BellJar: A new framework for testing system recoverability at scale

Building infrastructure that can easily recover from outages, particularly outages involving adjacent infrastructure, too often becomes a murky exploration of nuanced fate-sharing between systems. Untangling dependencies and uncovering side effects of unavailability has historically been time-consuming work. A lack of great tooling built for this, and the rarity of infrastructure outages, makes reasoning about them [...] Read More... The post BellJar: A new framework for testing system recoverability at scale appeared first on Engineering at Meta.
http://dlvr.it/SPqvD7

Komentar

Postingan populer dari blog ini

Inside Meta’s first smart glasses

Charting the future of our bug bounty program

How Meta enforces purpose limitation via Privacy Aware Infrastructure at scale