Viewing the world as a computer: Global capacity management

Meta currently operates 14 data centers around the world. This rapidly expanding global data center footprint poses new challenges for service owners and for our infrastructure management systems. Systems like Twine, which we use to scale cluster management, and RAS, which handles perpetual region-wide resource allocation, have provided the abstractions and automation necessary for service [...] Read More... The post Viewing the world as a computer: Global capacity management appeared first on Engineering at Meta.
http://dlvr.it/SXslZX

Komentar

Postingan populer dari blog ini

How Meta enforces purpose limitation via Privacy Aware Infrastructure at scale

Fully Sharded Data Parallel: faster AI training with fewer GPUs

Risk-driven backbone management during COVID-19 and beyond