Skip to content
Thoughtful, detailed coverage of everything Apple for 34 years
and the TidBITS Content Network for Apple professionals
No comments

Lessons from Building and Operating Amazon S3

Andy Warfield, a VP and distinguished engineer at Amazon, has contributed a guest post to the All Things Distributed blog based on his keynote at the USENIX FAST ’23 conference. He sums up:

I came to Amazon expecting to work on a really big and complex piece of storage software. What I learned was that every aspect of my role was unbelievably bigger than that expectation. I’ve learned that the technical scale of the system is so enormous, that its workload, structure, and operations are not just bigger, but foundationally different from the smaller systems that I’d worked on in the past. I learned that it wasn’t enough to think about the software, that “the system” was also the software’s operation as a service, the organization that ran it, and the customer code that worked with it. I learned that the organization itself, as part of the system, had its own scaling challenges and provided just as many problems to solve and opportunities to innovate. And finally, I learned that to really be successful in my own role, I needed to focus on articulating the problems and not the solutions, and to find ways to support strong engineering teams in really owning those solutions.

Parts of this lengthy post will likely be over your head technically (as they were over mine), but the more down-to-earth nuggets are fascinating. Warfield moved directly from grad school to a startup and then returned to academia as a professor before eventually joining Amazon to work on its Simple Storage Service—Amazon S3. Despite the name, S3 is anything but simple.

The scale of Amazon S3 is mind-blowing: it holds over 280 trillion objects and averages over 100 million requests per second. It’s built on literally millions of hard drives, and it’s entirely likely that a single data request might be served by over 1 million individual drives. S3 is so large that the problems it presents—and the solutions it requires—are fundamentally different than what one might expect from just increasing the size of a smaller system.

Warfield takes a side trip into the history of hard drives to note that the capacity of hard drives has increased 7.2 million times while physical size has decreased 5000 times. To help visualize the technical wizardry encapsulated in hard drives, he updates the analogy of a hard drive head scaled up to the size of a 747 airplane flying over a grassy field, where each blade of grass is a bit of information. In his version, the plane only flies at 75 miles per hour, but the air gap between the bottom of the plane and the top of the grass is just two sheets of paper. As it flies, it counts each blade of grass and misses a blade only once every 25,000 trips around the Earth. Amazing.

Finally, Warfield closes with some views on the importance of “ownership” in an organization. Whether with grad students in his lab or engineers at Amazon, he found it more effective to encourage people to come up with their own thoughts than to feed them ideas. The sentence that resonated most deeply for me:

I consciously spend a lot more time trying to develop problems, and to do a really good job of articulating them, rather than trying to pitch solutions.

Read original article

Subscribe today so you don’t miss any TidBITS articles!

Every week you’ll get tech tips, in-depth reviews, and insightful news analysis for discerning Apple users. For over 33 years, we’ve published professional, member-supported tech journalism that makes you smarter.

Registration confirmation will be emailed to you.

This site is protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Comments About Lessons from Building and Operating Amazon S3

Start the discussion in the TidBITS Discourse forum