Presentation: Help! I Accidentally Distributed My System!
This presentation is now available to view on InfoQ.com
Watch videoWhat You’ll Learn
-
Understand some of the key challenges to build a distributed system using the latest generation of cloud infrastructure.
-
Learn approaches to reason about and debug these distributed systems.
-
Plan more strategically when designing distributed systems to make troubleshooting later more effective.
Abstract
Mobile and web apps are increasingly built on Backends as a Service, Platforms as a Service, and Infrastructure as a Service solutions. We snap together SaaS software and vendor products, adding pieces until we’ve built a complex system out of seemingly simple parts. We’ve all become distributed systems engineers, intentionally or not.
This talk is a practical talk about the tools and skills we can use to get ourselves out of the corner we’ve boxed ourselves into. We propose ways we can move towards optimizing for the operator experience and our ability to understand how data moves through the system. We’ll discuss how to choose your pieces wisely, how to debug the newly complex systems you’ve built, and how to manage the ever-increasing cognitive load of it all.
Can you tell me about the work you do today?
Rachel: I work at Google on Firebase Security Rules and Google Cloud Policy. Firebase is a Backend As A Service. We provide tools for developers to keep their apps secure. Google Cloud Policy helps system administrators to ensure that all their apps have the same security policy.
Emily: I manage the product engineering team at Honeycomb. We are building a product to help engineers understand complex systems by helping them visualize production data in new and helpful ways.
Your talk is called “Help! I accidentally distributed my system.” What is the motivation for it?
Rachel: There is a lot of work being done on building distributed systems. We have been discussing the different aspects of building distributed systems for some time. We felt that there was a need to focus on long-term goals while building systems while taking into account technical decisions that need to be made.
Emily: We are two people with slightly different expertise. Over the years, we have done many joint talks about the hard problems that we solved together. It’s been interesting to see our hardest problems shift from the frontend to the backend, and go from being architecture problems to debugging problems.
Is this talk for someone who is migrating from a monolith and building their first service or for someone who has plans of building a large scale distributed system?
Rachel: This talk is for people who think they are building simple services one at a time and slowly realize that they have a complex distributed system instead. This could happen at both small and large companies.
Emily: People may be making good technology choices and taking the correct business decisions along the way, but they may still encounter problems when they reach a certain level of system complexity. This talk is for them.
Can you give us examples of some type of problems that require a new way of thinking?
Emily: We’ll present some examples of the most interesting distributed system bugs we have experienced. My favorite is a bug that at first glance appeared as though it might be a complex distributed systems bug based on its symptoms, but actually turned out to be quite straightforward — after we spent a long time debugging it. We’re going to dive into examples like this and show how we reason about the system, and how good instrumentation can help reduce the time to resolution.
Rachel: We want to make people aware of where the problems are most likely to be found.
Who is the core persona that the talk is intended for?
Rachel: The talk is for people who make decisions about what their system will look like. In some companies, it could be the V.P.; at others, it could be the senior engineers.
Emily: We want to address senior software engineers, who own the system and the architectural decisions. We want to bring together mindsets from different areas of the stack.
What are the key takeaways from the talk?
Rachel: We want to give the attendees enough examples to understand that every decision has consequences so that when they are evaluating new software, they know what are the problems to be considered.
Emily: We also want to share the toolset that we found useful in debugging visibility problems in distributed systems. There will be conceptual as well as practical takeaways from the talk.
What do you feel is the most important trend in software today?
Emily: There are a lot of surprises distributed systems around us. Any system with a lot of complex app logic in the browser together with a server qualifies to be a distributed system.
Rachel: Distributed Systems is the most important trend because everyone is building services or building on the cloud, which means they are building distributed systems.
Similar Talks
Scaling DB Access for Billions of Queries Per Day @PayPal
Software Engineer @PayPal
Petrica Voicu
Psychologically Safe Process Evolution in a Flat Structure
Director of Software Development @Hunter_Ind
Christopher Lucian
Not Sold Yet, GraphQL: A Humble Tale From Skeptic to Enthusiast
Software Engineer @Netflix
Garrett Heinlen
Let's talk locks!
Software Engineer @Samsara
Kavya Joshi
PID Loops and the Art of Keeping Systems Stable
Senior Principal Engineer @awscloud
Colm MacCárthaigh
Are We Really Cloud-Native?
Director of Technology @Luminis_eu
Bert Ertman
The Trouble With Learning in Complex Systems
Senior Cloud Advocate @Microsoft
Jason Hand
How Did Things Go Right? Learning More From Incidents
Site Reliability Engineering @Netflix
Ryan Kitchens
Graceful Degradation as a Feature
Director of Product @GremlinInc