Presentation: Streaming Microservices: Contracts & Compatibility
What You’ll Learn
- Understand the need and role of defining types in an Enterprise Application.
- Learn techniques and approaches to manage these schemas.
- Hear war stories and lessons on what can happen if you don’t practice clean practices around defining your types.
Abstract
In a world of microservices that communicate via unbounded streams of events, schemas are the contracts between the services. Having an agreed contract allows the teams developing those services to move fast, by reducing the risk involved in making changes. Yet delivering events with schema change in mind isn’t the common practice yet.
In this presentation, we’ll discuss patterns of schema design, schema storage and schema evolution that help development teams build better contracts through better collaboration - and deliver resilient applications faster. We’ll look at how schemas were used in the past, how their meaning has changed over the years and why they gained particular importance with the rise of the stream processing.
QCon: What’s the motivation for your talk?
Gwen: I’ve been concerned about the way companies manage their metadata since… 2012 probably. When I first moved from managing relational databases to managing Hadoop clusters. DBAs take metadata, especially schemas for granted. And then suddenly Hadoop was this wild west, people just dump data and no one knows how to use it. You create all those crazy dependencies between teams because whoever writes the data makes decisions that affect everyone and can break downstream apps at any time. This is even more difficult with stream processing because of the real-time and microservices nature of the applications.
I spent the last 5 years working with customers on solving this problem with different tools and environments. I feel like I have quite a lot to share.
QCon:How you you describe the persona of the target audience of this talk?
Gwen: The relevant role is usually “enterprise architect”, because they have overall responsibility for how different applications communicate and play together. Although I hope that many responsible engineers care as well. My target audience is usually from medium to huge companies - you need to be of a certain size before questions of compatibility become important.
QCon: How are you going to address these things?
Gwen: I'm going to spend part of my time just telling horror stories of what happens if you don't manage your schemas (I have four years worth of horror stories to share). Then I'm going to talk about how it's a general problem. It's not about if you use Avro or if you use JSON or something thing else. It doesn't even matter if you do stream processing at all, it's a very generic problem on how components, services and teams communicate.
Then I'm going to go into some solutions, including the Confluent Schema Registry. It's open to note though; there are lots of other solutions that you can use too.
I want to end the talk with few examples of the potential in implementing this kind of centralized streams and schema of management. In addition to the immediate compatibility benefits - a centralized metadata store can be used for data discovery and for governance. I hope to share some examples of what forward-looking enterprise architects in some organizations are currently exploring.
QCon: QCon targets advanced architects and sr development leads, what do you feel will be the actionable that type of persona will walk away from your talk with?
If you use events to communicate between applications (and this includes all stream processing apps) - you absolutely need to figure out a way to detect and prevent schema compatibility issues early in the development process. You also need reasonable ways to allow schemas to change without breaking things. My talk is full of suggestions on how to do both.
What do you feel is the most important thing/practice/tech/technique for a developer/leader in your space to be focused on today?
Gwen: The transition from both request-response processing and batch processing to stream processing.
Every business has many applications that are either request-response or batch due to historical reasons - but the real business process they model is a continuous stream of events. Using new technologies to model the business process more accurately in the applications will help make the entire process more efficient and more timely.
I am typically wary of cutting-edge technologies and prefer to use proven systems (like Kafka!), but one of the technologies I am currently most curious about is Lift’s Envoy. I hope to learn more about it at QCon NYC.
Similar Talks
Scaling DB Access for Billions of Queries Per Day @PayPal
Software Engineer @PayPal
Petrica Voicu
Psychologically Safe Process Evolution in a Flat Structure
Director of Software Development @Hunter_Ind
Christopher Lucian
Not Sold Yet, GraphQL: A Humble Tale From Skeptic to Enthusiast
Software Engineer @Netflix
Garrett Heinlen
Let's talk locks!
Software Engineer @Samsara
Kavya Joshi
PID Loops and the Art of Keeping Systems Stable
Senior Principal Engineer @awscloud
Colm MacCárthaigh
Are We Really Cloud-Native?
Director of Technology @Luminis_eu
Bert Ertman
The Trouble With Learning in Complex Systems
Senior Cloud Advocate @Microsoft
Jason Hand
How Did Things Go Right? Learning More From Incidents
Site Reliability Engineering @Netflix
Ryan Kitchens
Graceful Degradation as a Feature
Director of Product @GremlinInc