Presentation: Scaling DB Access for Billions of Queries Per Day @PayPal
This presentation is now available to view on InfoQ.com
Watch video with transcriptWhat You’ll Learn
- Hear about Hera and how it helps PayPal to manage database access and deal with issues.
- Listen about some of the problems PayPal solves with Hera.
Abstract
As microservices scale and proliferate, they add increasing load on databases in terms of connections and resource usage. Open sourced in the Go programming language, Hera (High Efficiency Reliable Access to data stores) scales thousands of PayPal’s applications with connection multiplexing, read-write split, and sharding. This talk covers various approaches taken over the years to handle a large growth in application connections and OLTP database utilization. Beyond pure connection and query scaling, Hera has functionality for better manageability. Automatic SQL eviction and DBA maintenance control help to more easily operate hundreds of databases.
What is the focus of your work today?
Petrica: I am working on developing Hera, which is High Efficiency Reliable Access to data sources, basically a proxy to databases. We support Oracle and MySQL. Hera helps PayPal scale.
What is the motivation for this talk?
Petrica: We are open-sourcing Hera and we want to talk about the product, about the issues that it solves and how it helps PayPal. We think this is applicable to other companies and other products. Hopefully, people will get familiar with it and start using it.
What would be your target audience and what level of experience they need to have?
Petrica: It's for engineers and DBAs, showing how Hera helps PayPal DBAs with Oracle maintenance, for example. It shows software engineers how to scale the database, when they need to shard it. Also, with performance issues, with Hera an application having to use fewer Oracle connections. To answer your question, I would say mostly engineers but also DBAs.
Does Hera act as a buffer between clients and the database, taking the spikes in usage?
Petrica: It's part of it, not necessarily the main focus, but an important one. We find this very useful at PayPal. If you have a spike sometimes you have to choose how to handle it and you may choose to kill some bad actor or some client that maybe holds a resource for too long.
What has been your experience working in Go for such a system?
Petrica: We like it. We looked at Go earlier, but at the time it was not good enough for our case because of the garbage collector's stop the world time. This was maybe five years back but two years ago we came up to the conclusion that it was ready for us, so we adopted it. Most of us were C++ engineers. In the beginning we had some reservations for Go, but it worked for us and it exceeded our expectations. Go has a new release every six months and we’ve seen improvements every time. All you have to do is to recompile the application and it works faster.
Similar Talks
Psychologically Safe Process Evolution in a Flat Structure
Director of Software Development @Hunter_Ind
Christopher Lucian
PID Loops and the Art of Keeping Systems Stable
Senior Principal Engineer @awscloud
Colm MacCárthaigh
Are We Really Cloud-Native?
Director of Technology @Luminis_eu
Bert Ertman
The Trouble With Learning in Complex Systems
Senior Cloud Advocate @Microsoft
Jason Hand
How Did Things Go Right? Learning More From Incidents
Site Reliability Engineering @Netflix
Ryan Kitchens
Making a Lion Bulletproof: SRE in Banking
IT Chapter Lead Site Reliability Engineering @ingnl
Janna Brummel
What Breaks Our Systems: A Taxonomy of Black Swans
Site Reliability Engineer @Slack, Contributor to Seeking SRE, & SRECon Steering Committee
Laura Nolan
Cultivating High-Performing Teams in Hypergrowth
Chief Scientist @n26
Patrick Kua
Inside Job: How to Build Great Teams Within a Legacy Organization?
Engineering Director @Meetup