Presentation: Scaling DB Access for Billions of Queries Per Day @PayPal

Track: Data Engineering for the Bold

Location: Empire Complex, 7th fl.

Duration: 10:35am - 11:25am

Day of week:

Slides: Download Slides

This presentation is now available to view on InfoQ.com

Watch video with transcript

What You’ll Learn

  1. Hear about Hera and how it helps PayPal to manage database access and deal with issues.
  2. Listen about some of the problems PayPal solves with Hera.

Abstract

As microservices scale and proliferate, they add increasing load on databases in terms of connections and resource usage. Open sourced in the Go programming language, Hera (High Efficiency Reliable Access to data stores) scales thousands of PayPal’s applications with connection multiplexing, read-write split, and sharding. This talk covers various approaches taken over the years to handle a large growth in application connections and OLTP database utilization. Beyond pure connection and query scaling, Hera has functionality for better manageability. Automatic SQL eviction and DBA maintenance control help to more easily operate hundreds of databases.

Question: 

What is the focus of your work today?

Answer: 

Petrica: I am working on developing Hera, which is High Efficiency Reliable Access to data sources, basically a proxy to databases. We support Oracle and MySQL. Hera helps PayPal scale.

Question: 

What is the motivation for this talk?

Answer: 

Petrica: We are open-sourcing Hera and we want to talk about the product, about the issues that it solves and how it helps PayPal. We think this is applicable to other companies and other products. Hopefully, people will get familiar with it and start using it.

Question: 

What would be your target audience and what level of experience they need to have?

Answer: 

Petrica: It's for engineers and DBAs, showing how Hera helps PayPal DBAs with Oracle maintenance, for example. It shows software engineers how to scale the database, when they need to shard it. Also, with performance issues, with Hera an application having to use fewer Oracle connections. To answer your question, I would say mostly engineers but also DBAs.

Question: 

Does Hera act as a buffer between clients and the database, taking the spikes in usage?

Answer: 

Petrica: It's part of it, not necessarily the main focus, but an important one. We find this very useful at PayPal. If you have a spike sometimes you have to choose how to handle it and you may choose to kill some bad actor or some client that maybe holds a resource for too long.

Question: 

What has been your experience working in Go for such a system?

Answer: 

Petrica: We like it. We looked at Go earlier, but at the time it was not good enough for our case because of the garbage collector's stop the world time. This was maybe five years back but two years ago we came up to the conclusion that it was ready for us, so we adopted it. Most of us were C++ engineers. In the beginning we had some reservations for Go, but it worked for us and it exceeded our expectations. Go has a new release every six months and we’ve seen improvements every time. All you have to do is to recompile the application and it works faster.

Speaker: Petrica Voicu

Software Engineer @PayPal

Petrica Voicu is a software developer at PayPal, where he develops Hera (High Efficiency Reliable Access to data stores), focusing on its scalability, resiliency, availability and performance. Before that, he worked on a high volume, zero-message loss, database-backed publish-subscribe messaging system. Prior to that, Petrica built the backend for the first version of PayPal Here, a PoS mobile-device accessory. In the more distant past, he developed embedded systems leveraging both C and assembly. He leveraged his assembly language experience more recently when he discovered and fixed a bug in the Go language runtime.

Find Petrica Voicu at

Speaker: Kenneth Kang

Software Engineer @PayPal

Kenneth Kang is a software engineer at PayPal. He's helped scale Hera (High Efficiency Reliable Access to data stores) since 2015. Coding in C++ and go, he added a bit of code for sharding and helped engineers move to their new, sharded database. He enjoys the team's design discussions on how a LIFO backlog can improve recovery from an incident.  In the past, he's worked on C++ frameworks and an async gateway.  Recently, he's interested in automated testing of deployment and monitoring.

Find Kenneth Kang at

Similar Talks

Are We Really Cloud-Native?

Qcon

Director of Technology @Luminis_eu

Bert Ertman

The Trouble With Learning in Complex Systems

Qcon

Senior Cloud Advocate @Microsoft

Jason Hand

Making a Lion Bulletproof: SRE in Banking

Qcon

IT Chapter Lead Site Reliability Engineering @ingnl

Janna Brummel

What Breaks Our Systems: A Taxonomy of Black Swans

Qcon

Site Reliability Engineer @Slack, Contributor to Seeking SRE, & SRECon Steering Committee

Laura Nolan