Presentation: Functional/Microservices in Real-Time Financials
What You’ll Learn
- Learn how to design a modern double entry accounting system from scratch (relevant buzzwords: event-driven, declarative, immutable, functional, correct).
- Discuss how an “immutable” database with a built-in audit trail (Nubank uses Datomic) can be a secret weapon for building a system of record.
- Gain knowledge of the challenges in scaling a real-time, financial system of record.
- Assess the empirical benefits for a financial firm based on real-time customer-level accounting visibility.
Abstract
Financial institutions have the responsibility of providing reliable, correct, and audited financial records for theirs customers and regulators. As financial services move towards real-time and adopt microservices architectures, how can they ensure data quality in a distributed system without compromising availability? We will present how we’ve built our system of record based on functional programming principles, the tools we used (Clojure, Datomic, Kafka), the challenges we faced when taking it to scale, and the benefits of our approach, including data science modeling, real time customer visibility, guaranteed conservation of money, and customer account histories.
QCon: What are you doing today?
Vitor: I work for Nubank that has a credit card business. I set up the building blocks that big financial institutions usually take for granted. Firstly there is the operational work of doing interest calculations, taxes, and other simple financial calculations. Also, because I'm the guy between the CTO and the CFO, I also use software to solve business problems in a more strategic way. For example, if we need funding, we need to find ways to have a competitive edge. How do we do that? Let's do the securitization process that is 100% controlled by systems instead of the usual manual flow you see in other institutions.
With this in mind, we've been very innovative in Brazil, doing the first securitization of sales of credit card purchases, something that had never been done before because the competitors did not have and were not willing to build the systems for that. In addition to that, also work with new products. If we're thinking about new financial products we always need to first establish the groundwork.
We are software specialists, we're not going to hire a bunch of people to run our new products, we don't have bankers. We need our software to be in charge of decisions and become the domain expert. One of the building blocks that we've created in this process is our system of record which is a double entry accounting spanning three different products.
QCon: Are you going to dive into the system of record, how it's built?
Vitor: Yes. I'll briefly introduce people to what our services look like. We run a microservices architecture and most of our services are written in Clojure, using Datomic for storage, and connected with Kafka queues and REST APIs with other services and our clients. We have over 90 of them. Then I'll present the problem that we have: most of our most important numbers that drive our most important decisions depend on aggregates across many of these services. I need an entity that comes from that operational service, that database, that logic and I need all of that to consolidate because someone's limit is going to be affected by purchases, by payments, by interest, by taxes, by everything in it.
How do we define something that is canonical, something that can be used by the customer, by operations, by analysis and sent to the regulators? These are the numbers that everyone is concerned about, not just Nubank. All the stakeholders look at them. Then I can show that with this architecture we could subscribe to existing Kafka topics, plug in a new service, a mostly functional one that would transform financially relevant events into a consolidated balance sheet.. Then I talk about our original assumptions, what turned out to be true, and the problems that we had when taking it to scale.
Wes: From the CAP theorem perspective, because obviously you have a highly consistent environment that's built on top of microservices, what were some of the challenges you had to face?
Vitor: We like to say that the double entry is a highly correct system more than a consistent one. The first thing we can always guarantee that even if stale, the data is always in a valid state and is never corrupted by unexpected distributed systems issues.
Additionally, we can't guarantee consistency across all services, but we can guarantee traceability of when, how, and why we were inconsistent and as soon as we do that we have the opportunity to take actions to fix decisions made with inconsistent or stale data. This is partly possible because we use Datomic. Datomic is essentially an append-only immutable database. This means that I can easily go back in time and see what the database looked like when a specific request or process was running. This gives us a strong audit trail and very powerful time series debugging.
Guarantees of consistency are concerns that are actually above the core idea of the system. I make sure that happens by monitoring it, by having the tooling around synchronizing databases, by republishing events, things like that. That's how we think about it. It's not about making sure that the entity is consistent with all the other entities, and all the other databases, it's more about having a source of truth that can be stale (but never corrupted) but that has enough metadata and a strong enough audit trail to help us determine the moments in which it was stale and naturally use it to correct our operational decisions and our models.
QCon: Who's the main persona you're talking to?
Vitor: I would be talking more to architects looking to solve financial problems in a microservices architecture.
Similar Talks
Scaling DB Access for Billions of Queries Per Day @PayPal
Software Engineer @PayPal
Petrica Voicu
Psychologically Safe Process Evolution in a Flat Structure
Director of Software Development @Hunter_Ind
Christopher Lucian
PID Loops and the Art of Keeping Systems Stable
Senior Principal Engineer @awscloud
Colm MacCárthaigh
Are We Really Cloud-Native?
Director of Technology @Luminis_eu
Bert Ertman
The Trouble With Learning in Complex Systems
Senior Cloud Advocate @Microsoft
Jason Hand
How Did Things Go Right? Learning More From Incidents
Site Reliability Engineering @Netflix
Ryan Kitchens
What Breaks Our Systems: A Taxonomy of Black Swans
Site Reliability Engineer @Slack, Contributor to Seeking SRE, & SRECon Steering Committee
Laura Nolan
Cultivating High-Performing Teams in Hypergrowth
Chief Scientist @n26
Patrick Kua
Inside Job: How to Build Great Teams Within a Legacy Organization?
Engineering Director @Meetup