Presentation: Beyond REST: Coursera's Journey to GraphQL
What You’ll Learn
- Gain knowledge on how to introduce GraphQL.
- Discover how to add an adapter layer that saves rewriting all the APIs when moving to GraphQL.
- Learn what to look out for while migrating your clients to use GraphQL.
Abstract
Coursera's platform is composed of hundreds of APIs, implemented across dozens of services by various engineering teams. Our client engineers have faced many challenges while using these APIs, especially around discoverability and assembly of data from various services. We’re working to solve these problems by migrating all client data access from REST to GraphQL.
Our path to GraphQL is different than most -- instead of manually adapting each of our REST APIs for GraphQL, we built a dynamic assembly layer that unifies our distributed APIs into a single GraphQL endpoint and corresponding schema. This unified schema allows clients to access data from across our various services in a single query.
In this talk, I’ll cover why we’re transitioning to GraphQL, share challenges and learnings from building our GraphQL assembly layer, and discuss a few open questions we have around designing APIs for simultaneous REST and GraphQL usage, and who owns the business logic in GraphQL.
QCon: What's the focus of the work that you do at Coursera today?
Bryan: I'm on the client infrastructure team, my job being to make client developers as productive as possible. We do whatever it takes to make them more productive and make Coursera more performant.. We do two things right now. One is around APIs, adding this GraphQL layer. The other one is around deployment, making it faster and safer for developers to deploy front-end applications. We do a lot to maintain the libraries and the infrastructure around the front-end applications to make them more productive.
QCon: What's the motivation for your talk?
Bryan: We've been transitioning to GraphQL, and it's solved many of the problems that we're faced with REST that we've been trying to figure out on our own. It's still pretty new, and we've done things differently than most other people. Those who start with GraphQL, typically they’ll manually hook it up to all their endpoints. We have now around 800 different API endpoints, and there are different ways to access data on each of them. We knew that we didn't want to make developers rewrite everything or find a way to add GraphQL, because we knew that developers would not do that, they're too lazy or have other priorities. We found a way to adapt our existing API infrastructure to work with GraphQL without developers having to do anything on their own. One part of the motivation is talk about GraphQL and how there are other approaches than having to rewrite everything, and getting people thinking about that. Also, think about when you're building API, frameworks and libraries, to make them more future proof and make sure that if GraphQL comes out or some other new thing in the future, there are ways so you would not be limited to whatever you have now.
QCon: What was the core problem that GraphQL could solve for Coursera?
Bryan: There are three different problems. One was our APIs were inefficient, we would end up with 20-30 round trips from the browser to the server to fetch all the data. There was the documentation, people didn't know about all 800 APIs. And if a developer did not know about an API and they needed it, they would rewrite it themselves. So we had some duplicate APIs out there. The other was around tooling. We have this other framework called Naptime that we adapted to work with GraphQL. We ended up writing a client side framework to work with Naptime. And that was super buggy. There was nothing else out there. People would try to Google it, and figure out how to use it and there was nothing else out there because we did it in-house. Being able to get these other libraries that other people are working on, to work with the community around that has made it much easier for us to contribute and not having to do everything on our own.
QCon: What do you mean by using GraphQL differently than others?
Bryan: We didn't want our developers have to rewrite all the APIs we have. The framework gives us three different things: which services have which APIs, the schema for what it returns and schema for the different inputs. We take that information and every five minutes we ask all of our services the state of the world, and rebuild our entire GraphQL schema automatically.
QCon: What do you consider the level of this talk?
Bryan: I'd say intermediate. It doesn't go too deep into GraphQL, like how to optimize it. It's more of how to move to GraphQL, or how to structure your framework so you don't get locked into one specific technology. I'm going to dive into how we got into GraphQL, with tips and tricks from migrating to GraphQL, but not too much on the ins and outs of GraphQL. There are takeaways that aren't specific to GraphQL.
QCon: What will a Java or Scala developer know after coming to your talk that didn't know before?
Bryan: The big thing is that it's possible to write this adapter layer and you don't have to rewrite every API. Also, if you're writing APIs, how to structure them in a way that you can make more future proof. Can you pull your schema out in a language independent format like ProtoBuf. We use another one from Linked-In. Is there a way to structure all that data so if you need to move to something in the future that it's much easier to do that and gives you more flexibility around your APIs.
QCon: Are you going to start using GraphQL to directly talk to the data stores or are you continuing to write RESTful APIs that GraphQL will consume?
Bryan: We're not rewriting any API on the back-end. We have 800 APIs that developers know how to use. All of our back-end services still speak REST to each other. It's just that we have this adapter layer between the servers and the client where we want to use GraphQL because it's much more efficient. We have been talking about letting servers speak GraphQL between each other. So if you need to get some data from the catalog service or the payment service, then that ends up fetching a whole bunch of other things, GraphQLmight be more efficient. But that opens up a whole new can of worms if you have loops or other things to worry about. Right now that's not something that we're trying to optimize.
Similar Talks
Scaling DB Access for Billions of Queries Per Day @PayPal
Software Engineer @PayPal
Petrica Voicu
Psychologically Safe Process Evolution in a Flat Structure
Director of Software Development @Hunter_Ind
Christopher Lucian
Not Sold Yet, GraphQL: A Humble Tale From Skeptic to Enthusiast
Software Engineer @Netflix
Garrett Heinlen
Let's talk locks!
Software Engineer @Samsara
Kavya Joshi
PID Loops and the Art of Keeping Systems Stable
Senior Principal Engineer @awscloud
Colm MacCárthaigh
Are We Really Cloud-Native?
Director of Technology @Luminis_eu
Bert Ertman
The Trouble With Learning in Complex Systems
Senior Cloud Advocate @Microsoft
Jason Hand
How Did Things Go Right? Learning More From Incidents
Site Reliability Engineering @Netflix
Ryan Kitchens
Graceful Degradation as a Feature
Director of Product @GremlinInc