Presentation: Canopy: Scalable Distributed Tracing & Analysis @Facebook
Abstract
How do you understand the performance of a request that is executed in a large-scale system, potentially fanning out across thousands of machines and services? To answer this question at Facebook, we built a distributed tracing framework, Canopy, which has provided visibility into an otherwise intractable problem.
In this talk we present Canopy, Facebook’s performance and efficiency tracing infrastructure. Canopy recoards causally related events across the end-to-end execution path of requests, including from browsers, mobile applications, and backend services. Canopy processes traces in near real-time, derives user-specified features, and outputs to datasets that aggregate across billions of requests. At Facebook, Canopy is used to query and analyze performance and efficiency data in real-time.
Canopy addresses three challenges we have encountered: (1) supporting the range of execution and performance models used by different components of the Facebook stack; (2) supporting interactive ad-hoc and real-time analysis of trace data; and (3) operating at massive scale - Canopy currently records and processes over 1 billion traces per day.
We conclude by discussing lessons learned applying Canopy to a wide range of use cases at Facebook and present case studies of its use in solving various performance and efficiency challenges
Similar Talks
Inside Job: How to Build Great Teams Within a Legacy Organization?
Engineering Director @Meetup
Francisco Trindade
Self-Selection for Resilience and Better Culture
Agile/DevOps Trainer & Founder of Agile Play Consulting, LLC
Dana Pylayeva
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB maintainer, Co-founder & CTO @CockroachDB
Peter Mattis
Breaking Hierarchy - How Spotify Enables Engineer Decision Making
Senior Engineering Manager, Data and Machine Learning Infrastructure @Spotify
Kristian Lindwall
Context Matters: Improving the Performance and Wellbeing of Teams
Director of IT @Etsy
Shawn Carney
Maintaining the Go Crypto Libraries
Cryptogopher @Google
Filippo Valsorda
Video Streaming at Scale
IBM Distinguished Engineer, CTO Watson Media Cognitive Solutions @IBM
Lysa Banks
Machine-to-Machine Interfaces
Sr. Consultant, AppDev @awscloud
Ari Lerner
From Research to Production With PyTorch
Engineering Manager @Facebook AI