Presentation: Probabilistic Programming from Scratch
This presentation is now available to view on InfoQ.com
Watch videoWhat You’ll Learn
- Gain a deeper understanding of how Probabilistic Programming can be used to help engineers solve problems around incomplete or partial data.
- Learn a new programming paradigm using Python and PyMC3.
- Hear how Probability Programming is being used in places like Facebook, Twitter, and Google in time series forecasting systems.
Abstract
This talk is for anyone who deals with real world data. Such data is always incomplete or imperfect in some way. Bayesian inference is a framework that allows us to draw conclusion from that data. And despite a reputation for mathematical and computational complexity, you don’t need a statistics background to understand Bayes at a conceptual level. We’ll develop that understanding by building a lightweight probabilistic programming system from scratch with simple Python. We’ll use the code we write to solve two real data problems: an A/B test and the German Tank problem. We’ll also look at how we’d solve those problems using PyMC3, a much more powerful, fully-featured probabilistic programming system.
What do you want someone to leave your talk with?
The audience will leave with a strong non-mathematical intuition for how Bayesian inference allows us to quantify the strength of conclusions drawn from real-world data. They’ll hopefully be excited to solve other toy problems with the tool we put together during the talk, and keen to check out PyMC3.
This talk is perhaps most useful for people who deal with real world data and face concrete statistical problems. But Bayesian inference provides a powerful day-to-day mental model for thinking about data and belief. And in keeping with the CS track, this talk will be an introduction to a new programming language paradigm for some. So I hope it will be at least interesting to a very wide audience!
Is probabilistic programming a real thing? Can you give me an example of where it's being used today?
Yes, it's a real thing! The most prominent examples of tech companies using these ideas in the real world are Facebook's Prophet time series forecasting system (which I'll discuss in the talk), and Uber's release of Pyro, an open source deep probabilistic programming system built on top of PyTorch. And Google are now getting involved with Tensorflow Probability
What is the level of experience someone attending this talk should have?
This might seem like a talk about statistics, mathematics and computer science. But my goal is that everyone who can write a for loop will understand everything we do. I attempt to ensure this by implementing things from scratch, and choosing a Bayesian inference algorithm that is particularly transparent and non-mathematical. I happen to use Python a little in this talk, but it's not essential that you can code in Python. And very importantly: no mathematics is required!
Similar Talks
Scaling DB Access for Billions of Queries Per Day @PayPal
Software Engineer @PayPal
Petrica Voicu
Psychologically Safe Process Evolution in a Flat Structure
Director of Software Development @Hunter_Ind
Christopher Lucian
Not Sold Yet, GraphQL: A Humble Tale From Skeptic to Enthusiast
Software Engineer @Netflix
Garrett Heinlen
Let's talk locks!
Software Engineer @Samsara
Kavya Joshi
PID Loops and the Art of Keeping Systems Stable
Senior Principal Engineer @awscloud
Colm MacCárthaigh
Are We Really Cloud-Native?
Director of Technology @Luminis_eu
Bert Ertman
The Trouble With Learning in Complex Systems
Senior Cloud Advocate @Microsoft
Jason Hand
How Did Things Go Right? Learning More From Incidents
Site Reliability Engineering @Netflix
Ryan Kitchens
Graceful Degradation as a Feature
Director of Product @GremlinInc