Keynote: Incident Response: Trade-Offs Under Pressure

Location: Salon A/B/C/D/E

Day of week:

Abstract

The increasing complexity of software applications and architectures in Internet services challenge the reasoning of operators tasked with diagnosing and resolving outages and degradations as they arise. This talk will give a glimpse into how other fields handle incident response and paint a picture of what active steps we can take to support engineers in those uncertain and ambiguous scenarios. Examples include fields such as military, surgical trauma units, space transportation, aviation and air traffic control, and wildland firefighting.

Despite higher focus on how failures can be prevented through more robust and fault-tolerant design of these systems, a dearth of research explores the cognitive challenges engineers face when those preventative designs fail and they are left to think and react to scenarios that hadn’t been imagined, much like in fields outside of technology.

Speaker: John Allspaw

CTO @Etsy

Web operations manager and engineer with 12 years of systems engineering experience. Building web operations teams and capacity planning for large-scale infrastructure projects. Data center operations, high availability, disaster recovery and general Business Continuity Planning. Outage and degradation investigation as well as PostMortem practices and policies. Capital expenditure budget management and technical project planning and delivery. Engineering culture hacker.

Find John Allspaw at

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.