A.M.B.R.O.S.I.A. - Conferring Immortality on Distributed Applications

Talk by Jonathan Goldstein

Location: APB 3105
Start: 20. September 2019 at 1:00 pm
End: 20. September 2019 at 2:00 pm
iCal

Abstract:

When writing today’s distributed programs, which frequently span both devices and cloud services, programmers are faced with complex decisions and coding tasks around coping with failure, especially when these distributed components are stateful. If their application can be cast as pure data processing, they benefit from the past 40-50 years of work from the database community, which has shown how declarative database systems can completely isolate the developer from the possibility of failure in a performant manner. Unfortunately, while there have been some attempts at bringing similar functionality into the more general distributed programming space, a compelling general-purpose system must handle non-determinism, be performant, support a variety of machine types with varying resiliency goals, and be language agnostic, allowing distributed components written in different languages to communicate. This talk describes the first system, publicly available on GitHub, called Ambrosia, to satisfy all these requirements. We coin the term “virtual resiliency”, analogous to virtual memory, for the platform feature which allows failure oblivious code to run in a failure resilient manner. We also introduce a programming construct, the “impulse”, which resiliently handles non-deterministic information originating from outside the resilient component. Of further interest to our community is the effective reapplication of much database performance optimization technology to make Ambrosia more performant than many of today’s non-resilient cloud solutions.

Bio:

Over the last 20 years, I have worked at Microsoft in a combination of research and product roles. In particular, I’ve spent about 15 years as a researcher at MSR, doing fundamental research in streaming, big data processing, databases, and distributed computing. My style of working is to attack difficult problems, and through fundamental understanding and insight, create new artifacts that enable important problems to be solved in vastly better ways. For instance, my work on streaming data processing enabled people with real time data processing problems to specify their processing logic in new, powerful ways, and also resulted in an artifact called Trill, which was orders of magnitude more performant than anything which preceded it. Within the academic community, I have published many papers, some with best paper awards (e.g. Best Paper Award at ICDE 2012), and two with test of time awards (e.g. SIGMOD 2011 Test of Time award and ICDT 2018 Test of Time award), and have also taken many organizational roles in database conferences. My research has also had significant impact on many Microsoft products, including SQL Server, Office, Windows, Bing, and Halo, as well as leading to the creation of entirely new products like Microsoft StreamInsight, Azure Stream Analytics, Trill, and most recently, Ambrosia. I spent 5 years building Microsoft StreamInsight, serving as a founder and architect for the product. Trill has become the de-facto standard for temporal and stream data processing within Microsoft, and years after creation, is still the most expressive and performant general purpose stream data processor in the world. I am also an inventor of 30+ patents.