Large-scale applications for mobile devices and Internet of Things live in stressful real-world environments: they have both continuous faults and bursts of high faults. Typical faults are node crashes, network partitions, and communication delays. We give a principled way to build applications that survive in such environments by using the concepts of Reversibility and Phase . A system is Reversible if the set of operations it provides depends on its current fault rate and not on the history of the fault rate. Reversibility generalizes standard fault tolerance with nested fault models. When the fault rate goes outside one model then it is still inside the next model. Phase is a per-node property that gives a qualitative indication of what system operations are available at each node, given the current fault rate. Phase can be determined with no additional distributed computation. We present two case studies. First, we present a transactional key-value store built on a structured overlay network and we explain how to make it Reversible . Second, we present a distributed collaborative graphic editor built on top of the key-value store, and we explain how to make it Phase-Aware, i.e., it optimizes its behavior according to a real-time observation of phase at each node using a Phase API. This shows the usefulness of Reversibility and Phase-Awareness for building large-scale Internet applications.
 Ruma R. Paul, Peter Van Roy, and Vladimir Vlassov. Reversible Phase Transitions in a Structured Overlay Network with Churn. NETYS 2016, Marrakech, Morocco, May 18-20, 2016.
 Ruma R. Paul, Peter Van Roy, and Vladimir Vlassov. Interaction Between Network Partitioning and Churn in a Self-Healing Structured Overlay Network. ICPADS 2015, Melbourne, Australia, Dec. 14-17, 2015.