This document describes Tsync (pronounced "sink"), which provides transparent synchronization across a set of machines for existing files and directories. A transparent synchronization system makes keeping a set of files consistent across many machines---possibly with differing degrees of connectivity and availability---as simple as possible while requiring minimal effort from the user and maintaining security, robustness to failure, and fast performance.
Traditional synchronization tools, such as the popular Rsync and Unison, require that the user manually synchronize her files after changing them. Moreover, these tools are designed to only synchronize a pair of hosts: if the user wishes to synchronize N machines, then she must run the tool N-1 times. Not only is it inefficient to unicast the same data N-1 times, but the user is also burdened with remembering to restart synchronizations that are interrupted and manually recovering failed hosts.
Tsync will solve the problem of providing transparent synchronization under the assumption of optimistic consistency. Optimistic conistency assumes that the same file is not modified on two hosts at the same time. In the Tsync usage model, the user writes a simple configuration file, similar to /etc/exports, describing which directories should be synchronized, and listing one or more other hosts that are part of the Tsync group (although this list does not have to contain all the hosts in the group). The user runs the Tsync daemon, tsyncd, on each machine in the group. Then when the user creates/modifies/deletes files on one machine, those changes are automatically propagated to all the others. So if the user were to add a bookmark on her machine at the university, it would be reflected on her desktops at home. Even if not all of the computers are connected at the same time (such as if her laptop were powered off), then the next time the disconnected machine regained connectivity, it would automatically learn about the change and update itself.
A synchronization system for widely distributed hosts faces scalability and reliability challenges. The system must gracefully scale to accommodate tens or even hundreds of hosts. Of course, to make managing the system simple, the user cannot be required to manually configure each host with every other host. Hosts must have a way of learning about other hosts, as well as efficiently distributing control messages and data to all other hosts. Furthermore, the system must automatically adapt as hosts are powered off, lose connectivity, or crash, and must rapidly re-synchronize these computers when they re-join. Similarly, adding new hosts should be a simple process, and they should rapidly be brought up-to-date. The design of Tsync uses peer-to-peer and overlay techniques to provide scalable and efficient mechanisms for transparently synchronizing many hosts. Tsync organizes a user's machines into an overlay network with a tree topology. The overlay network, through probing and a root fail-over protocol, ensures that each node remains connected with all other connected nodes. The overlay network also provides a scalable means by which a Tsync node can learn about other hosts, besides the bootstrap host with which it was configured. The tree topology allows any Tsync host to efficiently multicast a message to all the other hosts. The overlay also handles authentication and encryption: hosts authenticate each other using RSA-keys, and all data is encrypted using TLS.Read More »