Home About MIT CC BY-SA 4.0 RSS

Concurrency Control in Groupware Systems

What follows is an overview of a 1989 journal article by Ellis and Gibbs regarding collaborative editing. It is the origin of the use of operational transforms. For me, it is particularly useful as the dOPT algorithm for reconstructing a cohoerent global state is introduced here. The global state, in the case of the concurrent text editor, is the contents of the document. In more complex scenarios, the state can consist of much more ill-defined properties – in the case of Google Docs, the state would consist of not only the document text, but also the style rendered (bold, italic, etc.). Indeed, much of the work in finally implementing a robust operational transform framework was actually done while developing Google Wave.

In my case, I am developing a code editor in the style of nano or vi – the global state here is restricted to the original case of plain-text. Other features, such as syntax highlighting, does not change the global state, and therefore does not need to be communicated to others; indeed, if the endpoints are identical, then assuming a coherent global state, each endpoint can reconstruct the proper (identical) syntax highlighting by examining the text.

Definitions

The article defines groupware systems as related to the scope of the paper, as a piece of software which can support real-time editing of a shared resource (shared context), and the changes made by one client (participant) can be seen by all parties by default. That is, if client A were to edit the shared resource (for sake of concreteness, imagine a Word document), other connected clients B, C, and D must see the changes reflected in their own local interfaces, in a time period which is close to local latency. That is, the effective remote latency, from time of input from A to output on B (notification time), must be on the same order of magnitude as the latency from input on B to output on B (response time).

Such a groupware system G is specified uniquely by the set of clients S connected to the system, and O the set of operators which can alter the shared state of the system:

G = < { S_1, ..., S_n }, { O_1, ..., O_m } >

The paper is very convenient in my case, as it demonstrates the usability of operational transforms on text editing – so I can lift large portions of the examples given to suit my own needs. In the case of the set of operations, we are given two such examples:

O_1 = insert[c, i]
O_2 = delete[i]

An application of an operator (that is, when the operator is given arguments) is called an operation. Each client has (among other things) an active operation queue, of operations not yet acted upon (gathered from both remote and local input), and a logged queue, which keeps track of all previous operations acted upon. The crux of operational transform is to:

  1. determine which operation in the queue to act upon first,
  2. where it fits chronologically within the log queue,
  3. transform the operation based how far back it should be in the queue, and
  4. act upon the transformed operation.

All other data structures listed in the algorithm are merely for the sake of enacting the previous protocol.

Step three is very important, as this is the hardest part to specify in general – as per the paper, this is application-dependent. In the text, we are thankfully given the transformation matrix for the insert and delete operations (p404 - 5).

Limitations

As we will see in the next paper, we will still need to correct this algorithm to suit our needs – however, this is a very good start. Moreover, this paper assumes

  1. static number of clients, and
  2. reliable network.

We will have to extend dOPT to take care of these two points of contention; conveniently, the next paper addresses these two points succinctly as well.

Sources

  1. Concurrency Control in Groupware Systems (Ellis and Gibbs, 1989)