Operational Transform
For the past few months, I’ve been attempting to develop a collaborative code
editor in the command line. What that entails essentially is a Google Doc-like
interface, but using ncurses
, allow the program to be used in the terminal.
There are of course software out which allows editing collaboration away from
the browser (a partial list can be found on
Wikipedia),
but a scant few offer this capability from the command line. What I want to do
is build a vi
-like editor which can be connected to from halfway around the
world, as well as locally.
Needless to say, it’s been a lot harder than I anticipated.
Currently, I’m working on writing the final protocal needed to sync communication between an editing server with the remote client – how does the client efficiently send data to the server, and how does the server relay to the client the global state of the text document? I’ve made some headway, and I would like to document it for posterity.
Given a server S with read-write permission on a file F, and clients A, B both connected to S, we need a way to signal to S:
client::insert("some text")
client::delete(n)
The server in turn needs to be able to support:
server::update(client)
,which will update the given client to the correct global state.
To solve this problem, I hatched out a crude versioning solution, which brought me very close to a final answer:
- The server must keep track of a boolean
client_is_synced
on its end, one per client connected. - The server must keep track of state changes to each client – if client A sends a command to the server which will alter the state of client B, the server must update its internal client B’s state list.
- There exists a
SYNC
message with data and a unique, increasing id. - The server shall send a
SYNC
message to synchronize a client state with the global state of the file. - The client keeps a version number property, of the last server
SYNC
message. - Before sending a
SYNC
message to a client, the server shall markclient.client_is_synced
to false. - Upon receiving a
SYNC
message from the server, the client must send anACK
response to the server, with the message id in theSYNC
packet. - If the message is received by the server, and the message id within the
ACK
response corresponds to the latest state version, the server marksclient.client_is_synced = 1
- If the message is not received by the server (dropped during either the original send or ack), the client continues to be marked unsynced by the server.
- If the unsycned client sends some communique to the server in the future, the server must attempt to recover the client.
The recovery protocol is as follows:
- Server sends
DESYNC
error to the client. - Client responds with
SYNC
request (not an acknowledgement), with its version number, the number of the lastSYNC
packet sent by the server. - The server then responds to this request by sending the client
SYNC
packets (or a mergedSYNC
packet) and with the last version number. - The client updates its state accordingly, and must
ACK
the serverSYNC
. - The client then attempts to resend the original request (which may be altered from the original request due to state changes), or drop the request and await further user instructions.
The main puzzle of this protocol is how to represent the state changes of the
client and server. An obvious answer to this would be to use diff
and patch
magic to reconstruct the client state. The downside, however, is that each
client state can be upwards of 10 - 20K in length (roughly the maximum size of
an editing screen) – the problem of this is that diff starts slowing down
dramatically at these sizes – we’ll need a different way to log the changes. A
possible library to use in this scenario,
dtl-cpp implemented an algorithm detailed
in An O(NP) Sequence Comparison Algorithm (Wu et. al 1989); however, past 20%
buffer replacement (something that happens during a scrolling event), times
measured was at about .2s per diff calculation, which is prohibitively large.
Currently, I am considering the operational
transformation
route, which is what Google Docs uses to update client states. Essentially,
rather than calculating the string diff as a whole, I will need to broadcast the
microtransactions made by every client without a lot of merging edits: if
clients A and B both are operating on the same relevant data, then effectively,
a call to A.insert
will result in client B receiving a command for it to also
insert the appropriate data into its own state, sent by the server upon
receiving the packet from A. Updates to follow.
Resources
These are meant to be an index of resources for myself to use in the future.
- Concurrency Control in Groupware Systems (Ellis and Gibbs 1989) This is the original paper on operational transform, with proposed implementation (“dOPT”).
- A Counterexample to the Distributed Operational Transform and a Corrected Algorithm for Point-to-point Communication (Cormack 1995) This cites an oversight to the original dOPT algorithm and proposes changes to correct this oversight.
- dOPT (Barnes 2005) Presentation in simple English describing the oversight.