A Counterexample to dOPT (Cormack 1995)

10 Sep 2014

writeup, dOPT, journal notes, and giga

Cormack’s 1995 paper shows a flaw in the original dOPT algorithm, in showing that the strength of the algorithm – the completely decentralized nature of the messaging protocol – leads to divergent states across different clients. Fortunately, Cormack outlines a possible way to fix the algorithm, provided that message passing between nodes is retricted to a point-to-point protocol – that is, we can no longer afford the luxury in a network topology of having circular circuits. Fortunately, for our collaborative editing needs, having a p2p network is actually counterintuitive to our needs, and will not inhibit our needs.

The strength of the Cormack approach for our editing needs is that the point-to-point protocol in the paper is quite easy to understand, and can be applied to a server-client network model in a trivial manner: a server-client topology is just a star topology, with the hub of the star as the server. A brief conversation with Dr. Cormack indeed confirmed this –

Me I am interested in creating a simple collaborative text editor for a personal project, and I was reading your paper A Counterexample to dOPT (1995) – you mentioned in the paper a way to extend the point-to-point communications protocol into a server-client model, but I wasn’t too clear as to how that would be exactly implemented, and was wondering if you had any pointers in this area.

Gordan Cormack …If you implement your system as a tree (of which a simple star with a central hub is a trivial case) you have no problem. Just use the two-player version on every connection. That is, input an edit, apply the transform appropriate to that link, and then send the transformed op to all other neighbours.

Me Just to clarify, each link would have its own local log, and only one log would be updated per transform? That is, the central node in a star topography would have n logs, and upon receiving an update u from site i, transform u by searching through log_i, updating only log_i, and broadcast this change to all links? I was under the assumption that the log would need to be flattened because it’s suppose to be a global edit history – if I’m searching through a link-specific log, wouldn’t that lead to state inconsistencies?

Gordan Cormack No you need a log for each connection. So in the star topology, the hub would be responsible for receiving each incoming update, transforming it (with the neighbor-specific log), and then sending the transformed updates to each of the others (who would each transform it with their respective logs).

Unfortunately, the meat of this article – the corrected algorithm – needs to be slightly cleaned up in notation. I have done so below. The “Figure” reference refers to the corresponding figure within the paper, and the page refers the the page on which the figures occur.

Type Declarations:
	sit_t : int                                     (* set of site IDs *)
	obj_t : list of bits                            (* the type of the shared context *)
	upd_t : obj_t -> obj_t
	vec_t : dict of {sit_t : int}                   (* state vector *)
	pri_t : int                                     (* priority *)
	log_t : list of (sit_t, vec_t, upd_t, pri_t)

Client Immutables:
	S   : sit_t                                     (* unique ID per client *)
	X_0 : obj_t                                     (* the initial value *)
	T   : (upd_t, upd_t, pri_t, pri_t) -> upd_t     (* transformation matrix *)
	P   : (sit_t, upd_t, ...) -> pri_t              (* priority -- this isn't important *)

Client Mutables:
	(X : obj_t) := X_0                              (* local object copy *)
	(L : log_t) := nil                              (* local history log *)
	(Q : log_t) := nil                              (* local operation queue -- this is implicit *)
	(V : vec_t) := { 0 : 0, 1 : 0, ..., n : 0 }     (* local state vector *)

Events:
	local_enqueue(U : upd_t) =
		Q := (S, V, U, P(S, X, ...))::Q
		broadcast (remote_update(S, V, U, P(S, X, ...))

	remote_enqueue(s : sit_t, v : vec_t, u : upd_t, p : pri_t) =
		Q := (s, v, u, p)::Q

	(* given an update from the queue *)
	update(s : sit_t, v : vec_t, u : upd_t, p : pri_t) =
		(* Lamport ordering *)
		if v <= V:
			for each (s', v', u', p') in L
				if v'[s'] > v[s']
					(* transform the operation *)
					u := T(u, u', p, p')
			L := (s, v, u, p)::L
			(* update local context *)
			X := U(X)
			(* update local command count (equivalent to incrementing a timestep) *)
			V[s] := V[s] + 1
			(* update state *)
			X := u(X)

Figure 1. dOPT algorithm (p2)

Type Declarations:
	sit_t : int
	obj_t : list of bits
	upd_t : obj_t -> obj_t
	vec_t : dict of {sit_t, int}
	log_t : list of upd_t                           (* note that this is **different** *)
	pri_t : sit_t
	q_t   : list of (sit_t, vec_t, upd_t)           (* this is **different** than log_t *)

Client Immutables:
	S   : sit_t                                     (* 0 if server, non-zero if client *)
	X_0 : obj_t
	T   : (upd_t, upd_t, pri_t, pri_t) -> upd_t

Client Mutables:
	(X  : obj_t) := X_0
	(V  : vec_t) := { 0 : 0, S : 0 }                (* see post for details *)
	(Q : q_t) := nil
	(L  : log_t) := nil

Events:
	local_enqueue(U : upd_t) =
		Q := (S, V, U)::Q
		broadcast (remote_update(S, V, U)

	remote_enqueue(s : sit_t, v : vec_t, u : upd_t) =
		Q := (s, v, u)::Q

	update(s : sit_t, v : vec_t, u : upd_t) =
		if v <= V:
			(* shifts the log -- there exists a typo in the original *)
			L[(V[s] + v[S] + 1):(V[s] + V[S] + 1)] := L[(V[s] + v[S]):(V[s] + V[S])]
			L[V[s] + v[S]] := u
			for each k in ((V[s] + v[S] + 1), (V[s] + V[S] + 1))
				U := L[k]
				L[k] := T(U, u, S, s)
				u := T(u, U, s, S)
			V[s] := V[s] + 1
			X := u(X)

Figure 4. corrected dOPT algorithm (p4)

Sources

A Counterexample to the Distributed Operational Transform and a Corrected Algorithm for Point-to-point Communication (Cormack 1995)