[Twisted-Python] pb.Cacheable vs. offline queuing

Sat Nov 5 20:29:41 MST 2005

On 11/5/05, Brian Warner <warner at lothar.com> wrote:

> > the two servers (public and office) must both keep serving users, even if
> > the connection between them is severed.
>
> Be aware that trying to accomodate disconnected operation is like planting
> kudzu in your flowerbed with a promise that you'll "keep an eye on it". Each
> time you think you've figured out everything there is to know about
> distributed synchronization, you discover a new academic paper explaining a
> new horrible failure method that you're vulnerable to. The depths awaiting to
> be plumbed know no bound. (the problem is basically equivalent to merging in
> a distributed version control system, and look at how many of *those* we've
> got running around :).

Been there, ran away from that :)  In this case, all data and business
logic (relevant to the syncing) are narrowly defined, and I plan to
treat any data collisions the same way as any sane merging system --
let the user figure it out (with an iSync-esque conflict resolver
interface).  I don't want to spend x amount of time building the
system, and 3-4x writing syncing logic.

> Worse yet, the same problems actually exist in always-connected operation,
> it's just that you can trick yourself into thinking you can ignore them more
> often. Unidirectional dataflow is at least four orders of magnitude easier to
> deal with.

This system isn't symmetric, thankfully, and only certain types of
data will be updatable on both ends.  While connected, both systems
won't be doing quite the same thing (for the most part) so the window
for collisions while connected is incredibly small.  The same objects
would be edited, but not the same attributes on each end.

> > any collisions would be dealt with manually by the office staff once the
> > connection came back up.
>
> Good. You might still have a chance to retain your sanity :).

Sanity is overrated :)

I don't plan to spend a lot of time writing code to deal with
synchronization, though, 'cause it's both tedious and complex at once.

Offline operation is *not* going to be nominal operation, thanks to
their relatively good connectivity to each other.  This is mostly to
ensure that the office doesn't shut down if their hosted server's
connection dies, and so that end users are still able to do things if
their office connection goes out, like attach media to jobs, complete
forms, check status, modify miscellaneous info, etc.

> Yeah, Cacheable won't be enough for you. My hunch is that you'll be
> implementing enough new code that you might as well not bother building it on
> top of pb.Cacheable: start with a pb.Referenceable and a remote_acceptUpdate
> method, an outboundUpdates() queue, and some setter methods to trigger
> outbound update messages.

That sounds fine, actually.  I'm not trying to build a general-purpose
diff engine here, just some specific things, so I'm probably better
off ignoring Cacheable and focusing on the logic instead.

Thanks for the advice :)

J.