[Twisted-Python] Re: Handling PBConnectionLost errors

Daniel Miller daniel at keystonewood.com
Mon Jul 30 10:00:13 EDT 2007


You have gone above an beyond my expectations to answer my questions.  
Thank you.

On Jul 28, 2007, at 1:07 AM, David Bolen wrote:

> Daniel Miller <daniel at keystonewood.com> writes:
>> Is this such a stupid question that it doesn't even warrant a  
>> response?
>> ~ Daniel
> I agree with the other comment to the effect that the lack of response
> may be more due to the underlying complexity of the question as to
> lack of interest. ...

It's funny, my question was complex, but it nevertheless contained  
too many assumptions about my application and environment to allow  
you to answer easily. Thanks for taking a stab at it anyway.

> For example, your opening point about:
>>>                       (...)                                 It
>>> would be nice to implement a fail-safe(er) way of calling remote
>>> methods that would retry when necessary until the remote method has
>>> been called successfully and the result has been returned.  (...)
> has an implicit assumption that the remote method will even continue
> to exist once the disconnect has occurred - something that is by no
> means guaranteed with PB.

I hadn't even thought of that, although now that you point it out  
it's obvious. My (server-side) application is just a singleton facade  
to an accounting system database. I'm posting orders from an order  
entry system to invoices in the accounting system. The server- 
supplied "referenceable" will always be available assuming something  
terrible has not happened to the server (e.g. crashed, hacked or  
physically damaged--none of which are things I'm trying to solve here).

> Perhaps some earlier messages of mine when we had just finished
> putting together the remote wrapping and reconnect support in our
> system.  See my responses to the thread at:
> http://twistedmatrix.com/pipermail/twisted-python/2005-July/ 
> 011030.html
> and
> http://twistedmatrix.com/pipermail/twisted-python/2005-July/ 
> 011046.html

Thanks I'll take a look at them.

> It hits on topics beyond that of just a reliable method call, but the
> second message more specifically talks about the wrapper that
> implements reconnections, and how we dealt with updating references
> post-reconnect.  You can probably see how the design dovetailed with
> our particular server side structure (the registry was persistent as
> were the managers, so they provided the concrete point of
> reattachment).  And the use of the wrappers around references meant we
> could "correct" the wrappers for a new connection without having to
> worry about what parts of the client application may have been holding
> references.  Perhaps it will give you some other ideas in your own
> system.

This sounds good, I think I have a similar enough setup that I will  
be able to at least gain some good ideas.

> For your other points:
>>> I have two questions:
>>> 1. Does something like this already exist?
> <snip>
> ... I'm not aware of any existing approach that is generally suitable
> for any application.  I rather doubt any single generic approach would
> be possible, since PB provides for many mechanisms of statement
> management and referenceability among servers and clients.

You're probably right, although the problem domain is interesting  
enough to me that I may try to see what I can do if I ever get enough  
time :)

>>> 2. Is this a totally stupid idea? (would it be better to improve
>>> our physical network than to try to band-aid the problem with
>>> something like this?)
> It's never a stupid idea to engineer for network interruptions, but
> like everything else a design must weigh benefits against
> cost/development.  With that said, it might not be a bad idea to also
> look into your network.  TCP connections are rather hard to break just
> due to network transmission problems, and all your PB calls are going
> across a single TCP session.  They might be significantly delayed on a
> bad network, but the connection itself shouldn't fail unless something
> more extreme (and unusual) is happening.  Given the level of problems
> you're encountering, I wouldn't be surprised if something else was
> awry.

That's what I thought (the connections shouldn't just be dropping for  
no apparent reason, especially since they are all within the bounds  
of a LAN). I know this is getting off topic, but I thought maybe  
you'd know: collisions on the hub should be handled by TCP, and my  
application should not have to worry about them, correct? Even that  
doesn't answer why there are dropped connections on the switched side  
of the network. Maybe we have some bad wiring? FWIW, I am planning to  
eliminate the hub in lieu of another switch (there are other problems  
as well).

Again, thanks very much for your well-thought-out response.

~ Daniel

More information about the Twisted-Python mailing list