[Twisted-Python] How to debug an AMP connection?

Peter Westlake peter.westlake at pobox.com
Mon Aug 10 16:25:05 MDT 2009


On Sat, 08 Aug 2009 16:32 +0000, exarkun at twistedmatrix.com wrote:
> On 5 Aug, 03:59 pm, peter.westlake at pobox.com wrote:
> >Background: I have a client program that calls callRemote, but the
> >Deferred that callRemote returns is not fired. This is an intermittent
> >error that only happens after some hours of traffic.
...
> >The client sends a request.
> >The server sends a reply with the same tag, and logs a message.
> >Until it goes wrong, the client receives the reply and logs it.
> >When it goes wrong, the client does not see the reply.
> >
> >The protocol has the unfired Deferred in _outstandingRequests,
> >with the missing tag as key.
> >
> >All this suggests that the problem is either in the low-level
> >network code, or somewhere in the network between the client
> >and server. But doesn't TCP/IP tell you if a packet doesn't
> >get through?
> 
> It doesn't really /tell/ you.  But if there is some issue with
> the network that prevents packet delivery for long enough, then
> the connection breaks (giving you a connectionLost call in Twisted,
> which should errback your Deferred).
> >
> >I tried tracing the packets using a python-pycapy script (attached),
> >but it showed more packets disappearing than actually were.
> 
> It might be simpler to use something like wireshark, which already
> knows how to do TCP stream re-assembly and such.
> 
> I know this isn't much of a suggestion, but it's the only thing that
> really comes to mind.

No, it's a good suggestion - the only problem is that it makes the
problem happen far less often! What I've done is to capture the first
few bytes of each packet, enough to show the whole of the AMP reply
packets (which are just {'_answer':tag, 'status':True}) without any
need for full-scale analysis and reassembly.

I'm pleased to say that after many hours of watching tcpdump,
the bug eventually happened again just minutes before I left
for a long weekend. Sure enough, TCP/IP delivers the packet
successfully. Tomorrow's first task is going to be putting
tracing into the low-level code where the reactor is meant
to receive the packet and call the AMP _outstandingRequests
entry for it.

It would have been almost impossible to get even this far
without the Manhole feature. I've been using it all along
to inspect the workings of the client code, and it was a big
help on the server side too. The server handles a lot of
connections, and I was able to set a debug flag in the protocol
object for the connection I was tracing. There would have been
very much more log data otherwise. It makes a big difference
being able to see values of variables without having to add
new log statements and spend another working day waiting for
the bug. Whoever invented it - thank you!

Peter.




More information about the Twisted-Python mailing list