[Twisted-Python] How do I debug this network problem?

Peter Westlake peter.westlake at pobox.com
Thu Nov 13 05:15:28 MST 2014


TL;DR - how do I debug the sequence of events between an AMP answer box
arriving at a NIC, and AMP firing the callRemote Deferred?

I have an application with two processes, on separate machines,
communicating using AMP. One process does a callRemote, which returns a
Deferred, which is never fired. I know from tcpdump that the AMP answer
box arrives safely at the network interface card.

This isn't something which can easily be reproduced. Instead, I want to
ask the specific question: how do I debug the data path from the NIC to
AMP firing its Deferred?

I've had a look at the code, and got rather lost amongst the interfaces
and inheritance and protocols and transports. If someone can help me
narrow down the relevant bits of code, I can put in some Python tracing.

FWIW, this is happening on Debian Squeeze and Wheezy, on VMs hosted on
Xen 6.5. It only happens on some specific machines, and only sometimes.
The same code has run flawlessly for many years elsewhere, though this
same bug did happen there too some years ago. That time, it went away
after most of the software in the system was upgraded. I tried that this
time - Debian Squeeze to Wheezy, with associated kernel, Python and
Twisted versions - but the problem persists. Anyway, I don't want to
make the problem go away without understanding it, for fear that it will
come back a third time.

Peter.



More information about the Twisted-Python mailing list