FW: [Twisted-Python] strange server crash

Alec Matusis matusis at yahoo.com
Wed Mar 25 00:31:57 MDT 2009


One more thing:
This server has ECC memory, and it also has a BMC controller that externally logs all hardware errors independently of the memory condition.
>From what I understand, ECC memory module has an extra memory chip for data hashes. When data is written into ECC memory, the hash is created and stored in the extra chip, and when it's retrieved, the hash is checked. When a mismatch occurs, even if kernel does not log such error, the BMC controller logs it as 
# ipmitool sel list
1c10 | 02/19/2009 | 16:29:39 | Memory #0x08 | Uncorrectable ECC |
..
I retrieved the system even log (SEL) list from the BMC controller, and  are no errors whatsoever.

This looks like a Python error or something very basic. It started happening after I slightly changed the code for this particular server.

-----Original Message-----
From: Alec Matusis [mailto:matusis at yahoo.com] 
Sent: Tuesday, March 24, 2009 6:26 PM
To: 'Twisted general discussion'
Subject: RE: [Twisted-Python] strange server crash

This server crashed again today, again during maximum load for the day.
This time, no errors in the twisted log, and not even a segfault message in /var/log/messages : the pid simply ceased to exist.
Once again, this machine runs 8 twisted servers, but this one is slightly different from the others, and the error happened after the code for this server has been slightly modified.
I do not think this is the bad RAM anymore, because there's one particular server that keeps crashing on this machine.

> -----Original Message-----
> From: twisted-python-bounces at twistedmatrix.com [mailto:twisted-python-
> bounces at twistedmatrix.com] On Behalf Of glyph at divmod.com
> Sent: Monday, March 23, 2009 1:37 AM
> To: 'Twisted general discussion'
> Subject: RE: [Twisted-Python] strange server crash
> 
> 
> On 07:25 am, matusis at yahoo.com wrote:
> >Very strange. I am not using any custom C extensions...
> >In the last two days, it has been under larger load, and it has not
> >crashed.
> >I will update to Python 2.6 soon.
> 
> Have you tested for bad RAM on that server?  The error mode is
> sufficiently weird and rare to make me suspect cosmic rays.
> 
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python






More information about the Twisted-Python mailing list