[Twisted-Python] Contributing?

James Y Knight foom at fuhm.net
Thu Aug 26 10:35:48 MDT 2004

On Aug 22, 2004, at 3:28 PM, angryhicKclown at netscape.net wrote:
> I was looking over the page on twistedmatrix.com on contributing, and 
> it referred me to here. Over at the mono project, they have a 
> todo-list sort of thing, that idle hackers such as myself can work on. 
> I was wondering what the best way (besides monetary...I am a poor 
> student) to contribute to the Twisted project is?

Welllll, since you ask.. :)

Here's a relatively self-contained project that could use working on:

twisted.web.microdom and twisted.web.sux is supposed to implement an 
XML/XHTML and HTML parser. It is pretty useless as an XML parser, given 
its relative slowness and the existence of expat/python xml libraries 
which do already do a very good job of being an XML parser. Microdom is 
*almost* a useful HTML parser, but it's missing support for a lot of 
HTML peculiarities that really need to be handled 
("<tr><td>foo<tr><td>bar" for one, strange whitespace collapsing rules, 
for another, and I'm sure there's more). Perl has a very good HTML 
parser in HTML::TreeParser whose algorithms could be duplicated.

This project isn't even very twisted specific (sux/microdom only have 
very minor dependancies on the rest of twisted) so it could conceivably 
be made into a general purpose python module in its own right. There 
are a variety of other Python HTML parsers, but from what I can tell, 
they're even worse than microdom is. It'd be way cool to have a python 
HTML parser that actually works. Can't let perl win! Any 
victi...volunteers? ;0


