Ticket #4124 enhancement new
Re-implement twisted.mail.imap4's parsers using real parser technology
| Reported by: | exarkun | Owned by: | |
|---|---|---|---|
| Priority: | high | Milestone: | |
| Component: | Keywords: | ||
| Cc: | khorn | Branch: | |
| Author: | Launchpad Bug: |
Description
twisted.mail.imap4 deals with the rather complicated task of parsing the IMAP4 protocol (in both directions - the protocol is not symmetrical) with a series of alternately naive, gross, broken, inefficient, poorly tested, and very nearly unmaintainable hacks, tricks, kludges, and plain stupidity (I can say this, I wrote most of it).
#4049 has at last convinced me that this parsing code must be rewritten. This will be difficult for a number of reasons:
- The IMAP4 grammar is somewhat complicated. This is probably surmountable, but it would be good if someone with significant experience developing parsers could be involved in the process.
- There is a lot of grammar to account for.
- Quirks of the existing parser are exposed to the application-level APIs. The most reasonable resolution to this may be the introduction of new APIs and the deprecation of old ones. Of course, figuring out how to continue to support the old APIs with a new parser in place may be tricky.
- The client and server presently approach the problem from different directions, sharing almost none of their parsing code (but of course not exactly none - that would be too easy).
- The current parser is wrong in a number of places, yet all the existing tests are passing, so the existing test suite is clearly not sufficient to demonstrate that the new parser is correct.
I think it would be nice if this opportunity were taken to approach parser development differently than it is usually approached in Twisted. If possible, declaratively specifying much or all of the grammar and using this to automatically generate the parser would be preferable (and if we can do it here, I doubt any other protocols will give us much trouble). However, I won't say this is necessary outright, as it is more important to have a working parser than to have a parser with a neat implementation.
See these other open IMAP4-related tickets for some idea of the current problems:
