Opened 6 years ago

Last modified 6 years ago

#8128 enhancement new

ESMTP class can't be usefully subclassed to add extensions

Reported by: Alastair Houghton Owned by: Alastair Houghton
Priority: normal Milestone:
Component: mail Keywords:
Cc: Branch: branches/esmtp-extensibility-8128
branch-diff, diff-cov, branch-cov, buildbot
Author: adiroiban

Description

You would think it would be possible to implement ESMTP extensions by subclassing twisted.mail.smtp.ESMTP, but in practice this is made difficult by two features of the design of its superclass, twisted.mail.smtp.SMTP, namely:

  • SMTP derives from LineOnlyReceiver, which precludes implementing any extension that requires the ability to read bytes as opposed to lines.
  • SMTP breaks out the ESMTP options when parsing MAIL FROM: and RCPT TO: commands, but fails to do anything with them; it doesn’t even parse them (or provide any hook to do so).

To fix this, we should change SMTP to derive from LineReceiver and should make it parse the ESMTP options into dictionaries that are accessed from either the SMTP object itself or the User object (in the case of the RCPT TO: command).

(This ticket was split from #8126).

Attachments (3)

esmtp-extensibility-8128-1.patch (7.7 KB) - added by Alastair Houghton 6 years ago.
Fixes extensibility problems with ESMTP
esmtp-extensibility-8128-2.patch (8.2 KB) - added by Alastair Houghton 6 years ago.
Fixed a couple of problems and added a news file.
esmtp-extensibility-8128-3.patch (14.1 KB) - added by Alastair Houghton 6 years ago.
Updated in light of review comments.

Download all attachments as: .zip

Change History (11)

Changed 6 years ago by Alastair Houghton

Fixes extensibility problems with ESMTP

comment:1 Changed 6 years ago by Alastair Houghton

Keywords: review added

I've attached a patch, with two new test cases that exercise the new code.

comment:2 Changed 6 years ago by Adi Roiban

Author: adiroiban
Branch: branches/esmtp-extensibility-8128

(In [46344]) Branching to esmtp-extensibility-8128.

Changed 6 years ago by Alastair Houghton

Fixed a couple of problems and added a news file.

comment:3 Changed 6 years ago by Adi Roiban

Keywords: review removed
Owner: set to Alastair Houghton

Thanks for your contribution!


The patch requires a news file fragment as discussed here https://twistedmatrix.com/trac/wiki/ReviewProcess#Newsfiles


This patch still tries to do 2 things:

  • new feature of parsing the MAIL and RCPT options
  • new feature to allow receiving binary data in STMP

I don't understand how the new feature of allowing STMP in binary mode is useful with the current code. Maybe we should do it only once support for receiving binary data is added to STMP or ESMPT.

I don't understand how the new feature of parsing the MAIL options can be used.

So your changes are OK, but without documentation (docstrings, narrative docs or examples) I find it hard to see how they can be used.

If the purpose of this patch it to allow adding extentions, please take the time to write an example and narrative docs to describe how extentsions can be implemented.


Please add some reference to the RFCs describing how the MAIL and RCPT command are supposted to parse their options.


For the This used to derive from LineOnlyReceiver,... comment. I don't see much value in it.

If required, you can update the SMTP class docstring to inform that it can also handle binary data... but an example would be best :)

Put important code usage info in the docstring and don't hide them as comments. Docstrings will be visibile in the API docs.


The new User.optinon instance variables needs to be documented.

Do we need it to be public ? I don't see any reference to it other than as self.option.

Is better to start with it as few public things as possible as later we can not change them as we like since this will break the applications using the Twisted library.

The User.opts init argument needs docstring.

Why go with opts as the init argument and not just options ?

Just because existing code has some bad abreviasion like 'orig' should not prevent us from writing state of the art code :)


Do we need STMP.opt_re and STMP.space_re as public class members?

With public or private, all new class or instance members need docstrings.

I know that previous regular expression were public, but at least for the new code we should try to minimize the public API


Please check the Twisted conding convention. Instead of opt_re it should be something like optRE ... or better optionsRE


it looks to me like the code from STMP.do_RCPT and STMP.do_MAIL around

opts = m.group('opts')
if opts:
   MORE DUPLICATION

is duplicate code.

Can it be extracted into a helper method/function to reduce the duplication ?


I don't see any tests for the changes related to pickling support in User

User initialization with or without an opts argument also require tests.


Full test coverate is critical during refactoring, and they should came of great help once we try to port to Py3.

Documenting all new code and expected types will also higly reduce the effort required to port to Py3


Do we really need self._options = options in STMP.do_MAIL ? If yes, it should be initialized in init and documented in the STMP docstring.

I don't see how the new self._options can be used as there is no code using it. If you expect it to be used by external code, then it should be made public.


testESMTPOptions should be test_ESMTPOptions to comply with the current coding standard.


I find it hard to read and understand the testESMTPBinaryMode and testESMTPOptions.

I do manage to read and understand them but there are no tests for the test code itself, so I think that it is very important to have easy to read tests.

Here is some info about writing test inspectors https://glyph.twistedmatrix.com/2015/05/separate-your-fakes-and-your-inspectors.html

I think that testESMTPOptions should be split into a test for options received by the MAIL command and another test received byt he RCPT command.

It must be possible to obtain the options passed by the client. is a bit vague. The test should specify when and in which conditions those options are availale so that a person trying to refactor a code should know options are only availale in MAIl and RCPT command and not in any other arbitrary command.

For testESMTPBinaryMode do we really need to define ESMTPBinaryModeTest to test that setRawMode() can be called, and after it is called all recieved data are passed to STMP.rawDataReceived ... that after setLineMode() is called received data is passed again to the command dispatcher?


You can see some of the style guide errors reported here

https://buildbot.twistedmatrix.com/builders/twistedchecker/builds/3659/steps/run-twistedchecker/logs/new%20twistedchecker%20errors


Thanks again for your contribution and sorry for being a PITA.

I have sent your patch to the buildbot builders... beside style errors, all builders look green.

Please consider my feedback and submit a new patch on top of the attached branch.

Thanks!

comment:4 Changed 6 years ago by Adi Roiban

My previous review was done based on esmtp-extensibility-8128-1.patch

Please note that the news file fragment can onlu contain a single change.

At release time, its content will be concatenated into a single line and wrapped to fit the news file format.

Try to rephase it as a single phase or split it further into 2 tickets

Thanks!

Changed 6 years ago by Alastair Houghton

Updated in light of review comments.

comment:5 Changed 6 years ago by Alastair Houghton

Keywords: review added

Going through the comments.


News file fragment added, and updated to be a single line.


Yes, the patch does parse the MAIL FROM: and RCPT TO: options, and yes, it also allows receiving binary data in ESMTP subclasses when appropriate. The point is to make ESMTP extensible — I regard that as a single change.


I’ve added a documentation file that hopefully will make things clearer.


Added a comment with a reference to the relevant section of the relevant RFC


The reason for the comment is to dissuade future editors from changing it back to LineOnlyReceiver as an "optimisation". That's why it's useful to have it as a comment instead of in the docstrings.


I've added a docstring to User.__init__. Yes, options needs to be a public member; see the documentation file I added.


No need for them to be public. Fixed.


Happy to rename them. Fixed.


It isn’t identical, and it’s only a few lines. I don’t think it’s worth extracting, especially as we still need to error and return from the do_XXX function.


Added pickling tests. Also added code to cope with unpickling User objects from older Twisted versions that don’t have the options member variable. Added a test for that also.

Added tests with and without the options argument too.


Agreed.


Yes. It was actually initialised in the other ticket; I missed that bit out when splitting the patch. Fixed.

Also, I decided you were right that it should be public, though in 99% of cases it’ll be a subclass using it.


OK, haven't fixed that (thought I'd got everything, obviously not).


I'm not sure how you think the binary mode test could be improved. The point is that it is testing that a subclass can put the protocol into binary mode and that it can be taken back out of binary mode again.

I've split the options test in two. I will say that anyone with knowledge of the SMTP protocol would probably have understood the previous docstring, but I've clarified the docstrings for the new tests.


Fixed those, but there might be more?


You aren’t a PITA; you’re doing exactly what I’d hope you’d do. Plus, it’s my first time contributing to Twisted, so there’s a lot I don’t know — your comments are very helpful.

comment:6 Changed 6 years ago by Adi Roiban

We plan to use automatic linters to help up keep the good shape so having all new code comply with the new coding standard is a requirement.


Thanks for the newsfile fragment update.

As long as the changes can be expressed into a single phrase, is fine to be in a single patch :)

But do you expect from Twisted's end users to implement their own CHUNKING or BINARYMIME extensions ?

I am expecting from Twisted's ESTMP implementation to already provide those extensions as the are pretty standard and have a RFC.

As far as I understand, users can not implement their random extensions but each extension is stricly defined by a RFC

https://en.wikipedia.org/wiki/Extended_SMTP#Extensions

From this perspecting, I would say that Twisted ESMTP should not be designed to allow inheritance, but rather extensions should be implemented inside the ESTMP class (or helper class)


The documentation is great! Many, many thanks! It really make reviewing this patch a joy and gets me out of the "confusing" zone :)

I would argue that every new feature should start with writing documentation... and only after documentation is approved start to work at the actual code and tests.

With documentation is much easier to see the end result and what the patch tries to achieve.

Please add docstring or comment to explain what the examples are supposed to do.

I don't understand the example describing how to use the extension.

Is really bad that ESMTP has no dedicate method for extension advertising, other than the (gross) overwriting of the extentions() method.

I would argue that in this case inheritance is bad and we should look for some other mechanism to advertise the new extensions... maybe a dedicated method which can hook a

@implemented(IESTMPExtensions)
class ChunkingESTMPExtension(object):
    name = b'CHUNKING'

    do_BDAT(self):
       pass

server = ESTMP()
server.addExtension(ChunkingESTMPExtension()

I am not sure if the persons who have initially implemented the ESTMP class wanted the ESTMP to be extended in this way... since we don't have documention or docstrings, is hard to tell.

Again, I think that ESMPT should not be extended in the way you have documented here, but rather have those extensions integrated into Twisted.

Since I am not an ESTMP expert I might be wrong, and it might make sense to implement arbirary (non-standard) extensions.


Regarding LineOnlyReceiver comment. Just write a test to check that SMTP should support receiving raw data. If later someone wants to do any optimization and put it back to LineOnlyReceiver those tests should fail.

But, the test should not really care about LineOnlyReceiver or LineReceiver but rather focus on the SMTP class provide a set of functionalities and APIs

Later maybe we can implement this with a FancyLineReceiver... as long as the test pass and the API is the same, we should not care what is the name of the inherited class.


For

# Holds ESMTP options found on the MAIL FROM: command

please place this info into a the STMP class docstring as ivar.

Same for _optionRE and _spaceRE


The code should be designed and written to speak for itself, and documentation placed in dedicated docstring. For me, a comment is a sign of failing to find the right design ... so that poor comment was needed to help understanding the code :)


When documenting types please note that 'str' is closer to 'Unicode' than to 'bytes'. If your API should accept bytes, please document them as bytes and not str.

In Python 2 they might be the same thing but we are looking forward to Python3 where we have Unicode/str and bytes.

New code should be written with that in mind.

I would say that many of the variable types are just bytes as I don't see any explicit encoding or decoding in use.

Also if some strings should be byte strings, please mark them as such

Instead of

'Hello World!'

use

b'Hello World!'

This will make Py3 porting much easier.


A STMP expert might know that only MAIL and RCPT command accept options and any other command has no options... but it might take a long time to get an STMP expert to volunteer and review this ticket.

In fact Twisted in a great need of any reviewers ... so bad that Twisted ended up raising money to pay someone (like me) to do reviews.

I will sleep on it and see if your changes still make sense tomorrow morning

The current code looks much better, but I would like to see if these extensions make sense...

Maybe you can provide more info and convince me that support for the CHUNKING should not be implemented in Twisted but rather allow people to come with their own implementation of CHUNKING :)

comment:7 Changed 6 years ago by hawkowl

Just a note: We don't provide support for pickling compatibility between versions, and I really really would like to remove all support for pickling from Twisted -- so feel free to remove the pickling code, as it's not really required.

comment:8 Changed 6 years ago by Adi Roiban

Keywords: review removed

More feedback from hawkowl

(16:11:30) adiroiban: hawkowl: can you please check https://twistedmatrix.com/trac/ticket/8128
(16:11:44) adiroiban: just to see if the patch is on the right track on no
(16:11:57) adiroiban: I would say that ESTMP extension should be implemented in Twistd
(16:12:09) adiroiban: rather than create API to allow people to re-invent the wheel
(16:12:34) adiroiban: and create multiple implementation of the same extension
(16:16:39) hawkowl: adiroiban: i don't think subclassing is the method to do it by
(16:16:55) hawkowl: i dunno
(16:17:02) adiroiban: ok
(16:17:03) adiroiban: np
(16:17:08) adiroiban: I will follow up over the mailing list
(16:17:24) hawkowl: i think it's fine to have an extension point there
(16:17:36) hawkowl: the implemented stuff just needs to comply a bit more to our style guide
(16:17:41) adiroiban: ... I was also waiting for some feedback from the initial contributor
(16:17:42) hawkowl: eg. there's non-prefixed strings
(16:17:55) hawkowl: needs more b""
(16:18:03) adiroiban: I can handle the b'' and style
(16:18:09) adiroiban: and I think I have already commented on that
(16:18:22) hawkowl: also the docs need one-sentence per line
(16:18:24) adiroiban: but I wanted to make sure that we can to encourage people to reinvent to wheel
(16:18:33) hawkowl: well, we're not, really
(16:18:35) adiroiban: or force/encourage them to submit the implementations to Twisted
(16:18:36) adiroiban: :)
(16:18:45) hawkowl: we can use this interface ourselves
(16:18:56) hawkowl: but our interface might be wrong, because SMTP is crap

After sleeping on it, twice, I think that we should have a private API for implementing extensions

I am that type of persons which first tries to consider composition, prior to inheritance... but if the majority prefers more inheritance in Twisted I will go on and review and accept the ticket.

Since I am not an ESTMP expert I tried to gather more feedback from the Twisted mailing list. Please send your comments here and check the latest discussions on the ML:

http://twistedmatrix.com/pipermail/twisted-python/2015-November/029995.html

Thanks!

Note: See TracTickets for help on using tickets.