Opened 6 years ago
Last modified 2 years ago
If you do write("lotsofdata"); startTLS(), sometimes twisted will do the startTLS() before the write("lotsofdata"). The condition for "sometimes" being whether the lotsofdata was large enough to cause the data to be buffered in twisted, instead of sent immediately, or buffered by the kernel. Apparently this doesn't show up normally, because people don't often send much data before doing STARTTLS, and thus the data normally *is* sent out immediately. See also twisted.test.test_ssl.SpammyTLSTestCase test cases.
Propose this patch to make the problem more visible and less data-corruptful if you do happen to run into it: Index: twisted/internet/tcp.py =============================================================== ==== --- twisted/internet/tcp.py (revision 11439) +++ twisted/internet/tcp.py (working copy) @@ -211,6 +211,17 @@ def startTLS(self, ctx): assert not self.TLS + if self.dataBuffer: + written = self.writeSomeData(buffer(self.dataBuffer, self.offset)) + offset = self.offset + dataLen = len(self.dataBuffer) + self.offset = 0 + self.dataBuffer = "" + if isinstance(written, Exception) or (offset + written != len(dataBuffer)): + warnings.warn("startTLS with unwritten buffered data currently doesn't work right. See issue #686. Closing connection.", category=RuntimeWarning, stacklevel=2) + self.loseConnection() + + self.stopReading() self.stopWriting() self._startTLS()
This should be fixed correctly by 2.0.
Still urgent, but jknight explained that it's very unlikely to happen in practice given his fixes, so it's not necessary for 2.0.
urgent *means* "next release", so it's not urgent (not to mention no failing unit test), sorry :-)
twisted.test.test_ssl.SpammyTLSTestCase covers this bug, I think. Note that it is set todo.
Replying to exarkun:
I don't see any TODOs in the SpammyTLSTestCase anymore, guess this means it's fixed?
Replying to thijs:
Oh damn, I now see the :( faces and todo properties.. sorry for the noise.
(In [24382]) Branching to 'starttls-with-buffered-686'
I think this is basically resolved, but it's blocked on #2874.
(In [24414]) Branching to 'starttls-with-buffered-686-2'
Great, nice fix. Some small comments:
Flakes:
twisted/test/test_ssl.py:8: 'trial_util' imported but unused twisted/internet/tcp.py:20: 'warnings' imported but unused
+ self._tlsBufferedData = [] + self._tlsContext = ctx + self._tlsExtra = extra
I think it may be better if one object could hold all this state, eg self._tlsDelayed = _TLSDelayed([], ctx, extra). What do you think?
(In [24509]) fix pyflakes warnings
refs #686
(In [24510]) replace four _tls* attributes with _tlsWaiting
I've made those two changes in the two revisions linked above (no other changes since the last review). Here are the most current build results:
http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/starttls-with-buffered-686-2
I'm ok with the branch, but the build results show some problems:
(In [24525]) Branching to 'starttls-with-buffered-686-3'
I think those points have been taken care of in trunk already. I merged forward, here are the results for the new branch:
http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/starttls-with-buffered-686-3
I should have done that before, I knew there were still weird problems in the -2 branch.
runReactor still doesn't directly cancel its timeout, but unbuildReactor does. I think this is good enough until we have a case that demands something more.
Please merge.
(In [24527]) Merge starttls-with-buffered-686-3
Author: exarkun Reviewer: jknight, therve Fixes: #686
Change the implementation of ITLSTransport.startTLS so that if there is unsent data in the buffer, TLS negotiation is delayed until all of that data is sent. This changes the existing behavior which would log an error if the data could not be sent immediately and then drop the connection.
Note that this has some fallout: see Divmod ticket #2673.
Callers can no longer safely assume that the TLS session is established once startTLS returns, and code that used to work due to startTLS opportunistically flushing the buffer can suddenly start failing.
I'm guessing that startTLS should be defined to return a Deferred (or provide some other mechanism) to allow callers to wait for the delayed TLS session.
(Should this be reopened?)
One small correction: callers could never assume that the TLS session was established immediately after startTLS returned; they could rely on one of two things - that the connection was closed if there was data in the application buffer which couldn't be sent, or that the TLS handshake had probably started. As it turns out, the handle returned by getHandle would also be a pyOpenSSL Connection object, but the getHandle documentation is pretty clear:
Return a system- and reactor-specific handle. This might be a socket.socket() object, or some other type of object, depending on which reactor is being used. Use and manipulate at your own risk. This might be used in cases where you want to set specific options not exposed by the Twisted APIs.
I consider the change made here, from always returning a pyOpenSSL Connection object immediately after a startTLS call returning one at a somewhat later time in some cases to be reasonable. I also don't expect that it will break anything except for problemsFromTransport (however, if it does break some other applications, then I'll happily consider making some further changes to avoid that).
I agree that there should be a callback to notify application code of successful (and likely failed) TLS initiation, but that's a task for a separate ticket. Additionally, in all real-world protocols that I'm aware of, such callbacks aren't particularly useful, since the protocol is designed to allow that information to be discovered in other ways.