Ticket #686 (closed defect: fixed )

Opened 5 years ago

Last modified 11 months ago

[TEST] startTLS is broken if there's already data in the outgoing buffer.

Reported by: jknight Assigned to: exarkun
Type: defect Priority: high
Milestone: Component: core
Keywords: Cc: exarkun, itamarst, jknight, thijs, therve, pjd
Branch: branches/starttls-with-buffered-686-3 Author: exarkun
Launchpad Bug:

Attachments

Change History

  2004-08-24 05:53:32+00:00 changed by jknight

If you do write("lotsofdata"); startTLS(), sometimes twisted will do the startTLS() before the
write("lotsofdata"). The condition for "sometimes" being whether the lotsofdata was large enough to
cause the data to be buffered in twisted, instead of sent immediately, or buffered by the kernel.
Apparently this doesn't show up normally, because people don't often send much data before doing
STARTTLS, and thus the data normally *is* sent out immediately.
See also twisted.test.test_ssl.SpammyTLSTestCase test cases.

  2004-08-26 00:10:06+00:00 changed by jknight

Propose this patch to make the problem more visible and less data-corruptful if you do happen to run
into it:
Index: twisted/internet/tcp.py
===============================================================
====
--- twisted/internet/tcp.py     (revision 11439)
+++ twisted/internet/tcp.py     (working copy)
@@ -211,6 +211,17 @@
         def startTLS(self, ctx):
             assert not self.TLS
+            if self.dataBuffer:
+                written = self.writeSomeData(buffer(self.dataBuffer, self.offset))
+                offset = self.offset
+                dataLen = len(self.dataBuffer)
+                self.offset = 0
+                self.dataBuffer = ""
+                if isinstance(written, Exception) or (offset + written != len(dataBuffer)):
+                    warnings.warn("startTLS with unwritten buffered data currently doesn't work right. See
issue #686. Closing connection.", category=RuntimeWarning, stacklevel=2)
+                    self.loseConnection()
+
+
             self.stopReading()
             self.stopWriting()
             self._startTLS()

  2004-09-12 05:47:43+00:00 changed by itamarst

This should be fixed correctly by 2.0.

  2004-09-16 22:45:21+00:00 changed by itamarst

Still urgent, but jknight explained that it's very unlikely to happen in
practice given his fixes, so it's not necessary for 2.0.

  2004-11-20 16:07:20+00:00 changed by radix

urgent *means* "next release", so it's not urgent (not to mention no failing
unit test), sorry :-)

follow-up: ↓ 7   2004-11-20 21:20:22+00:00 changed by exarkun

twisted.test.test_ssl.SpammyTLSTestCase covers this bug, I think.  Note that it
is set todo.

in reply to: ↑ 6 ; follow-up: ↓ 8   2008-07-13 10:15:15+00:00 changed by thijs

  • author deleted
  • cc changed from exarkun, itamarst, jknight to exarkun, itamarst, jknight, thijs
  • component set to core
  • owner changed from jknight to exarkun
  • branch deleted
  • keywords set to review

Replying to exarkun:

twisted.test.test_ssl.SpammyTLSTestCase covers this bug, I think. Note that it is set todo.

I don't see any TODOs in the SpammyTLSTestCase anymore, guess this means it's fixed?

in reply to: ↑ 7   2008-07-13 10:17:00+00:00 changed by thijs

  • keywords deleted
  • owner changed from exarkun to jknight

Replying to thijs:

I don't see any TODOs in the SpammyTLSTestCase anymore, guess this means it's fixed?

Oh damn, I now see the :( faces and todo properties.. sorry for the noise.

  2008-07-25 21:09:20+00:00 changed by exarkun

  • branch set to branches/starttls-with-buffered-686
  • author set to exarkun

(In [24382]) Branching to 'starttls-with-buffered-686'

  2008-07-25 21:33:44+00:00 changed by exarkun

I think this is basically resolved, but it's blocked on #2874.

  2008-07-29 12:52:41+00:00 changed by exarkun

  • branch changed from branches/starttls-with-buffered-686 to branches/starttls-with-buffered-686-2

(In [24414]) Branching to 'starttls-with-buffered-686-2'

  2008-07-29 12:54:54+00:00 changed by exarkun

  • keywords set to review
  • owner deleted

  2008-08-02 10:03:27+00:00 changed by therve

  • keywords deleted
  • owner set to exarkun

Great, nice fix. Some small comments:

Flakes:

twisted/test/test_ssl.py:8: 'trial_util' imported but unused
twisted/internet/tcp.py:20: 'warnings' imported but unused
+                self._tlsBufferedData = []
+                self._tlsContext = ctx
+                self._tlsExtra = extra

I think it may be better if one object could hold all this state, eg self._tlsDelayed = _TLSDelayed([], ctx, extra). What do you think?

  2008-08-02 10:13:35+00:00 changed by therve

  • cc changed from exarkun, itamarst, jknight, thijs to exarkun, itamarst, jknight, thijs, therve

  2008-08-04 12:07:58+00:00 changed by exarkun

(In [24509]) fix pyflakes warnings

refs #686

  2008-08-04 12:37:50+00:00 changed by exarkun

(In [24510]) replace four _tls* attributes with _tlsWaiting

refs #686

  2008-08-04 14:07:55+00:00 changed by exarkun

  • keywords set to review
  • owner deleted

I've made those two changes in the two revisions linked above (no other changes since the last review). Here are the most current build results:

http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/starttls-with-buffered-686-2

  2008-08-05 11:37:59+00:00 changed by therve

  • keywords deleted
  • owner set to exarkun

I'm ok with the branch, but the build results show some problems:

  • gtk2/glib2 is not happy on windows it seems
  • I've noticed that the runReactor method doesn't cancel the timeout :/ Should be pretty easy to fix, though.

  2008-08-05 11:48:23+00:00 changed by exarkun

  • branch changed from branches/starttls-with-buffered-686-2 to branches/starttls-with-buffered-686-3

(In [24525]) Branching to 'starttls-with-buffered-686-3'

  2008-08-05 12:27:40+00:00 changed by exarkun

  • keywords set to review
  • owner changed from exarkun to therve

I think those points have been taken care of in trunk already. I merged forward, here are the results for the new branch:

http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/starttls-with-buffered-686-3

I should have done that before, I knew there were still weird problems in the -2 branch.

runReactor still doesn't directly cancel its timeout, but unbuildReactor does. I think this is good enough until we have a case that demands something more.

  2008-08-05 13:01:01+00:00 changed by therve

  • keywords deleted
  • owner changed from therve to exarkun

Please merge.

  2008-08-05 13:26:48+00:00 changed by exarkun

  • status changed from new to closed
  • resolution set to fixed

(In [24527]) Merge starttls-with-buffered-686-3

Author: exarkun Reviewer: jknight, therve Fixes: #686

Change the implementation of ITLSTransport.startTLS so that if there is unsent data in the buffer, TLS negotiation is delayed until all of that data is sent. This changes the existing behavior which would log an error if the data could not be sent immediately and then drop the connection.

  2008-08-09 06:58:22+00:00 changed by pjd

  • cc changed from exarkun, itamarst, jknight, thijs, therve to exarkun, itamarst, jknight, thijs, therve, pjd

Note that this has some fallout: see Divmod ticket #2673.

Callers can no longer safely assume that the TLS session is established once startTLS returns, and code that used to work due to startTLS opportunistically flushing the buffer can suddenly start failing.

I'm guessing that startTLS should be defined to return a Deferred (or provide some other mechanism) to allow callers to wait for the delayed TLS session.

(Should this be reopened?)

  2008-08-10 15:19:43+00:00 changed by exarkun

One small correction: callers could never assume that the TLS session was established immediately after startTLS returned; they could rely on one of two things - that the connection was closed if there was data in the application buffer which couldn't be sent, or that the TLS handshake had probably started. As it turns out, the handle returned by getHandle would also be a pyOpenSSL Connection object, but the getHandle documentation is pretty clear:

Return a system- and reactor-specific handle. This might be a socket.socket() object, or some other type of object, depending on which reactor is being used. Use and manipulate at your own risk. This might be used in cases where you want to set specific options not exposed by the Twisted APIs.

I consider the change made here, from always returning a pyOpenSSL Connection object immediately after a startTLS call returning one at a somewhat later time in some cases to be reasonable. I also don't expect that it will break anything except for problemsFromTransport (however, if it does break some other applications, then I'll happily consider making some further changes to avoid that).

I agree that there should be a callback to notify application code of successful (and likely failed) TLS initiation, but that's a task for a separate ticket. Additionally, in all real-world protocols that I'm aware of, such callbacks aren't particularly useful, since the protocol is designed to allow that information to be discovered in other ways.

Note: See TracTickets for help on using tickets.