Opened 4 years ago

Closed 4 years ago

#7860 enhancement closed fixed (fixed)

Add support for Next Protocol Negotiation and Application Layer Protocol Negotiation

Reported by: Cory Benfield Owned by: Cory Benfield
Priority: normal Milestone:
Component: core Keywords:
Cc: Hynek Schlawack, hawkowl Branch: branches/tls-alpn-npn-7860-25
branch-diff, diff-cov, branch-cov, buildbot
Author: hawkowl, adiroiban, glyph

Description

ALPN and NPN are TLS extensions that allow for negotiation of the protocol to speak once the TLS connection has been established. Their purpose is to avoid latency by using the TLS handshake to do very simple protocol negotiation.

NPN was originally specified outside the IETF: the IETF later adapted it into ALPN and specified it as such. The two have different handshake procedures and APIs, but fundamentally do the same thing. NPN is more widely supported at the moment because it was deployed earlier, but with ALPN support having landed in OpenSSL 1.0.2 ALPN will rapidly overtake NPN.

This ticket would add support for both ALPN and NPN to Twisted's SSL implementation. Some details:

  • This is one ticket, not two, because fundamentally ALPN and NPN do the same job in basically the same way. There's little point implementing one without the other.
  • PyOpenSSL obtained support for NPN and ALPN in the 0.15 release, so consideration needs to be made to Twisted users with older PyOpenSSL releases.
  • Not all versions of OpenSSL support NPN, and even fewer support ALPN. Consideration needs to be made for Twisted users with old versions of OpenSSL.

I am preparing a patch that adds this support, but if someone gets there before me I won't be too broken up about it!

Attachments (25)

7860_1.patch (9.0 KB) - added by Cory Benfield 4 years ago.
First draft patch
7860_2.patch (15.5 KB) - added by Cory Benfield 4 years ago.
Second draft patch
7860_3.patch (22.5 KB) - added by Cory Benfield 4 years ago.
Third draft patch
7860_4.patch (23.2 KB) - added by Cory Benfield 4 years ago.
Patch version 4
7860_5.patch (23.4 KB) - added by Cory Benfield 4 years ago.
Fifth draft patch
7860_6.patch (23.4 KB) - added by Cory Benfield 4 years ago.
Sixth draft patch
7860_7.patch (23.4 KB) - added by Cory Benfield 4 years ago.
Seventh draft patch
7860_8.patch (26.0 KB) - added by Cory Benfield 4 years ago.
Draft 8 of the patch, containing an example
7860_9.patch (27.1 KB) - added by Cory Benfield 4 years ago.
Ninth draft, headers added to the example.
7860_10.patch (22.1 KB) - added by Cory Benfield 4 years ago.
Tenth draft patch
7860_11.patch (27.0 KB) - added by Cory Benfield 4 years ago.
Eleventh draft patch
7860_12.patch (28.1 KB) - added by Cory Benfield 4 years ago.
Twelfth draft patch
7860_13.patch (28.1 KB) - added by Cory Benfield 4 years ago.
Thirteenth draft patch
7860_14.patch (37.4 KB) - added by Cory Benfield 4 years ago.
Draft patch 14
7860_15.patch (37.3 KB) - added by Cory Benfield 4 years ago.
Draft fifteen
7860_16.patch (36.2 KB) - added by Cory Benfield 4 years ago.
Sixteenth draft patch
7860_17.patch (36.1 KB) - added by Cory Benfield 4 years ago.
Seventeenth draft patch
7860_18.patch (37.1 KB) - added by Cory Benfield 4 years ago.
Eighteenth draft patch
7860_19.patch (37.7 KB) - added by Cory Benfield 4 years ago.
Nineteenth draft patch
7860_20.patch (40.3 KB) - added by Cory Benfield 4 years ago.
Twentieth draft patch
7860_21.patch (42.1 KB) - added by Cory Benfield 4 years ago.
Twenty-first draft patch
7860_22.patch (43.6 KB) - added by Cory Benfield 4 years ago.
Twenty-second draft patch
7860_23.patch (39.6 KB) - added by Cory Benfield 4 years ago.
Twenty-third draft patch
7860_24.patch (40.1 KB) - added by Cory Benfield 4 years ago.
Twenty-fourth draft patch
7860_25.patch (40.2 KB) - added by Cory Benfield 4 years ago.
Twenty-fifth draft patch

Download all attachments as: .zip

Change History (120)

comment:1 Changed 4 years ago by Hynek Schlawack

Cc: Hynek Schlawack added

Changed 4 years ago by Cory Benfield

Attachment: 7860_1.patch added

First draft patch

comment:2 Changed 4 years ago by Cory Benfield

Keywords: review added

Alright, a first draft patch is attached. This patch transparently uses both NPN and ALPN when available, to avoid providing too many choices. These could easily be split out if people believed there was value in doing that.

comment:3 Changed 4 years ago by Adi Roiban

Keywords: review removed
Owner: set to Cory Benfield

Many thanks for the patch.

I am not familiar with NPN and ALPN so this is a high level review... but I will read about it and prepare for the next round of reviews.

I think that nextProtos should be renamed to nextProtocol to avoid abbreviations in public API.

I think that current implemetation with a single list for ALPN and NPN is good.

I am ok with a single patch as changes don't look big :)


All code from the tests need dostring and documentation as much as the non-testing code, please don't treat it as second class citizen as it also needs to be maintained :)

Since it adds a new API, can you please also update the narrative documentation to show how it can be used.

This can also help reviewers do manual/functional/interoperability tests of the new feature.

Right now I have no idea how to test your changes and how to check that the changes work with other NPN/ALPN implementations.

Without docstring I find it hard to understand what you are trying to test ... I see the names hinting about success and failure... but what kind of failure, why it fails ? A assume that ie fails because there intersection of client / server protocols is empty... but please make it explicit.

Now sure what to do when this feature is used with pyOpenSSL less than 0.15 ... reading the code I see that it tries to ignore the errors. Maybe we should raise a warning and inform users they need to upgrade.

I see that set_npn_advertise_callback are called outside of try/catch are they implemented in older versions of OpenSSL ?

Since _npnSelectCallback is reused for ALPN why not use a generic name + docstring ex (nextProtocolSelectCallbac) and direclty assing the callback... without the _alpnSelectCallback = _npnSelectCallback alias and associated comment ?

From the test it looks like you are testing both client and server side using Twisted code and using the same test.... With this infrastructure I don't know how can we check that demonstrate that ALPN is prefered over NPN and also that NPN is used when ALPN is not set.

We would also need a test where nextProtocol is used but which runs on OpenSSL < 0.15.


As a technical detail, this will need a buildbot slave running pyOpenSSL 0.15 so that we can do continuous testing and make sure that the feature is not broken in the future.

I will try to raise a discussion over this patch on the twisted mailing lists so that we can get more feedback and testing.


Please check previous comments and provide your feedback.

Many thanks again for working on this!

comment:4 Changed 4 years ago by hawkowl

Cc: hawkowl added

comment:5 Changed 4 years ago by Cory Benfield

Thanks adiroiban! While I work on the next iteration of the patch, let me answer some of your questions.

Now sure what to do when this feature is used with pyOpenSSL less than 0.15 ... reading the code I see that it tries to ignore the errors. Maybe we should raise a warning and inform users they need to upgrade.

I see that set_npn_advertise_callback are called outside of try/catch are they implemented in older versions of OpenSSL ?

Yeah, the docstrings need to be clearer about this, including documenting possible exceptions. Weirdly, the tests show this better.

The logic is based on the fact that NPN was added in 1.0.1, and ALPN in 1.0.2: if they're not present, PyOpenSSL will raise a NotImplementedError. The logic ignores missing ALPN if NPN is present (1.0.1 <= OpenSSL Version < 1.0.2), because you can still do negotiation: however, if both ALPN and NPN are missing then a NotImplementedError will be raised. The code assumes that you cannot have an OpenSSL that has ALPN and no NPN, though it's possible that OpenSSL may one day deprecate/remove NPN. For the moment that's not true, though.

I see that set_npn_advertise_callback are called outside of try/catch are they implemented in older versions of OpenSSL ?

1.0.1 onwards. I deliberately leave them outside a try...catch because I want that exception to propagate, as shown above.

From the test it looks like you are testing both client and server side using Twisted code and using the same test.... With this infrastructure I don't know how can we check that demonstrate that ALPN is prefered over NPN and also that NPN is used when ALPN is not set.

Yeah, this is tricky. It's one of the major downsides of the 'combined' API. We can do it by getting coverage on a wide number of build machines: specifically, one machine with PyOpenSSL 0.15+ and OpenSSL 1.0.2; one with PyOpenSSL 0.15+ and OpenSSL 1.0.1; and one with PyOpenSSL 0.15+ and an earlier PyOpenSSL. The first two should run the tests and pass, the second one should skip the tests.

If that's not sufficiently good, I think we can do it by subclassing the OpenSSLCertificateOptions class and only installing one of NPN/ALPN. Does that sound better?

We would also need a test where nextProtocol is used but which runs on OpenSSL < 0.15

I can write a test that inverts the skip logic, so it only runs for PyOpenSSL < 0.15.

Changed 4 years ago by Cory Benfield

Attachment: 7860_2.patch added

Second draft patch

comment:6 Changed 4 years ago by Cory Benfield

Keywords: review added

Ok, I've attached the second draft patch that addresses the feedback above. Please let me know what you think!

comment:7 Changed 4 years ago by Cory Benfield

Owner: Cory Benfield deleted

comment:8 Changed 4 years ago by Adi Roiban

Author: adiroiban
Branch: branches/tls-alpn-npn-7860

(In [44549]) Branching to tls-alpn-npn-7860.

comment:9 Changed 4 years ago by Adi Roiban

Keywords: review removed
Owner: set to Cory Benfield

Thanks. I have applied your patch and sent it to buildbot.


The narrative documentation is great.

It is great to see information regarding the required openssl version. I think it can be extended to also document the required pyopenssl version.... or otherwise document this requirement in the docstring for OpenSSLCertificateOptions.

I think that we should not expect people to discover this dependencies only at runtime.


For getNextProtocol . Will it return str or bytes? I see that it uses the tls_connection... which should be an pyopessl object.. and I think that pyOpenSSL will return bytes.

To avoid confusing maybe the code should be written using b'some-bytes' and u'some-text' to make it clear where data is used as bytes or text.


For testing client and server side I was looking to see a custom context for the remote peer, in this way we can have fine grained control over the remote peer and arrange it in ways not directly supported by Twisted new public API.

For example for test_NPNAndALPNSuccess(self) the docstring talks about "when both ALPN and NPN are used" ... but the arrange part of the test has no explicit way to guarantee that ALPN and NPN is used. A custom context should help arrange the test in the required state.

Instead of using the OpenSSLCertificateOptions for the remote peer you can create a custom CertificateOption class, dedicated for these tests and which can be configured to explictly set only NPN or ALPN or both.


NegotiatedProtocol will also need docstring. For it's method you can just put a link to Protocol method but the generic docstring from NegotiatedProtocol should describe the intended purpose of this new class, what it does and you expect it to be used.

I see that it should be used for both client and server side... or if not used for both peers the other peer should send data so that the deferred is triggered.


You can remove 'Test that,' from test docstring see https://plus.google.com/+JonathanLange/posts/YA3ThKWhSAj


I was hoping to see a test where client and server use different protocols, and both lists are not empty so that if self.nextProtocols is triggered.

For example

client = ['http1/1', 'h2']
server = ['foo', 'bar', 'h2']

or to demonstrat that ALPN is preferred and server is the boss

client = ['bar', 'h2', 'foo']
server = ['foo', 'bar', 'h2']

Now... since we don't have buildslaves for that I will need to look for a VM/container with openssl 1.0.2 to test the changes and give them a spin.

Do you have some notes regarding the procedure you have used to do manual / functional / black box tests for this new feature?


Please check the current commend and let me know what do you think.... also check buildbot results... but since they don't have newer openssl or pyopenssl there is not much to see

I think that the patch is in good shape but we need to have correct documentation and also to prepare the automated test infrastructure to check that this feature will always work no mater what refactoring is done in Twisted.

Thanks!

comment:10 Changed 4 years ago by Cory Benfield

For getNextProtocol . Will it return str or bytes?

As stated in the docstring, it returns 'str', but on Python 2 'str' *is* bytes. Given that Twisted does not fully support Python 3 at this time I believe this to be unambiguous, but I can change the docstring to say 'bytes' instead if that would be clearer.

or to demonstrat that ALPN is preferred and server is the boss

This one is tricky, I'll need to feature detect ALPN as well. Given that the test here would actually be a test of OpenSSL's ALPN implementation, I don't think it adds very much to the test suite to have it, but I suppose I could if you believe it's really worthwhile.

Do you have some notes regarding the procedure you have used to do manual / functional / black box tests for this new feature?

At this point I'm afraid I have to be a bit unhelpful. My development machine is an OS X laptop, and Homebrew has made this very easy. I have an OpenSSL 0.9.8 install that came with the system, and a separately built 1.0.2 install from Homebrew that I can link PyOpenSSL against. This has allowed me to do functional testing quite easily.

For Linux this is trickier. If you use Debian as a base, Jessie is using 1.0.1 (and so has NPN support), and there's a 1.0.2 package in experimental that you could try. Alternatively, you could compile it from scratch.

Otherwise, I'm about to upload a patch that addresses the other comments in your feedback.

Changed 4 years ago by Cory Benfield

Attachment: 7860_3.patch added

Third draft patch

comment:11 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: Cory Benfield deleted

Ok, next patch draft is up.

comment:12 Changed 4 years ago by Adi Roiban

Many thanks for the patch. It looks.

In case you think that a test does not add value you should not add it. Feel free to ignore my comments if they don't make sense :)

The no ALPN and no NPN tests are done by setting an empty list for ALPN/NPN ... I am not sure if this is the same thing as a peer which don't talk ALPN or NPN ... but the test can be used to demonstrate that ALPN has priority and that NPN is also used... anyway most of those tests will just be tests for pyopenssl.

I only want to make sure that we have good coverage and that the test suite will touch at least each line / each branch.


I think there is a missing test for the new code.

raise NotImplementedError( 
    "nextProtocols requires PyOpenSSL 0.15 or later") 

This test would be require to make sure this branch is hit in a test and we have good coverage but also to make sure that the error is later converted into an errback and is not lost as an unhandled error.


Since we don't have buildslaves for pyopenssl 0.15 and newer openssl, in order to continue this review I will need to set up a local test environment... and this will take a while ... for the next 3 weeks I will be travelling.

Leaving this for review so that another reviewer can pick it up.

Thanks!

comment:13 Changed 4 years ago by Cory Benfield

I think there is a missing test for the new code.

No, the test is just hiding in the NPNAndALPNAbsentTest class. =)

Thanks for your review, I hope you enjoy your travels!

comment:14 Changed 4 years ago by hawkowl

Author: adiroibanhawkowl, adiroiban
Branch: branches/tls-alpn-npn-7860branches/tls-alpn-npn-7860-2

(In [44950]) Branching to tls-alpn-npn-7860-2.

comment:15 Changed 4 years ago by hawkowl

(In [44951]) apply patch from lukasa, refs #7860

comment:16 Changed 4 years ago by hawkowl

Applied the patch, builders spun.

The twisted-coverage.py builder (which is also debian7-py27) has PyOpenSSL 0.15.1, and so does debian8-py27.

comment:17 Changed 4 years ago by hawkowl

Keywords: review removed
Owner: set to Cory Benfield

Hi lukasa,

Thanks for the patch.

As mentioned on IRC, there are three failing tests on the builders that do not have PyOpenSSL installed. There are also some failures on the twistedchecker builders -- please see which ones are appropriate to fix (https://buildbot.twistedmatrix.com/builders/twistedchecker/builds/3036/steps/run-twistedchecker/logs/new%20twistedchecker%20errors)

Please address these issues and resubmit. Thanks again!

comment:18 Changed 4 years ago by hawkowl

Also, it fails on the Python 3 builder with PyOpenSSL 0.15.1. https://buildbot.twistedmatrix.com/builders/debian7-py3.3/builds/145/steps/shell/logs/stdio

comment:19 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: Cory Benfield deleted

Ok, I believe I've addressed those problems and the test failures. Hopefully this new patch will go more smoothly.

Changed 4 years ago by Cory Benfield

Attachment: 7860_4.patch added

Patch version 4

comment:20 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-2branches/tls-alpn-npn-7860-4

(In [44957]) Branching to tls-alpn-npn-7860-4.

comment:21 Changed 4 years ago by Adi Roiban

Thanks for the changes.

I still did not had time to set up the required openssl boxes to test this patch.

I have created a new branch based on the latest patch and sent it to the current builders.

Thanks!

comment:22 Changed 4 years ago by Cory Benfield

Thanks adiroban!

Looks like I missed an edge case in the tests. The new patch (number 5, sigh) should fix that problem. It isn't anything structurally important in the tests, just missed that we might want to run an NPN-only test case in an environment without ALPN.

Changed 4 years ago by Cory Benfield

Attachment: 7860_5.patch added

Fifth draft patch

comment:23 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-4branches/tls-alpn-npn-7860-5

(In [44983]) Branching to tls-alpn-npn-7860-5.

comment:24 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-5branches/tls-alpn-npn-7860-4

Thanks for the latest patch. I have sent it to buildbot and it looks good.

I have also created a ticket to track the progress of adding a new buildslave on which we can run the new tests. https://github.com/twisted-infra/braid/issues/88

Is it OK to use Debian Sid for testing your patch?

comment:25 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-4branches/tls-alpn-npn-7860-5
Keywords: review removed
Owner: set to Cory Benfield

Hi,

I have installed a Sid docker and was able to run the test suite on pyopenssl 0.15.1 and openssl 1.0.2. Tests passed.

Will run the test on debian:testing


There is a problem with the patch. ALPNTest should be renamed to ALPNTests , same story for the other test suite class.


Do you have a real-world use case for this new feature so that I can run a functional and inter-operational test?

Thanks!

comment:26 Changed 4 years ago by Cory Benfield

Ok, will upload a new patch that contains that change: thanks adiroiban.

Do you have a real-world use case for this new feature so that I can run a functional and inter-operational test?

Sure: this can be used to connect to HTTP/2 servers and ask them to speak HTTP/2. You can try connecting to http2bin.org, which is a server that I run that can speak both HTTP/1.1 and HTTP/2. If you use NPN/ALPN here you should be able to play with the settings and confirm that the correct protocols are negotiated.

If you really want to go further let me know and I can give you a fake HTTP/2 protocol that can make a single web request.

Changed 4 years ago by Cory Benfield

Attachment: 7860_6.patch added

Sixth draft patch

comment:27 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: Cory Benfield deleted

Ok, the sixth draft patch has been uploaded, which contains the change to the test class names.

comment:28 Changed 4 years ago by Adi Roiban

Keywords: review removed
Owner: set to Adi Roiban

Thanks!

I have also installed an debian stable container with openssl 1.0.1k and run the tests.

All good.


I think that NPNAndALPNAbsentTest should be renamed to NPNOrALPNAbsentTest as I see that the test is skipped when npn or alpn is present.

Same for NPNAndALPNTest. Should be renamed to NPNOrALPNTest as I see that the test are executed when npn or alpn is present :)

Sorry for this late comment.


Now regarding the functional and interoperability tests.

It would help if you have any Twisted code based on this branch which makes an client request to https://http2bin.org/ or a http2/http1 server which accepts client requests.

If you don't have such a code, that is fine. I will work an creating it. Maybe can be used as an official example.

Thanks!

comment:29 Changed 4 years ago by Adi Roiban

Owner: changed from Adi Roiban to Cory Benfield

comment:30 Changed 4 years ago by Cory Benfield

For functional testing, feel free to give this gist a try: https://gist.github.com/Lukasa/57bd9a7332991821329d It contains a basic HTTP/2 client.

I think that NPNAndALPNAbsentTest should be renamed to NPNOrALPNAbsentTest as I see that the test is skipped when npn or alpn is present.

While that's true, ALPN can only be present when NPN is also present: that means the cases where ALPN is present are a subset of the cases where NPN is present. For that reason, if NPN is absent ALPN must also be absent.

Same for NPNAndALPNTest. Should be renamed to NPNOrALPNTest as I see that the test are executed when npn or alpn is present :)

This one is right, so I'll rename it.

Changed 4 years ago by Cory Benfield

Attachment: 7860_7.patch added

Seventh draft patch

comment:31 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: Cory Benfield deleted

Alright, patch 7 contains that small change.

comment:32 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-5branches/tls-alpn-npn-7860-7

(In [45011]) Branching to tls-alpn-npn-7860-7.

comment:33 Changed 4 years ago by Adi Roiban

Keywords: review removed
Owner: set to Cory Benfield

Thanks for the latest patch.

I have sent it to buildbot.

I tried your gist and it works.

I tried to modify it and check that http/1.1 also works when h2 is not present.

I ended up with this gist https://gist.github.com/adiroiban/a88e7127671ca24647c8 ... but it does not switch to http1 , but rather gets [('SSL routines', 'SSL3_GET_SERVER_HELLO', 'parse tlsext')]

Please take a look at my gist and let me know what is wrong.

My test scenarios are:

  • extraCertificateOptions={'nextProtocols': [b'h2', b'http/1.1']}
  • `extraCertificateOptions={'nextProtocols': [b'bla', b'http/1.1']}
  • extraCertificateOptions={'nextProtocols': []}
  • extraCertificateOptions={'nextProtocols': [b'tra', b'la']}

I don't care what data is sent/received. I only care about protocol negociation over TLS

Maybe we can refactor the gist something like this https://gist.github.com/adiroiban/43ffa442ab23e5ff5ba8 ... and commit it as an example together with this patch.

What do you think?

Thanks!

comment:34 Changed 4 years ago by Cory Benfield

Owner: changed from Cory Benfield to Adi Roiban

Sorry I lost track of this, but I'm free up to spend some time working on it now.

To your questions, you'll want to make two changes to your second gist: the first is that you'll want to switch your example string from b'SOMETHING\r\n\r\n' to b'PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n'. The main reason for that is that those are the bytes initially sent by all HTTP/2 clients, so the remote server will be a little happier.

The second thing you might want to do is change your target. http2bin.org's web server does something slightly odd when you don't have h2 in your ALPN list, which is that it doesn't return ALPN at all. This means, when I run your second gist on my machine, I get next protocol as None in my output. However, if I change your target to, say, Google, I get http/1.1 in the output.

This probably is a little bit better for the sake of an example. If you'd like the example committed with this patch, let me know where you want me to put it.

comment:35 Changed 4 years ago by Adi Roiban

Owner: changed from Adi Roiban to Cory Benfield

Changed 4 years ago by Cory Benfield

Attachment: 7860_8.patch added

Draft 8 of the patch, containing an example

comment:36 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: changed from Cory Benfield to Adi Roiban

Ok, I've uploaded draft 8 of the patch, which now contains an example. Adi, care to take another swing at reviewing this?

Changed 4 years ago by Cory Benfield

Attachment: 7860_9.patch added

Ninth draft, headers added to the example.

comment:37 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-7branches/tls-alpn-npn-7860-9

(In [45804]) Branching to tls-alpn-npn-7860-9.

comment:38 Changed 4 years ago by Adi Roiban

Keywords: review removed
Owner: changed from Adi Roiban to Cory Benfield

Many thanks for your changes.

I have applied it and send them to buildbot.

While reviewing your changes I made a few changes. Here are my changes - https://github.com/twisted/twisted/commit/db5098e4431ff1868f5d5b40fbe136005c5e5780

They are brutal, but I hope that they help.


The docstring from getContext contains an incomplete raise part.


I have converted self.nextProtocol to self._nextProtocol as I don't see a reason why it should be public... and if you think that it needs to be public then you need to document it as an ivar... but I think it should stay private and only pass it at init time


Is it ok to ignore APN NotImplementedError error. What would happen if we have PyOpenSSL 0.15 + openSSL 1.0.1 ? I don't see a test for this case.


I think that is best to link the example from the ssl.rst file, rather from the generic examples... you can put the example link at the end of NPN ALPN docs

Also, I think that tls_alpn_npn.py should be renamed into tls_alpn_npn_client.py as it performs the client side negotiation.


The self.deferred from the NegotiatedProtocol needs to be documented.

I am not sure that ALPNOnlyOptions is a good idea... we should not inherit from OpenSSLCertificateOptions but rather create a minimal Options object for testing. Inheritance and overwrite can have side-effects which are not easy to detect... and we are in the testing area.


Please try to avoid the abbreviations ... sKey is serverKey ? clientConn is clientConnnection ?.. and many more.


I think that the loopback help method should be renamed to connectOverLoopback .. or something which hint that it does a connection.

Also when calling the lookback() method you can provide the argument names to make it easier to read.

We don't have test for the test code and is critical for the test code to be easy to read and understand to avoid any confusion.

For example:

        self.loopback(
            serverOptions=sslverify.OpenSSLCertificateOptions(
                privateKey=self.sKey,
                certificate=self.sCert,
                verify=True,
                requireCertificate=True,
                caCerts=[self.cCert],
                nextProtocols=protocols,
            ),
            clientOptions=sslverify.OpenSSLCertificateOptions(
                privateKey=self.cKey,
                certificate=self.cCert,
                verify=True,
                requireCertificate=True,
                caCerts=[self.sCert],
                nextProtocols=protocols,
            ),
        )

Since all loopback calls use the same client/server keys maybe we can reduce it to something like this... but maybe it does not help to much... so this is just minor comment.

In test, I prefer to highlight the parts which are critical to the test and try to hide the supporting code.

self.connectOverLoopback(
    serverNextProtocols=something,
    clientNextProtocol=somethingElse,
    )

self.connectOverLoopback(
    serverNextProtocols=something,
    clientOptions=NPNOnlyOptions,
    clientNextProtocol=somethingElse,
    )


In the docs you said that connection fail when a common protocol is not agreed. in test_NPNAndALPNFailure I don't see any failure... it looks to me like the connection is successfully made and you can then call getNextProtocol().


I see that ALPNTests, NPNOrALPNTests, NPNAndALPNAbsentTests have the same setUp/tearDown loopback() methods. Please refactor them into a common test case.


Please check my comments + my diff and submit a new patch.

Many thanks for your help!

comment:39 Changed 4 years ago by Cory Benfield

Is it ok to ignore APN NotImplementedError error. What would happen if we have PyOpenSSL 0.15 + openSSL 1.0.1 ? I don't see a test for this case.

I believe it is, yes. If ALPN is not implemented, NPN may be, and this patch assumes that having just NPN is acceptable. If both ALPN and NPN are missing, then NotImplementedError will still be raised because we don't try to catch that. Basically, here's the support matrix for ALPN/NPN:

  • OpenSSL < 1.0.1: No NPN, no ALPN. Raises NotImplementedError.
  • OpenSSL >= 1.0.1, < 1.0.2: Yes NPN, no ALPN. Does not raise, does not use ALPN, uses NPN.
  • OpenSSL >= 1.0.2: Yes NPN, Yes ALPN. Does not raises, uses ALPN and NPN, prefers ALPN.

I am not sure that ALPNOnlyOptions is a good idea... we should not inherit from OpenSSLCertificateOptions but rather create a minimal Options object for testing. Inheritance and overwrite can have side-effects which are not easy to detect... and we are in the testing area.

I am sympathetic to this, but it's not at all clear to me what such a minimal Options object would look like. The Options object is sufficiently complex that attempting to write a minimal version of it probably requires more familiarity with the interface than I have.

Please try to avoid the abbreviations ... sKey is serverKey ? clientConn is clientConnnection ?.. and many more.

I think that the loopback help method should be renamed to connectOverLoopback .. or something which hint that it does a connection.

In this instance I was trying to imitate the style used in the rest of the test file, which uses those names. I'll update the patch to not use them.

In the docs you said that connection fail when a common protocol is not agreed. in test_NPNAndALPNFailure I don't see any failure... it looks to me like the connection is successfully made and you can then call getNextProtocol().

Ah, apologies, you're right. What I meant to say is that the connection *may* fail if the remote peer chooses to terminate it, but it may also succeed with None being negotiated. I'll update.

Changed 4 years ago by Cory Benfield

Attachment: 7860_10.patch added

Tenth draft patch

comment:40 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: changed from Cory Benfield to Adi Roiban

Ok, I've just submitted the tenth draft patch which I believe addresses all your feedback (excepting what I said in my previous comment). Back to you for more, adiroiban.

comment:41 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-9branches/tls-alpn-npn-7860-10

(In [45849]) Branching to tls-alpn-npn-7860-10.

comment:42 Changed 4 years ago by Adi Roiban

Keywords: review removed
Owner: changed from Adi Roiban to Cory Benfield

Thanks for the latest patch.

You forgot to add the example to the last patch.

I have created a branch for latest patch and I am sending it to the builbot...

Is a bit late here so I will do the review tomorrow ... but meanwhile here are my initial comments.


If ALPN is not implemented, NPN may be, and this patch assumes that having just NPN is acceptable.

Many thanks for your comment. It makes total sense now :)

This should be a documented as a comment, just before ignoring that NotImplementedError.

I think that ignoring exceptions without a comments is a bad practice.


For the misc file

takes a nextProtos parameter

Please use the full name :)

Is the comma required?

Here is my rephasing... I am not a native English speaker.

twisted.internet.ssl.CertificateOptions now takes a nextProtocols parameter that enables negotiation of the protocol used after the TLS handshake. It uses ALPN or NPN.

Some tests are still failing - https://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/tls-alpn-npn-7860-10

We can not merge this branch if test are not green.

You can check the rendered narative documentation here:

https://buildbot.twistedmatrix.com/builds/docs/doc-a38a9a4802c2b5c9e4f80924fd7f7699c7869a44.tar.bz2

Thanks!

Changed 4 years ago by Cory Benfield

Attachment: 7860_11.patch added

Eleventh draft patch

comment:43 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: changed from Cory Benfield to Adi Roiban

I've uploaded the eleventh draft patch. This should contain fixes for everything you mentioned above, adiroiban, as well as fix the broken unit tests.

comment:44 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-10branches/tls-alpn-npn-7860-11

(In [45870]) Branching to tls-alpn-npn-7860-11.

Changed 4 years ago by Cory Benfield

Attachment: 7860_12.patch added

Twelfth draft patch

comment:45 Changed 4 years ago by Cory Benfield

Uploaded another draft patch that should resolve the test failures.

comment:46 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-11branches/tls-alpn-npn-7860-12

(In [45888]) Branching to tls-alpn-npn-7860-12.

comment:47 Changed 4 years ago by Adi Roiban

Some twistedchecker errors ... while the others tests are running

W9013:1810,0: Expected 3 blank lines, found 2
C0103:1852,33:NPNAndALPNBaseTestCase._buildCertificateOptions: Invalid name "ca_certificate" for type variable (should match ([a-z_][a-zA-Z0-9]*)$)

Changed 4 years ago by Cory Benfield

Attachment: 7860_13.patch added

Thirteenth draft patch

comment:48 Changed 4 years ago by Cory Benfield

Alright, looks like all the buildbots are green: we just have the twistedchecker moans. I've uploaded a new patch that I believe will fix them.

comment:49 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-12branches/tls-alpn-npn-7860-13

(In [45894]) Branching to tls-alpn-npn-7860-13.

comment:50 Changed 4 years ago by Glyph

Keywords: review removed
Owner: changed from Adi Roiban to Cory Benfield

Wow. 13 revisions already. Dang. This is quite a patch :). Thanks for your contribution and your enduring patience.

I do have some feedback, and it is of the "design feedback on a public API we have to commit to" variety. Therefore, I'm going to leave it all up to your discretion whether you want to implement my suggestions or not; if you'd prefer to leave things as they are I'm OK with that. If you were a committer I'd say you could land if you liked; however, since I'd really like you to at least have a chance to consider dealing with some of this stuff before landing it, I'm going to reassign it to you. If you'd rather just land it as-is, then feel free to just reply to this and assign it to me and I will do so.

Now. On to the feedback!

  1. You're implementing detection of ALPN/NPN in 3 places:
    1. In getContext you are doing some tricks to avoid blowing up in an untoward way. However, getContext is awful late to be doing this, because it is called by framework code which is ready to make a connection. I'd really want to know that I have no next-protocol negotiation option at construction time when I build the CertificateOptions instance so I can easily fix the code that generated it.
    2. in docs/core/howto/ssl.rst you document the versions of OpenSSL and pyOpenSSL which are required so that users can detect it for themselves (although you do not cover the (different!) version-detection-and-comparison idioms required to discover the version of OpenSSL you have and the version of pyOpenSSL you have)
    3. In the tests, you compute both skipALPN and skipNPN.
    I would very much prefer it if you could instead expose a single function that does this computation, like supportsProtocolNegotiation which returned False, APLN, or NPN, perhaps via a twisted.python.constants.FlagConstant instance.
  2. Given how much care has gone into the API design on this, it seems a pity that it has to be accessed (in the official example, no less) via extraCertificateOptions. It seems like this is a first-class feature which ought to be supported directly by optionsForClientTLS.
  3. Some concerns about getNextProtocol:
    1. We typically try to avoid get in method names; if it's just an attribute that defines the next protocol and it might have one of those values, why not make it an @ivar nextProtocol rather than a method? (You can always implement it with @property if it actually needs to run a little code to retrieve it.)
    2. You should probably know about the ticket about handshake completion notification, since getNextProtocol is not actually legit to call until that has happened. The example may want to link to it; however, getNextProtocol should.
    3. The @rtype should either be L{bytes} or L{unicode}, not L{str}. It seems like L{bytes} would be symmetric with the input here.
    4. Is it ever important / useful for applications to know whether ALPN or NPN was used? I'm getting the impression that this is a pointless implementation detail, so maybe nothing to do here.
    5. We typically put new methods or attributes like this onto a formal interface (like for example, ISSLTransport) so that applications can do ISomething.providedBy(self.transport) rather than blindly hasattring or isinstanceing a too-specific concrete type. One glorious day, I'd love to get rid of OpenSSL, and to facilitate that, the more of this stuff that is abstractly defined, the better. You'll notice we've managed to mostly keep TLSMemoryBIOProtocol out of the documentation, and I'd like to keep it that way.
  4. Why no ALPN/NPN server example? Is that a separate ticket? It seems like this would be the first thing someone would want to implement? In the documentation, you only mention CertificateOptions but then the example uses optionsForClientTLS, which is a bit confusing.
  5. We're trying to move away from base classes as an API for constructing test cases. Inheritance is generally bad, and the proliferation of base-classes-as-test-fixtures makes it very hard to compose integrations between different test systems as they arise. Instead, consider making NPNAndALPNBaseTestCase into an object whose setUp is just the constructor (that takes a test case) and whose tearDown is just a callback added via TestCase.addCleanup on the TestCase passed to NPNAndALPNHelper's constructor.
  6. You're using the global reactor in the tests. This makes the tests both slower and less reliable than necessary.
    1. You're not quite doing all the steps you need to do to completely reliably tear down the connections in the tests. To summarize: you absolutely need a Deferred to make sure your protocol connections have gone away. (The complexity of this tear-down is a good reason to avoid the real reactor.)
    2. This is duplicating code. If you were going to continue to use the real reactor (please don't) please refactor OpenSSLOptionsTests.loopback to be suitable for your uses.
    3. Luckily there's an alternative! Please check out the loopbackTLSConnection in test_sslverify which sets up an in-memory TLS connection and hooks them together. You might need to modify this slightly but it should do almost everything that you need, and make your tests faster and more reliable by getting socket I/O out of the picture. (One of the nice things about working on Twisted itself is that you get to use not-yet-public APIs like twisted.test.iosim... although we should really get around to making that public)

Please address as you see fit and resubmit. Thanks again!

comment:51 Changed 4 years ago by Cory Benfield

So using FlagConstant here turns out to be quite painful, because FlagConstant doesn't really define any logic for having no flags set. You can define a FlagConstant with a numerical value of 0, but that doesn't work quite right because it gets "remembered". This means you end up having a return value of False *or* FlagConstant. That leads to some nasty logic where you have to say "if not x...elif y in x" which is remarkably unclear.

The other problem with switching to this "single query point" is that it gets harder to signal whether the issue is PyOpenSSL or OpenSSL. Currently in the draft patch the notification about skipping the tests is really clear, and points out whether PyOpenSSL or OpenSSL is at fault. Switching to a central place to check this means we lose that bit of information when we need it, because most of the time we don't. This would make me inclined to leave the check in test_sslverify.py in place anyway.

Additionally, most of the work being done in getContext is not removable, because we need to tolerate having only ALPN or NPN available. This means we still need plenty of conditional logic (either 'try...except' or 'if'). For that reason, the only place I can really see this being useful is in the constructor to openSSLCertificateOptions, where it might save us a tiny bit of work, and as a one-stop-shop for Twisted users to validate what's going on. Still a good idea, I'm still going to add it, but it's worth bearing in mind that it is somewhat limited as a tool.

Aside from this caveat, I've done everything else you mentioned here. There are some notes below, but they are just clarifying points on how I ended up doing things.

We're trying to move away from base classes as an API for constructing test cases.

Switching to using loopbackTLSConnection was a little bit of work because it was writing files to disk, and who needs that, but the end result means that I don't actually need the base class anymore anyway. For that reason, it's simply gone. I also refactored loopbackTLSConnection, because it currently writes files to disk. I didn't want to change that behaviour for the existing tests, but it seems totally unnecessary, so I wrote a new implementation that basically just doesn't write to disk. No real drama. The new version is also slightly more flexible because it lets you specify the OpenSSLCertificateOptions you want to use, which is handy.

Why no ALPN/NPN server example?

Nope, just an oversight. In an effort to keep it simple, I've taken my Twisted HTTP/2 example server[0] and stripped out all the HTTP/2 bits. If you want a more fully-featured example server, you're welcome to include that one in its entirety: it's licensed under the MIT license.

Note that I didn't upgrade the client example to connect to the server example expressly because it's a pain to get optionsForClientTLS to stop verifying certificates, which means the handshake never succeeds and you can never see the result working. Instead, in the docstring for the server example, I provided an openssl command line that can be used to test.

Draft 14 of the patch, which I'll upload shortly, should contain all these changes. Please let it run through the buildbots though: it's a substantial patch. For reviewers, please keep a very close eye on the work I did with IALPNTransport and supportedProtocolNegotiationMechanisms: those are all brand new code and I'm not necessarily 100% happy with them, so opinions would be welcomed on those.

[0]: https://github.com/python-hyper/hyper-h2/blob/master/examples/twisted/twisted-server.py

Changed 4 years ago by Cory Benfield

Attachment: 7860_14.patch added

Draft patch 14

comment:52 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: changed from Cory Benfield to Glyph

Ok, draft patch is read to go, review is welcome!

comment:53 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-13branches/tls-alpn-npn-7860-14

(In [45923]) Branching to tls-alpn-npn-7860-14.

comment:54 Changed 4 years ago by Adi Roiban

Thanks for the latest patch.

I have applied the patch and sent it to buildbot ... twisted.test.test_sslverify.NPNOrALPNTests.test_NPNIsSupported is failing ... and maybe others

comment:55 Changed 4 years ago by Cory Benfield

Ah, sorry adiroiban and glyph, the test I had was simply not valid. Resubmitting with that test fixed.

Changed 4 years ago by Cory Benfield

Attachment: 7860_15.patch added

Draft fifteen

comment:56 Changed 4 years ago by Glyph

Author: hawkowl, adiroibanhawkowl, adiroiban, glyph
Branch: branches/tls-alpn-npn-7860-14branches/tls-alpn-npn-7860-15

(In [45927]) Branching to tls-alpn-npn-7860-15.

comment:57 in reply to:  51 Changed 4 years ago by Glyph

Keywords: review removed
Owner: changed from Glyph to Cory Benfield

Replying to Lukasa:

So using FlagConstant here turns out to be quite painful, because FlagConstant doesn't really define any logic for having no flags set. You can define a FlagConstant with a numerical value of 0, but that doesn't work quite right because it gets "remembered". This means you end up having a return value of False *or* FlagConstant. That leads to some nasty logic where you have to say "if not x...elif y in x" which is remarkably unclear.

Dang. Would you mind filing this - and by "this" I mean the invalid "being remembered" - as a bug?

I do have a better workaround for you, however. I think the important desirable attributes of a solution here would be:

  • you can do if supportedProtocolNegotiationMechanisms():
  • you can do if supportedProtocolNegotiationMechanisms() & ProtocolNegotiationSupportFlags.NPN:
  • you can do repr(supportedProtocolNegotiationMechanisms()) and see something meaningful.

So here's a hack to get this without modifying twisted.python.constants that will be forward-compatible with such a fix (I hope). Luckily you already documented this fix for me in your patch (via @cvar NOSUPPORT) ;-) so you need to fix it either way:

from twisted.python.constants import Flags, FlagConstant
class ProtocolNegotiationSupportFlags(Flags):
    NPN = FlagConstant(0x0001)
    ALPN = FlagConstant(0x0002)
ProtocolNegotiationSupportFlags.NOSUPPORT = (
    ProtocolNegotiationSupportFlags.NPN ^ ProtocolNegotiationSupportFlags.NPN
)

With that workaround, this program:

print("NPN?", bool(ProtocolNegotiationSupportFlags.NPN))
print("NOSUPPORT?", bool(ProtocolNegotiationSupportFlags.NOSUPPORT))
print("|?", repr(ProtocolNegotiationSupportFlags.NPN |
                 ProtocolNegotiationSupportFlags.NOSUPPORT))
print("Both?", bool(ProtocolNegotiationSupportFlags.ALPN |
                    ProtocolNegotiationSupportFlags.ALPN))

produces this output (as desired):

NPN? True
NOSUPPORT? False
|? <ProtocolNegotiationSupportFlags=NPN>
Both? True

Also, after typing those names a few times, I think we could make them a bit less wordy :). The flags suffix is probably unnecessary, and supportedProtocolNegotiationMechanisms could be shortend to nextProtocolMechanisms. Or some other abbreviation. But my wrists hurt after typing this section :).

The other problem with switching to this "single query point" is that it gets harder to signal whether the issue is PyOpenSSL or OpenSSL. Currently in the draft patch the notification about skipping the tests is really clear, and points out whether PyOpenSSL or OpenSSL is at fault. Switching to a central place to check this means we lose that bit of information when we need it, because most of the time we don't. This would make me inclined to leave the check in test_sslverify.py in place anyway.

I don't want to gold-plate the gestalt API here, but perhaps supportedProtocolNegotiationMechanisms (or whatever it's renamed to) could take an optional whyNot callable that would be called with a string describing why the support isn't available, if it isn't available. Most of the time you don't care, but for tests you might. Or, rather than the blank NOSUPPORT given above, you could have multiple FlagConstant values with a value of 0, taking advantage of a quirk of twisted.python.constants, like so:

class ProtocolNegotiationSupportFlags(Flags):
    NPN = FlagConstant(0x0001)
    ALPN = FlagConstant(0x0002)
    OPENSSL_TOO_OLD = FlagConstant(0)
    PYOPENSSL_TOO_OLD = FlagConstant(0)

This gives you different kinds of 'no', which you can distinguish:

print("NPN?", bool(ProtocolNegotiationSupportFlags.NPN))
print("OPENSSL?", ProtocolNegotiationSupportFlags.OPENSSL_TOO_OLD,
      bool(ProtocolNegotiationSupportFlags.OPENSSL_TOO_OLD))
print("PYOPENSSL?", ProtocolNegotiationSupportFlags.PYOPENSSL_TOO_OLD,
      bool(ProtocolNegotiationSupportFlags.PYOPENSSL_TOO_OLD))
NPN? True
OPENSSL? <ProtocolNegotiationSupportFlags=OPENSSL_TOO_OLD> False
PYOPENSSL? <ProtocolNegotiationSupportFlags=PYOPENSSL_TOO_OLD> False

Additionally, most of the work being done in getContext is not removable, because we need to tolerate having only ALPN or NPN available. This means we still need plenty of conditional logic (either 'try...except' or 'if'). For that reason, the only place I can really see this being useful is in the constructor to openSSLCertificateOptions, where it might save us a tiny bit of work, and as a one-stop-shop for Twisted users to validate what's going on. Still a good idea, I'm still going to add it, but it's worth bearing in mind that it is somewhat limited as a tool.

My main concern is not that getContext shouldn't do this work, but rather that the users of this API should find out earlier if what they're trying isn't going to work, and the change to CertificateOptions.__init__ does that just fine.

However, I have a secondary concern; this code is full of giant comments about compatibility. We try hard in Twisted to keep all our maintainer documentation in docstrings, since inline comments are an indication that the code is not expressing itself well. This is not a blanket policy, since some code really will be nuanced, and it's better to have comments than not, but we lean away from it whenever possible. So, in this case, instead of the gigantic multi-line essay about pyOpenSSL raising AttributeError, you might do something like this:

supported = supportedProtocolNegotiationMechanisms()
if supported & NPN:
    ctx.set_npn_advertise_callback(npnAdvertiseCallback)
    ctx.set_npn_select_callback(protoSelectCallback)
if supported & ALPN:
    ctx.set_alpn_select_callback(protoSelectCallback)
    ctx.set_alpn_protos(self._nextProtocols)

This is way easier for a maintainer to read, and if they're interested in the nuances of why it would or wouldn't be supported they can read the implementation of supportedProtocolNegotiationMechanisms.

Also, getContext still documents that it @raises NotImplementedError, but that is no longer true; you should kill that part of the docstring. (Note that I didn't include any if self._nextProtocols and not supported block, because it's not possible to get there any more if you go through the constructor. Although that is one thing that might be worth having a brief comment about :))

npnAdvertiseCallback is only used in the NPN case, so I think it should be defined in that branch of the conditional.

Finally (about getContext, anyway), protoSelectCallback and npnAdvertiseCallback don't need docstrings. Straightforward inner callbacks (especially npnAdvertiseCallback) don't reqiure docs unless you think they need an explanation (twistedchecker should not be complaining).

Aside from this caveat, I've done everything else you mentioned here. There are some notes below, but they are just clarifying points on how I ended up doing things.

We're trying to move away from base classes as an API for constructing test cases.

Switching to using loopbackTLSConnection was a little bit of work because it was writing files to disk, and who needs that, but the end result means that I don't actually need the base class anymore anyway. For that reason, it's simply gone. I also refactored loopbackTLSConnection, because it currently writes files to disk. I didn't want to change that behaviour for the existing tests, but it seems totally unnecessary, so I wrote a new implementation that basically just doesn't write to disk. No real drama. The new version is also slightly more flexible because it lets you specify the OpenSSLCertificateOptions you want to use, which is handy.

Changing the behavior for the existing tests would actually have been great. You're right: writing to disk is totally irrelevant. It probably does that just because years of constant abuse from OpenSSL's APIs has made me cagey about believeing that either PyOpenSSL or OpenSSL itself could handle sharing a data structure between two different pieces of code without serializing it and deserializing it again.

Maybe we can do that refactoring in a subsequent ticket.

Why no ALPN/NPN server example?

Nope, just an oversight. In an effort to keep it simple, I've taken my Twisted HTTP/2 example server[0] and stripped out all the HTTP/2 bits. If you want a more fully-featured example server, you're welcome to include that one in its entirety: it's licensed under the MIT license.

Cool, thank you. I think the trivial example is good enough for pedagogical purposes, but hooray for license compatibility; thanks so much for selecting MIT for that work. It will make it way easier to broadly adopt.

Note that I didn't upgrade the client example to connect to the server example expressly because it's a pain to get optionsForClientTLS to stop verifying certificates, which means the handshake never succeeds and you can never see the result working. Instead, in the docstring for the server example, I provided an openssl command line that can be used to test.

Maybe it should be a little less annoying to do this, but on the other hand... no, no it shouldn't ;-). Lots of other frameworks make it easy to stop verifying certs and we've kinda seen how that plays out.

Draft 14 of the patch, which I'll upload shortly, should contain all these changes. Please let it run through the buildbots though: it's a substantial patch. For reviewers, please keep a very close eye on the work I did with IALPNTransport and supportedProtocolNegotiationMechanisms: those are all brand new code and I'm not necessarily 100% happy with them, so opinions would be welcomed on those.

So, apparently the test-skipping logic was broken by the changes to the gestalt API, and the builders are unhappy with this again.

Obviously this needs another round but I think that the API is pretty much the right shape now; the biggest changes are to fix up the gestalt.

Last edited 4 years ago by Glyph (previous) (diff)

comment:58 Changed 4 years ago by Cory Benfield

Dang. Would you mind filing this - and by "this" I mean the invalid "being remembered" - as a bug?

Done: see #8074.

Changed 4 years ago by Cory Benfield

Attachment: 7860_16.patch added

Sixteenth draft patch

comment:59 Changed 4 years ago by Cory Benfield

Keywords: review added
Owner: changed from Cory Benfield to Glyph

Alright, patch 16 is up. Let's make sure we send it to the builders again: I wrote a new test for a topology I don't have easy access to and I want to make sure it actually works!

comment:60 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-15branches/tls-alpn-npn-7860-16

(In [46003]) Branching to tls-alpn-npn-7860-16.

Changed 4 years ago by Cory Benfield

Attachment: 7860_17.patch added

Seventeenth draft patch

comment:61 Changed 4 years ago by Cory Benfield

There were problems with patch 16, so I uploaded a patch that I believe fixes that.

comment:62 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-16branches/tls-alpn-npn-7860-17

(In [46005]) Branching to tls-alpn-npn-7860-17.

Changed 4 years ago by Cory Benfield

Attachment: 7860_18.patch added

Eighteenth draft patch

comment:63 Changed 4 years ago by Cory Benfield

Ok, the prior patch failed twistedchecker. 18 should do better.

comment:64 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-17branches/tls-alpn-npn-7860-18

I don't know why the branch was not auto-udpated

comment:65 Changed 4 years ago by Adi Roiban

Many thanks for the latest patch.

I think that it looks great... with example, documenation.. state of the art :) Many thanks for all your hard work.

Here is my initial code review.

I have not yet check / reviewed the tests coverage.


The new server example should also be listed in docs/core/examples/index.rst or remove the client example from there


For client example, for the TLS_TRIGGER_DATA maybe we need to add a comment to the link to the Twisted bug due to which this is still needed


For the server example

It will print what protocol was negotiated and exit.

maybe this can be a docstring for BasicResponderProtocol or maybe use a better name for the BasicResponderProtocol ... Basic in what? Reponser .... how?

To me, it does not look like the BasicResponderProtocol will triger an exit. Exit from what?

To me, it just accept any connection, for each data receive it prints the next protocol and waits for client to close the connection.


For server example maybe we can have the LISTEN_PORT as a top variable/constant


In _sslverify for

# This is a workaround for #8074: remove when that issue is fixed.

please use the standerd FIXME format

http://twistedmatrix.com/documents/current/core/development/policy/coding-standard.html#comments


The nextProtocols docstring from optionsForClientTLS, ClientTLSOptions, TLSMemoryBIOProtocol and IALPNTransport are somehow duplicated ... Later when we update the code we migh forget to update the other parts.

Not sure how to handle this... or how to add a note to prevent getting them out of sync.

Maybe we can just document them in one part, and from the other part we just put a link with @see: L{IALPNTransport.nextProtocol}

Also, the note about the handshake completion should come with a link to the ticket

In this way, if the ticket is fixed we can do a search on the code and also update all places where this bug is referred


for loopbackTLSConnectionWithoutFiles , instead and used a name based on what is is not... maybe we should name it based on what it is...ex

loopbackTLSConnectionInMemory :)


I will try to find some time to get VM/dockers with various openssl version and give the client/server example a try.

Thanks!

comment:66 Changed 4 years ago by Adi Roiban

I have paired the client with the server and I got the following errros

$ py docs/core/examples/tls_alpn_npn_server.py
/home/adi/chevah/twisted/twisted/python/modules.py:786: UserWarning: docs/core/examples (for module __main__) not in path importer cache (PEP 302 violation - check your local configuration).
  return theSystemPath[moduleName]




Connection made
Connection lost due to error [Failure instance: Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'SSL3_READ_BYTES', 'tlsv1 alert unknown ca'), ('SSL routines', 'SSL3_READ_BYTES', 'ssl handshake failure')]
/home/adi/chevah/twisted/twisted/internet/posixbase.py:597:_doReadOrWrite
/home/adi/chevah/twisted/twisted/internet/tcp.py:209:doRead
/home/adi/chevah/twisted/twisted/internet/tcp.py:215:_dataReceived
/home/adi/chevah/twisted/twisted/protocols/tls.py:422:dataReceived
--- <exception caught here> ---
/home/adi/chevah/twisted/twisted/protocols/tls.py:360:_flushReceiveBIO
/home/adi/chevah/twisted/build/local/lib/python2.7/site-packages/OpenSSL/SSL.py:1320:recv
/home/adi/chevah/twisted/build/local/lib/python2.7/site-packages/OpenSSL/SSL.py:1187:_raise_ssl_error
/home/adi/chevah/twisted/build/local/lib/python2.7/site-packages/OpenSSL/_util.py:48:exception_from_error_queue
]
$ py docs/core/examples/tls_alpn_npn_client.py
Connection made
Connection lost due to error [Failure instance: Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'SSL3_GET_SERVER_CERTIFICATE', 'certificate verify failed')]
/home/adi/chevah/twisted/twisted/internet/posixbase.py:597:_doReadOrWrite
/home/adi/chevah/twisted/twisted/internet/tcp.py:209:doRead
/home/adi/chevah/twisted/twisted/internet/tcp.py:215:_dataReceived
/home/adi/chevah/twisted/twisted/protocols/tls.py:415:dataReceived
--- <exception caught here> ---
/home/adi/chevah/twisted/twisted/protocols/tls.py:554:_write
/home/adi/chevah/twisted/build/local/lib/python2.7/site-packages/OpenSSL/SSL.py:1271:send
/home/adi/chevah/twisted/build/local/lib/python2.7/site-packages/OpenSSL/SSL.py:1187:_raise_ssl_error
/home/adi/chevah/twisted/build/local/lib/python2.7/site-packages/OpenSSL/_util.py:48:exception_from_error_queue
]

this is on Ubuntu 14.04

>>> OpenSSL.__version__
'0.15.1'

$ openssl version
OpenSSL 1.0.1f 6 Jan 2014

comment:67 Changed 4 years ago by Cory Benfield

Adi, I mentioned this quite a few comments up (this has become a massive issue so it's hard to find stuff like this):

Note that I didn't upgrade the client example to connect to the server example expressly because it's a pain to get optionsForClientTLS to stop verifying certificates, which means the handshake never succeeds and you can never see the result working. Instead, in the docstring for the server example, I provided an openssl command line that can be used to test.

In short: the client and the server don't work together, and won't unless you can get a valid TLS cert for wherever you put the server. Otherwise, it's just impossible to get Twisted to be happy to establish that connection. I recommend testing the server using a browser or openssl (there's a sample OpenSSL command line in the server).

Changed 4 years ago by Cory Benfield

Attachment: 7860_19.patch added

Nineteenth draft patch

comment:68 Changed 4 years ago by Cory Benfield

Otherwise, I've addressed your feedback Adi, and uploaded patch 19.

Changed 4 years ago by Cory Benfield

Attachment: 7860_20.patch added

Twentieth draft patch

comment:69 Changed 4 years ago by Cory Benfield

On adiroiban's advice, I've submitted a new patch, patch 20. This patch adjusts the client and server examples so they can talk to each other by re-minting the server.pem file. I did my best to keep it as close to the same as the original server.pem, with the following changes:

  • Changed the hash algorithm from MD5 to SHA256 (because, really?).
  • Added 'localhost' as a subjectAlternativeName domain to the cert, allowing it to be used for loopback connections.
  • Changed its expiration date to be 2045 so we can avoid needing to touch it again for a *long* time.

Nothing else changed.

It's my hope, therefore, that all the examples will continue to work.

I also slightly adjusted the example server so that it would send data back before tearing the connection down, so that both the client and the server print the protocol they agreed on.

comment:70 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-18branches/tls-alpn-npn-7860-20

(In [46043]) Branching to tls-alpn-npn-7860-20.

comment:71 Changed 4 years ago by Adi Roiban

Thanks for the latest patch.

Automated tests are green and I can use both client and server examples.


I have also tried it with pyOpenSSL 0.13 and the NotImplementedError is raised.


If there is no overlap between the client and the server with OpenSSL 1.0.1 I get

On the client side

Connection lost due to error [Failure instance: Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'SSL3_GET_SERVER_HELLO', 'parse tlsext')]

On the server side

Connection lost due to error [Failure instance: Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'SSL3_READ_BYTES', 'tlsv1 alert internal error'), ('SSL routines', 'SSL3_READ_BYTES', 'ssl handshake failure')]

This is related to test test_NPNAndALPNFailure ... which contain the Failure in the test name, but I can see that the connection is successfully created in the test... and in my manual test it fails.

Somehow the test are not in sync with my manual tests.


I have done the tests with OpenSSL 1.0.1...

I am now waiting for my container to have OpenSSL 1.0.2 and give it a try.

comment:72 Changed 4 years ago by Adi Roiban

I tried the example without matching protocols on OpenSSL 1.0.2d in stretch/sid

client side error

Connection lost due to error [Failure instance: Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'SSL23_GET_SERVER_HELLO', 'tlsv1 alert internal error')]

server side error

Connection lost due to error [Failure instance: Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'ssl3_get_client_hello', 'parse tlsext')]

but I don't know why I get ssl3 errors... when only TLS was supposed to be used.

I have also explicitly tried method=OpenSSL.SSL.TLSv1_METHOD ... maybe this is just how OpenSSL is reusing some SSL3 code in TLS1

comment:73 Changed 4 years ago by Adi Roiban

This ticket also adds nextProtocols in optionsForClientTLS and it adds the new IALPNTransport interface.

I think that this should be part of the news fragment.

comment:74 Changed 4 years ago by Cory Benfield

Aha, that explains a lot. I was wondering why the behaviour I'd seen back when I started working on this (failure to negotiate loses the connection) went away, and instead the connection was set up with nextProtocol None. But I've worked it out.

If one peer does not negotiate a protocol *at all* (that is, does not provide nextProtocol, or provides an empty list), then the handshake works fine. However, if both peers provide protocols with no overlap at all, the connection fails.

I'm going to quickly dive in and see if that's a guaranteed behaviour, or if we change it, because some servers in wild don't exhibit that behaviour. However, if we can't change it then I'll add a test that verifies that logic too.

Changed 4 years ago by Cory Benfield

Attachment: 7860_21.patch added

Twenty-first draft patch

comment:75 Changed 4 years ago by Cory Benfield

Ok, I added those documentation notes and updated the news fragment. Patch 21 also adds a new test for the two kinds of connection negotiation behaviours, so it needs to spin around the builders again to confirm that it's all good.

comment:76 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-20branches/tls-alpn-npn-7860-21

(In [46052]) Branching to tls-alpn-npn-7860-21.

comment:77 Changed 4 years ago by Adi Roiban

Thanks for the last patch.

Don't forget to also update the public.pem so that it is in sync with server.pem


ssl.rst

It talks about supportedProtocolNegotiationMechanisms and does not mention optionsForClientTLS.

For the

If ALPN or NPN are used and no overlap can be found, then the remote peer may choose to terminate the connection.

This may cause the TLS handshake to fail, or may result in the connection being torn down immediately after being made.

maybe we can have it as a warning or a note block... to stand out as I find that this is important.


_sslverify.py

I don't see any tests in which nextProtocolMechanisms is used with the whyNot callback.

I don't see ProtocolNegotiationSupport.NOSUPPORT used in the test.

Do we have coverage for these parts?


Minor comment for tests.

Rather than having many test calling negotiateProtocol and then checking that reason is None, maybe we can have

negoticateProtocolsWithSuccess and negoticateProtocolsWithFailure and pass the error only in negoticateProtocolsWithFailure.

Maybe the docstring for test_NextProtocolReturnsNone should also talk about the fact that when nextProtocols is None NPN and ALPN will not be tried so the connection will succeed.

comment:78 Changed 4 years ago by Cory Benfield

I don't see any tests in which nextProtocolMechanisms is used with the whyNot callback.

Yeah, glyph and I talked a bit about having this but then I never actually used it, so I'm removing it.

Rather than having many test calling negotiateProtocol and then checking that reason is None, maybe we can have negoticateProtocolsWithSuccess and negoticateProtocolsWithFailure and pass the error only in negoticateProtocolsWithFailure.

We could, but it makes life a little trickier because those functions don't have access to test assertions. This means they can't fail the tests on connection success/failure without raising opaque test errors or throwing exceptions, neither of which is great. The assertion that lostReason is None is actually important: it signals that the connection was successfully established and cleanly closed.

For the moment, I think I'm happy with the tests as they are, but I'm willing to let glyph be a tiebreaker here.

All your other feedback is addressed in the next patch, which I'll upload shortly.

Changed 4 years ago by Cory Benfield

Attachment: 7860_22.patch added

Twenty-second draft patch

comment:79 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-21branches/tls-alpn-npn-7860-22

(In [46056]) Branching to tls-alpn-npn-7860-22.

comment:80 Changed 4 years ago by Adi Roiban

Maybe we should not update the server certificate, but rather instruct people to create their own certificates.

I hope that some day, Twisted, PyOpenSSL or cryptography will provide a nice API for generating RSA keys and x.509 certs

Changed 4 years ago by Cory Benfield

Attachment: 7860_23.patch added

Twenty-third draft patch

comment:81 Changed 4 years ago by Cory Benfield

Ok, changes to the server certificate have been removed and instructions for generating a new one added.

comment:82 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-22branches/tls-alpn-npn-7860-23

(In [46093]) Branching to tls-alpn-npn-7860-23.

comment:83 Changed 4 years ago by Adi Roiban

Owner: changed from Glyph to Adi Roiban

I am doing a final review, reading all strings and making sure all is ok.

There are a few minor issues, but I am fixing them while doing the review.

Will come back soon :)

comment:84 Changed 4 years ago by Adi Roiban

Keywords: review removed
Owner: changed from Adi Roiban to Cory Benfield

Hi,

I did another full review.


I have updated the examples to use cert from current working directory... users should not have to overwrite certs found in the example dir.

I have also broke server key and cert into separate files.


I moved the raised exception docstring from makeContext to init


I don't like the way NPN/ALPN is set up in the tests.... as it is first set and then re-set stuff. I broke the code into separate methods. Please check that it makes sense. For me, when you only want to set up NPN it makes more sense to just see NPN calls, rather than only un-set ALPN calls.


Please review my latest changes

https://github.com/twisted/twisted/commit/fda4b0b48a0b59c8dae26df8d81b86ed0223692c

https://github.com/twisted/twisted/commit/fdc994a31c3a7ef7f19b802a478d7e12d3620747


If possible, please submit a patch based on latest branch so that the diff is easier to read.

Thanks!

comment:85 Changed 4 years ago by Cory Benfield

Owner: changed from Cory Benfield to Adi Roiban

Adiroiban, those changes seem fine to me. No notes. =)

comment:86 Changed 4 years ago by Adi Roiban

Owner: changed from Adi Roiban to Glyph

I am assigning this to glpyh as a final warning in case he want to review it :)

If not review until Monday evening (GMT) I will look at fixing the remaining api-doc warnings and merge it.

Many, many thanks for all your hard work!

comment:87 Changed 4 years ago by Cory Benfield

Thanks to you too adiroiban! This has been a big ticket and you've done a great deal, thanks so much. =)

Once this is landed, I can start work on #7460!

comment:88 Changed 4 years ago by Adi Roiban

Ok. Summary of latest feedback from IRC.

The only concerns were about the naming. No complains about the design.

I think that you can start working on #7460 as I hope that we will have soon an agreement over the names.... but the design should stay the same

I know that naming is hard, but we have some feedback and maybe we can improve this.

Here are my suggestions for renaming.

  • IHasProtocolNegotiation / ICanNegotiateProtocol
  • ssl.optionsForClientTLS(negotiableProtocols)
  • self.transport.negotiatedProtocol

Cheers!

Changed 4 years ago by Cory Benfield

Attachment: 7860_24.patch added

Twenty-fourth draft patch

comment:89 Changed 4 years ago by Cory Benfield

Alrighty, names changed and patch uploaded. For the record, the names are now:

IALPNTransport -> INegotiated nextProtocols -> acceptableProtocols nextProtocol -> negotiatedProtocol

We should run the buildbots once more to confirm I didn't miss anything.

comment:90 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-23branches/tls-alpn-npn-7860-24

(In [46137]) Branching to tls-alpn-npn-7860-24.

comment:91 Changed 4 years ago by Adi Roiban

Owner: changed from Glyph to Cory Benfield

Can we have nextProtocolMechanisms -> protocolNegotiationMechanisms ?

waiting for the builds

Changed 4 years ago by Cory Benfield

Attachment: 7860_25.patch added

Twenty-fifth draft patch

comment:92 Changed 4 years ago by Cory Benfield

Owner: changed from Cory Benfield to Glyph

Glyph, I've made that change in draft patch 25.

comment:93 Changed 4 years ago by Adi Roiban

Branch: branches/tls-alpn-npn-7860-24branches/tls-alpn-npn-7860-25

(In [46143]) Branching to tls-alpn-npn-7860-25.

comment:94 Changed 4 years ago by Adi Roiban

Owner: changed from Glyph to Cory Benfield

will merge. thanks!

comment:95 Changed 4 years ago by Adi Roiban

Resolution: fixed
Status: newclosed

(In [46146]) Merge tls-alpn-npn-7860-25: Add NPN and ALPN support for the TLS transport.

Author: lukasa Reviewers: hawkowl, adiroiban, glyph Fixes: #7860

Note: See TracTickets for help on using tickets.