Opened 3 years ago

Last modified 2 years ago

#5374 enhancement new

TLS Server Name Indication Support (Client)

Reported by: moxie Owned by: moxie
Priority: normal Milestone:
Component: core Keywords:
Cc: Branch:
Author: Launchpad Bug:

Description (last modified by exarkun)

Twisted should support making TLS connections with SNI information when possible (eg. PyOpenSSL >= 0.13).

It should be possible to specify the SNI hostname separately from the connection address (ie, it should be possible to connect to ("74.125.31.99", 443) but specify an SNI hostname of "www.google.com").

Attachments (1)

sni-client.patch (6.8 KB) - added by moxie 3 years ago.
Patch to enable client-side SNI support for TLS connections.

Download all attachments as: .zip

Change History (20)

Changed 3 years ago by moxie

Patch to enable client-side SNI support for TLS connections.

comment:1 Changed 3 years ago by moxie

  • Keywords review added; tls ssl sni removed

comment:2 Changed 3 years ago by exarkun

  • Keywords review removed
  • Owner set to moxie

It should, for sure.

I wonder if it would be better to expand the context factory interface, though, rather than all of the connectSSL-like interfaces? Changing interfaces is a bit scary, because it weakens what an interface is (a definition of the expected behavior of a object - now should I use the definition from Twisted 10.0 or Twisted 12.0?). In particular, adding arguments to a signature means disagreement over which version of the interface should be used means calls may fail with a TypeError. Adding an optional attribute to an object is perhaps just as problematic, but somehow feels safer. I'd love to hear from some other Twisted developers on this as well, though.

It would also be useful to have some unit tests and documentation for the functionality. :)

Thanks!

comment:3 Changed 3 years ago by itamar

  1. Wouldn't one want to always include SNI by default? Or will it break if the server doesn't support it?
  1. If we do want to include it by default: since we're already including a hostname parameter in connectSSL, why not use it by default, instead of supplying it again? One might want to override it, but seems unnecessary to have to supply it twice for the normal case.

Having it on context factory is one way to do it... but what about sharing the same context factory with multiple connections, as twisted.web.client.Agent does IIRC? Perhaps we should have a ContextFactory.getSNI(hostname) method, rather than a hardcoded value?

comment:4 Changed 3 years ago by moxie

I contemplated adding a method to the ContextFactory instead, but putting it directly in the connectSSL() call felt cleaner to me, since I didn't know whether some callers would prefer to reuse the same ContextFactory for multiple connections.

I'm not sure whether doing SNI by default is a good idea or not, but I would definitely like a method for making it explicit. There are cases where you want to call connectSSL() for a specific IP address and specify a hostname for the SNI parameter separately, which I feel are important to support.

I don't know enough about Twisted internals to have a strong preference for whether it's better to do this in the ContextFactory or the connectSSL call. If the consensus is the ContextFactory, I'm totally willing to change the patch. I might even write a unit test. =)

comment:5 follow-up: Changed 3 years ago by exarkun

what about sharing the same context factory with multiple connections, as twisted.web.client.Agent does

Agent doesn't.

There are cases where you want to call connectSSL() for a specific IP address and specify a hostname for the SNI parameter separately, which I feel are important to support.

Yep, I agree, this is important.

Perhaps we should have a ContextFactory.getSNI(hostname) method, rather than a hardcoded value?

This doesn't help, because of the IP address case moxie mentioned.

If the consensus is the ContextFactory, I'm totally willing to change the patch.

One thing that occurs to me this morning that I forgot about before was that the underlying OpenSSL APIs don't put this onto the context. Currently our context factory objects are basically only thin wrappers around OpenSSL context objects. Adding SNI information would be a point of divergence between the two APIs.

comment:6 Changed 3 years ago by antoine

SSL contexts are meant to be shared so, unless I'm mistaking what this is talking about, sharing a context factory accross connections is reasonable. The SNI hostname, on the other hand, is pretty much connection-specific, as is the IP address of the host you're connecting to.

However, the context factory could take an argument indicating whether SNI should be automatically enabled using the connectSSL() hostname.

For the record, Python 3.2 enables SNI by default in high-level client libraries (urllib.request), but it leaves it as a manual option in the lower-level ssl module.

comment:7 in reply to: ↑ 5 Changed 3 years ago by moxie

Replying to exarkun:

One thing that occurs to me this morning that I forgot about before was that the underlying OpenSSL APIs don't put this onto the context. Currently our context factory objects are basically only thin wrappers around OpenSSL context objects. Adding SNI information would be a point of divergence between the two APIs.

Yeah, I was imagining another method on the ContextFactory like getServerNameIndication() that by default returns None. It does seem like this is more "backwards compatible" in the sense that merely attempting to override that method won't break anything when running on older versions of Twisted without SNI support, where as changing connectSSL() can potentially give callers an "unexpected keyword argument" error when their code runs on old versions.

It doesn't seem totally unreasonable to me that if people want to use SNI, they can't reuse their ContextFactory?

comment:8 follow-up: Changed 3 years ago by exarkun

SSL contexts are meant to be shared

I don't know of an explicit statement of this by the OpenSSL developers. Rather, SSL contexts are a big bag of parameters and configuration that are easier to pass around as a single object than as separate arguments.

It can be convenient to share them, if you happen to be setting up multiple connections with the same parameters. This already means that you can only share the context if you want to use the same protocol, ciphers, certificate verification rules, and a bunch of other things.

It's true that you currently you can consider sharing without considering SNI and if we add SNI to the context factory that will no longer be true. However, if you were matching up certificate names with hostnames before, then you could only share contexts between connections to the same host. If you weren't matching up names, or you weren't verifying certificates at all, then you probably won't care about SNI and you can leave use the old behavior of not specifying it (which should probably be the default).

So it doesn't seem like the opportunities for sharing will be noticeably reduced.

For the record, Python 3.2 enables SNI by default in high-level client libraries (urllib.request), but it leaves it as a manual option in the lower-level ssl module.

Do the APIs which enable SNI by default also enable certificate verification by default? If so, how do they get the CAs to use for that verification?

It doesn't seem totally unreasonable to me that if people want to use SNI, they can't reuse their ContextFactory?

In case it isn't clear, this also doesn't seem totally unreasonable to me.

comment:9 in reply to: ↑ 8 Changed 3 years ago by antoine

Replying to exarkun:

SSL contexts are meant to be shared

I don't know of an explicit statement of this by the OpenSSL developers.

Well, I don't know of any explicit public statement by the OpenSSL developers. OpenSSL docs are a pile of crummy incomplete man pages, and you'll be hard-pressed to get any actually useful help out of them or their developers (or perhaps you're luckier than I am :-)). It just seems that sharing context objects between SSL connections is the intuitive thing implied by the OpenSSL APIs.

Similarly, even without looking for any explicit statement by the Twisted developers, I guess Factory objects are meant to be shared accross multiple connections (please correct me if I'm wrong :-)).

Rather, SSL contexts are a big bag of parameters and configuration that are easier to pass around as a single object than as separate arguments.

Well, wouldn't it be annoying to lose that property because of the SNI setting?
I'm also assuming that parsing certificates isn't totally costless, and doing it once and for all instead of once per connection seems to make sense.
SSL contexts also allow you to cache and reuse sessions, although I'm not sure Twisted takes advantage of that (the Python ssl module definitely doesn't - not on the client side anyway).

It can be convenient to share them, if you happen to be setting up multiple connections with the same parameters. This already means that you can only share the context if you want to use the same protocol, ciphers, certificate verification rules, and a bunch of other things.

You don't force the ciphers at the context level (well, you can, but usually you don't). However, certificate verification policy is quite often a program-wide policy indeed, and that's also why sharing contexts makes sense.

However, if you were matching up certificate names with hostnames before, then you could only share contexts between connections to the same host.

That's false. You can certainly validate different hostnames and certificates using a single context. You have to write your own code anyway, OpenSSL doesn't do anything to ensure the subject or subjectAltName match your connection request.

If you weren't matching up names, or you weren't verifying certificates at all, then you probably won't care about SNI and you can leave use the old behavior of not specifying it (which should probably be the default).

That's quite orthogonal. SNI doesn't change how or even whether the client should check the server's certificate, it changes how the server reacts to the request. It might imply e.g. serving different contents depending on the SNI hostname (regardless of whether the client validates the certificate's subject name).

For the record, Python 3.2 enables SNI by default in high-level client libraries (urllib.request), but it leaves it as a manual option in the lower-level ssl module.

Do the APIs which enable SNI by default also enable certificate verification by default?

Nope (but I would have liked to, if set_default_verify_paths() at least had had some kind of error checking :-)). SNI is orthogonal to certificate verification by the client.

comment:10 follow-up: Changed 3 years ago by exarkun

Rather, SSL contexts are a big bag of parameters and configuration that are easier to pass around as a single object than as separate arguments.

Well, wouldn't it be annoying to lose that property because of the SNI setting?

The property lost is the shareability. After adding SNI, it's still a big bag of parameters and configuration that is easier to pass around as a single object.

I don't know how annoying it would be lose to shareability. Maybe it would be somewhat annoying. I do know that adding a new SNI parameter to every SSL connection setup API will be annoying, though. (And adding whatever new parameter is required next will be annoying, and adding the parameter after that...)

However, if you were matching up certificate names with hostnames before, then you could only share contexts between connections to the same host.

That's false. You can certainly validate different hostnames and certificates using a single context.

Can you show me how?

comment:11 in reply to: ↑ 10 Changed 3 years ago by antoine

I don't know how annoying it would be lose to shareability. Maybe it would be somewhat annoying. I do know that adding a new SNI parameter to every SSL connection setup API will be annoying, though. (And adding whatever new parameter is required next will be annoying, and adding the parameter after that...)

That's why I'm also suggesting a parameter on factories so that the connection hostname is also used as the SNI hostname for the 98% cases where it is sufficient :)

However, if you were matching up certificate names with hostnames before, then you could only share contexts between connections to the same host.

That's false. You can certainly validate different hostnames and certificates using a single context.

Can you show me how?

I'm not sure I understand the question, because it's done the same way as validating a single connection.
If I take the ssl module idioms, you first set up a context:

ctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
ctx.set_default_verify_paths()   # I'm lazy
ctx.verify_mode = ssl.CERT_REQUIRED

Then you do your connection and start the SSL layer on it, specifying the desired hostname:

conn = socket.create_connection(("myhost.com", 443))
ssl_conn = ctx.wrap_socket(conn, server_hostname="mysubdomain.myhost.com")

Then you grab the certificate and validate it against the desired hostname:

cert = ssl_conn.getpeercert()
ssl.match_hostname(cert, "mysubdomain.myhost.com")  # raises if the hostname doesn't match

See http://docs.python.org/py3k/library/ssl.html for the API description
(I haven't actually checked the code, but barring typos it should work)

comment:12 Changed 3 years ago by exarkun

Then you grab the certificate and validate it against the desired hostname:

cert = ssl_conn.getpeercert()
ssl.match_hostname(cert, "mysubdomain.myhost.com")  # raises if the hostname doesn't match

Okay, but this is the part I don't follow. How do you decide what hostname to check, if you are using this code to connect to multiple different hostnames?

Also, this looks like it might not be part of a verification callback. Is it, or is it code run after OpenSSL is done with the handshake and has decided to let the connection proceed? For the latter case, it has nothing to do with the context at all. My comments above are based on the former case, where verification happens in a callback during the handshake.

comment:13 Changed 3 years ago by antoine

How do you decide what hostname to check, if you are using this code to connect to multiple different hostnames?

Obviously you don't hardwire the hostname as a constant. e.g. in HTTP, the hostname would come from the URL.

My comments above are based on the former case, where verification happens in a callback during the handshake.

I don't know why that would change anything. If you have the certificate and know which hostname you want to check for, it shouldn't be a problem to do that from the callback.

comment:14 Changed 3 years ago by exarkun

I think we have been talking about different things. pyOpenSSL exposes a Python callback function API to verify peer certificates. The callback is specified on the context object. From this callback, it is hard to make decisions about hostnames. It is equally possible to ignore this feature of pyOpenSSL and verify the hostname after the connection has been established; this is similar to the code snippets based on the stdlib ssl library.

It seems this conversation has wandered a bit from the topic, though.

comment:15 Changed 3 years ago by exarkun

  • Description modified (diff)

comment:16 Changed 3 years ago by exarkun

The current patch makes no change to twisted.internet.endpoints.SSL4ClientEndpoint. That API should definitely support client-side SNI eventually. Whether it's logically part of this ticket or not, I'm not sure. If the feature were to be implemented as a change to context factories, that endpoint would inherit it by accident (which is more or less the point of context factories - big bag of configuration that can all be passed around as one object).

comment:17 Changed 2 years ago by hathawsh

May I suggest a different way to expand the connectSSL method? The patch suggested expanding the connectSSL function signature with an optional sni parameter:

def connectSSL(self, host, port, factory, contextFactory, timeout=30, bindAddress=None, sni=None):

Here is my alternative:

def connectSSL(self, host, port, factory, contextFactory, timeout=30, bindAddress=None):
    sni = getattr(host, 'sni', host)

What I like about this alternative:

  • Just as we expect modern web browsers to send SNI by default, this makes Twisted send SNI by default.
  • SNI can be overridden easily using a subclass of str:
class HostWithSNI(str):
    def __new__(cls, host, sni):
        s = str.__new__(cls, host)
        s.sni = sni
        return s
  • SNI can be disabled by setting the sni attribute to None.
  • This is both backward and forward compatible: Applications that set the sni attribute still work with old versions of Twisted reactors; and applications that don't set the sni attribute still work with new versions of Twisted.

What I don't like:

  • If regular Python strings ever grow a "sni" attribute, or if people are already passing objects with an 'sni' attribute as the host parameter, Twisted will freak out. While those possibilities don't seem very likely to me, we could avoid any possible clashes using an interface lookup rather than getattr().

Thoughts?

comment:18 Changed 2 years ago by exarkun

There is already a place to hang random SSL-related configuration. It is the context, produced by the context factory. I don't see the advantage of starting to hang random SSL-related configuration off of a different connectSSL parameter - and certainly not off of one that is currently a string type.

All of the listed advantages are to be had by adding SNI configuration to the context factory. So this seems at best equivalent, but by involving subclasses of str, seems not as good.

comment:19 Changed 2 years ago by hathawsh

Using the context sounds fine to me, exarkun. What will it take to get this into Twisted?

Note: See TracTickets for help on using tickets.