New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bpo-28414] Make all hostnames in SSL module IDN A-labels #5128

Merged
merged 6 commits into from Feb 24, 2018

Conversation

Projects
None yet
7 participants
@tiran
Member

tiran commented Jan 7, 2018

Alternative approach to PR #3010

  • Really all server_hostname attributes are IDN A-labels
  • SSLContext.set_servername_callback argument is IDN A-label, too
  • Replace all externally replaced certs with proper certs from internal test helper

https://bugs.python.org/issue28414

Show outdated Hide outdated Doc/whatsnew/3.7.rst
Show outdated Hide outdated Lib/test/test_asyncio/test_events.py
@njsmith

This comment has been minimized.

Show comment
Hide comment
@njsmith

njsmith Jan 10, 2018

Contributor

I did a quick skim and the code seems fine to me, but... I do think that for set_servername_callback, instead of just breaking things we maybe we need to rename it and add a compatibility shim. Quoting from https://mail.python.org/pipermail/python-dev/2017-December/151527.html:

On the server, the obvious fix would be to start passing
A-label-encoded names to the servername_callback, instead of
U-label-encoded names. Unfortunately, this is a bit trickier, because
this has historically worked (AFAIK) for IDNA names, so long as they
didn't use one of the four magic characters who changed meaning
between IDNA 2003 and IDNA 2008. But we do still need to do something.
For example, right now, it's impossible to use the ssl module to
implement a web server at https://straße.de, because incoming
connections will use SNI to say that they expect a cert for
"xn--strae-oqa.de", and then the ssl module will freak out and throw
an exception instead of invoking the servername callback.

It's ugly, but probably the simplest thing is to add a new function
like set_servername_callback2 that uses the A-label, and then redefine
set_servername_callback as a deprecated compatibility shim:

def set_servername_callback(self, cb):
    def shim_cb(sslobj, servername, sslctx):
        if servername is not None:
            servername = servername.encode("ascii").decode("idna")
        return cb(sslobj, servername, sslctx)
    self.set_servername_callback2(shim_cb)

We can bikeshed what the new name should be. Maybe set_sni_callback?
or set_server_hostname_callback, since the corresponding client-mode
argument is server_hostname?

Contributor

njsmith commented Jan 10, 2018

I did a quick skim and the code seems fine to me, but... I do think that for set_servername_callback, instead of just breaking things we maybe we need to rename it and add a compatibility shim. Quoting from https://mail.python.org/pipermail/python-dev/2017-December/151527.html:

On the server, the obvious fix would be to start passing
A-label-encoded names to the servername_callback, instead of
U-label-encoded names. Unfortunately, this is a bit trickier, because
this has historically worked (AFAIK) for IDNA names, so long as they
didn't use one of the four magic characters who changed meaning
between IDNA 2003 and IDNA 2008. But we do still need to do something.
For example, right now, it's impossible to use the ssl module to
implement a web server at https://straße.de, because incoming
connections will use SNI to say that they expect a cert for
"xn--strae-oqa.de", and then the ssl module will freak out and throw
an exception instead of invoking the servername callback.

It's ugly, but probably the simplest thing is to add a new function
like set_servername_callback2 that uses the A-label, and then redefine
set_servername_callback as a deprecated compatibility shim:

def set_servername_callback(self, cb):
    def shim_cb(sslobj, servername, sslctx):
        if servername is not None:
            servername = servername.encode("ascii").decode("idna")
        return cb(sslobj, servername, sslctx)
    self.set_servername_callback2(shim_cb)

We can bikeshed what the new name should be. Maybe set_sni_callback?
or set_server_hostname_callback, since the corresponding client-mode
argument is server_hostname?

@tiran

This comment has been minimized.

Show comment
Hide comment
@tiran

tiran Jan 13, 2018

Member

Meh! :(

Let's make it PEP 543 compatible and add sni_callback property. The Python subclass can provide your shim cb. Thanks for proposing it!

Member

tiran commented Jan 13, 2018

Meh! :(

Let's make it PEP 543 compatible and add sni_callback property. The Python subclass can provide your shim cb. Thanks for proposing it!

njsmith and others added some commits Aug 6, 2017

[bpo-28414] In SSL module, store server_hostname as an A-label
Historically, our handling of international domain names (IDNs) in the
ssl module has been very broken. The flow went like:

1. User passes server_hostname= to the SSLSocket/SSLObject
   constructor. This gets normalized to an A-label by using the
   PyArg_Parse "et" mode: bytes objects get passed through
   unchanged (assumed to already be A-labels); str objects get run
   through .encode("idna") to convert them into A-labels.

2. newPySSLSocket takes this A-label, and for some reason decodes
   it *back* to a U-label, and stores that as the object's
   server_hostname attribute.

3. Later, this U-label server_hostname attribute gets passed to
   match_hostname, to compare against the hostname seen in the
   certificate. But certificates contain A-labels, and match_hostname
   expects to be passed an A-label, so this doesn't work at all.

This PR fixes the problem by removing the pointless decoding at step
2, so that internally we always use A-labels, which matches how
internet protocols are designed in general: A-labels are used
everywhere internally and on-the-wire, and U-labels are basically just
for user interfaces.

This also matches the general advice to handle encoding/decoding once
at the edges, though for backwards-compatibility we continue to use
'str' objects to store A-labels, even though they're now always
ASCII. Technically there is a minor compatibility break here: if a
user examines the .server_hostname attribute of an ssl-wrapped socket,
then previously they would have seen a U-label like "pythön.org", and
now they'll see an A-label like "xn--pythn-mua.org". But this only
affects non-ASCII domain names, which have never worked in the first
place, so it seems unlikely that anyone is relying on the old
behavior.

This PR also adds an end-to-end test for IDN hostname
validation. Previously there were no tests for this functionality.

Fixes bpo-28414.
Drop trustme certs
All test certs must be generated by CPython's own test helper.

Signed-off-by: Christian Heimes <christian@python.org>
Fix IDN SAN handling
Signed-off-by: Christian Heimes <christian@python.org>

@tiran tiran changed the title from [bpo-28414][WIP] Make all hostnames in SSL module IDN A-labels to [bpo-28414] Make all hostnames in SSL module IDN A-labels Feb 22, 2018

@tiran tiran requested a review from njsmith Feb 22, 2018

njsmith added some commits Feb 23, 2018

@njsmith

I pushed some prose changes, and here are some comments on the code. Overall looks good, but there are a few small cleanups to consider, and a few points where I have questions.

Show outdated Hide outdated Lib/ssl.py
Show outdated Hide outdated Lib/ssl.py
@@ -355,13 +355,20 @@ def __new__(cls, protocol=PROTOCOL_TLS, *args, **kwargs):
self = _SSLContext.__new__(cls, protocol)
return self
def __init__(self, protocol=PROTOCOL_TLS):
self.protocol = protocol
def _encode_hostname(self, hostname):

This comment has been minimized.

@njsmith

njsmith Feb 23, 2018

Contributor

Maybe make this a top-level utility function, instead of a method?

@njsmith

njsmith Feb 23, 2018

Contributor

Maybe make this a top-level utility function, instead of a method?

This comment has been minimized.

@tiran

tiran Feb 23, 2018

Member

I'd prefer to keep this a method. It makes it easier to modify the behavior in subclasses or in tests.

@tiran

tiran Feb 23, 2018

Member

I'd prefer to keep this a method. It makes it easier to modify the behavior in subclasses or in tests.

{"options", (getter) get_options,
(setter) set_options, NULL},
{"protocol", (getter) get_protocol,
NULL, NULL},

This comment has been minimized.

@njsmith

njsmith Feb 23, 2018

Contributor

Why is this new protocol attribute in this PR? Should it be documented or anything?

@njsmith

njsmith Feb 23, 2018

Contributor

Why is this new protocol attribute in this PR? Should it be documented or anything?

This comment has been minimized.

@tiran

tiran Feb 23, 2018

Member

It's not new: https://docs.python.org/3/library/ssl.html#ssl.SSLContext.protocol

I just moved it from Python code into C code so I can access the value from C methods. The new setter for the callback uses the value to prevent setting a SNI callback for a client-only context.

@tiran

tiran Feb 23, 2018

Member

It's not new: https://docs.python.org/3/library/ssl.html#ssl.SSLContext.protocol

I just moved it from Python code into C code so I can access the value from C methods. The new setter for the callback uses the value to prevent setting a SNI callback for a client-only context.

This comment has been minimized.

@njsmith

njsmith Feb 24, 2018

Contributor

Ah, gotcha.

@njsmith

njsmith Feb 24, 2018

Contributor

Ah, gotcha.

Show outdated Hide outdated Modules/_ssl.c
Address review comments
Drop extra code for PEP 543 future compatibility in sni callback

Use callable() and encode_hostname in shim for old SNI callback.

Signed-off-by: Christian Heimes <christian@python.org>
@tiran

This comment has been minimized.

Show comment
Hide comment
@tiran

tiran Feb 24, 2018

Member

The context special handling is pining for the fjords.

Member

tiran commented Feb 24, 2018

The context special handling is pining for the fjords.

@njsmith njsmith merged commit 11a1493 into python:master Feb 24, 2018

4 checks passed

bedevere/issue-number Issue number 28414 found
Details
bedevere/news News entry found in Misc/NEWS.d
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@miss-islington

This comment has been minimized.

Show comment
Hide comment
@miss-islington

miss-islington Feb 24, 2018

Thanks @tiran for the PR, and @njsmith for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7.
🐍🍒🤖

miss-islington commented Feb 24, 2018

Thanks @tiran for the PR, and @njsmith for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7.
🐍🍒🤖

miss-islington added a commit to miss-islington/cpython that referenced this pull request Feb 24, 2018

[bpo-28414] Make all hostnames in SSL module IDN A-labels (GH-5128)
Previously, the ssl module stored international domain names (IDNs)
as U-labels. This is problematic for a number of reasons -- for
example, it made it impossible for users to use a different version
of IDNA than the one built into Python.

After this change, we always convert to A-labels as soon as possible,
and use them for all internal processing. In particular, server_hostname
attribute is now an A-label, and on the server side there's a new
sni_callback that receives the SNI servername as an A-label rather than
a U-label.
(cherry picked from commit 11a1493)

Co-authored-by: Christian Heimes <christian@python.org>
@bedevere-bot

This comment has been minimized.

Show comment
Hide comment
@bedevere-bot

bedevere-bot Feb 24, 2018

GH-5843 is a backport of this pull request to the 3.7 branch.

bedevere-bot commented Feb 24, 2018

GH-5843 is a backport of this pull request to the 3.7 branch.

njsmith added a commit that referenced this pull request Feb 24, 2018

[bpo-28414] Make all hostnames in SSL module IDN A-labels (GH-5128) (G…
…H-5843)

Previously, the ssl module stored international domain names (IDNs)
as U-labels. This is problematic for a number of reasons -- for
example, it made it impossible for users to use a different version
of IDNA than the one built into Python.

After this change, we always convert to A-labels as soon as possible,
and use them for all internal processing. In particular, server_hostname
attribute is now an A-label, and on the server side there's a new
sni_callback that receives the SNI servername as an A-label rather than
a U-label.
(cherry picked from commit 11a1493)

Co-authored-by: Christian Heimes <christian@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment