Discussion:
[PEP466] SSLSockets, and sockets, _socketobjects oh my!
Alex Gaynor
2014-07-22 21:03:36 UTC
Permalink
Hi all,

I've been happily working on the SSL module backports for Python2 (pursuant to
PEP466), and I've hit something of a snag:

In python3, the SSLSocket keeps a weak reference to the underlying socket,
rather than a strong reference, as Python2 uses.

Unfortunately, due to the way sockets work in Python2, this doesn't work:

On Python2, _socketobject composes around _real_socket from the _socket module,
whereas on Python3, it subclasses _socket.socket. Since you now have a Python-
level class, you can weak reference it.

The question is:

a) Should we backport weak referencing _socket.sockets (changing the structure
of the module seems overly invasive, albeit completely backwards
compatible)?
b) Does anyone know why weak references are used in the first place? The commit
message just alludes to fixing a leak with no reference to an issue.

Anyone who's interested in the state of the branch can see it at:
github.com/alex/cpython on the backport-ssl branch. Note that many many tests
are still failing, and you'll need to apply the patch from
http://bugs.python.org/issue22023 to get it to work.

Thanks,
Alex

PS: Any help in getting http://bugs.python.org/issue22023 landed which be very
much appreciated.
Antoine Pitrou
2014-07-22 21:25:27 UTC
Permalink
Post by Alex Gaynor
a) Should we backport weak referencing _socket.sockets (changing the structure
of the module seems overly invasive, albeit completely backwards
compatible)?
b) Does anyone know why weak references are used in the first place? The commit
message just alludes to fixing a leak with no reference to an issue.
Because :
- the SSLSocket has a strong reference to the ssl object (self._sslobj)
- self._sslobj having a strong reference to the SSLSocket would mean
both would only get destroyed on a GC collection

I assume that's what "leak" means here :-)

As for 2.x, I don't see why you couldn't just continue using a strong
reference.

Regards

Antoine.
Nick Coghlan
2014-07-22 21:44:54 UTC
Permalink
Post by Antoine Pitrou
Post by Alex Gaynor
a) Should we backport weak referencing _socket.sockets (changing the structure
of the module seems overly invasive, albeit completely backwards
compatible)?
b) Does anyone know why weak references are used in the first place? The commit
message just alludes to fixing a leak with no reference to an issue.
- the SSLSocket has a strong reference to the ssl object (self._sslobj)
- self._sslobj having a strong reference to the SSLSocket would mean both
would only get destroyed on a GC collection
Post by Antoine Pitrou
I assume that's what "leak" means here :-)
As for 2.x, I don't see why you couldn't just continue using a strong
reference.

As Antoine says, if the cycle already exists in Python 2 (and it sounds
like it does), we can just skip backporting the weak reference change.

I'll also give the Fedora Python list a heads up about your repo to see if
anyone there can help you with the backport.

Cheers,
Nick.
Post by Antoine Pitrou
Regards
Antoine.
_______________________________________________
Python-Dev mailing list
https://mail.python.org/mailman/listinfo/python-dev
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
Antoine Pitrou
2014-07-22 23:00:18 UTC
Permalink
Post by Antoine Pitrou
Post by Antoine Pitrou
As for 2.x, I don't see why you couldn't just continue using a strong
reference.
As Antoine says, if the cycle already exists in Python 2 (and it sounds
like it does), we can just skip backporting the weak reference change.
No, IIRC there shouldn't be a cycle. It's just complicated in a
different way than 3.x :-)

Regards

Antoine.
Alex Gaynor
2014-07-23 19:36:07 UTC
Permalink
This post might be inappropriate. Click to display it.
Antoine Pitrou
2014-07-23 21:02:26 UTC
Permalink
Post by Alex Gaynor
That said, I've hit another issue, with SNI callbacks. The first argument to an
SNI callback is the socket. The callback is set up by some C code, which right
now has access to only the _socket.socket object, not the ssl.SSLSocket object,
which is what the public API needs there.
* Pass the SSLObject *in addition* to the _socket.socket object to the C code.
This generates some additional divergence from the Python3 code, but is
probably basically straightforward.
You mean for use with SSL_set_app_data?
Alex Gaynor
2014-07-23 21:10:39 UTC
Permalink
Post by Antoine Pitrou
You mean for use with SSL_set_app_data?
Yes, if you look in ``_servername_callback``, you can see where it uses
``SSL_get_app_data`` and then reads ``ssl->Socket``, which is supposed to be
the same object that's returned by ``context.wrap_socket()``.

Alex
Nick Coghlan
2014-07-23 22:06:26 UTC
Permalink
Post by Alex Gaynor
* Pass the SSLObject *in addition* to the _socket.socket object to the C code.
This generates some additional divergence from the Python3 code, but is
probably basically straightforward.
* Try to refactor the socket code in the same way as Python3 did, so we can
pass *only* the SSLObject here. This is some nasty scope creep for PEP466,
but would make the overall _ssl.c diff smaller.
* Some super sweet and simple thing I haven't thought of yet.
Thoughts?
Wearing my "risk management" hat, option 1 sounds significantly more
appealing than option 2 :)

Cheers,
Nick.

Loading...