Discussion:
Memory BIO for _ssl
Geert Jansen
2014-07-05 18:04:04 UTC
Permalink
Hi,

the topic of a memory BIO for the _ssl module in the stdlib was
discussed before here:

http://mail.python.org/pipermail/python-ideas/2012-November/017686.html

Since I need this for my Gruvi async framework, I want to volunteer to
write a patch. It should be useful as well to Py3K's asyncio and other
async frameworks. It would be good to get some feedback before I start
on this.

I was thinking of the following approach:

* Add a new type to _ssl: PySSLMemoryBIO
* PySSLMemoryBIO has a public constructor, and at least the following
methods: puts() puts_eof() and gets(). I aligned the terminology with
the method names in OpenSSL. puts_eof() does a
BIO_set_mem_eof_return(-1).
* All accesses to the memory BIO as non-blocking.
* Update PySSLSocket to add support for SSL_set_bio(). The fact that
the memory BIO is non-blocking makes it easier. None of the logic in
and around check_socket_and_wait_for_timeout() for example needs to be
changed. For the parts that deal with the socket directly, and that
are in the code path for non-blocking IO, I think the preference would
be i) try to change the code to use BIO methods that works for both
sockets and memory BIOs, and ii) if not possible, special case it.
* At this point the PySSLSocket name is a bit of a misnomer as it
does more than sockets. Probably not an issue.
* Add a method _wrap_bio(rbio, wbio, ...) to _SSLContext.
* Expose the low-level methods via the "ssl" module.

Creating an SSLSocket with a memory BIO would work something like this:

context = SSLContext()
rbio = ssl.MemoryBIO()
wbio = ssl.MemoryBIO()
sslsock = ssl.wrap_bio(rbio, wbio)

To pass SSL data from the network and decrypt it into application
level data (and potentially new SSL level data):

rbio.puts(ssldata)
appdata = sslsock.read()
ssldata = wbio.gets()

I currently have a utility class in my async IO framework (gruvi.io)
called SslPipe that does the above, but it uses a socketpair instead
of a memory BIO, and hence it works with the current _ssl. See here:

https://github.com/geertj/gruvi/blob/master/gruvi/ssl.py#L86

This approach, while fine and very fast on Linux, gives me problems on
Windows. It appears that on some older Windows versions, when I write
data to one side of an (emulated) socket pair, it takes some time for
it to become available at the other side. That breaks the synchronous
interface that I need in order for this to work. And I can't fully
work around it as I do not know in all situations whether or not to
expect data on the socketpair. A memory BIO should be the right
solution to this.

Any feedback?

Regards,
Geert
Antoine Pitrou
2014-07-06 23:49:23 UTC
Permalink
Hi,
Post by Geert Jansen
Since I need this for my Gruvi async framework, I want to volunteer to
write a patch. It should be useful as well to Py3K's asyncio and other
async frameworks. It would be good to get some feedback before I start
on this.
Thanks for volunteering! This would be a very welcome addition.
Post by Geert Jansen
* Add a new type to _ssl: PySSLMemoryBIO
* PySSLMemoryBIO has a public constructor, and at least the following
methods: puts() puts_eof() and gets(). I aligned the terminology with
the method names in OpenSSL. puts_eof() does a
BIO_set_mem_eof_return(-1).
Hmm... I haven't looked in detail, but at least I'd like those to be
called read() and write() (and write_eof()), like most other I/O methods
in Python.
Or if we want to avoid confusion, add an explicit suffix (write_incoming?).
Post by Geert Jansen
* All accesses to the memory BIO as non-blocking.
Sounds sensible indeed (otherwise what would they wait for?).
Post by Geert Jansen
* Update PySSLSocket to add support for SSL_set_bio(). The fact that
the memory BIO is non-blocking makes it easier. None of the logic in
and around check_socket_and_wait_for_timeout() for example needs to be
changed. For the parts that deal with the socket directly, and that
are in the code path for non-blocking IO, I think the preference would
be i) try to change the code to use BIO methods that works for both
sockets and memory BIOs, and ii) if not possible, special case it.
That sounds good in the principle. I don't enough about memory BIOs to
know whether you will have issues doing so :-)
Post by Geert Jansen
* At this point the PySSLSocket name is a bit of a misnomer as it
does more than sockets. Probably not an issue.
Agreed.
Post by Geert Jansen
* Add a method _wrap_bio(rbio, wbio, ...) to _SSLContext.
* Expose the low-level methods via the "ssl" module.
context = SSLContext()
rbio = ssl.MemoryBIO()
wbio = ssl.MemoryBIO()
sslsock = ssl.wrap_bio(rbio, wbio)
The one thing I find confusing is the r(ead)bio / w(rite)bio terminology
(because you actually read and write from both). Perhaps incoming and
outgoing would be clearer.

Regards

Antoine.
Geert Jansen
2014-07-12 09:12:37 UTC
Permalink
Post by Antoine Pitrou
Post by Geert Jansen
Since I need this for my Gruvi async framework, I want to volunteer to
write a patch. It should be useful as well to Py3K's asyncio and other
async frameworks. It would be good to get some feedback before I start
on this.
Thanks for volunteering! This would be a very welcome addition.
I have a first patch and submitted it as issue #21965

http://bugs.python.org/issue21965

I've incorporated your feedback.

Regards,
Geert

Loading...