Project

General

Profile

Actions

Bug #9028

closed

Make SSLSocket Support Encodings

Added by whitehat101 (Jeremy Ebler) over 10 years ago. Updated over 10 years ago.

Status:
Closed
Target version:
-
ruby -v:
1.9.3, 2.0.0-p0
[ruby-core:57906]

Description

I was working on a bug in the xmpp4r project that caused REXML exceptions when receiving UTF-8 Strings.
https://github.com/xmpp4r/xmpp4r/issues/13

The issue ended up being that SSLSocket#readline didn't always return strings with the same encoding. It gave plain ASCII strings an encoding of UTF-8, and UTF-8 strings an encoding of ASCII-8BIT. We were passing the SSLSocket directly to REXML::Parsers::SAX2Parser and REXML throws exceptions when the input is not UTF-8.

Our solution, wrap the socket and always return consistently encoded strings:

class SSLSocketUtf8 < OpenSSL::SSL::SSLSocket
def sysread *args
super.force_encoding ::Encoding::UTF_8
end
end

Hello, I'm investigating some strange behavior with OpenSSL::SSL::SSLSocket and string encodings
#readline returns UTF-8 encoded strings, until the string actually contains UTF-8, then it claims that the encoding is ASCII-8BIT
I've been reading through the source, and I'm not sure where to try to patch it
whitehat101: have an example script?
whitehat101: can you reproduce it with #sysread?
if you can, the problem lies in the C code
if you cannot, the problem lies in the OpenSSL::Buffering module
I don't have a concise example, I'm working with the xmpp4r project
whitehat101: look at sample/openssl/echo_*
you can probably make a simple example out of that
I found that #sysread always returns 8BIT, but #readline usually gives UTF-8
Thank you, i'll look at those
whitehat101: then I imagine the problem is that OpenSSL::Buffering#initialize creates a UTF-8 buffer
(@rbuffer)
I bet that # encoding: ASCII-8BIT at the very top of the file will fix it
in buffering.rb?
in ext/openssl/lib/openssl/buffering.rb
My feeling is that these functions should be returning UTF-8
A patch that works for my project:
class SSLSocketUtf8 < OpenSSL::SSL::SSLSocket
def sysread *args
super.force_encoding ::Encoding::UTF_8
end
end
hrm
they should be returning the encoding of the SSLSocket
It doesn't look like SSLSocket has any supportfor encodings
I tried setting the encoding of the TCPSocket, but it had no effect
since SSLSocket wraps the TCPSocket, I don't know if that has an effect on SSLSocket#sysread
I'm guessing that SSLSocket has no idea what the encoding is, it just deals with bytes
We're passing the SSLSocket directly to REXML::Parsers::SAX2Parser
and REXML throws exceptions when the input is not UTF-8
possibly, since it isn't an IO subclass and doesn't seem to respond to #set_encoding
setting the encoding on the TCPSocket probably has no effect because SSLSocket needs to read binary data off the TCPSocket
the ultimate solution would be "make SSLSocket support encodings"
That sounds right to me
a short-term fix would be "make the SSLSocket methods return a consistent encoding, regardless of correctness"
whitehat101: if you file a bug, maybe I'll find the time to fix it for ruby 2.1
you can file one here: http://bugs.ruby-lang.org/projects/ruby-trunk/issues/new
That would be excellent, thanks
Should I try to make an example, or just include this conversation?
this conversation is enough


Files

openssl.buffering.rb.encoding.patch (1.02 KB) openssl.buffering.rb.encoding.patch drbrain (Eric Hodel), 12/03/2013 07:14 AM
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0