Project

General

Profile

Bug #6799

Digest::*.hexdigest returns an ASCII-8BIT String

Added by Eregon (Benoit Daloze) about 5 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Target version:
ruby -v:
ruby 2.0.0dev (2012-07-15 trunk 36395) [x86_64-darwin10.8.0]
[ruby-core:46792]

Description

$ ruby -rdigest/sha1 -e 'p Digest::SHA1.hexdigest("a").encoding'

#Encoding:ASCII-8BIT

I'm happy to provide a patch.

0001-ext-digest-digest.c-hexencode_str_new-return-an-ASCI.patch (2.59 KB) 0001-ext-digest-digest.c-hexencode_str_new-return-an-ASCI.patch Eregon (Benoit Daloze), 07/27/2012 09:01 PM
noname (500 Bytes) noname Anonymous, 07/29/2012 10:53 AM
secure_random.patch (863 Bytes) secure_random.patch tenderlovemaking (Aaron Patterson), 09/06/2012 08:58 AM
noname (500 Bytes) noname Anonymous, 09/15/2012 03:53 PM

Associated revisions

Revision 36588
Added by Eregon (Benoit Daloze) almost 5 years ago

ext/digest/digest.c (hexencode_str_new): return an ASCII string

* test/digest: tests for all kind of digests encodings
[Bug #6799]

Revision 36588
Added by Eregon (Benoit Daloze) almost 5 years ago

ext/digest/digest.c (hexencode_str_new): return an ASCII string

* test/digest: tests for all kind of digests encodings
[Bug #6799]

Revision 36588
Added by Eregon (Benoit Daloze) almost 5 years ago

ext/digest/digest.c (hexencode_str_new): return an ASCII string

* test/digest: tests for all kind of digests encodings
[Bug #6799]

Revision 36588
Added by Eregon (Benoit Daloze) almost 5 years ago

ext/digest/digest.c (hexencode_str_new): return an ASCII string

* test/digest: tests for all kind of digests encodings
[Bug #6799]

Revision 37269
Added by Eregon (Benoit Daloze) almost 5 years ago

set encoding to ASCII for appropriate String#unpack modifiers

* pack.c (pack_unpack): set encoding of the
'H','h','B' and 'B' modifiers to US-ASCII.
* test/ruby/test_pack.rb: tests for the above.
[Bug #7050]
* test/test_securerandom.rb: tests for SecureRandom.hex
from tenderlove. [Bug #6799]

Revision 37269
Added by Eregon (Benoit Daloze) almost 5 years ago

set encoding to ASCII for appropriate String#unpack modifiers

* pack.c (pack_unpack): set encoding of the
'H','h','B' and 'B' modifiers to US-ASCII.
* test/ruby/test_pack.rb: tests for the above.
[Bug #7050]
* test/test_securerandom.rb: tests for SecureRandom.hex
from tenderlove. [Bug #6799]

Revision 37269
Added by Eregon (Benoit Daloze) almost 5 years ago

set encoding to ASCII for appropriate String#unpack modifiers

* pack.c (pack_unpack): set encoding of the
'H','h','B' and 'B' modifiers to US-ASCII.
* test/ruby/test_pack.rb: tests for the above.
[Bug #7050]
* test/test_securerandom.rb: tests for SecureRandom.hex
from tenderlove. [Bug #6799]

Revision 37269
Added by Eregon (Benoit Daloze) almost 5 years ago

set encoding to ASCII for appropriate String#unpack modifiers

* pack.c (pack_unpack): set encoding of the
'H','h','B' and 'B' modifiers to US-ASCII.
* test/ruby/test_pack.rb: tests for the above.
[Bug #7050]
* test/test_securerandom.rb: tests for SecureRandom.hex
from tenderlove. [Bug #6799]

History

#1 [ruby-core:46795] Updated by duerst (Martin Dürst) about 5 years ago

So what should it return? [If you create a bug, saying what the expected behavior is (and also why you expect it) is quite important.]

#2 [ruby-core:46796] Updated by shyouhei (Shyouhei Urabe) about 5 years ago

  • Status changed from Open to Feedback

#3 [ruby-core:46808] Updated by Eregon (Benoit Daloze) about 5 years ago

duerst (Martin Dürst) wrote:

So what should it return? [If you create a bug, saying what the expected behavior is (and also why you expect it) is quite important.]

I thought it was clear enough, but indeed I should have mentioned it.
An hexadecimal String like this should have the US-ASCII encoding, because, by definition, it only has ASCII characters ([a-f0-9]).

This for example, makes YAML dump it as a binary String, and may cause other problems as the String is believed to be "binary", while it's not (#digest on the other hand has the ASCII-8BIT encoding, which is right).

#5 [ruby-core:46846] Updated by Anonymous almost 5 years ago

On Fri, Jul 27, 2012 at 07:34:30PM +0900, Eregon (Benoit Daloze) wrote:

Issue #6799 has been updated by Eregon (Benoit Daloze).

duerst (Martin Dürst) wrote:

So what should it return? [If you create a bug, saying what the expected behavior is (and also why you expect it) is quite important.]

I thought it was clear enough, but indeed I should have mentioned it.
An hexadecimal String like this should have the US-ASCII encoding, because, by definition, it only has ASCII characters ([a-f0-9]).

FWIW, I would like this feature too. In Rails, we have to work around
this issue in some database drivers because the ASCII-8BIT string will
be considered to be binary.

--
Aaron Patterson
http://tenderlovemaking.com/

#6 [ruby-core:46871] Updated by duerst (Martin Dürst) almost 5 years ago

  • Status changed from Feedback to Assigned
  • Assignee set to Eregon (Benoit Daloze)

Given that the string has only ASCII characters, an encoding of US-ASCII indeed seems best. Benoit, unless somebody objects soon, I think you should go ahead with your patch.

#7 Updated by Eregon (Benoit Daloze) almost 5 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r36588.
Benoit, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


ext/digest/digest.c (hexencode_str_new): return an ASCII string

  • test/digest: tests for all kind of digests encodings [Bug #6799]

#8 [ruby-core:46914] Updated by Eregon (Benoit Daloze) almost 5 years ago

duerst (Martin Dürst) wrote:

Given that the string has only ASCII characters, an encoding of US-ASCII indeed seems best. Benoit, unless somebody objects soon, I think you should go ahead with your patch.

Thank you, I just did.

#9 [ruby-core:47440] Updated by tenderlovemaking (Aaron Patterson) almost 5 years ago

Could we do this for SecureRandom.hex as well? I've attached a patch. /cc akr

#10 [ruby-core:47528] Updated by Eregon (Benoit Daloze) almost 5 years ago

@Aaron: Maybe String#unpack should be fixed instead to return an ASCII string for the 'H', 'h', 'B' and 'b' directives?
It makes sense to me since these will always contain 0-9a-f (0-1 for 'B','b') chars.
This is already done for the 'm', 'M' and 'u' directives in Array#pack.
What do you think?

P.S.: Does anyone know the rationale behind which method to use for 'u','m','M' and 'H','h','B','b' ?
To have a base64 representation you need [s].pack('m') (and so #pack is used for 'u','m','M' to obtain the specified format), but for 'H','h','B','b' you need s.unpack(directive). I think the latter makes more sense when compared to other types such as 'l', in which #unpack is used to obtain the desired format.

#11 [ruby-core:47539] Updated by Anonymous almost 5 years ago

On Fri, Sep 14, 2012 at 07:35:46PM +0900, Eregon (Benoit Daloze) wrote:

Issue #6799 has been updated by Eregon (Benoit Daloze).

@Aaron: Maybe String#unpack should be fixed instead to return an ASCII string for the 'H', 'h', 'B' and 'b' directives?
It makes sense to me since these will always contain 0-9a-f (0-1 for 'B','b') chars.
This is already done for the 'm', 'M' and 'u' directives in Array#pack.
What do you think?

I agree (and would prefer that). It just seems like a larger change
than making this method work for me. ;-)

--
Aaron Patterson
http://tenderlovemaking.com/

Also available in: Atom PDF