Feature #21785: Add signed and unsigned LEB128 support to pack / unpack - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #21785

closed

Add signed and unsigned LEB128 support to pack / unpack

Feature #21785: Add signed and unsigned LEB128 support to pack / unpack

Added by tenderlovemaking (Aaron Patterson) 4 months ago. Updated 2 months ago.

Status:

Closed

Assignee:

Target version:

[ruby-core:124258]

Description

Hi,

I'd like to add signed and unsigned LEB128 support to the pack and unpack methods. LEB128 is a variable length encoding scheme for integers. You can read the wikipedia entry about it here: https://en.wikipedia.org/wiki/LEB128

LEB128 is used in DWARF, WebAssembly, MQTT, and Protobuf. I'm sure there are other formats, but these are the ones I'm familiar with.

I sent a pull request here: https://github.com/ruby/ruby/pull/15589

I'm proposing K for the unsigned version and k for the signed version. I just picked k because it was available, I'm open to other format strings.

Thanks for consideration!

Related issues 1 (0 open — 1 closed)

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#1 [ruby-core:124259]

Sorry, I probably should have put an example in the original post. Here is a sample of the usage:

irb(main):003> [0xFFF].pack("K")
=> "\xFF\x1F"
irb(main):004> [0xFFF].pack("K").unpack1("K")
=> 4095
irb(main):005> [-123].pack("k")
=> "\x85\x7F"
irb(main):006> [-123].pack("k").unpack1("k")
=> -123

Updated by matz (Yukihiro Matsumoto) 4 months ago Actions
Copy link
#2 [ruby-core:124268]

I am positive about the addition of LEB128. But I don't really like K/k because it doesn't remind me of LEB128 at all (though I know we've used L, E, B already).

Given that the only case pairs not yet used are k, r, and y, either R (vaRiable length), or Y (next to W - BER) would be better than K/k.

Matz.

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#3 [ruby-core:124272]

matz (Yukihiro Matsumoto) wrote in #note-2:

I am positive about the addition of LEB128. But I don't really like K/k because it doesn't remind me of LEB128 at all (though I know we've used L, E, B already).

Given that the only case pairs not yet used are k, r, and y, either R (vaRiable length), or Y (next to W - BER) would be better than K/k.

Matz.

Thanks for the feedback. I've updated the patch to use R/r!

Updated by mame (Yusuke Endoh) 4 months ago 1Actions
Copy link
#4 [ruby-core:124287]

It's a shame unpack doesn't tell you how many bytes it read. You'd probably want a unpack variant that returns the final offset too, or a specifier that returns the current offset (like o?).

bytes = "\x01\x02\x03"
offset = 0
leb128_value1, offset = bytes.unpack("Ro", offset: offset) #=> 1
leb128_value2, offset = bytes.unpack("Ro", offset: offset) #=> 2
leb128_value3, offset = bytes.unpack("Ro", offset: offset) #=> 3

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#5 [ruby-core:124294]

mame (Yusuke Endoh) wrote in #note-4:

It's a shame unpack doesn't tell you how many bytes it read. You'd probably want a unpack variant that returns the final offset too, or a specifier that returns the current offset (like o?).
bytes = "\x01\x02\x03"
offset = 0
leb128_value1, offset = bytes.unpack("Ro", offset: offset) #=> 1
leb128_value2, offset = bytes.unpack("Ro", offset: offset) #=> 2
leb128_value3, offset = bytes.unpack("Ro", offset: offset) #=> 3

You could tell how many bytes you read based on the size of the leb128_value returned. But I agree, getting the information directly from unpack would be nice.

Updated by mame (Yusuke Endoh) 4 months ago Actions
Copy link
#6 [ruby-core:124298]

You could tell how many bytes you read based on the size of the leb128_value returned.

That apparoach is unreliable because LEB128 is redundant. For example, both "\x03" and "\x83\x00" are valid LEB128 encodings of the value 3.
See the note of the section Values - Integers, in the Wasm spec.
https://webassembly.github.io/spec/core/binary/values.html#integers

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#7 [ruby-core:124304]

mame (Yusuke Endoh) wrote in #note-6:

That apparoach is unreliable because LEB128 is redundant. For example, both "\x03" and "\x83\x00" are valid LEB128 encodings of the value 3.

Ah of course. I didn't think about that. 🤦‍♀️

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#8

Status changed from Open to Closed

Applied in changeset git|d0b72429a93e54f1f956b4aedfc25c57dc7001aa.

Add support for signed and unsigned LEB128 to pack/unpack.

This commit adds a new pack format command R and r for unsigned and
signed LEB128 encoding. The "r" mnemonic is because this is a
"vaRiable" length encoding scheme.

LEB128 is used in various formats including DWARF, WebAssembly, MQTT,
and Protobuf.

[Feature #21785]

Updated by byroot (Jean Boussier) 4 months ago Actions
Copy link
#9

Related to Feature #21796: unpack variant that returns the final offset added

Updated by matz (Yukihiro Matsumoto) 4 months ago Actions
Copy link
#10 [ruby-core:124334]

It is too late to introduce it in Ruby 4.0, let's aim for 4.1.

Matz.

Updated by byroot (Jean Boussier) 4 months ago Actions
Copy link
#11

Status changed from Closed to Open

Updated by tenderlovemaking (Aaron Patterson) 2 months ago Actions
Copy link
#12 [ruby-core:124676]

Is it OK if I merge this again?

Thanks

Updated by matz (Yukihiro Matsumoto) 2 months ago Actions
Copy link
#13 [ruby-core:124776]

Yes.

Matz.

Updated by tenderlovemaking (Aaron Patterson) 2 months ago Actions
Copy link
#14

Status changed from Open to Closed

Applied in changeset git|c61f52a012f0a390a869db4825143187ea468d21.

[Feature #21785] Add LEB128 again (#16123)

Revert "Revert pack/unpack support for LEB128"

This reverts commit 77c3a9e447ec477be39e00072e1ce3348d0f4533.

Update specs for LEB128

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Feature #21785

Add signed and unsigned LEB128 support to pack / unpack

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#1 [ruby-core:124259]

Updated by matz (Yukihiro Matsumoto) 4 months ago Actions
Copy link
#2 [ruby-core:124268]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#3 [ruby-core:124272]

Updated by mame (Yusuke Endoh) 4 months ago 1Actions
Copy link
#4 [ruby-core:124287]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#5 [ruby-core:124294]

Updated by mame (Yusuke Endoh) 4 months ago Actions
Copy link
#6 [ruby-core:124298]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#7 [ruby-core:124304]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#8

Updated by byroot (Jean Boussier) 4 months ago Actions
Copy link
#9

Updated by matz (Yukihiro Matsumoto) 4 months ago Actions
Copy link
#10 [ruby-core:124334]

Updated by byroot (Jean Boussier) 4 months ago Actions
Copy link
#11

Updated by tenderlovemaking (Aaron Patterson) 2 months ago Actions
Copy link
#12 [ruby-core:124676]

Updated by matz (Yukihiro Matsumoto) 2 months ago Actions
Copy link
#13 [ruby-core:124776]

Updated by tenderlovemaking (Aaron Patterson) 2 months ago Actions
Copy link
#14

Project

General

Profile

Ruby

Custom queries

Feature #21785

Add signed and unsigned LEB128 support to pack / unpack

Updated by tenderlovemaking (Aaron Patterson) 4 months ago ActionsCopy link #1 [ruby-core:124259]

Updated by matz (Yukihiro Matsumoto) 4 months ago ActionsCopy link #2 [ruby-core:124268]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago ActionsCopy link #3 [ruby-core:124272]

Updated by mame (Yusuke Endoh) 4 months ago 1ActionsCopy link #4 [ruby-core:124287]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago ActionsCopy link #5 [ruby-core:124294]

Updated by mame (Yusuke Endoh) 4 months ago ActionsCopy link #6 [ruby-core:124298]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago ActionsCopy link #7 [ruby-core:124304]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago ActionsCopy link #8

Updated by byroot (Jean Boussier) 4 months ago ActionsCopy link #9

Updated by matz (Yukihiro Matsumoto) 4 months ago ActionsCopy link #10 [ruby-core:124334]

Updated by byroot (Jean Boussier) 4 months ago ActionsCopy link #11

Updated by tenderlovemaking (Aaron Patterson) 2 months ago ActionsCopy link #12 [ruby-core:124676]

Updated by matz (Yukihiro Matsumoto) 2 months ago ActionsCopy link #13 [ruby-core:124776]

Updated by tenderlovemaking (Aaron Patterson) 2 months ago ActionsCopy link #14

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#1 [ruby-core:124259]

Updated by matz (Yukihiro Matsumoto) 4 months ago Actions
Copy link
#2 [ruby-core:124268]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#3 [ruby-core:124272]

Updated by mame (Yusuke Endoh) 4 months ago 1Actions
Copy link
#4 [ruby-core:124287]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#5 [ruby-core:124294]

Updated by mame (Yusuke Endoh) 4 months ago Actions
Copy link
#6 [ruby-core:124298]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#7 [ruby-core:124304]

Updated by tenderlovemaking (Aaron Patterson) 4 months ago Actions
Copy link
#8

Updated by byroot (Jean Boussier) 4 months ago Actions
Copy link
#9

Updated by matz (Yukihiro Matsumoto) 4 months ago Actions
Copy link
#10 [ruby-core:124334]

Updated by byroot (Jean Boussier) 4 months ago Actions
Copy link
#11

Updated by tenderlovemaking (Aaron Patterson) 2 months ago Actions
Copy link
#12 [ruby-core:124676]

Updated by matz (Yukihiro Matsumoto) 2 months ago Actions
Copy link
#13 [ruby-core:124776]

Updated by tenderlovemaking (Aaron Patterson) 2 months ago Actions
Copy link
#14