Project

General

Profile

Actions

Feature #18598

closed

Add String#bytesplice

Added by shugo (Shugo Maeda) almost 3 years ago. Updated almost 2 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:107707]

Description

I withdrew the proposal of String#bytesplice in #13110 because it may cause problems if the specified offset does not land on character boundary.
But how about to raise IndexError in such cases?

# encoding: utf-8

s = "あいうえおかきくけこ"
s.bytesplice(9, 6, "xx")
p s #=> "あいうxxかきくけこ"
s.bytesplice(2, 3, "x") #=> offset 2 does not land on character boundary (IndexError)
s.bytesplice(3, 4, "x") #=> offset 7 does not land on character boundary (IndexError)

Pull request

https://github.com/ruby/ruby/pull/5584

Spec

bytesplice(index, length, str) -> string
bytesplice(range, str)         -> string

Replaces some or all of the content of +self+ with +str+, and returns +str+.
The portion of the string affected is determined using the same criteria as String#byteslice, except that +length+ cannot be omitted.
If the replacement string is not the same length as the text it is replacing, the string will be adjusted accordingly.
The form that take an Integer will raise an IndexError if the value is out of range; the Range form will raise a RangeError.
If the beginning or ending offset does not land on character (codepoint) boundary, an IndexError will be raised.

Motivation

On a text editor Textbringer, the content of a buffer is represented by a String whose encoding is ASCII-8BIT, and force_encoding(Encoding::UTF_8) is called when necessary.
It's because point (cursor position) and marks are represented by byte offsets for performance, and currently there is no way to modify UTF-8 strings with byte offsets.
If String#bytesplice is introduced, the content of a text buffer can be represented by a UTF-8 string, and force_encoding can be removed: https://github.com/shugo/textbringer/pull/31/files


Related issues 1 (1 open0 closed)

Related to Ruby master - Feature #19315: Lazy substrings in CRubyOpenActions
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0