Project

General

Profile

Actions

Bug #10845

closed

Subclassing String

Added by sawa (Tsuyoshi Sawada) over 6 years ago. Updated 11 months ago.

Status:
Closed
Priority:
Normal
Target version:
-
[ruby-core:68084]

Description

If I make a subclass of String, the method * returns an instance of that class.

class MyString < String
end

MyString.new("foo").*(2).class #=> MyString

This is different from other similar operations like + and %, which return a String instance.

MyString.new("foo").+("bar").class #=> String
MyString.new("%{foo}").%(foo: "bar").class #=> String

I don't see clear reason why * is to be different from + and %, and thought that perhaps either the behaviour with * is a bug, or the behaviour with + and % is a bug.

Or, is a reason why they are different?


Related issues

Related to Ruby master - Bug #6087: How should inherited methods deal with return values of their own subclass? Closedmatz (Yukihiro Matsumoto)Actions
Has duplicate Ruby master - Bug #11209: [PATCH] Fix for String#+ when subclassedClosedActions

Updated by dummey (Ricky Ng) over 6 years ago

Hmm, guessing that '+' and '%' are being a bit weird...

Verified the some thing happens in: ruby 2.0.0p481

It does look like '<<' is working (or not working) though.

 irb(main):007:0> MyString.new("foo").<<("foo").class
 => MyString

--
Incoherently,
Ricky Ng

Updated by sawa (Tsuyoshi Sawada) over 6 years ago

Ricky

Methods that modify the receiver and return the receiver would have to return the exact same object, which means that the class is the same. That is not the issue here.

Updated by dummey (Ricky Ng) over 6 years ago

Ahh yea, me being derpy =(. What I wanted to actually do was sanity
check some other methods that would return a new_str:

 irb(main):005:0> MyString.new("hello")[0].class
 => MyString
 irb(main):006:0> MyString.new("hello").byteslice(1).class
 => MyString
 irb(main):007:0> MyString.new("hello").capitalize.class
 => MyString
 irb(main):008:0> MyString.new("hello").center(1).class
 => MyString
 irb(main):009:0> MyString.new("hello").chomp.class
 => MyString
 irb(main):010:0> MyString.new("hello").chop.class
 => MyString
 irb(main):011:0> MyString.new("hello").delete("ll").class
 => MyString
 irb(main):012:0> MyString.new("hello").downcase.class
 => MyString
 irb(main):013:0> MyString.new("hello").dump.class
 => MyString
 irb(main):014:0> MyString.new("hello").gsub("ll", "LL").class
 => MyString

--
Incoherently,
Ricky Ng

Updated by marcandre (Marc-Andre Lafortune) over 6 years ago

  • Assignee set to matz (Yukihiro Matsumoto)

It's clear to me that there's no rationale behind the current behavior.

The question is broader than that.

First, it's not only for String:

class MyArray < Array; end
x = MyArray.new([1,2,3])
x.first(2).class == x[0..1].class # => false
(x+x).class == (x * 2).class # => false

etc...

Also troubling is the fact that no constructor is called at all... Take this somewhat absurd example where @other is normally guaranteed to be a Hash (and to_s assumes that):

class MyArray < Array
  def initialize(size = 0, default = nil, **other)
    @other = other
    super(size, default)
  end

   def initialize_clone(x)
    raise "This is never called"
  end

  def initialize_dup(x)
    raise "This is never called"
  end

  def initialize_copy(x)
    raise "This is never called"
  end

 def to_s
    super + @other.keys.to_s
  end
end

MyArray.new(2, :foo, bar: 42).*(2).to_s
# => undefined method `keys' for nil:NilClass (NoMethodError)

The newly created MyArray has no @other because no constructor is called.

Actions #5

Updated by nobu (Nobuyoshi Nakada) over 6 years ago

  • Has duplicate Bug #11209: [PATCH] Fix for String#+ when subclassed added
Actions #6

Updated by mame (Yusuke Endoh) almost 2 years ago

  • Related to Bug #6087: How should inherited methods deal with return values of their own subclass? added
Actions #7

Updated by mame (Yusuke Endoh) almost 2 years ago

  • Description updated (diff)

Updated by jeremyevans0 (Jeremy Evans) 12 months ago

I've added a pull request for making String methods consistently return String instances when called on String subclasses: https://github.com/ruby/ruby/pull/3701

For the reasons marcandre (Marc-Andre Lafortune) mentioned, I think this makes more sense. Instance specific state is not copied over, and you can end up with invalid objects. If we are going to make similar changes for Array in 3.0 (see #6087), I think it's worth considering the same change for String.

Updated by matz (Yukihiro Matsumoto) 11 months ago

I am OK with experimenting. But I've heard Rails relies on the old behavior (SafeStringBuffer?). In that case, it might be hard to apply this.

Matz.

Updated by matsuda (Akira Matsuda) 11 months ago

I've heard Rails relies on the old behavior (SafeStringBuffer?)

Just commented about the impact of this patch on Rails (ActiveSupport::SafeBuffer) https://github.com/ruby/ruby/pull/3701#issuecomment-730960288

Updated by ioquatix (Samuel Williams) 11 months ago

Is there are logic why we should prefer to return String vs custom subclass instances? I can see argument for both way. Why is one way preferred over the other?

Updated by Eregon (Benoit Daloze) 11 months ago

ioquatix (Samuel Williams) See #6087. Returning subclass instances leads to all kind of issues, because there is no good way to build the subclass instance.

+1 to do it for String since we did for Array.

Maybe we should do it for Hash too?
It seems already the case for most Hash methods but not all (e.g. Hash#merge).

Updated by jeremyevans0 (Jeremy Evans) 11 months ago

Eregon (Benoit Daloze) wrote in #note-12:

Maybe we should do it for Hash too?
It seems already the case for most Hash methods but not all (e.g. Hash#merge).

I looked at Hash#merge, but it doesn't have the same issue as the String and Array methods, since it is implemented as dup.merge!, and dup copies the state into the new object instead of losing it.

Updated by Eregon (Benoit Daloze) 11 months ago

jeremyevans0 (Jeremy Evans) wrote in #note-13:

I looked at Hash#merge, but it doesn't have the same issue as the String and Array methods, since it is implemented as dup.merge!, and dup copies the state into the new object instead of losing it.

Indeed, I think too that case is fine.

I checked all Hash methods listed in the docs and returning a Hash and not just returning self: compact, filter, select, reject, invert, transform_keys, transform_values, and all of them return Hash, except for merge:

h = SubHash.new
h[:foo] = 42

# Seems fine, uses #dup
h.merge({}).class # => SubHash

(keep_if returns self, that confused me)

So Hash is fine already :)

Updated by jeremyevans0 (Jeremy Evans) 11 months ago

  • Status changed from Open to Closed

Thanks to matsuda (Akira Matsuda)'s work showing that Rails already does something very similar and only needs a small patch to work with this, I have committed the patch at 58325daae3beefda13ed100782cd19a89cc68771. If it causes any problems or breakage we are unwilling to accept, we can revert the patch.

Updated by naruse (Yui NARUSE) 11 months ago

In 2.7 we break large compatibilities, and this tickets breaks Rails again.
"Rails master already fixes it" doesn't care the problem; we breaks compatibility.

Breaking compatibility itself is not problem, but we can provide very small benefit for Rails application in 2.7 and 3.0.
We only provide disadvantages in recent versions.
Ractor and types are great milestone for future Rails, but they are not for Rails at this time.

I'm serious about the motivation of upgrading to Ruby 3.0 for Rails users.
I worry that so much.

Updated by jeremyevans0 (Jeremy Evans) 11 months ago

naruse (Yui NARUSE) wrote in #note-16:

In 2.7 we break large compatibilities, and this tickets breaks Rails again.
"Rails master already fixes it" doesn't care the problem; we breaks compatibility.

In this case, Ruby's behavior resulted in objects that were not internally consistent. Rails has a fair amount of code just to work around the inconsistency , which they will be able to drop once 3.0 is the minimum version (Rails 7?). See https://github.com/rails/rails/blob/1165401ee962aef0aaf81080e3e7dcab522efa40/activesupport/lib/active_support/core_ext/string/output_safety.rb#L248-L253

I appreciate the importance of backwards compatibility. Almost all of my libraries run on Ruby 1.9 - Ruby 3.0. However, I don't think we should skip fixing problems in core Ruby just because it breaks a small part of Rails, especially if fixing the issue in Rails is simple. We should always weigh the costs and benefits when deciding whether compatibility should be broken. That being said, I respect your opinion, and if this experiment causes too many issues, we can revert.

Updated by marcandre (Marc-Andre Lafortune) 11 months ago

I'll ask the same question I've asked before... Why not deprecate this first?

This gives time to Rails to issue a simple update to the versions that are supported, and in a year or two the behavior can change without breaking anything.

Fixing things is good, but there is no urgency to fix this.

Updated by Eregon (Benoit Daloze) 11 months ago

There are far more (intentional and some of them warned before) breaking changes in Ruby 3 than this, so I don't see the point.
I guess Rails backported all the keyword arguments changes to previous major versions?
Then maybe we can do the same for this.

An old release of e.g. Rails 5 already doesn't work on Ruby 3, so there is no point to be compatible with that.
I understand being compatible with the latest Rails 5 release makes sense, though.

naruse (Yui NARUSE) wrote in #note-16:

I'm serious about the motivation of upgrading to Ruby 3.0 for Rails users.
I worry that so much.

I wouldn't worry.
If people upgraded to Ruby 2.7, they'll be happy to update to Ruby 3 which has understandable keyword arguments semantics and many other things.
I think a lot of people are excited about Ruby 3.
If there was a release to worry whether people would upgrade to it it was 2.7, but it seems many people upgraded to it.

Actions

Also available in: Atom PDF