Project

General

Profile

Feature #14336

Create new method String#symbol? and deprecate Symbol class

Added by dsferreira (Daniel Ferreira) almost 2 years ago. Updated almost 2 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:84698]

Description

From the discussions on the three previous issues related to the String vs Symbol subject (5964, 7792, 14277) there are some conclusions we can assume:

  • Current String vs Symbol is not the ideal scenario. See: Matz and Koichi comments.
  • Current philosophy is to use Symbols as identifiers and Strings when strings are needed.
  • Current situation is that Symbols are being used in many code bases as strings except for strings that really need the String methods.
  • Current situation is that we are designing APIs to handle both String and Symbol inputs forcing an overhead of API development.

I propose the deprecation of Symbol class and the introduction of String#symbol?.

foo = :foo
foo.class # => String
foo.symbol? # => true
bar = "bar"
bar.class # => String
bar.symbol? # => false

For backwards compatibility transition path I propose:

class Symbol
  def self.===(var)
    warn ("Warning message regarding deprecated class")
    if var.class == Symbol
      true
    elsif var.class == String && var.symbol?
      true
    else
      false
    end
  end
end

class String
  def is_a?(klass)
    case klass
    when String
      true
    when Symbol
      self.symbol?
    else
      false
    end
  end
end

History

#1

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

  • Description updated (diff)

Updated by shevegen (Robert A. Heiler) almost 2 years ago

Some comments.

Current String vs Symbol is not the ideal scenario. See:
Matz and Koichi comments.

I think both pointed out some problems; koichi pointed out the transition
path problem.

I think IF there is a decision to remove symbols (just assuming this for
the moment), then a transition path should be enabled to start early.

Possibly an early deprecation warning; a document on the official ruby
homepage too, to encourage ruby hackers to use symbols rather than
strings (again, IF it would be decided to do so); and possibly a hybrid
situation where both symbols and strings would still work, before
eventually removing symbols. Alternatively ruby could also internally
treat both the same but I really don't know anything near as much to
be able to say something ... useful. What I am just pointing out is
that people would need hints about any decision in regards to a
change. I guess it could be done in ruby 3.x but it should not be done
earlier really.

Even then there will be quite a transition cost involved. I am personally
still not sure whether this would be worth it (and it's good that I
don't have to decide), so I am writing this as a big IF.

Matz said a few more things in regards to how symbols originated; and
it is also somewhat true that the distinction between symbols and
by-default frozen strings, is not a huge one, in my opinion. I mean,
a frozen string, that remains the same, should be quite similar to
a symbol, that also never changes, yes? Even if the semantics are not
completely the same.

  • One thing that I would like to point out is the syntax issue. I
    like being able to do:

    :foobar

If I would have to do 'foobar' then I'd have to use one extra character.
So I prefer :foobar here. I also actually like Symbols and would rather
keep them - HOWEVER had, I also agree (and wrote so myself) that newcomers
may be confused about the difference and the different semantics. So it
may be easier for newcomers to not have to wonder about when to use
what.

There are also abominations such as HashWithIndifferentAccess. Now I
never used it (it is too long, too verbose and too crazy an idea) but
I can understand it. People will say something like "hey, I don't want
to have to care between symbols and strings, either way, I just want
to have a hash that can deal with this".

For that situation, I think it may be easier to just add a method to
Hash to allow either key variant; or add an extension class to hash
but with a sane name, not this long name. (Or remove symbols, then
nobody has to think about the difference; and we'd not see
HashWithIndifferentAccess).

  • IF symbols are removed, then String#symbol? does not make any sense. :)

I think it would not be logical to mandate of ruby hackers to query
whether a string is a symbol, when symbols ... don't exist anymore. :P

  • Current situation is that we are designing APIs to handle both String and Symbol inputs forcing an overhead of API development.

You mention one (possible) drawback - but you don't mention drawbacks
in regards to other changes that are required if symbols were removed.

Even your example of duck patching class String, it is not the same
as the situation BEFORE.

Note that I am not at all against removing symbols; and I don't mind
to change my code either. But I still don't really see the huge benefit
in it for non-newcomers. I have absolutely no problem in dealing with
both symbols and strings in any of my ruby code, so there would not
be a huge net benefit to me.

Even though I am not opposed to getting rid of symbols, to be honest,
I'd rather like to keep the current status quo, simply because it
does not give me any problem, whereas with the suggestion to change,
I'd have to change a lot without a huge net benefit.

I don't want to distract from the discussion though, so this is just
my personal opinion, and I'll let others comment.

At the end of the day, matz has to decide and only he knows how he
wants ruby people to use symbols (or not use them; I think matz once
said that he was surprised to see how symbols were used e. g. in
the rails ecosystem many years ago; I am sure I also misuse symbols,
but I like them too).

I also just realized that Symbols do not have a .new method. :)

Updated by zverok (Victor Shepelev) almost 2 years ago

To be completely honest, I feel like you are solving the problem that exists for very small percent of Rubyists (I am not saying "non-existent problem" only out of politeness).

When people answer you with "removing of Symbol will lead to this, that, and that problems", they just give you examples of how Symbols nature deep into Ruby, and why it will never be removed, not ask you to "solve those three problems and we are done".

There is absolutely nothing with Symbol/String difference that needs "fixing", and it is this way for the most of us. Symbol is an internal name, String is user input, it is neat and useful semantical difference, not "ambiguity" or "bad design".

And let it just be this way.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Hi Victor,

Symbol is an internal name, String is user input

What do you mean with that? Can you be more specific?

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

I like this transition path and support it. I'd love to see symbols gone for good. The 'string'.symbol? trick seems to fix most issues with existing libraries designs that interpret symbols and strings differently.

Updated by zverok (Victor Shepelev) almost 2 years ago

Symbol is an internal name, String is user input

What do you mean with that? Can you be more specific?

Symbols are identifiers, strings are data

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

There is also one important thing here that deserves consideration: We are doing a great effort to improve the performance of strings and everything associated to them.

Symbols are not taking advantage of those improvements.

If we keep saying that everything is a Symbol except user input we will be constraining ourselves to use less performant code. Correct me if I’m wrong but if I’m not mistaken immutable strings can be more performant then symbols, not to mention the memory consumption.

By merging symbols into strings we can focus all our efforts in improving string’s performance leaving symbols out of the equation.

I only see positive things here.

The transition path that I present will make sure that no code will be broken.

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

dsferreira (Daniel Ferreira) wrote:

There is also one important thing here that deserves consideration: We are doing a great effort to improve the performance of strings and everything associated to them.

Symbols are not taking advantage of those improvements.

there are you totally wrong. Symbols can't take advantage of that because Symbols where already optimized to the max before (frozen) strings where even added

if you want to talk about memory usage, google why ruby threat strings with more than 23 characters different

you only see that Strings got abit optimized, but you total don't see that is only a fraction of how Symbols where optimized to begin with

Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago

  • Backport deleted (2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN)
  • Tracker changed from Bug to Feature

dsferreira (Daniel Ferreira) wrote:

There is also one important thing here that deserves consideration: We are doing a great effort to improve the performance of strings and everything associated to them.

Symbols are not taking advantage of those improvements.

I'm not sure what you mean here. Symbols were already as fast or faster than Strings as internally they are stored as a direct value (at least real symbols, as opposed to those created via String#to_sym).

If we keep saying that everything is a Symbol except user input we will be constraining ourselves to use less performant code.

I don't think I've read anyone saying that "everything is a Symbol except user input". Multiple people have told you that symbols serve as identifiers. While it is true that some other languages merge identifiers and strings, some other languages use different classes for text and data too, where Ruby uses String for both. Symbols are a core part of ruby, and removing/deprecating them would cause serious problems, as I explained in #7792 and #14277.

Correct me if I’m wrong but if I’m not mistaken immutable strings can be more performant then symbols, not to mention the memory consumption.

I think there are few cases where a frozen string performs better than a symbol. I'm guessing #to_s would be faster, but beyond that and possibly other cases where you are returning a string, symbols generally perform as well or better than strings.

Have you actually done any benchmarking in this area? Here's a simple benchmark:

require 'benchmark/ips'

a = 'a'
a2 = a.dup
fa = a.dup.freeze
fa2 = a.dup.freeze
Benchmark.ips do |x|
  x.report("string"){a == a2}
  x.report("frozen string"){fa == fa2}
  x.report("literal fstring"){'a'.freeze == 'a'.freeze}
  x.report("symbol"){:a == :a}
end

results:

              string      6.031M (± 0.6%) i/s -     30.213M in   5.009466s
       frozen string      6.032M (± 0.6%) i/s -     30.237M in   5.012902s
     literal fstring      6.997M (± 0.7%) i/s -     35.148M in   5.023640s
              symbol      7.108M (± 0.7%) i/s -     35.580M in   5.005652s

Then you have cases like #hash, where String#hash is O(n) and Symbol#hash is O(1).

str = 'a'*1_000_000
fstr = str.dup.freeze
sym = str.to_sym
Benchmark.ips do |x|
  x.report("string"){str.hash}
  x.report("frozen string"){fstr.hash}
  x.report("symbol"){sym.hash}
end

results, note the difference between k and M:

              string      2.712k (± 0.4%) i/s -     13.770k in   5.077791s
       frozen string      2.713k (± 0.4%) i/s -     13.770k in   5.076325s
              symbol      6.589M (± 0.7%) i/s -     33.047M in   5.015775s

By merging symbols into strings we can focus all our efforts in improving string’s performance leaving symbols out of the equation.

Symbols were already faster, so this doesn't make sense.

I only see positive things here.

I see a huge negative thing here.

The transition path that I present will make sure that no code will be broken.

It sounds like these would still be broken, with no deprecation warning:

h['a'] = 1
h[:a] = 2
h['a'] # was 1, now 2

'a' == :a # was false, now true

'a'.hash == :a.hash # was false, now true

I don't see any transition plan for the C-API.

Daniel, are you OK with closing #14277? There doesn't seem to be a reason to have two separate issues opened for basically the same feature request.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Jeremy so we have some possible performance problems regarding frozen string literals that we should try to understand.

This is what I have on my side:

Code:

#!/usr/bin/env ruby --enable-frozen-string-literal

require "benchmark/ips"

Benchmark.ips do |x| 
  x.report("literal fstring"){ a = 'a'; a == 'a'}
  x.report("symbol"){a = :a; a == :a} 
end

Results for ruby 2.3.0:

Warming up --------------------------------------
     literal fstring   263.644k i/100ms
              symbol   260.400k i/100ms
Calculating -------------------------------------
     literal fstring     10.969M (± 3.7%) i/s -     54.838M in   5.007198s
              symbol     10.809M (± 5.3%) i/s -     53.903M in   5.002792s

Results for ruby 2.5.0

Warming up --------------------------------------
     literal fstring   301.226k i/100ms
              symbol   304.846k i/100ms
Calculating -------------------------------------
     literal fstring     10.701M (± 3.3%) i/s -     53.618M in   5.016530s
              symbol     11.121M (± 3.4%) i/s -     55.787M in   5.022618s

Why did frozen string literals outperformed symbols in ruby 2.3.0 and now that is not the case in 2.5.0?
In fact it seems frozen string literals have regressed!

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

It sounds like these would still be broken, with no deprecation warning:

h['a'] = 1
h[:a] = 2
h['a'] # was 1, now 2

'a' == :a # was false, now true

'a'.hash == :a.hash # was false, now true
I don't see any transition plan for the C-API.

From my proposal:

bar = :foo
bar.class # => String
bar.symbol? # => true

baz = "foo"
baz.class # => String
baz.symbol? # => false

bar == baz # => false

h[bar] == h[baz] # => ? Open to discussion but I would discuss it in a different issue. Not this one. For now lets not break backwards compatibility.

Lets keep current C API in the background. (Since we still have performance gains by using symbols in certain situations maybe we can use the mechanism in some degree and do the optimisations in the brackground (Not sure if this is feasible or not!))

Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago

dsferreira (Daniel Ferreira) wrote:

Why did frozen string literals outperformed symbols in ruby 2.3.0 and now that is not the case in 2.5.0?
In fact it seems frozen string literals have regressed!

They didn't regress. There is overlap between the ranges, so you can't say with confidence statistically using the benchmark whether symbols or strings is faster, or whether the results have changed between 2.3 and 2.5:

2.3:
     literal fstring     10.969M (± 3.7%) => 10.56..11.37
              symbol     10.809M (± 5.3%) => 10.24..11.38

2.5:
     literal fstring     10.701M (± 3.3%) => 10.35..11.05
              symbol     11.121M (± 3.4%) => 10.74..11.59

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Daniel, are you OK with closing #14277? There doesn't seem to be a reason to have two separate issues opened for basically the same feature request.

Jeremy, there are still conversations going on out there. Lets see how it will evolve. I will redirect people to this one if needs be.
But maybe we can update its description with the information that solution 1. has now a proposal open.
What do you think?

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

They didn't regress. There is overlap between the ranges, so you can't say with confidence statistically using the benchmark whether symbols or strings is faster, or whether the results have changed between 2.3 and 2.5:

True. But the behaviour statistically is consistent on my end so I believe it would be better to take a closer look at this.

Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago

dsferreira (Daniel Ferreira) wrote:

Jeremy, there are still conversations going on out there. Lets see how it will evolve. I will redirect people to this one if needs be.

When I made the recommendation to close #14277, there weren't any recent conversations going on in it. However, as it has seen activity in the last hour or so, there's a reason to keep it open now.

But maybe we can update its description with the information that solution 1. has now a proposal open.
What do you think?

That sounds fine, but I'm not sure how to edit the description.

Updated by matz (Yukihiro Matsumoto) almost 2 years ago

  • Status changed from Open to Rejected

Symbols in Ruby are for identifiers. They are kind of like Enums in other languages such as Swift or C# (although Symbols are more primitive). I don't think you would propose to unify Strings and Enums to Swift community.

I understand some other languages do not provide Symbols, and you are not familiar with the concept of Symbols in Ruby. But it's not wise to change Ruby because of your unfamiliarity.

To persuade me to unify them, you need a clear benefit, big enough to compensate potential compatibility breakage. I don't think this proposal has it. 'less confusion' is not big enough.

Matz.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

matz (Yukihiro Matsumoto) wrote:

But it's not wise to change Ruby because of your unfamiliarity.

This subject is really hard to discuss.

In the description I start by mentioning the other three issues already created about the subject.
And I also mention the comments previously made.

If it was just my unfamiliarity I don't think we would have a situation where so many people using ruby use things like HashWithIndifferentAccess.
If there is something that I don't like is to use such things.
If I was unfamiliar with ruby I would happily use it as many do.

I refuse myself to do it. But then I have to deal with that.
HashWithIndifferentAccess is a very big smell in the language.

This proposal now rejected was an attempt to come up with a transition path to put things in the right place.

Because if ruby community has the need to come up with such things as HashWithIndifferentAccess then we have a really big problem.

I still think ruby 3 is an awesome opportunity we will have to resolve such matter.
And I'm still committed to help on that.

Many Thanks

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

Note that we had tried it once and concluded it wasn't good.
We don't reject it with no evidence.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

nobu (Nobuyoshi Nakada) wrote:

Note that we had tried it once and concluded it wasn't good.

I understand that Nobu.
Could you please tell us what was the solution implemented and why it didn't work?

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Matz, I'd just like to remind you that this discussion is not about unfamiliarity.

When I suggested the removal of symbols from the language several years ago I already understood the differences between Symbols and Strings by then. I'd arguee that most Rubyists do, and yet, many would love to see Symbols gone from the language (and certainly many of them want to keep them apparently).

Most of the confusion caused by the existence of symbols come from hash[:a] != hash['a'] and this leads to bugs every single day in some Ruby project out there. It's pretty rare that someone would intentionally want hash[:a] to be different from hash['a']. For most code bases that would usually mean a bug.

The fact that the new hash syntax creates symbols as keys just makes it worse, as people usually prefer the new syntax as they have to type much less. So, here's what typically happens in a Ruby application:

unless user = user_from_cache(session_id)
  user = { id: 1, username: 'rodrigo', permissions: { may_view_reports: true } }
  store_user_in_cache session_id, user
end

Now, suppose user_from_cache deserialize from Redis, while store_user_in_cache serialize to Redis. The Redis driver supports symbols as keys and will convert them to string. When deserializing back, the hash keys will be strings. That means calling user[:id] would be nil if user was retrieved from cache. This sort of things happens very often unfortuntately so we can't just pretend that all sorts of problems that come from the existence of symbols are simply caused by people being ignorant among the conceptual difference between symbols and strings. They cause real pain to many rubyists, even experienced ones.

We never know whether hash keys should be considered symbols or strings when we get them from somewhere else. For example, Sequel models have a #to_h instance method and the only way we can know whether keys are symbols or strings is by either looking at the documentation or trying it and see what to expect. This is more work than we'd like to have when dealing with hashes all over the places. This is more time that we need to take to read somewhere else because for some people it's important that hash[:a] would be different from hash['a'].

This has nothing to do with unfamiliarity. There's a gray area anyway when we talk about what should be considered an identifier. Having to think about whether symbols or strings should be used for a particular case is also a waste of time for many cases while designing Ruby code, in my opinion.

You often say that symbols already exist in other languages and has been borrowed from them. Since I don't know those languages, let me ask you: do those languages support the equivalent of Ruby's Hash and allow both strings and symbols as keys? If so, haven't their users ever complained about that?

It feels very confusing to me, even if you clearly understand the conceptual differences among symbols and strings.

That's why I suggested Hash to be implemented as HashWithIndifferentAccess in issue #7797.

If we can't get rid of Symbols, but if issue #7797 would be accepted, I guess most of the confusion in existing Ruby code bases would be gone. Could you please reconsider it?

Updated by zverok (Victor Shepelev) almost 2 years ago

It's pretty rare that someone would intentionally want hash[:a] to be different from hash['a'].

This (and most of the subsequent) is only true for a lazily designed web app with no boundaries between input and internals. For me, "I want to write hash[:foo] but this hash turns out to have string keys" is always a useful and enlightening sign of design flaw, not a sign of "Screw it and give my HashWithIndifferentAccess".

I can (and will) only blame Rails for this misconception. Funny thing Rails themselves eventually thought better against "just take this hash, it has all your GET params", but, being Rails, solved the problem by introducing some more "hash but not exactly hash" (BlahParams) concepts. ¯\_(ツ)_/¯

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

zverok (Victor Shepelev) wrote:

I can (and will) only blame Rails for this misconception.

What about ruby 1.9 Hash syntax? Wasn't it an invitation to use symbols all over the place? Who's to blame then?

Since ruby 1.9 it seems it is a crime to use strings as hash keys. And yes I still use the old hash syntax every time I have the chance but again. Fighting the status quo is an inglorious task...

Lets stop with blames here and fix the issue shall we?

Updated by zverok (Victor Shepelev) almost 2 years ago

Lets stop with blames here and fix the issue shall we?

There is no issue to fix, except for not understanding why the difference matter and is a feature, not a bug of the language. But I really don't know how to fix the latter :(

Updated by zverok (Victor Shepelev) almost 2 years ago

I can (and will) only blame Rails for this misconception.

What about ruby 1.9 Hash syntax? Wasn't it an invitation to use symbols all over the place? Who's to blame then?

Probably you misread me. I blame Rails for people thinking "Symbol and String is just ambigous, HashWithIndifferentAccess to rule them all!"

Ruby 1.9 Hash syntax is a blessing, and exactly because it forces to tell internal identifiers of external data.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Aren't we suppose to make string operations in top of hash keys?
People don't care about that.
They simply use the cool syntax and then they let to the latter the burden of dealing with the necessary conversions.
Is it just me? I don't think so. I know from experience it is not just me.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Victor one thing is the idealisation of a feature another thing is its use in the wild.
When one thing doesn't match the other there is something broken.
And that is the case.

Updated by zverok (Victor Shepelev) almost 2 years ago

Victor one thing is the idealisation of a feature another thing is its use in the wild.

Yeah, yeah, been told so. I don't know whether it is a right moment to tell it, but just for you to know, I use Ruby since 2004, both for work and hobby project, and stopped idealizing anything in the language... let's say, a while ago.

Though, I sometimes smile remembering my crusades in ruby-talk (it was pretty active those days, including regular matz's presence) about how Time#inspect is so broken and why NOBODY can see this.

When one thing doesn't match the other there is something broken.
And that is the case.

Are you not surprised by how the discussion is going, then?
Two or three users are telling about "everybody hates", "people are confused", "WE need to fix this", and everybody else are either indifferent or "nah, nothing is broken here". Don't you think it may be a sign of misjudging the "brokenness" on your side?

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

zverok (Victor Shepelev) wrote:

Ruby 1.9 Hash syntax is a blessing, and exactly because it forces to tell internal identifiers of external data.

Couldn't we use symbols as keys in ruby original hash syntax?
Couldn't we use strings as keys as well?
Couldn't we use integers as keys?
What was wrong then?
Notation? Syntax?

How can we use strings as keys with the cool notation?
How can we use integers as keys with that new cool blessing notation?
What is then the purpose of that thing then?

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

zverok (Victor Shepelev) wrote:

Two or three users are telling about "everybody hates", "people are confused", "WE need to fix this", and everybody else are either indifferent or "nah, nothing is broken here". Don't you think it may be a sign of misjudging the "brokenness" on your side?

I speak for myself and my experience, not the others.
In this discussion if I was somebody else I would have been gone for a long long time.

I continue to say:

I still think ruby 3 is an awesome opportunity we will have to resolve such matter.
And I'm still committed to help on that.

Many Thanks

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Two or three users are telling about "everybody hates", "people are confused", "WE need to fix this", and everybody else are either indifferent or "nah, nothing is broken here". Don't you think it may be a sign of misjudging the "brokenness" on your side?

You should be able to find a few more here:

https://blog.arkency.com/could-we-drop-symbols-from-ruby/

You know, not every regular Ruby user follows ruby-core list and not even all of them follow articles mentioned in the Ruby Weekly ;)

So, I'm not surprised that this is not considered a problem in the ruby-core list audience, but this shouldn't be considered a good sample for representing most Ruby programmers if you ask me.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

You know, not every regular Ruby user follows ruby-core list

Isn't that obvious for everyone here? I thought it would be.

Let me tell you something then:

Real world problems are what matter the most when we design a language and that is why ruby is so awesome.
Because ruby was designed from the bottom up with that in mind.
In the real world people use the language features, people don't think about how to design the language in a better way.
They take the language features as granted and they create patterns to take advantage of those same features.
Ruby community is also awesome on that.
That is why Rails is an icon of technology followed by all other languages.
Cucumber, Rspec.
Even Github for me exists as it is today because of ruby. (but that is my impression, nothing else)
Ruby allows all that wonderful creativity to be put in real products as an artwork. Clear masterpieces.

Ruby core is a totally different world.
And in ruby core we should continue with the philosophy that took ruby up to here.
Ruby 3 will be out there.
It will be different but should be different for the better.
And in order to do that in ruby core the real world needs to be present in every decision.

In the real world symbols are being used for everything.
And that is the situation.
So they are not being used as they were meant to be.
And that is why I think ruby should fix that in ruby 3 in some way or another.

Updated by spatulasnout (B Kelly) almost 2 years ago

rr.rosas@gmail.com wrote:

You often say that symbols already exist in other languages and has been
borrowed from them. Since I don't know those languages, let me ask you:
do those languages support the equivalent of Ruby's Hash and allow both
strings and symbols as keys? If so, haven't their users ever complained
about that?

Smalltalk's hash (dictionary) does indeed allow both symbols and strings
as keys (and the language differentiates between them as Ruby does.)

Though, the existence of manual pages like the following ("Two flavors
of equality") suggests the distinction was anticipated to be a point of
confusion for some users:

https://www.gnu.org/software/smalltalk/manual/html_node/Two-flavors-of-equality.html

Regards,

Bill

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

distinction was anticipated to be a point of
confusion for some users

That is very important to note, plus in Smalltalk there is no dictionary syntax that forces the user to use Symbols like ruby 1.9 hash syntax.

Smalltalk manual page specifically specifies that the use of Symbols should be an exception and that “we generally use strings”.

Can we come up with a new syntax for symbols and let them be the exception again?

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Looking at Smalltalk deeper we have: Smalltalk Symbol class

Symbol is a subclass of String. (If this was the reality in ruby I believe most of our problems would be resolved. Can we add it as possible solution?)

Also there is nothing stating that symbols are for identifiers. What we see is:


In general, you can use strings for almost all your tasks. If you ever get into a performance-critical function which looks up strings, you can switch to Symbol. It takes longer to create a Symbol, and the memory for a Symbol is never freed (since the class has to keep tabs on it indefinitely to guarantee it continues to return the same object). You can use it, but use it with care.“

All of this makes more sense to me. Remembers me the old days of ruby 1.8. I guess this is what Koichi was referring to when he mentioned going back in time before ruby 1.9.

Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago

dsferreira (Daniel Ferreira) wrote:

Can we come up with a new syntax for symbols and let them be the exception again?

If you have a separate proposal, please submit a separate feature request. However, keep in mind that "you need a clear benefit, big enough to compensate potential compatibility breakage" is a very high bar. All of your previous explanations regarding the benefits of unifying symbols and strings is not enough. Only open a new feature request if you can demonstrate clear and compelling benefits that you have not discussed before that outweigh all of the problems that unifying strings and symbols will cause.

Updated by matz (Yukihiro Matsumoto) almost 2 years ago

In the early stage of Ruby1.9 development, I tried to unify strings and symbols, first by making Symbol class compatible with String class. But so many existing Ruby programs failed to run. So I gave up. If we would unify strings and symbols again in Ruby3, even more existing Ruby programs would fail, and I'm afraid we would see community division as Python2 and 3 did for a long time. It's like a dark age. From some point of view, separation of Symbols and Strings is not ideal, I admit, but I don't believe it's worth crashing so many software.

Even if you believe we won't see such breakage (by the proposal like this one), it's hard for me to believe it from the past experience. I need a proof that the breakage is trivial or the benefit outperforms the disadvantage.

Matz.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Thank you very much Matz.

I don’t mind to give my time on that task. For a long time I’ve been willing to contribute to ruby core but my professional contracts didn’t allow me.
Today I believe the situation is about to change.
I will ask my company the green light and will work hard on this endeavour once I get it.

Something tells me that it will be possible. That ruby 3 is the right moment to make this happen and I trust a lot in my intuition. I agree with all of you that we should not allow ruby to break backwards compatibility without a clear transition path and try our very best to keep breakage at its bare minimum if we really must have it.

I believe it is clear to everyone by now my passion for ruby and how much I think this situation is a big minus in the language rating standards which really impacts me in a daily basis since ruby as been a very important part of my life since many years and it will continue to be for many years to come.

I am what I am today because of ruby. I can say it. If it wasn’t ruby I wouldn’t be able to be what I am today in professional terms so this commitment is my payback to all of you that have worked very hard to provide me the tools that allowed me to succeed.

I will need some cooperation and support from experienced ruby core members to act as mentors on this task since it will be a comborsome task but we do have the time as our friend since ruby 3 is not gonna be out so soon.

And in the end if we reach the conclusion it is not worthy then I will be someone with a very deep knowledge of cruby codebase and for me that is invaluable.

How does it sound to you?

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Thanks for the explanation on SmallTalk symbols, Bill and Daniel.

It seems to be that symbols are seldom used in SmallTalk while they are often used in Ruby applications, which might explain why SmallTalk users might not complain that much. If you are kind enough, would you mind answering another question on SmallTalk?

Is it possible to generate a symbol from a string in SmallTalk, such as Ruby's 'string'.to_sym?

Conceptually speaking, I see symbols could be thought of as something like: "MYSYMBOL = Object.new". Of course, they could be made so that they would be cheaper to create and use less memory, and even have a corresponding name for debugging or reporting purposes mostly. But when you allow strings to be converted to symbols, people may (and will) just abuse of symbols and they will be used everywhere. If you create a much easier way to write hashes which favors symbols as keys, then things get even more out of control.

In the early days, when I was learning Ruby, the books I've read would also enforce that symbols should be used when performance was a concern (which I consider a misadvice) while talking about the differences between symbols and strings, and provide some micro-benchmarks, which showed big differences by the way back in the time. As people just love micro-benchmarks and feeling performant, there's a whole lot of people that simply used symbols in their libraries and code for "performance" reasons, ignoring that any performance gains achieved by using symbols would be lost if they had to be always converting between symbols and strings.

I understand and embrace the philosophy that we should have great reasons for introducing backwards compatibility and I do think we have them. Currently Ruby's broken from my point of view. The fact that most code bases are fighting bugs caused by h[:a] being different from h['a'] leads to applications crashing much more often than applications would break if we introduced some incompatibilities. I don't really think this could cause something like Python2/3 or Perl5/6 big divisions.

We have introduced support for frozen strings and it went pretty well. Maybe we could replicate the experiment. I disagree of Jeremy Evans in that we should immediately open new issues for every idea we come up while suggesting ways to improve the situation between symbols and strings. Once one of the ideas seems plausible, it's fine to open a separate issue, but for now, it's just a generic discussion. So, how about using magical comments to change the way symbols are created?

I suggested once that we could add some sort of HashWithIndifferentAccess class (with a shorter name hopefully) and create a special short syntax to declare them in issue #9980. What if we extended that to support some flag and magical comment, like we do for frozen strings? When hashes should be interpreted as with indifferent access by default, then {my: 'hash'} would be interpreted as {my: 'hash'}i, which would be a shorter syntax for HashWithIndiferentAccess.new(my: 'hash').

Since the hash issue is the one holding most of the confusion around the differences between symbols and strings, I'm focusing on it in order to try to minimize the impact of introducing incompatible changes and controlling its scope.

So, if we add a magical comment to our files, during the transition phase, we could have code like this working the way I'd like it to be:

# hwia-by-default: true

h = {my: 'hash'}
assert h[:my] == h['my']

Would something like that be considered for a future version of Ruby, as a transition path? We could even warn that the current hash behavior will be changed and that hwia will be the default in the future, so that people would have time to update their code in case they expect h[:a] to be different from h['a']. Ruby could rename the current implementation to something else, and provide, again, a transition path for existing code bases, by allowing them to revert the behavior for those sources by setting "# hwia-by-default: false".

This might cause some confusion in the initial phase, as we still can't know whether the hash we are getting from external sources are HWIA or legacy hashes, but I think the community could embrace it. If the community doesn't embrace it, we won't see any magical comments such as "# hwia-by-default: true" used in the wide. Otherwise we'll know it's a desired feature. Ruby has introduced refinements as an experimental feature initially, and it ended up merged to core. Maybe we could think about a similar strategy for dealing with hashes.

As for comparing strings and symbols, we might provide another another flag/magical comment to interpret :a == 'a' as true. Or we could unify them in a single flag/magical comment, so that if :a == 'a', then it would make sense that hash[:a] should equal hash['a'].

I understand this proposal is not very complete. It's not meant to be. It's supposed to be a starting point for a discussion around this idea. What do you think?

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

Is it possible to generate a symbol from a string in SmallTalk, such as Ruby's 'string'.to_sym?

Symbol is a subclass of String in Smalltalk. See one of my previous comments.

Since the hash issue is the one holding most of the confusion around the differences between symbols and strings, I'm focusing on it in order to try to minimize the impact of introducing incompatible changes and controlling its scope.

The hash issue is a consequence of the ecosystem.
We need to heal the ecosystem first.
Once properly healed issues like the hash one will be easily resolved.

The focus should be on the String vs Symbol relationship.
The transition path with minimal impact on current software.
Then next steps.
One step at a time.

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Hi Daniel, I've read your previous comment, but I also read this from Bill Kelly:

Smalltalk's hash (dictionary) does indeed allow both symbols and strings
as keys (and the language differentiates between them as Ruby does.)

I've assumed that dict[symbol] would be interpreted as different from dict['symbol']. If that was the case, it doesn't matter whether Symbols would be inherited from String, since they would be considered as different things when looking up in the dictionary. In that case, it could make sense to allow a symbol to be converted to a plain string. Did I interpret it wrongly?

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Thank you for confirming, Daniel. Indeed it seems the confusion would be about the same except that it seems symbols are seldom used in Smalltalk from what I've read in this thread. But it's good to know that Ruby and Smalltalk symbols vs strings comparison are pretty similar to each other, even if symbols are inherited from strings in Smalltalk while they are not in Ruby. It doesn't make much difference in practice, since many string operations don't make much sense for symbols in my opinion and since symbols wouldn't compare to strings in Smalltalk either. Thank you for your research and explanations :)

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

The fact that in Smalltalk Symbol is a subclass of String makes perfect sense.
It is key in the discussion we are having.

Why did Matz tried to unify symbols and strings pior to ruby 1.9?
Because it made sense to do so. The attempt didn’t go well but that outcome didn’t change the situation.
So if it made sense by then to consider symbols and strings equivalent in nature it continuous to make sense today.

Today we have even bigger reasons to consider it because symbols in ruby are having widespread use and the overlap between symbols and strings is growing.

Converting symbols into strings just because we want to metaprogramme some identifiers dynamically (and I do that a lot) and then convert back into symbols because that is what an identifier should be? That is an overlap.
If Symbol class was a subclass of String we wouldn’t have to do so.
And if we didn’t have this convention that identifiers are symbols and text/data are strings we wouldn’t have to worry about it either.

This discussion until now what shows me is that people are justifying the current status quo based on made up conventions totally neglecting the fundamentals.

And the fundamentals are directly related to an attempt that failed.

That failure took ruby to a totally different route. The point of failure was a bifurcation.

Koichi said that if it was possible to get back to the past...

I want to get back to that bifurcation point. The one that is there in the past. Revisit it. Speak with it and try all I can to help ruby select the other route that is also there forgotten and still waiting for us.

Usually we can’t go back to previous missed opportunities but sometimes we are allowed to do so. Ruby 3 is a one time opportunity. And I don’t want us to miss it.

So please stop saying that everything is as it should be because Matz had his say now and has been having it throughout the times.

I’m trying to come up with the best solution (which I don’t know what it is by the way). I will also try to make attempts.
In the end Matz will decide if my efforts were worthy or not.
Meanwhile all your feedback is more than welcome but if you show up just to rant me like many did in this discussion understand that I will not consider any of your lines in my exercise.

I will say it again that I expect someone from the core team to work closely with me as a mentor. Until the day that person exists I will work by myself and this is a commitment.

Many thanks,

Daniel

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

we might talk about the different size of RString and RSymbol struct in MRI

or why Strings bigger than 23 characters are different than < 23 ones

Updated by normalperson (Eric Wong) almost 2 years ago

danieldasilvaferreira@gmail.com wrote:

I will say it again that I expect someone from the core team
to work closely with me as a mentor. Until the day that person
exists I will work by myself and this is a commitment.

doc/extension.rdoc should tell you what you need to know.
I don't know what your level of C experience is; but you should
know unions and integer representations well. That's as far
as I'm willing to help you on this topic.

IMHO, the incompatibility is too big and as matz says;
I don't want Ruby to have the problems Python 2 => 3 has.
Our 1.8 => 1.9 transition was bad enough and already
lost us many users.

If you really have a problem with HashWithIndifferentAccess,
why don't you work with Rails developers to fix/discourage it in
Rails? It may take 10 years or more to change developers'
habits, though.

Fwiw; the success and longetivity of Unix, C, Linux and git is
not because they are perfect or strive to be; but they got
"good enough" and would rather continue with their warts than
have breaking changes. This is a stark contrast to the CADT[1]
mentality of desktop software which is willing to completely
throw out stuff and rewrite every few years.

[1] https://www.jwz.org/doc/cadt.html

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

doc/extension.rdoc should tell you what you need to know.

Thanks Eric.

That's as far
as I'm willing to help you on this topic.

Hopefully someone will be able to answer one or two questions. :-)

IMHO, the incompatibility is too big

It will be an exercise.
Ruby will not be worst with it.
Will I lose my time? No.
I will learn and I’m good with that.

Updated by matz (Yukihiro Matsumoto) almost 2 years ago

Daniel, don't try to read my mind. I tried years ago because I wanted to experiment what others liked (yes, we had similar people like you since the old days). And so many of our tests, samples, and utilities were failed miserably. During the experiment, I found out the symbol is the fundamental part of the language.

Now I am negative about unifying symbols and strings.

So, to persuade me, you have to answer following questions very clearly, concretely:

  • why do you want that change? (I am sick of "make sense" or "natural" or "consistent")
  • how big is the impact of the change? (compatibility, performance)
  • how big is the benefit of the change?
  • who gets the benefit?
  • who gets the penalty?
  • after considering above questions, is the change reasonable?

Currently, my answer to the last question is "no".

If you are going to make your own experiment, it's good. I am curious about your approach and the result. The result may persuade me in the future.

You might feel I am too conservative. But remember, people blame me, not you, for making any change to the language. I need to be very positive before taking responsibility for any change to the language.

Matz.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

matz (Yukihiro Matsumoto) wrote:

Daniel, don't try to read my mind. I tried years ago because I wanted to experiment what others liked (yes, we had similar people like you since the old days).

Thanks for the clarification Matz. I wasn’t trying to read your mind. It was just a logical deduction. I get it. It made sense for others.

And so many of our tests, samples, and utilities were failed miserably.

Probably, with like 99,999% certain my endeavour will be much worst and if that is the case it will be work to be forgotten. Like I said: I’m comfortable with that. I respect very much all of your experience and knowledge. When I say “your” I mean not just you as the leader but all of you as a team. I know very little about the language compared to you. The key here for me is that you didn’t say “it is not gonna to happen”. You left a “but” behind. An open door that I’m willing to take.

During the experiment, I found out the symbol is the fundamental part of the language.

Once again Matz many thanks for the insight. It is an invaluable knowledge that you are passing to me. I will keep that in mind during the exercise.

If you are going to make your own experiment, it's good. I am curious about your approach and the result. The result may persuade me in the future.

That is all I need to hear. Your curiosity about the work I will do is my biggest motivation right now.

You might feel I am too conservative.

Not even close. I already said: l totally agree with you that it is a bad thing to break backwards compatibility.
My goal is to come up with a proper transition path like requested by Koichi whilst keeping breakage as minimal as possible. And once again I state that I deeply respect all of you.

Thank you very much Matz for this opportunity!

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Meanwhile all your feedback is more than welcome but if you show up just to rant me like many did in this discussion understand that I will not consider any of your lines in my exercise.

Daniel, I have no ideas on why you got those impressions. I've been asking for making Symbols the same as Strings for much longer than you and haven't changed my mind yet. I just mentioned that simply making Symbols inherit from Strings don't fix anything by its own if :id != 'id' and h[:id] != h['id']. Since this seems to be the case with Smalltalk, I just mentioned that the situation seems to be as broken in Smalltalk as it is in Ruby. I wasn't trying to justify the status quo in any way.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

Daniel, I have no ideas on why you got those impressions.

Nothing related to you Rodrigo. We are fine and I’m counting on you to provide me your feedback. ;-)

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Matz, if you don't mind, I'd like to give your questions a try:

So, to persuade me, you have to answer following questions very clearly, concretely:

why do you want that change? (I am sick of "make sense" or "natural" or "consistent")

Serializing values to and from JSON is becoming a very very common operation no matter what languages you are using. We do that all the time. It often leads to confusion in Ruby programs if the original hash being serialized had keys as symbols and after deserialization they are strings, which means one would have to re-convert them to symbols, for example. This sort of confusion happens in several other situations, including some hashes we get from other's code and the only way we can know if keys are expecting to be string or symbol is by either giving it a try or looking into the documentation (if any) or tracking down the code until we know what to expect. This is very costly and could be avoided if h[:a] == h['a']. So, my main motivation is to make it easier for most Rubyists to write code without having to waste too much time thinking about symbols and strings.

how big is the impact of the change? (compatibility, performance)

It can be big or small. Since I know you won't consider big incompatibilities, I'm interpreting this question as "how can we reduce the impact of the change?".

If we use the same approach taken by the frozen strings magical comment and apply it to symbols, the impact would be minimal. Making the comparison of strings and symbols work seamless based on such magical comments or flags seem complicated and they could introduce more incompatibilities, so I'd like to focus on the main issue as I see: h[:a] != h['a'].

For that, I'd propose a new flag/magical comment such as "# hwia_by_default: true". A new read-only property (set at creation time) would be added to Hash such as :indifferent_access. Example: h = Hash.new({a: 1}, indifferent_access: true). Alternatively we could add a method to Hash to return a copy with this property set: new_hash = hash.with_indifferent_access; new_hash.has_indifferent_access? # true. Those names are just suggestions and they are not important for the time being. We can always discuss better names. For now, let me explain the idea on how that would work.

# hwia_by_default: true

h = {a: 1} # translated to Hash.new({a: 1}, indifferent_access: true)
assert h[:a] == h['a']

It would be even better if we could add a short syntax to create such hashes when we can't add the magical comment:

h = {a: 1}i # equivalent of Hash.new({a: 1}, indifferent_access: true)

So, with regards to compatibility it seems we would be fine since such approach shouldn't break existing code.

Regarding performance, this is not that simple to answer. I can't tell about internal performance since I'm not an implementor, but just take into consideration that lots of application have to convert hash keys between symbols and strings and that should be taken into consideration when evaluating the performance impact. If one often has to call transform_keys (or ActiveSupport's symbolize_keys) when not using the new flag, that should be taken into consideration when evaluating the performance impact.

how big is the benefit of the change?

It sounds huge to me. Not having to worry about using symbols or strings as keys look like a huge win to my happinness when coding less buggy prone applications. And by inspecting some articles and questions and complains in StackOverflow it seems I'm not alone in thinking this way.

who gets the benefit?

Most (if not all) Rubyists in my opinion.

who gets the penalty?

The language implementors, maybe?

after considering above questions, is the change reasonable?

Definitely yes, if you ask me ;)

Updated by matz (Yukihiro Matsumoto) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas), that sounds like if the de-facto JSON library can automatically convert map keys to symbols, most of the needs are fulfilled. And that must be much easier and simpler than unifying symbols and strings.

Matz.

Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago

matz (Yukihiro Matsumoto) wrote:

rosenfeld (Rodrigo Rosenfeld Rosas), that sounds like if the de-facto JSON library can automatically convert map keys to symbols, most of the needs are fulfilled. And that must be much easier and simpler than unifying symbols and strings.

Matz.

This is already supported by the json library that ships with ruby:

JSON.parse('{"a":{"b":2}}')
# => {"a"=>{"b"=>2}}
JSON.parse('{"a":{"b":2}}', :symbolize_names=>true)
#=> {:a=>{:b=>2}}

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

matz (Yukihiro Matsumoto) wrote:

rosenfeld (Rodrigo Rosenfeld Rosas), that sounds like if the de-facto JSON library can automatically convert map keys to symbols, most of the needs are fulfilled. And that must be much easier and simpler than unifying symbols and strings.

Yes Matz. That kind of approach to the hash keys problem makes perfect sense to me.

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Thanks, Matz, it certainly helps, but there are plenty of cases where we are not in the control of how hashes are serialized/deserialized. For example, when using Redis, the interface accepts a hash and it will serialize behind the scenes using strings as keys but you have no choice upon deserialization.

But even if you had, that wouldn't completely fix the issue, because part of the application might use strings as keys while another part of the application might use symbols. Unless this information was stored by Redis in the database, Redis couldn't know whether symbols or strings should be used when deserializing a particular stored value.

This sort of things happen all the time and are not specific to just JSON or even just JSON and Redis. Other examples could include PostgreSQL json columns, ElasticSolr interfaces and so forth. Pretending there's an easy solution to the serialization problem that only exists due to the existence of symbols doesn't help to improve the situation. There's also the problem that forces us, Ruby developers, to always look at the documentation or other's code, just to figure out whether symbols or strings are supposed to be used as hash keys.

That's why I'd like to see at least some sort of out-of-the-box HWIA-like solution. That, when implemented as I suggested, should allow a great improvement while not being backward incompatible.

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

But even if you had, that wouldn't completely fix the issue, because part of the application might use strings as keys while another part of the application might use symbols.

thats itself is the problem there

it means that the developer doesn't think about his own application enough to know if something would be a String or Symbol, or uses user input without checking which is even worse

There's also the problem that forces us, Ruby developers, to always look at the documentation or other's code, just to figure out whether symbols or strings are supposed to be used as hash keys.

you should always check the documentation of the code you use to check all the corner cases where something would throw an Exception

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

it means that the developer doesn't think about his own application enough to know if something would be a String or Symbol, or uses user input without checking which is even worse

It has nothing to do with the developer but with the library author. They can't know whether the developer would want a symbol or a string as the hash key when deserializing back.

And this has nothing to do with user input, since the user doesn't talk directly to Redis.

you should always check the documentation of the code you use to check all the corner cases where something would throw an Exception

Most of the time we spend coding is reading other's code. Most of the time other's code exist directly in the application and there's no documentation for them at all. You know, from the method name, that it's supposed to return a hash with identifier as keys, but you always have to check just to make sure those identifiers are strings or symbols, or maybe the hash is a HWIA. If Ruby was a static language, we would know immediately because the return value could be declared as Hash<String, Object> for example. But Ruby isn't, and it would be super helpful if whenever there's a hash with identifiers as keys we would know for sure that it must be strings, for example. Or, at least, that it doesn't matter whether we lookup by string or symbol.

There are lots of aging code bases around there and teams with high turnover rates. You can't just assume that all code in the code base was written by a single developer or a consistent team.

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

it means that the developer doesn't think about his own application enough to know if something would be a String or Symbol, or uses user input without checking which is even worse

It has nothing to do with the developer but with the library author. They can't know whether the developer would want a symbol or a string as the hash key when deserializing back.

And this has nothing to do with user input, since the user doesn't talk directly to Redis.

it does: if the application developer doesn't care what he input and output into a third party library he used, then it's his problem
you can't know what the third party does with it, thats what documentation of that third party library is for

for example if i would do something that does return deserialized stuff, i would do parameter to control what kind should be returned
like what if your application wants the third library to return deserialized stuff returned as objects like Struct?
yaml for example support some magic comment for ruby struct, while json does not.
what if the your application or your users what json to return objects and not just Hash?
do your thing support that too?

you should always check the documentation of the code you use to check all the corner cases where something would throw an Exception

Most of the time we spend coding is reading other's code. Most of the time other's code exist directly in the application and there's no documentation for them at all.
You know, from the method name, that it's supposed to return a hash with identifier as keys, but you always have to check just to make sure those identifiers are strings or symbols, or maybe the hash is a HWIA.

thats where rdoc and yard is for (and in extension ri), you not only has the method name, but also the parameters, what type they should be, and example how to call the function and what it should return

for example the Module#constants or const_get or const_set functions. its explicit said you can use both a symbol or a string as names for const_get and const_set, but constants only would use symbols.

or was you one of them that had a problem with that change after it was done in ruby 1.9?

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

I give up on this discussion since I'm very low on free time these days and it's clear to me that I'm talking to the wind here, since most people discussing here seems to believe that "h[:a] != h['a']" is a non issue and that it's always the Ruby developer's fault. I've already stated all arguments I could think about, so I'm not willing to repeat them over and over. I'm convinced that if this is going to change at some point it won't be through my arguments, so I give up on this discussion until someone brings a new perspective into it.

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

I give up on this discussion since I'm very low on free time these days and it's clear to me that I'm talking to the wind here, since most people discussing here seems to believe that "h[:a] != h['a']" is a non issue and that it's always the Ruby developer's fault. I've already stated all arguments I could think about, so I'm not willing to repeat them over and over. I'm convinced that if this is going to change at some point it won't be through my arguments, so I give up on this discussion until someone brings a new perspective into it.

why are you guys not have a problem with that h[0] != h[0.0] ?
compared to :a and "a", 0 and 0.0 are == true but hash does treat them differently?
why isn't there a Hash class extra for this?

HWIA wouldn't handle that case too, so we need an extra HashWithExtraAccessForNumbers?

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

I give up on this discussion until someone brings a new perspective into it.

Rodrigo I'm working in a concrete proposal.
Matz statement that "symbol is the fundamental part of the language" makes all this discussion very tricky.
The solution needs to be really clever in order to be accepted.
Strings vs Symbols is all over the place and ruby 3 will only be out at least almost 3 years from now if I'm not mistaken. (Christmas 2020?)
The solution should work as a whole.
I have more things to present to the community but they need some more work.

Stay tuned. ;-)

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

why are you guys not have a problem with that h[0] != h[0.0] ?

Aren't you exaggerating? When did you ever need to use a non integer number as a hash key? I never did and I never saw a single code base using non integer numeric values as hash keys in all those years programming in Ruby or whatever other language. That argument makes zero sense.

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

why are you guys not have a problem with that h[0] != h[0.0] ?

Aren't you exaggerating? When did you ever need to use a non integer number as a hash key? I never did and I never saw a single code base using non integer numeric values as hash keys in all those years programming in Ruby or whatever other language. That argument makes zero sense.

what if you have hash data for a color gradient?
then you have { 0 => 'start color', 0.5 => 'step color', 1.0 => 'end color'}
if you now would have 1 instead of 1.0 then it wouldn't be the same and your program would fail

That argument makes zero sense.

isn't it cool how much "your" argument makes sense? saying that some argument makes no sense is the perfect way to silence other opinions
would be sad if someone would do that to you, ne?

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

It's not about what I think makes sense. I can provide you tons of code where people are using HWIA because it's a real problem Rubyists face every day some place around the globe. It's not hard to find too many examples on strings and symbols being used as hash keys. It would be hard to find a single code base though where you won't find a single hash using either symbols or strings as keys.

Now, can you show me many examples of applications using numerical non-integer values as hash keys? If you can't that means it's not a common issue, just one you're making up to try to get an argument since you lack one.

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

some older code of mine where i needed to handle both user input and input from xml or json
https://github.com/Hanmac/tmx-ruby/blob/master/lib/tiled_tmx/map.rb#L29-L34
that was before ruby got keyarguments
i know how to handle both

rosenfeld (Rodrigo Rosenfeld Rosas) wrote:

Now, can you show me many examples of applications using numerical non-integer values as hash keys? If you can't that means it's not a common issue, just one you're making up to try to get an argument since you lack one.

i ALREADY did! data for color gradient (see above)
just theating that my example doesn't exist to prof your point

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Hans, now I completely missed your point.

What does this prove?

@height = (node[:height] || node["height"]).to_i
@width = (node[:width] || node["width"]).to_i
@tileheight = (node[:tileheight] || node["tileheight"]).to_i
@tilewidth = (node[:tilewidth] || node["tilewidth"]).to_i

@orientation = (node[:orientation] || node["orientation"] || :orthogonal).to_sym

Updated by Hanmac (Hans Mackowiak) almost 2 years ago

dsferreira (Daniel Ferreira) wrote:

Hans, now I completely missed your point.

What does this prove?

@height = (node[:height] || node["height"]).to_i
@width = (node[:width] || node["width"]).to_i
@tileheight = (node[:tileheight] || node["tileheight"]).to_i
@tilewidth = (node[:tilewidth] || node["tilewidth"]).to_i

@orientation = (node[:orientation] || node["orientation"] || :orthogonal).to_sym

and old (pre keyword) example how to make it work for all three user input (keys or strings), hash data from json or nodes from xml

i did some other example that might be interesting how ruby does key and non key arguments, and how to combine them that both input from string or symbols work

def meth(opts={}, key: opts['key'], **other)
  p key,opts,other
end

meth("key"=>"abc", :o => "val")

#"abc"
#{"key"=>"abc"}
#{:o=>"val"}

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

Hanmac (Hans Mackowiak) wrote:

and old (pre keyword) example how to make it work

I don't know what to think about that.
Isn't the all point of the discussion to find a solution that could allow us to handle those situations without doing that extra work?

i did some other example

Are you aware of the new keyword feature that will break that code?

https://bugs.ruby-lang.org/issues/14183

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

wouldn't this be beautiful:

%w/height width tileheight tilewith/.each do |dimension|
  instance_variable_set("@#{dimension}".to_sym, node[dimension])
end

Now imagine this:

%w/height width tileheight tilewith/.each do |dimension|
  instance_variable_set(dimension, node[dimension])
end

Note: I don't understand why we need to prefix the dimension with @ when the method is instance_variable...

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

dsferreira (Daniel Ferreira) wrote:

Are you aware of the new keyword feature that will break that code?

https://bugs.ruby-lang.org/issues/14183

Why and how do you think it will break that code?

dsferreira (Daniel Ferreira) wrote:

Note: I don't understand why we need to prefix the dimension with @ when the method is instance_variable...

Instance variables must start with @ in ruby.
It's a language design.

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

Instance variables must start with @ in ruby.

I believe that was exactly the point of Daniel. Since it must start with @, why having to include the @ while declaring the instance variable? Why not just the variable name? I also never understood that design, but this is off-topic to this issue ;)

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

I ALREADY did! data for color gradient (see above). Just theating that my example doesn't exist to prof your point.

I was talking specifically about dealing with common cases, not exceptions. I told you that you'll find hashes using either strings or symbols as keys in basically every code base. That makes it a common case. Then I asked you if you could come with tons of examples of code out there (which means, not yours code) where people are using non integer numerical values as hash keys. You failed to show us that this would be a common case. We're not discussing the exceptions in this issue, but the common cases.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

nobu (Nobuyoshi Nakada) wrote:

Why and how do you think it will break that code?

Sorry Nobu. Now I can see it maps to the first example of the description. I misread it. So it will not break Hans example.
Not sure what will break then.
I have a comment in the issue asking for some clarification about it.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

nobu (Nobuyoshi Nakada) wrote:

Instance variables must start with @ in ruby.
It's a language design.

What I meant was what Rodrigo stated. (Thanks Rodrigo).
But that was just a side note. Maybe I'll open an issue to discuss that some day.

I believe it would be great to have:

instance_variable_set(:foo, 1)
instance_variable_get(:foo) # => 1
@foo # => 1

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

@ is a part of instance variable names.
Do you want to access constants without the first upper case letter by const_get?

class X
  Constant = 1
end
X.const_get(:onstant) #=> 1

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

nobu (Nobuyoshi Nakada) wrote:

Do you want to access constants without the first upper case letter by const_get?

Would you like me to open an issue to discuss it? This String vs Symbol discussion is already to problematic in nature.
My bad for adding the side note.

Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago

dsferreira (Daniel Ferreira) wrote:

I believe it would be great to have:

instance_variable_set(:foo, 1)
instance_variable_get(:foo) # => 1
@foo # => 1

The reason this doesn't work is by design. You can actually store instance variables that don't start with @ using the C-API (rb_ivar_set), these are hidden from ruby land. You will occasionally see this used in C extensions, and it is also used in core ruby (transcode.c, vm.c, enum.c).

In any case, this is unrelated to this feature request. This feature request has already been rejected, so there is little reason to continue discussion. If an alternative proposal is put forth as a new feature request, it can be discussed there.

Updated by dsferreira (Daniel Ferreira) almost 2 years ago

jeremyevans0 (Jeremy Evans) wrote:

You can actually store instance variables that don't start with @ using the C-API

Thanks for the clarification. That makes perfect sense.
I suspected there would be some good explanation for the feature.

Updated by rosenfeld (Rodrigo Rosenfeld Rosas) almost 2 years ago

It still doesn't make sense for me since it's only possible to add instance variables starting with @ when using the public exposed API, so it seems completely unnecessary to require the @ when using the public API, regardless of how the internal implementation can be used. But I don't really care about it, specially as I rarely feel the need for metaprogramming in the Ruby code I usually write, so I'm not really bothered by this :) I just never thought it made sense, even though it never bothered me either ;)

Also available in: Atom PDF