Project

General

Profile

Feature #715

Ruby interpreter should understand UTF-8 symbols with special meaning

Added by Fjan (Jan Maurits Faber) almost 12 years ago. Updated over 4 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:19714]

Description

I would like the ruby interpreter to understand symbols such as the greater-than-or-equal sign, as an alias for '>='.

This is not simply because it would look pretty, it would reduce the cognitive load on the programmer. At the moment many ascii characters are overloaded to mean different things in different contexts. Especially characters like $, :, > and =. If the relevant symbols were used then the brains of the programmer would be free to do more useful things.

For example, something like:

a>=b ? {:a=>!b} : nil

Could be displayed as:

ab ? :a  ¬b : 

(in case the UTF-8 characters don't come across: I just replaced several characters with mathematical symbols)

If the Ruby interpreter would support this then text editors can be improved to automatically insert the appropriate symbol.

I don't know of any language that can do this yet, so it would be a unique selling point for Ruby, but it would seem rather easy to implement.

#1

Updated by zenspider (Ryan Davis) almost 12 years ago

On Nov 6, 2008, at 08:06 , Jan Maurits Faber wrote:

(in case the UTF-8 characters don't come across: I just replaced
several characters with mathematical symbols)

doesn't this sum up at least one of the problems with the proposal?

Another is "cognitive load", all you're doing is shifting the load
over to the developer to have to figure out how to type such symbols.
I certainly don't know how to type much of any math/logical characters
in unicode with ease. You do mention improving text editors to insert
the symbols for us, but that seems like you just cancelled out the
benefit at that stage. What do you get at that point, besides arguably
prettier code?

#2

Updated by matz (Yukihiro Matsumoto) almost 12 years ago

  • Status changed from Open to Rejected

It's worse than you thought. Trust me.
We had similar temptation to use Japanese characters in our programs.
And if we did, I bet you don't use Ruby now.

Updated by Fjan (Jan Maurits Faber) over 4 years ago

  • Description updated (diff)

Just in case someone stumbled upon this old rejected feature request: there is a clever way of achieving most of this by using a font with ligatures that combines successive ascii characters into ligatures:
https://github.com/tonsky/FiraCode

Perhaps now that Ruby is firmly in the UTF-8 world we can reconsider this feature though?

Updated by shyouhei (Shyouhei Urabe) over 4 years ago

No. Have you heard about APL programming language? If not, please just google. I don't want that madness imported to Ruby.

Updated by matz (Yukihiro Matsumoto) over 4 years ago

There was a language named Fortress that did similar but had disappeared in the history.

Matz.

Updated by Fjan (Jan Maurits Faber) over 4 years ago

Both APL and Fortress predate decent UTF-8 support by a long time, I'm not sure that the fact that those failed in the past is still a convincing reason to reject out of hand the human factor advantages this could have. Do we want to limit Ruby to ASCII for the rest of eternity?

We already see editors like RubyMine implement support for things like FiraCode and Hasklig, so it's clear there is demand for this from the programming community.

Updated by nobu (Nobuyoshi Nakada) over 4 years ago

  • Description updated (diff)

Updated by darix (Marcus Rückert) over 4 years ago

Jan Maurits Faber wrote:

This is not simply because it would look pretty, it would reduce the cognitive load on the programmer. At the moment many ascii characters are overloaded to mean different things in different contexts. Especially characters like $, :, > and =. If the relevant symbols were used then the brains of the programmer would be free to do more useful things.

counter point: the advantage of all those ascii chars is I can type them easily because they are reflected on my keyboard. memorizing all the code points of those special Unicode chars so i can enter them, is not really helping me. and i dont want to program with some character selection app open all the time. now you could argue that editors could replace those ascii char combinations with the unicode chars automatically, similar to what firacode and so do. but then ... why? I still typed the same stuff as before. and if it is just a display issue. use firacode and friends.

Updated by Fjan (Jan Maurits Faber) over 4 years ago

It's really easy to remember: to type ≥ is just alt + >, to type ≤ is alt + < and ≠ is alt + =, so I don't think you would need a character picker for very long. But more importantly using the new symbols would be entirely optional. If you don't like them then the ASCII way would of course keep working forever.

The Firacode is an interesting hack but it is very limited, it is not able to understand the difference between "==" used inside a string and == used inside code, for example, and mathematical symbols like pi (π, alt + p, easy right?) or logical not (¬, alt-l) do not have a straightforward ascii equivalent.

Updated by phluid61 (Matthew Kerwin) over 4 years ago

Jan Maurits Faber wrote:

It's really easy to remember: to type ≥ is just alt + >, to type ≤ is alt + < and ≠ is alt + =, so I don't think you would need a character picker for very long. But more importantly using the new symbols would be entirely optional. If you don't like them then the ASCII way would of course keep working forever.

I just opened up Notepad++ (my editor of choice at the moment on Windows) and tried alt + =. Didn't work. Doesn't work here in Chrome either. In fact, I can't type ≠ using my keyboard at all, even with the magic Alt+numpad combinations. This is a pretty typical US-101 keyboard, in one of the most used operating systems in the world.

Considering there are users for whom curly braces {} are enough of an issue that they prefer to avoid block forms at all costs (trying to cram as much as possible into to_proc'd &foo forms; see previous discussions on the Ruby Talk mailing list about that, and look at e.g. the Italian keyboard layout), I think adding even more niche characters to the core repertoire is a bad idea.

Also a≥b ? :a → ¬b : ∅ is really hard for me to understand. I'm not a mathematician. Please don't write that in any code I may have to one day maintain.

Updated by Fjan (Jan Maurits Faber) over 4 years ago

Ok, I agree that it's probably a good idea to introduce well known symbols like ≥ at first to avoid confusion. But I don't think the argument that it's hard to type on your particular computer is valid, partly because your editor could autocomplete it for you, but mainly because you wouldn't have to use it.

Maybe 1% of Ruby programmers use refinements, that's not a reason not to add the feature. By contrast, this feature is popular enough that editors like RubyMine start adding hacks to work around it, so it would be nice to have for people who want it.

Updated by shyouhei (Shyouhei Urabe) over 4 years ago

I happen to be a Japanese and understands Japanese language. Throughout history of our programming language construction, many Japanese developers tried to create Japanese-aware programming languages, then failed. Without exceptions. It has been possible to type "≥" character natively for a loooong time to us. Yet, that didn't help at all for those failed challenges. Everyone could understand what the program says easily, but no one wanted to write anything in that language. It was just too annoying to use IME to program something. I bet other people have similar failures for other natural languages too. Like Matz said in comment #2, it is worse than you think. People just avoid using such things.

I can point other failed similar attempts, for instance C's trigraphs. I wonder if you could learn something from those failures, not just they are "predate UTF-8".

Updated by duerst (Martin Dürst) over 4 years ago

I have a somewhat mixed opinion. The proposal is for aliases, not for replacements. Therefore, whether some people think it's difficult to type isn't relevant. If you don't want to type it, you don't have to. One question is whether the symbols will be visible 'everywhere' (100% may be difficult, but let's say 99.99% of the time). The situation here is definitely a lot better than 7 years ago, and only getting better.

Another question is whether (or how much) people who usually write '>=' would be confused if they suddenly saw '≥'. At least for '≥', the chance of confusion on a reader's side is probably quite low. My guess is based on the fact that my students often write '≥' instead of '>=' when they have to write a program on paper.

But then we get to the question of what cases exactly should or should not have aliases. For example, should '=>' be aliased to '→', as in the OP, or to '⇒', which is a more direct equivalent (whereas '→' would stand for '->'). Also, should '∅' really stand for nil? I'd be confused, because in Mathematics, '∅' is used for the empty set, which I'd write Set.new. So, implementing aliases could be easy, but deciding which aliases to define for which operators or objects could lead to long unproductive discussions.

Japanese-language programming was also mentioned. I'm sure there have been quite a few reasons why these programming languages failed. One may have been that these languages where intended for beginners, without a story for advanced users. One may be that using Japanese automatically limited the size of the ecosystem. One may be that some of these languages tried to be too 'natural-language like' (think e.g. COBOL). Another may have been the overhead of entering Japanese characters. None of these would apply to Ruby users who don't want to use these aliases.

APL was also mentioned. It's clear that it's mostly a dead language, and would be difficult to input these days where using both upper and lower case letters is widespread. It's also true that it's very easy (easier than in Perl) to create difficult to read programs. But it's not as much of a problem as it may sound, because APL programs are extremely short. Also, I have fond memories of APL. I used it in high school, it was great fun. I learned to think about programming in a more global way than e.g. if I had learned C first. To an APL programmer, something like 'map-reduce' sounds very obvious, except for the name. It makes use of Ruby methods such as map, select, inject,... very natural, whereas people coming e.g. from C will tend to write explicit while loops.

Also available in: Atom PDF