Bug #6312

Psych needlessly noisy parsing string node starting with number-ish string

Added by gabriele renzi about 2 years ago. Updated 8 months ago.

[ruby-core:44426]
Status:Assigned
Priority:Low
Assignee:Aaron Patterson
Category:-
Target version:-
ruby -v:ruby 1.9.3p188 (2012-04-17 revision 35365) [x86_64-darwin10.8.0] Backport:

Description

For example:

 $ ruby -d -rpsych -e 'Psych.load("4 weddings")' 2>&1 | tail -n 2
 Exception `ArgumentError' at /Users/riffraff/.rvm/rubies/ruby-1.9.3-head/lib/ruby/1.9.1/psych/scalar_scanner.rb:99 - invalid value for Integer(): "4 weddings"
 4 weddings

(tail because there are a bunch more printout due to load/name errors)

This is due to assuming by default that anything that is not another scalar type should be considered first as a YAML !!int, and only if that fails with an exception, as a string.

There was already a specific fix for one instance of this issue (#5186), but it would be nicer to avoid it altogether.

Small patch attached importing the spec from yaml.org for what an int should be. All psych tests still passing for me.

Notes:

  • I did add a tiny test and some setup/teardown in the specific file so that the debug would be visible on screen.
    It could make sense to replace STDERr with a StringIO and check that but it feels fragile, and I don't know how to test "does not cause debug printouts" otherwise.

  • checking for the INT regex makes the check for two "." in integer unnecessary. I have added it back to the float case as r32957 had fixed the issue but it's been reintroduced (the yaml.org float regexp is wrong or we don't parse the same floats)

  • psych treats '1,2' as '12'. This seem like a bug as I could not see it in the spec, but I have changed the regexp accordingly.

  • if the "1,2" == "12" parsing is removed then the String#gsub calls become unnecessary

  • there seem to be many capturing groups in this file which are not really necessary

  • sexagesimal formatting is handled by itself in another node, but it's still in the FLOAT regex so I left it in the INT one too

Hope this is somewhat helpful.

0001-make-psych-less-noisy-by-checking-if-a-value-is-a-va.patch Magnifier (2.89 KB) gabriele renzi, 04/18/2012 01:26 AM

History

#1 Updated by Yusuke Endoh about 2 years ago

  • Status changed from Open to Assigned
  • Assignee set to Aaron Patterson

#2 Updated by Zachary Scott 8 months ago

I get a NameError exception when I try to run your example on trunk:

Exception NameError' at /Users/zzak/.rubies/ruby-trunk/lib/ruby/2.1.0/psych/core_ext.rb:29 - methodyamlas' not defined in Module
Exception NameError' at /Users/zzak/.rubies/ruby-trunk/lib/ruby/2.1.0/psych/deprecated.rb:81 - undefined methodto
yaml_properties' for class `Object'

#3 Updated by gabriele renzi 8 months ago

Sorry I hadn't noticed this issue was updated, possibly rubymine is not sending notifications anymore?

Anyway It's been 15 months since I submitted this, so please forgive me if I'm not very fresh on the subject, but as far as I can tell the issue that I reported appear to have been accidentally fixed by a change a couple weeks ago:

https://github.com/tenderlove/psych/issues/156

So I guess this issue can be closed.

The errors you see are another example of the basic problem though: needlessly raising and catching exception causes noise when running with -d.
The lines in question are

removemethod :yamlas rescue nil #in core_ext.rb

and

undef :toyamlproperties rescue nil # in deprecated.rb

both can, AFAICT, be replaced with explicit testing (undef if defined) which avoids raising exceptions and eliminates this noise, with the possible side effect of being more efficient (no need to create exceptions, fill stacktrace etc).

Please excuse me if I am missing something, but as I wrote, it's been a while since I looked into this, and it's quite late in the night at the time of this writing.

Also available in: Atom PDF