Project

General

Profile

Actions

Bug #6312

closed

Psych needlessly noisy parsing string node starting with number-ish string

Added by riffraff (gabriele renzi) over 12 years ago. Updated almost 6 years ago.

Status:
Closed
Target version:
-
ruby -v:
ruby 1.9.3p188 (2012-04-17 revision 35365) [x86_64-darwin10.8.0]
Backport:
[ruby-core:44426]

Description

For example:

 $ ruby -d -rpsych -e 'Psych.load("4 weddings")' 2>&1 | tail -n 2
 Exception `ArgumentError' at /Users/riffraff/.rvm/rubies/ruby-1.9.3-head/lib/ruby/1.9.1/psych/scalar_scanner.rb:99 - invalid value for Integer(): "4 weddings"
 4 weddings

(tail because there are a bunch more printout due to load/name errors)

This is due to assuming by default that anything that is not another scalar type should be considered first as a YAML !!int, and only if that fails with an exception, as a string.

There was already a specific fix for one instance of this issue (#5186), but it would be nicer to avoid it altogether.

Small patch attached importing the spec from yaml.org for what an int should be. All psych tests still passing for me.

Notes:

  • I did add a tiny test and some setup/teardown in the specific file so that the debug would be visible on screen.
    It could make sense to replace STDERr with a StringIO and check that but it feels fragile, and I don't know how to test "does not cause debug printouts" otherwise.

  • checking for the INT regex makes the check for two "." in integer unnecessary. I have added it back to the float case as r32957 had fixed the issue but it's been reintroduced (the yaml.org float regexp is wrong or we don't parse the same floats)

  • psych treats '1,2' as '12'. This seem like a bug as I could not see it in the spec, but I have changed the regexp accordingly.

  • if the "1,2" == "12" parsing is removed then the String#gsub calls become unnecessary

  • there seem to be many capturing groups in this file which are not really necessary

  • sexagesimal formatting is handled by itself in another node, but it's still in the FLOAT regex so I left it in the INT one too

Hope this is somewhat helpful.


Files

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0