Feature #7788

YAML Tag Schema Support

Added by Thomas Sawyer about 1 year ago. Updated about 1 year ago.

[ruby-core:51882]
Status:Open
Priority:Normal
Assignee:Aaron Patterson
Category:lib
Target version:next minor

Description

=begin
I have endeavoured to add proper Schema support to Psych (see (()) on Schemas). The primary reasons for supporting schemas are two fold: security and global tag conflict. The first is well known b/c of recent events. The second is less realized, but consider is it same problem as using global variables. Different apps have different tags; two identical local tags may have different meanings and thus cause conflict.

The API works like this:

class Foo
end

foo_schema = YAML::Schema.new do |s|
  s.tag '!foo', Foo
end

YAML.load('foo.yml', :schema=>foo_schema)

This code would allow only failsafe and json schema tags (core defaults), plus the specifically defined !foo tag.
Also, %TAG prefix is supported:

foo_schema = YAML::Schema.new(:prefix=>{'!'=>'tag:foo.org/'}) do |s|
  s.tag '!foo', Foo
end

This will add tag 'tag:foo.org/fooinstead of local!foo` tag.

To properly support schema, object's must store the tag with which they were loaded in order to ensure correct round tripping. For this there is tag_uri attribute.
(Note: I am not sure if it best to store as instance variable, which it currently is, or to store in global table. Need feedback.)

In the process of adding schema support I was able to clean up and generalize loading code. For immutable types and class factories, adding (({ClassName.new_with(coder)})) can be used to instantiate class.

Implementation is close to complete, I believe this is all that remains:

  1. ScalarScanner needs to respect schema (basically if failsafe and/or json schemas are not used).
  2. Dumping needs to take :schema option to limit it to schema tags.
  3. Dumping needs to look to tag_uri for tag by default.
  4. There is one bug I have yet to figure out (testspecbuiltin_map).
  5. I have questions about Coder, b/c it seems more complex than it needs to be.

I am also considering refactoring Schemas as modules that can be included into other schema. Currently they are classes/objects that can be subclassed or merged via +, e.g.

LEGACY_SCHEMA = CORE_SCHEMA + RUBY_SCHEMA + OBJECT_SCHEMA + SYCK_SCHEMA

Of course, as with any new code, there's sure to be corner cases to work out. Having other pound on it for a while would be very helpful. Oh, and I should also mention I am documenting as much of the code as can.

Feel free to ask me any questions for more details about the code. You can find the branch here: https://github.com/trans/psych/tree/isotag
=end

History

#1 Updated by Koichi Sasada about 1 year ago

  • Assignee set to Aaron Patterson

Also available in: Atom PDF