-
Notifications
You must be signed in to change notification settings - Fork 211
Description
As a user, if my serialized data doesn't match the current definition, I'd much rather see a crash than have it continue with no exception but corrupted values.
#749 changed this so that missing keys will be converted to nil and extra keys will be ignored. Probably some tests need to be added for the following failure cases (I've included Marshal for comparison).
With the following serializations (same for any recent version of psych):
$ irb -ryaml
irb(main):001> D = Data.define(:foo, :bar)
=> D
irb(main):002> d = D["foo", "bar"]
=> #<data D foo="foo", bar="bar">
irb(main):003> d.to_yaml
=> "--- !ruby/data:D\nfoo: foo\nbar: bar\n"
irb(main):004> Marshal.dump d
=> "\x04\bS:\x06D\a:\bfooI\"\bfoo\x06:\x06ET:\bbarI\"\bbar\x06;\aT"
irb(main):005> I see the following behavior in psych 5.3.0:
$ irb -ryaml
irb(main):001> D = Data.define(:a, :b)
=> D
irb(main):002> YAML.unsafe_load "--- !ruby/data:D\nfoo: foo\nbar: bar\n"
=> #<data D a=nil, b=nil>
irb(main):003> I believe that is invalid behavior. It should have crashed. Here's how Marshal and psych 5.2.6 handle an invalid serialization:
$ irb -ryaml
irb(main):001> D = Data.define(:a, :b)
=> D
irb(main):002> Marshal.load "\x04\bS:\x06D\a:\bfooI\"\bfoo\x06:\x06ET:\bbarI\"\bbar\x06;\aT"
<internal:marshal>:34:in 'Marshal.load': struct D not compatible (:foo for :a) (TypeError)
from (irb):2:in '<main>'
from <internal:kernel>:168:in 'Kernel#loop'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/irb-1.15.3/exe/irb:9:in '<top (required)>'
from /home/nick/.local/share/rbenv/versions/3.4.5/bin/irb:25:in 'Kernel#load'
from /home/nick/.local/share/rbenv/versions/3.4.5/bin/irb:25:in '<main>'
irb(main):003> YAML.unsafe_load "--- !ruby/data:D\nfoo: foo\nbar: bar\n"
/home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:222:in 'Psych::Visitors::ToRuby#init_struct': unknown keywords: foo, bar (ArgumentError)
init_struct(data, **members)
^^^^^^^^^^^^^^^
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:222:in 'Psych::Visitors::ToRuby#visit_Psych_Nodes_Mapping'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:30:in 'Psych::Visitors::Visitor#visit'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:6:in 'Psych::Visitors::Visitor#accept'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:35:in 'Psych::Visitors::ToRuby#accept'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:345:in 'Psych::Visitors::ToRuby#visit_Psych_Nodes_Document'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:30:in 'Psych::Visitors::Visitor#visit'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:6:in 'Psych::Visitors::Visitor#accept'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:35:in 'Psych::Visitors::ToRuby#accept'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych/nodes/node.rb:49:in 'Psych::Nodes::Node#to_ruby'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/psych-5.2.6/lib/psych.rb:275:in 'Psych.unsafe_load'
from (irb):3:in '<main>'
from <internal:kernel>:168:in 'Kernel#loop'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/irb-1.15.3/exe/irb:9:in '<top (required)>'
from /home/nick/.local/share/rbenv/versions/3.4.5/bin/irb:25:in 'Kernel#load'
from /home/nick/.local/share/rbenv/versions/3.4.5/bin/irb:25:in '<main>'
irb(main):004>For what it's worth, I don't think the scenario where the parameter order changed is important. Although Data.define does provide an explicit ordering, which can be used with ::new or ::[], Data#initialize always uses keyword args which never care about order. I believe Psych handles this correctly and maybe Marshal should be updated to not care about Data member order.
$ ruby --version
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +PRISM [x86_64-linux]
$ irb -ryaml
irb(main):001> D = Data.define(:bar, :foo)
=> D
irb(main):002> Marshal.load "\x04\bS:\x06D\a:\bfooI\"\bfoo\x06:\x06ET:\bbarI\"\bbar\x06;\aT"
<internal:marshal>:34:in 'Marshal.load': struct D not compatible (:foo for :bar) (TypeError)
from (irb):2:in '<main>'
from <internal:kernel>:168:in 'Kernel#loop'
from /home/nick/.local/share/rubies/3.4.5/lib/ruby/gems/3.4.0/gems/irb-1.15.3/exe/irb:9:in '<top (required)>'
from /home/nick/.local/share/rbenv/versions/3.4.5/bin/irb:25:in 'Kernel#load'
from /home/nick/.local/share/rbenv/versions/3.4.5/bin/irb:25:in '<main>'
irb(main):003> YAML.unsafe_load "--- !ruby/data:D\nfoo: foo\nbar: bar\n"
=> #<data D bar="bar", foo="foo">
irb(main):004> Originally posted by @nevans in #749 (comment)