How do I deserialize in Psych to return an existing object, such as a class object?
To do serialization of a class, I can do
require "psych"
class Class
yaml_tag 'class'
def encode_with coder
coder.represent_scalar 'class', name
end
end
yaml_string = Psych.dump(String) # => "--- !<class> String\n...\n"
but if I try doing Psych.load on that, I get an anonymous class, rather than the String class.
The normal deserialization method is Object#init_with(coder), but that only changes the state of the existing anonymous class, whereas I’m wanting the String class.
Psych::Visitors::ToRuby#visit_Psych_Nodes_Scalar(o) has cases where rather than modifying existing objects with init_with, they make sure the right object is created in the first place (for example, calling Complex(o.value) to deserialize a complex number), but I don’t think I should be monkeypatching that method.
Am I doomed to working with low level or medium level emitting, or am I missing something?
Background
I’ll describe the project, why it needs classes, and why it needs
(de)serialization.
Project
The Small Eigen Collider aims to create random tasks for Ruby to run.
The initial aim was to see if the different implementations of Ruby
(for example, Rubinius and JRuby) returned the same results when given
the same random tasks, but I’ve found that it’s also good for
detecting ways to segfault Rubinius and YARV.
Each task is composed of the following:
receiver.send(method_name, *parameters, &block)
where receiver is a randomly chosen object, and method_name is the
name of a randomly chosen method, and *parameters is an array of
randomly chosen objects. &block is not very random – it’s basically
equivalent to {|o| o.inspect}.
For example, if receiver were “a”, method_name was :casecmp, and
parameters was [“b”], then you’d be calling
"a".send(:casecmp, "b") {|x| x.inspect}
which is equivalent to (since the block is irrelevant)
"a".casecmp("b")
the Small Eigen Collider runs this code, and logs these inputs and
also the return value. In this example, most implementations of Ruby
return -1, but at one stage, Rubinius returned +1. (I filed this as a
bug https://github.com/evanphx/rubinius/issues/518 and the Rubinius
maintainers fixed the bug)
Why it needs classes
I want to be able to use class objects in my Small Eigen Collider.
Typically, they would be the receiver, but they could also be one of
the parameters.
For example, I found that one way to segfault YARV is to do
Thread.kill(nil)
In this case, receiver is the class object Thread, and parameters is
[nil]. (Bug report: http://redmine.ruby-lang.org/issues/show/4367 )
Why it needs (de)serialization
The Small Eigen Collider needs serialization for a couple of reasons.
One is that using a random number generator to generate a series of
random tasks every time isn’t practical. JRuby has a different builtin
random number generator, so even when given the same PRNG seed it’d
give different tasks to YARV. Instead, what I do is I create a list of
random tasks once (the first running of ruby
bin/small_eigen_collider), have the initial running serialize the list
of tasks to tasks.yml, and then have subsequent runnings of the
program (using different Ruby implementations) read in that tasks.yml
file to get the list of tasks.
Another reason I need serialization is that I want to be able to edit
the list of tasks. If I have a long list of tasks that leads to a
segmentation fault, I want to reduce the list to the minimum required
to cause a segmentation fault. For example, with the following bug
https://github.com/evanphx/rubinius/issues/643 ,
ObjectSpace.undefine_finalizer(:symbol)
by itself doesn’t cause a segmentation fault, and nor does
Symbol.all_symbols.inspect
but if you put the two together, it did. But I started out with
thousands of tasks, and needed to pare it back to just those two
tasks.
Does deserialization returning existing class objects make sense in
this context, or do you think there’s a better way?
The Psych maintainer has implemented the serialization and deserialization of classes and modules. It’s now in Ruby!