XML-MAPPING: XML-to-object (and back) Mapper for Ruby, including XPath Interpreter¶ ↑
Xml-mapping is an easy to use, extensible library that allows you to semi-automatically map Ruby objects to XML trees and vice versa.
Download¶ ↑
For downloading the latest version, git repository access etc. go to:
github.com/multi-io/xml-mapping
Contents of this Document¶ ↑
Example¶ ↑
(example document stolen + extended from www.castor.org/xml-mapping.html)
Input Document:¶ ↑
<?xml version="1.0" encoding="ISO-8859-1"?>
<Order reference="12343-AHSHE-314159">
<Client>
<Name>Jean Smith</Name>
<Address where="home">
<City>San Mateo</City>
<State>CA</State>
<ZIP>94403</ZIP>
<Street>2000, Alameda de las Pulgas</Street>
</Address>
<Address where="work">
<City>San Francisco</City>
<State>CA</State>
<ZIP>94102</ZIP>
<Street>98765, Fulton Street</Street>
</Address>
</Client>
<Item reference="RF-0001">
<Description>Stuffed Penguin</Description>
<Quantity>10</Quantity>
<UnitPrice>8.95</UnitPrice>
</Item>
<Item reference="RF-0034">
<Description>Chocolate</Description>
<Quantity>5</Quantity>
<UnitPrice>28.50</UnitPrice>
</Item>
<Item reference="RF-3341">
<Description>Cookie</Description>
<Quantity>30</Quantity>
<UnitPrice>0.85</UnitPrice>
</Item>
<Signed-By>
<Signature>
<Name>John Doe</Name>
<Position>product manager</Position>
</Signature>
<Signature>
<Name>Jill Smith</Name>
<Position>clerk</Position>
</Signature>
<Signature>
<Name>Miles O'Brien</Name>
</Signature>
</Signed-By>
</Order>
Mapping Class Declaration:¶ ↑
require 'xml/mapping' ## forward declarations class Client; end class Address; end class Item; end class Signature; end class Order include XML::Mapping text_node :reference, "@reference" object_node :client, "Client", :class=>Client hash_node :items, "Item", "@reference", :class=>Item array_node :signatures, "Signed-By", "Signature", :class=>Signature, :default_value=>[] def total_price items.values.map{|i| i.total_price}.inject(0){|x,y|x+y} end end class Client include XML::Mapping text_node :name, "Name" object_node :home_address, "Address[@where='home']", :class=>Address object_node :work_address, "Address[@where='work']", :class=>Address, :default_value=>nil end class Address include XML::Mapping text_node :city, "City" text_node :state, "State" numeric_node :zip, "ZIP" text_node :street, "Street" end class Item include XML::Mapping text_node :descr, "Description" numeric_node :quantity, "Quantity" numeric_node :unit_price, "UnitPrice" def total_price quantity*unit_price end end class Signature include XML::Mapping text_node :name, "Name" text_node :position, "Position", :default_value=>"Some Employee" end
Usage:¶ ↑
####read access
o=Order.load_from_file("order.xml")
=> #<Order:0x007ff64a0fe8b0 @signatures=[#<Signature:0x007ff64a0ce3e0 @position="product manager", @name="John Doe">, #<Signature:0x007ff64a0cd210 @position="clerk", @name="Jill Smith">, #<Signature:0x007ff649a322e8 @position="Some Employee", @name="Miles O'Brien">], @reference="12343-AHSHE-314159", @client=#<Client:0x007ff64a0fd6b8 @work_address=#<Address:0x007ff64a0ed678 @city="San Francisco", @state="CA", @zip=94102, @street="98765, Fulton Street">, @name="Jean Smith", @home_address=#<Address:0x007ff64a0efef0 @city="San Mateo", @state="CA", @zip=94403, @street="2000, Alameda de las Pulgas">>, @items={"RF-0001"=>#<Item:0x007ff64a0df550 @descr="Stuffed Penguin", @quantity=10, @unit_price=8.95>, "RF-0034"=>#<Item:0x007ff64a0ddbd8 @descr="Chocolate", @quantity=5, @unit_price=28.5>, "RF-3341"=>#<Item:0x007ff64a0dc0d0 @descr="Cookie", @quantity=30, @unit_price=0.85>}>
o.reference
=> "12343-AHSHE-314159"
o.client
=> #<Client:0x007ff64a0fd6b8 @work_address=#<Address:0x007ff64a0ed678 @city="San Francisco", @state="CA", @zip=94102, @street="98765, Fulton Street">, @name="Jean Smith", @home_address=#<Address:0x007ff64a0efef0 @city="San Mateo", @state="CA", @zip=94403, @street="2000, Alameda de las Pulgas">>
o.items.keys
=> ["RF-0001", "RF-0034", "RF-3341"]
o.items["RF-0034"].descr
=> "Chocolate"
o.items["RF-0034"].total_price
=> 142.5
o.signatures
=> [#<Signature:0x007ff64a0ce3e0 @position="product manager", @name="John Doe">, #<Signature:0x007ff64a0cd210 @position="clerk", @name="Jill Smith">, #<Signature:0x007ff649a322e8 @position="Some Employee", @name="Miles O'Brien">]
o.signatures[2].name
=> "Miles O'Brien"
o.signatures[2].position
=> "Some Employee"
## default value was set
o.total_price
=> 257.5
####write access
o.client.name="James T. Kirk"
o.items['RF-4711'] = Item.new
o.items['RF-4711'].descr = 'power transfer grid'
o.items['RF-4711'].quantity = 2
o.items['RF-4711'].unit_price = 29.95
s=Signature.new
s.name='Harry Smith'
s.position='general manager'
o.signatures << s
xml=o.save_to_xml #convert to REXML node; there's also o.save_to_file(name)
=> <order reference='12343-AHSHE-314159'> ... </>
xml.write($stdout,2)
<order reference='12343-AHSHE-314159'>
<Client>
<Name>
James T. Kirk
</Name>
<Address where='home'>
<City>
San Mateo
</City>
<State>
CA
</State>
<ZIP>
94403
</ZIP>
<Street>
2000, Alameda de las Pulgas
</Street>
</Address>
<Address where='work'>
<City>
San Francisco
</City>
<State>
CA
</State>
<ZIP>
94102
</ZIP>
<Street>
98765, Fulton Street
</Street>
</Address>
</Client>
<Item reference='RF-0001'>
<Description>
Stuffed Penguin
</Description>
<Quantity>
10
</Quantity>
<UnitPrice>
8.95
</UnitPrice>
</Item>
<Item reference='RF-0034'>
<Description>
Chocolate
</Description>
<Quantity>
5
</Quantity>
<UnitPrice>
28.5
</UnitPrice>
</Item>
<Item reference='RF-3341'>
<Description>
Cookie
</Description>
<Quantity>
30
</Quantity>
<UnitPrice>
0.85
</UnitPrice>
</Item>
<Item reference='RF-4711'>
<Description>
power transfer grid
</Description>
<Quantity>
2
</Quantity>
<UnitPrice>
29.95
</UnitPrice>
</Item>
<Signed-By>
<Signature>
<Name>
John Doe
</Name>
<Position>
product manager
</Position>
</Signature>
<Signature>
<Name>
Jill Smith
</Name>
<Position>
clerk
</Position>
</Signature>
<Signature>
<Name>
Miles O'Brien
</Name>
</Signature>
<Signature>
<Name>
Harry Smith
</Name>
<Position>
general manager
</Position>
</Signature>
</Signed-By>
</order>
####Starting a new order from scratch
o = Order.new
=> #<Order:0x007ff64a206050 @signatures=[]>
## attributes with default values (here: signatures) are set
## automatically
xml=o.save_to_xml
XML::MappingError: no value, and no default value, for attribute: reference
from /Users/oklischat/xml-mapping/lib/xml/mapping/base.rb:724:in `obj_to_xml'
from /Users/oklischat/xml-mapping/lib/xml/mapping/base.rb:218:in `block in fill_into_xml'
from /Users/oklischat/xml-mapping/lib/xml/mapping/base.rb:217:in `each'
from /Users/oklischat/xml-mapping/lib/xml/mapping/base.rb:217:in `fill_into_xml'
from /Users/oklischat/xml-mapping/lib/xml/mapping/base.rb:229:in `save_to_xml'
## can't save as long as there are still unset attributes without
## default values
o.reference = "FOOBAR-1234"
o.client = Client.new
o.client.name = 'Ford Prefect'
o.client.home_address = Address.new
o.client.home_address.street = '42 Park Av.'
o.client.home_address.city = 'small planet'
o.client.home_address.zip = 17263
o.client.home_address.state = 'Betelgeuse system'
o.items={'XY-42' => Item.new}
o.items['XY-42'].descr = 'improbability drive'
o.items['XY-42'].quantity = 3
o.items['XY-42'].unit_price = 299.95
xml=o.save_to_xml
xml.write($stdout,2)
<order reference='FOOBAR-1234'>
<Client>
<Name>
Ford Prefect
</Name>
<Address where='home'>
<City>
small planet
</City>
<State>
Betelgeuse system
</State>
<ZIP>
17263
</ZIP>
<Street>
42 Park Av.
</Street>
</Address>
</Client>
<Item reference='XY-42'>
<Description>
improbability drive
</Description>
<Quantity>
3
</Quantity>
<UnitPrice>
299.95
</UnitPrice>
</Item>
</order>
## the root element name when saving an object to XML will by default
## be derived from the class name (in this example, "Order" became
## "order"). This can be overridden on a per-class basis; see
## XML::Mapping::ClassMethods#root_element_name for details.
As shown in the example, you have to include XML::Mapping into a class to turn it into a “mapping class”. There are no other restrictions imposed on mapping classes; you can add attributes and methods to them, include additional modules in them, derive them from other classes, derive other classes from them etc.pp.
An instance of a mapping class can be created from/converted into an XML node with methods like XML::Mapping::ClassMethods#load_from_xml, XML::Mapping#save_to_xml, XML::Mapping::ClassMethods#load_from_file, XML::Mapping#save_to_file. Special class methods like “text_node”, “array_node” etc., called node factory methods, may be called from the body of the class definition to define instance attributes that are automatically and bidirectionally mapped to subtrees of the XML element an instance of the class is mapped to.
Single-attribute Nodes¶ ↑
For example, in the definition
class Address include XML::Mapping text_node :city, "City" text_node :state, "State" numeric_node :zip, "ZIP" text_node :street, "Street" end
the first call to text_node creates an attribute named “city” which is
mapped to the text of the XML child element defined
by the XPath expression “City” (xml-mapping includes an XPath interpreter
that can also be used seperately; see below). When
you create an instance of Address from an XML element (using Address.load_from_file(file_name) or
Address.load_from_xml(rexml_element)), that instance's “city” attribute
will be set to the text of the XML element's
“City” child element. When you convert an instance of Address
into an XML element, a sub-element “City” is added
and its text is set to the current value of the city
attribute. The other node types (numeric_node, array_node etc.) work
analogously. Generally said, when an instance of the above
Address class is created from or converted to an XML tree, each of the four nodes in the class maps some
parts of that XML tree to a single, specific
attribute of the Adress instance. The name of that attribute
is given in the first argument to the node factory method. Such a node is
called a “single-attribute node”. All node types that come with xml-mapping
except one (choice_node, which I'll talk about below) are
single-attribute nodes.
Default Values¶ ↑
For each single-attribute node you may define a default value which will be set if there was no value defined for the attribute in the XML source.
From the example:
class Signature include XML::Mapping text_node :position, "Position", :default_value=>"Some Employee" end
The semantics of default values are as follows:
-
when creating a new instance from scratch:
-
attributes with default values are set to their default values
-
attributes without default values are left unset
-
(when defining your own initializer, you'll have to call the inherited initialize method in order to get this behaviour)
-
when loading an instance from an XML document:
-
when saving an instance to an XML document:
-
unset attributes without default values raise an error
-
attributes with default values that are set to their default values are not saved
-
all other attributes are saved
-
This implies that:
-
attributes that are set to their respective default values are not represented in the XML
-
attributes without default values must be set explicitly before saving
Single-attribute Nodes with Sub-objects¶ ↑
Single-attribute nodes of type array_node,
hash_node, and object_node recursively map one or
more subtrees of their XML to sub-objects (e.g.
array elements or hash values) of their attribute. For example, with the
line
array_node :signatures, "Signed-By", "Signature", :class=>Signature, :default_value=>[]
, an attribute named “signatures” is added to the surrounding class (here:
Order); the attribute will be an array whose elements
correspond to the XML sub-trees yielded by the XPath
expression “Signed-By/Signature” (relative to the tree corresponding to the
Order instance). Each element will be of class
Signature (internally, each element is created from its
corresponding XML subtree by just calling
Signature.load_from_xml(the_subtree)). The reason why the path
“Signed-By/Signature” is provided in two arguments instead of just one
combined one becomes apparent when marshalling the array (along with the
surrounding Order object) back into a sequence of XML elements. When that happens, “Signed-By” names the
common base element for all those elements, and “Signature” is the path
that will be duplicated for each element. For example, when the
signatures attribute contains an array with 3
Signature instances (let's call them sig1,
sig2, and sig3) in it, it will be marshalled to
an XML tree that looks like this:
<Signed-By>
<Signature>
[marshalled object sig1]
</Signature>
<Signature>
[marshalled object sig2]
</Signature>
<Signature>
[marshalled object sig3]
</Signature>
</Signed-By>
Internally, each Signature instance is stored into its
<Signature> sub-element by calling
the_signature_instance.fill_into_xml(the_sub_element). The
input document in the example above shows how this ends up looking.
hash_nodes work similarly, but they define hash-valued
attributes instead of array-valued ones.
object_nodes are the simplest of the three types of
single-attribute nodes with sub-objects. They just map a single given
subtree directly to their attribute value. See the example for examples :)
The mentioned methods load_from_xml and
fill_into_xml are the only methods classes must implement in
order to be usable in the :class=> keyword arguments to
node factory methods. Mapping classes (i.e. classes that include
XML::Mapping) automatically inherit those functions and can thus be
readily used in :class=> arguments, as shown for the
Signature class in the array_node call above. In
addition to that, xml-mapping adds those methods to some of Ruby's core
classes, namely String and Numeric (and thus
Float, Integer, and BigInt). So you
can also use strings or numbers as sub-objects of attributes of
array_node, hash_node, or
object_node nodes. For example, say you have an XML document like this one:
<?xml version="1.0" encoding="ISO-8859-1"?>
<people>
<names>
<name>Jim</name>
<name>Susan</name>
<name>Herbie</name>
<name>Nancy</name>
</names>
</people>
, and you want to map all the names to a string array attribute
names, you could do it like this:
require 'xml/mapping' class People include XML::Mapping array_node :names, "names", "name", :class=>String end
usage:
ppl=People.load_from_file("stringarray.xml")
=> #<People:0x007ff64a0cda08 @names=["Jim", "Susan", "Herbie", "Nancy"]>
ppl.names
=> ["Jim", "Susan", "Herbie", "Nancy"]
ppl.names.concat ["Mary","Arnold"]
=> ["Jim", "Susan", "Herbie", "Nancy", "Mary", "Arnold"]
ppl.save_to_xml.write $stdout,2
<people>
<names>
<name>
Jim
</name>
<name>
Susan
</name>
<name>
Herbie
</name>
<name>
Nancy
</name>
<name>
Mary
</name>
<name>
Arnold
</name>
</names>
</people>
As a side node, this feature actually makes text_node and
numeric_node special cases of object_node. For
example, text_node :attr, "path" is the same as
object_node :attr, "path", :class=>String.
Polymorphic Sub-objects, Marshallers/Unmarshallers¶ ↑
Besides the :class keyword argument, there are alternative
ways for a single-attribute node with sub-objects to specify the way the
sub-objects are created from/marshalled into their subtrees.
First, it's possible not to specify anything at all – in that case, the class of a sub-object will be automatically deduced from the root element name of its subtree. This allows you to achieve a kind of “polymorphic”, late-bound way to decide about the sub-object's class. The following example document contains a hierarchical, recursive set of named “documents” and “folders”, where folders hold a set of entries, each of which may again be either a document or a folder:
<?xml version="1.0" encoding="ISO-8859-1"?>
<folder name="home">
<document name="plan">
<contents> inhale, exhale</contents>
</document>
<folder name="work">
<folder name="xml-mapping">
<document name="README">
<contents>foo bar baz</contents>
</document>
</folder>
</folder>
</folder>
This can be mapped to Ruby like this:
require 'xml/mapping' class Entry include XML::Mapping text_node :name, "@name" end class Document <Entry include XML::Mapping text_node :contents, "contents" end class Folder <Entry include XML::Mapping array_node :entries, "document|folder", :default_value=>[] def [](name) entries.select{|e|e.name==name}[0] end def append(name,entry) entries << entry entry.name = name entry end end
Usage:
root = XML::Mapping.load_object_from_file "documents_folders.xml"
=> #<Folder:0x007ff6499c0f58 @entries=[#<Document:0x007ff6499bb8a0 @name="plan", @contents=" inhale, exhale">, #<Folder:0x007ff6499ba298 @entries=[#<Folder:0x007ff6499b84e8 @entries=[#<Document:0x007ff6499b1fa8 @name="README", @contents="foo bar baz">], @name="xml-mapping">], @name="work">], @name="home">
root.name
=> "home"
root.entries
=> [#<Document:0x007ff6499bb8a0 @name="plan", @contents=" inhale, exhale">, #<Folder:0x007ff6499ba298 @entries=[#<Folder:0x007ff6499b84e8 @entries=[#<Document:0x007ff6499b1fa8 @name="README", @contents="foo bar baz">], @name="xml-mapping">], @name="work">]
root.append "etc", Folder.new
root["etc"].append "passwd", Document.new
root["etc"]["passwd"].contents = "foo:x:2:2:/bin/sh"
root["etc"].append "hosts", Document.new
root["etc"]["hosts"].contents = "127.0.0.1 localhost"
xml = root.save_to_xml
=> <folder name='home'> ... </>
xml.write $stdout,2
<folder name='home'>
<document name='plan'>
<contents>
inhale, exhale
</contents>
</document>
<folder name='work'>
<folder name='xml-mapping'>
<document name='README'>
<contents>
foo bar baz
</contents>
</document>
</folder>
</folder>
<folder name='etc'>
<document name='passwd'>
<contents>
foo:x:2:2:/bin/sh
</contents>
</document>
<document name='hosts'>
<contents>
127.0.0.1 localhost
</contents>
</document>
</folder>
</folder>
As you see, the Folder#entries attribute is mapped via an
array_node that does not specify a :class or anything else to
govern the instantiation of the array's elements. This causes
xml-mapping to deduce the class of each array element from the root element
name of the corresponding XML tree. In this example,
the root element name is either “document” or “folder”. The mapping between
root element names and class names is the one briefly described in example at the beginning of this document – the
unqualified class name is just converted to lower case and “dashed”, e.g.
Foo::Bar::MyClass becomes “my-class”; and you may overwrite this on a
per-class basis by calling root_element_name
"the-new-name" in the class body. In our example, the root
element name “document” leads to an instantiation of class
Document, and the root element name “folder” leads to an
instantiation of class Folder.
Incidentally, the last example shows that you can readily derive mapping
classes from one another (as said before, you can also derive mapping
classes from other classes, include other modules into them etc. at will).
This works just like intuition thinks it should – when deriving one mapping
class from another one, the list of nodes in effect when loading/saving
instances of the derived class will consist of all nodes of that class and
all superclasses, starting with the topmost superclass that has nodes
defined. There is one thing to take care of though: When deriving mapping
classes from one another, you have to make sure to include
XML::Mapping in each class. This requirement exists purely due to
ease-of-implementation considerations; there are probably ways to do away
with it, but the inconvenience seemed not severe enough for me to bother
(as yet). Still, you might get “strange” errors if you forget to do it for
a class.
Besides the :class keyword argument and no argument, there is
a third way to specify the way the sub-objects are created from/marshalled
into their subtrees: :marshaller and/or
:unmarshaller keyword arguments. Here you pass procs in which
you just do all the work manually. So this is basically a “catch-all” for
cases where the other two alternatives are not appropriate for the problem
at hand. (TODO: Use other example?) Let's say we want to
extend the Signature class from the initial example to include
the date on which the signature was created. We want the new XML representation of such a signature to look like
this:
<Signature>
<Name>John Doe</Name>
<Position>product manager</Position>
<signed-on>
<day>13</day>
<month>2</month>
<year>2005</year>
</signed-on>
</Signature>
So, a new “signed-on” element was added that holds the day, month, and
year. In the Signature instance in Ruby, we want the date to
be stored in an attribute named signed_on of type
Time (that's Ruby's built-in Time class).
One could think of using object_node, but something like
object_node :signed_on, "signed-on", :class=>Time
won't work because Time isn't a mapping class and
doesn't define methods load_from_xml and
fill_into_xml (we could easily define those though; we'll
talk about that possibility here and here). The fastest, most ad-hoc way to
achieve what we want are :marshaller and :unmarshaller keyword arguments,
like this:
require 'xml/mapping' require 'xml/xxpath_methods' class Signature include XML::Mapping text_node :name, "Name" text_node :position, "Position", :default_value=>"Some Employee" object_node :signed_on, "signed-on", :unmarshaller=>proc{|xml| y,m,d = [xml.first_xpath("year").text.to_i, xml.first_xpath("month").text.to_i, xml.first_xpath("day").text.to_i] Time.local(y,m,d) }, :marshaller=>proc{|xml,value| e = xml.elements.add; e.name = "year"; e.text = value.year e = xml.elements.add; e.name = "month"; e.text = value.month e = xml.elements.add; e.name = "day"; e.text = value.day # xml.first("year",:ensure_created=>true).text = value.year # xml.first("month",:ensure_created=>true).text = value.month # xml.first("day",:ensure_created=>true).text = value.day } end
The :unmarshaller proc will be called whenever a
Signature instance is being read in from an XML source. The xml argument passed to the
proc contains (as a REXML::Element
instance) the XML subtree corresponding to the
node's attribute's sub-object currently being read. In the case of
our object_node, the sub-object is just the node's
attribute (signed_on) itself, and the subtree is the one
rooted at the <signed-on> element (if this were e.g. an
array_node, the :unmarshaller proc would be
called once for each array element, and xml would hold the
subtree corresponding to the “current” array element). The proc is expected
to extract the sub-object's data from xml and return the
sub-object. So we have to read the “year”, “month”, and “day” elements,
construct a Time instance from them and return that. One could
just use the REXML API to do that, but I've
decided here to use the XPath interpreter that comes with xml-mapping
(xml/xxpath), and specifically the 'xml/xxpath_methods' utility
library that adds methods like first to REMXML::Element. We
call first on xml three times, passing XPath
expressions to extract the “year”/“month”/“day” sub-elements, construct the
Time instance from that and return it. The XPath library is
explained in more detail below.
The :marshaller proc will be called whenever a
Signature instance is being written into an XML tree. xml is again the XML subtree rooted at the <signed-on> element (it
will still be empty when this proc is called), and value is
the current value of the sub-object (again, since this is an
object_node, value is the node's attribute,
i.e. the Time instance). We have to fill xml with
the data from value here. So we add three elements “year”,
“month” and “day” and set their texts to the corresponding values from
value. The commented-out code shows an alternative
implementation of the same thing using the XPath interpreter.
It should be mentioned again that :marshaller/:unmarshaller procs are
possible with all single-attribute nodes with sub-objects, i.e. with
object_node, array_node, and
hash_node. So, if you wanted to map a whole array of date
values, you could use array_node with the same
:marshaller/:unmarshaller procs as above, for example:
array_node :birthdays, "birthdays", "birthday",
:unmarshaller=> <as above>,
:marshaller=> <as above>
You can see that :marshaller/:unmarshaller procs give you more flexibility,
but they also impose more work because you essentially have to do all the
work of marshalling/unmarshalling the sub-objects yourself. If you find
yourself copying and pasting marshaller/unmarshaller procs all over the
place, you should instead define your own node type or mix the
marshalling/unmarshalling capabilities into the Time class
itself. This is explained here and here, and you'll see that it's not
really much more work than writing :marshaller and :unmarshaller procs (you
essentially just move the code from those procs into your own node type
resp. into the Time class), so you should not hesitate to do
this.
Another thing worth mentioning is that you don't have to specify
both a :marshaller and an :unmarshaller simultaneously. You can as
well give only one of them, and in addition to that pass a
:class argument or no argument. When you do that, the
specified marshaller (or unmarshaller) will be used when marshalling (resp.
unmarshalling) the sub-objects, and the other passed argument
(:class or none) will be employed when unmarshalling (resp.
marshalling) the sub-objects. So, in effect, you can deactivate or
“short-cut” some part of the marshalling/unmarshalling functionality of a
node type while retaining another part.
Attribute Handling Details, Augmenting Existing Classes¶ ↑
I'll shed some more light on how single-attribute nodes add mapped attributes to Ruby classes. An attribute declaration like
text_node :city, "City"
maps some portion of the XML tree (here: the “City”
sub-element) to an attribute (here: “city”) of the class whose body the
declaration appears in. When writing (marshalling) instances of the
surrounding class into an XML document, xml-mapping
will read the attribute value from the instance using the function named
city; when reading (unmarshalling) an instance from an XML document, xml-mapping will use the one-parameter
function city= to set the attribute in the instance to the
value read from the XML document.
If these functions don't exist at the time the node declaration is
executed, xml-mapping adds default implementations that simply read/write
the attribute value to instance variables that have the same name as the
attribute. For example, the city attribute declaration in the
Address class in the example added functions city
and city= that read/write from/to the instance variable
@city.
If, however, these functions already exist prior to defining the attributes, xml-mapping will leave them untouched, so your precious self-written accessor methods that do whatever complicated internal processing of the data won't be overwritten.
This means that you can not only create new mapping classes from scratch, you can also take existing classes that contain some “business logic” and “augment” them with xml-mapping capabilities. As a simple example, let's augment Ruby's “Time” class with node declarations that declare XML mappings for the day, month etc. fields:
class Time
include XML::Mapping
numeric_node :year, "year"
numeric_node :month, "month"
numeric_node :day, "mday"
numeric_node :hour, "hours"
numeric_node :min, "minutes"
numeric_node :sec, "seconds"
end
nowxml=Time.now.save_to_xml
=> <time> ... </>
nowxml.write($stdout,2)
<time>
<year>
2015
</year>
<month>
3
</month>
<mday>
1
</mday>
<hours>
15
</hours>
<minutes>
31
</minutes>
<seconds>
6
</seconds>
</time>
Here XML mappings are defined for the existing
fields year, month etc. Xml-mapping noticed that
the getter methods for those attributes existed, so it didn't overwrite
them. When calling save_to_xml on a Time object,
these methods are called and return the object's values for those
fields, which then get written to the output XML.
So you can convert Time objects into XML trees. What about reading them back in from XML? All XML reading operations
go through <Class>.load_from_xml. The
load_from_xml class method inherited from XML::Mapping (see XML::Mapping::ClassMethods#load_from_xml)
allocates a new instance of the class (Time), then calls
fill_from_xml (i.e. XML::Mapping#fill_from_xml)
on it. fill_from_xml iterates over all our nodes in the order
of their definition. For each node, its data (the <year>, or
<month>, or <day> etc. element) is read from the XML source and then written to the Time
instance via the respective setter method (year=,
month=, day= etc.). These methods didn't
exist in Time before (Time objects are
immutable), so xml-mapping defined its own, default setter methods that
just set @year, @month etc. This is of course
pretty useless because Time objects don't hold their time
in these variables, so the setter methods don't really change the time
of the Time object. So we have to redefine
load_from_xml for the Time class:
def Time.load_from_xml(xml, options={:mapping=>:_default}) year,month,day,hour,min,sec = [xml.first_xpath("year").text.to_i, xml.first_xpath("month").text.to_i, xml.first_xpath("mday").text.to_i, xml.first_xpath("hours").text.to_i, xml.first_xpath("minutes").text.to_i, xml.first_xpath("seconds").text.to_i] Time.local(year,month,day,hour,min,sec) end
Other Nodes¶ ↑
All nodes I've shown so far (node types text_node, numeric_node, boolean_node, object_node, array_node, and hash_node) were single-attribute nodes: The first parameter to the node factory method of such a node is an attribute name, and the attribute of that name is the only piece of the state of instances of the node's mapping class that gets read/written by the node.
choice_node¶ ↑
There is one node type distributed with xml-mapping that is not a
single-attribute node: choice_node. A choice_node
allows you to specify a sequence of pairs, each consisting of an XPath
expression and another node (any node is supported here, including other
choice_nodes). When reading in an XML source, the
choice_node will delegate the work to the first node in the sequence whose
corresponding XPath expression was matched in the XML. When writing an object back to XML, the choice_node will delegate the work to the
first node whose data was “present” in the object (for single-attribute
nodes, the data is considered “present” if the node's attribute is
non-nil; for choice_nodes, the data is considered “present” if at least one
of the node's sub-nodes is “present”).
As a (somewhat contrived) example, here's a mapping for
Publication objects that have either a single author
(contained in an “author” XML attribute) or several
“contributors” (contained in a sequence of “contr” XML elements):
class Publication
include XML::Mapping
choice_node :if, '@author', :then, (text_node :author, '@author'),
:elsif, 'contr', :then, (array_node :contributors, 'contr', :class=>String)
end
### usage
p1 = Publication.load_from_xml(REXML::Document.new('<publication author="Jim"/>').root)
=> #<Publication:0x007ff64a166a78 @author="Jim">
p2 = Publication.load_from_xml(REXML::Document.new('
<publication>
<contr>Chris</contr>
<contr>Mel</contr>
<contr>Toby</contr>
</publication>').root)
=> #<Publication:0x007ff64a155f48 @contributors=["Chris", "Mel", "Toby"]>
The symbols :if, :then, and :elsif (but not :else – see below) in the
choice_node's node factory method call are ignored; they
may be sprinkled across the argument list at will (preferably the way shown
above of course) to increase readability.
The rest of the arguments specify the mentioned sequence of XPath expressions and corresponding nodes.
When reading a Publication object from XML, the XPath expressions from the
choice_node (@author and contr) will
be matched in sequence against the source XML tree
until a match is found or the end of the argument list is reached. If the
end is reached, an exception is raised. Otherwise, for the first XPath
expression that matched, the corresponding node will be invoked (i.e. used
to read actual data from the XML source into the
Person object). If you specify :else, :default, or :otherwise
in place of an XPath expression, this is treated as an XPath expression
that always matches. So you can use :else (or :default or :otherwise) for a
“fallback” node that will be used if none of the other XPath expressions
matched (an example for this follows).
When writing a Publication object back to XML, the first node in the sequence whose data is
“present” in the source object will be invoked to write data from the
object into the target XML tree (and the
corresponding XPath expression will be created in the XML tree if it doesn't exist already). If there is
no such node in the sequence, an exception is raised. As said above, for
single-attribute nodes, the node's data is considered “present” if the
node's attribute is non-nil. So, if you write a
Publication object to XML, and either
the author or the contributors attribute of the
object is set, it will be written; if both attributes are nil, an exception
will be raised.
A frequent use case for choice_nodes will probably be object attributes
that may be represented in multiple alternative ways in XML. As an example, consider “Person” objects where the
name of the person should be stored alternatively in a sub-element named
name, or an attribute named name, or in the text
of the person element itself. You can achieve this with
choice_node like this:
class Person
include XML::Mapping
choice_node :if, 'name', :then, (text_node :name, 'name'),
:elsif, '@name', :then, (text_node :name, '@name'),
:else, (text_node :name, '.')
end
### usage
p1 = Person.load_from_xml(REXML::Document.new('<person name="Jim"/>').root)
=> #<Person:0x007ff64a1cd660 @name="Jim">
p2 = Person.load_from_xml(REXML::Document.new('<person><name>James</name></person>').root)
=> #<Person:0x007ff64a1c54b0 @name="James">
p3 = Person.load_from_xml(REXML::Document.new('<person>Suzy</person>').root)
=> #<Person:0x007ff64a1b6820 @name="Suzy">
p1.save_to_xml.write($stdout)
<person><name>Jim</name></person>
p2.save_to_xml.write($stdout)
<person><name>James</name></person>
p3.save_to_xml.write($stdout)
<person><name>Suzy</name></person>
Here all sub-nodes of the choice_nodes are single-attribute nodes
(text_nodes) with the same attribute (name). As you see, when
writing persons to XML, the name is always stored in
a <name> sub-element. Of course, this is because that alternative
appears first in the choice_node.
Readers/Writers¶ ↑
Finally, all nodes support keyword arguments :reader and :writer which allow you to extend or completely override the reading and/or writing functionality of the node with your own code. The :reader as well as the :writer argument must be a proc that takes as its arguments the Ruby object to be read/written (instance of the mapping class the node belongs to) and the XML tree to be written to/read from. An optional third argument may be specified – it will receive a proc that wraps the default reader/writer functionality of the node.
The :reader proc is for reading (from the XML into the object), the :writer proc is for writing (from the object into the XML).
Here's a (really contrived) example:
class Foo
include XML::Mapping
text_node :name, "@name", :reader=>proc{|obj,xml,default_reader|
default_reader.call(obj,xml)
obj.name += xml.attributes['more']
},
:writer=>proc{|obj,xml|
xml.attributes['bar'] = "hi #{obj.name} ho"
}
end
f = Foo.load_from_xml(REXML::Document.new('<foo name="Jim" more="XYZ"/>').root)
=> #<Foo:0x007ff64a10e8c8 @name="JimXYZ">
xml = f.save_to_xml
xml.write $stdout,2
<foo bar='hi JimXYZ ho'/>
So there's a “Foo” class with a text_node that would by default
(without the :reader and :writer proc) map the Ruby attribute “name” to the
XML attribute “name”. The :reader proc is invoked
when reading from XML into a Foo
object. The xml argument is the XML
tree, obj is the object. default_reader is the
proc that wraps the default reading functionality of the node. We invoke it
at the beginning. For this text_node, the default reading functionality is
to take the text of the “name” attribute of xml and put it
into the name attribute of obj. After that, we
take the text of the “more” attribute of xml and append it to
the name attribute of obj. So the XML tree <foo name="Jim"
more="XYZ"/> is converted to a Foo object
with name=“JimXYZ”.
In our :writer proc, we only take obj (the Foo
object to be written to XML) and xml
(the XML tree the stuff is to be written to).
Analogously to the :reader, we could take a proc that wraps the default
writing functionality of the node, but we don't do that here–we
completely override the writing functionality with our own code, which just
takes the name attribute of the object and writes “hi <the
name> ho” to a bar XML attribute in
the XML tree (stupid example, I know).
As a special convention, if you specify both a :reader and a :writer for a
node, and in both cases you do /not/ call the default behaviour, then you
should use the generic node type node, e.g.:
class SomeClass
include XML::Mapping
....
node :reader=>proc{|obj,xml| ...},
:writer=>proc{|obj,xml| ...}
end
(since you're completely replacing both the reading and the writing functionality, you're effectively replacing all the functionality of the node, so it would be pointless and confusing to use one of the more “specific” node types)
As you see, the purpose of readers and writers is to make it possible to
augment or override a node's functionality arbitrarily, so there
shouldn't be anything that's absolutely impossible to achieve with
xml-mapping. However, if you use readers and writers without invoking the
default behaviour, you really do everything manually, so you're not
doing any less work than you would do if you weren't using xml-mapping
at all. So you'll probably use readers and/or writers for those bits of
your mapping semantics that can't be achieved with xml-mapping's
predefined node types (an alternative approach might be to override the
post_load and/or post_save instance methods on
the mapping class – see the reference documentation).
An advice similar to the one given above for marshallers/unmarshallers applies here as well: If you find yourself writing lots of readers and writers that only differ in some easily parameterizable aspects, you should think about defining your own node types. We talk about that below, and it generally just means that you move the (sensibly parameterized) code from your readers/writers to your node types.
Multiple Mappings per Class¶ ↑
Sometimes you might want to represent the same Ruby object in multiple alternative ways in XML. For example, the name of a “Person” object could be represented either in a “name” element or a “name” attribute.
xml-mapping supports this by allowing you to define multiple disjoint
“mappings” for a mapping class. A mapping is by convention identified with
a symbol, e.g. :my_mapping, :other_mapping etc.,
and each mapping comprises a root element name and a set of node
definitions. In the body of a mapping class definition, you switch to
another mapping with use_mapping :the_mapping. All following
node declarations will be added to that mapping unless you specify
the option :mapping=>:another_mapping for a node declaration (all node
types support that option). The default mapping (the mapping used if there
was no previous use_mapping in the class body) is named
:_default.
All the worker methods like load_from_xml/file,
save_to_xml/file, load_object_from_xml/file
support a :mapping keyword argument to specify the mapping,
which again defaults to :_default.
In the following example, we define two mappings (the default one and a
mapping named :other) for Person objects with a
name, an age and an address:
require 'xml/mapping'
class Address; end
class Person
include XML::Mapping
# the default mapping. Stores the name and age in XML attributes,
# and the address in a sub-element "address".
text_node :name, "@name"
numeric_node :age, "@age"
object_node :address, "address", :class=>Address
use_mapping :other
# the ":other" mapping. Non-default root element name; name and age
# stored in XML elements; address stored in the person's element
# itself
root_element_name "individual"
text_node :name, "name"
numeric_node :age, "age"
object_node :address, ".", :class=>Address
# you could also specify the mapping on a per-node basis with the
# :mapping option, e.g.:
#
# numeric_node :age, "age", :mapping=>:other
end
class Address
include XML::Mapping
# the default mapping.
text_node :street, "street"
numeric_node :number, "number"
text_node :city, "city"
numeric_node :zip, "zip"
use_mapping :other
# the ":other" mapping.
text_node :street, "street-name"
numeric_node :number, "street-name/@number"
text_node :city, "city-name"
numeric_node :zip, "city-name/@zip-code"
end
### usage
## XML representation of a person in the default mapping
xml = REXML::Document.new('
<person name="Suzy" age="28">
<address>
<street>Abbey Road</street>
<number>72</number>
<city>London</city>
<zip>18827</zip>
</address>
</person>').root
## load using the default mapping
p = Person.load_from_xml xml
=> #<Person:0x007ff64a23e9c8 @name="Suzy", @age=28, @address=#<Address:0x007ff64a23d4b0 @street="Abbey Road", @number=72, @city="London", @zip=18827>>
## save using the default mapping
xml2 = p.save_to_xml
xml2.write $stdout,2
<person name='Suzy' age='28'>
<address>
<street>
Abbey Road
</street>
<number>
72
</number>
<city>
London
</city>
<zip>
18827
</zip>
</address>
</person>
## xml2 identical to xml
## now, save the same person to XML using the :other mapping...
other_xml = p.save_to_xml :mapping=>:other
other_xml.write $stdout,2
<individual>
<name>
Suzy
</name>
<age>
28
</age>
<street-name number='72'>
Abbey Road
</street-name>
<city-name zip-code='18827'>
London
</city-name>
</individual>
## load it again using the :other mapping
p2 = Person.load_from_xml other_xml, :mapping=>:other
=> #<Person:0x007ff64a20c838 @name="Suzy", @age=28, @address=#<Address:0x007ff64a2079a0 @street="Abbey Road", @number=72, @city="London", @zip=18827>>
## p2 identical to p
In this example, each of the two mappings contains nodes that map the same set of Ruby attributes (name, age and address). This is probably what you want most of the time (since you're normally defining multiple XML mappings for the same Ruby data), but it's not a necessity at all. When a mapping class is defined, xml-mapping will add all Ruby attributes from all mappings to it.
You may have noticed that the object_nodes in the
Person class apply the mapping they were themselves defined in
to their sub-ordinated class (Address). This is the case for
all Single-attribute Nodes with Sub-objects
(object_node, array_node and
hash_node) unless you explicitly specify a different mapping
for the sub-object(s) using the option :sub_mapping, e.g.
object_node :address, "address", :class=>Address, :sub_mapping=>:other
Defining your own Node Types¶ ↑
It's easy to write additional node types and register them with the
xml-mapping library (the following node types come with xml-mapping:
node, text_node, numeric_node,
boolean_node, object_node,
array_node, hash_node, choice_node).
I'll first show an example, then some more theoretical insight.
Example¶ ↑
Let's say we want to extend the Signature class from the
example to include the time at which the signature was created. We want the
new XML representation of such a signature to look
like this:
<Signature>
<Name>John Doe</Name>
<Position>product manager</Position>
<signed-on>
<day>13</day>
<month>2</month>
<year>2005</year>
</signed-on>
</Signature>
(we only save year, month and day to make this example shorter), and the mapping class declaration to look like this:
class Signature include XML::Mapping text_node :name, "Name" text_node :position, "Position", :default_value=>"Some Employee" time_node :signed_on, "signed-on", :default_value=>Time.now end
(i.e. a new “time_node” declaration was added).
We want this time_node call to define an attribute named
signed_on which holds the date value from the XML document in an instance of class Time.
This node type can be defined with this piece of code:
require 'xml/mapping/base' class TimeNode < XML::Mapping::SingleAttributeNode def initialize(*args) path,*args = super(*args) @y_path = XML::XXPath.new(path+"/year") @m_path = XML::XXPath.new(path+"/month") @d_path = XML::XXPath.new(path+"/day") args end def extract_attr_value(xml) y,m,d = default_when_xpath_err{ [@y_path.first(xml).text.to_i, @m_path.first(xml).text.to_i, @d_path.first(xml).text.to_i] } Time.local(y,m,d) end def set_attr_value(xml, value) @y_path.first(xml,:ensure_created=>true).text = value.year @m_path.first(xml,:ensure_created=>true).text = value.month @d_path.first(xml,:ensure_created=>true).text = value.day end end XML::Mapping.add_node_class TimeNode
The last line registers the new node type with the xml-mapping library. The name of the node factory method (“time_node”) is automatically derived from the class name of the node type (“TimeNode”).
There will be one instance of the node type TimeNode per
time_node declaration per mapping class (not per mapping class
instance). That instance (the “node” for short) will be created by the node
factory method (time_node); there's no need to instantiate
the node type directly. The time_node method places the node
into the mapping class; the @owner attribute of the node is set to
reference the mapping class. The node factory method passes the mapping
class the node appears in (Signature), followed by its own
arguments, to the node's constructor. In the example, the
time_node method calls TimeNode.new(Signature,
:signed_on, "signed-on", :default_value=>Time.now)).
new of course creates the node and then delegates the
arguments to our initializer initialize. We first call the
superclass's initializer, which strips off from the argument list those
arguments it handles itself, and returns the remaining ones. In this case,
the superclass XML::Mapping::SingleAttributeNode
handles the Signature, :signed_on and
:default_value=>Time.now arguments – Signature
is stored into @owner, :signed_on is stored into
@attrname, and {:default_value=>Time.now} is
stored into @options. The remaining argument list
["signed-on"] is returned; we capture the
"signed-on" string in path (the rest of the
argument list (an empty array) we capture in args for returning it
at the end of the initializer. This isn't strictly necessary, it's
just a convention that a node class initializer should always return those
arguments it didn't handle itself). We'll interpret path
as an XPath expression that locates the time value relative to the parent
mapping object's XML tree (in this case, this
would be the XML tree rooted at the
<Signature> element, i.e. the tree the
Signature instance was read from). We'll later have to
read/store the year, month, and day values from
path+"/year", path+"/month",
and path+"/day", respectively, so we create (and
precompile) three corresponding XPath expressions using XML::XXPath.new and store them into
member variables of the node. XML::XXPath is
an XPath implementation that is bundled with xml-mapping. It is very
incomplete, but it supports writing (not just reading) of XML nodes, which is needed to support writing data back
to XML. The XML::XXPath library is explained in more detail
below.
The extract_attr_value method is called whenever an instance
of the mapping class the node belongs to (Signature in the
example) is being created from an XML tree. The
parameter xml is that tree (again, this is the tree rooted at the
<Signature> element in this example). The method
implementation is expected to extract the single attribute's value from
xml and return it, or raise XML::Mapping::SingleAttributeNode::NoAttrValueSet
if the attribute was “unset” in the XML (this
exception tells the framework that the default value should be put in place
if it was defined), or raise any other exception to signal an error and
abort the whole process. Our superclass XML::Mapping::SingleAttributeNode
will store the returned single attribute's value into the
signed_on attribute of the Signature instance
being read in. In our implementation, we apply the xpath expressions
created during initialization to xml (e.g.
@y_path.first(xml)). An expression
xpath_expr.first(xml) returns (as a REXML element) the first sub-element of xml
that matches xpath_expr, or raises XML::XXPathError if there was no such
element. We apply REXML's text method to the returned element
to get out the element's text, convert it to integer, and supply it to
the constructor of the Time object to be returned. As a side
note, if an XPath expression matches XML attributes,
XML::XXPath methods like first will
return XML::XXPath::Accessors::Attribute
nodes that behave similarly to REXML::Element nodes, including support for
messages like name and text, so this would've worked
also if our XPath expressions had referred to XML
attributes, not elements. The default_when_xpath_err thing
calls the supplied block and returns its value, but maps the exception XML::XXPathError to the mentioned XML::Mapping::SingleAttributeNode::NoAttrValueSet
(any other exceptions fall through unchanged). As said above, XML::Mapping::SingleAttributeNode::NoAttrValueSet
is caught by the framework (more precisely, by our superclass XML::Mapping::SingleAttributeNode),
and the default value is set if it was provided. So you should just wrap
default_when_xpath_err around any applications of XPath
expressions whose non-presence in the XML you want
to be considered a non-presence of the attribute you're trying to
extract. (XML::XXPath is designed to know knothing about XML::Mapping, so it doesn't raise XML::Mapping::SingleAttributeNode::NoAttrValueSet
directly)
The set_attr_value method is called whenever an instance of
the mapping class the node belongs to (Signature in the
example) is being stored into an XML tree. The
xml parameter is the XML tree (a REXML element node; here this is again the tree
rooted at the <Signature> element); value is
the current value of the single attribute (in this example, the
signed_on attribute of the Signature instance
being stored). xml will most probably be “half-populated” by the
time this method is called – the framework calls the
set_attr_value methods of all nodes of a mapping class in the
order of their definition, letting each node fill its “bit” into
xml. The method implementation is expected to write value
into (the correct sub-elements of) xml, or raise an exception to
signal an error and abort the whole process. No default value handling is
done here; set_attr_value won't be called at all if the
attribute had been set to its default value. In our implementation we grab
the year, month and day values from value (which must be a
Time), and store it into the sub-elements of xml
identified by XPath expressions @y_path, @m_path
and @d_path, respectively. We do this by calling XML::XXPath#first with an
additional parameter :ensure_created=>true. An expression
xpath_expr.first(xml,:ensure_created=>true) works just
like xpath_expr.first(xml) if xpath_expr was
already present in xml. If it was not, it is created (preferably
at the end of xml's list of sub-nodes), and returned. See below for a more detailed documentation of the XPath
interpreter.
Element order in created XML documents¶ ↑
As just said, XML::XXPath, when used to
create new XML nodes, generally appends those nodes
to the end of the list of subnodes of the node the xpath expression was
applied to. All xml-mapping nodes that come with xml-mapping use XML::XXPath when writing data to XML, and therefore also append their data to the XML data written by preceding nodes (the nodes are
invoked in the order of their definition). This means that, generally, your
output data will appear in the XML document in the
same order in which the corresponding xml-mapping node definitions appeared
in the mapping class (unless you used XPath expressions like foo which explicitly dictate a fixed position in the
sequence of XML nodes). For instance, in the
Order class from the example at the beginning of this
document, if we put the :signatures node before the
:items node, the <Signed-By> element will
appear before the sequence of <Item> elements
in the output XML.
The following is a more systematic overview of the basic node types. The description is self-contained, so some information from the previous section will be repeated.
Node Types Are Ruby Classes¶ ↑
A node type is implemented as a Ruby class derived from XML::Mapping::Node or one of its subclasses.
The following node types (node classes) come with xml-mapping (they all live in the XML::Mapping namespace, which I've left out here for brevity):
Node +-SingleAttributeNode | +-SubObjectBaseNode | | +-ObjectNode | | +-ArrayNode | | +-HashNode | +-TextNode | +-NumericNode | +-BooleanNode +-ChoiceNode
XML::Mapping::Node is the base class
for all nodes, XML::Mapping::SingleAttributeNode
is the base class for single-attribute nodes,
and XML::Mapping::SubObjectBaseNode
is the base class for single-attribute nodes
with sub-objects. XML::Mapping::TextNode, XML::Mapping::ArrayNode etc. are of
course the text_node, array_node etc. we've
talked about in this document. When you've written a new node class,
you register it with xml-mapping by calling
XML::Mapping.add_node_class MyNode. When you do that,
xml-mapping automatically defines the node factory method for your class –
the method's name (e.g. my_node) is derived from the
node's class name (e.g. Foo::Bar::MyNode) by stripping all parent
module names, and then converting capital letters to lowercase and
preceding them with an underscore. In fact, this is just how all the
predefined node types are defined – those node types are not “special”;
they're defined in the source file
xml/mapping/standard_nodes.rb and then registered normally in
xml/mapping.rb. The source code of the built-in nodes is not
very long or complicated; you may consider reading it in addition to this
text to gain a better understanding.
How Node Types Work¶ ↑
The xml-mapping core “operates” node types as follows:
Node Initialization¶ ↑
As said above, when a node class is registered with xml-mapping by calling
XML::Mapping.add_node_class TheNodeClass, xml-mapping
automatically generates the node factory method for that type. The node
factory method will effectively be defined as a class method of the XML::Mapping module, which is why one can call
it from the body of a mapping class definition. The generated method will
create a new instance of the node class (a node for short) by
calling new on the node class. The list of parameters to
new will consist of the mapping class, followed by all
arguments that were passed to the node factory method. For example,
when you have this node declaration:
class MyMappingClass include XML::Mapping my_node :foo, "bar", 42, :hi=>"ho" end
, then the node factory method (my_node) calls
MyNode.new(MyMappingClass, :foo, "bar", 42,
:hi=>"ho").
new of course creates the instance and calls initialize on it. The initialize implementation will generally store the parameters into some instance variables for later usage. As a convention, initialize should always extract from the parameter list those parameters it processes itself, process them, and return an array containing the remaining (still unprocessed) parameters. Thus, an implementation of initialize follows this pattern:
def initialize(*args) myparam1,myparam2,...,myparamx,*args = super(*args) .... process the myparam1,myparam2,...,myparamx .... # return still unprocessed args args end
(since the called superclass initializer is written the same way, the parameter array returned by it will already be stripped of all parameters that the superclass initializer (or any of its superclasses's initializers) processed)
This technique is a simple way to “chain” the initializers of all superclasses of a node class, starting with the topmost one (Node), so that each initializer can easily find out and process the parameters it is responsible for.
The base node class XML::Mapping::Node provides an initialize implementation that, among other things (described below), adds self (i.e. the created node) to the internal list of nodes held by the mapping class, and sets the @owner attribute of self to reference the mapping class.
So, effectively there will be one instance of a node class (a node) per node definition, and that instance lives in the mapping class the node was defined in.
Node Operation during Marshalling and Unmarshalling¶ ↑
When an instance of a mapping class is created or filled from an XML tree, xml-mapping will call xml_to_obj
on all nodes defined in that mapping class in the mapping the node is defined in, in the order of
their definition. Two parameters will be passed: the mapping class instance
being created/filled, and the XML tree the instance
is being created/filled from. The implementation of xml_to_obj
is expected to read whatever pieces of data it is responsible for from the
XML tree and put it into the appropriate
variables/attributes etc. of the instance.
When an instance of a mapping class is stored or filled into an XML tree, xml-mapping will call obj_to_xml
on all nodes defined in that mapping class in the mapping the node is defined in, in the order of
their definition, again passing as parameters the mapping class instance
being stored, and the XML tree the instance is being
stored/filled into. The implementation of obj_to_xml is
expected to read whatever pieces of data it is responsible for from the
instance and put it into the appropriate XML
elements/XML attr etc. of the XML tree.
Basic Node Types Overview¶ ↑
The following is an overview of how initialization and marshalling/unmarshalling is implemented in the node base classes (Node, SingleAttributeNode, and SubObjectBaseNode).
TODO: summary table: member var name; introduced in class; meaning
Node¶ ↑
In initialize, the mapping class and the option arguments are
stripped from the argument list. The mapping class is stored in @owner, the
option arguments are stored (as a hash) in @options (the hash will be empty
if no options were given). The mapping the node
is defined in is determined (:mapping option, last use_mapping
or :_default) and stored in @mapping. The node then stores
itself in the list of nodes of the mapping class belonging to the mapping
(@owner.xml_mapping_nodes(:mapping=>@mapping); see XML::Mapping::ClassMethods#xml_mapping_nodes).
This list is the list of nodes later used when marshalling/unmarshalling an
instance of the mapping class with respect to a given mapping. This means
that node implementors will not normally “see” anything of the mapping
(they don't need to access the @mapping variable) because the
marshalling/unmarshalling methods
(obj_to_xml/xml_to_obj) simply won't be
called if the node's mapping is not the same as the mapping the
marshalling/unmarshalling is happening with.
Furthermore, if :reader and/or :writer options were given,
xml_to_obj resp. obj_to_xml are transparently
overwritten on the node to delegate to the supplied :reader/:writer procs.
The marshalling/unmarshalling methods
(obj_to_xml/xml_to_obj) are not implemented in
Node (they just raise an exception).
SingleAttributeNode¶ ↑
In initialize, the attribute name is stripped from the argument list and stored in @attrname, and an attribute of that name is added to the mapping class the node belongs to.
During marshalling/unmarshalling of an object to/from XML, single-attribute nodes only read/write a single
piece of the object's state: the single attribute (@attrname) the node
handles. Because of this, the
obj_to_xml/xml_to_obj implementations in
SingleAttributeNode call two new methods introduced by SingleAttributeNode,
which must be overwritten by subclasses:
extract_attr_value(xml) set_attr_value(xml, value)
extract_attr_value(xml) is called by xml_to_obj
during unmarshalling. xml is the XML tree
being read. The method must read the attribute's value from
xml and return it. xml_to_obj will set the attribute
to that value.
set_attr_value(xml, value) is called by
obj_to_xml during marshalling. xml is the XML tree being written, value is the current
value of the attribute. The method must write value into (the
correct sub-elements/attributes) of xml.
SingleAttributeNode also handles the default value, if it was specified
(via the :default_value option): When writing data to XML, set_attr_value(xml, value) won't
be called if the attribute was set to the default value. When reading data
from XML, the extract_attr_value(xml)
implementation must raise a special exception, XML::Mapping::SingleAttributeNode::NoAttrValueSet,
if it wants to indicate that the data was not present in the XML. SingleAttributeNode will catch this exception and
put the default value, if it was defined, into the attribute.
SubObjectBaseNode¶ ↑
The initializer will set up additional member variables @sub_mapping, @marshaller, and @unmarshaller.
@sub_mapping contains the mapping to be used when reading/writing the sub-objects (either specified with :sub_mapping, or, by default, the mapping the node itself was defined in).
@marshaller and @unmarshaller contain procs that encapsulate
writing/reading of sub-objects to/from XML, as
specified by the user with :class/:marshaller/:unmarshaller etc. options
(the meaning of those different options was described above). The procs are there to be called from
extract_attr_value or set_attr_value whenever the
need arises.
XPath Interpreter¶ ↑
XML::XXPath is an XPath parser. It is used in
xml-mapping node type definitions, but can just as well be utilized
stand-alone (it does not depend on xml-mapping). XML::XXPath is very incomplete and probably will
always be, but it should be reasonably efficient (XPath expressions are
precompiled), and, most importantly, it supports write access, which is
needed for writing objects to XML. For example, if
you create the path /foo/bar[3]/baz[@key='hiho'] in the XML document
<foo>
<bar>
<baz key="ab">hello</baz>
<baz key="xy">goodbye</baz>
</bar>
</foo>
, you'll get:
<foo>
<bar>
<baz key='ab'>hello</baz>
<baz key='xy'>goodbye</baz>
</bar>
<bar/>
<bar>
<baz key='hiho'/>
</bar>
</foo>
XML::XXPath is explained in more detail in the reference documentation and the user_manual_xxpath file.
License¶ ↑
xml-mapping is licensed under the Apache License, version 2.0. See the LICENSE file for details.