XML standards cheatsheet
Here’s a simple guide to crafting standards compliant XML. I’m currently working on an API, and found it easy to knock off some XML for the output without thinking about it much. That’s not enough though, I wanted to follow some defined standards, so I did some research and decided to get the key points down here.
Define XML
XML is a way of marking up data. The data could be directly from a database, for example the result of a nested query. Instead of just outputting the results as plain text, we can mark up each element of the results in a machine readable format. That’s XML.
Simple XML element
A single element representing a stand alone piece of information would be represented as follows:
Complex Types
A name on its own might not be enough, for a person you might have other pieces of information. A person can be considered a complex type of XML element, constructed from several other XML elements.
<firstname>Mark</firstname>
<surname>Kirby</surname>
<address>50 Poles Hill, Brighton, BN1</address>
</person>
We could introduce another complex type to break up the address into further elements.
<firstname>Mark</firstname>
<surname>Kirby</surname>
<address>
<firstline>50 Poles Hill</firstline>
<town>Brighton</town>
<postcode>BN1</postcode>
</address>
</person>
Nested elements
A root element (say, person) could have more than one of a specific element (say, phone number). This is fine, and would be represented as follows:
<firstname>Mark</firstname>
<surname>Kirby</surname>
<phone>01273 4444441</phone>
<phone>07988 3838381</phone>
</person>
Attributes
XML elements can have attributes added to them to add more information about the element.
I feel you should use attributes to add meta information about the element, information about the data itself, information useful to machines and not people. Here are a few examples:
- Database id’s
- Created date
- Edited date
Here’s our person with some suitable attributes:
<firstname>Mark</firstname>
<surname>Kirby</surname>
</person>
I’ve seen API’s come up with a single element (say person), and stuff the tag with attributes containing all the details, thus:
This is very bad! It’s hard to read, hard to process, inflexible and doesn’t follow established standards.
An attribute can only be used once per element, and you shouldn’t store multiple values for a single attribute.
This is wrong:
This is wrong:
This is correct:
Constructing an XML document
An XML document should (and for should read must for standards sake) have the following features:
An XML document should have the extension XML
An XML document should begin with the XML declaration
- Version should be set to 1.0 to comply with the majority of parsers.
- Encoding can be specified, or left out to assume Unicode
- Standalone yes means there is no DTD or XSD document describing the XML available, no means there is a document, leave the attribute out to assume no
An XML document must have one and only one root element
In the above example, if you wanted to list more than one person, you could start with a root element, ‘people’.
<person>
<firstname>Mark</firstname>
<surname>Kirby</surname>
<address>50 Poles Hill, Brighton, BN1</address>
</person>
<person>
<firstname>Chris</firstname>
<surname>Kirby</surname>
<address>24 Linfield Road, Brighton, BN1</address>
</person>
</people>
