Saturday, February 15, 2014

XML Concepts

As I posted yesterday, I am going to try to articulate the easiest way to understand XML from what I know so far.  First off as I wrote previously, XML doesn't do anything, it's just a different way to format data that makes it easier to transport.

XML documents look something like this (taken from here):
<bookstore>
  <book category="CHILDREN">
    <title>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="WEB">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>

The first few lines of code for above:
<bookstore>
<book category = "CHILDREN">
</bookstore>

Is the same as
<bookstore>
<book category> CHILDREN </book category>
</bookstore>

Can you seen the differences?  The difference is that the first example of the short code includes elements and attributes… the best way to recognize an attribute is that an attribute always has quotation marks.  Elements are a little trickier because elements can CONTAIN "other elements, text, attributes, or a mix of both" according to w3schools.com.

So is it better to have an element centric XML code, or attribute centric?  After a google search here is a  decent explanation taken from stack overflow:

Attribute centric
  • Smaller size than element centric.
  • Not very interoperable, since most XML parsers will think the user data is presented by the element, Attributes are used to describe the element.
  • There is no way to present nullable value for some data type. e.g. nullable int
  • Can not express complex type.
Element centric
  • Complex type can be only presented as an element node.
  • Very interoperable
  • Bigger size than attribute centric. (compression can be used to eliminated the size significantly)
  • Nullable data can be expressed with attribute xsi:nil="true"
  • Faster to parse since the parser only looks for elements for user data.
The next thing to be able to identify would be the namespace within XML: 
I think about a namespace kind of like how you would us the AS feature in TSQL… say during an inner join.  
(SELECT h.empid, o.firstname
FROM hr.employees AS h
INNER JOIN 
hr.oldemployees AS o 
ON h.empid = o.empid)

The name spaces would be the h or o in h.xxxxx or o.xxxxx.

That's a brief overview of namespaces, I learned a lot from this link that explained namespaces very well.  It discusses when they are created, and how they are used in a practical sense.  

Well that's gets us a few ideas about XML (elements, attributes, and namespaces) but there is a lot more to go.  It's important to know these things because as far as the 461 test goes, I know that XML is something that is covered.  

As homework for Chapter 7 the book has a few suggested practices which I think are straight forward and easy to work with they suggest: Use a simple XML column on a sample database, and search it for  elementary things like; a persons first name, a persons last name, a persons age, etc.  Here is a bit of code found on stackoverflow that can help you to find the columns in all of your databases that contain XML data:

SELECT table_name [Table Name], column_name [Column Name], 

FROM information_schema.columns where data_type = 'XML'

Cheers until next time!

No comments:

Post a Comment