Friday, February 28, 2014

Xquery

Alright, so I'm ready to dive into some Xquery discussions…

Xquery, is just as it sounds that you can query information from and XML document.  The thing that I had a hard time understanding about Xquery was due to the fact that I had been studying so much TSQL, and was comfortable with SELECT, FROM WHERE, ORDER BY, etc… and I assumed that the result that I would get from Xquery would be represented as a normal value that you could then use somewhere else (a CTE, apply, SELECT INTO).  It's  little different than that… while you can shread XML data into "normal" looking columns, most of the time the results that your query returns is in and XML format (a format that you can specify) which is one of the GOOD things about XML.

So to start with, going back to my example yesterday of the Bagdad Theater in Portland, OR… the XML looked like this:

<usa>
  <or>
    <portland>
      <southeast>
        <hawthorn>
          <thirtyseventh>
            <BagdadTheater>SeatThree</BagdadTheater>
          </thirtyseventh>
        </hawthorn>
      </southeast>
    </portland>
  </or>
</usa>


Well you have to turn this into an XML document, which would be a declaration like this (modified it and added data as you'll notice):

DECLARE @abc AS XML;
SET @abc = N '
<usa>
  <or>
    <portland>
      <southeast>
        <hawthorn>
          <thirtyseventh>
            <BagdadTheater>SeatThree</BagdadTheater>
          </thirtyseventh>
        </hawthorn>
      </southeast>
    </portland>
  </or>
</usa>'


Now we can query @abc

SELECT @abc.query('*') AS Complete_Sequence, 

That query is going to give us back the complete XML document (@abc).  

You can also query specific areas of the document:

SELECT @abc.query('data(usa/or/portland/southeast/hawthorn/37ave/BagdadTheater)') AS Element_theaterseat_data

Can you guess what would be returned?

A big piece of Xquery is the use of FLWOR (I call it flower) which stands for FOR, LET, WHERE, ORDER, RETURN.  FLWOR is similar to SELECT, FROM, WHERE in SQL when querying tables.  

In the example above we could use the FLWOR to get the same result, it would look like this:

FOR  $x in doc("Theater.XML")/usa/or/portland/southeast/hawthorn/37ave
LET $x /BagdadTheater
RETURN $x (I'm not sure about this part, so don't quote me on it)

This gives you an idea of how Xquery works… I've began to think about XML really as a big array that we've developed paths for and modes of navigation to move around.  I still think the best way to get comfortable with some XML is to work out some problems with it.  The book gives some very simple practice lessons that I am going to try, anyone else who is reading this and is at the same level as me I encourage to do the same thing!…. I hope this helps!

Thursday, February 27, 2014

My own words Xpath and Navigation

As I wrote in my last post I was going to study further XML so that I'm comfortable explaining a few of the concepts in my own words.

First XPath.  Xpath refers to a position in an XML documents, I saw a good example somewhere online as I was working on this that helped me visualize it more.  Suppose you want to get to some directions to a geographical location, for example you would like to go to the Bagdad Theater in Portland, Oregon (my home town), well you might write down the location something like this:

usa/or/portland/southeast/hawthorn/37ave/BagdadTheater

In XML this might look something like this:
<usa>
  <or>
    <portland>
      <southeast>
        <hawthorn>
          <thirtyseventh>
            </BagdadTheater>
          </thirtyseventh>
        </hawthorn>
      </southeast>
    </portland>
  </or>
</usa>


The Xpath for the Bagdad Theater could be presented as we saw it above: usa/or/portland/southeast/hawthorn/37ave/BagdadTheater

That's all Xpath is, it explains where something is, then you can reference what ever is there.  Simple right?

Next this leads us into Navigation.

DOM node tree
OR:
Node tree
Credit to W3schools.org for these pics!

They say a picture is worth of thousand words.  This picture helped me to understand what exactly an XML tree looks like, and what the relationship is within it.  It's not surprising that when it comes to navigation most books will talk about "child nodes", "siblings", "parents", and "descendants"... they do this because again XML is formatted like a tree (think family tree).  So there are children, parents etc.  To make navigation a little easier we have some abbreviations for the nodes:

This:    //     "Retrieves the context node and all its descendants"
This:    @    "Retrieves the specified ATTRIBUTE (remember attributes are always in quotes) of the context                     node"
This:    ..     "Retrieves the parent of the context node" I tend to think about Linux on this one, and how to                        navigate folders; ".." always goes up one folder

Hopefully this clears up some navigation within XML.

I'll write more on Xquery tomorrow... cheers until then!

Credits: 
wikipedia.org
w3schools.org
Querying Microsoft SQL 2012; Itzik Ben-Gan, Dejan Sarka, Ron Talmage

Wednesday, February 26, 2014

Continuing XML

I intended on reading chapter 8 last night, but I flipped to the end of chapter 7 first where my book mark was placed.  I then began going through some of the practice within chapter 7, and felt like I still needed to understand XML clearer.  Not just elements/attributes, but the use of xquery, xpath, navigation of an xml documents including parent/child nodes etc.  So my goal today is to drill deeper into these ideas and be able to write out exactly what these pieces of XML are and include some examples in my own words.  When I can describe things in my own words I feel like I'm grasping the concept more firmly.

For the record I did read some of chapter 8 as well which is about creating objects in the RDBMS such as tables, and views.  Chapter 8 is also going to cover adding constraints (and I think checks) to a table too.  Again, I find it interesting in the order that the book covers these things as creating tables is a fairly rudimentary piece of TSQL, and it seems like you should know that before you begin to create CTE's.  It will be fun to start combining some of the lessons that I've learned so far and playing around with Selecting data into another table or creating a view.  Cool stuff!

So I have my goals today:
-Be able to define xquery, what does it do?
-Define xpath and what it does
-Describe navigation through an XML document


Cheers!

Sunday, February 23, 2014

XML Follow Up

After I posted my homework and answers at SSC, I had a great response from one of the members there.  Here is a link to the forum, and the explanation of Russel Loski.  I believed that I've learned sufficient XML through my studies over the last week, but I understand that there is A LOT that I still don't know.  I know this because simply going to amazon.com and searching XML comes up with many different books to advance ones knowledge of the subject... I've only spent one chapter.  So I understand it in a broad sense, but I would need to study XML specifically for sometime in order to grasp the many other components of it.

As a side note, this is what I love about learning this technologies...  I could spend one chapter on a subject such as XML and get a rudimentary education on it.  However I could also spend five years to learn deeper and deeper how it works... there is nothing that is stopping me from doing so.  How cool!

With this all being said, I'm going to read through chapter 7 one more time, then push on to chapter 8.

Cheers all!

Wednesday, February 19, 2014

Homework and answers

So like I suggested for homework querying an XML document I went with the books suggestion which was to use the Resume column in humanresources.jobcandidates table (AdventureWorks).  The suggested practice was to find everyone's first and last name... then the secondary part of the problem was to find all candidates who lived in Chicago (location.city).  Here is the way that I found to do it:

WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' AS ns)
SELECT [Resume].value('(/ns:Resume/ns:Name/ns:Name.First)[1]', 'varchar(50)') AS Firstname,
[Resume].value('(/ns:Resume/ns:Name/ns:Name.Last)[1]', 'varchar(50)') AS Lastname
FROM HumanResources.JobCandidate
WHERE [Resume].exist('/ns:Resume/ns:Employment/ns:Emp.Location/ns:Location/ns:Loc.City[.="Chicago"]')=1


I'm going to post this to SSC to see if there is an easier way of doing this (meaning an easier way of writing it, not optimization wise), and what are the search methods when values are attributes instead of elements as the resumes were.

Cheers!

Answers soon

Working through chapter 7 while mixing in homework from the other chapters. I'm literally dreaming about SQL! 

I'll post some answers to the "simple" XML queries that I suggested doing the other day, such as searching for the first and last name of someone.

Cheers!

Sunday, February 16, 2014

Sandbox Homework

As opposed to writing out business querying objects, and working through them in SQL.  Tonight I spent time creating objects on my virtual machine, as well as my native machine.  I worked on creating a database that had a few different tables in it, and had a compound primary key on one of the tables.  When I started looking into these things (especially the second piece) I came across a blog that I thought was interesting because it discussed the use of ssn#'s as primary keys, and whether or not it was a good idea.  The overall answer was no, that it was not a good idea to use them as PK's.

For the database that I'm creating (contacts in my phone book) I want to use the phone number as the primary key, but not all of these are unique because they can be assigned to multiple "people"... therefore I began looking at the idea of a composite primary key.  I would likely have to add the phone number and firstname as the composite key... anyways, I'm working on it.  It's fun play around with this stuff in SQL, as I have in Synergex in the past.  The phone numbers are currently stored as VARCHAR(12), it'll be fun to removing the "-" from the phone numbers (if they exist), and casting them into an integer.

Once I have my DB setup, I'll want to be able to link up to it from my current cpu instead of the virtual one I currently am working on.  If anyone is reading this, I encourage you to do the same thing, as it proves valuable in navigating SQL, not just TSQL.

Saturday, February 15, 2014

XML Concepts

As I posted yesterday, I am going to try to articulate the easiest way to understand XML from what I know so far.  First off as I wrote previously, XML doesn't do anything, it's just a different way to format data that makes it easier to transport.

XML documents look something like this (taken from here):
<bookstore>
  <book category="CHILDREN">
    <title>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="WEB">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>

The first few lines of code for above:
<bookstore>
<book category = "CHILDREN">
</bookstore>

Is the same as
<bookstore>
<book category> CHILDREN </book category>
</bookstore>

Can you seen the differences?  The difference is that the first example of the short code includes elements and attributes… the best way to recognize an attribute is that an attribute always has quotation marks.  Elements are a little trickier because elements can CONTAIN "other elements, text, attributes, or a mix of both" according to w3schools.com.

So is it better to have an element centric XML code, or attribute centric?  After a google search here is a  decent explanation taken from stack overflow:

Attribute centric
  • Smaller size than element centric.
  • Not very interoperable, since most XML parsers will think the user data is presented by the element, Attributes are used to describe the element.
  • There is no way to present nullable value for some data type. e.g. nullable int
  • Can not express complex type.
Element centric
  • Complex type can be only presented as an element node.
  • Very interoperable
  • Bigger size than attribute centric. (compression can be used to eliminated the size significantly)
  • Nullable data can be expressed with attribute xsi:nil="true"
  • Faster to parse since the parser only looks for elements for user data.
The next thing to be able to identify would be the namespace within XML: 
I think about a namespace kind of like how you would us the AS feature in TSQL… say during an inner join.  
(SELECT h.empid, o.firstname
FROM hr.employees AS h
INNER JOIN 
hr.oldemployees AS o 
ON h.empid = o.empid)

The name spaces would be the h or o in h.xxxxx or o.xxxxx.

That's a brief overview of namespaces, I learned a lot from this link that explained namespaces very well.  It discusses when they are created, and how they are used in a practical sense.  

Well that's gets us a few ideas about XML (elements, attributes, and namespaces) but there is a lot more to go.  It's important to know these things because as far as the 461 test goes, I know that XML is something that is covered.  

As homework for Chapter 7 the book has a few suggested practices which I think are straight forward and easy to work with they suggest: Use a simple XML column on a sample database, and search it for  elementary things like; a persons first name, a persons last name, a persons age, etc.  Here is a bit of code found on stackoverflow that can help you to find the columns in all of your databases that contain XML data:

SELECT table_name [Table Name], column_name [Column Name], 

FROM information_schema.columns where data_type = 'XML'

Cheers until next time!

Thursday, February 13, 2014

Homework Expanded I

I labeled this post "Expanded I" because I'm sure that I"m going to be doing this again in the future... essentially what I did was expanded on some home work that I've already done.

Tonight I started with this as my homework and was going to work through it real quick, it's from chapter four:

1) Use production.products table to group products by categoryid, and returns for each category the minimum unit price. 

2) define a CTE based on question 1 and join the CTE to the production.products table to return per each category the products with the minimum unit price.

Then I decided to take it a little further and said... what if I wanted an additional column that told me how many products where in categoryid, then out of those categories, which one had the smallest unit price. It took me a while, but I finally got it... the batch of code is below.  I'm going to post to SSC and see if there is different way of doing this that may be easier. I don't there is because I have two aggregates.  I just created two CTE's in order to implement this.

WITH cte AS(
SELECT COUNT(categoryid) AS catct, categoryid
FROM production.Products
GROUP BY categoryid),
ctea AS(
SELECT MIN(unitprice) AS mn, categoryid
FROM production.Products
GROUP BY categoryid)
SELECT a.categoryid, a.catct, b.mn
FROM cte AS a
INNER JOIN ctea AS b
ON b.categoryid = a.categoryid

More XML tomorrow, I watched some youtube videos on XML today to try to get comfortable with whole elements, attributes, and namespaces.  I'm going to try to articulate what I learn tomorrow with a post in the afternoon.  Cheers until then.

Wednesday, February 12, 2014

SSC post

I posted a question at SSC last night seen here:


The answers of which definitely made me feel better about my query visualization. It also makes me think about restructuring my home work questions and rewriting them in a very specific manner.  I'm still working through chapter 7, will give it another read today... That's all for now, cheers!

Saturday, February 8, 2014

XML Thoughts

Just a quick post this morning re: XML…

The 461TK book does a decent job talking about the elements/attributes of XML, but it's important for people to remember that XML doesn't really do ANYTHING.  It stores data, and makes it easier to transfer… it's really just information wrapped in tags.

See… short and to the point…. more later on all of this

Cheers

Friday, February 7, 2014

Chapter 7 XML

Well chapter 6 seems like something that I need to work through at home using SSMS due to the problems with needing some of the files associated with the training kit disc.  I'll work through those as best I can tonight, in the mean time I'll be working through homework that I've assigned myself, and looking at chapter 7.

Chapter 7 is all about XML which I'm excited to learn because I remember this being on the test the first time that I took it.  It looks very confusing when looking at it at first, but again the book does a great job simplifying things.  As always, I'm reading through the lesson once knowing that I won't absorb everything… then I'll go back through it again (maybe tonight) slower, and slower, and slower until I have it down.

More on this later, this afternoon will consist of homework to stay fresh, and possibly playing around with XML Elements

Thursday, February 6, 2014

Training Kit Errata

Well, I've knew that there were errors in this book mainly from a few of the quizzes given at the end of lessons that had the incorrect letter, but correct answers on them, but just as a reminder there are A LOT more errors than I would've imagined for such a precise publication.  Training Kit Errata is a reminder of the errors noted by others, and is important to look at when executing some of the written procedures in the book.

Another useful link is here which gives you the necessary source code to work through all of the chapters, and exercises in book.

I never knew this existed until now!.. kind of a bummer because I've been reading the book, following along and writing out what the book has written verbatim…. oh well it's a good way to keep me engaged.

I write this today at the end of a very long day because chapter 6 has to do with full text querying (finding text within a file) and the exercises require documents that were supposed to be on the training kit cd but were not, thus I have provided the explanation and resolution in this blog post… until tomorrow!  Cheers


Wednesday, February 5, 2014

Keep practicing even when studying new material

Again, just like the post says I think it's EXTREMELY important to keep practicing things that I've learned even though I am moving onto chapter 6.  Looking over the homework that I've posted thus far I believe that the oldest homework to date only goes back to chapter 3 (out of 6, that's kind of funny).  However, there has been a TON of content covered in those three chapters, so it's a must to get good at them.

Each day I am going to make an effort to do some type of practicing as well as advance my studies in another area.  Here is the practice for today, it comes from chapter 3 I believe... something I haven't thought about for a few weeks:

Use hr.employees table in TSQL2012

Find all employees that were born after 1963

Born before 1963

Born between Feb. 1 1965 and today

Find employees who's first name starts with 'D' and is born after 1960

Create column "astrology", sort employees, if they are born in months Jan-Jun label them "libra" in 'astrology' column, if born between July-December "taurus"

Answers have already been posted for this... it takes discipline to keep problems like such relevant.

Tuesday, February 4, 2014

Chapter 5 answers

Here are the answers to the homework questions... keep working on these on a daily basis even if you begin learning new things... it's always great to refresh!

SELECT custid, orderid, val,
SUM(val) OVER(PARTITION BY custid) AS custid,
SUM(val) OVER() AS Gratot
FROM sales.OrderValues
--
SELECT custid, orderid, val,
CAST(100 * val / (SUM(val) OVER(PARTITION BY custid)) AS NUMERIC (5,2)) AS custpct,
CAST(100 * val / (SUM(val) OVER()) AS NUMERIC (5,2)) AS Totpct
FROM sales.ordervalues
--
SELECT custid, orderid, val,
SUM(val) OVER(PARTITION BY custid
ORDER BY orderdate, orderid
ROWS UNBOUNDED PRECEDING) AS runningtot
FROM sales.ordervalues
--
SELECT orderid, custid, val,
AVG(val) OVER(PARTITION BY custid
ORDER BY orderdate, orderid
ROWS BETWEEN 3 PRECEDING
AND CURRENT ROW) AS lastfour
FROM sales.OrderValues
--
WITH cte AS(
SELECT shipperid, freight, orderid,
ROW_NUMBER() OVER(PARTITION BY shipperid
ORDER BY freight DESC, orderid) AS rankpership
FROM sales.Orders)
SELECT * FROM cte
WHERE rankpership <= 3
--
USE TSQL2012
IF OBJECT_ID('fa') IS NOT NULL DROP TABLE fa;
GO
WITH freightavg AS(
SELECT custid, shipperid, freight
FROM sales.Orders)
SELECT *
INTO fa
FROM freightavg
PIVOT(AVG(freight) FOR shipperid IN ([1], [2], [3])) AS p;
SELECT * FROM fa
WHERE custid BETWEEN 5 AND 10
--
SELECT custid, shipperid, freight
FROM sales.freighttotals
UNPIVOT(freight FOR shipperid IN ([1], [2], [3])) AS un
--
SELECT o.custid, c.country, c.city, COUNT(*) AS swededorders
FROM sales.Orders AS o
INNER JOIN sales.Customers AS c
ON o.custid = c.custid
WHERE c.country LIKE 'Sweden'
GROUP BY o.custid, c.country, c.city
--
SELECT o.custid, c.city, COUNT(*) AS swededorders
FROM sales.Orders AS o
INNER JOIN sales.Customers AS c
ON o.custid = c.custid
WHERE c.country LIKE 'Sweden'
GROUP BY GROUPING SETS ((o.custid, c.city), ())

Chapter 5 hw posted tonight

Appropriate heading for my post this afternoon... This will ensure that I do the homework. I've done it a number of times already, but have occasionally looked at the answers while doing the hw. Tonight I won't do any of that.

I also am going to start mixing in old hw assignments to stay fresh with those.

Monday, February 3, 2014

Chapter 5 homework notes

Chapter 5 homework takes a bit to get comfortable and competent with, but with persistence and practice it's very doable.  Part of the challenging part is trying to figure out exactly what the question is asking for (yay word problems!), the figuring out how you're going to write it in TSQL..  Understanding the over clause and how partitions work can be confusing as well, so I wanted to write an example of some simple code (question 1 of last homework set) with a slight variation so that it's easier to understand.

This second part of the code is the answer to the question, can you figure out what the first one does?

SELECT custid, orderid, val,
SUM(val) OVER(PARTITION BY custid, orderid) AS custot,
SUM(val) OVER() AS grandtot
FROM sales.ordervalues;
--
SELECT custid, orderid, val,
SUM(val) OVER(PARTITION BY custid) AS custot,
SUM(val) OVER() AS grandtot
FROM sales.ordervalues

The first one helps you understand the over clause more clearly because when you run both queries at the same time you see that the second set of  code partitions ONLY by custid, while the first set partitions by custid AND orderid... which there are no duplicates of.  Notice how the 'val' and 'custot' column are the same when running the first set of code.  Hopefully this helps in understanding the OVER clause that uses PARTITION...