70-451–MCITP–Design an XML Strategy–Design a Strategy to Query and Modify XML Data

Requirements

Design a strategy to query and modify XML data.
This objective may include but is not limited to: when to use appropriate XPath and XQuery expressions, .query vs. .value, XML indexes for performance, typed vs. untyped, .exist, .modify

My Take

I believe that this section is dealing with how we will get at the data in the XML structures. This includes understanding each of the methods you can use on the xml data type. It will also help to understand xml indexes as these can vastly improve the performance of querying your xml data.

As we will be learning about transforming XML into Relational Data in a future post, I will attempt to concentrate on the aspects we might use this in returning the data in the XML format. However, a lot of the ideas here are probably going to overlap. Really, once you understand navigating the XML using XPath and XQuery as well as the concepts behind FOWLR it shouldn’t be too big of a stretch to use these to return the data in a more relational structure.

For all of the sections on XML I have watched the XML MCM Video, and read Chapter 13: XML and XQuery of Inside Microsoft SQL Server 2008: T-SQL Programming (Itzik Ben Gan et al, Chapter by Dejan Sarka).  The article Introduction to XQuery in SQL Server 2005 might be a good start for some of this material if you don’t have access to the T-SQL Programming book.

When to use XPath and XQuery Expressions

XPath is an expression language which uses a path notation to navigate and investigate portions of an XML document (www.w3.org/TR/xpath). XQuery is used to query structured or semi-structured xml data and is based on the Xpath language with support for better iteration, better sorting, and the ability to construct necessary XML. XQuery expressions will be used any time that you call one of the xml data type methods. XPath is, in essence, used whenever XQuery is used. However, we also use XPath when constructing xml with the FOR XML syntax.

XQuery supports the FLWOR Statement and Iteration syntax. FLWOR is an acronym (initialism? I say “flower” even though that probably isn’t right, but it helps me remember the definition) – For, Let,Where, Order By, and Return. The syntax is essentially a for each loop that is used to define, filter and shape the data returned.

Remember that Xpath and XQuery are CaSe SenSItivE.

Using .query vs. .value

Deciding when to use .query vs when to use .value is really knowing what type of data you need returned. The .query method will return an xml data type. This can have one or more nodes / elements / attributes / etc. The .value() method will return a single value of the type specified.

.query()

The query() method accepts an XQuery expression and returns an xml data type. As such, it is not limited to a single result.

Sample query()

DECLARE @x AS XML = '<root><row OrderID="43659" OrderDate="2001-07-01T00:00:00">
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660" OrderDate="2001-07-01T00:00:00">
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661" OrderDate="2001-07-01T00:00:00">
<PO>PO18473189620<Customer>442</Customer></PO>
</row></root>';
SELECT @x.query('/root/row[@OrderID="43660"]');

Sample query() results

<row OrderID="43660"OrderDate="2001-07-01T00:00:00"
<PO>PO18850127500<Customer>117</Customer></PO>
</row>

.value()

The value() method accepts an XQuery expression and an sql type and returns a scalar value of the type specified. This is useful for querying xml to return data along with other relational data. Note that the query using the .value() method has to explicitly return a single value.

Sample value()

DECLARE @x AS XML = '<root><row OrderID="43659" OrderDate="2001-07-01T00:00:00">
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660" OrderDate="2001-07-01T00:00:00">
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661" OrderDate="2001-07-01T00:00:00">
<PO>PO18473189620<Customer>442</Customer></PO>
</row></root>';
SELECT @x.value('(/root/row[@OrderID="43660"]/PO)[1]', 'Varchar(50)');

Sample value() output

PO18850127500117

XML Indexes for Performance

When looking at Indexes on XML Data Type Columns it is important to note that there are really two vastly different types of indexes. The Primary Index is similar to a clustered index on a table. The data in the primary index for the XML column is actually stored in a relational format. The Secondary index(es) are more like a traditional non-clustered index that is persisted over the table(s) created for the primary index.

Primary XML Indexes

The Primary XML Index, like a clustered index is persisted. For a XML, this means that the data is shredded and stored as relational data in the index. There will be approximately one record per node. Each record stored holds information relating to the tag name, the node value, the node type, the document order information, the path from each node to the root of the tree, and the primary key of the base table(which is required for a primary xml index). Note that, even though this index is similar to a clustered index, if the full xml is requested from the table, the index will be ignored and the original data retrieved that is stored in the XML column.

Secondary XML Indexes

A secondary xml index requires the primary xml index already exist and will consist of either a PATH, VALUE or PROPERTY index. The PATH index will be useful in fulfilling path expressions which are generally encountered in the where clause of a T-SQL statement using a .exist() statement. The VALUE index will order items by the value of the property or element and is useful when a query will search for a value in an unknown node or with an unknown attribute. Finally, the PROPERTY index will be useful when retrieving data with the .value() method through the primary key of the base table when the path is known.

Typed vs. Untyped

I have talked some about the differences between typed and untyped xml data in the Design XML Storage post. However, when querying a database it is often times necessary to have a typed xml column in order for the optimizer to know when it can expect a simpleton node (i.e. single value). Without this prior knowledge, the engine is forced to throw an error. You can get around this by rewriting the queries to be more precise in identifying the exact path and occurrence to return.

Using .exist

The exist() method takes an XQuery expression and returns a bit. The value is 1 if there is a non-empty result set returned, 0 if there is an empty result set returned, and null if the xml that it was called against was null. This method is usually, but not always, used in a WHERE clause.

Sample .exist() statement 1

DECLARE @x AS XML = '<root><row OrderID="43659" OrderDate="2001-07-01T00:00:00">
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660" OrderDate="2001-07-01T00:00:00">
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661" OrderDate="2001-07-01T00:00:00">
<PO>PO18473189620<Customer>442</Customer></PO>
</row></root>';
SELECT @x.exist('/root/row[@OrderID="43668"]');

Sample .exist() results

0

Sample .exist() statement 2

DECLARE @x AS XML = '<root><row OrderID="43659" OrderDate="2001-07-01T00:00:00">
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660" OrderDate="2001-07-01T00:00:00">
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661" OrderDate="2001-07-01T00:00:00">
<PO>PO18473189620<Customer>442</Customer></PO>
</row></root>';
/* Note that we are asking to return ALL POs Customers if there is even one OrderID = 43660 */
WITH myXML AS (
SELECT @x.query('//PO/*') AS Output
)
SELECT
Output
FROM
myXML
WHERE
@x.exist('/root/row[@OrderID="43660"]') = 1
;

Sample.exist() result statement 2

<Customer>676</Customer>
<Customer>117</Customer>
<Customer>442</Customer>

Using .modify

The .modify() method uses XML Data Modification Language (XML DML) to insert, replace the value of (i.e. update) or delete nodes from XML data. Other than getting used to the syntax, the results are similar to what one would expect from a normal insert, update or delete. It should be noted that this functionality of XQuery is not supported by the WC3. This is a SQL Server XML only implementation (as far as I know).

Insert

With the insert statement, the syntax uses an expression to be inserted, an expression to locate where the first expression will be inserted, and a declaration of whether the first expression is a sibling or descendant of the second.

Insert Expression1 {as first | as last} into | after | before Expression2

Example insert XML DML statement

DECLARE @x AS XML = '<root><row OrderID="43659" OrderDate="2001-07-01T00:00:00">
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660" OrderDate="2001-07-01T00:00:00">
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661" OrderDate="2001-07-01T00:00:00">
<PO>PO18473189620<Customer>442</Customer></PO>
</row></root>';
SET @x.modify('insert <anothernode>SomeText</anothernode> as last
into (/root/row[@OrderID="43660"]/PO)[1]');
SELECT @x;

Example insert XML DML statement results

<root>
<row OrderID="43659"OrderDate="2001-07-01T00:00:00"
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660"OrderDate="2001-07-01T00:00:00"
<PO>PO18850127500<Customer>117</Customer><anothernode>SomeText</anothernode></PO>
</row>
<row OrderID="43661"OrderDate="2001-07-01T00:00:00"
<PO>PO18473189620<Customer>442</Customer></PO>
</row>
</root>

Replace Value of

The replace value of syntax gives Expression1 which is the path to the value you would like to replace, and Expression2 which is the value to which you will be setting the Expression1.

replace value of Expression1 with Expression2

Example replace value of XML DML statement

DECLARE @x AS XML = '<root><row OrderID="43659" OrderDate="2001-07-01T00:00:00">
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660" OrderDate="2001-07-01T00:00:00">
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661" OrderDate="2001-07-01T00:00:00">
<PO>PO18473189620<Customer>442</Customer></PO>
</row></root>';
SET @x.modify('insert <anothernode>SomeText</anothernode> as last
into (/root/row[@OrderID="43660"]/PO)[1]');
SET @x.modify('replace value of (//anothernode/text())[1] with "different text"');
SELECT @x;

Example replace value of XML DML statement results

<root>
<row OrderID="43659"OrderDate="2001-07-01T00:00:00"
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660"OrderDate="2001-07-01T00:00:00"
<PO>PO18850127500<Customer>117</Customer><anothernode>different TEXT</anothernode></PO>
</row>
<row OrderID="43661"OrderDate="2001-07-01T00:00:00"
<PO>PO18473189620<Customer>442</Customer></PO>
</row>
</root>

Delete

The delete syntax simply takes an expression to navigate to the element or attribute which you would like to delete.

delete Expression

Example delete XML DML statement

DECLARE @x AS XML = '<root><row OrderID="43659" OrderDate="2001-07-01T00:00:00">
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660" OrderDate="2001-07-01T00:00:00">
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661" OrderDate="2001-07-01T00:00:00">
<PO>PO18473189620<Customer>442</Customer></PO>
</row></root>';
SET @x.modify('insert <anothernode>SomeText</anothernode> as last
into (/root/row[@OrderID="43660"]/PO)[1]');
SET @x.modify('replace value of (//anothernode/text())[1] with "different text"');
SET @x.modify('delete (//anothernode)');
SELECT @x;

Example delete XML DML statement results

<root>
<row OrderID="43659"OrderDate="2001-07-01T00:00:00"
<PO>PO522145787<Customer>676</Customer></PO>
</row>
<row OrderID="43660"OrderDate="2001-07-01T00:00:00"
<PO>PO18850127500<Customer>117</Customer></PO>
</row>
<row OrderID="43661"OrderDate="2001-07-01T00:00:00"
<PO>PO18473189620<Customer>442</Customer></PO>
</row>
</root>

Advertisements
This entry was posted in SQL and tagged , , . Bookmark the permalink.

2 Responses to 70-451–MCITP–Design an XML Strategy–Design a Strategy to Query and Modify XML Data

  1. Pingback: MCITP 70-451 Links Page | Destination: Change

  2. Musa says:

    Thanx Eric for the Post it has help me a lot.What books can you suggest for Exam 451 & 450 Since Microsoft does not prescribe any books.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s