Design XML storage.
This objective may include but is not limited to: determining when to use XML for storage, untyped vs. typed (XML schema collection)
It seems to me that this section wants us to look into why it is that we might want to store data as XML in the database. This is a separate issue than why we might query XML or when we might attempt to retrieve data as XML. If we have determined that XML is the correct form of storage, we now should consider whether or not we would like to store this as typed (Bound by a well defined Schema Collection) versus untyped (xml which is not defined by a schema) xml .
For all of the sections on XML I have watched the XML MCM Video, and read Chapter 13: XML and XQuery of Inside Microsoft SQL Server 2008: T-SQL Programming (Itzik Ben Gan et al, Chapter by Dejan Sarka).
Determining when to use XML for storage
The article Understanding XML in SQL Server gives several examples of why you might use XML in SQL Server. The first and most important (at least to me) reason for using xml would be that it is a platform independent storage model. Xml can be used as a structure to communicate between disparate systems or systems that might communicate different structures depending on workflow. XML is extremely powerful in it’s ability to cope with dynamic or sparse data structures. A few prime examples of this behavior can be seen in the query execution plans, service broker and extended events modules of sql server. XML also handles hierarchies and recursive hierarchies well. The article referenced above also mentions two other points, when the order is inherent in the data and when you might want to query into the data or make modifications based on its’ structure. I suppose that both of these are valid reasons, although I must say that it is slightly foreign to me to think in those terms. I would instead model the order or mitigating flag(s) into the data model.
Of course, even if you have decided to use XML in SQL Server you still have to decide if using the native xml data type is the correct option. You can instead use an Annotated XML Schema (AXSD) to map between XML and the relational engine without physically storing XML, or the FOR XML syntax to return relational data as XML (as discussed in an upcoming post).
Untyped vs. Type (XML Schema Collection)
With typed xml you have an XML Schema against which the xml is validated and data type information for the elements and attributes. The XML Schema will need to be defined in the database in which the xml data will be defined. Note that the default “facet” for typed xml is content, which allows for xml fragments (multiple top level nodes, or text data in the top-level elements). This can be controlled by defining the facet as a document.
Untyped XML data types are all xml data types which do not have a schema associated with them. However, because they still adhere to the well-formed XML version 1.0 documents, the xml data type will still reject data that is not well formed. Note that, like typed xml, you can store xml fragments (i.e. multiple top level nodes). However, unlike the typed xml, this behavior is not modifiable.
Untyped vs Typed
There are several guidelines and arguments which relate to when to use Typed XML Compared to Untyped XML. You might like to store the xml as untyped if you either do not have the schema for the xml data, you believe that the structure will be dynamic in nature and modified often, or if you don’t want the server to validate the data (for instance, if the data might be temporarily stored in an invalid state). Other than these states, however, you probably want to use the typed xml as it provides data validation and allows optimization based on the type information for storage, query and compilation.