openxml system procedure

Generates a result set from an XML document.

Syntax
openxml( xml-data,
 xpath [, flags [, namespaces ] ] )
WITH ( column-name column-type [ xpath ],... )
Arguments
  • xml_data   The XML on which the result set is based. This can be any string expression, such as a constant, variable, or column.

  • xpath   A string containing an XPath query. XPath allows you to specify patterns that describe the structure of the XML document you are querying. The XPath pattern included in this argument selects the nodes from the XML document. Each node that matches the XPath query in the second xpath argument generates one row in the table.

    Metaproperties can only be specified in WITH clause xpath arguments. A metaproperty is accessed within an XPath query as if it was an attribute. If a namespaces is not specified, then by default the prefix mp is bound to the Uniform Resource Identifier (URI) urn:ianywhere-com:sa-xpath-metaprop. If a namespaces is specified, this URI must be bound to mp or some other prefix to access metaproperties in the query. Metaproperty names are case sensitive. The openxml statement supports the following metaproperties:

    • @mp:id   returns an ID for a node that is unique within the XML document. The ID for a given node in a given document may change if the database server is restarted. The value of this metaproperty increases with document order.

    • @mp:localname   returns the local part of the node's name, or NULL if the node does not have a name.

    • @mp:prefix   returns the prefix part of the node's name, or NULL if the node does not have a name or if the name is not prefixed.

    • @mp:namespaceuri   returns the URI of the namespace that the node belongs to, or NULL if the node is not in a namespace.

    • @mp:xmltext   returns a subtree of the XML document in XML form. For example, when you match an internal node, you can use this metaproperty to return an XML string, rather than the concatenated values of the descendant text nodes.

  • flags   Indicates the mapping that should be used between the XML data and the result set when an XPath query is not specified in the WITH clause. If the flags parameter is not specified, the default behavior is to map attributes to columns in the result set. The flags parameter can have one of the following values:

    Value Description
    1 XML attributes are mapped to columns in the result set (the default).
    2 XML elements are mapped to columns in the result set.

  • namespace-declaration   An XML document. The in-scope namespaces for the query are taken from the root element of the document. If namespaces are specified, then you must include a flags argument, even if all the xpath arguments are specified.

  • WITH clause   Specifies the schema of the result set and how the value is found for each column in the result set. WITH clause xpath arguments are matched relative to the matches for the xpath in the second argument. If a WITH clause expression matches more than one node, then only the first node in the document order is used. If the node is not a text node, then the result is found by appending all the text node descendants. If a WITH clause expression does not match any nodes, then the column for that row is NULL.

    The openxml WITH clause syntax is similar to the syntax for selecting from a stored procedure.

    For information about selecting from a stored procedure, see FROM clause.

  • column-name   The name of the column in the result set.

  • column-type   The data type of the column in the result set. The data type must be compatible with the values selected from the XML document. See SQL data types.

Usage

The openxml system procedure parses the xml_data and models the result as a tree. The tree contains a separate node for each element, attribute, and text node, or other XML construct. The XPath queries supplied to the openxml system procedure are used to select nodes from the tree, and the selected nodes are then mapped to the result set.

The XML parser used by the openxml system procedure is non-validating, and does not read the external DTD subset or external parameter entities.

When there are multiple matches for a column expression, the first match in the document order (the order of the original XML document before it was parsed) is used. NULL is returned if there are no matching nodes. When an internal node is selected, the result is all the descendant text nodes of the internal node concatenated together.

Columns of type BINARY, LONG BINARY, IMAGE, and VARBINARY are assumed to be in base64-encoded format and are decoded automatically. If you generate XML using the FOR XML clause, these types are base64-encoded, and can be decoded using the openxml system procedure. See FOR XML and binary data.

The openxml system procedure supports a subset of the XPath syntax, as follows:

  • The child, self, attribute, descendant, descendant-or-self, and parent axes are fully supported.

  • Both abbreviated and unabbreviated syntax can be used for all supported features. For example, 'a' is equivalent to 'child::a' and '..' is equivalent to 'parent::node()'.

  • Name tests can use wildcards. For example, 'a/*/b'.

  • The following kind tests are supported: node(), text(), processing-instruction(), and comment().

  • Qualifiers of the form expr1[expr2] and expr1[expr2="string" ] can be used, where expr2 is any supported XPath expression. A qualifier evaluates TRUE if expr2 matches one or more nodes. For example, 'a[b]' finds a nodes that have at least one b child, and a[b="I"] finds a nodes that have at least one b child with a text value of I.

See also
Example

The following query generates a result set from the XML document supplied as the first argument to the openxml system procedure:

SELECT * FROM openxml( '<products>
                 <ProductType ID="301">Tee Shirt</ProductType>
                 <ProductType ID="401">Baseball Cap</ProductType>
                 </products>',
                 '/products/ProductType' )
WITH ( ProductName LONG VARCHAR 'text()', ProductID CHAR(3) '@ID');

This query generates the following result:

ProductName ProductID
Tee Shirt 301
Baseball Cap 401

In the following example, the first <ProductType> element contains an entity. When you execute the query, this node is parsed as an element with four children: Tee, &amp;, Sweater, and Set. You can use . to concatenate the children together in the result set.

SELECT * FROM openxml( '<products>
                 <ProductType ID="301">Tee &amp; Sweater Set</ProductType>
                 <ProductType ID="401">Baseball Cap</ProductType>
                 </products>',
                 '/products/ProductType' )
WITH ( ProductName LONG VARCHAR '.', ProductID CHAR(3) '@ID');

This query generates the following result:

ProductName ProductID
Tee Shirt & Sweater Set 301
Baseball Cap 401

The following query uses an equality predicate to generate a result set from the supplied XML document.

SELECT * FROM openxml('<EmployeeDirectory>
   <Employee>
      <column name="EmployeeID">105</column>
      <column name="GivenName">Matthew</column>
      <column name="Surname">Cobb</column>
      <column name="Street">7 Pleasant Street</column>
      <column name="City">Grimsby</column>
      <column name="State">UT</column>
      <column name="PostalCode">02154</column>
      <column name="Phone">6175553840</column>
   </Employee>
  <Employee>
      <column name="EmployeeID">148</column>
      <column name="GivenName">Julie</column>
      <column name="Surname">Jordan</column>
      <column name="Street">1244 Great Plain Avenue</column>
      <column name="City">Woodbridge</column>
      <column name="State">AZ</column>
      <column name="PostalCode">01890</column>
      <column name="Phone">6175557835</column>
   </Employee>
  <Employee>
      <column name="EmployeeID">160</column>
      <column name="GivenName">Robert</column>
      <column name="Surname">Breault</column>
      <column name="Street">358 Cherry Street</column>
      <column name="City">Milton</column>
      <column name="State">PA</column>
      <column name="PostalCode">02186</column>
      <column name="Phone">6175553099</column>
   </Employee>
  <Employee>
      <column name="EmployeeID">243</column>
      <column name="GivenName">Natasha</column>
      <column name="Surname">Shishov</column>
      <column name="Street">151 Milk Street</column>
      <column name="City">Grimsby</column>
      <column name="State">UT</column>
      <column name="PostalCode">02154</column>
      <column name="Phone">6175552755</column>
   </Employee> 
</EmployeeDirectory>', '/EmployeeDirectory/Employee')
WITH ( EmployeeID INT 'column[@name="EmployeeID"]',
       GivenName    CHAR(20) 'column[@name="GivenName"]',
       Surname      CHAR(20) 'column[@name="Surname"]',
       PhoneNumber  CHAR(10) 'column[@name="Phone"]');

This query generates the following result set:

EmployeeID GivenName Surname PhoneNumber
105 Matthew Cobb 6175553840
148 Julie Jordan 6175557835
160 Robert Breault 6175553099
243 Natasha Shishov 6175552755

The following query uses the XPath @attribute expression to generate a result set:

SELECT * FROM openxml( '<Employee
      EmployeeID="105"
      GivenName="Matthew"
      Surname="Cobb"
      Street="7 Pleasant Street"
      City="Grimsby"
      State="UT"
      PostalCode="02154"
      Phone="6175553840"
/>', '/Employee' )
WITH ( EmployeeID INT '@EmployeeID',
       GivenName    CHAR(20) '@GivenName',
       Surname      CHAR(20) '@Surname',
       PhoneNumber  CHAR(10) '@Phone');

The following query operates on an XML document like the one used in the above query, except that an XML namespace has been introduced. It demonstrates the use of wildcards in the name test for the XPath query, and generates the same result set as the above query.

SELECT * FROM openxml( '<Employee  xmlns="http://www.iAnywhere.com/EmployeeDemo"
      EmployeeID="105"
      GivenName="Matthew"
      Surname="Cobb"
      Street="7 Pleasant Street"
      City="Grimsby"
      State="UT"
      PostalCode="02154"
      Phone="6175553840"
/>', '/*:Employee' )

WITH ( EmployeeID INT '@EmployeeID',
       GivenName    CHAR(20) '@GivenName',
       Surname      CHAR(20) '@Surname',
       PhoneNumber  CHAR(10) '@Phone');

Alternatively, you could specify a namespace declaration:

SELECT * FROM openxml( '<Employee  xmlns="http://www.iAnywhere.com/EmployeeDemo"
      EmployeeID="105"
      GivenName="Matthew"
      Surname="Cobb"
      Street="7 Pleasant Street"
      City="Grimsby"
      State="UT"
      PostalCode="02154"
      Phone="6175553840"
/>', '/prefix:Employee', 1, '<r xmlns:prefix="http://www.iAnywhere.com/EmployeeDemo"/>' )
WITH ( EmployeeID INT '@EmployeeID',
       GivenName    CHAR(20) '@GivenName',
       Surname      CHAR(20) '@Surname',
       PhoneNumber  CHAR(10) '@Phone');

For more examples of using the openxml system procedure, see Importing XML using openxml.