xmlextract

A built-in function that applies the XML_query_expression to the xml_data_expression and returns the result. This function resembles a SQL substring operation.

Syntax

xmlextract_expression ::=
xmlextract (xml_query_expression,xml_data_expression
      [optional_parameters])
 xml_query_expression ::=basic_string_expression 
 xml_data_expression ::= general_string_expression
 optional_parameters ::=
     options_parameter
     | return_type
     | options_parameterreturn_type
 options_parameter ::= [,] option option_string
 returns_type ::= [,] returns {varchar[(integer)] | text | image }
 option_string ::= [,] basic_string_expression

Description

NoteSee Chapter 3, “XML Language and XML Query Language,”for the following topics:

option_string

The general format of the option_string is described in “option_strings: general format”.

The option supported for the xmlextract function is:

 xmlerror = {exception| null| message}

Exceptions

If the value of the xml_data_expression is not not valid XML, or is an all blank or empty string:

If the return_type of the xmlextract_expression is varchar and the runtime result of evaluating the xml_query_expression parameter is longer than the maximum length of a varchar, then an exception is raised:

 result exceeded maximum varchar length

Examples

The following examples use the sample_docs table described in Appendix B, “The sample_docs Example Table”.

This example selects the title of documents that have a bookstore/book/price of 55 or a bookstore/book/author/degree whose from attribute is “Harvard”.

select xmlextract('/bookstore/book[price=55        
   | author/degree/[@from="Harvard"] ]/title'
     text_doc )
from sample_docs
------------------------------------------------------ 
<title>History of Trenton</title> 
<title>Trenton Today, Trenton Tomorrow</title>   

NULL 

NULL 

The following example selects the row/pub_id elements of documents whose row elements either have a price element that is less than 10 or a city element equal to “Boston”. This query returns three rows:

select xmlextract('//row[price<10 | city="Boston" ]/pub_id',     
     text_doc) from sample_docs2>  
------------------------------------   
NULL            

<pub_id>0736</pub_id>

<pub_id>0736</pub_id>   
<pub_id>0877</pub_id>   
<pub_id>0736</pub_id>   
<pub_id>0736</pub_id>     

(3 rows affected)

The following example selects the price of “Seven Years in Trenton” as an integer. This query has a number of steps.

  1. To select the price of “Seven Years in Trenton” as an XML element:

    select xmlextract 
    ('/bookstore/book[title="Seven Years in Trenton"]/price',text_doc)
    from sample_docs
    where name_doc='bookstore' 
    --------------------------------------
    <price>12</price> 
    
  2. The following attempts to select the full price as an integer by adding a returns integer clause:

    select xmlextract  
         ('/bookstore/book[title="Seven Years in Trenton"]/price',
        text_doc returns integer)
        from sample_docs
        where name_doc='bookstore' 
    Msg 249, Level 16, State 1:
    Line 1:
    Syntax error during explicit conversion of VARCHAR value '<price>12</price>' to an INT field.
    
  3. To specify a returns clause with a numeric, money, or date-time datatype, the XML query must return value suitable for conversion to the specified datatype. The query must therefore use the text() reference to remove the XML tags:

    select xmlextract
      ('/bookstore/book[title="Seven Years in Trenton"]/price/text()',
      text_doc returns integer)
      from sample_docs
      where name_doc='bookstore'
    -----------
             12
    
  4. To specify a returns clause with a numeric, money, or date-time datatype, the XML query must also return a single value, not a list. For example, the following query returns a list of prices:

    select xmlextract
        ('/bookstore/book/price',   
        text_doc)
        from sample_docs
        where name_doc='bookstore'
    -----------
    <price>12</price>
    <price>55</price>
    <price intl="canada" exchange="0.7">6.50</price> 
    
  5. Adding the text() reference yields the following result:

    select xmlextract
    ('/bookstore/book/price/text()',
    text_doc)
    from sample_docs
    where name_doc='bookstore'
    ----------------------------- 
    12 55 6.50 
    
  6. Specifying the returns integer clause produces an exception, indicating that the combined values aren’t suitable for conversion to integer:

    select xmlextract
       ('/bookstore/book/price/text()',
       text_doc returns integer)
       from sample_docs
       where name_doc='bookstore'
    Msg 249, Level 16, State 1:
    Line 1:
    Syntax error during explicit conversion of VARCHAR value '12556.50' to an INT field.
    

To illustrate the xmlerror options, the following command inserts an invalid document into the sample_docs table:

insert into sample_docs (name_doc, text_doc) 
values ('invalid doc', '<a>unclosed element<a>)

(1 row affected)

In the following example, the xmlerror options determine the treatment of invalid XML documents by the xmlextract function:

The xmlerror option is used by xmlextract only when the xml_data_expression is a varchar or text expression, or an image expression whose value is an unparsed XML document. In other words, the xmlerror option doesn't apply to a document that is a parsed XML document or to a document returned by an explicit nested call by xmlparse.

For example, in the following xmlextract call, the xml_data_expression is an unparsed character-string document, so the xmlerror option applies to it. The document is invalid XML, so an exception is raised, and the xmlerror option indicates that the exception message should be returned as an XML document with the exception message:

select xmlextract('/',  '<a>A<a>'  option'xmlerror=message')
---------------------------------------------------
<xml_parse_error>The input ended before all started tags were ended.    
   Last tag started was 'a'</xml_parse_error> 

In the following xmlextract call, the xml_data_expression is returned by an explicit call by the xmlparse function (see section “xmlparse”). Therefore, the default xmlerror option of the explicit xmlparse call applies, rather than the xmlerror option of the outer xmlextract call. That default xmlerror option is exception, so the explicit xmlparse call raises an exception:

select xmlextract('/', xmlparse('<a>A<a>')
     option 'xmlerror=message'))
---------------------------------------------------
Msg 14702, Level 16, State 0:
Line 2:
XMLPARSE(): XML parser fatal error   
 <<The input ended before all started tags were ended.
 Last tag started was 'a'>> at line 1, offset 8.

To apply the xmlerror=message option to the explicit nested call of xmlparse, specify it as an option in that call:

select xmlextract('/',       
   xmlparse('<a>A<a>' option 'xmlerror=message'))
----------------------------------------------------
<xml_parse_error>The input ended before all started
tags were ended. Last tag started was
'a'</xml_parse_error> 

To summarize the treatment of the xmlerror option for unparsed XML documents and nested calls of xmlparse:

This command restores the sample_docs table to its original state:

delete from sample_docs
where na_doc=’invalid doc’