A built-in function that applies the XML_query_expression to the xml_data_expression and returns the result. This function resembles a SQL substring operation.
xmlextract_expression ::= xmlextract (xml_query_expression,xml_data_expression [optional_parameters]) xml_query_expression ::=basic_string_expression xml_data_expression ::= general_string_expression optional_parameters ::= options_parameter | return_type | options_parameterreturn_type options_parameter ::= [,] option option_string returns_type ::= [,] returns {varchar[(integer)] | text | image } option_string ::= [,] basic_string_expression
A basic_string_expression is a sql_query_expression whose datatype is character, varchar, or java.lang.String.
A general_string_expression is a sql_query_expression whose datatype is text, image, character, varchar, or java.lang.String.
An xmlextract expression can be used in SQL language wherever a character expression is allowed.
The default options_parameter is an empty string. A null options parameter is treated as an empty string.
If the xml_query_expression of an xmlextract call is null, then the result of the xmlextract call is null.
The value of the xml_data_expression parameter is the runtime context for execution of the XML query expression.
The datatype of an xmlextract call is specified by the returns_type.
The default returns_type is text.
If a varchar specified as the returns_type without an integer, the default is 30.
If numeric or decimal is specified without a precision (the first integer), the default is 18. If it is specified without a scale (the second integer), the default is 0.
The initial result of an xmlextract call is the result of applying the xml_query_expression to the xml_data_expression. That result is specified by the XPath standard.
If the returns_type specifies varcharimage, text, or image, the initial result value is returned as a character-string document of that datatype.
If the returns_type specifies a numeric, money, or date-time datatype, the initial result is converted to that datatype and returned. The conversion follows the rules specified for the convert built-in function.
The initial result must be a value suitable for the convert built-in function. This requires using the text() reference in the XML query expression. See the examples following.
See Chapter 3, “XML Language and XML Query Language,”for the following topics:
Restrictions on external URL references, XML namespaces, and XML schemas.
Treatment of predefined entities and their corresponding characters: & (&), < (,), > (>), "e; (“), and ' (;). Be careful to include the semicolon as part of the entity.
Treatment of whitspace.
Treatment of empty elements.
The general format of the option_string is described in “option_strings: general format”.
The option supported for the xmlextract function is:
xmlerror = {exception| null| message}
If the value of the xml_data_expression is not not valid XML, or is an all blank or empty string:
If the explicit or default option specifies that xmlerror=exception, an exception is raised:
invalid XML data
If the explicit or default option specifies xmlerror=null a null value is returned.
If the explicit or default options specifies xmlerror=message, a character string containing an XML element, which contains the exception message, is returned. This value is valid XML.
If the return_type of the xmlextract_expression is varchar and the runtime result of evaluating the xml_query_expression parameter is longer than the maximum length of a varchar, then an exception is raised:
result exceeded maximum varchar length
The following examples use the sample_docs table described in Appendix B, “The sample_docs Example Table”.
This example selects the title of documents that have a bookstore/book/price of 55 or a bookstore/book/author/degree whose from attribute is “Harvard”.
select xmlextract('/bookstore/book[price=55 | author/degree/[@from="Harvard"] ]/title' text_doc ) from sample_docs ------------------------------------------------------ <title>History of Trenton</title> <title>Trenton Today, Trenton Tomorrow</title> NULL NULL
The following example selects the row/pub_id elements of documents whose row elements either have a price element that is less than 10 or a city element equal to “Boston”. This query returns three rows:
A null value from the bookstore row
A single “<row>...</row>” element from the publishers row
4 “<row>...</row>” elements from the titles row
select xmlextract('//row[price<10 | city="Boston" ]/pub_id', text_doc) from sample_docs2> ------------------------------------ NULL <pub_id>0736</pub_id> <pub_id>0736</pub_id> <pub_id>0877</pub_id> <pub_id>0736</pub_id> <pub_id>0736</pub_id> (3 rows affected)
The following example selects the price of “Seven Years in Trenton” as an integer. This query has a number of steps.
To select the price of “Seven Years in Trenton” as an XML element:
select xmlextract ('/bookstore/book[title="Seven Years in Trenton"]/price',text_doc) from sample_docs where name_doc='bookstore' -------------------------------------- <price>12</price>
The following attempts to select the full price as an integer by adding a returns integer clause:
select xmlextract ('/bookstore/book[title="Seven Years in Trenton"]/price', text_doc returns integer) from sample_docs where name_doc='bookstore' Msg 249, Level 16, State 1: Line 1: Syntax error during explicit conversion of VARCHAR value '<price>12</price>' to an INT field.
To specify a returns clause with a numeric, money, or date-time datatype, the XML query must return value suitable for conversion to the specified datatype. The query must therefore use the text() reference to remove the XML tags:
select xmlextract ('/bookstore/book[title="Seven Years in Trenton"]/price/text()', text_doc returns integer) from sample_docs where name_doc='bookstore' ----------- 12
To specify a returns clause with a numeric, money, or date-time datatype, the XML query must also return a single value, not a list. For example, the following query returns a list of prices:
select xmlextract ('/bookstore/book/price', text_doc) from sample_docs where name_doc='bookstore' ----------- <price>12</price> <price>55</price> <price intl="canada" exchange="0.7">6.50</price>
Adding the text() reference yields the following result:
select xmlextract ('/bookstore/book/price/text()', text_doc) from sample_docs where name_doc='bookstore' ----------------------------- 12 55 6.50
Specifying the returns integer clause produces an exception, indicating that the combined values aren’t suitable for conversion to integer:
select xmlextract ('/bookstore/book/price/text()', text_doc returns integer) from sample_docs where name_doc='bookstore' Msg 249, Level 16, State 1: Line 1: Syntax error during explicit conversion of VARCHAR value '12556.50' to an INT field.
To illustrate the xmlerror options, the following command inserts an invalid document into the sample_docs table:
insert into sample_docs (name_doc, text_doc) values ('invalid doc', '<a>unclosed element<a>) (1 row affected)
In the following example, the xmlerror options determine the treatment of invalid XML documents by the xmlextract function:
If xmlerror=exception (this is the default), an exception is raised:
select xmlextract('//row', text_doc option 'xmlerror=exception') from sample_docs Msg 14702, Level 16, State 0: Line 2: XMLPARSE(): XML parser fatal error <<The input ended before all started tags were ended. Last tag started was 'a'>> at line 1, offset 23.
If xmlerror=null, a null value is returned:
select xmlextract('//row', test_doc option 'xmlerror=null') from sample_docs (0 rows affected)
If xmlerror=message, a parsed XML document with an error message will be returned:
select xmlextract('//row', test_doc option 'xmlerror=message') from sample_docs ----------------------------------
<xml_parse_error>The input ended before all startedtags were ended. Last tag started was 'a'</xml_parse_error>
The xmlerror option is used by xmlextract only when the xml_data_expression is a varchar or text expression, or an image expression whose value is an unparsed XML document. In other words, the xmlerror option doesn't apply to a document that is a parsed XML document or to a document returned by an explicit nested call by xmlparse.
For example, in the following xmlextract call, the xml_data_expression is an unparsed character-string document, so the xmlerror option applies to it. The document is invalid XML, so an exception is raised, and the xmlerror option indicates that the exception message should be returned as an XML document with the exception message:
select xmlextract('/', '<a>A<a>' option'xmlerror=message') --------------------------------------------------- <xml_parse_error>The input ended before all started tags were ended. Last tag started was 'a'</xml_parse_error>
In the following xmlextract call, the xml_data_expression is returned by an explicit call by the xmlparse function (see section “xmlparse”). Therefore, the default xmlerror option of the explicit xmlparse call applies, rather than the xmlerror option of the outer xmlextract call. That default xmlerror option is exception, so the explicit xmlparse call raises an exception:
select xmlextract('/', xmlparse('<a>A<a>') option 'xmlerror=message')) --------------------------------------------------- Msg 14702, Level 16, State 0: Line 2: XMLPARSE(): XML parser fatal error <<The input ended before all started tags were ended. Last tag started was 'a'>> at line 1, offset 8.
To apply the xmlerror=message option to the explicit nested call of xmlparse, specify it as an option in that call:
select xmlextract('/', xmlparse('<a>A<a>' option 'xmlerror=message')) ---------------------------------------------------- <xml_parse_error>The input ended before all started tags were ended. Last tag started was 'a'</xml_parse_error>
To summarize the treatment of the xmlerror option for unparsed XML documents and nested calls of xmlparse:
The xmlerror option is used by xmlextract only when the document operand is an unparsed document.
When the document operand is an explicit xmlparse call, the implicit or explicit xmlerror option of that call overrides the implicit or explicit xmlerror option of the xmlextract.
This command restores the sample_docs table to its original state:
delete from sample_docs where na_doc=’invalid doc’