|
|
OLE DB Provider for XMLWhile researching how I could generate data reports for a project, I found myself needing access to XML with a regular database interface. After looking for a solution on the Internet, I found to my surprise that I had only two options:
An OLE DB Provider is a database driver, which uses Microsoft COM technology.
The more famous layer, ADO, is built on top of OLE DB and leverages an
"easy access to databases for Visual Basic programmers"-layer.
The XML ProviderThis Provider allows any OLE-DB / ADO application to access data in tabular and hierarchical formats. Using standard structured query language (SQL) to query and retrieve XML data.The Provider is built by using existing XML technologies. Specifically the Microsoft XML Parser 4.0 is required. Performance was never a consideration for this project; simplicity was. SQL conformanceThe Provider supports a minimum of SQL, enough to allow searches and filtering of columns. This basically means that you can use a simple standard SQL query such as...
The Provider deduces the table relations itself, so no additional information is required to describe how SQL relations are structured in the document (no additional schema information required).
It is possible to use
The Provider is a read-only provider. It does not support
TablesXML excels at hierarchial structuring. SQL on the other hand tends to look at relations. It is not easy to mix the two or create a query language that seems natural for the other.
The XML Provider looks for repeating nodes at the same hierarchy level and
treats them as tables. Meaning, all nodes that contain child nodes are potential
tables.
Here is a sample:
In this XML document, the "Person" table contains the set of
Person nodes. Each record row in this table contains columns, such as
"familyName", "firstName" and "Description".
The Address and JobInfo nodes are not part of the table,
because they do not contain a text node.
There are several other tables, which can be queried in this sample. They are "AddressBook", "Address" and "JobInfo". Table HintsXML documents are set up in hierarchical structure. It may be nessecary to indicate which hierarchy level represents a table. For this purpose, you can use table hints.The syntax is:
Here is a sample.
This finds all persons, but selects only those with an address in California.
Of course, this is the same as writing...
But hints are still useful when you need to navigate to parent nodes, or
specific grand-children nodes.
ColumnsAll immediate child nodes in a table that do not themselves contain child nodes are cadidates as selectable table columns. All node attributes are included as columns as well.Remember that columns names are case-sensitive and that attribute columns are not referenced with their @-character prefix. If your query contains multiple relations (tables) you may reference columns in different tables by prefixing with the table-name. This is standard SQL stuff.
Special ColumnsA special column is available for all tables. The rowid column is automatically included and represents a unique identifier for each row. Sort of the same as a unique key in a relational database. The key is generated using the generate-id() XPath method.Data typesThe Provider data source can read any valid XML - including raw XML. Raw XML is XML without schema information. For this kind of XML, all columns are treated as string columns. If schema information is present the Provider will load it and attempt to do mapping between the most ordinary data types. This includes integers and numeric values.String columns are truncated to 120 characters! This is configurable through the Providers properties as described later. Joining tablesOnly simple table joins are supported.The provider supports two types of joins. The first is a little non-standard.
This is not meaningful SQL, but the Provider accepts this and assumes that
the Address table is located below the previous table,
the Person table in this sample.
It then does a simple table outer join and
includes all addresses of the person in the result set.
Use a WHERE clause to filter records.
The second join syntax looks more like the standard SQL syntax.
When the provider stumbles over this syntax, it assumes that the
Address is not necessarily placed directly below
the Person table and scans the entire XML document for
Address nodes.
The Provider always starts from the left-most table (as defined in the
Because of the nesting of table access when you don't provide a unique ID for the join, you may need a table hint when you join more than 2 tables:
SortingSorting is possible by using theORDER BY SQL clause.
However, only columns from the first table can be included in the
sort list.
Reserved KeywordThe following SQL keywords are reserved as they are used by the SQL engine.SELECT UPDATE DELETE FROM WHERE ORDER BY SET AND ORKeywords are case-insensitive. Do not use these names for XML nodes. Using the ProviderConfiguration propertiesAny OLE DB Provider can be configured by setting a number of properties on the data source object. The XML Provider defines a few of its own values:
XML Root -
sets the default root to query for the first table.
Use this to minimize the initial search.
Max String Length -
All string columns are truncated to this size.
Validate on Parse -
Validate document when loaded. Default is True.
Resolve Externals -
Indicates whether external definitions, resolvable namespaces,
document type definition (DTD) external subsets, and
external entity references, are to be resolved at load time.
Default is True.
Protect Missing Tables -
Indicates whether unknown tables in a query causes a failure
or the Provider to ignore/skip the table.
Default is False (failure).
Accessing the dataThe following JavaScript sample opens an XML document and reads thefirstName field from the table.
The use of adUseClient is mandatory because ADO behaves
strangely when it tries to access it using a server cursor.
Accessing the Provider from C++ is similar, and the standard ATL Consumer templates can be used. Note that the Provider's ProgID is XmlOleDb.XML. The provider also works with the Data Bound controls in Visual Basic 6. You need to use the ADO Data Source as the data source control. A final note...This provider has not been thoroughly tested. It is not unlikely that you will stumble into problems and get all sorts of strange errors. Even worse, if you hit a bug the driver will probably crumble and Dr. Watson takes over the conversation... You have been warned.And please don't ask me to publish the source code for this component. The source code for the XML OLEDB Provider will not be released. Installation RequirementsMicrosoft XML Parser 4.0Installation Guide
Download Files
|
![]() | OLE DB Provider for XML (170 Kb) |