The solution

So, how might a solution be approached? The World Wide Web Consortium’s (W3C) XML specification does outline rules that XML documents must follow to be considered “well-formed”. But two well-formed purchase orders can look very different from each other. Even if they share the same concepts, such as reference number and customer information, the flexibility of XML would make two companies hard pressed to exchange these documents reliably.

Fortunately, the creators of XML recognized the need to allow document structures to be more finely constrained so that multiple parties could interpret them consistently. Generically speaking, the set of rules that outline a document’s components, structure, and content is called a schema. Schemas can be written in all sorts of languages, but the important schemas that constrain XML are Document Type Definitions (DTDs) and XML Schema. Documents that conform to a schema are said to be “valid” instead of just “well-formed”. Therefore, developers who want to use Java objects to represent the data in XML documents should model their Java classes from whatever schema constrains the XML document.