There are multiple ways to validate input, and this article will look at two of them: Document Type Definitions (DTD) and XML Schema (XSD).
A third option is Relax NG, which tries to find a middle ground between DTD's lack of expressiveness and XSD's Byzantine structure. Before continuing, I want to add a third, non-standard term to describe XML documents: “correct.” A validator can only check the existence, ordering, and general content of an XML file; it's equivalent to the syntax check of a Java compiler.
I recommend always parsing with namespaces enabled, with one exception: in legacy code that uses XPath or XSLT.
As I describe elsewhere, XPath has its own hoops with regard to namespaces.
Instead it read from an doesn't match the actual content.
And that brings up the second case: if you get XML from someone who doesn't know the rules.
A “valid” document, by comparison, is one where the document's corresponds to some specification.
In the real world, this wouldn't be very useful (and gives no more information than the default); generally the first few messages will tell you why the parser got off the rails.The XML specification requires that an XML document either have a prologue that specifies its encoding, or be encoded in UTF-8 or UTF-16.But in this example I used a Java String, which is UTF-16 encoded, without a prologue. The answer is that the parser did not read the string directly.Except for one small problem: the Namespace spec was introduced in 1999, while the DOM level 1 spec was released in 1998 and knew nothing of namespaces.The JDK's XML API predated namespaces, and due to backwards compatibility you must explicitly tell it that you want namespace-aware parsing: , not the parser.However, chances are good that you're not parsing simple literal strings, so read on …The DOM API is filled with design patterns, especially creational patterns: package consists solely of interfaces), which can let a misbehaved program wreak havoc in a shared environment such as an app-server.Let validation do what it can, but ultimately your program must explicitly verify that an XML file contains the correct data.The Document Type Definition is part of the XML specification.A DTD describes the organization and content of an XML document in a form similar to Backus-Naur notation: a tree structure in which each element specifies the elements that it may contain (potentially none), and the order in which they must appear.” declaration, which must appear before the first element in the document (but after the prologue! The DOCTYPE may specify an embedded DTD, as in the example below, or it may reference an external DTD, as we'll see later. Error Handler was not set, which is probably not what is desired.