This package provides the core SAX APIs. Some SAX1 APIs are deprecated to encourage integration of namespace-awareness into designs of new applications and into maintenance of existing infrastructure.

See http://www.saxproject.org for more information about SAX.

SAX2 Standard Feature Flags

One of the essential characteristics of SAX2 is that it added feature flags which can be used to examine and perhaps modify parser modes, in particular modes such as validation. Since features are identified by (absolute) URIs, anyone can define such features. Currently defined standard feature URIs have the prefix http://xml.org/sax/features/ before an identifier such as validation. Turn features on or off using setFeature. Those standard identifiers are:

Feature ID Default Description
external-general-entities unspecified Reports whether this parser processes external general entities; always true if validating
external-parameter-entities unspecified Reports whether this parser processes external parameter entities; always true if validating
is-standalone none May be examined only during a parse, after the startDocument() callback has been completed; read-only. The value is true if the document specified the "standalone" flag in its XML declaration, and otherwise is false.
lexical-handler/parameter-entities unspecified true indicates that the LexicalHandler will report the beginning and end of parameter entities
namespaces true true indicates namespace URIs and unprefixed local names for element and attribute names will be available
namespace-prefixes false true indicates XML qualified names (with prefixes) and attributes (including xmlns* attributes) will be available
resolve-dtd-uris true A value of "true" indicates that system IDs in declarations will be absolutized (relative to their base URIs) before reporting. (That is the default behavior for all SAX2 XML parsers.) A value of "false" indicates those IDs will not be absolutized; parsers will provide the base URI from Locator.getSystemId(). This applies to system IDs passed in
  • DTDHandler.notationDecl(),
  • DTDHandler.unparsedEntityDecl(), and
  • DeclHandler.externalEntityDecl().
It does not apply to EntityResolver.resolveEntity(), which is not used to report declarations, or to LexicalHandler.startDTD(), which already provides the non-absolutized URI.
string-interning unspecified true if all XML names (for elements, prefixes, attributes, entities, notations, and local names), as well as Namespace URIs, will have been interned using java.lang.String.intern. This supports fast testing of equality/inequality against string constants, rather than forcing slower calls to String.equals().
use-attributes2 unspecified Returns true if the Attributes objects passed by this parser in ContentHandler.startElement() implement the org.xml.sax.ext.Attributes2 interface. That interface exposes additional DTD-related information, such as whether the attribute was specified in the source text rather than defaulted.
use-locator2 unspecified Returns true if the Locator objects passed by this parser in ContentHandler.setDocumentLocator() implement the org.xml.sax.ext.Locator2 interface. That interface exposes additional entity information, such as the character encoding and XML version used.
use-entity-resolver2 true (when recognized) Returns true if, when setEntityResolver is given an object implementing the org.xml.sax.ext.EntityResolver2 interface, those new methods will be used. Returns false to indicate that those methods will not be used.
validation unspecified Controls whether the parser is reporting all validity errors; if true, all external entities will be read.
xmlns-uris false Controls whether, when the namespace-prefixes feature is set, the parser treats namespace declaration attributes as being in the http://www.w3.org/2000/xmlns/ namespace. By default, SAX2 conforms to the original "Namespaces in XML" Recommendation, which explicitly states that such attributes are not in any namespace. Setting this optional flag to true makes the SAX2 events conform to a later backwards-incompatible revision of that recommendation, placing those attributes in a namespace.

Support for the default values of the namespaces and namespace-prefixes properties is required. Support for any other feature flags is entirely optional.

For default values not specified by SAX2, each XMLReader implementation specifies its default, or may choose not to expose the feature flag. Unless otherwise specified here, implementations may support changing current values of these standard feature flags, but not while parsing.

SAX2 Standard Handler and Property IDs

For parser interface characteristics that are described as objects, a separate namespace is defined. The objects in this namespace are again identified by URI, and the standard property URIs have the prefix http://xml.org/sax/properties/ before an identifier such as lexical-handler or dom-node. Manage those properties using setProperty(). Those identifiers are:

Property ID Description
declaration-handler Used to see most DTD declarations except those treated as lexical ("document element name is ...") or which are mandatory for all SAX parsers (DTDHandler). The Object must implement org.xml.sax.ext.DeclHandler.
dom-node For "DOM Walker" style parsers, which ignore their parser.parse() parameters, this is used to specify the DOM (sub)tree being walked by the parser. The Object must implement the org.w3c.dom.Node interface.
lexical-handler Used to see some syntax events that are essential in some applications: comments, CDATA delimiters, selected general entity inclusions, and the start and end of the DTD (and declaration of document element name). The Object must implement org.xml.sax.ext.LexicalHandler.
xml-string Readable only during a parser callback, this exposes a TBS chunk of characters responsible for the current event.

All of these standard properties are optional; XMLReader implementations need not support them.

SAX2 Standard Exception IDs

SAX 2.1 defines a standard SAXParseException.getExceptionId() method to identify which kind of error is being reported. since any diagnostic message will vary between parsers. Those identifiers are URIs, which are used in much the same way that they are used for feature and property IDs. Systems can define nonstandard IDs when needed, by using a different base URI.

Moreover, for the XML (and related) standards relied on by the SAX specification itself (including XML and Namespaces in XML), SAX also standardizes the IDs used to identify those errors. The identifiers all start with the exception base URI http://xml.org/sax/exception/ which is then combined with additional information describing the error encountered. Not all parsers will choose to provide all these IDs, but those that provide any must only use the exception IDs defined by SAX. Any errors defined by those specifications which are not yet addressed by the SAX specification must not include any exception ID (but see below, more identifiers can be defined).

Parsers that correctly use exception IDs thus allow application software to reason, in a parser-independent manner, about all the basic XML errors that may be reported by an XMLReader. For example, they could assemble (and translate) catalogs of messages that make more sense to their users, because they can use application context to supplement a parser's textually oriented diagnostic. In some cases they can write code that uses knowledge of the errors to decide how to proceed most effectively. For example, some validity errors might be of no concern, while others might need to be treated as fatal.

The core SAX identifiers are described here, and a more current version might be available through http://www.saxproject.org,. Parser writers should work with the SAX project to define standard ID for any error cases that are identified in the relevant specifications, but for which no IDs are defined here.

IDs for XML 1.0 Parsing Errors

These IDs are derived from the current XML 1.0 (2nd edition) recommendation. The IDs start with the exception base URI, and append to that xml/ and then an additional string that provides more specific identification of the rule being violated. Those additional strings are defined as follows:

So for example http://xml.org/sax/exception/xml/rule-66 indicates a violation of grammar rule 66 (a malformed character reference), which is a fatal error. And http://xml.org/sax/exception/xml/vc-one-id-per-el indicates a violation of the One ID per Element Type validity constraint, which is a validity error reflecting a bug in the document's DTD.

That list is subject to evolution for several reasons, such as:

However, potential changes to the XML source of the XML Recommendation which change those identifiers will not change that list. Such cases would be addressed by applying these rules to a base revision of that specification, and assigning IDs manually for such problem cases.

IDs for XML Namespaces Violations

These IDs are derived from the current XML Namespaces recommendation. The IDs start with the exception base URI, and append to that xmlns/ and then an additional string that provides more specific identification of the rule being violated. Those additional strings are defined as follows:

So for example http://xml.org/sax/exception/xmlns/qname might indicate an "localName" that begins with a digit, which is a nonfatal error.

That list is subject to evolution, although it has substantially less need to evolve than the corresponding list for XML 1.0 errors, since there are only two NSCs and one other conformance constraint.