Class XMLDocumentFragmentScannerImpl

java.lang.Object
org.apache.xerces.impl.XMLScanner
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl
All Implemented Interfaces:
XMLEntityHandler, org.apache.xerces.xni.parser.XMLComponent, org.apache.xerces.xni.parser.XMLDocumentScanner, org.apache.xerces.xni.parser.XMLDocumentSource
Direct Known Subclasses:
XMLDocumentScannerImpl

public class XMLDocumentFragmentScannerImpl extends XMLScanner implements org.apache.xerces.xni.parser.XMLDocumentScanner, org.apache.xerces.xni.parser.XMLComponent, XMLEntityHandler
This class is responsible for scanning the structure and content of document fragments. The scanner acts as the source for the document information which is communicated to the document handler.

This component requires the following features and properties from the component manager that uses it:

  • http://xml.org/sax/features/validation
  • http://apache.org/xml/features/scanner/notify-char-refs
  • http://apache.org/xml/features/scanner/notify-builtin-refs
  • http://apache.org/xml/properties/internal/symbol-table
  • http://apache.org/xml/properties/internal/error-reporter
  • http://apache.org/xml/properties/internal/entity-manager

INTERNAL:

Usage of this class is not supported. It may be altered or removed at any time.
Version:
$Id: XMLDocumentFragmentScannerImpl.java 572055 2007-09-02 17:55:43Z mrglavas $
Author:
Glenn Marcy, IBM, Andy Clark, IBM, Arnaud Le Hors, IBM, Eric Ye, IBM
  • Field Details

    • SCANNER_STATE_START_OF_MARKUP

      protected static final int SCANNER_STATE_START_OF_MARKUP
      Scanner state: start of markup.
      See Also:
    • SCANNER_STATE_COMMENT

      protected static final int SCANNER_STATE_COMMENT
      Scanner state: comment.
      See Also:
    • SCANNER_STATE_PI

      protected static final int SCANNER_STATE_PI
      Scanner state: processing instruction.
      See Also:
    • SCANNER_STATE_DOCTYPE

      protected static final int SCANNER_STATE_DOCTYPE
      Scanner state: DOCTYPE.
      See Also:
    • SCANNER_STATE_ROOT_ELEMENT

      protected static final int SCANNER_STATE_ROOT_ELEMENT
      Scanner state: root element.
      See Also:
    • SCANNER_STATE_CONTENT

      protected static final int SCANNER_STATE_CONTENT
      Scanner state: content.
      See Also:
    • SCANNER_STATE_REFERENCE

      protected static final int SCANNER_STATE_REFERENCE
      Scanner state: reference.
      See Also:
    • SCANNER_STATE_END_OF_INPUT

      protected static final int SCANNER_STATE_END_OF_INPUT
      Scanner state: end of input.
      See Also:
    • SCANNER_STATE_TERMINATED

      protected static final int SCANNER_STATE_TERMINATED
      Scanner state: terminated.
      See Also:
    • SCANNER_STATE_CDATA

      protected static final int SCANNER_STATE_CDATA
      Scanner state: CDATA section.
      See Also:
    • SCANNER_STATE_TEXT_DECL

      protected static final int SCANNER_STATE_TEXT_DECL
      Scanner state: Text declaration.
      See Also:
    • NAMESPACES

      protected static final String NAMESPACES
      Feature identifier: namespaces.
      See Also:
    • NOTIFY_BUILTIN_REFS

      protected static final String NOTIFY_BUILTIN_REFS
      Feature identifier: notify built-in refereces.
      See Also:
    • ENTITY_RESOLVER

      protected static final String ENTITY_RESOLVER
      Property identifier: entity resolver.
      See Also:
    • DEBUG_CONTENT_SCANNING

      protected static final boolean DEBUG_CONTENT_SCANNING
      Debug content dispatcher scanning.
      See Also:
    • fDocumentHandler

      protected org.apache.xerces.xni.XMLDocumentHandler fDocumentHandler
      Document handler.
    • fEntityStack

      protected int[] fEntityStack
      Entity stack.
    • fMarkupDepth

      protected int fMarkupDepth
      Markup depth.
    • fScannerState

      protected int fScannerState
      Scanner state.
    • fInScanContent

      protected boolean fInScanContent
      SubScanner state: inside scanContent method.
    • fHasExternalDTD

      protected boolean fHasExternalDTD
      has external dtd
    • fStandalone

      protected boolean fStandalone
      Standalone.
    • fIsEntityDeclaredVC

      protected boolean fIsEntityDeclaredVC
      True if [Entity Declared] is a VC; false if it is a WFC.
    • fExternalSubsetResolver

      protected ExternalSubsetResolver fExternalSubsetResolver
      External subset resolver.
    • fCurrentElement

      protected org.apache.xerces.xni.QName fCurrentElement
      Current element.
    • fElementStack

      protected final XMLDocumentFragmentScannerImpl.ElementStack fElementStack
      Element stack.
    • fNotifyBuiltInRefs

      protected boolean fNotifyBuiltInRefs
      Notify built-in references.
    • fDispatcher

      Active dispatcher.
    • fContentDispatcher

      protected final XMLDocumentFragmentScannerImpl.Dispatcher fContentDispatcher
      Content dispatcher.
    • fElementQName

      protected final org.apache.xerces.xni.QName fElementQName
      Element QName.
    • fAttributeQName

      protected final org.apache.xerces.xni.QName fAttributeQName
      Attribute QName.
    • fAttributes

      protected final XMLAttributesImpl fAttributes
      Element attributes.
    • fTempString

      protected final org.apache.xerces.xni.XMLString fTempString
      String.
    • fTempString2

      protected final org.apache.xerces.xni.XMLString fTempString2
      String.
  • Constructor Details

    • XMLDocumentFragmentScannerImpl

      public XMLDocumentFragmentScannerImpl()
      Default constructor.
  • Method Details

    • setInputSource

      public void setInputSource(org.apache.xerces.xni.parser.XMLInputSource inputSource) throws IOException
      Sets the input source.
      Specified by:
      setInputSource in interface org.apache.xerces.xni.parser.XMLDocumentScanner
      Parameters:
      inputSource - The input source.
      Throws:
      IOException - Thrown on i/o error.
    • scanDocument

      public boolean scanDocument(boolean complete) throws IOException, org.apache.xerces.xni.XNIException
      Scans a document.
      Specified by:
      scanDocument in interface org.apache.xerces.xni.parser.XMLDocumentScanner
      Parameters:
      complete - True if the scanner should scan the document completely, pushing all events to the registered document handler. A value of false indicates that that the scanner should only scan the next portion of the document and return. A scanner instance is permitted to completely scan a document if it does not support this "pull" scanning model.
      Returns:
      True if there is more to scan, false otherwise.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • reset

      public void reset(org.apache.xerces.xni.parser.XMLComponentManager componentManager) throws org.apache.xerces.xni.parser.XMLConfigurationException
      Resets the component. The component can query the component manager about any features and properties that affect the operation of the component.
      Specified by:
      reset in interface org.apache.xerces.xni.parser.XMLComponent
      Overrides:
      reset in class XMLScanner
      Parameters:
      componentManager - The component manager.
    • getRecognizedFeatures

      public String[] getRecognizedFeatures()
      Returns a list of feature identifiers that are recognized by this component. This method may return null if no features are recognized by this component.
      Specified by:
      getRecognizedFeatures in interface org.apache.xerces.xni.parser.XMLComponent
    • setFeature

      public void setFeature(String featureId, boolean state) throws org.apache.xerces.xni.parser.XMLConfigurationException
      Sets the state of a feature. This method is called by the component manager any time after reset when a feature changes state.

      Note: Components should silently ignore features that do not affect the operation of the component.

      Specified by:
      setFeature in interface org.apache.xerces.xni.parser.XMLComponent
      Overrides:
      setFeature in class XMLScanner
      Parameters:
      featureId - The feature identifier.
      state - The state of the feature.
    • getRecognizedProperties

      public String[] getRecognizedProperties()
      Returns a list of property identifiers that are recognized by this component. This method may return null if no properties are recognized by this component.
      Specified by:
      getRecognizedProperties in interface org.apache.xerces.xni.parser.XMLComponent
    • setProperty

      public void setProperty(String propertyId, Object value) throws org.apache.xerces.xni.parser.XMLConfigurationException
      Sets the value of a property. This method is called by the component manager any time after reset when a property changes value.

      Note: Components should silently ignore properties that do not affect the operation of the component.

      Specified by:
      setProperty in interface org.apache.xerces.xni.parser.XMLComponent
      Overrides:
      setProperty in class XMLScanner
      Parameters:
      propertyId - The property identifier.
      value - The value of the property.
    • getFeatureDefault

      public Boolean getFeatureDefault(String featureId)
      Returns the default state for a feature, or null if this component does not want to report a default value for this feature.
      Specified by:
      getFeatureDefault in interface org.apache.xerces.xni.parser.XMLComponent
      Parameters:
      featureId - The feature identifier.
      Since:
      Xerces 2.2.0
    • getPropertyDefault

      public Object getPropertyDefault(String propertyId)
      Returns the default state for a property, or null if this component does not want to report a default value for this property.
      Specified by:
      getPropertyDefault in interface org.apache.xerces.xni.parser.XMLComponent
      Parameters:
      propertyId - The property identifier.
      Since:
      Xerces 2.2.0
    • setDocumentHandler

      public void setDocumentHandler(org.apache.xerces.xni.XMLDocumentHandler documentHandler)
      setDocumentHandler
      Specified by:
      setDocumentHandler in interface org.apache.xerces.xni.parser.XMLDocumentSource
      Parameters:
      documentHandler -
    • getDocumentHandler

      public org.apache.xerces.xni.XMLDocumentHandler getDocumentHandler()
      Returns the document handler
      Specified by:
      getDocumentHandler in interface org.apache.xerces.xni.parser.XMLDocumentSource
    • startEntity

      public void startEntity(String name, org.apache.xerces.xni.XMLResourceIdentifier identifier, String encoding, org.apache.xerces.xni.Augmentations augs) throws org.apache.xerces.xni.XNIException
      This method notifies of the start of an entity. The DTD has the pseudo-name of "[dtd]" parameter entity names start with '%'; and general entities are just specified by their name.
      Specified by:
      startEntity in interface XMLEntityHandler
      Overrides:
      startEntity in class XMLScanner
      Parameters:
      name - The name of the entity.
      identifier - The resource identifier.
      encoding - The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader).
      augs - Additional information that may include infoset augmentations
      Throws:
      org.apache.xerces.xni.XNIException - Thrown by handler to signal an error.
    • endEntity

      public void endEntity(String name, org.apache.xerces.xni.Augmentations augs) throws org.apache.xerces.xni.XNIException
      This method notifies the end of an entity. The DTD has the pseudo-name of "[dtd]" parameter entity names start with '%'; and general entities are just specified by their name.
      Specified by:
      endEntity in interface XMLEntityHandler
      Overrides:
      endEntity in class XMLScanner
      Parameters:
      name - The name of the entity.
      augs - Additional information that may include infoset augmentations
      Throws:
      org.apache.xerces.xni.XNIException - Thrown by handler to signal an error.
    • createContentDispatcher

      protected XMLDocumentFragmentScannerImpl.Dispatcher createContentDispatcher()
      Creates a content dispatcher.
    • scanXMLDeclOrTextDecl

      protected void scanXMLDeclOrTextDecl(boolean scanningTextDecl) throws IOException, org.apache.xerces.xni.XNIException
      Scans an XML or text declaration.

       [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
       [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ")
       [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' |  "'" EncName "'" )
       [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
       [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'")
                       | ('"' ('yes' | 'no') '"'))
      
       [77] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'
       
      Parameters:
      scanningTextDecl - True if a text declaration is to be scanned instead of an XML declaration.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanPIData

      protected void scanPIData(String target, org.apache.xerces.xni.XMLString data) throws IOException, org.apache.xerces.xni.XNIException
      Scans a processing data. This is needed to handle the situation where a document starts with a processing instruction whose target name starts with "xml". (e.g. xmlfoo)
      Overrides:
      scanPIData in class XMLScanner
      Parameters:
      target - The PI target
      data - The string to fill in with the data
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanComment

      protected void scanComment() throws IOException, org.apache.xerces.xni.XNIException
      Scans a comment.

       [15] Comment ::= 'invalid input: '&lt'!--' ((Char - '-') | ('-' (Char - '-')))* '-->'
       

      Note: Called after scanning past '<!--'

      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanStartElement

      protected boolean scanStartElement() throws IOException, org.apache.xerces.xni.XNIException
      Scans a start element. This method will handle the binding of namespace information and notifying the handler of the start of the element.

       [44] EmptyElemTag ::= '<' Name (S Attribute)* S? '/>'
       [40] STag ::= '<' Name (S Attribute)* S? '>'
       

      Note: This method assumes that the leading '<' character has been consumed.

      Note: This method uses the fElementQName and fAttributes variables. The contents of these variables will be destroyed. The caller should copy important information out of these variables before calling this method.

      Returns:
      True if element is empty. (i.e. It matches production [44].
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanStartElementName

      protected void scanStartElementName() throws IOException, org.apache.xerces.xni.XNIException
      Scans the name of an element in a start or empty tag.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
      See Also:
    • scanStartElementAfterName

      protected boolean scanStartElementAfterName() throws IOException, org.apache.xerces.xni.XNIException
      Scans the remainder of a start or empty tag after the element name.
      Returns:
      True if element is empty.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
      See Also:
    • scanAttribute

      protected void scanAttribute(org.apache.xerces.xni.XMLAttributes attributes) throws IOException, org.apache.xerces.xni.XNIException
      Scans an attribute.

       [41] Attribute ::= Name Eq AttValue
       

      Note: This method assumes that the next character on the stream is the first character of the attribute name.

      Note: This method uses the fAttributeQName and fQName variables. The contents of these variables will be destroyed.

      Parameters:
      attributes - The attributes list for the scanned attribute.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanContent

      protected int scanContent() throws IOException, org.apache.xerces.xni.XNIException
      Scans element content.
      Returns:
      Returns the next character on the stream.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanCDATASection

      protected boolean scanCDATASection(boolean complete) throws IOException, org.apache.xerces.xni.XNIException
      Scans a CDATA section.

      Note: This method uses the fTempString and fStringBuffer variables.

      Parameters:
      complete - True if the CDATA section is to be scanned completely.
      Returns:
      True if CDATA is completely scanned.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanEndElement

      protected int scanEndElement() throws IOException, org.apache.xerces.xni.XNIException
      Scans an end element.

       [42] ETag ::= '</' Name S? '>'
       

      Note: This method uses the fElementQName variable. The contents of this variable will be destroyed. The caller should copy the needed information out of this variable before calling this method.

      Returns:
      The element depth.
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanCharReference

      protected void scanCharReference() throws IOException, org.apache.xerces.xni.XNIException
      Scans a character reference.

       [66] CharRef ::= 'invalid input: '&#'' [0-9]+ ';' | 'invalid input: '&#x'' [0-9a-fA-F]+ ';'
       
      Throws:
      IOException
      org.apache.xerces.xni.XNIException
    • scanEntityReference

      protected void scanEntityReference() throws IOException, org.apache.xerces.xni.XNIException
      Scans an entity reference.
      Throws:
      IOException - Thrown if i/o error occurs.
      org.apache.xerces.xni.XNIException - Thrown if handler throws exception upon notification.
    • handleEndElement

      protected int handleEndElement(org.apache.xerces.xni.QName element, boolean isEmpty) throws org.apache.xerces.xni.XNIException
      Handles the end element. This method will make sure that the end element name matches the current element and notify the handler about the end of the element and the end of any relevent prefix mappings.

      Note: This method uses the fQName variable. The contents of this variable will be destroyed.

      Parameters:
      element - The element.
      Returns:
      The element depth.
      Throws:
      org.apache.xerces.xni.XNIException - Thrown if the handler throws a SAX exception upon notification.
    • setScannerState

      protected final void setScannerState(int state)
      Sets the scanner state.
      Parameters:
      state - The new scanner state.
    • setDispatcher

      protected final void setDispatcher(XMLDocumentFragmentScannerImpl.Dispatcher dispatcher)
      Sets the dispatcher.
      Parameters:
      dispatcher - The new dispatcher.
    • getScannerStateName

      protected String getScannerStateName(int state)
      Returns the scanner state name.
    • getDispatcherName

      public String getDispatcherName(XMLDocumentFragmentScannerImpl.Dispatcher dispatcher)
      Returns the dispatcher name.