Class XMLTokenizer
- java.lang.Object
-
- de.pdark.decentxml.XMLTokenizer
-
- Direct Known Subclasses:
DTDTokenizer
public class XMLTokenizer extends java.lang.Object
This class allows to chop an XMLSource into tokens.You can use it to parse XML yourself or use the XMLParser to let it parse XML into a Document.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
XMLTokenizer.Type
Types of tokens the tokenizer can return
-
Field Summary
Fields Modifier and Type Field Description protected boolean
inStartElement
true if we're currently inside of a start tagprotected int
pos
The current position in the sourceprotected XMLSource
source
-
Constructor Summary
Constructors Constructor Description XMLTokenizer(XMLSource source)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected Token
createToken()
All tokens are created here.protected void
expect(char expected)
Check that the next character isexpected
and skip itCharValidator
getCharValidator()
EntityResolver
getEntityResolver()
int
getOffset()
Get the current parsing position (for error handling, for example).XMLSource
getSource()
boolean
isTreatEntitiesAsText()
protected java.lang.String
lookAheadForErrorMessage(java.lang.String conditionalPrefix, int pos, int len)
Token
next()
Fetch the next token from the source.protected char
nextChar(java.lang.String errorMessage)
protected void
nextChars(java.lang.String expected, int startPos, java.lang.String errorMessage)
protected void
parseAttribute(Token token)
Read the attribute of an element.protected void
parseBeginElement(Token token)
Read the name of an element.protected void
parseBeginSomething(Token token)
Read one of "<tag", "<?pi", "<!--", "<![CDATA[" or a end tag.protected void
parseCData(Token token)
Parse a CDATA element.protected void
parseComment(Token token)
Read a comment.protected void
parseDocType(Token token)
Parse a doctype declarationprotected void
parseEndElement(Token token)
Read an end tag.protected void
parseEntity(Token token)
protected void
parseExcalamation(Token token)
Parse "<!--" or "<![CDATA["protected void
parseName(java.lang.String objectName)
Read an XML nameprotected void
parseProcessingInstruction(Token token)
Read a processing instruction.protected void
parseText(Token token)
Read a piece of text.XMLTokenizer
setCharValidator(CharValidator charValidator)
XMLTokenizer
setEntityResolver(EntityResolver resolver)
void
setOffset(int offset)
Set the current parsing position.XMLTokenizer
setTreatEntitiesAsText(boolean treatEntitiesAsText)
protected void
skipChar(char c)
Advance one or two positions, depending on whether the current character if the high part of a surrogate pair.protected void
skipWhiteSpace()
Advance the current position past any whitespace in the inputprotected void
verifyEntity(int start, int end)
Verify an entity.
-
-
-
Field Detail
-
source
protected final XMLSource source
-
pos
protected int pos
The current position in the source
-
inStartElement
protected boolean inStartElement
true if we're currently inside of a start tag
-
-
Constructor Detail
-
XMLTokenizer
public XMLTokenizer(XMLSource source)
-
-
Method Detail
-
setTreatEntitiesAsText
public XMLTokenizer setTreatEntitiesAsText(boolean treatEntitiesAsText)
-
isTreatEntitiesAsText
public boolean isTreatEntitiesAsText()
-
getCharValidator
public CharValidator getCharValidator()
-
setCharValidator
public XMLTokenizer setCharValidator(CharValidator charValidator)
-
getEntityResolver
public EntityResolver getEntityResolver()
-
setEntityResolver
public XMLTokenizer setEntityResolver(EntityResolver resolver)
-
next
public Token next()
Fetch the next token from the source. Returnsnull
if there are no more tokens in the input.- Returns:
- The next token or
null
at EOF
-
createToken
protected Token createToken()
All tokens are created here.Use this method to create custom tokens with additional information.
- Returns:
- a new, pre-initialized token
-
getSource
public XMLSource getSource()
-
getOffset
public int getOffset()
Get the current parsing position (for error handling, for example).This value is not very accurate because the tokenizer might be anywhere in the stream.
-
setOffset
public void setOffset(int offset)
Set the current parsing position. You can use this to restart parsing after an error or to jump around in the input.
-
parseBeginSomething
protected void parseBeginSomething(Token token)
Read one of "<tag", "<?pi", "<!--", "<![CDATA[" or a end tag.
-
parseBeginElement
protected void parseBeginElement(Token token)
Read the name of an element.The resulting token will contain the '<' plus any whitespace between it and the name plus the name itself but no whitespace after the name.
-
parseEndElement
protected void parseEndElement(Token token)
Read an end tag.The resulting token will contain the '</' and '>' plus the name plus any whitespace between those three.
-
parseExcalamation
protected void parseExcalamation(Token token)
Parse "<!--" or "<![CDATA["
-
parseDocType
protected void parseDocType(Token token)
Parse a doctype declarationThe resulting token will contain "
-
parseCData
protected void parseCData(Token token)
Parse a CDATA element.The resulting token will contain the "<![CDATA[" plus the terminating "]]>".
-
parseComment
protected void parseComment(Token token)
Read a comment.The resulting token will contain the "<!--" plus the terminating "-->".
-
parseProcessingInstruction
protected void parseProcessingInstruction(Token token)
Read a processing instruction.The resulting token will contain the "<?" plus the terminating "?>".
-
parseAttribute
protected void parseAttribute(Token token)
Read the attribute of an element.The resulting token will contain the name, "=" plus the quotes and the value.
-
parseName
protected void parseName(java.lang.String objectName)
Read an XML name
-
parseText
protected void parseText(Token token)
Read a piece of text.The resulting token will contain the text as is with all the entity and numeric character references.
-
skipChar
protected void skipChar(char c)
Advance one or two positions, depending on whether the current character if the high part of a surrogate pair.
-
verifyEntity
protected void verifyEntity(int start, int end)
Verify an entity. If no entityResolver is installed, this does nothing.
-
parseEntity
protected void parseEntity(Token token)
-
nextChars
protected void nextChars(java.lang.String expected, int startPos, java.lang.String errorMessage)
-
nextChar
protected char nextChar(java.lang.String errorMessage)
-
expect
protected void expect(char expected)
Check that the next character isexpected
and skip it
-
lookAheadForErrorMessage
protected java.lang.String lookAheadForErrorMessage(java.lang.String conditionalPrefix, int pos, int len)
-
skipWhiteSpace
protected void skipWhiteSpace()
Advance the current position past any whitespace in the input
-
-
-