Package org.w3c.tidy

Class Tidy

  • All Implemented Interfaces:
    java.io.Serializable

    public class Tidy
    extends java.lang.Object
    implements java.io.Serializable
    HTML parser and pretty printer.
    Version:
    $Revision: 1191 $ ($Author: aditsu $)
    Author:
    Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
    See Also:
    Serialized Form
    • Constructor Summary

      Constructors 
      Constructor Description
      Tidy()
      Instantiates a new Tidy instance.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static org.w3c.dom.Document createEmptyDocument()
      Creates an empty DOM Document.
      java.lang.String getAltText()
      alt-text- default text for alt attribute.
      boolean getAsciiChars()
      ascii-chars- convert quotes and dashes to nearest ASCII char.
      boolean getBreakBeforeBR()
      break-before-br - output newline before <br>.
      boolean getBurstSlides()
      split- create slides on each h2 element.
      Configuration getConfiguration()
      Returns the actual configuration
      java.lang.String getDocType()
      doctype- user specified doctype.
      boolean getDropEmptyParas()
      drop-empty-paras- discard empty p elements.
      boolean getDropFontTags()
      drop-font-tags- discard presentation tags.
      boolean getDropProprietaryAttributes()
      drop-proprietary-attributes- discard proprietary attributes.
      boolean getEmacs()
      gnu-emacs- if true format error output for GNU Emacs.
      boolean getEncloseBlockText()
      enclose-block-text- if true text in blocks is wrapped in <p>'s.
      boolean getEncloseText()
      enclose-text- if true text at body is wrapped in <p>'s.
      java.lang.String getErrfile()
      Errfile - file name to write errors to.
      java.io.PrintWriter getErrout()
      Errout - the error output stream.
      boolean getEscapeCdata()
      escape-cdata -replace CDATA sections with escaped text.
      boolean getFixBackslash()
      fix-backslash- fix URLs by replacing \ with /.
      boolean getFixComments()
      fix-bad-comments- fix comments with adjacent hyphens.
      boolean getFixUri()
      fix-uri- output BODY content only.
      boolean getForceOutput()
      force-output- output document even if errors were found.
      boolean getHideComments()
      hide-comments- hides all (real) comments in output.
      boolean getHideEndTags()
      hide-endtags - suppress optional end tags.
      boolean getIndentAttributes()
      indent-attributes- newline+indent before each attribute.
      boolean getIndentCdata()
      indent-cdata- indent CDATA sections.
      boolean getIndentContent()
      indent - indent content of appropriate tags.
      java.lang.String getInputEncoding()
      input-encoding the character encoding used for input.
      java.lang.String getInputStreamName()  
      boolean getJoinClasses()
      join-classes- join multiple class attributes.
      boolean getJoinStyles()
      join-styles- join multiple style attributes.
      boolean getKeepFileTimes()
      keep-time- if true last modified time is preserved.
      boolean getLiteralAttribs()
      literal-attributes- if true attributes may use newlines.
      boolean getLogicalEmphasis()
      logical-emphasis- replace i by em and b by strong.
      boolean getLowerLiterals()
      lower-literals- folds known attribute values to lower case.
      boolean getMakeBare()
      make-clean - remove Microsoft cruft.
      boolean getMakeClean()
      make-clean - remove presentational clutter.
      boolean getNumEntities()
      numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
      boolean getOnlyErrors()
      only-errors - if true normal output is suppressed.
      java.lang.String getOutputEncoding()
      output-encoding the character encoding used for output.
      int getParseErrors()
      ParseErrors - the number of errors that occurred in the most recent parse operation.
      int getParseWarnings()
      ParseWarnings - the number of warnings that occurred in the most recent parse operation.
      boolean getPrintBodyOnly()
      print-body-only- output BODY content only.
      boolean getQuiet()
      quiet - no 'Parsing X', guessed DTD or summary.
      boolean getQuoteAmpersand()
      quote-ampersand- output naked ampersand as &.
      boolean getQuoteMarks()
      quote-marks- output " marks as &quot;.
      boolean getQuoteNbsp()
      quote-nbsp- output non-breaking space as entity.
      boolean getRawOut()
      output-raw- avoid mapping values > 127 to entities.
      int getRepeatedAttributes()
      repeated-attributes- keep first or last duplicate attribute.
      boolean getReplaceColor()
      replace-color- replace hex color attribute values with names.
      int getShowErrors()
      show-errors- number of errors to put out.
      boolean getShowWarnings()
      show-warnings - show warnings? (errors are always shown).
      boolean getSmartIndent()
      SmartIndent - does text/block level content effect indentation.
      int getSpaces()
      indent-spaces- default indentation.
      java.io.PrintWriter getStderr()  
      int getTabsize()
      tab-size- tab size in chars.
      boolean getTidyMark()
      tidy-mark- add meta element indicating tidied doc.
      boolean getTrimEmptyElements()
      trim-empty-elements- trim empty elements.
      boolean getUpperCaseAttrs()
      uppercase-attributes - output attributes in upper case.
      boolean getUpperCaseTags()
      uppercase-tags - output tags in upper case.
      boolean getWord2000()
      word-2000- draconian cleaning for Word2000.
      boolean getWrapAsp()
      wrap-asp- wrap within ASP pseudo elements.
      boolean getWrapAttVals()
      wrap-attributes- wrap within attribute values.
      boolean getWrapJste()
      wrap-jste- wrap within JSTE pseudo elements.
      int getWraplen()
      wrap- default wrap margin.
      boolean getWrapPhp()
      wrap-php- wrap within PHP pseudo elements.
      boolean getWrapScriptlets()
      wrap-script-literals- wrap within JavaScript string literals.
      boolean getWrapSection()
      wrap-sections- wrap within <![ ...
      boolean getWriteback()
      writeback - if true then output tidied markup.
      boolean getXHTML()
      output-xhtml - output extensible HTML.
      boolean getXmlOut()
      output-xml - create output as XML.
      boolean getXmlPi()
      add-xml-pi- add <?xml?> for XML docs.
      boolean getXmlPIs()
      assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >.
      boolean getXmlSpace()
      add-xml-space- if set to yes adds xml:space attr as needed.
      boolean getXmlTags()
      input-xml - treat input as XML.
      static void main​(java.lang.String[] argv)
      Command line interface to parser and pretty printer.
      protected int mainExec​(java.lang.String[] argv)
      Main method, but returns the return code as an int instead of calling System.exit(code).
      Node parse​(java.io.InputStream in, java.io.OutputStream out)
      Reads from the given input and returns the root Node.
      Node parse​(java.io.InputStream in, java.io.Writer out)
      Reads from the given input and returns the root Node.
      Node parse​(java.io.Reader in, java.io.OutputStream out)
      Reads from the given input and returns the root Node.
      Node parse​(java.io.Reader in, java.io.Writer out)
      Reads from the given input and returns the root Node.
      org.w3c.dom.Document parseDOM​(java.io.InputStream in, java.io.OutputStream out)
      Parses InputStream in and returns a DOM Document node.
      org.w3c.dom.Document parseDOM​(java.io.Reader in, java.io.Writer out)  
      void pprint​(org.w3c.dom.Document doc, java.io.OutputStream out)
      Pretty-prints a DOM Document.
      void pprint​(org.w3c.dom.Node node, java.io.OutputStream out)
      Pretty-prints a DOM Node.
      void setAltText​(java.lang.String altText)
      alt-text- default text for alt attribute.
      void setAsciiChars​(boolean asciiChars)
      ascii-chars- convert quotes and dashes to nearest ASCII char.
      void setBreakBeforeBR​(boolean breakBeforeBR)
      break-before-br - output newline before <br>.
      void setBurstSlides​(boolean burstSlides)
      split- create slides on each h2 element.
      void setConfigurationFromFile​(java.lang.String filename)
      Sets the configuration from a configuration file.
      void setConfigurationFromProps​(java.util.Properties props)
      Sets the configuration from a properties object.
      void setDocType​(java.lang.String doctype)
      doctype- user specified doctype.
      void setDropEmptyParas​(boolean dropEmptyParas)
      drop-empty-paras- discard empty p elements.
      void setDropFontTags​(boolean dropFontTags)
      drop-font-tags- discard presentation tags.
      void setDropProprietaryAttributes​(boolean dropProprietaryAttributes)
      drop-proprietary-attributes- discard proprietary attributes.
      void setEmacs​(boolean emacs)
      gnu-emacs- if true format error output for GNU Emacs.
      void setEncloseBlockText​(boolean encloseBlockText)
      enclose-block-text- if true text in blocks is wrapped in <p>'s.
      void setEncloseText​(boolean encloseText)
      enclose-text- if true text at body is wrapped in <p>'s.
      void setErrfile​(java.lang.String errfile)
      Errfile - file name to write errors to.
      void setErrout​(java.io.PrintWriter out)  
      void setEscapeCdata​(boolean escapeCdata)
      escape-cdata- replace CDATA sections with escaped text.
      void setFixBackslash​(boolean fixBackslash)
      fix-backslash- fix URLs by replacing \ with /.
      void setFixComments​(boolean fixComments)
      fix-bad-comments- fix comments with adjacent hyphens.
      void setFixUri​(boolean fixUri)
      fix-uri- fix uri references applying URI encoding if necessary.
      void setForceOutput​(boolean forceOutput)
      force-output- output document even if errors were found.
      void setHideComments​(boolean hideComments)
      hide-comments- hides all (real) comments in output.
      void setHideEndTags​(boolean hideEndTags)
      hide-endtags - suppress optional end tags.
      void setIndentAttributes​(boolean indentAttributes)
      indent-attributes- newline+indent before each attribute.
      void setIndentCdata​(boolean indentCdata)
      indent-cdata- indent CDATA sections.
      void setIndentContent​(boolean indentContent)
      indent - indent content of appropriate tags.
      void setInputEncoding​(java.lang.String encoding)
      input-encoding the character encoding used for input.
      void setInputStreamName​(java.lang.String name)
      InputStreamName - the name of the input stream (printed in the header information).
      void setJoinClasses​(boolean joinClasses)
      join-classes- join multiple class attributes.
      void setJoinStyles​(boolean joinStyles)
      join-styles- join multiple style attributes.
      void setKeepFileTimes​(boolean keepFileTimes)
      keep-time- if true last modified time is preserved.
      void setLiteralAttribs​(boolean literalAttribs)
      literal-attributes- if true attributes may use newlines.
      void setLogicalEmphasis​(boolean logicalEmphasis)
      logical-emphasis- replace i by em and b by strong.
      void setLowerLiterals​(boolean lowerLiterals)
      lower-literals- folds known attribute values to lower case.
      void setMakeBare​(boolean makeBare)
      make-bare - remove Microsoft cruft.
      void setMakeClean​(boolean makeClean)
      make-clean - remove presentational clutter.
      void setMessageListener​(TidyMessageListener listener)
      Attach a TidyMessageListener which will be notified for messages and errors.
      void setNumEntities​(boolean numEntities)
      numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
      void setOnlyErrors​(boolean onlyErrors)
      only-errors - if true normal output is suppressed.
      void setOutputEncoding​(java.lang.String encoding)
      output-encoding the character encoding used for output.
      void setPrintBodyOnly​(boolean bodyOnly)
      print-body-only- output BODY content only.
      void setQuiet​(boolean quiet)
      quiet - no 'Parsing X', guessed DTD or summary.
      void setQuoteAmpersand​(boolean quoteAmpersand)
      quote-ampersand- output naked ampersand as &.
      void setQuoteMarks​(boolean quoteMarks)
      quote-marks- output " marks as &quot;.
      void setQuoteNbsp​(boolean quoteNbsp)
      quote-nbsp- output non-breaking space as entity.
      void setRawOut​(boolean rawOut)
      output-raw- avoid mapping values > 127 to entities.
      void setRepeatedAttributes​(int repeatedAttributes)
      repeated-attributes- keep first or last duplicate attribute.
      void setReplaceColor​(boolean replaceColor)
      replace-color- replace hex color attribute values with names.
      void setShowErrors​(int showErrors)
      show-errors- set the number of errors to put out.
      void setShowWarnings​(boolean showWarnings)
      show-warnings - show warnings? (errors are always shown).
      void setSmartIndent​(boolean smartIndent)
      SmartIndent - does text/block level content effect indentation.
      void setSpaces​(int spaces)
      indent-spaces- default indentation.
      void setTabsize​(int tabsize)
      tab-size- tab size in chars.
      void setTidyMark​(boolean tidyMark)
      tidy-mark- add meta element indicating tidied doc.
      void setTrimEmptyElements​(boolean trimEmpty)
      trim-empty-elements- trim empty elements.
      void setUpperCaseAttrs​(boolean upperCaseAttrs)
      uppercase-attributes - output attributes in upper case.
      void setUpperCaseTags​(boolean upperCaseTags)
      uppercase-tags - output tags in upper case.
      void setWord2000​(boolean word2000)
      word-2000- draconian cleaning for Word2000.
      void setWrapAsp​(boolean wrapAsp)
      wrap-asp- wrap within ASP pseudo elements.
      void setWrapAttVals​(boolean wrapAttVals)
      wrap-attributes- wrap within attribute values.
      void setWrapJste​(boolean wrapJste)
      wrap-jste- wrap within JSTE pseudo elements.
      void setWraplen​(int wraplen)
      wrap- default wrap margin.
      void setWrapPhp​(boolean wrapPhp)
      wrap-php- wrap within PHP pseudo elements.
      void setWrapScriptlets​(boolean wrapScriptlets)
      wrap-script-literals- wrap within JavaScript string literals.
      void setWrapSection​(boolean wrapSection)
      wrap-sections- wrap within <![ ...
      void setWriteback​(boolean writeback)
      writeback - if true then output tidied markup.
      void setXHTML​(boolean xhtml)
      output-xhtml - output extensible HTML.
      void setXmlOut​(boolean xmlOut)
      output-xml - create output as XML.
      void setXmlPi​(boolean xmlPi)
      add-xml-pi- add <?xml?> for XML docs.
      void setXmlPIs​(boolean xmlPIs)
      assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >.
      void setXmlSpace​(boolean xmlSpace)
      add-xml-space- if set to yes adds xml:space attr as needed.
      void setXmlTags​(boolean xmlTags)
      input-xml - treat input as XML.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Tidy

        public Tidy()
        Instantiates a new Tidy instance. It's reccomended that a new instance is used at each parsing.
    • Method Detail

      • getConfiguration

        public Configuration getConfiguration()
        Returns the actual configuration
        Returns:
        tidy configuration
      • getStderr

        public java.io.PrintWriter getStderr()
      • getParseErrors

        public int getParseErrors()
        ParseErrors - the number of errors that occurred in the most recent parse operation.
        Returns:
        number of errors that occurred in the most recent parse operation.
      • getParseWarnings

        public int getParseWarnings()
        ParseWarnings - the number of warnings that occurred in the most recent parse operation.
        Returns:
        number of warnings that occurred in the most recent parse operation.
      • setInputStreamName

        public void setInputStreamName​(java.lang.String name)
        InputStreamName - the name of the input stream (printed in the header information).
        Parameters:
        name - input stream name
      • getInputStreamName

        public java.lang.String getInputStreamName()
      • getErrout

        public java.io.PrintWriter getErrout()
        Errout - the error output stream.
        Returns:
        error output stream.
      • setErrout

        public void setErrout​(java.io.PrintWriter out)
      • setConfigurationFromFile

        public void setConfigurationFromFile​(java.lang.String filename)
        Sets the configuration from a configuration file.
        Parameters:
        filename - configuration file name/path.
      • setConfigurationFromProps

        public void setConfigurationFromProps​(java.util.Properties props)
        Sets the configuration from a properties object.
        Parameters:
        props - Properties object
      • createEmptyDocument

        public static org.w3c.dom.Document createEmptyDocument()
        Creates an empty DOM Document.
        Returns:
        a new org.w3c.dom.Document
      • parse

        public Node parse​(java.io.InputStream in,
                          java.io.OutputStream out)
        Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
        Parameters:
        in - input
        out - optional destination for pretty-printed document
        Returns:
        parsed org.w3c.tidy.Node
      • parse

        public Node parse​(java.io.Reader in,
                          java.io.OutputStream out)
        Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
        Parameters:
        in - input
        out - optional destination for pretty-printed document
        Returns:
        parsed org.w3c.tidy.Node
      • parse

        public Node parse​(java.io.Reader in,
                          java.io.Writer out)
        Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
        Parameters:
        in - input
        out - optional destination for pretty-printed document
        Returns:
        parsed org.w3c.tidy.Node
      • parse

        public Node parse​(java.io.InputStream in,
                          java.io.Writer out)
        Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
        Parameters:
        in - input
        out - optional destination for pretty-printed document
        Returns:
        parsed org.w3c.tidy.Node
      • parseDOM

        public org.w3c.dom.Document parseDOM​(java.io.InputStream in,
                                             java.io.OutputStream out)
        Parses InputStream in and returns a DOM Document node. If out is non-null, pretty prints to OutputStream out.
        Parameters:
        in - input stream
        out - optional output stream
        Returns:
        parsed org.w3c.dom.Document
      • parseDOM

        public org.w3c.dom.Document parseDOM​(java.io.Reader in,
                                             java.io.Writer out)
      • pprint

        public void pprint​(org.w3c.dom.Document doc,
                           java.io.OutputStream out)
        Pretty-prints a DOM Document. Must be an instance of org.w3c.tidy.DOMDocumentImpl. Caller is responsible for closing the outputStream after calling this method.
        Parameters:
        doc - org.w3c.dom.Document
        out - output stream
      • pprint

        public void pprint​(org.w3c.dom.Node node,
                           java.io.OutputStream out)
        Pretty-prints a DOM Node. Caller is responsible for closing the outputStream after calling this method.
        Parameters:
        node - org.w3c.dom.Node. Must be an instance of org.w3c.tidy.DOMNodeImpl.
        out - output stream
      • main

        public static void main​(java.lang.String[] argv)
        Command line interface to parser and pretty printer.
        Parameters:
        argv - command line parameters
      • mainExec

        protected int mainExec​(java.lang.String[] argv)
        Main method, but returns the return code as an int instead of calling System.exit(code). Needed for testing main method without shutting down tests.
        Parameters:
        argv - command line parameters
        Returns:
        return code
      • setMessageListener

        public void setMessageListener​(TidyMessageListener listener)
        Attach a TidyMessageListener which will be notified for messages and errors.
        Parameters:
        listener - TidyMessageListener implementation
      • setSpaces

        public void setSpaces​(int spaces)
        indent-spaces- default indentation.
        Parameters:
        spaces - number of spaces used for indentation
        See Also:
        Configuration.spaces
      • getSpaces

        public int getSpaces()
        indent-spaces- default indentation.
        Returns:
        number of spaces used for indentation
        See Also:
        Configuration.spaces
      • setWraplen

        public void setWraplen​(int wraplen)
        wrap- default wrap margin.
        Parameters:
        wraplen - default wrap margin
        See Also:
        Configuration.wraplen
      • getWraplen

        public int getWraplen()
        wrap- default wrap margin.
        Returns:
        default wrap margin
        See Also:
        Configuration.wraplen
      • setTabsize

        public void setTabsize​(int tabsize)
        tab-size- tab size in chars.
        Parameters:
        tabsize - tab size in chars
        See Also:
        Configuration.tabsize
      • getTabsize

        public int getTabsize()
        tab-size- tab size in chars.
        Returns:
        tab size in chars
        See Also:
        Configuration.tabsize
      • setErrfile

        public void setErrfile​(java.lang.String errfile)
        Errfile - file name to write errors to.
        Parameters:
        errfile - file name to write errors to
        See Also:
        Configuration.errfile
      • getErrfile

        public java.lang.String getErrfile()
        Errfile - file name to write errors to.
        Returns:
        error file name
        See Also:
        Configuration.errfile
      • setWriteback

        public void setWriteback​(boolean writeback)
        writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.
        Parameters:
        writeback - true= output tidied markup
        See Also:
        Configuration.writeback
      • getWriteback

        public boolean getWriteback()
        writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.
        Returns:
        true if tidy will output tidied markup in input file
        See Also:
        Configuration.writeback
      • setOnlyErrors

        public void setOnlyErrors​(boolean onlyErrors)
        only-errors - if true normal output is suppressed.
        Parameters:
        onlyErrors - if true normal output is suppressed.
        See Also:
        Configuration.onlyErrors
      • getOnlyErrors

        public boolean getOnlyErrors()
        only-errors - if true normal output is suppressed.
        Returns:
        true if normal output is suppressed.
        See Also:
        Configuration.onlyErrors
      • setShowWarnings

        public void setShowWarnings​(boolean showWarnings)
        show-warnings - show warnings? (errors are always shown).
        Parameters:
        showWarnings - if false warnings are not shown
        See Also:
        Configuration.showWarnings
      • getShowWarnings

        public boolean getShowWarnings()
        show-warnings - show warnings? (errors are always shown).
        Returns:
        false if warnings are not shown
        See Also:
        Configuration.showWarnings
      • setQuiet

        public void setQuiet​(boolean quiet)
        quiet - no 'Parsing X', guessed DTD or summary.
        Parameters:
        quiet - true= don't output summary, warnings or errors
        See Also:
        Configuration.quiet
      • getQuiet

        public boolean getQuiet()
        quiet - no 'Parsing X', guessed DTD or summary.
        Returns:
        true if tidy will not output summary, warnings or errors
        See Also:
        Configuration.quiet
      • setIndentContent

        public void setIndentContent​(boolean indentContent)
        indent - indent content of appropriate tags.
        Parameters:
        indentContent - indent content of appropriate tags
        See Also:
        Configuration.indentContent
      • getIndentContent

        public boolean getIndentContent()
        indent - indent content of appropriate tags.
        Returns:
        true if tidy will indent content of appropriate tags
        See Also:
        Configuration.indentContent
      • setSmartIndent

        public void setSmartIndent​(boolean smartIndent)
        SmartIndent - does text/block level content effect indentation.
        Parameters:
        smartIndent - true if text/block level content should effect indentation
        See Also:
        Configuration.smartIndent
      • getSmartIndent

        public boolean getSmartIndent()
        SmartIndent - does text/block level content effect indentation.
        Returns:
        true if text/block level content should effect indentation
        See Also:
        Configuration.smartIndent
      • setHideEndTags

        public void setHideEndTags​(boolean hideEndTags)
        hide-endtags - suppress optional end tags.
        Parameters:
        hideEndTags - true= suppress optional end tags
        See Also:
        Configuration.hideEndTags
      • getHideEndTags

        public boolean getHideEndTags()
        hide-endtags - suppress optional end tags.
        Returns:
        true if tidy will suppress optional end tags
        See Also:
        Configuration.hideEndTags
      • setXmlTags

        public void setXmlTags​(boolean xmlTags)
        input-xml - treat input as XML.
        Parameters:
        xmlTags - true if tidy should treat input as XML
        See Also:
        Configuration.xmlTags
      • getXmlTags

        public boolean getXmlTags()
        input-xml - treat input as XML.
        Returns:
        true if tidy will treat input as XML
        See Also:
        Configuration.xmlTags
      • setXmlOut

        public void setXmlOut​(boolean xmlOut)
        output-xml - create output as XML.
        Parameters:
        xmlOut - true if tidy should create output as xml
        See Also:
        Configuration.xmlOut
      • getXmlOut

        public boolean getXmlOut()
        output-xml - create output as XML.
        Returns:
        true if tidy will create output as xml
        See Also:
        Configuration.xmlOut
      • setXHTML

        public void setXHTML​(boolean xhtml)
        output-xhtml - output extensible HTML.
        Parameters:
        xhtml - true if tidy should output XHTML
        See Also:
        Configuration.xHTML
      • getXHTML

        public boolean getXHTML()
        output-xhtml - output extensible HTML.
        Returns:
        true if tidy will output XHTML
        See Also:
        Configuration.xHTML
      • setUpperCaseTags

        public void setUpperCaseTags​(boolean upperCaseTags)
        uppercase-tags - output tags in upper case.
        Parameters:
        upperCaseTags - true if tidy should output tags in upper case (default is lowercase)
        See Also:
        Configuration.upperCaseTags
      • getUpperCaseTags

        public boolean getUpperCaseTags()
        uppercase-tags - output tags in upper case.
        Returns:
        true if tidy should will tags in upper case
        See Also:
        Configuration.upperCaseTags
      • setUpperCaseAttrs

        public void setUpperCaseAttrs​(boolean upperCaseAttrs)
        uppercase-attributes - output attributes in upper case.
        Parameters:
        upperCaseAttrs - true if tidy should output attributes in upper case (default is lowercase)
        See Also:
        Configuration.upperCaseAttrs
      • getUpperCaseAttrs

        public boolean getUpperCaseAttrs()
        uppercase-attributes - output attributes in upper case.
        Returns:
        true if tidy should will attributes in upper case
        See Also:
        Configuration.upperCaseAttrs
      • setMakeClean

        public void setMakeClean​(boolean makeClean)
        make-clean - remove presentational clutter.
        Parameters:
        makeClean - true to remove presentational clutter
        See Also:
        Configuration.makeClean
      • getMakeClean

        public boolean getMakeClean()
        make-clean - remove presentational clutter.
        Returns:
        true if tidy will remove presentational clutter
        See Also:
        Configuration.makeClean
      • setMakeBare

        public void setMakeBare​(boolean makeBare)
        make-bare - remove Microsoft cruft.
        Parameters:
        makeBare - true to remove Microsoft cruft
        See Also:
        Configuration.makeBare
      • getMakeBare

        public boolean getMakeBare()
        make-clean - remove Microsoft cruft.
        Returns:
        true if tidy will remove Microsoft cruft
        See Also:
        Configuration.makeBare
      • setBreakBeforeBR

        public void setBreakBeforeBR​(boolean breakBeforeBR)
        break-before-br - output newline before <br>.
        Parameters:
        breakBeforeBR - true if tidy should output a newline before <br>
        See Also:
        Configuration.breakBeforeBR
      • getBreakBeforeBR

        public boolean getBreakBeforeBR()
        break-before-br - output newline before <br>.
        Returns:
        true if tidy will output a newline before <br>
        See Also:
        Configuration.breakBeforeBR
      • setBurstSlides

        public void setBurstSlides​(boolean burstSlides)
        split- create slides on each h2 element.
        Parameters:
        burstSlides - true if tidy should create slides on each h2 element
        See Also:
        Configuration.burstSlides
      • getBurstSlides

        public boolean getBurstSlides()
        split- create slides on each h2 element.
        Returns:
        true if tidy will create slides on each h2 element
        See Also:
        Configuration.burstSlides
      • setNumEntities

        public void setNumEntities​(boolean numEntities)
        numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
        Parameters:
        numEntities - true if tidy should output entities in the numeric form.
        See Also:
        Configuration.numEntities
      • getNumEntities

        public boolean getNumEntities()
        numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
        Returns:
        true if tidy will output entities in the numeric form.
        See Also:
        Configuration.numEntities
      • setQuoteMarks

        public void setQuoteMarks​(boolean quoteMarks)
        quote-marks- output " marks as &quot;.
        Parameters:
        quoteMarks - true if tidy should output " marks as &quot;
        See Also:
        Configuration.quoteMarks
      • getQuoteMarks

        public boolean getQuoteMarks()
        quote-marks- output " marks as &quot;.
        Returns:
        true if tidy will output " marks as &quot;
        See Also:
        Configuration.quoteMarks
      • setQuoteNbsp

        public void setQuoteNbsp​(boolean quoteNbsp)
        quote-nbsp- output non-breaking space as entity.
        Parameters:
        quoteNbsp - true if tidy should output non-breaking space as entity
        See Also:
        Configuration.quoteNbsp
      • getQuoteNbsp

        public boolean getQuoteNbsp()
        quote-nbsp- output non-breaking space as entity.
        Returns:
        true if tidy will output non-breaking space as entity
        See Also:
        Configuration.quoteNbsp
      • setQuoteAmpersand

        public void setQuoteAmpersand​(boolean quoteAmpersand)
        quote-ampersand- output naked ampersand as &.
        Parameters:
        quoteAmpersand - true if tidy should output naked ampersand as &
        See Also:
        Configuration.quoteAmpersand
      • getQuoteAmpersand

        public boolean getQuoteAmpersand()
        quote-ampersand- output naked ampersand as &.
        Returns:
        true if tidy will output naked ampersand as &
        See Also:
        Configuration.quoteAmpersand
      • setWrapAttVals

        public void setWrapAttVals​(boolean wrapAttVals)
        wrap-attributes- wrap within attribute values.
        Parameters:
        wrapAttVals - true if tidy should wrap within attribute values
        See Also:
        Configuration.wrapAttVals
      • getWrapAttVals

        public boolean getWrapAttVals()
        wrap-attributes- wrap within attribute values.
        Returns:
        true if tidy will wrap within attribute values
        See Also:
        Configuration.wrapAttVals
      • setWrapScriptlets

        public void setWrapScriptlets​(boolean wrapScriptlets)
        wrap-script-literals- wrap within JavaScript string literals.
        Parameters:
        wrapScriptlets - true if tidy should wrap within JavaScript string literals
        See Also:
        Configuration.wrapScriptlets
      • getWrapScriptlets

        public boolean getWrapScriptlets()
        wrap-script-literals- wrap within JavaScript string literals.
        Returns:
        true if tidy will wrap within JavaScript string literals
        See Also:
        Configuration.wrapScriptlets
      • setWrapSection

        public void setWrapSection​(boolean wrapSection)
        wrap-sections- wrap within <![ ... ]> section tags
        Parameters:
        wrapSection - true if tidy should wrap within <![ ... ]> section tags
        See Also:
        Configuration.wrapSection
      • getWrapSection

        public boolean getWrapSection()
        wrap-sections- wrap within <![ ... ]> section tags
        Returns:
        true if tidy will wrap within <![ ... ]> section tags
        See Also:
        Configuration.wrapSection
      • setAltText

        public void setAltText​(java.lang.String altText)
        alt-text- default text for alt attribute.
        Parameters:
        altText - default text for alt attribute
        See Also:
        Configuration.altText
      • getAltText

        public java.lang.String getAltText()
        alt-text- default text for alt attribute.
        Returns:
        default text for alt attribute
        See Also:
        Configuration.altText
      • setXmlPi

        public void setXmlPi​(boolean xmlPi)
        add-xml-pi- add <?xml?> for XML docs.
        Parameters:
        xmlPi - true if tidy should add <?xml?> for XML docs
        See Also:
        Configuration.xmlPi
      • getXmlPi

        public boolean getXmlPi()
        add-xml-pi- add <?xml?> for XML docs.
        Returns:
        true if tidy will add <?xml?> for XML docs
        See Also:
        Configuration.xmlPi
      • setDropFontTags

        public void setDropFontTags​(boolean dropFontTags)
        drop-font-tags- discard presentation tags.
        Parameters:
        dropFontTags - true if tidy should discard presentation tags
        See Also:
        Configuration.dropFontTags
      • getDropFontTags

        public boolean getDropFontTags()
        drop-font-tags- discard presentation tags.
        Returns:
        true if tidy will discard presentation tags
        See Also:
        Configuration.dropFontTags
      • setDropProprietaryAttributes

        public void setDropProprietaryAttributes​(boolean dropProprietaryAttributes)
        drop-proprietary-attributes- discard proprietary attributes.
        Parameters:
        dropProprietaryAttributes - true if tidy should discard proprietary attributes
        See Also:
        Configuration.dropProprietaryAttributes
      • getDropProprietaryAttributes

        public boolean getDropProprietaryAttributes()
        drop-proprietary-attributes- discard proprietary attributes.
        Returns:
        true if tidy will discard proprietary attributes
        See Also:
        Configuration.dropProprietaryAttributes
      • setDropEmptyParas

        public void setDropEmptyParas​(boolean dropEmptyParas)
        drop-empty-paras- discard empty p elements.
        Parameters:
        dropEmptyParas - true if tidy should discard empty p elements
        See Also:
        Configuration.dropEmptyParas
      • getDropEmptyParas

        public boolean getDropEmptyParas()
        drop-empty-paras- discard empty p elements.
        Returns:
        true if tidy will discard empty p elements
        See Also:
        Configuration.dropEmptyParas
      • setFixComments

        public void setFixComments​(boolean fixComments)
        fix-bad-comments- fix comments with adjacent hyphens.
        Parameters:
        fixComments - true if tidy should fix comments with adjacent hyphens
        See Also:
        Configuration.fixComments
      • getFixComments

        public boolean getFixComments()
        fix-bad-comments- fix comments with adjacent hyphens.
        Returns:
        true if tidy will fix comments with adjacent hyphens
        See Also:
        Configuration.fixComments
      • setWrapAsp

        public void setWrapAsp​(boolean wrapAsp)
        wrap-asp- wrap within ASP pseudo elements.
        Parameters:
        wrapAsp - true if tidy should wrap within ASP pseudo elements
        See Also:
        Configuration.wrapAsp
      • getWrapAsp

        public boolean getWrapAsp()
        wrap-asp- wrap within ASP pseudo elements.
        Returns:
        true if tidy will wrap within ASP pseudo elements
        See Also:
        Configuration.wrapAsp
      • setWrapJste

        public void setWrapJste​(boolean wrapJste)
        wrap-jste- wrap within JSTE pseudo elements.
        Parameters:
        wrapJste - true if tidy should wrap within JSTE pseudo elements
        See Also:
        Configuration.wrapJste
      • getWrapJste

        public boolean getWrapJste()
        wrap-jste- wrap within JSTE pseudo elements.
        Returns:
        true if tidy will wrap within JSTE pseudo elements
        See Also:
        Configuration.wrapJste
      • setWrapPhp

        public void setWrapPhp​(boolean wrapPhp)
        wrap-php- wrap within PHP pseudo elements.
        Parameters:
        wrapPhp - true if tidy should wrap within PHP pseudo elements
        See Also:
        Configuration.wrapPhp
      • getWrapPhp

        public boolean getWrapPhp()
        wrap-php- wrap within PHP pseudo elements.
        Returns:
        true if tidy will wrap within PHP pseudo elements
        See Also:
        Configuration.wrapPhp
      • setFixBackslash

        public void setFixBackslash​(boolean fixBackslash)
        fix-backslash- fix URLs by replacing \ with /.
        Parameters:
        fixBackslash - true if tidy should fix URLs by replacing \ with /
        See Also:
        Configuration.fixBackslash
      • getFixBackslash

        public boolean getFixBackslash()
        fix-backslash- fix URLs by replacing \ with /.
        Returns:
        true if tidy will fix URLs by replacing \ with /
        See Also:
        Configuration.fixBackslash
      • setIndentAttributes

        public void setIndentAttributes​(boolean indentAttributes)
        indent-attributes- newline+indent before each attribute.
        Parameters:
        indentAttributes - true if tidy should output a newline+indent before each attribute
        See Also:
        Configuration.indentAttributes
      • getIndentAttributes

        public boolean getIndentAttributes()
        indent-attributes- newline+indent before each attribute.
        Returns:
        true if tidy will output a newline+indent before each attribute
        See Also:
        Configuration.indentAttributes
      • setDocType

        public void setDocType​(java.lang.String doctype)
        doctype- user specified doctype.
        Parameters:
        doctype - omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.
        See Also:
        Configuration.docTypeStr, Configuration.docTypeMode
      • getDocType

        public java.lang.String getDocType()
        doctype- user specified doctype.
        Returns:
        omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.
        See Also:
        Configuration.docTypeStr, Configuration.docTypeMode
      • setLogicalEmphasis

        public void setLogicalEmphasis​(boolean logicalEmphasis)
        logical-emphasis- replace i by em and b by strong.
        Parameters:
        logicalEmphasis - true if tidy should replace i by em and b by strong
        See Also:
        Configuration.logicalEmphasis
      • getLogicalEmphasis

        public boolean getLogicalEmphasis()
        logical-emphasis- replace i by em and b by strong.
        Returns:
        true if tidy will replace i by em and b by strong
        See Also:
        Configuration.logicalEmphasis
      • setXmlPIs

        public void setXmlPIs​(boolean xmlPIs)
        assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.
        Parameters:
        xmlPIs - true if tidy should expect a ?> at the end of processing instructions
        See Also:
        Configuration.xmlPIs
      • getXmlPIs

        public boolean getXmlPIs()
        assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.
        Returns:
        true if tidy will expect a ?> at the end of processing instructions
        See Also:
        Configuration.xmlPIs
      • setEncloseText

        public void setEncloseText​(boolean encloseText)
        enclose-text- if true text at body is wrapped in <p>'s.
        Parameters:
        encloseText - true if tidy should wrap text at body in <p>'s.
        See Also:
        Configuration.encloseBodyText
      • getEncloseText

        public boolean getEncloseText()
        enclose-text- if true text at body is wrapped in <p>'s.
        Returns:
        true if tidy will wrap text at body in <p>'s.
        See Also:
        Configuration.encloseBodyText
      • setEncloseBlockText

        public void setEncloseBlockText​(boolean encloseBlockText)
        enclose-block-text- if true text in blocks is wrapped in <p>'s.
        Parameters:
        encloseBlockText - true if tidy should wrap text text in blocks in <p>'s.
        See Also:
        Configuration.encloseBlockText
      • getEncloseBlockText

        public boolean getEncloseBlockText()
        enclose-block-text- if true text in blocks is wrapped in <p>'s. return true if tidy should will text text in blocks in <p>'s.
        See Also:
        Configuration.encloseBlockText
      • setWord2000

        public void setWord2000​(boolean word2000)
        word-2000- draconian cleaning for Word2000.
        Parameters:
        word2000 - true if tidy should clean word2000 documents
        See Also:
        Configuration.word2000
      • getWord2000

        public boolean getWord2000()
        word-2000- draconian cleaning for Word2000.
        Returns:
        true if tidy will clean word2000 documents
        See Also:
        Configuration.word2000
      • setTidyMark

        public void setTidyMark​(boolean tidyMark)
        tidy-mark- add meta element indicating tidied doc.
        Parameters:
        tidyMark - true if tidy should add meta element indicating tidied doc
        See Also:
        Configuration.tidyMark
      • getTidyMark

        public boolean getTidyMark()
        tidy-mark- add meta element indicating tidied doc.
        Returns:
        true if tidy will add meta element indicating tidied doc
        See Also:
        Configuration.tidyMark
      • setXmlSpace

        public void setXmlSpace​(boolean xmlSpace)
        add-xml-space- if set to yes adds xml:space attr as needed.
        Parameters:
        xmlSpace - true if tidy should add xml:space attr as needed
        See Also:
        Configuration.xmlSpace
      • getXmlSpace

        public boolean getXmlSpace()
        add-xml-space- if set to yes adds xml:space attr as needed.
        Returns:
        true if tidy will add xml:space attr as needed
        See Also:
        Configuration.xmlSpace
      • setEmacs

        public void setEmacs​(boolean emacs)
        gnu-emacs- if true format error output for GNU Emacs.
        Parameters:
        emacs - true if tidy should format error output for GNU Emacs
        See Also:
        Configuration.emacs
      • getEmacs

        public boolean getEmacs()
        gnu-emacs- if true format error output for GNU Emacs.
        Returns:
        true if tidy will format error output for GNU Emacs
        See Also:
        Configuration.emacs
      • setLiteralAttribs

        public void setLiteralAttribs​(boolean literalAttribs)
        literal-attributes- if true attributes may use newlines.
        Parameters:
        literalAttribs - true if attributes may use newlines
        See Also:
        Configuration.literalAttribs
      • getLiteralAttribs

        public boolean getLiteralAttribs()
        literal-attributes- if true attributes may use newlines.
        Returns:
        true if attributes may use newlines
        See Also:
        Configuration.literalAttribs
      • setPrintBodyOnly

        public void setPrintBodyOnly​(boolean bodyOnly)
        print-body-only- output BODY content only.
        Parameters:
        bodyOnly - true = print only the document body
        See Also:
        Configuration.bodyOnly
      • getPrintBodyOnly

        public boolean getPrintBodyOnly()
        print-body-only- output BODY content only.
        Returns:
        true if tidy will print only the document body
      • setFixUri

        public void setFixUri​(boolean fixUri)
        fix-uri- fix uri references applying URI encoding if necessary.
        Parameters:
        fixUri - true = fix uri references
        See Also:
        Configuration.fixUri
      • getFixUri

        public boolean getFixUri()
        fix-uri- output BODY content only.
        Returns:
        true if tidy will fix uri references
      • setLowerLiterals

        public void setLowerLiterals​(boolean lowerLiterals)
        lower-literals- folds known attribute values to lower case.
        Parameters:
        lowerLiterals - true = folds known attribute values to lower case
        See Also:
        Configuration.lowerLiterals
      • getLowerLiterals

        public boolean getLowerLiterals()
        lower-literals- folds known attribute values to lower case.
        Returns:
        true if tidy will folds known attribute values to lower case
      • setHideComments

        public void setHideComments​(boolean hideComments)
        hide-comments- hides all (real) comments in output.
        Parameters:
        hideComments - true = hides all comments in output
        See Also:
        Configuration.hideComments
      • getHideComments

        public boolean getHideComments()
        hide-comments- hides all (real) comments in output.
        Returns:
        true if tidy will hide all comments in output
      • setIndentCdata

        public void setIndentCdata​(boolean indentCdata)
        indent-cdata- indent CDATA sections.
        Parameters:
        indentCdata - true = indent CDATA sections
        See Also:
        Configuration.indentCdata
      • getIndentCdata

        public boolean getIndentCdata()
        indent-cdata- indent CDATA sections.
        Returns:
        true if tidy will indent CDATA sections
      • setForceOutput

        public void setForceOutput​(boolean forceOutput)
        force-output- output document even if errors were found.
        Parameters:
        forceOutput - true = output document even if errors were found
        See Also:
        Configuration.forceOutput
      • getForceOutput

        public boolean getForceOutput()
        force-output- output document even if errors were found.
        Returns:
        true if tidy will output document even if errors were found
      • setShowErrors

        public void setShowErrors​(int showErrors)
        show-errors- set the number of errors to put out.
        Parameters:
        showErrors - number of errors to put out
        See Also:
        Configuration.showErrors
      • getShowErrors

        public int getShowErrors()
        show-errors- number of errors to put out.
        Returns:
        the number of errors tidy will put out
      • setAsciiChars

        public void setAsciiChars​(boolean asciiChars)
        ascii-chars- convert quotes and dashes to nearest ASCII char.
        Parameters:
        asciiChars - true = convert quotes and dashes to nearest ASCII char
        See Also:
        Configuration.asciiChars
      • getAsciiChars

        public boolean getAsciiChars()
        ascii-chars- convert quotes and dashes to nearest ASCII char.
        Returns:
        true if tidy will convert quotes and dashes to nearest ASCII char
      • setJoinClasses

        public void setJoinClasses​(boolean joinClasses)
        join-classes- join multiple class attributes.
        Parameters:
        joinClasses - true = join multiple class attributes
        See Also:
        Configuration.joinClasses
      • getJoinClasses

        public boolean getJoinClasses()
        join-classes- join multiple class attributes.
        Returns:
        true if tidy will join multiple class attributes
      • setJoinStyles

        public void setJoinStyles​(boolean joinStyles)
        join-styles- join multiple style attributes.
        Parameters:
        joinStyles - true = join multiple style attributes
        See Also:
        Configuration.joinStyles
      • getJoinStyles

        public boolean getJoinStyles()
        join-styles- join multiple style attributes.
        Returns:
        true if tidy will join multiple style attributes
      • setTrimEmptyElements

        public void setTrimEmptyElements​(boolean trimEmpty)
        trim-empty-elements- trim empty elements.
        Parameters:
        trim - -empty-elements true = trim empty elements
        See Also:
        Configuration.trimEmpty
      • getTrimEmptyElements

        public boolean getTrimEmptyElements()
        trim-empty-elements- trim empty elements.
        Returns:
        true if tidy will trim empty elements
      • setReplaceColor

        public void setReplaceColor​(boolean replaceColor)
        replace-color- replace hex color attribute values with names.
        Parameters:
        replaceColor - true = replace hex color attribute values with names
        See Also:
        Configuration.replaceColor
      • getReplaceColor

        public boolean getReplaceColor()
        replace-color- replace hex color attribute values with names.
        Returns:
        true if tidy will replace hex color attribute values with names
      • setEscapeCdata

        public void setEscapeCdata​(boolean escapeCdata)
        escape-cdata- replace CDATA sections with escaped text.
        Parameters:
        escapeCdata - true = replace CDATA sections with escaped text
        See Also:
        Configuration.escapeCdata
      • getEscapeCdata

        public boolean getEscapeCdata()
        escape-cdata -replace CDATA sections with escaped text.
        Returns:
        true if tidy will replace CDATA sections with escaped text
      • setRepeatedAttributes

        public void setRepeatedAttributes​(int repeatedAttributes)
        repeated-attributes- keep first or last duplicate attribute.
        Parameters:
        repeatedAttributes - Configuration.KEEP_FIRST | Configuration.KEEP_LAST
        See Also:
        Configuration.duplicateAttrs
      • getRepeatedAttributes

        public int getRepeatedAttributes()
        repeated-attributes- keep first or last duplicate attribute.
        Returns:
        Configuration.KEEP_FIRST | Configuration.KEEP_LAST
      • setKeepFileTimes

        public void setKeepFileTimes​(boolean keepFileTimes)
        keep-time- if true last modified time is preserved.
        Parameters:
        keepFileTimes - true if tidy should preserved last modified time in input file.
        See Also:
        Configuration.keepFileTimes
        To Do:
        this is NOT supported at this time.
      • getKeepFileTimes

        public boolean getKeepFileTimes()
        keep-time- if true last modified time is preserved.
        Returns:
        true if tidy will preserved last modified time in input file.
        See Also:
        Configuration.keepFileTimes
        To Do:
        this is NOT supported at this time.
      • setRawOut

        public void setRawOut​(boolean rawOut)
        output-raw- avoid mapping values > 127 to entities. This has the same effect of specifying a "raw" encoding in the original version of tidy.
        Parameters:
        rawOut - avoid mapping values > 127 to entities
        See Also:
        Configuration.rawOut
      • getRawOut

        public boolean getRawOut()
        output-raw- avoid mapping values > 127 to entities.
        Returns:
        true if tidy will not map values > 127 to entities
        See Also:
        Configuration.rawOut
      • setInputEncoding

        public void setInputEncoding​(java.lang.String encoding)
        input-encoding the character encoding used for input.
        Parameters:
        encoding - a valid java encoding name
      • getInputEncoding

        public java.lang.String getInputEncoding()
        input-encoding the character encoding used for input.
        Returns:
        the java name of the encoding currently used for input
      • setOutputEncoding

        public void setOutputEncoding​(java.lang.String encoding)
        output-encoding the character encoding used for output.
        Parameters:
        encoding - a valid java encoding name
      • getOutputEncoding

        public java.lang.String getOutputEncoding()
        output-encoding the character encoding used for output.
        Returns:
        the java name of the encoding currently used for output