Class XmlDetagger

  • All Implemented Interfaces:
    AnalysisComponent

    public class XmlDetagger
    extends CasAnnotator_ImplBase
    A multi-sofa annotator that does XML detagging. Reads XML data from the input Sofa (named "xmlDocument"); this data can be stored in the CAS as a string or array, or it can be a URI to a remote file. The XML is parsed using the JVM's default parser, and the plain-text content is written to a new sofa called "plainTextDocument".
    • Field Detail

      • PARAM_TEXT_TAG

        public static final java.lang.String PARAM_TEXT_TAG
        Name of optional configuration parameter that contains the name of an XML tag that appears in the input file. Only text that falls within this XML tag will be considered part of the "document" that it is added to the CAS by this CAS Initializer. If not specified, the entire file will be considered the document.
        See Also:
        Constant Field Values
    • Constructor Detail

      • XmlDetagger

        public XmlDetagger()