Class Utils


  • public class Utils
    extends java.lang.Object

    Common utilities.

    Created by: Vladimir Nikic
    Date: November, 2006.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.util.regex.Pattern DECIMAL  
      static java.util.regex.Pattern HEX_RELAXED  
      static java.util.regex.Pattern HEX_STRICT  
    • Constructor Summary

      Constructors 
      Constructor Description
      Utils()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String deserializeEntities​(java.lang.String str, boolean recognizeUnicodeChars)  
      static java.lang.String escapeHtml​(java.lang.String s, CleanerProperties props)
      Escapes HTML string
      static java.lang.String escapeXml​(java.lang.String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR)
      change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character.
      static java.lang.String escapeXml​(java.lang.String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR, boolean isHtmlOutput)
      change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character.
      static java.lang.String escapeXml​(java.lang.String s, CleanerProperties props, boolean isDomCreation)
      Escapes XML string.
      static java.lang.String fullUrl​(java.lang.String pageUrl, java.lang.String link)
      Calculates full URL for specified page URL and link which could be full, absolute or relative like there can be found in A or IMG tags.
      static java.lang.String getXmlName​(java.lang.String name)  
      static java.lang.String getXmlNSPrefix​(java.lang.String name)  
      static boolean isEmptyString​(java.lang.Object o)  
      static boolean isFullUrl​(java.lang.String link)
      Checks if specified link is full URL.
      static boolean isValidHtmlAttributeName​(java.lang.String name)  
      static boolean isValidXmlIdentifier​(java.lang.String s)
      Checks whether specified string can be valid tag name or attribute name in xml.
      static boolean isValidXmlIdentifierStartChar​(java.lang.String identifier)
      Determines whether the initial character of an identifier is valid for XML
      static boolean isWhitespaceString​(java.lang.Object object)
      Checks whether specified object's string representation is empty string (containing of only whitespaces).
      static boolean isXmlReservedCharacter​(java.lang.String c)  
      static java.lang.String ltrim​(java.lang.String s)
      Trims specified string from left.
      static java.lang.String replaceInvalidXmlIdentifierCharacters​(java.lang.String name, java.lang.String replacement)
      Strips out invalid characters from names used for XML Elements and replaces them with the specified character.
      static java.lang.String rtrim​(java.lang.String s)
      Trims specified string from right.
      static java.lang.String sanitizeHtmlAttributeName​(java.lang.String name)  
      static java.lang.String sanitizeXmlIdentifier​(java.lang.String attName)  
      static java.lang.String sanitizeXmlIdentifier​(java.lang.String attName, java.lang.String prefix)  
      static java.lang.String sanitizeXmlIdentifier​(java.lang.String attName, java.lang.String prefix, java.lang.String replacementCharacter)
      Attempts to replace invalid attribute names with valid ones.
      static java.lang.String[] tokenize​(java.lang.String s, java.lang.String delimiters)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • HEX_STRICT

        public static java.util.regex.Pattern HEX_STRICT
      • HEX_RELAXED

        public static java.util.regex.Pattern HEX_RELAXED
      • DECIMAL

        public static java.util.regex.Pattern DECIMAL
    • Constructor Detail

      • Utils

        public Utils()
    • Method Detail

      • isFullUrl

        public static boolean isFullUrl​(java.lang.String link)
        Checks if specified link is full URL.
        Parameters:
        link -
        Returns:
        True, if full URl, false otherwise.
      • fullUrl

        public static java.lang.String fullUrl​(java.lang.String pageUrl,
                                               java.lang.String link)
        Calculates full URL for specified page URL and link which could be full, absolute or relative like there can be found in A or IMG tags. (Reinstated as per user request in bug 159)
      • escapeHtml

        public static java.lang.String escapeHtml​(java.lang.String s,
                                                  CleanerProperties props)
        Escapes HTML string
        Parameters:
        s - String to be escaped
        props - Cleaner properties affects escaping behaviour
        Returns:
      • escapeXml

        public static java.lang.String escapeXml​(java.lang.String s,
                                                 CleanerProperties props,
                                                 boolean isDomCreation)
        Escapes XML string.
        Parameters:
        s - String to be escaped
        props - Cleaner properties affects escaping behaviour
        isDomCreation - Tells if escaped content will be part of the DOM
      • escapeXml

        public static java.lang.String escapeXml​(java.lang.String s,
                                                 boolean advanced,
                                                 boolean recognizeUnicodeChars,
                                                 boolean translateSpecialEntities,
                                                 boolean isDomCreation,
                                                 boolean transResCharsToNCR,
                                                 boolean translateSpecialEntitiesToNCR)
        change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character. 3) convert html special entities to xml &#xxx; when outputing in xml
        Parameters:
        s -
        advanced -
        recognizeUnicodeChars -
        translateSpecialEntities -
        isDomCreation -
        Returns:
        TODO Consider moving to CleanerProperties since a long list of params is misleading.
      • escapeXml

        public static java.lang.String escapeXml​(java.lang.String s,
                                                 boolean advanced,
                                                 boolean recognizeUnicodeChars,
                                                 boolean translateSpecialEntities,
                                                 boolean isDomCreation,
                                                 boolean transResCharsToNCR,
                                                 boolean translateSpecialEntitiesToNCR,
                                                 boolean isHtmlOutput)
        change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character. 3) convert html special entities to xml &#xxx; when outputing in xml
        Parameters:
        s -
        advanced -
        recognizeUnicodeChars -
        translateSpecialEntities -
        isDomCreation -
        isHtmlOutput -
        Returns:
        TODO Consider moving to CleanerProperties since a long list of params is misleading.
      • sanitizeXmlIdentifier

        public static java.lang.String sanitizeXmlIdentifier​(java.lang.String attName)
      • sanitizeXmlIdentifier

        public static java.lang.String sanitizeXmlIdentifier​(java.lang.String attName,
                                                             java.lang.String prefix)
      • sanitizeHtmlAttributeName

        public static java.lang.String sanitizeHtmlAttributeName​(java.lang.String name)
      • isValidHtmlAttributeName

        public static boolean isValidHtmlAttributeName​(java.lang.String name)
      • sanitizeXmlIdentifier

        public static java.lang.String sanitizeXmlIdentifier​(java.lang.String attName,
                                                             java.lang.String prefix,
                                                             java.lang.String replacementCharacter)
        Attempts to replace invalid attribute names with valid ones.
        Parameters:
        attName - the attribute name to fix
        prefix - the prefix to use to indicate an attribute name has been altered
        Returns:
      • isValidXmlIdentifier

        public static boolean isValidXmlIdentifier​(java.lang.String s)
        Checks whether specified string can be valid tag name or attribute name in xml.
        Parameters:
        s - String to be checked
        Returns:
        True if string is valid xml identifier, false otherwise
      • isEmptyString

        public static boolean isEmptyString​(java.lang.Object o)
        Parameters:
        o -
        Returns:
        True if specified string is null of contains only whitespace characters
      • tokenize

        public static java.lang.String[] tokenize​(java.lang.String s,
                                                  java.lang.String delimiters)
      • isXmlReservedCharacter

        public static boolean isXmlReservedCharacter​(java.lang.String c)
      • getXmlNSPrefix

        public static java.lang.String getXmlNSPrefix​(java.lang.String name)
        Parameters:
        name -
        Returns:
        For xml element name or attribute name returns prefix (part before :) or null if there is no prefix
      • getXmlName

        public static java.lang.String getXmlName​(java.lang.String name)
        Parameters:
        name -
        Returns:
        For xml element name or attribute name returns name after prefix (part after :)
      • ltrim

        public static java.lang.String ltrim​(java.lang.String s)
        Trims specified string from left.
        Parameters:
        s -
      • rtrim

        public static java.lang.String rtrim​(java.lang.String s)
        Trims specified string from right.
        Parameters:
        s -
      • isWhitespaceString

        public static boolean isWhitespaceString​(java.lang.Object object)
        Checks whether specified object's string representation is empty string (containing of only whitespaces).
        Parameters:
        object - Object whose string representation is checked
        Returns:
        true, if empty string, false otherwise
      • deserializeEntities

        public static java.lang.String deserializeEntities​(java.lang.String str,
                                                           boolean recognizeUnicodeChars)
      • isValidXmlIdentifierStartChar

        public static boolean isValidXmlIdentifierStartChar​(java.lang.String identifier)
        Determines whether the initial character of an identifier is valid for XML
        Parameters:
        identifier -
        Returns:
      • replaceInvalidXmlIdentifierCharacters

        public static java.lang.String replaceInvalidXmlIdentifierCharacters​(java.lang.String name,
                                                                             java.lang.String replacement)
        Strips out invalid characters from names used for XML Elements and replaces them with the specified character. For example, "" becomes ""
        Parameters:
        name -
        Returns:
        valid XML name