Class StringNormalizer


  • public class StringNormalizer
    extends Object
    A utility class offering some string operations
    Author:
    Matthais Müller
    • Field Detail

      • NON_PRINTABLE_CHARACTER

        public static final Pattern NON_PRINTABLE_CHARACTER
        matches every character that is a control character (\p{Cc}, a unicode general category) and not tab (\t), carriage return (\r) or new line (\n)
    • Constructor Detail

      • StringNormalizer

        public StringNormalizer()
    • Method Detail

      • normalize

        public static final String normalize​(String input)
        Removes the following characters from the given string:
        ! " § $ % & / ( ) = ? ´ { [ ] } \ ` + - * % : , ; < > ° ^ # ~ ' |
        Additionally, all diacritics are removed from the string.
        Finally, it replaces german umlauts ( ä ö ü ß) with their two-letter representations (ae oe ue ss).
        Parameters:
        input - the input string
        Returns:
        the "normalized" string
      • removeNonPrintableCharacters

        public static String removeNonPrintableCharacters​(String value)
        Removes all non-printable characters from the given string.
        Parameters:
        value - the string
        Returns:
        the string without any non printable characters
      • replaceNonPrintableCharacters

        public static String replaceNonPrintableCharacters​(String value,
                                                           String replacement)
        Replaces all non-printable characters of the given string with the given replacement
        Parameters:
        value - the string
        replacement - the replacement string
        Returns:
        the string with the replacements for non-printable characters