Class StopwordLists

java.lang.Object
opennlp.tools.stopword.StopwordLists

public final class StopwordLists extends Object
Static factory for StopwordFilter instances backed by bundled language-specific stopword resources or caller-supplied input streams.

Bundled lists ship for the eleven languages enumerated in OPENNLP-660: Bulgarian (bg), Danish (da), German (de), English (en), Spanish (es), Finnish (fi), French (fr), Italian (it), Dutch (nl), Portuguese (pt), Russian (ru). Each list is keyed by its ISO 639-1 two-letter code.

  • Method Details

    • forLanguage

      public static opennlp.tools.stopword.StopwordFilter forLanguage(String iso639Code)
      Returns a case-insensitive StopwordFilter for the given ISO 639 language code. Three-letter codes are normalized to their two-letter equivalent when a bundled list exists for the latter.
      Parameters:
      iso639Code - The ISO 639-1 or ISO 639-2/3 language code. Must not be null.
      Returns:
      A StopwordFilter backed by the bundled resource. The returned instance is immutable, thread-safe and cached, so repeated calls for the same language return the same shared filter.
      Throws:
      IllegalArgumentException - if iso639Code is null, is not a valid ISO 639 code, or has no bundled list for this language.
      UncheckedIOException - if reading the bundled resource fails.
    • supportedLanguages

      public static Set<String> supportedLanguages()
      Returns:
      An unmodifiable view of the bundled ISO 639-1 codes for which stopword lists are shipped.
    • load

      public static opennlp.tools.stopword.StopwordFilter load(InputStream in, Charset cs, boolean caseSensitive) throws IOException
      Loads a stopword filter from a caller-supplied input stream.
      Parameters:
      in - The input stream. Must not be null.
      cs - The Charset to decode with. Must not be null.
      caseSensitive - Whether the resulting filter matches case-sensitively.
      Returns:
      A StopwordFilter populated from in.
      Throws:
      IllegalArgumentException - if in or cs is null.
      IOException - Thrown if an IO error occurs while reading.