Class SpellCorrectingObjectStream

java.lang.Object
opennlp.tools.util.FilterObjectStream<String,String>
opennlp.spellcheck.stream.SpellCorrectingObjectStream
All Implemented Interfaces:
AutoCloseable, opennlp.tools.util.ObjectStream<String>

public class SpellCorrectingObjectStream extends opennlp.tools.util.FilterObjectStream<String,String>
A FilterObjectStream that spell-corrects each String line read from a wrapped ObjectStream (for example a PlainTextByLineStream).

Correction is delegated to a SpellCheckingCharSequenceNormalizer, so the same per-token / compound modes and skip guards apply. Each source line is passed through the normalizer and the corrected line is emitted; null (end of stream) is forwarded unchanged, honoring the ObjectStream exhaustion contract. FilterObjectStream.reset() and FilterObjectStream.close() delegate to the wrapped stream via FilterObjectStream.

Typical use is to drop the corrector into an existing preprocessing chain that already reads text line-by-line:


 ObjectStream<String> lines = new PlainTextByLineStream(factory, StandardCharsets.UTF_8);
 ObjectStream<String> corrected = new SpellCorrectingObjectStream(lines, model);
 

For tokenized data (one whitespace-separated token list per element) use SpellCorrectingTokenStream, which corrects token-by-token and re-joins.

  • Constructor Details

    • SpellCorrectingObjectStream

      public SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SpellChecker spellChecker)
      Wraps samples with a default per-token corrector backed by a SpellChecker.
      Parameters:
      samples - the source line stream; must not be null
      spellChecker - the engine used to correct lines; must not be null
    • SpellCorrectingObjectStream

      public SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SymSpellModel model)
      Wraps samples with a default per-token corrector backed by a loaded SymSpellModel.
      Parameters:
      samples - the source line stream; must not be null
      model - the loaded model whose engine is used; must not be null
    • SpellCorrectingObjectStream

      public SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SpellCheckingCharSequenceNormalizer normalizer)
      Wraps samples with an explicitly configured corrector, so callers can pick the mode and guards through SpellCheckingCharSequenceNormalizer.Builder.
      Parameters:
      samples - the source line stream; must not be null
      normalizer - the corrector to apply to each line; must not be null
  • Method Details