Package opennlp.spellcheck.stream
Class SpellCorrectingObjectStream
- All Implemented Interfaces:
AutoCloseable,opennlp.tools.util.ObjectStream<String>
public class SpellCorrectingObjectStream
extends opennlp.tools.util.FilterObjectStream<String,String>
A
FilterObjectStream that spell-corrects each String line read from a
wrapped ObjectStream (for example a PlainTextByLineStream).
Correction is delegated to a SpellCheckingCharSequenceNormalizer, so the
same per-token / compound modes and skip guards apply. Each source line is passed
through the normalizer and the corrected line is emitted; null (end of
stream) is forwarded unchanged, honoring the ObjectStream exhaustion
contract. FilterObjectStream.reset() and FilterObjectStream.close() delegate to the wrapped stream via
FilterObjectStream.
Typical use is to drop the corrector into an existing preprocessing chain that already reads text line-by-line:
ObjectStream<String> lines = new PlainTextByLineStream(factory, StandardCharsets.UTF_8);
ObjectStream<String> corrected = new SpellCorrectingObjectStream(lines, model);
For tokenized data (one whitespace-separated token list per element) use
SpellCorrectingTokenStream, which corrects token-by-token and re-joins.
-
Constructor Summary
ConstructorsConstructorDescriptionSpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SymSpellModel model) Wrapssampleswith a default per-token corrector backed by a loadedSymSpellModel.SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SpellCheckingCharSequenceNormalizer normalizer) Wrapssampleswith an explicitly configured corrector, so callers can pick the mode and guards throughSpellCheckingCharSequenceNormalizer.Builder.SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SpellChecker spellChecker) Wrapssampleswith a default per-token corrector backed by aSpellChecker. -
Method Summary
Methods inherited from class opennlp.tools.util.FilterObjectStream
close, reset
-
Constructor Details
-
SpellCorrectingObjectStream
public SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SpellChecker spellChecker) Wrapssampleswith a default per-token corrector backed by aSpellChecker.- Parameters:
samples- the source line stream; must not benullspellChecker- the engine used to correct lines; must not benull
-
SpellCorrectingObjectStream
public SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SymSpellModel model) Wrapssampleswith a default per-token corrector backed by a loadedSymSpellModel.- Parameters:
samples- the source line stream; must not benullmodel- the loaded model whose engine is used; must not benull
-
SpellCorrectingObjectStream
public SpellCorrectingObjectStream(opennlp.tools.util.ObjectStream<String> samples, SpellCheckingCharSequenceNormalizer normalizer) Wrapssampleswith an explicitly configured corrector, so callers can pick the mode and guards throughSpellCheckingCharSequenceNormalizer.Builder.- Parameters:
samples- the source line stream; must not benullnormalizer- the corrector to apply to each line; must not benull
-
-
Method Details
-
read
- Throws:
IOException
-