Class SymSpellModels

java.lang.Object
opennlp.spellcheck.dictionary.SymSpellModels

public final class SymSpellModels extends Object
Convenience factory and (de)serialization helpers for SymSpellModel.

This is the high-level entry point for the persistence layer: build a model from plain-text frequency dictionaries, write a model to / read it from a stream, and emit the model.properties descriptor consumed by the OpenNLP model-resolver.

Classpath resolution of packaged models lives in SymSpellModelResolver.

  • Field Details

  • Method Details

    • buildModel

      public static SymSpellModel buildModel(String language, SymSpellConfig config, Charset charset, opennlp.tools.util.InputStreamFactory unigramSource, opennlp.tools.util.InputStreamFactory bigramSource) throws IOException
      Builds a SymSpellModel from a unigram dictionary and an optional bigram dictionary using the supplied configuration.

      The dictionaries are parsed with a FrequencyDictionaryLoader into source count maps (with duplicate keys accumulated, mirroring engine semantics); the engine itself is then built by SymSpellModel.

      Parameters:
      language - the language tag (e.g. "en"); must not be blank
      config - the engine configuration; must not be null
      charset - the charset to decode the dictionaries with; must not be null
      unigramSource - the word<TAB>count dictionary source; must not be null
      bigramSource - the w1 w2<TAB>count dictionary source; may be null to skip bigrams
      Returns:
      the built model
      Throws:
      IOException - Thrown on IO errors or a malformed dictionary line.
    • serialize

      public static void serialize(SymSpellModel model, OutputStream out) throws IOException
      Serializes a model to the given stream using SymSpellModelSerializer.
      Parameters:
      model - the model to write; must not be null
      out - the destination stream; must not be null. The stream is not closed by this method.
      Throws:
      IOException - Thrown on IO errors.
    • deserialize

      public static SymSpellModel deserialize(InputStream in) throws IOException
      Deserializes a model from the given stream using SymSpellModelSerializer.
      Parameters:
      in - the source stream; must not be null. The stream is not closed by this method.
      Returns:
      the deserialized model
      Throws:
      IOException - Thrown on IO errors or on a malformed stream.
    • toBytes

      public static byte[] toBytes(SymSpellModel model) throws IOException
      Serializes a model to a byte array.
      Parameters:
      model - the model to serialize; must not be null
      Returns:
      the binary model bytes
      Throws:
      IOException - Thrown on IO errors.
    • fromBytes

      public static SymSpellModel fromBytes(byte[] bytes) throws IOException
      Deserializes a model from a byte array.
      Parameters:
      bytes - the binary model bytes; must not be null
      Returns:
      the deserialized model
      Throws:
      IOException - Thrown on IO errors or on a malformed stream.
    • buildProperties

      public static Properties buildProperties(SymSpellModel model, byte[] modelBytes)
      Builds the model.properties descriptor for a serialized model, computing the model.sha256 over the supplied binary form.
      Parameters:
      model - the model the properties describe; must not be null
      modelBytes - the serialized binary form of model (see toBytes(opennlp.spellcheck.dictionary.SymSpellModel)); must not be null
      Returns:
      the populated Properties
    • writePackage

      public static void writePackage(SymSpellModel model, OutputStream binaryOut, OutputStream propertiesOut) throws IOException
      Writes a packaged model pair to the given streams: the binary model and the matching model.properties. This is the on-disk shape expected inside an opennlp-models-spellcheck-{lang} jar (the binary entry must have a .bin suffix to be discoverable by the model-resolver).
      Parameters:
      model - the model to package; must not be null
      binaryOut - destination for the binary model; must not be null. Not closed by this method.
      propertiesOut - destination for model.properties; must not be null. Not closed by this method.
      Throws:
      IOException - Thrown on IO errors.
    • artifactId

      public static String artifactId(String language)
      Parameters:
      language - a language tag; must not be null
      Returns:
      the conventional Maven artifactId for a packaged model of that language, e.g. "opennlp-models-spellcheck-en"