Class SymSpellModel
- All Implemented Interfaces:
opennlp.tools.util.model.SerializableArtifact
SymSpell engine together with
the source frequency data and the metadata needed to reproduce and identify it.
The model keeps the source dictionary (unigram counts and optional bigram
counts) and the configuration rather than the derived delete
index. This is what the SymSpellModelSerializer writes; the (much larger)
delete index is rebuilt by replaying the source through SymSpell.add(java.lang.String, long) /
SymSpell.addBigram(java.lang.String, java.lang.String, long) when the engine is constructed. See the serializer's class
javadoc for the rationale.
Instances are immutable: the engine is built once at construction time and exposed
read-only via getSymSpell(); the source maps returned by unigrams()
and bigrams() are unmodifiable views.
As a SerializableArtifact this type can be embedded in OpenNLP model
containers and is round-tripped by SymSpellModelSerializer.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionSymSpellModel(String language, String name, String version, SymSpellConfig config, Map<String, Long> unigrams, Map<String, Long> bigrams) Creates a model with explicit name and version, and builds itsSymSpellengine from the supplied source data.SymSpellModel(String language, SymSpellConfig config, Map<String, Long> unigrams, Map<String, Long> bigrams) Creates a model and builds itsSymSpellengine from the supplied source data. -
Method Summary
-
Field Details
-
DEFAULT_MODEL_NAME
The default model name fragment used for classpath discovery.- See Also:
-
DEFAULT_MODEL_VERSION
The default model version used when none is supplied.- See Also:
-
-
Constructor Details
-
SymSpellModel
public SymSpellModel(String language, SymSpellConfig config, Map<String, Long> unigrams, Map<String, Long> bigrams) Creates a model and builds itsSymSpellengine from the supplied source data.- Parameters:
language- an IETF/ISO language tag (e.g."en"); must not benullor blankconfig- the engine configuration; must not benullunigrams- theword -> countsource map; must not benullbigrams- the"w1 w2" -> countsource map; must not benull(may be empty)
-
SymSpellModel
public SymSpellModel(String language, String name, String version, SymSpellConfig config, Map<String, Long> unigrams, Map<String, Long> bigrams) Creates a model with explicit name and version, and builds itsSymSpellengine from the supplied source data.- Parameters:
language- an IETF/ISO language tag (e.g."en"); must not benullor blankname- the model name (becomesmodel.name); must not benullor blankversion- the model version (becomesmodel.version); must not benullor blankconfig- the engine configuration; must not benullunigrams- theword -> countsource map; must not benullbigrams- the"w1 w2" -> countsource map; must not benull(may be empty)
-
-
Method Details
-
getSymSpell
- Returns:
- the ready-to-query engine backed by this model.
-
getLanguage
- Returns:
- the language tag of this model.
-
getName
- Returns:
- the model name (also emitted as
model.name).
-
getVersion
- Returns:
- the model version (also emitted as
model.version).
-
getConfig
- Returns:
- the configuration used to build the engine.
-
unigrams
- Returns:
- an unmodifiable view of the
word -> countsource map.
-
bigrams
- Returns:
- an unmodifiable view of the
"w1 w2" -> countsource map.
-
getArtifactSerializerClass
- Specified by:
getArtifactSerializerClassin interfaceopennlp.tools.util.model.SerializableArtifact
-