Class DamerauOSADistance

java.lang.Object
opennlp.spellcheck.distance.DamerauOSADistance
All Implemented Interfaces:
EditDistance

public final class DamerauOSADistance extends Object implements EditDistance
Optimal String Alignment (restricted Damerau-Levenshtein) edit distance.

Counts insertions, deletions and substitutions, plus transpositions of two adjacent symbols, each with a unit cost. As an optimal-string-alignment metric, a given substring may not be edited more than once, which is the variant used by the SymSpell reference implementation.

This is the default edit distance for the engine. It is Unicode-aware: comparison happens on Unicode code points, so characters outside the Basic Multilingual Plane (e.g. many emoji) are treated as single symbols.

Instances are immutable and thread-safe. A bounded computation with early exit is provided through distance(CharSequence, CharSequence, int).

  • Field Details

  • Constructor Details

    • DamerauOSADistance

      public DamerauOSADistance()
  • Method Details

    • distance

      public int distance(CharSequence a, CharSequence b, int max)
      Description copied from interface: EditDistance
      Computes the edit distance between a and b, giving up early once it is certain the distance exceeds max.

      A max of 0 is permitted and meaningful: it asks only whether the two sequences are equal, returning 0 when they are and -1 as soon as any difference is found. It is the natural lower bound of the contract, not a degenerate case.

      Specified by:
      distance in interface EditDistance
      Parameters:
      a - the first sequence; must not be null
      b - the second sequence; must not be null
      max - the maximum acceptable distance; must not be negative (>= 0)
      Returns:
      the edit distance, or -1 if it is strictly greater than max