Class LevenshteinDistance

java.lang.Object
opennlp.spellcheck.distance.LevenshteinDistance
All Implemented Interfaces:
EditDistance

public final class LevenshteinDistance extends Object implements EditDistance
Plain Levenshtein edit distance (insertions, deletions, substitutions; no transpositions). Offered as a selectable alternative to the default DamerauOSADistance, with which it shares the bounded EditDistance contract; the only behavioural difference is that an adjacent transposition costs two edits here (one deletion plus one insertion) rather than one.

The computation is bounded with early exit and is Unicode-aware: comparison happens on Unicode code points, so characters outside the Basic Multilingual Plane (e.g. many emoji) are treated as single symbols.

Instances are immutable and thread-safe.

  • Field Details

  • Constructor Details

    • LevenshteinDistance

      public LevenshteinDistance()
  • Method Details

    • distance

      public int distance(CharSequence a, CharSequence b, int max)
      Description copied from interface: EditDistance
      Computes the edit distance between a and b, giving up early once it is certain the distance exceeds max.

      A max of 0 is permitted and meaningful: it asks only whether the two sequences are equal, returning 0 when they are and -1 as soon as any difference is found. It is the natural lower bound of the contract, not a degenerate case.

      Specified by:
      distance in interface EditDistance
      Parameters:
      a - the first sequence; must not be null
      b - the second sequence; must not be null
      max - the maximum acceptable distance; must not be negative (>= 0)
      Returns:
      the edit distance, or -1 if it is strictly greater than max