langlab.core.comparators

Module contains various tools for fuzzy strings comparison.

calc-common-prefix-length

(calc-common-prefix-length s1 s2)

Calculates the length of the common initial substring of s1 and s2.

calc-jaro-winkler-dist-aclang

(calc-jaro-winkler-dist-aclang s1 s2)

Computes the Jaro-Winkler distance between s1 and s2

http://en.wikipedia.org/wiki/Jaro-Winkler_distance

It uses Apache Commons Lang3 (aclang).

calc-levenshtein-dist-aclang

(calc-levenshtein-dist-aclang s1 s2)

Calculates the Levenshtein distance between s1 and s2. It uses Apache Commons Lang3 (aclang).

calc-trunc-levenshtein-dist-aclang

(calc-trunc-levenshtein-dist-aclang s1 s2 truncation)

Calculates the Levenshtein distance between s1 and s2, only if it is lower or equal to truncation. If the Levenshtein distance is greater false is returned. It uses Apache Commons Lang3 (aclang).