langlab.algs.readability
Module contains functions for computing readability indices.
calc-automated-readability-index
(calc-automated-readability-index stats)Calculates Automated Readability Index based on the given stats. Stats should include fields :n-words, :n-hard-words, :n-sentences, n-chars.
calc-coleman-liau-index
(calc-coleman-liau-index stats)Calculates Coleman-Liau Readability Index based on the given stats. Stats should include fields :n-words, :n-hard-words, :n-sentences, n-chars.
calc-gunning-fog-index
(calc-gunning-fog-index stats)Calculates Gunning Fog Readability Index based on the given stats. Stats should include fields :n-words, :n-hard-words, :n-sentences.
calc-text-stats
(calc-text-stats s env)Calculates statistics of text s. The env supports the following keys mapping to functions
:split-sentences-f- splits text into sentences (mandatory),:split-tokens-f- splits text to tokens (mandatory),:trans-drop-punct-f- removes all non-words tokens (defaulttrans-drop-punct),:count-chars-f- count chars in string (default:en-count-chars-bi,:is-hard-word-f- check if token is a hard word (default: count-latin-vowel-groups-without-final>2).
Result contains a map with the following fields: - :n-chars - total number of letters in words, - :n-words - number of words, - :n-hard-words - number of hard words according to is-hard-word-f, - :n-sentences - number of sentences.
count-sentences
(count-sentences s env)Counts the number of sentences in s based on the provided env. The env supports key :split-sentences-f (mandatory).
count-words
(count-words s env)Counts the number of words in s based on the provided env. The env supports keys:
:split-tokens-f(mandatory):trans-drop-punct-f(defaults totrans-drop-punct)