langlab.algs.readability
Module contains functions for computing readability indices.
calc-automated-readability-index
(calc-automated-readability-index stats)
Calculates Automated Readability Index based on the given stats
. Stats should include fields :n-words
, :n-hard-words
, :n-sentences
, n-chars
.
calc-coleman-liau-index
(calc-coleman-liau-index stats)
Calculates Coleman-Liau Readability Index based on the given stats
. Stats should include fields :n-words
, :n-hard-words
, :n-sentences
, n-chars
.
calc-gunning-fog-index
(calc-gunning-fog-index stats)
Calculates Gunning Fog Readability Index based on the given stats
. Stats should include fields :n-words
, :n-hard-words
, :n-sentences
.
calc-text-stats
(calc-text-stats s env)
Calculates statistics of text s
. The env
supports the following keys mapping to functions
:split-sentences-f
- splits text into sentences (mandatory),:split-tokens-f
- splits text to tokens (mandatory),:trans-drop-punct-f
- removes all non-words tokens (defaulttrans-drop-punct
),:count-chars-f
- count chars in string (default:en-count-chars-bi
,:is-hard-word-f
- check if token is a hard word (default: count-latin-vowel-groups-without-final>2).
Result contains a map with the following fields: - :n-chars
- total number of letters in words, - :n-words
- number of words, - :n-hard-words
- number of hard words according to is-hard-word-f, - :n-sentences
- number of sentences.
count-sentences
(count-sentences s env)
Counts the number of sentences in s
based on the provided env
. The env
supports key :split-sentences-f
(mandatory).
count-words
(count-words s env)
Counts the number of words in s
based on the provided env
. The env
supports keys:
:split-tokens-f
(mandatory):trans-drop-punct-f
(defaults totrans-drop-punct
)