langlab.core.stopwords
Module contains predefined sets of stopwords/articles for various languages and functions to operate on them.
All constants are lowercase.
ar-get-stopwords-lucene
(ar-get-stopwords-lucene)
Returns default Arabic stopword set used by Lucene.
bg-get-stopwords-clef
(bg-get-stopwords-clef)
Returns Bulgarian stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
bg-get-stopwords-lucene
(bg-get-stopwords-lucene)
Returns default Bulgarian stopword set used by Lucene.
br-get-stopwords-lucene
(br-get-stopwords-lucene)
Returns default Brazilian stopword set used by Lucene.
ca-get-stopwords-lucene
(ca-get-stopwords-lucene)
Returns default Catalan stopword set used by Lucene.
ca-get-stopwords-ranks
(ca-get-stopwords-ranks)
Returns Catalan stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
cz-get-stopwords-clef
(cz-get-stopwords-clef)
Returns Czech stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
cz-get-stopwords-lucene
(cz-get-stopwords-lucene)
Returns default Czech stopword set used by Lucene.
cz-get-stopwords-ranks
(cz-get-stopwords-ranks)
Returns Czech stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
da-get-stopwords-lucene
(da-get-stopwords-lucene)
Returns default Danish stopword set used by Lucene.
da-get-stopwords-ranks
(da-get-stopwords-ranks)
Returns Danish stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
de-get-stopwords-clef
(de-get-stopwords-clef)
Returns German stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
de-get-stopwords-lucene
(de-get-stopwords-lucene)
Returns default German stopword set used by Lucene.
de-get-stopwords-ranks
(de-get-stopwords-ranks)
Returns German stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
el-get-stopwords-lucene
(el-get-stopwords-lucene)
Returns default Greek stopword set used by Lucene.
en-get-stopwords-clef
(en-get-stopwords-clef)
Returns English stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
en-get-stopwords-lucene
(en-get-stopwords-lucene)
Returns default Bulgarian stopword set used by Lucene.
en-get-stopwords-ranks-long
(en-get-stopwords-ranks-long)
Returns English stopword set (longer list). Taken from http://www.ranks.nl/resources/stopwords.html.
en-get-stopwords-ranks-short
(en-get-stopwords-ranks-short)
Returns English stopword set (shorter list). Taken from http://www.ranks.nl/resources/stopwords.html.
en-get-stopwords-ranks-vlong
(en-get-stopwords-ranks-vlong)
Returns English stopword (very long list). Taken from http://www.ranks.nl/resources/stopwords.html.
es-get-stopwords-long-clef
(es-get-stopwords-long-clef)
Returns Spanish stopword set (longer version). Taken from http://members.unine.ch/jacques.savoy/clef/
es-get-stopwords-lucene
(es-get-stopwords-lucene)
Returns default Spanish stopword set used by Lucene.
es-get-stopwords-ranks
(es-get-stopwords-ranks)
Returns Spanish stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
es-get-stopwords-short-clef
(es-get-stopwords-short-clef)
Returns Spanish stopword set (shorter version). Taken from http://members.unine.ch/jacques.savoy/clef/
eu-get-stopwords-lucene
(eu-get-stopwords-lucene)
Returns default Basque stopword set used by Lucene.
fa-get-stopwords-lucene
(fa-get-stopwords-lucene)
Returns default Persian stopword set used by Lucene.
fi-get-stopwords-lucene
(fi-get-stopwords-lucene)
Returns default Finish stopword set used by Lucene.
fi-get-stopwords-ranks
(fi-get-stopwords-ranks)
Returns Finish stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
fr-get-stopwords-clef
(fr-get-stopwords-clef)
Returns French stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
fr-get-stopwords-lucene
(fr-get-stopwords-lucene)
Returns default French stopword set used by Lucene.
fr-get-stopwords-ranks
(fr-get-stopwords-ranks)
Returns French stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
ga-get-stopwords-lucene
(ga-get-stopwords-lucene)
Returns default Irish stopword set used by Lucene.
gl-get-stopwords-lucene
(gl-get-stopwords-lucene)
Returns default Galician stopword set used by Lucene.
hi-get-stopwords-lucene
(hi-get-stopwords-lucene)
Returns default Hindi stopword set used by Lucene.
hu-get-stopwords-clef
(hu-get-stopwords-clef)
Returns Hungarian stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
hu-get-stopwords-lucene
(hu-get-stopwords-lucene)
Returns default Hungarian stopword set used by Lucene.
hu-get-stopwords-ranks
(hu-get-stopwords-ranks)
Returns Hungarian stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
hy-get-stopwords-lucene
(hy-get-stopwords-lucene)
Returns default Armenian stopword set used by Lucene.
id-get-stopwords-lucene
(id-get-stopwords-lucene)
Returns default Indonesian stopword set used by Lucene.
it-get-stopwords-clef
(it-get-stopwords-clef)
Returns Italian stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
it-get-stopwords-lucene
(it-get-stopwords-lucene)
Returns default Italian stopword set used by Lucene.
it-get-stopwords-ranks
(it-get-stopwords-ranks)
Returns Italian stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
lv-get-stopwords-lucene
(lv-get-stopwords-lucene)
Returns default Latvian stopword set used by Lucene.
nl-get-stopwords-ranks
(nl-get-stopwords-ranks)
Returns Dutch stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
no-get-stopwords-lucene
(no-get-stopwords-lucene)
Returns default Norwegian stopword set used by Lucene.
no-get-stopwords-ranks
(no-get-stopwords-ranks)
Returns Norwegian stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
pl-get-stopwords-long-clef
(pl-get-stopwords-long-clef)
Returns Polish stopword set (longer version). Taken from http://members.unine.ch/jacques.savoy/clef/
pl-get-stopwords-lucene
(pl-get-stopwords-lucene)
Returns default Polish stopword set used by Lucene.
pl-get-stopwords-ranks
(pl-get-stopwords-ranks)
Returns Polish stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
pl-get-stopwords-short-clef
(pl-get-stopwords-short-clef)
Returns Polish stopword set (shorter version). Taken from http://members.unine.ch/jacques.savoy/clef/
pl-get-stopwords-wiki-long
(pl-get-stopwords-wiki-long)
Return Polish stopword set. Taken from http://pl.wikipedia.org/wiki/Wikipedia:Stopwords.
pl-get-stopwords-wiki-short
(pl-get-stopwords-wiki-short)
Return Polish stopword set. Taken from http://pl.wikipedia.org/wiki/Wikipedia:Stopwords. Extended with polysemous words bez i go.
pt-get-stopwords-long-clef
(pt-get-stopwords-long-clef)
Returns Portuguese stopword set (longer version). Taken from http://members.unine.ch/jacques.savoy/clef/
pt-get-stopwords-lucene
(pt-get-stopwords-lucene)
Returns default Portuguese stopword set used by Lucene.
pt-get-stopwords-ranks
(pt-get-stopwords-ranks)
Returns Portuguese stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
pt-get-stopwords-short-clef
(pt-get-stopwords-short-clef)
Returns Portuguese stopword set (shorter version). Taken from http://members.unine.ch/jacques.savoy/clef/
ro-get-stopwords-clef
(ro-get-stopwords-clef)
Returns Romanian stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
ro-get-stopwords-lucene
(ro-get-stopwords-lucene)
Returns default Romanian stopword set used by Lucene.
ru-get-stopwords-clef
(ru-get-stopwords-clef)
Returns Russian stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
ru-get-stopwords-lucene
(ru-get-stopwords-lucene)
Returns default Russian stopword set used by Lucene.
ru-get-stopwords-ranks
(ru-get-stopwords-ranks)
Returns Russian stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
sv-get-stopwords-clef
(sv-get-stopwords-clef)
Returns Swedish stopword set. Taken from http://members.unine.ch/jacques.savoy/clef/
sv-get-stopwords-lucene
(sv-get-stopwords-lucene)
Returns default Swedish stopword set used by Lucene.
sv-get-stopwords-ranks
(sv-get-stopwords-ranks)
Returns Swedish stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.
th-get-stopwords-lucene
(th-get-stopwords-lucene)
Returns default Thai stopword set used by Lucene.
tr-get-stopwords-lucene
(tr-get-stopwords-lucene)
Returns default Turkish stopword set used by Lucene.
tr-get-stopwords-ranks
(tr-get-stopwords-ranks)
Returns Turkish stopword set. Taken from http://www.ranks.nl/resources/stopwords.html.