How to make a word list from an aspell dictionary

Posted on 2007-11-26

I am working on something that needs a list of words, without regard to american vs. british or accents or anything: it just has to be as many words as possible.  There are a whole bunch of aspell dictionaries available.  First expand the files:

unzip *.wl

Then merge into a single list, eliminating duplicates:

sort --unique --ignore-case *.wl >list.txt

In additon, I want everything to be UTF-8:

iconv -f ISO8859-1 -t UTF-8 list.txt >ulist.txt

Pretty simple.  The merged english word list has 137,883 words.

Tags: aspell iconv sort wordlist