Printable Shavian dictionary of the 1650 most common English words
Published by Arun Isaac on
In other languages: தமிழ்
I am trying to learn the Shavian alphabet1, but I miss a good paper dictionary. Flitting back and forth between my computer when I want to be writing away from it is a no go. So, I made this printable dictionary of the 16502 most common English words.
I used the Kingsley Read lexicon3 with some Unix-fu to produce this printable Shavian dictionary. Here's the code.
wget https://github.com/Shavian-info/readlex/raw/refs/heads/main/kingsleyreadlexicon.tsv
# The number of lines (1650, in this case) is constrained to be a # multiple as described in fold.awk. # Combine parts of speech that are spelt the same, sort by word # frequency, take the first 1650 words, extract only the Latin and # Shavian spellings, sort alphabetically, and "fold" columns for # printing. awk -f collapse.awk kingsleyreadlexicon.tsv | \ sort -nrk 3,3 | head -n1650 \ | cut -f1,2 | sort -k 1,1 \ | awk -f fold.awk > shavian-1650.tsv
where collapse.awk
is
BEGIN { OFS = "\t" } { latin = $1 shavian = $2 frequency = $5 } NR == 1 { # Initialize state. previous_latin = latin previous_shavian = shavian accumulated_frequency = frequency } NR > 1 { # Output line and clear accumulator if this is a different word. if ((latin != previous_latin) || (shavian != previous_shavian)) { print previous_latin, previous_shavian, accumulated_frequency accumulated_frequency = 0 } # Keep state. accumulated_frequency += frequency previous_latin = latin previous_shavian = shavian } END { # This is the end. Output line unconditionally. print previous_latin, previous_shavian, accumulated_frequency }
and fold.awk
is
# This script only works if the total number of lines is a multiple of # lines_per_page * sections_per_page. This is when pages are # perfectly filled, and there is no empty space left. BEGIN { lines_per_page = 55; sections_per_page = 3; } # Accumulate lines in a matrix. { page_line = (NR - 1) % (lines_per_page * sections_per_page) lines[page_line % lines_per_page, int(page_line / lines_per_page)] = $0 } # Dump accumulated matrix once end of page is reached. Then, clear the # matrix so the next page can be built up. (page_line == lines_per_page*sections_per_page - 1) { for (i=0; i<lines_per_page; i++) { for (j=0; j<sections_per_page; j++) { printf (j == 0) ? "%s" : "\t%s", lines[i, j] delete lines[i, j] } printf "\n" } }
Finally, I printed shavian-1650.tsv
using LibreOffice Calc. LibreOffice took care of neatly aligning the columns.
On a related note, you may also be interested in this Shavian alphabet chart from Omniglot.
UPDATE on 30 March, 2025: Based on a suggestion on the fediverse, I added collapse.awk
to combine parts of speech that are spelt the same.
Footnotes:
Thanks to indieterminacy for nudging me in this direction a long time ago.
Why 1650, you say? I was going for something above 1000, and 1650 fit perfectly into 10 pages.
The Kingsley Read Lexicon, and therefore my printable dictionary, is copyright shavian.info 2020-2022 under a Creative Commons Attribution-ShareAlike 4.0 International Licence.