History of Words

Now for a change from procedural generation, a bit of linguistics!

I’ve aways had a few low-key linguistics projects going on, generally related to language learning or the history of language; this particular project stems from an attempt to make a script to convert one language to another using only language change rules (e.g. French “cavalier” -> Spanish “caballero“; so how about making “fake spanish” by taking French replacing “v” with “b” and “-ier” with “-ero“); but that project ended up being too complicated and the outputs it produced weren’t much fun. For example, the intro of Don Quixote:

CUENTA Cide Hamete Benengeli, en la segunda parte desta historia y tercera salida de don Quijote, que el cura y el barbero se estuvieron casi un mes sin verle, por no renovarle y traerle a la memoria las cosas pasadas; pero no por esto dejaron de visitar a su sobrina y a su ama, encargándolas tuviesen cuenta con regalarle, dándole a comer cosas confortativas y apropiadas para el corazón y el cerebro, de donde procedía, según buen discurso, toda su mala ventura. …

In “Frenchified” Spanish, this became:

CONTE Cide Famete Benengeli, en le ségonde parte deste hèsteure xiste terzére salide de don Quixote, qe el cure xiste el barbére se estuveron chasi un mes sin verle, peur ne rénovarle xiste trairle e le memeure las cosas pasadas; pére ne peur este delleron de visiter e su sobrine xiste e su ame, enchargandolas tuvessen conte con régalarle, dandole e comér cosas confortativas xiste appropiadas pare el ceurason xiste el cérébre, de done procédie, ségun bon discurse, tode su male venture. …

… which, apart from sounding a bit more like old French, is not very interesting (I could have made something better out of it, but it would have required rethinking my approach on a few aspects, and I had spent enough time on it so switched to another project).

But this summer I dug up this old project and used the same data to build a couple nice visualizations published as the History of Words; a visual exploration of the linguistic data I used on the previous project.

It has two main visualizations:

1) Trees of Indo-European Cognates:

Screen Shot 2017-09-24 at 23.11.56

2) A History of English, in the form of a Sankey diagram:

Screen Shot 2017-09-24 at 23.05.12

(click on it for the full, interactive, version)

It was a fun project, and I have ideas on ways to improve it, but I have other itches to scratch so I’d rather publish it as it is and come back in a few week months to add improvements (for example: highlight words with an unusual history; illustrate sound laws …).

History of Words

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s