Random Language Generator

training text (choose at least one)


dependency depth:  




This is a program that generates new words and sentences based on patterns in one or more texts of your choice. It works as follows. Let d be your choice of dependency depth. Then for every d-1-tuple of characters in the training text(s) you chose, the program finds the empirical distribution of the characters that follow that d-1-tuple. Generating text is then a matter of sampling from those distributions.

For generating new words, or a hybrid language (e.g. Hispano-Czech, Italo-Polish), the best dependency depth is about 3 or 4. Slightly higher values for dependency depth net you mostly real words, but the syntax is strange. At the highest values of dependency depth you get whole words and pretty good syntax, the only missing ingredient being a coherent train of thought; at that point you are, in effect, generating college-level essays.

Some of the "languages" here, aren't. That is, they're particular texts in English, included here for their unique style, vocabulary or subject matter. Select a high dependency depth, check "Obama", and you're off and running generating never-before-heard Obamaesque speeches. Click "Bush" and "Obama" for the best of both!

The generated texts are kept short in the interest of conserving CPU cycles at the server end. However, every time you press the submit button, the program will reseed its random number generator and you will therefore obtain a different text.

Author: Ted Sternberg