'Up-Goer Five' text editor restricts writers to 1000 most commonly used words

Feb 01, 2013 by Bob Yirka report
Someone describes parliamentary democracy. Credit: upgoer5

(Phys.org)—Geneticist Theo Sanderson has written a simple text editor that allows a writer to use only words from a list of the 1000 ("ten hundred" since "thousand" isn't on the list) most commonly used words in the English language, to describe things. He calls it the Up-Goer Five Text Editor, in honor of a comic created by xkcd, to describe a Saturn V rocket, using only the most common 1000 words in the English language. Sanderson has made the editor available online for free, which intrigued bloggers, Chris Rowan and Anne Jefferson to the extent that they've set up a Tumblr blogger page called "Ten Hundred Words of Science," where they display the results of a challenge they've issued to scientists to describe what they do for a living using Sanderson's text editor. The results are thought provoking, interesting and quite often humorous.

Writers the world over spend their days converting scientific jargon into prose that most anyone can understand. They do so because the results of scientific efforts are interesting to a wide range of people – they want to know what's going on. Unfortunately, many people who might be interested in learning of such work, might not be able to make sense of what is presented in a scientific journal (or gain access to it without paying for it), due to the word choices used by their authors. To make the science more easily understood, such writers must use less jargon and more easily relatable analogies. Some might wonder why and don't simply write their papers in ways that everyone can understand in the first place – the answer is that to do so would lengthen the paper to the extent that it would become unwieldy and it would take far longer to write, taking more time that would be better spent doing research.

The Up-Goer Five editor challenges such thinking, however, by causing those who use it to think about what they wish to convey in ways they likely never thought of before. It forces expression to come from a word driven approach, to one that is idea driven, which, when put down in , often sounds like the way ideas are expressed to children. That's not coincidental – children have a very limited perspective and background, so new information has to be given in a context that they are capable of understanding, and that generally means using a reasonably small vocabulary.

The Up-Goer Five text editor isn't likely to change the ways of the world, of course, but it might just offer some people an opportunity to consider how they express themselves in a more profound way, and to perhaps cause them to gain some insight into how they communicate with others in general.

Explore further: Can science eliminate extreme poverty?

Related Stories

Texting affects ability to interpret words

Feb 20, 2012

(Medical Xpress) -- Research designed to understand the effect of text messaging on language found that texting has a negative impact on people's linguistic ability to interpret and accept words.

Survival of the fittest: Linguistic evolution in practice

Dec 09, 2011

A new study of how compound word formation is influenced by subtle forms of linguistic pressure demonstrates that words which "sound better" to the speakers of a language have a higher chance of being created, suggesting ...

Shakespeare's skill 'more in grammar than in words'

Jan 30, 2012

William Shakespeare's mastery of the English language is displayed more in the grammar he used than in his words, according to a researcher at the University of Strathclyde in Glasgow, Scotland.

Study: Word sounds contain clues for language learners

Sep 13, 2011

(PhysOrg.com) -- Why do words sound the way they do? For over a century, it has been a central tenet of linguistic theory that there is a completely arbitrary relationship between how a word sounds and what it means.

Recommended for you

Study finds law dramatically curbing need for speed

7 hours ago

Almost seven years have passed since Ontario's street-racing legislation hit the books and, according to one Western researcher, it has succeeded in putting the brakes on the number of convictions and, more importantly, injuries ...

Newlyweds, be careful what you wish for

Apr 17, 2014

A statistical analysis of the gift "fulfillments" at several hundred online wedding gift registries suggests that wedding guests are caught between a rock and a hard place when it comes to buying an appropriate gift for the ...

User comments : 26

Adjust slider to filter visible comments by rank

Display comments: newest first

NotAsleep
5 / 5 (3) Feb 01, 2013
I work on seats that throw people from flying things really fast before they hit the ground. If people die, I probably haven't done my job well. I also make sure that my flying things don't hurt the trees and air with the bad things they give off. If one of my flying things hits the ground wrong, I'm there to save the trees, air and water before bad things happen for good
antialias_physorg
4 / 5 (4) Feb 01, 2013
Orwellian Newspeak, here we come.

But seriously: sometimes you get fun stuff if you limit the choice of words.
Another approach I stumbled upon a good while ago was someone using 4-letter words (or less) to explain relativity (and doing a pretty good job of it, too)
http://www.muppet.../al.html

Some might wonder why scientists and academics don't simply write their papers in ways that everyone can understand in the first place

Because science aims to be precise. Concepts are very precisely defined. However, the more precisely you define somthing the more words you need for it (consider: lake, ocean, river, etc. vs. 'body of water'). Science happens at the edges of the definitions.

Knowing the definitions is important to express ideas succinctly. But science isn't ABOUT technical terms. They are merely tools so that people don't have to waste a lot of time explaining stuff over and over, but can get on with the important parts.
Eikka
1 / 5 (1) Feb 01, 2013
What they're doing is forcing a user of an isolating language - English - to write in a way that resembles a synthetic language. Synthetic, not in the sense of a made-up language, but in the sense that the language structures itself around a core vocabulary and synthesizes the necessary words that describe things outside of that vocabulary.

Think of the German word Fernsehen for television, or indeed the French-derived loanword television itself. If the same concept was applied to the English language to make it a synthesizing one, you'd arrive at a recognizable word: seefar.

Seefar - wouldn't that be a perfectly reasonable name for a television?

The word itself is it's own description for the most part, which is how you avoid coming up with impenetrable jargon in sciences as well. For the english word gear, the Germans use Zahnrad, which is literally "teeth-wheel".
Eikka
1 / 5 (1) Feb 01, 2013
The main problem of an isolating language like English is, that you do end up with a vast collection of jargon. The Oxford-English dictionary alone contains the definitions of roughly 600,000 word-forms, and the last time they made a complete printed run, it took up 20 volumes of books.

Meanwhile a single word in Finnish can take up roughly 2500 possible forms, the meanings of which are described by a collection of about 25-75 rules that apply the same to all words of the same class, verb or noun. That means you can express millions of ideas just by knowing a limited set of rules and a limited core vocabulary.
antialias_physorg
4.2 / 5 (5) Feb 01, 2013
Seefar - wouldn't that be a perfectly reasonable name for a television?

How would you distinguish it from binoculars, telescopes, remote cameras, clairvoyance, ... ?

The problem is that there are far more than 1000 concepts - and the more concepts you have the more you have to aggregate ever longer word-monster.
To give you an example in german, you can end up with something like this (which is avalid word):
"Kühlschrankreparaturdienstleistungsgesellschaftspressevereinigungsredaktionssprecher"
(Which is the editorial spokesperson of the union of departments of all the refrigerator repair service press offices.)

And it took me full 5 minutes to deconstruct it for proper translation. And at least 1 minute to figure out what it meant in german at all - and I am german.

Communication shouldn't be like that. It should be a speedy way of transferring information without ambiguity. And that sometimes requires technical terms.
tekram
4.8 / 5 (5) Feb 01, 2013
..: seefar.

Seefar - wouldn't that be a perfectly reasonable name for a television?
stupid box
Theo Sanderson can call his editor "stupid writing thing" and his occupation "too much time on hand"
Eikka
1 / 5 (1) Feb 01, 2013
How would you distinguish it from binoculars, telescopes, remote cameras, clairvoyance, ... ?


By coming up with different words for those things. It's usually the context that gives away what kind of a device it is, so the word doesn't have to be completely self-describing.

A telescope can be a seepipe, and binoculars can be pairseers. A remote camera doesn't need a single word because you don't need to roll adjectives into it, and clarivoyance is obviously thruthseeing, to give some possible examples.

Besides, "clarivoyance", "binoculars", "telescope" are originally synthesized words that got loaned into English. English simply treats them as isolated concepts instead of recognizing the parts.

"Kühlschrankreparaturdienstleistungsgesellschaftspressevereinigungsredaktionssprecher"


And if you had to invent a jargon word for the same thing, what would it be?

"Bob?"
Eikka
1 / 5 (1) Feb 01, 2013
The problem is that there are far more than 1000 concepts


The problem here is that the 1000 concepts appear to include names as well, so you can't use e.g "Saturn", which is a completely artifical restriction for a real language and results in wordmonsters. You don't have to call the Sun a "roundhotthinginthesky", because it has a name: the sun. What we're talking about here are concepts like "sunshine", or "sunbeam", or "moonrocket". If you don't know what the sun is, or what the moon is, you'll just have to ask for a clarification.

Picking a thousand basic concepts that aren't trivial names or adjectives, and restricting yourself to combinations of three words gives you 997,002,000 possible combinations. Even if 99% of those are meaningless, you're still left with the means to express ten million separate concepts with your base vocabulary.
Eikka
3 / 5 (2) Feb 01, 2013
Playing the game by its own rules:

I don't agree with the idea of the write-make-thing. It doesn't allow for simple make-different of words like "make", to say "a person or a thing who makes". This is not good because it doesn't allow you to use your ten hundred words word-bag in a normal way. It doesn't allow you to make other words from the words you already know, like people would do. This do-not is not human.
baudrunner
4 / 5 (4) Feb 01, 2013
An effort to eliminate from usage the highfalutin colloquialisms and idioms threatening to acculturate into meme territory, not to mention the technical bafflegab and gobbledygook that the unmerited are forced to assimilate to feign understanding of concepts beyond the ken of most of the corporeal set.
baudrunner
3 / 5 (2) Feb 01, 2013
So... just so we're clear, does this text editor have or not have an "add to dictionary" feature? ..'cause like, I could probably use it.
antialias_physorg
4 / 5 (4) Feb 01, 2013
By coming up with different words for those things.

Guess what. The words we use now are the ones we use BECAUSE they are different.
It's usually the context that gives away what kind of a device it is

So you always have to provide the context. The point of using these specific words so that you DONT have to provide the context.
Imagine what scientific papers would look like if you'd have to provide the full context to everything that way? The usual 3-7 page papers would each be blown up to over 100 pages, easily.

And at this point we have to face the fact: scientific papers are written for scientist. Even if they were written in crayon and baby-talk the average Joe (even one reading a science site like this) wouldn't DO anything with it.
So it would be wasted effort while negating that which such articles are supposed to do - inform other scientists in a succinct manner.
antialias_physorg
3.7 / 5 (3) Feb 01, 2013
Picking a thousand basic concepts that aren't trivial names or adjectives, and restricting yourself to combinations of three words gives you 997,002,000 possible combinations.

The overwhelming part of which make no sense.

But you know: we already employ a very similar system to the one you propose. We use 26 letters and rearrange them to mean different things. And it works quite well.
Silverhill
5 / 5 (3) Feb 01, 2013
This reminds me of a science cartoon by Nick Downes:
A puzzled-looking man is standing near a professor with the stereotypic chalkboard full of arcana.
The professor is saying, "In layman's term's? I'm afraid I don't know any layman's terms!"
IronhorseA
5 / 5 (3) Feb 01, 2013
This is such a thing as over simplifying.
Anda
5 / 5 (3) Feb 01, 2013
And I thought there was at least one american enlightened in here...
but of course now I understand, you're not american antialias
frajo
not rated yet Feb 02, 2013
Communication shouldn't be like that. It should be a speedy way of transferring information without ambiguity.

You describe scientific communication.
But generally, communication is a term that encompasses art, too. The most noble task of art is to convey the ambiguities in everything that is human.
VendicarE
1 / 5 (1) Feb 02, 2013
Does it allow you to create your own dictionary?

t.r.e.e = tall woodie with bush on top.

Mmmmmm... Bush on top....
alfie_null
5 / 5 (2) Feb 02, 2013
You can't force people to write clearly (or well). Some of the worst stuff I have had to read came from authors whose vocabulary is already limited.

I'm reminded of the history of computer languages, which is full of languages created in part to prevent people from writing bad code. Eventually most of us come to realize that people who write bad code will continue to do so regardless of language strictures.
Eikka
1 / 5 (1) Feb 03, 2013
The overwhelming part of which make no sense.


I already adressed that. You still get millions of expressions

But you know: we already employ a very similar system to the one you propose. We use 26 letters and rearrange them to mean different things. And it works quite well.


That's completely irrelenvant. You know that we're talking about meta-concepts and not individual words on their own.

We're talking about words like gear versus Zahnrad, where the former explains nothing and the latter explains at least something to the audience, so they aren't completely ignorant even if they aren't familiar with the term.

And that makes a world of difference in public understanding of sciences, which btw. already use synthesized words to be able to communicate between different sciences. For example "isotropic", which consists of the Latin words "iso" and "tropos". The precise meaning differs from case to case, but it gives you some understanding of what is going on.
Eikka
1 / 5 (1) Feb 03, 2013
The unfortunate thing about scientists is that you have to guess when they're using Latin and when they're using Greek, especially if you don't know either.

The above example for isotropic was in fact Greek, and the word is "trópos" for "way/manner".

So the word isotropic means "in the same manner", which for material sciences means that the material has the same properties in every direction. The opposite would be anisotropic.

Of course this is lost to English speakers who only see the word "isotropic" as a stand-alone concept that you either know, or you don't know. They can't deduce the meaning because they don't speak Greek, and aren't used to words that contain other words.
antialias_physorg
1 / 5 (1) Feb 03, 2013
You can't force people to write clearly (or well).

And it wouldn't make any difference. Using different words doesn't mean that understanding a subject is any easier. The words aren't important. Whether you arrange simple words to mean complex subjects or use a technical term - in any case you have to know beforehand what it means in order to understand whatthe author tries to express.

Whether it's a "farsee" or a "television" - you still have to learn first what the word MEANS.

And if the Hamming-distance between words is larger then the chance for confusion is lessened. So using few words as a source is counter-productive beyond a certain extent. Language evolves. And it has evolved so that it is useful. If 'farsee' had been useful we would use it.
Eikka
1 / 5 (1) Feb 03, 2013
Imagine what scientific papers would look like if you'd have to provide the full context to everything that way?


Not much different form what they are today.

As I already explained, scientific literature already uses synthesized words based on Greek and Latin, so that anyone who understands these two langauges can get a basic idea of what the terms mean, and what everyone else is saying without being an expert on every subject.

For the layman who doesn't speak those languages, these terms appear as jargon because the only way they'll know what they mean is by looking them up in a book and memorizing them.

That's why the ordinary everyday language would benefit from having synthesis as the basis of word-formation, where new terms are based on old established terms, so they become intrinsically understandable. Then you could translate these Greek and Latin terms into everyday speech with good success.
Eikka
1 / 5 (1) Feb 03, 2013
If 'farsee' had been useful we would use it.


That's a logical fallacy and you know it.

The English language is not conductive to such word formation, so we borrow words instead of making them up. That's why the English language recognizes such words as schandenfreude or smorgosbord, or television, on binoculars, or telescope... all of which are synthesized words in their language of origin.

Whether it's a "farsee" or a "television" - you still have to learn first what the word MEANS.


Not necessarily, and that's the point. If someone says "The farsee allows people to transmit information", you can already guess that the information is going to be visual information without knowing what the word "farsee" refers to in specific.

hudres
not rated yet Feb 03, 2013
This takes the concept of "dumbing down" to new heights. A vsliant blow for illiteracy. Wheever pays this guy should think twice.
Onathan
not rated yet Feb 03, 2013
This may seem a bit funny, but the story about Up-Goer Five didn't use Up-Goer Five. When you put it into Up-Goer Five, 93 out of the 448 words that are not names are shown as not allowed.

More news stories

Study finds law dramatically curbing need for speed

Almost seven years have passed since Ontario's street-racing legislation hit the books and, according to one Western researcher, it has succeeded in putting the brakes on the number of convictions and, more importantly, injuries ...