Sunday, May 1, 2011

Looking at words differently using the Google Books Ngram Viewer















Last December Google released a new software tool called the Ngram Viewer. An n-gram is a phrase with one to five words. The viewer lets you look at the relative frequency of a word or phrase over the time period from before 1800 to 2000. Data for the viewer come from a huge sample of five million books from the collection that Google has scanned and digitized. You can read a tutorial here. There’s a longer article about this new topic of culturomics.   

Back in February I blogged about What are you standing on (or behind)? I mentioned the words rostrum, lectern, and podium. A graph for the word rostrum is shown above. Click on it to view a larger version. Frequency of use for rostrum peaked back in 1890, then fluctuated, and has been declining steadily since 1970.













We can see more by comparing the words rostrum, lectern, and podium. Starting in 1980 podium became a more popular term than rostrum. Also, back in 1950 podium became more popular than lectern.










Vacuum tubes were replaced by transistors, and integrated circuits came later. Then why is the word transistors showing up between back in 1900 and 1910? It’s an error where the optical character recognition software just misread the word translators. I had to check in the Google Books database to find that little glitch.













When we compare five words, things get even more interesting. Look at the comparison between different types of engineers, that start at different times but fluctuate similarly.













Also, look at what happens to the names of the Intermountain states.

No comments:

Post a Comment