Friday, March 29, 2024

What’s Wrong With Google Dictionary?

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

My birthday was December 3rd. One of my gifts this year was the Oxford English Dictionary on CD-ROM. (CD-ROM technology is dated, but it’s by far the cheapest way to own the best dictionary in the world.) I’m having problems installing the software, which choked on the installation. Tech support said my problem is a well-known error, and to just keep installing it until it works. Sigh!

On that same day, Google gave the world a lesser gift: Google Dictionary.

The Oxford English Dictionary on CD-ROMand Google Dictionary are in fact opposites. One is clunky, barely functional, overly complex software that provides incredibly great content, and the other is sleek, fast and advanced software that serves up junk content.

The media reported the news with the kind of passivity and acquiescence we’ve come to expect, applauding Google for rolling out a dictionary that’s “fast,” and has “cool features.”

Journalists are supposed to be skeptical wordsmiths, but the reporting on Google Dictionary unmasked most of them for what they really are: corporate shills. Reporters expressed huge concern over the fate of online dictionary competitors, and no concern over the fate of the English language.

What Is Google Dictionary?

Google search gets hundreds of millions of search queries each day. Most single-word searches on Google offer a “definition” link prominently on the top right of the results page. Click on the link, and you get a dictionary entry for the word.

Until last week, the link took you to an entry provided by Answers.com. Now it takes you to a new service called Google Dictionary.

The dictionary provides definitions, parts of speech, pronunciation and synonyms. It also shows phrases that use the word, followed by what Google calls “Web definitions,” which come from the Wikipedia and other online sources.

The Google Dictionary service used to be called Google Translate. It offers dictionaries and translation features for 28 languages: from English to French to German to Chinese to Hindi to Kannada and so on. (Presumably Kannada is what they speak in Canada.)

As a Web app, Google Dictionary is truly great. It’s fast, clean, functional, and well linked to relevant pages both within the service and outside of it. It retains your recent searches in a list.

What’s Wrong With Google Dictionary?

Unlike real dictionaries, which are transparent to a fault, Google Dictionary hides important information from the user. For example, Google Dictionary provides a main list of definitions. Where do they come from? How were they created, by whom and with what process?

Since Google won’t tell you, I’ll try to.

Google appears to get most or all its definitions from Collins COBUILD Advanced Learner’s English Dictionary, as well as synonyms, antonyms, and pronunciations.

Have you ever even heard of Collins COBUILD Advanced Learner’s English Dictionary?I hadn’t. Those of us who grew up speaking English as our primary language would never have any reason to. A learner’s dictionary is specifically designed for foreign-language speakers learning English. It’s an ESL (English as a Second Language) reference tool.

The difference between a learner’s dictionary and one written for native speakers is clearly detailed in the Wikipedia entry for “Advanced Learner’s Dictionary”:

“A learner’s dictionary is intended for non-native speakers who want information about the meaning and usage of words and phrases. Such dictionaries focus on current meanings, omitting outdated uses; etymology, a staple of standard dictionaries, is also usually omitted. All headwords are explained in uncomplicated language, typically using a core defining vocabulary of some 3,000 words, thus making the definitions more digestible to learners. There are many example phrases and sentences, but no quotations.”

Just like that: Gone are outdated uses, etymology and quotations, which to me are the best ways to understand the nuanced differences between similar words, and to truly understand the full context of words.

In fact, it’s precisely these historical components of words that make the great dictionaries so great. The Oxford English Dictionaryisn’t the best English language dictionary because of its primary definitions and pronunciations, but because of its “outdated uses,” etymology and quotations — which are unparalleled.

What’s happening is that the Collins COBUILD Learner’s Dictionary, designed for foreign language speakers, has been elevated to a vastly higher profile than, say, the Oxford English, American Heritage and Merriam-Webster, dictionaries.

The word “COBUILD” in the dictionary’s title stands for “Collins Birmingham University International Language Database.” That means definitions are largely computer-generated, or at least that the crafting of definitions were computer-assisted, and taken from a wide variety of TV shows, newspapers, books and other sources. That makes me suspect that Google may want to take over the project and develop techniques for automating the process of gleaning definitions from popular culture.

Next Page: It’s not even the best learner’s dictionary….

Not only is Collins COBUILD Advanced Learner’s English Dictionary not the best English dictionary, it’s not even the best learner’s dictionary, according to many online sources. According to one review of learner’s dictionaries, it’s the lowest rated of the five dictionaries reviewed, earning a solid F for foreign students learning American English (the second lowest grade was a C+). The reviewer summarized thus: “If dictionaries were cars, the 6th edition COBUILD would be a wheelbarrow.”

From a culture wars perspective, what’s happening here is that the numbers people (engineers at Google) are choosing the world’s dictionary, instead of the word people. And the engineers don’t seem to appreciate the importance of a high-quality dictionary. It would be as if lexicographers got to choose which programming language all software engineers — including those at Google — used by default. (“We choose Visual Basic.NET. So easy to use!”)

No wonder Google hides the origins of Google Dictionary definitions! But the obfuscation continues. Weirdly, Google Dictionary has no key, or place where abbreviations and symbols are explained — at least that I could find. For example, one of the primary purposes of a dictionary is to serve as a guide to pronunciation. Google Dictionary has no place where users can find out what its phonetic symbols mean. In fact, it’s based on A. C. Gimson’s phonemic system, which uses symbols from the International Phonetic Alphabet and is fairly standard.

The actual Collins COBUILD Advanced Learner’s English Dictionaryprovides a key or guide to the pronunciation symbols, as do all real dictionaries. Google assumes we’ve all memorized this system, or know what it is and where to look it up.

Let’s test their assumption. You’re far more educated than the average Google user. How do you pronounce this?: ‘_bf_ske_t (The word is “obfuscate,” but without a key, the pronunciation symbols are useless to most users.)

It’s just as well. The pronunciations, if you could understand them, would probably do more harm than good. The Collins dictionary glosses over international, regional and even contextual variations in pronunciation, providing one generic, international pronunciation for every word.

Parts of speech are helpfully included in a lighter color, but they’re not explained, either. Many users may not know what “N-UNCOUNT” means, for example, and Google doesn’t say.

Why isn’t any of this specified? Google’s attitude appears to be, “don’t worry your pretty little head about all that hard academic stuff. Just accept all this at face value.”

Google’s transgression here isn’t the switch from answers.com, which also sucked, or the desire to offer a dictionary that it controls. It’s not even the elevation of a two-bit, third-rate dictionary for foreigners learning English to the position of being the most influential and powerful language resource in human history.

The problem is opacity and exclusivity. Why the secrecy? Why doesn’t Google reveal its source for the main definitions? Why doesn’t it provide a key for understanding cryptic symbols and abbreviations? Why doesn’t it at least link to real, high-quality dictionaries crafted for the purpose of fully understanding words?

Google Dictionary is automatically authoritative, simply because it’s on Google. These definitions, for example, will be where the vast majority of students in English-speaking countries get their word definitions. A dictionary designed very specifically for one purpose — for educating ESL students — will be used by untold millions for an entirely different purpose: For understanding their own language.

There’s no way around it. Google Dictionary represents a colossal dumbing down of the English Language (and presumably others). We all shouldn’t rely on it for our everyday understanding of the English language, but we will.

Well, most of us will. I’ll be using the Oxford English Dictionary — if I can ever get the damned thing installed.

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles