Languages are windows into the worlds of the people who speak them – reflecting what they value and experience daily.
So perhaps it's no surprise different languages highlight different areas of vocabulary. Scholars have noted that Mongolian has many horse-related words, that Maori has many words for ferns, and Japanese has many words related to taste.
Some links are unsurprising, such as German having many words related to beer, or Fijian having many words for fish. The linguist Paul Zinsli wrote an entire book on Swiss-German words related to mountains.
In our recently-published study we took a broad approach towards understanding the links between different languages and concepts.
Using computational methods, we identified areas of vocabulary that are characteristic of specific languages, to provide insight into linguistic and cultural variation.
Our work adds to a growing understanding of language, culture, and the way they both relate.

Our method
We tested 163 links between languages and concepts, drawn from the literature.
We compiled a digital dataset of 1574 bilingual dictionaries that translate between English and 616 different languages. Since many of these dictionaries were still under copyright, we only had access to counts of how often a particular word appeared in each dictionary.
One example of a concept we looked at was "horse", for which the top-scoring languages included French, German, Kazakh and Mongolian. This means dictionaries in these languages had a relatively high number of
- words for horses. For instance, Mongolian аргамаг means "a good racing or riding horse"
- words related to horses. For instance, Mongolian чөдөрлөх means "to hobble a horse".
However, it is also possible the counts were influenced by "horse" appearing in example sentences for unrelated terms.
Not a hoax after all?
Our findings support most links previously highlighted by researchers, including that Hindi has many words related to love and Japanese has many words related to obligation and duty.
We were especially interested in testing the idea that Inuit languages have many words for snow. This notorious claim has long been distorted and exaggerated. It has even been dismissed as the "great Eskimo vocabulary hoax", with some experts saying it simply isn't true.
But our results suggest the Inuit snow vocabulary is indeed exceptional. Out of 616 languages, the language with the top score for "snow" was Eastern Canadian Inuktitut. The other two Inuit languages in our data set (Western Canadian Inuktitut and North Alaskan Inupiatun) also achieved high scores for "snow".
The Eastern Canadian Inuktitut dictionary in our dataset includes terms such as kikalukpok, which means "noisy walking on hard snow", and apingaut, which means "first snow fall".
The top 20 languages for "snow" included several other languages of Alaska, such as Ahtena, Dena'ina and Central Alaskan Yupik, as well as Japanese and Scots.
Scots includes terms such as doon-lay, meaning "a heavy fall of snow", feughter meaning "a sudden, slight fall of snow", and fuddum, meaning "snow drifting at intervals".
You can explore our findings using the tool we developed, which allows you to identify the top languages for any given concept, and the top concepts for a particular language.

Language and environment
Although the languages with top scores for "snow" are all spoken in snowy regions, the top-ranked languages for "rain" were not always from the rainiest parts of the world.
For instance, South Africa has a medium level of rainfall, but languages from this region, such as Nyanja, East Taa and Shona, have many rain-related words. This is probably because, unlike snow, rain is important for human survival – which means people still talk about it in its absence.
For speakers of East Taa, rain is both relatively rare and desirable. This is reflected in terms such as lábe ||núu-bâ, an "honorific form of address to thunder to bring rain" and |qába, which refers to the "ritual sprinkling of water or urine to bring rain".
Our tool can also be used to explore various concepts related to perception ("smell"), emotion ("love") and cultural beliefs ("ghost").
The top-scoring languages for "smell" include a cluster of Oceanic languages such as Marshallese, which has terms such as jatbo meaning "smell of damp clothing", meļļā meaning "smell of blood", and aelel meaning "smell of fish, lingering on hands, body, or utensils".
Prior to our research, the smell terms of the Pacific Islands had received little attention.
Some caveats
Although our analysis reveals many interesting links between languages and concepts, the results aren't always reliable – and should be checked against original dictionaries where possible.
For example, the top concepts for Plautdietsch (Mennonite Low German) include von ("of"), den ("the") and und ("and") – all of which are unrevealing. We excluded similar words from other languages using Wiktionary, but our method did not filter out these common words for Plautdietsch.
Also, the word counts reflect both dictionary definitions and other elements, such as example sentences. While our analysis excluded words that are especially likely to appear in example sentences (such as "woman" and "father"), such words could have still influenced our results to some extent.
Most importantly, our results run the risk of perpetuating potentially harmful stereotypes if taken at face value. So we urge caution and respect while using the tool. The concepts it lists for any given language provide, at best, a crude reflection of the cultures associated with that language.
Charles Kemp, Professor, School of Psychological Sciences, The University of Melbourne; Ekaterina Vylomova, Lecturer, Computing and Information Systems, The University of Melbourne; Temuulen Khishigsuren, PhD Candidate, The University of Melbourne, and Terry Regier, Professor, Language and Cognition Lab, University of California, Berkeley
This article is republished from The Conversation under a Creative Commons license. Read the original article.