Genes and Languages Group makes strides

May 15, 2016

The Genes and Language Group was established to identify the aspects of linguistic variation that are not merely superficial (i.e., a result of chance events in the course of language change or cultural-historical factors), but those that are rooted in the constraints of human cognition and hence may have genetic correlates.

During the 2015-2016 academic year, they've expanded to include six faculty members (Chuanzhu Fan, Haiyong Liu, Geoffrey Nathan, Ljlijana Progovac, Natalia Rahklin, Martha Ratliff) and four graduate students (Xiayimaierdan Abdushalamu, Feng Tao, Yamei Wang, Jinhan Yu) from different fields, including linguistics, anthropology, modern and classical languages, communication sciences and disorders and biology.

This year, they discussed key readings on the genetic bases of language variation and planned their investigation of the relationship between language typology and genetic variation. Their goal is to search for meaningful statistically significant associations between points of language variation and inter-populational differences in allele frequency among SNPs and haplotypes of certain genes.

The Genes and Language Group is interested in documenting these correlations when they are not attributable to shared geography or history but may have arisen independently in diverse populations. The idea is that the presence of certain genetic variants may confer on a carrier a cognitive bias that would have a subtle but important effect on language processing in individual children. This subtle difference, over many generations in the course of historical language change, would translate into certain linguistic features becoming established in the languages spoken by populations that have a high frequency of the allele associated with the language processing bias in question.

This is a very ambitious goal with very little published research to date. Undoubtedly, the work they've undertaken so far is only a first step in a long-term research program.

Accomplishments to report

Identified two public databases of genetic data containing information on allele frequency in anthropologically defined populations, namely Allele Frequency Database (ALFRED), maintained by Yale University, and "1000 Genomes" project database maintained by the International Genome Sample Resource (IGSR)
Identified a list of published candidate "language genes"
Identified a list of diverse languages from all major geographic regions of the world that overlap with the populations represented in the genetic databases, which will be the focus of their investigation
Created a list of linguistic parameters (points of broad variation in the sound systems, grammatical organization, and lexical packaging of meaning across languages) to be examined in the languages of interest, such as the use of linguistic tone, ergative case pattern, accusative case pattern, reduplication, large phonetic inventory of consonants, uncommon vowels, semantic or formal gender, etc.
Currently, each member of the group is researching a set of features vis-à-vis the list of the languages of interest and entering binary values for each feature (one if the feature is present and 0 if it is absent in a given language) in a shared database they've created.
Hired a graduate student assistant from the biology department to be responsible for entering genetic data into their database and conducting statistical analyses looking for meaningful statistical correlations between linguistic and genetic variation.

← Back to listing