Tracing the First Speakers of Indo-European: Ancient DNA Challenges Old Theories
A Language Family That Spanned Continents
Nearly half of the world’s population speaks a language that traces back to a single ancestral tongue. From English to Hindi, Russian to Farsi, hundreds of languages—known collectively as Indo-European—share deep linguistic roots. For more than two centuries, scholars have debated who first spoke this ancient language and where it originated. A recent study published in Nature1 takes a fresh look at this long-standing question, using ancient DNA to propose that the first Indo-European speakers were a group of hunter-gatherers who lived in what is now southern Russia about 6,000 years ago.
The findings, led by Harvard geneticist David Reich and an international team of researchers, point to a previously unknown population known as the Caucasus-Lower Volga (CLV) people as the possible origin of Indo-European languages.
The Search for Proto-Indo-European
The idea that many modern languages stem from a single ancestor was first proposed in 1786 by British judge and philologist William Jones, who noticed remarkable similarities between Sanskrit, Latin, and Greek. Over the next two centuries, linguists reconstructed what they called Proto-Indo-European, the hypothetical mother language spoken thousands of years ago. But where, and by whom, this language was spoken remained an open question.
Two major competing theories have dominated the debate. The "Steppe Hypothesis" argues that Proto-Indo-European was spoken by nomadic herders in the Pontic-Caspian steppe, an area that stretches from Ukraine to Kazakhstan. These herders, known as the Yamnaya, are thought to have spread both their genes and their language across Europe and Asia as they migrated about 4,500 years ago.
The competing "Anatolian Hypothesis," first proposed by archaeologist Colin Renfrew in 1987, suggests that the language originated much earlier, around 8,000 years ago, among early farmers in Anatolia (modern-day Turkey). According to this model, Proto-Indo-European spread gradually through agricultural expansion rather than rapid migration.
The new genetic evidence complicates both theories.
A Newly Discovered Ancestral Population
By analyzing DNA from 435 ancient skeletons across Eurasia, Reich and his colleagues identified the Caucasus-Lower Volga (CLV) people, who lived north of the Caucasus Mountains between 4,400 and 4,000 BCE. Their genetic signatures appear in populations that later split into two major groups:
One group moved west into Ukraine, where they mixed with local hunter-gatherers and became the Yamnaya.
The other group moved south into Anatolia, where they interbred with early farmers.

This discovery suggests that Indo-European languages may have originated with the CLV people before being carried westward by their Yamnaya descendants and southward by early Anatolian farmers.
How Language and Genes Moved Together
Archaeological evidence has long suggested that the Yamnaya were highly mobile, spreading across vast distances in a short period. Previous DNA studies showed that by 4,500 years ago, a large portion of central and northern Europeans had Yamnaya ancestry, indicating widespread migration. The new study builds on this by showing that these migrations also brought linguistic change.
The researchers propose that the Indo-European languages spoken in Europe today—including Germanic, Slavic, and Romance languages—descend from Yamnaya expansions. Meanwhile, the early Indo-European languages of Anatolia, such as Hittite, may have come from the southward movement of the CLV people rather than from steppe migrations.
Guus Kroonen, a linguist at Leiden University, called the study "a very intelligent scenario that’s difficult to criticize."
However, some linguists remain skeptical.
“Genes don’t tell us anything about language, period,” said Mait Metspalu, a population geneticist at the University of Tartu in Estonia.
Others, like Paul Heggarty of the Pontifical Catholic University of Peru, argue that Indo-European languages may have much older roots in the Fertile Crescent, aligning more closely with the Anatolian Hypothesis.
Revisiting the Indo-European Debate
The genetic findings challenge earlier assumptions but do not settle the debate. Indo-European may have developed within a network of interacting cultures rather than from a single "homeland." Some scholars suggest that rather than a single moment of linguistic origin, the language family evolved over centuries of contact between different groups.

One thing the research does make clear is that Indo-European languages did not emerge in isolation. They spread alongside people, technologies, and ways of life—whether through pastoralist migrations or agricultural expansions.
Dispelling Myths of “Aryan Purity”
For much of the 19th and early 20th centuries, discussions about Indo-European origins were tainted by racist pseudoscience. European nationalists promoted the idea of an “Aryan race” descending from a superior, pure bloodline of Indo-European speakers. The Nazis later weaponized these ideas, distorting linguistic research to justify genocide.
The new genetic data dismantles any notion of Indo-European "purity." Ancient DNA shows a complex history of migration and mixing across Eurasia.
"There’s all sorts of mixtures and movements from places that these myths never imagined," said Reich. "It really teaches us that there’s no such thing as purity."
Where Does the Research Go Next?
The new study has added another layer to an already complex debate. Researchers will continue analyzing ancient DNA, archaeological sites, and linguistic patterns to refine the picture of how Indo-European languages spread. The next step may involve studying populations further east, in Iran and Central Asia, to see how they fit into this broader story.
As genetic technology advances, the search for the origins of the world’s most widely spoken language family continues. Whether it began with nomads on the Eurasian steppes or farmers in Anatolia, one thing is certain: Indo-European languages—and the people who spoke them—moved, mixed, and adapted in ways that shaped human history.
Related Research
Haak, W., Lazaridis, I., Patterson, N., et al. (2015). "Massive migration from the steppe was a source for Indo-European languages in Europe." Nature, 522(7555), 207-211.
DOI: 10.1038/nature14317
Heggarty, P., Anderson, C., & Gray, R. (2023). "Language trees with sampled ancestors support a hybrid model for the origins of Indo-European languages." Science Advances, 9(23), eadi6348.
DOI: 10.1126/sciadv.adi6348
Anthony, D. W. (2007). The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World. Princeton University Press.
Lazaridis, I., Patterson, N., Anthony, D., Vyazov, L., Fournier, R., Ringbauer, H., Olalde, I., Khokhlov, A. A., Kitov, E. P., Shishlina, N. I., Ailincăi, S. C., Agapov, D. S., Agapov, S. A., Batieva, E., Bauyrzhan, B., Bereczki, Z., Buzhilova, A., Changmai, P., Chizhevsky, A. A., … Reich, D. (2025). The genetic origin of the Indo-Europeans. Nature, 1–11. https://doi.org/10.1038/s41586-024-08531-5