How related are South Asians?

I’ve been geeking out on genetics recently and I came across this Wikipedia page:

It lists out the results of several studies which sampled various ethnic / linguistic / geographic groups in India and the Haplogroup s they are a part of. I’m not going to pretend I knew what a Haplogroup was until recently, so I’ll quickly explain it:

A Haplogroup is basically a set of genes which show up together frequently in a given population. They come from the Y chromosome and so the presence of one group or another indicates a common male ancestor. I gather it’s It’s more complicated than this, but that’s what I understand..

I took the data from some of the groups, and shoved it into a spreadsheet to see how related these different (and often acrimonious) communities are genetically:

These are not apples to apples type comparisons and the datasets have wildly varying margins of error.

A couple things that are interesting:

  • There is a signifigant difference between low and high caste people, but less between middle and high.
  • The difference between Indo-European Indians and Dravidians is actually about the same as high/low caste across the entire continent.
  • There is almost no correlation between IE and Munda people despite thousands of years of cohabitation across broad stretches of east and north-east India
  • The Tharu people have an inverse correlation to Indo-Europeans across most Haplotypes even though they are mostly linked linguistically and in religion. As a side note, they are the only population in South Asia who happens to carry the trait for thalassemia which prevents malaria (and can cause sickle cell). Weird…

Okay, I hope that was mildly interesting, it was fun interneting for me.

Originally published at on May 28, 2015.

Jacob Singh
CTO in residence at Sequoia Capital. Independent product and Engineering Coach Mediocre guitarist, singer, rock climber, point guard and baker Dedicated dad. American in New Delhi.
New Delhi