Happy Sunday! I am supposed to be working on something else. But I am here procrastinating on writing by writing. I was going through some key genetics papers from the past on the topic of drug target discovery, and I couldn’t resist musing on a clever trick scientists used in the early days to get their hands on some of the major discoveries.
I was rereading this gem of a paper from Jonathan Cohen, Helen Hobbs and team from University of Texas, who first found that loss of function variants in PCSK9 reduce blood cholesterol, based on a prior work by Catherine Boileau and team from Inserm, Paris, who found that gain of function mutations in PCSK9 increase blood cholesterol.
Every time I think about this work, I feel amazed how they landed on the breakthrough discovery by studying merely 128 individuals, more than 50% of which just happened to be African Americans. One other factor that helped the discovery—which I've read before but only hit me recently—is their clever idea of sampling individuals from the extreme of the LDL distribution. They deliberately chose their study participants who had extremely low LDL cholesterol to increase their chances of discovering large effect rare variants. I am amazed to think that this concept has prevalent in the 2000s, when the field were just beginning to understand the genetic architecture of complex traits.
Cohen and Hobbs team might have bagged many low-hanging fruits using this trade secret. Just a year before publishing the PCSK9 findings in Nature Genetics in 2005, they published another great work in Science where they mined the sequences of HDL cholesterol associated Mendelian genes (ABCA1, APOA1, and LCAT) in individuals in the lower tail of the HDL distribution and discovered that large effect gene deleterious coding variants are more common in those with extremely low HDL cholesterol in the population.
Another great example of a discovery made by studying the phenotypic extremes is BCL11A association with fetal hemoglobin levels in adult, a discovery that has been successfully translated into a life-saving medicine last year. Menzel et al. published in Nature Genetics in 2007 is one of the two publications that reported the BCL11A discovery. The authors sampled 179 individuals from both extremes (<5th and >95th percentiles) of the blood HbF level distributions in the population and landed on the strong trans QTL signal near BCL11A gene that influenced the fetal hemoglobin levels, which eventually led to the discovery of the molecular on/off switch behind fetal to adult hemoglobin transition after birth.
Coming back to our original topic, so, two factors contributed to Cohen and Hobbs team's successful discovery: diverse sample and extreme phenotype. I am not sure studying a sample where >50% of African Americans were part of the plan, or it just happened so that African Americans were enriched in the lower end of the cholesterol distribution. Either way, it worked out great. It turned out, two nonsense mutations in the PCSK9 gene are multiple folds enriched in individuals of African ancestries. The combined frequency of these two variants in African Americans in the US is 1.8%, which increases to 11% in the sample studied by Cohen et al. enabling them to identify 7 carriers out of 64 samples. This is in stark contrast to European ancestry groups, where the combined frequency of loss of function variants was <0.1%.
This work served as the foundation for their subsequent work published in NEJM in 2006, considered a landmark in the drug development field. After learning the list of nonsense mutations that are enriched in African Americans, Cohen and Hobbs team genotyped those variants in 3363 African American participants of Atherosclerosis Risk in Communities (ARIC) study and showed the impressive cardio-protection of PCSK9 nonsense mutations inspiring two companies--Regeneron and Amgen--to enter the race to develop antibody-based PCSK9 inhibitors for the treatment of hypercholesterolemia and both landing FDA approvals almost at the same time 2015.
P