Sasha Gusev (@sashagusevposts.bsky.social) reply parent
very relevant, thank you!
Statistical geneticist. Associate Prof at Dana-Farber / Harvard Medical School. www.gusevlab.org
6,350 followers 678 following 758 posts
view profile on Bluesky Sasha Gusev (@sashagusevposts.bsky.social) reply parent
very relevant, thank you!
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
That's a great idea, and looks something like the below. I'm working on the size scaling a bit more to convey the point more clearly
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yeah I was thinking about this initially and unable to find any examples in the literature where epistasis tagged by additivity versus "pure" additivity mattered. I think the response really is just about narrow-sense h2 (as in the Breeder's Equation).
Sasha Gusev (@sashagusevposts.bsky.social)
Great mini thread here on lessons for GxG and GxE from non-human organisms:
Ewan Birney (@ewanbirney.bsky.social) reposted reply parent
Nice blog and good to see this also from the twins/shared environment side. We (with my colleagues in @wittbrodtlab.bsky.social) have tried to tackle the non-additive in experimental settings (in medaka fish) which we can map to human (as the medaka fish are "wild") www.biorxiv.org/content/10.1...
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yeah, uncentered effects make more sense to me biologically but of course you could design a statistical generative model where the interaction is purely non-additive.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
great commentary article (and thanks for the thread!)
Giacomo Bignardi (@giacomobignardi.bsky.social) reposted
đź§µ on cool choices of image headers (and titles) all from @sashagusevposts.bsky.social's Substack đź”— at the end 'Beneath the surface of the sum' Genetic interactions may look like the thing they deviate from Interaction (1964) by Julian Stanczak
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yeah, in general multi component models will only take the part of the interaction that is not captured by additivity. The twin ADE model partitions them correctly but ONLY if there's no shared environment (and other assumptions hold).
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Sure thing, see this gist. Note that this is always the affect allele frequency and not the minor allele frequency and there's no genotype scaling. Let me know if you think there's a better presentation.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Everyone is free to make their own choices, but I personally think what platform you use has negligible impact on the influence of wealthy and powerful people in the US, and far far less than being heard on important topics.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
:) I still post on twitter as well
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
So the mystery remains. It is tempting to conclude that biological epistasis is widespread but mostly gets mapped to statistical additivity. But that does not explain the deviations from additivity often observed in twins. /x
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
There is no statistical power to identify individual rare var interactions. But interactions between rare variants and common polygenic scores can be tested and show ... nothing. This is genuinely surprising: large deleterious effects simply add up with common polygenic burden.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
So do we see non-additive epistasis in real data? For common variants, not really. Dominance has largely been ruled out, and various attempts to estimate components of pairwise epistasis have found little to none. Could it be hiding among rare variants?
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Finally, where epistasis really matters is in the context of interventions. If non-additive epistasis is substantial, then acting on the additive genetic mechanisms will frequently produce highly unexpected traits in individuals. Try to move a trait down but it actually goes up:
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
What about natural selection? For the short-term response, all that matters is additive/narrow-sense heritability. But the long-term response can be very different depending on whether epistasis is aligned with or against fitness:
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
That's at the variant level, but Zuk et al. 2012 proposed a model where a trait is defined by the lowest value of multiple heritable pathways ("each unhappy family is unhappy in its own way"). This *induces* epistasis and can similarly severely distort heritability estimates.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Less appreciated is the fact that this inflation has to go ~somewhere~, so it also leads to a corresponding deflation in the estimate of the shared environment. The same twin correlations can arise if [A]dditivity=74% and [C]ommon-Env=12% or from [A]=20%, [C]=30%, [D]=35%.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
pistasis decays very quickly with genotypic relatedness, which creates problems for family based heritability estimators. Narrow-sense heritability estimates from unrelated individuals pick up ~only the additive component, whereas estimates from twins can be severely inflated.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
The same phenomena holds for genome-wide epistasis between many loci. As the causal variants become more common, the epistasis (A*B) can be better approximated with additivity (A+B). For uniformly selected frequencies, >80% of biological epistasis looks like additivity!
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
For two loci in a simple additive model (A+B), the effects on the trait are "parallel" (left). Interactions between the loci (AxB) draw the effects away from parallel, with larger deviations for rare variants. Surprisingly, high frequency AxB looks very additive *statistically*.
Sasha Gusev (@sashagusevposts.bsky.social)
I wrote about gene-gene interactions (epistasis) and the implications for heritability, trait definitions, natural selection, and therapeutic interventions. Biology is clearly full of causal interactions, so why don't we see them in the data? A đź§µ:
Elliot Tucker-Drob (@tuckerdrob.bsky.social) reposted reply parent
The twin/family studies were not clear whether it was dominance vs. epistasis vs. some other sort of nonadditivity. Dominance was just the easiest to model. We can now estimate dominance SNP h2 but can’t obtain an estimate of epistatic SNP h2. Assimilation/contrast effects may also be at play.
Michel Nivard (@michelnivard.bsky.social) reposted reply parent
While I am not entirely convinced this is “real” and not some psychometric equivalent of a vowel shift (and PRS could help find out a bit I think) this it is relevant:
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yeah I'm actually *very* curious about the psychometric consistency of these scales across the cohorts. But definitely suggests that cohort effects on the measured variables can be substantial.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yeah agreed. Something I've always been curious about is whether AxC interactions would also get picked up as D in the extended twin designs.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
And to push back a bit, if these massive differences in estimates were not memory-holed, I would have expected them to at least be mentioned in a review on personality genomics that otherwise talks about the foresight of twin studies quite a lot.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Twin models can partition heritability into additive and non-additive (dominance + interactions) and find very large estimates for the latter. On the other hand, molecular GWAS data has estimated dominance heritability to be zero for pretty much all traits. No one knows why.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Would be interesting to read. Sibling interactions should show up as differences in within-family versus pop PGI performance, right, but you don't see that either? And dominance GWAS heritability has estimated very confident ~0% for essentially every trait.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Apropos this interesting review of the genomics of personality: www.annualreviews.org/content/jour...
Sasha Gusev (@sashagusevposts.bsky.social)
One of the largest twin studies of personality anticipated massive non-additive dominance/epistasis components. These have not materialized with modern data for *any trait*, no one knows why, and the entire set of findings has apparently been memory-holed.
Sasha Gusev (@sashagusevposts.bsky.social)
Decrease in phenotype by intervening on additive or epistatic mechanisms in the presence of low (left) or high (right) epistasis. Red lines indicate an increase in the phenotype when a decrease was intended.
Jeffrey Ross-Ibarra (@jrossibarra.bsky.social) reposted
Popgen, day7: single-locus selection models. Allele frequency change is faster at intermediate frequencies! Fixation takes longer in larger pops! And a bit more from yesterday on MK (asymptotic MK test is cool!)
Sasha Gusev (@sashagusevposts.bsky.social)
Any discussion of embryo selection should probably start with the description of IVF in this piece, to at least be connected to some aspect of tangiable reality.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Seems reasonable for 40-50% relative risk reduction, but would be interesting to see this type of model validated in family data. It's still not clear to me whether shifts to the liability translate directly into shifts to the log hazard, nor how decreasing h2 with age fits into hazard models.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Hah. I already hate AI for making me immediately skeptical this kind of prose. I've gotten a few comments along these lines though and it sort of feels like they're saying "please make more emotional claims so I can then accuse you of relying too much on emotional claims".
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yeah ... I don't think you're going to convince people go spend $50k and go through an elective in vitro procedure to get a probabilistic hope at a few IQ points if all assumptions hold, instead of just putting $50k in a college fund and conceiving the conventional way.
Mykhaylo M. Malakhov (@mykmal.xyz) reposted
The use of dichotomized labels for traits that are actually continuous has bothered me ever since I started working on complex trait genetics. Sasha's blog post provides an excellent explanation of why this matters in the context of polygenic risk and embryo selection.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Good points. Here is the code for the hazard model: gist.github.com/sashagusev/8...
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yes, good point! I'll add the ref.
Patrick Turley (@paturley.bsky.social) reposted reply parent
Great thread! We make this point in the NEJM paper, but I think it flew under the radar for many.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
I bring this up as more embryo selection companies are advertising large relative risk reductions for traits that clearly lie on a spectrum. These reductions are in turn interpreted as probabilistic "cures" even by a fairly informed audience (e.g. astralcodexten.com/p/suddenly-t...). /x
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
In short, risk models matter. Polygenic embryo selection acts on risk in a way that we are not typically used to thinking about, and the conventional liability threshold model -- while statistically convenient -- may not be the right way to convey the cost/benefit tradeoffs.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
When we look at the expected additional "disease-free time" for a typical person (including those who would never develop the disease in their lifetime) the risk reduction adds just a few months. In contrast, classical relative risk estimates are a sizable ~20%.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
We can again use simulations to see what happens when risk is reduced. For a trait that mimics breast cancer incidence and a 0.5SD risk reduction, individuals who would have developed the condition instead develop it 3-6 years later in life.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
What about diseases like cancer? A second model we can think about is a `hazard` or survival model, where individuals develop the disease at different rates. Here embryo screening essentially extends the age of onset for the disease in the offspring, rather than curing it.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
In the case of obesity, there's nothing magical about crossing the threshold. Such ad hoc thresholds -- sometimes called "dichotomania" -- are surprisingly common in medicine. In fact, many traits currently being screened for have clinically important sub-threshold analogs:
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
In the setting of embryo selection, small changes in the underlying liability can lead to large apparent "relative risk reductions", because individuals are shifted from just above to just below the threshold. See this simulated example with Class III obesity:
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
I argue that the main distinction here is the use of a statistical model that messes with our intuitions about disease. The `liability threshold model` assumes all individuals lie along a liability continuum; those above a threshold are sick and those below are perfectly healthy.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
The core issue is captured by two seminal papers. First, Karavani et al. 2019 (pubmed.ncbi.nlm.nih.gov/31761530/) found that screening embryos for continuous traits has limited utility. Then, in 2021 many of the same authors found that selection CAN reduce risk for certain diseases.
Sasha Gusev (@sashagusevposts.bsky.social)
I wrote about how genetic risk works in the context of embryo selection and how people often think about it all wrong. A short đź§µ:
Graham Coop (@gcbias.bsky.social) reposted
An old article on "PolygenX" marketing IQ embryo selection and its links to the far-right. The newly launched company HeraSight seems to share a pretty similar business model to PolygenX and an overlapping set of employees. investigations.hopenothate.org.uk/superbaby-fa...
Keck School of Medicine of USC (@keck.usc.edu) reposted
Arun Durvasula, PhD, of @uscpphs.bsky.social writes, "Lifestyle factors play a large role in determining who gets a disease and who doesn’t. But they are far from the entire story." A fascinating read about the relationship between genetics and our environment. ⬇️ 🔗 #Science #MedSky #ResearchSky
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
If you center the genotypes then you are effectively adding only non-additive epistasis so the additive component misses all of it (except for dominance).
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yeah it's a bit confusing. This is generating a trait that is entirely driven by epistatic effects and then estimating how much of the trait variance can be explained by additive effects only. The genotypes are NOT centered here, which matters a lot.
Sasha Gusev (@sashagusevposts.bsky.social)
The proportion of epistatic heritability that is estimated as additive by quantitative genetic models. Epistasis deviates more from additivity for lower frequency causal alleles, but on average >80% of biological GxG will just look like statistical G.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
This is an interesting cast of characters to be involved in a product which, while clearly conceived and marketed around the idea of IQ prediction, has so far chosen to hide all the relevant validation. /x
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
A fourth employee of Heliospect is an active quantitative racism researcher, publishing with the usual race science crew on bogus Jewish polygenic score comparisons (work that was immediately and summarily rebutted: trejo.scholar.princeton.edu/sites/g/file...).
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
A third employee (and senior author on the research whitepaper) somehow accidentally ended up posting anti-immigrant slogans in a private chat with an Austrian far-right activist (aka neo-Nazi) Martin Sellner:
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
A second employee (and the first author of the Herasight research whitepaper) was a member of a Telegram group for the pro-eugenics online publication Aporia. Aporia also frequently publishes on the topic of white racial solidarity:
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
In 2023, one of their geneticists got busted buying Nazi books and posters online and was eventually let go.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
A final point about trust. Herasight has apparently partnered with Heliospect Genomics, founded by self-described eugenicist Jonathan Anomaly (a member of both companies). Anomaly and Heliospect were profiled last year by Hope Not Hate: investigations.hopenothate.org.uk/superbaby-fa...
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
This is a complicated product. Customers are making long bets on what diseases and environments will still be relevant ~50 years from now and what genetic correlations will not be. They are trusting companies to get it right -- something the companies have so far failed to do.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
But beyond the technical details, what stuck out for me was that Herasight immediately broke their promise to put "research over marketing" by releasing and advertising IQ prediction while including zero methods/performance details in their white-paper.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
3) Risk reduction calculations that still rely on a liability threshold assumption that often reflects arbitrary diagnostic cutoffs: a small reduction in BMI, for example, can be made to look like a large (and IMO highly misleading) "risk" reduction in obesity when the former is what matters.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
2) How do they handle genetic correlations and pleiotropy across traits, where selecting on something like IQ can also lead to substantial relative risk increases for a genetically correlated condition like Autism? And what about conditions we haven't measured?
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
There are still important details lacking. 1) How to address the substantial differences in prediction calibration and uncertainty across different environmental contexts (see: Hou et al. 2024 pubmed.ncbi.nlm.nih.gov/38886587/), many of which are unknown for an embryo by definition?
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
What about Herasight itself? They claim to "prioritize research over marketing" and their white-paper does include much more rigorous validation than the competitors (a low bar), including quantifying and reporting within-family and cross-population accuracy.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
People who care about this technology should be furious at Nucleus and their collaborators (as well as Orchid and Genomic Prediction for their own errors). Finding such flaws should not require reverse-engineering by a competitor. These products clearly need independent audits.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
I wrote last year about the un-seriousness with which Nucleus approached their IQ product and the damage it could do to genetic prediction and research more broadly (theinfinitesimal.substack.com/p/genomic-pr...). This appears to have been a broader pattern beyond IQ, extending even to rare disease.
Sasha Gusev (@sashagusevposts.bsky.social)
A few thoughts on Herasight, the new embryo selection company. First, their whitepaper (drive.google.com/file/d/1EpFi...) implies that competitors like Nucleus have been marketing and selling grossly erroneous risk estimates. This is shocking if true! đź§µ
Graham Coop (@gcbias.bsky.social) reposted
It is depressing, but all too predictable, how swiftly we’ve gone from the Social Science Genetic Association Consortium offering reassurances about the uses of behavioural polygenic scores to one of their lead authors marketing embryo selection for IQ
Andrea Ganna (@andganna.bsky.social) reposted
Leena Peltonen School of Human Genetics in full-swing! @gosiatrynka.bsky.social @dgmacarthur.bsky.social @bpasaniuc.bsky.social @tuuliel.bsky.social @hilarycmartin.bsky.social @sashagusevposts.bsky.social @zkutalik.bsky.social @mashaals.bsky.social @alemedinarivera.bsky.social
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
In this case, the government says "we think you're breaking the law", illegally terminates contracts and services with the university, and then negotiates a backroom deal where the university pays the government to be left alone. Seems like a bad system.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Typically, the government tells the university "we think you're breaking the law", conducts an investigation, proves their case in court (subject to appeal), the court determines the punishment, and if needed Congress passes new laws for additional oversight.
Sasha Gusev (@sashagusevposts.bsky.social)
Good take on Columbia: "Higher ed policy in the United States is now being developed through ad hoc deals, a mode of regulation that is not only inimical to the ideal of the university as a site of critical thinking but also corrosive to the democratic order and to law itself."
Na Cai (@caina89.bsky.social) reposted
Very happy to share our new paper now on @medrxivpreprint: “Genetic risk effects on psychiatric disorders act in sets”, a great effort led my PhD student @jolienrietkerk.bsky.social, and performed together with collaborators Andy Dahl, Jonathan Flint, Andrew Schork etc. Thread 1/n
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
AxC is simply not a research direction of any interest to the field. Most twin study reviews don't even mention AxC at all, as either a limitation or an open research area or anything (e.g. pubmed.ncbi.nlm.nih.gov/37188734/).
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
These are cool models but they just treat C's in the two generations as independent latent variables (only sometimes estimable) and they have no way of modeling interactions with either C at all.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
With respect to EEA in GWAS, yes, I was referring to the assumption that environmental confounding / pop strat can be adequately dealt with using linear covariate adjustment and mixed models; which works well enough for most traits (though unfortunately not behavioral/social outcomes).
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
I'm not aware of any twins models for testing AxC that are not very esoteric tests for higher moments and therefore require impossible sample sizes. Even for extended family methods you need multi-generational models of the shared environment, which have not been developed.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
"We have this great design for partitioning phenotypic variance!" Oh, what does it do with environmental interactions? "Just treats them as genes" So everyone must be really concerned about that? "No, our biggest finding is that the shared environment doesn't matter. We even have a law ..."
Nicholas Mancuso (@nmancuso.bsky.social) reposted
Super excited to see this out. What started as some math in a grant in 2020, to a student deciding to take this on in 2022, to published in 2025. These things can take time and patience is key!
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Aside from the EEA (which plays a fundamental role in twin studies and a minor one in GWAS), twin studies treat AxC as A and GWAS (or RDR) treat AxC as E. That's a huge difference!
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
My dad won a pair of Bose computer speakers on the mid-90's internet by clicking a banner ad that asked which hat a cowboy would wear (you had to click the cowboy hat) and I still use them 30 years later and they sound fantastic.
Jeffrey Pullin (@jeffreypullin.bsky.social) reposted
Very excited to share new work from my PhD on a new software package for eQTL mapping: quasar. The quasar software package is a C++ program designed to provide a flexible and efficient eQTL mapping. www.medrxiv.org/content/10.1...
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Nice, looking forward to it! That's one setting where I expect an LMM component to be critical.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Very useful! Is it possible to test interactions between genotype and cell level covariates in the PGLMM?
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
In these figures the PCA analysis was LD pruned to avoid picking up high LD regions (which is pretty standard, as pointed out in Prive et al), but the variant differences are computed from the full sequence with no exclusions. So these two distance metrics are impacted by demography differently.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Yes, right, this is what I was trying to get at by "counting site differences, not alleles". But now I think a simple EUR bottleneck isn't enough to get |EUR-YRI| << |YRI-YRI|, not sure what other demographic components are required though.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Ah you're right, thanks for walking through it, somehow I wasn't thinking about the symmetry. But now I must be missing something obvious about this point in Biddanda et al. (elifesciences.org/articles/60107):
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
Random A-B or A-C pairs will then have fewer differing sites than random B-C pairs because of lower diversity in A (counting site differences, not alleles). Whereas the bottleneck actually increases the A-B and A-C distance in terms of drift (and therefore PCA).
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
So the way I'm thinking about it: Take three populations A, B, C that split from a common ancestor and drift. A has a severe bottleneck that fixes many variants, so subsequent random mating produces offspring with fewer total heterozygotes.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
I think it's both. You can shift the PCs around with sampling. But the fact that EUR-AFR pairs will often have fewer differences than AFR-AFR pairs is, I believe, due to EUR founder effects / demography. Whereas most "conventional" PC1-PC2 will place EUR-AFR much farther apart than AFR-AFR.
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
More examples. Data from the SGDP (reichdata.hms.harvard.edu/pub/datasets...).
Sasha Gusev (@sashagusevposts.bsky.social) reply parent
PCA attempts to find axes that explain the most variance, but variance can increase due to founder events (e.g. bottlenecks outside of Africa) while simultaneously reducing genetic diversity and the total # of genetic differences to people from an ancestral population.
Sasha Gusev (@sashagusevposts.bsky.social)
People often intuitively interpret genetic ancestry PCA plots as simple genetic distances, but this is not always the case. Below are random pairs of individuals that are far apart in PCA space (lines) but actually have fewer total variant differences (numbers).
Sasha Gusev (@sashagusevposts.bsky.social)
Nice discussion here. A particularly good point about why we both do and do not care about heritability.