Hypothesis generation

A mutation may have a variety of possible effects at the cellular level. To pinpoint what it could be in order to direct further studies a variety of factors need to be considered. Several of these hypothesis could be combined or manifest at different severity. This is not meant to be an exhaustive list, but simply an aid for thought.
gnomAD is very resource for untangling the effect. However, the sampling in gnomAD is not exhaustive (16k samples for the v3 control set). A heterozygous variant more frequent than 5e-5 may be found homozygously in the population, but has not been sampled yet. The square of the frequency gives a ballpark figure of how many one would expect to find homozygously (erroneously assuming homogeneity in the human population).


Hypothesis: truncation (frameshift and nonsense) of all domains, simply results in less protein.
Details: If heterozygous, in a simple system there should be 50% less functional protein. If homozygous, there should be no functional protein. The heterozygous pathogenic case is referred to haploinsufficiency. If a heterozygous truncation is tolerated it is referred to dosage balanced.
gnomAD: in gnomAD controls, one would expect no similar case, bar possibly for truncations at the very end of the C-terminal.
Venus does not do predictions on truncation, but a silent mutation will provide with domain information.

Interdomain truncation

Hypothesis: truncation of regulatory domains, resulting in misregulation.
Details: Whereas it is true that nonsense-mediated mRNA decay will decrease the level of the transcribed protein of that allele, it is not complete, therefore if the remnant is stable, some 20%-30% of it may be made, which may be sufficient to cause an effect.


Hypothesis: The protein is destabilised and less active
Details: This is the classic case of loss of function. The protein may aggregate and/or more easily targeted for degradation. In the most simple/severe case it is no different from a truncation. However, the variant may be less functional and not fully non-functional, so a gradient of severity may be present. The case where a domain is destabilised and less active, but the rest of the protein is functional is a more complicated situation.
gnomAD: To be a valid hypothesis one would not expect there to be any truncations at a similar zygosity in the gnomAD control set.
If the variant is part of a cohort, the absence of truncations in the cohort but not in gnomAD, would indicate other effects are at play (e.g. sequestration etc.). If there are no truncation is either and the cohort is large, it could be that the mutation is so pathogenic that it is embryonically lethal and therefore not detected (survivorship bias)

Weaker interface

Hypothesis: The protein is less able to bind a partner protein
Details: If the partner protein is kept in check by the protein, the disruption will deregulate the former, potentially manifesting in a dominant manner. A variant of this is when a regulatory domain is part of the same protein (cf. 'Domain destabilisation').
gnomAD: To be valid, one would not expect similarly severe gnomAD variants at a similar zygousity on the surface in question.


Hypothesis: The protein loses its catalytic activity or is unable to switch conformation
Details: In the simplest case it is the same as a truncation or simple destabilisation. However, it may still bind to another protein, effectively sequestering it (see sequestration).


Hypothesis: The protein inactive but still binds a partner protein decreasing functional concentration of the latter
Details:This scenario may present a worse phenotypic consequence than a truncation or severe destabilisation. However, to be valid it must form a complex with another protein that is intolerant to decreased protein levels. For example, in RPB1 (POLR2A-encoded) missense variants that abrogate activity are worse than nonsense variants potentially due to RPABC3 (POLR2H) sequestration.

Non-functional oligomer

Hypothesis: A single variant chain makes the complex non functional to some degree
Details:The back-of-the-evelope maths for the number of complex without no variant chain is 0.5^n, where n is the number of chains in an oligomer. This is value higher than for a homozygous truncation (>0), unless the protein forms a hetero-olgomer with a homologue (see sequestration).

Deregulation via post-translation site

Hypothesis: loss of a post-translation site
Details:Phosphorylation and other modifications are used to control the conformation of protein (e.g. hRas), which controls their activity, or are used to target the protein for degradation.
gnomAD: Different post-translation sites have different strengths/effects, so gnomAD variants affecting other predicted post-translation sites does not invalidate this hypothesis, however, mutations adjacent to the one in question do.

Domain destabilisation

Hypothesis: the mutation destabilises a regulatory domain, but not the whole protein
Details:Exposed hydrophobic residues does result in higher aggregation and degradation, but a limited destabilisation may abrogate the activity (if it is via inter-domain binding, then it would be no different than the weaker interface scenario already discussed). If the domain is at the C-terminus, then truncations that do not affect the preceeding domains (cf. 'interdomain truncation' above) would have a similar effect albeit affected by missense mRNA decay.
gnomAD:One would not expect severe mutations in the domain.

Altered catalysis

Hypothesis: a mutation alters the specificity of the protein
Details:albeit it possible, most active site mutations are likely to cause a loss of activity, not alter its specificity. It is presented here for completeness.


Hypothesis: loss of localisation signal
Details:A mutation in the localisation signal may result in loss of function or gain of function depending on whether the migration was required for its activity or repression.
gnomAD:In most cases it is not a structural effect, but a motif disruption so ∆∆G is not a valid metric to assess gnomAD variants in a signal.