Michelanglo — VENUS

Variant effect on structure — Free energy

VENUS

Main Hypothesis generation model choice Free energy URLs Other sites

Hypothesis generation

A mutation may have a variety of possible effects at the cellular level. Several of these hypothesis could be combined or manifest at different severity. To pinpoint what it could be in order to direct further studies a variety of factors need to be considered.
gnomAD, a repository of genome and exome sequencing data, isthe ideal resource for untangling the effect, however, the sampling in gnomAD is not exhaustive (16k samples for the v3 control set). Consequently, a heterozygous variant more frequent than 10^-5 may be found homozygously in the population, but be absent from the gnomAD database. The square of the frequency gives a ballpark figure of how many one would expect to find homozygously (erroneously assuming homogeneity in the human population).
This is The list of hypotheses below is not meant to be an exhaustive list, but simply an aid for thought.

Truncation

Hypothesis: truncations (frameshift and nonsense) of all domains, simply result in less protein.
Details: If heterozygous, in a simple system there should be 50% less functional protein. If homozygous, there should be no functional protein. The heterozygous pathogenic case is referred to as haploinsufficiency. If a heterozygous truncation is tolerated it is referred to as dosage balanced.
C-terminal truncations The C-terminal is frequently disordered, but can have a structural role and in some cases the carboxylate forms a buried salt bridge, or can be highly subjected to post-translational modifications. A well studied example of the latter is DNA polymerase.
gnomAD: In gnomAD controls, one would expect no similar case of truncations, bar possibly for truncations at the very end of the C-terminal, unless this is structural of modified as mentioned.
NB: Venus does not do predictions on truncation, but if one were to want to use Venus for a truncation submitting a silent mutation (an identity) it will provide with domain information...

Interdomain truncation

Hypothesis: truncation of regulatory domains result in misregulation.
Details: Premature terminations may lead to nonsense mediated decay, where the transcript gets degraded, resulting in a diminished but not absent level of translation from the defective allele. Therefore if the remnant is stable, some 20%-30% of it may be made, which may be sufficient to cause an effect, especially it the C-terminal domain was regulatory.

Destabilisation

Hypothesis: The protein is destabilised and less active
Details: This is the classic case of loss of function. The protein may aggregate and/or more easily targeted for degradation. In the most simple/severe case the effect is no different from a truncation. However, cases can range from somewhat less functional to entirely non-functional, with a gradient of severity. An additional layer of complexity arises where a specific domain is destabilised and less active, while other domains are functional.
gnomAD: To be a valid hypothesis one would not expect there to be any truncations, before or within that domain, at a similar zygosity in the gnomAD control set.
If the variant is part of a disease cohort and the control set includes truncations, this would indicate other effects are at play in the disease cohort (e.g. sequestration). If there are no truncations in either the control set or the cohort, assuming the cohort is large, it could be that truncation of the affected allele is embryonically lethal and therefore not detected (survivorship bias )

Weaker interface

Hypothesis: The protein is less able to bind a partner protein
Details: If the partner protein is kept in check by the affected protein, the disruption will deregulate the partner protein, potentially manifesting in a dominant manner. A variant of this is when a regulatory domain is part of the same protein (see 'Domain destabilisation').
gnomAD: To be a valid hypothesis, one would not expect similarly severe variants at a similar zygousity affecting the same surface within the the gnomAD control set.

Compromised activity

Hypothesis: The protein loses its catalytic activity or is unable to switch conformation
Details: In the simplest case the effect is the same as a truncation or severe destabilisation, i.e. there is no functional protein. However, it may still bind to a partner protein, effectively sequestering it (see 'Sequestration').
Ideally, to study a protein that undergoes a conformational switch one would run a long term molecular dynamics simulation, however, calculating the ∆∆G of different conformations (single static snapshots) can be used as a proxy to assess whether a conformation is disfavoured or not, for example if an apo conformation has a worse ∆∆G, while a bound conformation has a neutral ∆∆G then potentially the latter state is favoured. To investigate this in Venus, one needs to understand what models represent which state and submit them as a custom model.

Sequestration

Hypothesis: The protein is inactive but still binds a partner protein decreasing functional concentration of the latter
Details:This scenario may present a worse phenotypic consequence than a truncation or severe destabilisation. If a protein forms a complex with another protein that is intolerant to decreased protein levels and the function of the latter is affected by the interaction, an inactive affected protein would in effect sequester the latter keeping it out of service. For example, in RPB1 (POLR2A-encoded) missense variants that abrogate activity are worse than nonsense variants potentially due to RPABC3 (POLR2H) sequestration.
Some native protein are inactive when sequestered by another protein in a particular conformation, as is the case for the β-γ subunits of G-protein, which bind to the α subunit in the GDP bound conformation. Mutations therefore in the complex may result in a loss of sequestration of the β subunit —whereas this interface on the β subunit is also involved with its downstream target, it is a large surface, so differences can easily result in differential affinities for the different protein.

Non-functional oligomer

Hypothesis: A single variant chain makes the complex non functional to some degree
Details:Theoretically, the number of complex without no variant chains is 0.5ⁿ, where n is the number of chains in an oligomer, assuming that the variant chain is at an equal concentration to the unaffected chain and has the same affinity.
A homozygous truncation that abrogates the protein activity has a functionality of 0%. So the functional fraction of a pool of oligomers of two chains, one affected and one unaffected, is greater than this, but less than the functional fraction of a pool of mixed monomers (50%). If a complex is formed with a homologue of the variant, as is the case for some oligomers, then the relative activity may be lower than that seen for a homozygous truncation (see 'Sequestration'). However, this back-of-the-evelope calculations, omit whether the variant chain is degraded more, less likely to form a complex and marginally active, which would nudge the relative activity more towards 50%.

Deregulation via post-translation site

Hypothesis: loss of a post-translation site
Details:Phosphorylation and other modifications are used to control the conformation of protein (e.g. hRas), which controls their activity, or are used to target the protein for degradation.
gnomAD: Different post-translation sites have different strengths/effects, so the presence of variants within the gnomAD control dataset affecting other predicted post-translation sites may not invalidate this hypothesis, however, mutations adjacent to the one in question do.

Domain destabilisation

Hypothesis: the mutation destabilises a regulatory domain, but not the whole protein
Details:As with destabilising mutations, variants that expose hydrophobic residues result in higher aggregation and degradation. A limited destabilisation may abrogate the activity: if it is via inter-domain binding, then it would be no different than the weaker interface scenario described above. If the domain is at the C-terminus, then truncations that do not affect the preceeding domains (see 'Interdomain truncations') would have a similar effect albeit affected by missense mRNA decay.
gnomAD:One would not expect severe mutations in the affected domain within the gnomAD control dataset.

Altered catalysis

Hypothesis: a mutation alters the specificity of the protein
Details: While possible, this scenario is a much less likely outcome of active site mutations than a loss of activity. It is presented here for completeness.

Mislocalisation

Hypothesis: loss of localisation signal
Details:A mutation in the localisation signal may result in loss of function or gain of function depending on whether the migration was required for its activity or repression.
gnomAD:In most cases it is not a structural effect, but a motif disruption so ∆∆G is not a valid metric to assess gnomAD variants in a signal.