Package org.snpeff.snpEffect
Class HgvsDna
- java.lang.Object
-
- org.snpeff.snpEffect.Hgvs
-
- org.snpeff.snpEffect.HgvsDna
-
public class HgvsDna extends Hgvs
Coding DNA reference sequence References http://www.hgvs.org/mutnomen/recs.html Nucleotide numbering: - there is no nucleotide 0 - nucleotide 1 is the A of the ATG-translation initiation codon - the nucleotide 5' of the ATG-translation initiation codon is -1, the previous -2, etc. - the nucleotide 3' of the translation stop codon is *1, the next *2, etc. - intronic nucleotides (coding DNA reference sequence only) - beginning of the intron; the number of the last nucleotide of the preceding exon, a plus sign and the position in the intron, like c.77+1G, c.77+2T, .... - end of the intron; the number of the first nucleotide of the following exon, a minus sign and the position upstream in the intron, like ..., c.78-2A, c.78-1G. - in the middle of the intron, numbering changes from "c.77+.." to "c.78-.."; for introns with an uneven number of nucleotides the central nucleotide is the last described with a "+" (see Discussion) Genomic reference sequence - nucleotide numbering starts with 1 at the first nucleotide of the sequence NOTE: the sequence should include all nucleotides covering the sequence (gene) of interest and should start well 5' of the promoter of a gene - no +, - or other signs are used - when the complete genomic sequence is not known, a coding DNA reference sequence should be used - for all descriptions the most 3' position possible is arbitrarily assigned to have been changed (see Exception)
-
-
Field Summary
Fields Modifier and Type Field Description static boolean
debug
-
Fields inherited from class org.snpeff.snpEffect.Hgvs
duplication, genome, hgvsTrId, marker, MAX_SEQUENCE_LEN_HGVS, strandMinus, strandPlus, tr, variant, variantEffect
-
-
Constructor Summary
Constructors Constructor Description HgvsDna(VariantEffect variantEffect)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.lang.String
alt()
protected java.lang.String
dnaBaseChange()
DNA level base changesprotected boolean
isDuplication()
Is this a duplication?protected java.lang.String
pos()
Genomic position for exonic variantsprotected java.lang.String
pos(int pos)
HGVS position base on genomic coordinates (chr is assumed to be the same as in transcript/marker).protected java.lang.String
posDownstream(int pos)
Position downstream of the transcriptprotected java.lang.String
posExon(int pos)
Convert genomic position to HGVS compatible (DNA) positionprotected java.lang.String
posIntron(int pos, Intron intron)
Intronic positionprotected java.lang.String
posUpstream(int pos)
Position upstream of the transcript Note: How to calculate Upstream position: If strand is '-' as for NM_016176.3, "genomicTxStart" being the rightmost tx coord: cDotUpstream = -(cdsStart + variantPos - genomicTxStart) Instead of "-(variantPos - genomicCdsStart)": The method that stays in transcript space until extending beyond the transcript is correct because of these statements on http://varnomen.hgvs.org/bg-material/numbering/: * nucleotides upstream (5') of the ATG-translation initiation codon (start) are marked with a "-" (minus) and numbered c.-1, c.-2, c.-3, etc.protected java.lang.String
posUtr3(int pos)
Position within 3'UTRprotected java.lang.String
posUtr5(int pos)
Position within 5'UTRprotected java.lang.String
prefixTranslocation()
Translocation nomenclature.protected java.lang.String
ref()
java.lang.String
toString()
protected java.lang.String
typeOfReference()
Prefix for coding or non-coding sequences-
Methods inherited from class org.snpeff.snpEffect.Hgvs
initStrand, parseTranscript, removeTranscript
-
-
-
-
Constructor Detail
-
HgvsDna
public HgvsDna(VariantEffect variantEffect)
-
-
Method Detail
-
alt
protected java.lang.String alt()
-
dnaBaseChange
protected java.lang.String dnaBaseChange()
DNA level base changes
-
isDuplication
protected boolean isDuplication()
Is this a duplication?
-
pos
protected java.lang.String pos()
Genomic position for exonic variants
-
pos
protected java.lang.String pos(int pos)
HGVS position base on genomic coordinates (chr is assumed to be the same as in transcript/marker).
-
posDownstream
protected java.lang.String posDownstream(int pos)
Position downstream of the transcript
-
posExon
protected java.lang.String posExon(int pos)
Convert genomic position to HGVS compatible (DNA) position
-
posIntron
protected java.lang.String posIntron(int pos, Intron intron)
Intronic position
-
posUpstream
protected java.lang.String posUpstream(int pos)
Position upstream of the transcript Note: How to calculate Upstream position: If strand is '-' as for NM_016176.3, "genomicTxStart" being the rightmost tx coord: cDotUpstream = -(cdsStart + variantPos - genomicTxStart) Instead of "-(variantPos - genomicCdsStart)": The method that stays in transcript space until extending beyond the transcript is correct because of these statements on http://varnomen.hgvs.org/bg-material/numbering/: * nucleotides upstream (5') of the ATG-translation initiation codon (start) are marked with a "-" (minus) and numbered c.-1, c.-2, c.-3, etc. (i.e. going further upstream) * Question: When the ATG translation initiation codon is in exon 2, and we find a variant in exon 1, should we include intron 1 (upstream of c.-14) in nucleotide numbering? (Isabelle Touitou, Montpellier, France) Answer: Nucleotides in introns 5' of the ATG translation initiation codon (i.e. in the 5'UTR) are numbered as introns in the protein coding sequence (see coding DNA numbering). In your example, based on a coding DNA reference sequence, the intron is present between nucleotides c.-15 and c.-14. The nucleotides for this intron are numbered as c.-15+1, c.-15+2, c.-15+3, ...., c.-14-3, c.-14-2, c.-14-1. Consequently, regarding the question, when a coding DNA reference sequence is used, the intronic nucleotides are not counted.
-
posUtr3
protected java.lang.String posUtr3(int pos)
Position within 3'UTR
-
posUtr5
protected java.lang.String posUtr5(int pos)
Position within 5'UTR
-
prefixTranslocation
protected java.lang.String prefixTranslocation()
Translocation nomenclature. From HGVS: Translocations are described at the molecular level using the format "t(X;4)(p21.2;q34)", followed by the usual numbering, indicating the position translocation breakpoint. The sequences of the translocation breakpoints need to be submitted to a sequence database (Genbank, EMBL, DDJB) and the accession.version numbers should be given (see Discussion). E.g.: t(X;4)(p21.2;q35)(c.857+101_857+102) denotes a translocation breakpoint in the intron between coding DNA nucleotides 857+101 and 857+102, joining chromosome bands Xp21.2 and 4q34
-
ref
protected java.lang.String ref()
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
typeOfReference
protected java.lang.String typeOfReference()
Prefix for coding or non-coding sequences
-
-