Class GeneSets

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Iterable<GeneSet>
    Direct Known Subclasses:
    GeneSetsRanked

    public class GeneSets
    extends java.lang.Object
    implements java.lang.Iterable<GeneSet>, java.io.Serializable
    A collection of GeneSets Genes have associated "experimental values"
    Author:
    Pablo Cingolani
    See Also:
    Serialized Form
    • Constructor Summary

      Constructors 
      Constructor Description
      GeneSets()
      Default constructor
      GeneSets​(java.lang.String msigDb)  
      GeneSets​(GeneSets geneSets)  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean add​(java.lang.String gene)
      Add a gene and aliases
      boolean add​(java.lang.String gene, GeneSet geneSet)
      Add a gene and it's corresponding gene set
      void add​(GeneSet geneSet)
      Add a gene set
      boolean addInteresting​(java.lang.String gene)
      Add a symbol as 'interesting' gene (to every corresponding GeneSet in this collection)
      void checkInterestingGenes​(java.util.Set<java.lang.String> intGenes)
      Checks that every symboolID is in the set (as 'interesting' genes)
      protected void copy​(GeneSets geneSets)
      Copy all data from geneSets
      GeneSet disjointSet​(java.util.List<GeneSet> geneSetList, int activeSets)
      Produce a GeneSet based on a list of GeneSets and a 'mask'
      static GeneSets factory​(GoTerms goTerms)
      Create gene sets form GoTerms
      java.util.List<GeneSet> geneSetsSorted()
      Iterate through each GeneSet in this GeneSets
      java.util.List<GeneSet> geneSetsSortedSize​(boolean reverse)
      Gene sets sorted by size (if same size, sort by name).
      int getGeneCount()
      How many genes do we have?
      java.util.Set<java.lang.String> getGenes()
      Get all genes in this set
      GeneSet getGeneSet​(java.lang.String geneSetName)
      Get a gene set named 'geneSetName'
      int getGeneSetCount()
      Get number of gene sets
      java.util.HashSet<GeneSet> getGeneSetsByGene​(java.lang.String gene)
      All gene sets that this gene belongs to
      java.util.HashMap<java.lang.String,​GeneSet> getGeneSetsByName()  
      java.util.HashSet<java.lang.String> getInterestingGenes()  
      int getInterestingGenesCount()  
      java.lang.String getLabel()  
      double getValue​(java.lang.String gene)
      Get experimental value
      java.util.HashMap<java.lang.String,​java.lang.Double> getValueByGene()  
      boolean hasGene​(java.lang.String geneId)  
      boolean hasValue​(java.lang.String gene)  
      boolean isInteresting​(java.lang.String geneName)  
      boolean isRanked()  
      protected boolean isUsed​(java.lang.String geneName)  
      protected boolean isUsed​(GeneSet gs)
      Is this gene set used? I.e.
      java.util.Iterator<GeneSet> iterator()
      Iterate through each GeneSet in this GeneSets
      java.util.Iterator<GeneSet> iteratorSorted()
      Iterate through each GeneSet in this GeneSets
      java.util.Set<java.lang.String> keySet()  
      java.util.List<GeneSet> listTopTerms​(int numberToSelect)
      Select a number of GeneSets
      java.util.List<java.lang.String> loadExperimentalValues​(java.lang.String fileName, boolean maskException)
      Reads a file with a list of genes and experimental values.
      boolean loadMSigDb​(java.lang.String gmtFile, boolean maskException)
      Read an MSigDBfile and add every Gene set (do not add relationships between nodes in DAG)
      void remove​(GeneSet geneSet)  
      void removeGeneSet​(java.lang.String geneSetName)
      Remove a GeneSet
      void removeUnusedSets()
      Remove unused gene sets
      void reset()
      Reset every 'interesting' gene or ranked gene (on every single GeneSet in this GeneSets)
      void saveGseaGeneSets​(java.lang.String fileName)
      Save gene sets file for GSEA analysis Format specification: http://www.broad.mit.edu/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29
      void setDoNotAddIfNotInGeneSet​(boolean doNotAddIfNotInGeneSet)  
      void setGeneSetByName​(java.util.HashMap<java.lang.String,​GeneSet> geneSets)  
      void setInterestingGenes​(java.util.HashSet<java.lang.String> interestingGenesIdSet)  
      void setValue​(java.lang.String geneId, double value)
      Set experimental value for this gene
      void setVerbose​(boolean verbose)  
      java.lang.String toString()  
      java.util.Collection<GeneSet> values()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
      • Methods inherited from interface java.lang.Iterable

        forEach, spliterator
    • Field Detail

      • debug

        public static boolean debug
      • LOG2

        public static double LOG2
      • PRINT_SOMETHING_TIME

        public static long PRINT_SOMETHING_TIME
    • Constructor Detail

      • GeneSets

        public GeneSets()
        Default constructor
      • GeneSets

        public GeneSets​(GeneSets geneSets)
      • GeneSets

        public GeneSets​(java.lang.String msigDb)
    • Method Detail

      • factory

        public static GeneSets factory​(GoTerms goTerms)
        Create gene sets form GoTerms
        Parameters:
        goTerms - : GoTerms to use
      • add

        public void add​(GeneSet geneSet)
        Add a gene set
        Parameters:
        geneSetName -
        geneSet -
      • add

        public boolean add​(java.lang.String gene)
        Add a gene and aliases
      • add

        public boolean add​(java.lang.String gene,
                           GeneSet geneSet)
        Add a gene and it's corresponding gene set
        Parameters:
        gene -
        geneSet -
        Returns:
      • addInteresting

        public boolean addInteresting​(java.lang.String gene)
        Add a symbol as 'interesting' gene (to every corresponding GeneSet in this collection)
      • checkInterestingGenes

        public void checkInterestingGenes​(java.util.Set<java.lang.String> intGenes)
        Checks that every symboolID is in the set (as 'interesting' genes)
        Parameters:
        intGenes - : A set of interesting genes Throws an exception on error
      • copy

        protected void copy​(GeneSets geneSets)
        Copy all data from geneSets
        Parameters:
        geneSets -
      • disjointSet

        public GeneSet disjointSet​(java.util.List<GeneSet> geneSetList,
                                   int activeSets)
        Produce a GeneSet based on a list of GeneSets and a 'mask'
        Parameters:
        geneSetList - : A list of GeneSets
        activeSets - : An integer (binary mask) that specifies weather a set in the list should be taken into account or not. The operation performed is: Intersection{ GeneSets where mask_bit == 1 } - Union{ GeneSets where mask_bit == 0 } ) where the minus sign '-' is actually a 'set minus' operation. This operation is done for both sets in GeneSet (i.e. genes and interestingGenes)
        Returns:
        A GeneSet
      • geneSetsSorted

        public java.util.List<GeneSet> geneSetsSorted()
        Iterate through each GeneSet in this GeneSets
      • geneSetsSortedSize

        public java.util.List<GeneSet> geneSetsSortedSize​(boolean reverse)
        Gene sets sorted by size (if same size, sort by name).
        Parameters:
        reverse - : Reverse size sorting (does not affect name sorting)
        Returns:
      • getGeneCount

        public int getGeneCount()
        How many genes do we have?
        Returns:
      • getGenes

        public java.util.Set<java.lang.String> getGenes()
        Get all genes in this set
        Returns:
      • getGeneSet

        public GeneSet getGeneSet​(java.lang.String geneSetName)
        Get a gene set named 'geneSetName'
        Parameters:
        geneSetName -
        Returns:
      • getGeneSetCount

        public int getGeneSetCount()
        Get number of gene sets
        Returns:
      • getGeneSetsByGene

        public java.util.HashSet<GeneSet> getGeneSetsByGene​(java.lang.String gene)
        All gene sets that this gene belongs to
        Parameters:
        gene -
        Returns:
      • getGeneSetsByName

        public java.util.HashMap<java.lang.String,​GeneSet> getGeneSetsByName()
      • getInterestingGenes

        public java.util.HashSet<java.lang.String> getInterestingGenes()
      • getInterestingGenesCount

        public int getInterestingGenesCount()
      • getLabel

        public java.lang.String getLabel()
      • getValue

        public double getValue​(java.lang.String gene)
        Get experimental value
        Parameters:
        gene -
        Returns:
      • getValueByGene

        public java.util.HashMap<java.lang.String,​java.lang.Double> getValueByGene()
      • hasGene

        public boolean hasGene​(java.lang.String geneId)
      • hasValue

        public boolean hasValue​(java.lang.String gene)
      • isInteresting

        public boolean isInteresting​(java.lang.String geneName)
      • isRanked

        public boolean isRanked()
      • isUsed

        protected boolean isUsed​(GeneSet gs)
        Is this gene set used? I.e. is there at least one gene 'used'? (e.g. interesting or ranked)
        Parameters:
        gs -
        Returns:
      • isUsed

        protected boolean isUsed​(java.lang.String geneName)
      • iterator

        public java.util.Iterator<GeneSet> iterator()
        Iterate through each GeneSet in this GeneSets
        Specified by:
        iterator in interface java.lang.Iterable<GeneSet>
      • iteratorSorted

        public java.util.Iterator<GeneSet> iteratorSorted()
        Iterate through each GeneSet in this GeneSets
      • keySet

        public java.util.Set<java.lang.String> keySet()
      • listTopTerms

        public java.util.List<GeneSet> listTopTerms​(int numberToSelect)
        Select a number of GeneSets
        Parameters:
        numberToSelect -
        Returns:
      • loadExperimentalValues

        public java.util.List<java.lang.String> loadExperimentalValues​(java.lang.String fileName,
                                                                       boolean maskException)
        Reads a file with a list of genes and experimental values. Format: "gene \t value \n"
        Parameters:
        fileName -
        Returns:
        A list of genes not found
      • loadMSigDb

        public boolean loadMSigDb​(java.lang.String gmtFile,
                                  boolean maskException)
        Read an MSigDBfile and add every Gene set (do not add relationships between nodes in DAG)
        Parameters:
        gmtFile -
        geneSetType -
      • remove

        public void remove​(GeneSet geneSet)
      • removeGeneSet

        public void removeGeneSet​(java.lang.String geneSetName)
        Remove a GeneSet
      • removeUnusedSets

        public void removeUnusedSets()
        Remove unused gene sets
      • reset

        public void reset()
        Reset every 'interesting' gene or ranked gene (on every single GeneSet in this GeneSets)
      • saveGseaGeneSets

        public void saveGseaGeneSets​(java.lang.String fileName)
        Save gene sets file for GSEA analysis Format specification: http://www.broad.mit.edu/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29
        Parameters:
        fileName -
      • setDoNotAddIfNotInGeneSet

        public void setDoNotAddIfNotInGeneSet​(boolean doNotAddIfNotInGeneSet)
      • setGeneSetByName

        public void setGeneSetByName​(java.util.HashMap<java.lang.String,​GeneSet> geneSets)
      • setInterestingGenes

        public void setInterestingGenes​(java.util.HashSet<java.lang.String> interestingGenesIdSet)
      • setValue

        public void setValue​(java.lang.String geneId,
                             double value)
        Set experimental value for this gene
        Parameters:
        geneId -
        value -
      • setVerbose

        public void setVerbose​(boolean verbose)
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • values

        public java.util.Collection<GeneSet> values()