README

The goal of GARCOM is to provide mutation counts per individual within genetic boundaries (genes). It accepts different data formats with input file from plink (.raw), gene boundaries, SNP location. It also accepts VCF file format. The vcf file is assumed to be pVCF, that is where all samples are merged/combined.

library(GARCOM)
## basic example code
## sample data provided with library: genecoord, snpgene, snppos and genecoord

## Input data requires output from PLINK --recode flag. plink --bfile input --recode A --out sample_output 

#input data: .raw formatted and SNP-gene (two columns)
gene_annot_counts(recodedgen,snpgene) 

#input data: .raw formatted, SNP location (two columns) and Gene boundaries (three columns)
gene_pos_counts(recodedgen, snppos, genecoord) 

#read VCF file vcf_data <- vcfR::read.vcfR("CHRXX.vcf.gz", verbose=TRUE)
vcf_counts_annot(vcf_data,df_snpgene) # pass vcf data read and data frame with SNP-gene annotation

#read VCF file vcf_data <- vcfR::read.vcfR("CHRXX.vcf.gz", verbose=TRUE)
vcf_counts_SNP_genecoords(vcf_data,df_snppos,df_genecoords) # pass vcf data read and data frame SNP position and third with gene coordinates

#subset individuals 
vcf_counts_SNP_genecoords(vcf_data,df_snptestpos, df_genecoordstestpos,keep_indiv=c("IID1","IID2"))

ind_select<-c("IID1","IID2") ## store in a vector
vcf_counts_SNP_genecoords(vcf_data,df_snptestpos, df_genecoordstestpos,keep_indiv=ind_select)

## Filter individuals and filter genes for a VCF data
vcf_counts_SNP_genecoords(vcf_data,df_snptestpos,df_genecoordstestpos,keep_indiv=c("IID1","IID2"),filter_gene="GENE_1") #returns a matrix of data.table class

##For more examples refer manual

GARCOM

Installation

Example

Citation

Dependencies (Imports)

suggests

Issues and suggestions

Origin

To-Dos

Version