
Drugs databases vary too much in their formats and structures which making related data analysis not a very easy job and requires a lot of efforts to work on only two databases together such as DrugBank, OnSIDES, and TWOSIDES.
Hence, dbparser package aims to parse different public
drugs databases into a single and unified format R object called
dvobject (stands for drugverse object).
With recent updates, dbparser has evolved into an
integration engine, allowing you to merge mechanistic
data (DrugBank) with real-world phenotypic data (OnSIDES) and drug-drug
interaction risks (TWOSIDES).
That should help in:
dvobject and storing results in
the same object in a very easy manner.dvobject introduces a unified and compressed format of
drugs data. It is an R list object.
For a single database (e.g., DrugBank): It contains one or more of the following sub-lists:
For a merged database (Integrated
Pharmacovigilance): When databases are merged using
merge_drugbank_onsides or
merge_drugbank_twosides, the dvobject becomes
a nested structure containing:
Parsers are available for the following databases (it is in progress list)
DrugBank database is a comprehensive, freely accessible, online database containing information on drugs and drug targets. As both a bioinformatics and a cheminformatics resource, DrugBank combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. More information about DrugBank can be found here.
In its raw form, the DrugBank database is a single XML file. Users must create an account with DrugBank and request permission to download the database. Note that this may take a couple of days.
The dbparser package parses the DrugBank XML database
into R tibbles that can be explored and analyzed by the
user, check this
tutorial for more details.
If you are waiting for access to the DrugBank database, or do not
intend to do a deep dive with the data, you may wish to use the
dbdataset package,
which contains the DrugBank database already parsed into
dvobject. Note that this is a large package that exceeds
the limit set by CRAN. It is only available on GitHub.
dbparser is tested against DrugBank versions
5.1.0 through 5.1.12 successfully. If you find errors
with these versions or any other version please submit an issue here.
OnSIDES provides adverse drug
events extracted from thousands of FDA drug labels using machine
learning. * Parser: parseOnSIDES() *
Input: Directory containing OnSIDES CSV files.
TWOSIDES
provides data on drug-drug interactions and the adverse events that
arise when two drugs are taken together. * Parser:
parseTWOSIDES() * Input: The
TWOSIDES.csv.gz file.
The power of dbparser lies in its ability to chain
parsers and mergers together. Here is how you can build a complete
pharmacovigilance dataset:
library(dbparser)
library(dplyr)
# 1. Parse the raw databases
drugbank_db <- parseDrugBank("data/drugbank.xml")
onsides_db <- parseOnSIDES("data/onsides/")
twosides_db <- parseTWOSIDES("data/TWOSIDES.csv.gz")
# 2. Build the Integrated Knowledge Graph
# DrugBank serves as the hub. We chain the merges.
final_db <- drugbank_db %>%
merge_drugbank_onsides(onsides_db) %>%
merge_drugbank_twosides(twosides_db)
# 3. Analyze Results
# Example: Accessing the enriched drug-drug interaction table
head(final_db$integrated_data$drug_drug_interactions)For a detailed case study, please refer to the Integrated Pharmacovigilance Vignette.
You can install the released version of dbparser from CRAN with:
install.packages("dbparser")or you can install the latest updates directly from the repo
library(devtools)
devtools::install_github("ropensci/dbparser")Please note that the ‘dbparser’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
👍🎉 First off, thanks for taking the time to contribute! 🎉👍 Please review our Contributing Guide.
Think dbparser is useful? Let others discover it, by telling them in person, via Twitter or a blog post.
Using dbparser for a paper you are writing? Consider citing it
citation("dbparser")
#> To cite dbparser in publications use:
#>
#> Mohammed Ali, Ali Ezzat (). dbparser: DrugBank Database XML Parser.
#> R package version 2.2.0.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {DrugBank Database XML Parser},
#> author = {Mohammed Ali and Ali Ezzat},
#> organization = {Interstellar for Consultinc inc.},
#> note = {R package version 2.2.0},
#> url = {https://CRAN.R-project.org/package=dbparser},
#> }