Eudract and CT.gov Safety XML

Simon Bond

2026-01-09

Introduction

The European Clinical Trials Database Eudract is run by the European Medicines Agency. All studies that are officially registered clinical trials have to enter the results of the final analysis into the website to be made public.

An equivalent database hosted by the U.S. government is CT.gov.

There is a large amount of documentation online that will not be repeated here.

The detailed entering of safety information is an onerous task if it is to be done by hand. However we do have the facility to upload an XML file to automate this step. This R package seeks to enable the production of an XML file from a standard structure of Safety data that is recorded on a patient-level.

The key functions are:

We provide a dummy safety data set that is one line per patient-event: ?safety. The format of the data, specifically the variable names is described in the help file printed below. This needs to be turned into several frequency tables: one given information at a group level; one given at the event level, broken down into serious and non-serious events. The term “group” here corresponds to treatment arms in a randomised control trial. You need to define a group within EudraCT even if it is a one-armed study, in which case it can be a “dummy” label.

Some information is defined in a hard-coded fashion below, but it is understood that this will be generated by code if applied in real life. Each entry in the vectors below correspond to counts in each of the two groups.

subjectsExposed <- c("Control"=99,"Experimental"=101)
#count of deaths not in the Safety data. Could be c(0,0)
deathsExternal <- c("Control"=3,"Experimental"=5)

Coded adverse events are required to be helpful and avoid the task of reconciling minor spelling or text inconsistencies. This package and vignette assumes this is the case, and will not work in the absence of coding. We cannot provide the full MedDRA dictionary, due to copy right reasons. But normally this is available to sponsors. However, for upload into EudraCT, as a minimal requirement, only the System Organ Class (SOC) needs to be fully coded into the EudraCT internal version coding system. We have provided an internal data set, derived from the eutct site in the package to use this; see ?soc_code.

Safety data set

There are two possible formats for inputting the safety data

Original Format

A sample data set with the required fields is included.

safety R Documentation

Example of safety data

Description

A dataset containing some example data of safety event in raw source format

Usage

safety

Format

a data frame with 8 columns and 16 rows

pt

meddra preferred term code

subjid

a unique subject identifier

related

a logical indicating if the event is related to the treatment

soc

the meddra code for the System Organ Class

fatal

a numerical 0/1 to indicate if the event was fatal

serious

a numerical 0/1 to indicate if the event was serious

group

the treatment group for the subject

term

a text description of the event. Needs to be matching 1-1 with the pt code

Details

The data contains one row per patient-event. So the numbers exposed in each arm cannot be inferred from these data, as patients with no events will not be included in these data.

The variable names and formats are those required by safety_summary. The variable pt is not strictly required. An alternative to soc would be the equivalent character string from soc_code

ADaM format

A pair of example data sets are taken from the pharmaverseadam package. Specifically the individual subject-level data pharmaverseadam::adsl and adverse event data pharmaverseadam::adae. The adverse event data is readily convertible to the original format data, just with different variable names. The subject level data is used to internally calculate the numbers exposed, and excess deaths not included in the adverse event data.

Calculate Summary Statistics

We provide a function that derives the patient and event counts as required in a format internal to R.

Original Data Format

safety_statistics <- safety_summary(safety, 
                                    exposed=subjectsExposed, 
                                    excess_deaths = deathsExternal, 
                                    freq_threshold = 1
                                    )
safety_statistics
## Group-Level Statistics
## 
##          title subjectsAffectedBySeriousAdverseEvents
## 1      Control                                     15
## 2 Experimental                                     33
##   subjectsAffectedByNonSeriousAdverseEvents deathsResultingFromAdverseEvents
## 1                                        15                                9
## 2                                        24                               22
##   subjectsExposed deathsAllCauses
## 1              99              12
## 2             101              27
## 
## Non-serious event-level statistics (intial rows)
## 
##   groupTitle subjectsAffected occurrences                              term
## 1    Control                1           1           Acute coronary syndrome
## 2    Control                1           1            Intestinal perforation
## 3    Control                1           1                Laryngeal stenosis
## 4    Control                1           1 Lower respiratory tract infection
## 5    Control                1           1               Lung adenocarcinoma
## 6    Control                2           2                         Pneumonia
##        eutctId
## 1 100000004849
## 2 100000004856
## 3 100000004855
## 4 100000004855
## 5 100000004855
## 6 100000004862
## 
## Serious event-level statistics (intial rows)
## 
##   groupTitle subjectsAffected occurrences                       term
## 1    Control                1           1             Abdominal pain
## 2    Control                1           1   Aortic valve replacement
## 3    Control                1           1            B-cell lymphoma
## 4    Control                0           0          Bladder papilloma
## 5    Control                1           1             Cardiac arrest
## 6    Control                0           0 Cardiac failure congestive
##        eutctId occurrencesCausallyRelatedToTreatment deaths
## 1 100000004856                                     0      0
## 2 100000004865                                     0      0
## 3 100000004851                                     0      0
## 4 100000004864                                     0      0
## 5 100000004849                                     0      1
## 6 100000004849                                     0      0
##   deathsCausallyRelatedToTreatment
## 1                                0
## 2                                0
## 3                                0
## 4                                0
## 5                                0
## 6                                0

ADaM data format

The same arguments for freq_trheshold and na.action are used. The excess_deaths and exposed are internally calculated and so not needed. The ADaM data has a text field AEREL that is unconstrained, so we can provide an argument related_terms that gives the set of terms used to identify related events; the default is c("POSSIBLE","PROBABLE","DEFINITELY")) .

library(pharmaverseadam)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
data("adsl")
data("adae")

safety_statistics_adam <- safety_summary_adam(
  adsl |> filter(ARM != "Screen Failure"),
  adae)
safety_statistics_adam 
## Group-Level Statistics
## 
##                  title subjectsAffectedBySeriousAdverseEvents
## 1              Placebo                                      0
## 2 Xanomeline High Dose                                      2
## 3  Xanomeline Low Dose                                      1
##   subjectsAffectedByNonSeriousAdverseEvents deathsResultingFromAdverseEvents
## 1                                        69                                2
## 2                                        78                                0
## 3                                        77                                1
##   subjectsExposed deathsAllCauses
## 1              86               2
## 2              84               0
## 3              84               1
## 
## Non-serious event-level statistics (intial rows)
## 
##   groupTitle subjectsAffected occurrences                 term      eutctId
## 1    Placebo                0           0 ABDOMINAL DISCOMFORT 100000004856
## 2    Placebo                1           1       ABDOMINAL PAIN 100000004856
## 3    Placebo                0           0 ACROCHORDON EXCISION 100000004865
## 4    Placebo                0           0    ACTINIC KERATOSIS 100000004858
## 5    Placebo                2           2            AGITATION 100000004873
## 6    Placebo                0           0          ALCOHOL USE 100000004869
## 
## Serious event-level statistics (intial rows)
## 
##             groupTitle subjectsAffected occurrences
## 1              Placebo                0           0
## 2              Placebo                0           0
## 3 Xanomeline High Dose                1           1
## 4 Xanomeline High Dose                1           1
## 5  Xanomeline Low Dose                0           0
## 6  Xanomeline Low Dose                1           1
##                                             term      eutctId
## 1 PARTIAL SEIZURES WITH SECONDARY GENERALISATION 100000004852
## 2                                        SYNCOPE 100000004852
## 3 PARTIAL SEIZURES WITH SECONDARY GENERALISATION 100000004852
## 4                                        SYNCOPE 100000004852
## 5 PARTIAL SEIZURES WITH SECONDARY GENERALISATION 100000004852
## 6                                        SYNCOPE 100000004852
##   occurrencesCausallyRelatedToTreatment deaths deathsCausallyRelatedToTreatment
## 1                                     0      0                                0
## 2                                     0      0                                0
## 3                                     0      0                                0
## 4                                     1      0                                0
## 5                                     0      0                                0
## 6                                     1      0                                0

Convert to XML

If you have produced these statistics through separate coding, then you can use the eudract:::create.summary_statistics() function to put them into the correct internal format and start the conversion to XML directly.

First we export the safety_statistics to a XML document that is human readable “simple.xml”. Then we convert to the EudraCT and CT.gov formats.

simple <- tempfile(fileext = ".xml")
eudract_upload_file <- tempfile(fileext = ".xml")
ct_upload_file <- tempfile(fileext = ".xml")
simple_safety_xml(safety_statistics, simple)
## '/tmp/RtmpctG5NZ/file39702568bfe5.xml' is created or modified
eudract_convert(input=simple,
                output=eudract_upload_file)
## '/tmp/RtmpctG5NZ/file397059b5447e.xml' is created or modified
## Please email cuh.cctu@nhs.net to tell us if you have successfully uploaded a study to EudraCT.
## This is to allow us to measure the impact of this tool.
clintrials_gov_convert(input=simple,
                       original=system.file("extdata", "1234.xml", package ="eudract"),
                output=ct_upload_file)
## '/tmp/RtmpctG5NZ/file3970958dcb4.xml' is created or modified
## Please email cuh.cctu@nhs.net to tell us if you have successfully uploaded a study to ClinicalTrials.gov .
## This is to allow us to measure the impact of this tool.

Note that for the ClinicalTrials.gov, there must first be a study set-up within website, and then a download of the XML taken. This is the original argument. Then the original file has the safety events data over-written, and can be manually uploaded back into ClinicalTrials.gov

Alternatively, if you have a user account within CT.gov, then the initial study needs to be set up within there, but we can use the API to directly upload without needing to manually interact with the site.

# Not actually run. It needs real user account details: the ones below are fictitious.
clintrials_gov_upload(
    input=simple,
    orgname="CTU",
    username="Student",
    password="Guinness",
    studyid="1234"
    )

Output

The key outputs are

We can validate the output against the XML schemas provided by EudraCT and CT.gov, although the calls to eudract_convert() and clintrials_gov_convert() also do this behind the scenes, returning the value TRUE if there are no errors against the schema validation.

note these are semi-readable files of code/data rather than a standard web page.

myschema <- xml2::read_xml(system.file("extdata","adverseEvents.xsd", package="eudract"))
aes <- xml2::read_xml(eudract_upload_file)
check <- xml2::xml_validate(aes,myschema)
if(check){print("Validation against eudraCT schema has passed!")}
## [1] "Validation against eudraCT schema has passed!"
myschema <- xml2::read_xml(system.file("extdata","ProtocolRecordSchema.xsd", package="eudract"))
aes <- xml2::read_xml(ct_upload_file)
check <- xml2::xml_validate(aes,myschema)
if(check){print("Validation against CT.gov schema has passed!")}
## [1] "Validation against CT.gov schema has passed!"

Manual Upload

To use the resulting eudraCT xml file navigate and log in online to the study specific area of the EudraCT site. On the top banner is a link “Upload XML” which you follow. Choose the option “Adverse Events” rather than “Full data set”, and select the file xml you have produced. The resulting information can be viewed in the browser interactively or with a static pdf file (note this is a fictitious study and fictitious data). This is not the only step in completing the EudraCT report, as the description of the study, baseline characteristics and efficacy analysis will all need to be added. That is not the remit of this package though.

For the ClinicalTrials.gov, once logged in, there is a button titled “Records”, near the top. From there select “Upload Record (XML)”. On the new page, use the “Choose File” button to select the newly created XML file, and click “Upload”.

To extract the original study record for over-writing, you need to go into the specific study record from the home page. In there beneath the initial section titled “Record Status”, there is a link “Download XML”, which will enable you to save locally the required file.