Number of trial registrations by year, location, disease and phase of development (1999-2019)
Published: March 2020
The number of trials listed in the WHO International Clinical Trials Registry Platform (ICTRP) is reported by year, location (worldwide, WHO region and country), disease (or condition) and phase of development, for the period of 1999–2019. Note that the ICTRP comprises both interventional and observational trials. See more on scope of ICTRP below.
See also:
What you see | Scope and limitations | Data sources | Current version
What you see
The data visualization above shows trials on ICTRP from 1999 to 2019 as follows:
- Trials by year worldwide (chart A)
- Trials by phase of development (chart B)
- Trials by year and WHO region (chart C)
- Trials by disease or condition (chart D)
- Trials by country (chart E), colour coded by WHO region
Chart A shows the total number of trials, regardless of the number of trial sites involved. A multicountry trial is counted once for each participating region in chart C but once for each participating country in chart E. The default for chart B and chart D is one trial counted once regardless of the number of sites. When a selection is made, either by filtering a region or country, the count will reflect the same principle described above. The year corresponds to the date of enrollment of the first trial participant, in a small number of cases when this is not available the registration year is used.
The data can be visualized separately for interventional or observational trials. With respect to disease focus, trials on neglected tropical diseases or R&D Blueprint pathogens can also be viewed separately (select desired options using the circular buttons at the top left or top right of the data visualization).
Points to note:
- The United States of America had the highest total number of trials registered during 1999-2019 (134,516), followed by China (46,149) and Japan (45,856) (chart E).
- Of interventional trials (tick the corresponding box on the top left to filter), 58% (283,308) have no information on the phase of development. Of those with an identified phase of development, the largest number of trials is in phase II (70,309) (chart B).
- ICTRP and its underlying sources do not include a field describing the disease or condition investigated in each trial. Data mining techniques were, therefore, used to assign a primary disease to each trial (see approach under analysis below). It was not possible to match 110,957 trials (21.15%) to any disease. For some trials, it was only possible to classify them into broader categories.
- Of the trials that were categorized into a disease or condition, 84.50 % were for noncommunicable diseases, 10.82% for communicable, maternal, perinatal and nutritional conditions and 4% for injuries (chart D).
- Among the trials on one of the R&D Blueprint pathogens (tick the corresponding box at the top right to filter), the top three diseases were Ebola virus disease with 109 trials, Zika virus disease (46) and severe acute respiratory syndrome (17) (chart D).
To explore the data further
- Tick the box (top left) to filter the results for only interventional or observational trials.
- To filter the results for one or a combination of the following: year, region, country, disease or phase, click on the relevant data element (for example on a region from the key in chart C; a point on the trend line for the year; or a bar beside the disease, phase or country of interest) in the relevant charts.
-- For any of the above selections, information in the other charts will update accordingly, as relevant. - Hover the cursor on a data element of interest (for example a bar or a point on a trend line) to see more information in a popup window.
- Hold the Ctrl key to select more than one option, for example two regions.
- Undo a selection by clicking ‘undo’ or ‘reset’ near the bottom of the page or by clicking the same element again.
Limitations of the data analysis
- There are several gaps in the ICTRP data source, which required data cleaning to uniformly classify data elements when possible. In some cases no information was available, e.g. on the country where the trial is conducted (7.07%) or the phase of development of clinical trials.
- Automated data mining was used to generate information on the primary disease investigated in each trial using text-based data fields.
-- A list of disease synonyms was compiled using as a base the Unified Medical Language System (UMLS). This was complemented by synonyms drawn from the data, mostly to account for errors in data entry such as spelling errors or use of abbreviations.
-- An automated algorithm was applied to two data fields using the list of disease synonyms to generate the uniform disease classification field used in this analysis. The first field is based on free-text keywords provided by the registrant. The second is the scientific title of the trial. If the first field provided a match the second was not used.
-- The first match closer to the beginning of the text field was selected. This was considered the primary disease investigated by the trial. It is possible that the trial has more than one disease focus, which is not captured in this analysis.
-- The algorithm was refined through various iterations but as with any automated algorithm, it is likely that some trials were not correctly matched. A full description of the methods and approach is available in this paper: Resource allocation for biomedical research: analysis of investments by major funders.
-- This method resulted in matching around 80% of the trials to a disease or a condition. Information on the diseases presented above is therefore not representative of all trials on the ICTRP and must be interpreted with caution. - The data presented in this visualization utilizes classifications that are not mutually exclusive. For example, a registered trial can recruit participants from multiple countries and regions. In this case, the trial will be counted once per region in chart C but once per country in chart E. The total number of trials across the two charts is therefore not equivalent.
- The analysis will be updated at regular time points but time lags with the scheduled updates by the data sources are inevitable. Accuracy and completeness of the information is the responsibility of the data source, see terms and conditions of use.