facebook-pixel

Predict Rate of Recruitment for a Clinical Trial

Industry Pharmaceutical and Life Sciences, Healthcare

Specialization Or Business Function Biology, Health and Medicine

Technical Function Analytics (Machine Learning)

Technology & Tools Programming Languages and Frameworks (Python)

CLOSED FOR BIDDING

Project Description

Knowledgent is a precision-focused data intelligence firm with consistent, field-proven results across industries. Rather than follow the latest industry hype, we rise above the noise to craft innovative and reliable data and analytics solutions that help organizations use information as a strategic asset.

Problem:

We wish to develop a model capable of predicting the rate of recruitment (number of patients per site per month) for a clinical trial on a country-by-country basis and with a known confidence interval. Data include free-text elements and structured data, although it is expected that because of the limited time available, structured data will play a larger role in the analysis.

Expertise Required: 

We will need someone with data science expertise, preferably with some experience in the clinical trial space. Expertise in NLP may be beneficial, as much of the data available for the project that differentiates one clinical trial from another is in the form of free text inclusion/exclusion criteria.

Data sources:

Internal trial monitoring data from one company, public clinical trial data (e.g., clinicaltrials.gov), incidence/prevalence data by country (public and 3rd party licensed data). Data is available in Hive tables in an AWS environment, as well as in the original formats if needed/desired (varies by data source, but includes XML and xls files). It is known that this is not a comprehensive list of data that would influence the rate of recruitment. While any publicly available data may be incorporated to improve predictions, it is expected that a basic model can be produced using only the data provided.

We cannot provide a sample of the licensed data. But, much of the relevant data is publicly available. Data from clinicaltrials.gov can be most easily accessed from http://aact.ctti-clinicaltrials.org/

Technology stack:

Data available on Hive on AWS. Python server running on an AWS EC2 instance with Jupyter notebook. Note that any solution must be provided in Python

Deliverable:

An algorithm capable of predicting the rate of recruitment (number of patients per site per month) for a clinical trial within a fixed ±0.05 range (typical recruitment rates can be expected to be in the 0.1 to 1.0 range) with 80% of predictions within the target range. Predictions must be made for each country for which incidence/prevalence data is provided (~7), as well as for the trial as a whole, although the success criteria stated above only apply to the trial-level prediction.

Location Preference:

We have some preference for people located close to our Warren, NJ office, although this is not a strict requirement.

Project Overview

  • Posted
    October 17, 2017
  • Planned Start
    October 23, 2017
  • Delivery Date
    November 13, 2017
  • Preferred Location
    United States

Client Overview


EXPERTISE REQUIRED

Matching Providers