Ongoing Culture Changes in the Safety Landscape

John Tukey – the Father of Data Science – recognized a problem brewing back when we were just starting to develop the most powerful tests for efficacy analyses, soon after we started leveraging randomization in clinical trials.

[John] Tukey hoped to restore a balance between open-ended discovery of potentially important context-dependent information and mathematical techniques for sorting the signal from the noise. The ultimate decisions, he believed, were mainly in the province of those with substantive knowledge. The role of data analysis was to provide quantitative evidence and to assist, but not to replace, the substantive expert to sift through and interpret the evidence. By focusing exclusively on [decision rules], the researcher was effectively cutting the clinician out of the ‘judicial’ process of evaluation. (Herbert Weisberg, 2014)

This evocative and compelling idea is encapsulated in the famous quote by John Tukey (1962): Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise.

Forty years down the pike and we’re still struggling with the same issues, as evinced by Bob O’Neill (2002): Statistical methodology has not been developed for safety monitoring to match that for efficacy monitoring.  He was continuing a call to arms he had initiated in a provocative paper when ICH was getting its legs under it (Bob O’Neill, 1995). 

Complex challenges exist for evaluating the relationship of study drug with occurrence of AEs; accounting for duration of exposure time, patient-level covariates, and other clinical considerations.  But they create opportunities for expanded interest and participation by clinical safety professionals and safety data scientists working closely together, and for sponsors to partner with regulatory authorities for developing interdisciplinary safety evaluation procedures.

More research is needed to improve consistency in identification and evaluation of adverse events across a development program, not only for comprehensive analysis at the end of studies, but also for ongoing aggregate safety evaluation during studies, and to develop methodologies for distinguishing ADRs from background adverse events.