Thwarting Subjectivity in Clinical Trials

19 November 2019

Dr. Nathaniel Katz / FiercePharma

No one would proceed with a clinical trial if a key measurement tool — a thermometer or fMRI, for example — was improperly calibrated. So why don’t we hold our participants, who are arguably the most critical component of any trial, to a similar standard of calibration?

A clinical trial is itself a measurement system--at least, it should be. Each component must be calibrated to precisely measure treatment effect. Too often, however, scientists fail to calibrate the most significant of their measurement instruments: trial participants. Optimizing the performance of each study participant is essential in preventing or mitigating data-quality risks.

Let’s take a step back and consider what’s meant by a “measurement system.” It’s a “set of one or more measuring instruments and often other devices assembled to give information used to generate measured quantity values,” according to the Joint Committee for Guides in Metrology.

So, if a clinical trial is a measurement system, the measured quantity values would be the patient's pain score, for example, or their blood pressure, depression score, cholesterol level, etc. Some values are objective, while others are subjective. But all have to be calibrated in order for the clinical trial — as a whole — to generate a clear estimate of the treatment effect.

You wouldn’t dream of using uncalibrated scientific instruments in proper clinical research. Yet we allow trial participants — both subjects and staff — to go uncalibrated without giving it a second thought. Here’s why that’s problematic: patients and raters both vary in their ability to report symptoms: for example, many patients find it difficult to accurately report their own pain, while many raters find it difficult to rate a patient’s depression. This high degree of variability poses a significant risk to data quality, particularly in clinical trials with subjective endpoints.

Grappling with subjectivity

Everyone understands that, especially in CNS trials, outcomes are based largely on patients’ subjective reports. What’s not widely understood is that some patients are better than others at accurately reporting their symptoms. Take pain: in our research, published in the Journal of the International Association for the Study of Pain, 20% to 30% of subjects enrolled in analgesic trials are unable to accurately report their pain.

Fortunately, we can quantify how accurately somebody reports their symptoms. One tool is the Focused Analgesia Selection Test, which involves exposing subjects to multiple painful stimuli of different intensities to quantify pain-reporting reliability and accuracy.[i] As a recent study published in Pain concludes, “patients whose attention is externally directed do not perceive or report bodily sensations accurately, are more vulnerable to external cues (such as placebos),and should either be excluded from clinical trials or trained to accurately report what is being measured in the study.”[ii]

If a patient cannot report symptoms accurately, it may be tempting to remove that subject from the trial. But, the ability to accurately report pain is a skill, and we’ve found that many patients can improve their accuracy with training and practice. (See sidebar: “What do we mean by training?”).

Placebo response, too

Along with the inability of many patients to accurately report symptoms, placebo response may constitute the biggest challenge facing CNS trials. And the two are closely related.

Patients who are externally focused in constructing their symptom reports have both higher variability (because their symptom reports are vulnerable to external influences), and higher placebo responses (because they pay excessive attention to external cues in constructing their symptom reports).

Several studies have demonstrated relationships between high pain-score variability and a high placebo response — and these relationships are not unique to pain.[iii],[iv],[v],[vi] You can identify placebo-prone patients by looking for high variability in reported symptom intensity.

So — as surprising as it may sound — placebo response is another example of poor calibration. One that can be both predicted and reduced.

In a measurement system, it all comes down to calibrating — or rather, training — your instruments.

Calibrate your patients

It’s a simple proposition: train your patients and get better data. Training patients to assess and report their symptoms more accurately diminishes placebo response, reduces variation and improves data accuracy.[vii] Fail to train them, and your trial is at risk.

Accurate-pain-reporting training can help patients better assess their own pain. Research shows that those who received such training had lower variability in pain scores and improved discrimination between active drug and placebo compared to those who did not undergo training.[viii]

Although it may sound obvious, training should familiarize patients with the pain scales that will be used during the trial. We’ve discovered that a surprising number of clinical trial patients haven’t been taught how to use a pain scale correctly.

Other issues come into play as well. Patients often fail to appreciate their role in a study, not always distinguishing between clinical care and research. Study procedures often resemble therapies — especially in CNS trials — so it’s not always easy for patients to tell the difference. Lack of understanding leads to unrealistic expectations of improvement. Many patients fail to accurately report outcomes, while others may focus on the wrong phenomenon. For instance, in a depression trial, the patient may focus either on pain or anxiety relief, or fail to report symptom intensity.

By getting patients to be more introspective about what’s going on inside their own bodies and training them to report their pain more accurately, we inoculate them against the external cues that drive the placebo response.

This has implications for other CNS trials. As researchers noted in a 2018 PLoS One paper:

“The use of training approaches in future analgesic and potentially other neurological and psychiatric clinical trials has the potential to improve assay sensitivity, reduce sample size requirements, increase the likelihood of trial success, and accelerate the development of new treatment options for those who suffer.”[ix]

Never forget how much rests on a subjective response. If it’s a patient completing the assessment by filling out a patient-reported outcome measure, you’re dependent upon how reliably that patient performs the assessment. If it’s an investigator doing an assessment, such as the Ham-D, you’re entirely dependent upon his or her level of skill.

Calibrate your staff

The goal of any trial is to neutralize patient expectations. But where do those expectations come from? It’s often the staff. Maybe it’s from the written and verbal information they read. Perhaps the rater or other members of the team are warm and empathetic, rather than detached and objective.

Everyone knows the value of rater training, but raters aren’t the only ones who need to be “calibrated.” Anyone with patient contact — face-to-face or virtual — can affect subject responses.

Well-trained research staff avoid infecting patients with their own biases, and they help patients neutralize their own expectations. For instance, the ways in which staff communicate verbally and non-verbally with patients can create or control expectations. Even a reassuring smile and pat on the back can create expectations — high expectations of personal benefit —that should be avoided. We don’t expect the site teams to be unkind, but they must remain neutral.

Right now, most trial staff aren’t appropriately calibrated. Staff training around placebo response and symptom reporting must become a standard part of pre-trial activities.

Pharma companies and their CRO partners invest heavily in internal training, but the concept of training clinical trial participants has yet to gain traction in the industry. To date, training has been regarded as a “check-the-box” activity — something done to please regulators — rather than an important scientific occupation.

Training is so much more than a regulatory “checkbox”. In fact, training patients to report symptoms accurately is a matter of ethical (as well as practical) consideration. If you don’t, you are experimenting on a patient who cannot generate scientifically useful data. And it’s time to take that seriously.

Calibrated tools, more precise measurements

Precision medicine demands precision measurement, and as was mentioned at the outset of this article, a clinical trial is fundamentally a measurement system. But too often, especially in CNS, that measurement system is imprecise. High placebo response and the inability of patients to accurately report symptoms can be devastating to measurement reliability, which can undermine the entire trial.

To truly demonstrate a drug’s efficacy, measurement instruments must give us precise, reliable information.

It’s time to calibrate all of our instruments — including each trial participant, from the patient to the Principal Investigator. Until we approach the calibration of our participants with the same rigor as we approach that of traditional measurement tools, we’ll only see more trials fail — not because the therapy is ineffective, but because the trial is.

Sidebar: What do we mean by training?

Training is an activity designed to optimize real-world performance, generally incorporating education, practice and feedback.

It’s vital to understand that education is but one aspect of training. Education is about what you know; training is about what you do. Giving someone a manual to read helps to educate them. Showing them how to perform a task described in the manual — and allowing them to practice this newly acquired skill — is how we train them. Education doesn’t necessarily change behavior, but training does.

Until we take training seriously in the world of clinical trials, we’ll never adequately address the challenge of subjectivity.


[i] Treister R, Eaton TA, et al. “Development and preliminary validation of the focused analgesia selection test to identify accurate pain reporters.” J Pain Res. 2017;10:319–326. Published 2017 Feb 9. doi:10.2147/JPR.S121455

[ii] Treister R, Honigman L, et al. “A deeper look at pain variability and its relationship with the placebo response: results from a randomized, double-blind, placebo-controlled clinical trial of naproxen in osteoarthritis of the knee” [published online February 25, 2019]. Pain. doi:10.1097/j.pain.0000000000001538

[iii] Harris RE, Williams DA, et al. “Characterization and consequences of pain variability in individuals with fibromyalgia.” Arthritis Rheum. 2005;52: 3670–4.

[iv] Farrar JT, Katz NP, et al. “Effect of variability in the 7-day baseline pain diary on the assay sensitivity of neuropathic pain randomized clinical trials: an ACTTION study.” Pain. 2014;155: 1622–31.

[v] Zilcha-Mano S, Barber JP. “Instability of depression severity at intake as a moderator of outcome in the treatment for major depressive disorder.” Psychother Psychosom. 2014;83: 382–3. pmid:25323818

[vi] Treister R, , et al. “Accurate pain reporting training diminishes the placebo response: Results from a randomised, double-blind, crossover trial.” PLoS One. 2018;13(5):e0197844. Published 2018 May 24. doi:10.1371/journal.pone.0197844

[vii] Treister R, et al., “Training Subjects to report their pain more accurately improves study power: Results of a randomized placebo-controlled study of pregabalin vs placebo in PDN.” IASP, Japan, 2016

[viii] Treister R, IASP, Japan, 2016 op. cit.

[ix] Treister R, PLoS ONE 2018, op. cit.




Read more


Read more

Media Center

Read more