How Not to Do Science | Skeptical Inquirer

According to research methodologist R. Barker Bausell, “CAM [complementary and alternative medicine] therapists simply do not value (and most, in my experience, do not understand) the scientific process.” They have seen their patients improve, and that’s all the “evidence” they think they need. They don’t understand that they may have been deceived by the post hoc ergo propter hoc logical fallacy. The patient may have improved despite their care rather than because of it. They don’t understand that the only reliable way to know if a therapy is effective is to do a properly designed scientific trial with a credible control group. They don’t value science; but they know that most of the rest of us do, so they want to do science to convince the rest of us that they are right. But they don’t understand how to do science. All too often, their attempts end in laughable fiascos.

A prime example was recently published in the journal Complementary Therapies in Medicine. The title was “Pediatric perioperative measures of sleep, pain, anxiety and anesthesia emergence: A healing touch proof of concept randomized clinical trial.” The full text is available online. It can serve as a lesson in how not to do science.

The Study

“The purpose of this study was to determine the impact of healing touch (HT) on sleep, anxiety, anesthesia emergence and pain.” The subjects were 5–21 years of age, scheduled for an elective operation at a pediatric burns hospital. A total of forty-one subjects were randomized into four groups:

Healing touch (HT) (ten subjects). An elaborate sixty-minute ritual by experienced HT practitioners, a mixture of hands-on and hands-hovering-above exercises.
HT sham (twelve subjects). Aides with no HT experience or knowledge but who had observed HT practitioners and mimicked their arm movements but with no conscious intent to heal.
Control/presence (CP) (eight subjects). Aides unfamiliar with HT simply sat with the patient for an hour with no interaction.
Control/no presence (CNP) (eleven subjects). No intervention. Patients were left alone in the room.

Data was collected from polysomnography (PSG), subjective anxiety and pain scales, lab tests that the researchers thought would measure stress and anxiety, and a satisfaction survey.

Conclusion: “Although no tracked parameters showed statistically significant findings, anecdotal HT benefits included enhanced relaxation and sounder sleep.”

Translation: This study showed HT didn’t work, but we will disregard the findings of our study and will continue to believe it works based on anecdotal evidence rather than on science.

A Lesson on How Not to Do Science

Test something imaginary. I call this Tooth Fairy Science. You can do a scientific study on the Tooth Fairy—for instance, measuring how much money she leaves to children of rich families vs. poor families, how much she pays for the first tooth compared to the last tooth, or whether she leaves more money for a tooth in a baggie than for a tooth wrapped in Kleenex. You can run statistical analyses and imitate all the trappings of science. You can get results that are reproducible and statistically significant, and you can convince yourself that you have learned valuable truths about the Tooth Fairy’s behavior. But you haven’t, because the Tooth Fairy doesn’t exist. You have actually only been studying parental behavior and popular customs.

This study was on healing touch, a type of energy medicine. The authors explain that energy medicine is “based on the concept that there is a universal human energy subject to imbalances and that a therapist can re-pattern the disrupted field by imparting compassionate healing energy into a person to improve health.”

They deliberately fail to acknowledge that there is no evidence that any such human energy field exists and that the consensus of reputable scientists is that energy medicine is pseudoscientific.

Use meaningless unscientific language without comment. The experienced HT practitioners used “self-centering exercises,” “magnetic clearing to clear the patient’s energy field,” “mind clearing,” and hand positions on or above various parts of the body including the “root chakra.” An atmosphere was created so “the full body could be connected (chakra).”
Don’t bother to proofread your manuscript for consistency or mathematical correctness. A table shows that 95 patients were assessed for eligibility, with 44 excluded and 41 randomized. (44 + 41 = 85, not 95). Two subjects were dropped because of restlessness or inability to complete PSG, so only 39 patients remained. The report inconsistently states that either 39 or 41 patients were randomized. The treatments reportedly lasted 60 minutes but the maneuvers used were described as lasting 10, 10, and 20 minutes, which adds up to only 40 minutes.
Use surrogate measures for your endpoints. For a study on interventions to prevent heart attacks, instead of following patients to see how many have heart attacks, just test for cholesterol levels, BP, or other measures known to be correlated with heart attacks. In this study, they measured stress and anxiety using the unvalidated surrogates CRP, vitamin D, and glucose levels; their cited evidence was a single study for each that only reported a correlation.
Confuse correlation with causation. If two factors A and B are correlated, that could be because A causes B, or because B causes A, or because both A and B are caused by a third factor C, or simply because of a meaningless coincidence like the correlation between the number of pirates and the diagnoses of autism.

Este artículo también está disponible en español.
Haga clic aquí para leerlo.

Put a positive spin on negative results. The study’s conclusion states “Although no tracked parameters showed statistically significant findings, anecdotal HT benefits included enhanced relaxation and sounder sleep.” The discussion of lab results says, “while none of these differences was statistically significant, they are clinically meaningful.” They can’t be clinically meaningful if the results were negative. And no reputable scientist would suggest ignoring study results in favor of anecdotal reports. I even ran across one study that concluded, “This treatment works but this study was unable to demonstrate that fact.”
Use an inadequate sample size. According to Bausell’s checklist, a credible study should have at least fifty subjects in each group. These researchers knew they would need 200 patients to reliably detect differences, but when the funding period ran out, they had only enrolled forty-one patients. So they knew they wouldn’t get reliable results, but they persevered.
Ignore red flags. These researchers were surprised at their inability to recruit enough patients. Didn’t they wonder why most potential subjects refused to enroll? They didn’t see this as a sign of bias that might compromise their results. Those who agreed to participate might have viewed HT more favorably and their subjective reports might be skewed.
Refer to evidence from disreputable studies. The reference to HT’s efficacy for wound healing is based on a study by Daniel Wirth in the journal Subtle Energies. Wirth was imprisoned for fraud and several of his studies have been retracted. Other studies have failed to confirm Wirth’s findings.
Don’t bother trying to get your study published in a reputable high-quality prestigious science-based peer-reviewed journal. They don’t accept poor quality studies. But there are plenty of CAM journals whose editors and peer reviewers are less rigorous, and if all fails, there are pay-to-publish journals.
Accept imperfect controls. You can’t blind the HT providers, and you can’t control for intention. There’s no way to know what is in the provider’s or the sham provider’s mind, and there’s no evidence that intention can make a difference.
Always call for more research. If results are consistently negative, you can argue that the next study might be positive if only you could do a larger study or make slight alterations in the protocol. The research could conceivably go on forever with one small variation after another.

Other ways to not do good science

This study provides a lot of lessons in how not to do science, but the list is far from exhaustive. Here are just a few of the many other lessons in how not to do science that I have seen in other studies:

Don’t use a control group at all. If you test anything, you’re likely to get positive results due to a placebo response.
Use the A vs. A+B format. Adding anything to usual care A or to a mainstream treatment A is guaranteed to make B look better than it really is.
Don’t do an exit poll to determine if patients could guess which group they were in. If they can guess better than chance, either the blinding procedures or the placebo controls were faulty. If the results are positive, don’t ask; you don’t want to know if the study is flawed.
Only submit positive studies for publication. Leave studies with negative results in the file drawer.
Don’t report your own bias. Is the treatment you’re studying part of your practice? Would giving it up reduce your income?
Do some data-mining to see if you can torture subgroup data into confessing a positive result.
Don’t correct for multiple endpoints. If you have enough endpoints, some may be positive (false positive) just by chance, and you can claim success.
Do animal studies or test-tube studies. They may be easier and more likely to produce positive results, although those results may not be relevant to humans.
Report statistical significance and imply that it proves clinical significance even when it doesn’t.
Use the wrong statistical tests if that’s the only way to make it look like your results are positive.
Do pragmatic studies of real-world everyday performance. Such studies were never meant to establish whether a treatment works better than placebo, and they may falsely make an unproven treatment with strong placebo effects look better than an evidence-based treatment.

That’s only a start. Despite their best efforts, even the best scientists can be misled by things like unrecognized contaminants or by conscious or unconscious manipulation of data by technicians. And someone who starts out doing good science can later lapse into deliberate fraud. We rely on replication and confirmation and don’t trust single studies, because there are so many things that can go wrong in scientific studies even when you know how to do science. And there are many, many more things that can go wrong if you don’t know how not to do science. You can find examples of how to do good science in prestigious mainstream publications such as The New England Journal of Medicine. You can find examples of how not to do science in any CAM journal. They’re not good for understanding reality, but they’re good for a laugh.

Harriet Hall

Harriet Hall, MD, a retired Air Force physician and flight surgeon, writes and educates about pseudoscientific and so-called alternative medicine. She is a contributing editor and frequent contributor to the Skeptical Inquirer and contributes to the blog Science-Based Medicine. She is author of Women Aren’t Supposed to Fly: Memoirs of a Female Flight Surgeon and coauthor of the 2012 textbook Consumer Health: A Guide to Intelligent Decisions.

The Study

A Lesson on How Not to Do Science

Este artículo también está disponible en español. Haga clic aquí para leerlo.

Other ways to not do good science

Harriet Hall

Este artículo también está disponible en español.
Haga clic aquí para leerlo.