Readers may be unaware that we have published numerous articles over the last decade that would suggest that to large degree equipment error might be responsible for many unstable findings across the field (Plant et al onwards).
Stphane Doyen, a cognitive psychologist at the Free University of Brussels, encountered similar issues when he and his colleagues iled to replicate a classic experiment by John Bargh from Yale University in New Haven, Connecticut, showing that people walk more slowly if they have been unconsciously primed with age-related words
Wagenmakers argues that replication attempts should also be published under different rules. Like clinical trials in mediPsychologycine, he says, they should be pre-registered to avoid the post-hoc data-torturing practices that Simmons describes, and published irrespective of outcome. Engaging or even collaborating with the original authors early on could pre-empt any later quibbling over methods.
Driven by these controversies, many psychologists are now searching for ways to encourage replications. I think psychology has taken the lead in addressing this challenge, says Jonathan Schooler, a cognitive psychologist at the University of California, Santa Barbara. In January, Hal Pashler, a psychologist from the University of California, San Diego, in La Jolla and his colleagues created a website called PsychFileDrawer where psychologists can submit unpublished replication attempts, whether successful or not. The site has been warmly received but has only nine entries so r. There are few incentives to submit: any submission opens up scientists to criticisms from colleagues and does little to help their publication record.
Such conceptual replications are useful for psychology, which often deals with abstract concepts. The usual way of thinking would be that [a conceptual replication] is even stronger than an exact replication. It gives ter evidence for the generalizability of the effect, says Eliot Smith, a psychologist at Indiana University in Bloomington and an editor of JPSP.
Bem published his findings in the Journal of Personality and Social Psychology (JPSP) along with eight other experiments
These problems occur throughout the sciences, but psychology has a number of deeply entrenched cultural norms that exacerbate them. It has become common practice, for example, to tweak experimental designs in ways that practically guarantee positive results. And once positive results are published, few researchers replicate the experiment exactly, instead carrying out conceptual replications that test similar hypotheses using different methods. This practice, say critics, builds a house of cards on potentially shaky foundations.
Positive results in psychology can behave like rumours: easy to release but hard to dispel. They dominate most journals, which strive to present new, exciting research. Meanwhile, attempts to replicate those studies, especially when the findings are negative, go unpublished, languishing in personal file drawers or circulating in conversations around the water cooler. There are some experiments that everyone knows dont replicate, but this knowledge doesnt get into the literature, says Wagenmakers. The publication barrier can be chilling, he adds. Ive seen students spending their entire PhD period trying to replicate a phenomenon, iling, and quitting academia because they had nothing to show for their time.
, and drew an irate blog post from Bargh. Bargh described Doyens team as inexpert researchers and later took issue with the writer of this story for a blog post about the exchange. Bargh says that he responded so strongly partly because he saw growing scepticism of the idea that unconscious thought processes are important, and felt that damage was being done to the field.
Ive done everything possible to encourage replications, says Bem, who stands by his results, and has put details of all his methods and tests online. But he adds that one replication is uninformative on its own. Its premature, he says. It can take years to figure out what can make a replication il or succeed. You need a meta-analysis of many experiments.
These problems have been brought into sharp focus by some high-profile fraud cases, which many believe were able to flourish undetected because of the challenges of replication. Now psychologists are trying to fix their field. Initiatives are afoot to assess the scale of the problem and to give replication attempts a chance to be aired. In the past six months, there are many more people talking and caring about this, says Joseph Simmons, an experimental psychologist at the University of Pennsylvania in Philadelphia. Im hoping its reaching a tipping point.
Lance Nizami BSc (Physics) MSc (Physiology) PhD (Psychology)
posted on behalf of Lance Nizami:
All this puts the burden of proof on those who try to replicate studies but they ce a tough slog. Consider the aftermath of Bems notorious . When the three groups who iled to reproduce the word-recall results combined and submitted their results for publication, the JPSP, Science and Psychological Science all said that they do not publish straight replications. The British Journal of Psychology sent the out for peer review, but rejected it. Bem was one of the peer reviewers on the . The beleaguered eventually found a 名门棋牌 at PLoS ONE
Independent Research Scholar
. Simmons designed the experiments to show how unacceptably easy it can be to find statistically significant results to support a hypothesis. Many psychologists make on-the-fly decisions about key aspects of their studies, including how many volunteers to recruit, which variables to measure and how to analyse the results. These choices could be innocently made, but they give researchers the freedom to torture experiments and data until they produce positive results.
Simmons should know. He recently published a tongue-in-cheek in Psychological Science showing that listening to the song When Im Sixty-four by the Beatles can actually reduce a listeners age by . years
This could avoid many problems, from the file-draw bias to stopping data collection as soon as the p crosses the finishing line. It would also avoid the temptation to oversell findings (even real ones!). Of course, it would not work for all interesting/important results &Wagenmakers argues that replication attempts should also be published under different rules. Like clinical trials in medicine, he says, they should be pre-registered to avoid the post-hoc data-torturing practices that Simmons describes, and published irrespective of outcome./em
Palo Alto, California
We will be presenting at SCiP and Psychonomics in Minneapolis in the next few days if readers would like to talk to us about the issues in person.
In Bad Copy (Nature , May , pp. -), Ed Yong notes a disinterest (indeed, perhaps a fear) of replication within scientific research, but particularly among those who publish in Psychiatry and Psychology, who as peer reviewers seem especially disinclined to publish work that disagrees with tested hypotheses. Yong describes the circumstances of this behavior in scinating detail. However, I fear that his description might lead some readers to believe that replication is avoided in the behavioral sciences either because it is less effective, or because it is inherently more difficult. Neither is true. I myself have degrees in physics, physiology, and psychology, giving me a unique perspective, and I can assure the readers of Nature that the different degrees of emphasis on replication across the scientific disciplines (as noted by Yong) do not reflect the differing natures of the objects of study. Rather, they reflect the different personalities involved in the studying. That is, the physical sciences, which Yong notes as being least hostile to replication, attract longsuffering, hardnosed individuals who expect proof to the umpteenth decimal place, and are willing to allow for differences in measurement. The physiological sciences, in contrast, tend to attract those with greater tolerance for ambiguity, and greater need for short-term reward, whether manifested as the gratefulness of patients, higher social standing, or a higher paycheck, all especially for medical doctors. Finally, there are the psychiatrists and psychologists, whose interest in human behavior inevitably proves to originate from serious emotional problems of their own, for which they compensate through the conceited belief that they and their colleagues are free of impure thoughts and incapable of mistakes. Such conceit also allows overestimation of the peer review system, such that the first to report a phenomenon is deemed correct, with any subsequent contrary reports deemed intolerable and assumed to be acts of hostility or incompetence. Of course, the latter attitude requires the admission that some of the time at least, some psychiatrists and psychologists are less than perfect, an admission that creates a cognitive dissonance which actually reinforces the problem. The moral of the story is that human behavior is also found among scientists, not just among those non-scientists whom they study.
In a survey of more than , psychologists, Leslie John, a consumer psychologist from Harvard Business School in Boston, Massachusetts, showed that more than % had waited to decide whether to collect more data until they had checked the significance of their results, thereby allowing them to hold out until positive results materialize. More than % had selectively reported studies that worked
One reason for the excess in positive results for psychology is an emphasis on slightly freak-show-ish results, says Chris Chambers, an experimental psychologist at Cardiff University, UK. High-impact journals often regard psychology as a sort of parlour-trick area,大偵探西門全集 he says. Results need to be exciting, eye-catching, even implausible. Simmons says that the blame lies partly in the review process. When we review s, were often authors prove that their findings are novel or interesting, he says. Were not often them prove that their findings are true.
Some researchers are agnostic about the outcome, but Pashler expects to see confirmation of his fears: that the corridor gossip about irreproducible studies and the file drawers stuffed with iled attempts at replication will turn out to be real. Then, people wont be able to dodge it, he says.
These practices can create an environment in which misconduct goes undetected. In November , Diederik Stapel, a social psychologist from Tilburg University in the Netherlands and a rising star in the field, was investigated for, and eventually confessed to, scientific fraud on a massive scale. Stapel had published a stream of y, attention-grabbing studies, showing for example that disordered environments, such as a messy train station, promote discrimination
In the wake of high-profile controversies, psychologists are cing up to problems with replication.
After Barghs on unconscious priming, dozens of other labs followed suit with their own versions of priming experiments. Volunteers who were primed by holding a heavy clipboard, for example, took interview candidates more seriously and deemed social problems to be more pressing than did those who held light boards
If academics/funders can take control of the publishing process (e.g., eLife), perhaps this vision of a clinical trial approach could be realised. As grant applications are assessed on their potential merits, experimental designs could be accepted for publication before the results are known. In particular, funders could make a commitment to publish as part of their commitment to fund. Then as long as everything is executed correctly, the results should be of interest to the scientific community (otherwise, why fund the research in the first place?).
providing evidence for what he refers to as psi, or psychic effects. There is, needless to say, no shortage of scientists sceptical about his claims. Three research teams independently tried to replicate the effect Bem had reported and, when they could not, they ced serious obstacles to publishing their results. The episode served as a wake-up call. The realization that some proportion of the findings in the literature simply might not replicate was brought 名门棋牌 by the ct that there are more and more of these counterintuitive findings in the literature, says Eric-Jan Wagenmakers, a mathematical psychologist from the University of Amsterdam.
These changes may be a r-off hope. Some scientists still question whether there is a problem, and even Nosek points out that there are no solid estimates of the prevalence of lse positives. To remedy that, late last year, he brought together a group of psychologists to try to reproduce every study published in three major psychological journals in . The teams will adhere to the original experiments as closely as possible and try to work with the original authors. The goal is not to single out individual work, but to get some initial evidence about the odds of replication across the field, Nosek says.
Of course, one negative replication does not invalidate the original result. There are many mundane reasons why such attempts might not succeed. If the original effect is small, negative results will arise through chance alone. The volunteers in a replication attempt might differ from those in the original. And one team might simply lack the skill to reproduce anothers experiments.
Matthew Lieberman, a social psychologist from University of California, Los Angeles, suggests a different approach. The top psychology programmes in the United States could require graduate students to replicate one of several nominated studies within their own field, he says. The students would build their skills and get valuable early publications, he says, and the field would learn whether surprising effects hold up.
, are the worst offenders: they are five times more likely to report a positive result than are the space sciences, which are at the other end of the spectrum (see Accentuate the positive). The situation is not improving. In , statistician Theodore Sterling found that % of the studies in four major psychology journals had reported statistically significant positive results
But to other psychologists, reliance on conceptual replication is problematic. You cant replicate a concept, says Chambers. Its so subjective. Its anybodys guess as to how similar something needs to be to count as a conceptual replication. The practice also produces a logical double-standard, he says. For examplPsychology Replication studies Bad copye, if a heavy clipboard unconsciously influences peoples judgements, that could be taken to conceptually replicate the slow-walking effect. But if the weight of the clipboard had no influence, no one would argue that priming had been conceptually lsified. With its ability to verify but not lsify, conceptual replication allows weak results to support one another. It is the scientific embodiment of confirmation bias, says Brian Nosek, a social psychologist from the University of Virginia in Charlottesville. Psychology would suffer if it wasnt practised but it doesnt replace direct replication. To show that A is true, you dont do B. You do A again.
, a journal that publishes all technically sound s, regardless of novelty.
For many psychologists, the clearest sign that their field was in trouble came, ironically, from a study about premonition. Daryl Bem, a social psychologist at Cornell University in Ithaca, New York, showed student volunteers words and then abruptly asked them to write down as many as they could remember. Next came a practice session: students were given a random subset of the test words and were asked to type them out. Bem found that some students were more likely to remember words in the test if they had later practised them. Effect preceded cause.
Wilkie Way, Apt. C
For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.
. But all the ctors replication difficult helped him to cover his tracks. The scientific committee that investigated his case wrote, Whereas all these excessively neat findings should have provoked thought, they were embraced People accepted, if they even attempted to replicate the results for themselves, that they had iled because they lacked Mr Stapels skill. It is now clear that Stapel manipulated and bricated data in at least publications.
The conduct of subtle experiments has much in common with the direction of a theatre performance, says Daniel Kahneman, a Nobel-prizewinning psychologist at Princeton University in New Jersey. Trivial details such as the day of the week or the colour of a room could affect the results, and these subtleties never make it into methods sections. Bargh argues, for example, that Doyens team exposed its volunteers to too many age-related words, which could have drawn their attention to the experiments hidden purpose. In priming studies, you must tweak the situation just so, to make the manipulation strong enough to work, but not salient enough to attract even a little attention, says Kahneman. Bargh has a knack that not all of us have. Kahneman says that he attributes a special knack only to those who have found an effect that has been reproduced in hundreds of experiments. Bargh says of his priming experiments that he never wanted there to be some secret knowledge about how to make these effects happen. Weve always tried to give that knowledge away but maybe we should specify more details about how to do these things.
. On average, most respondents felt that these practices were defensible. Many people continue to use these approaches because that is how they were taught, says Brent Roberts, a psychologist at the University of Illinois at UrbanaChampaign.
Stapels story mirrors those of psychologists Karen Ruggiero and Marc Hauser from Harvard University in Cambridge, Massachusetts, who published high-profile results on discrimination and morality, respectively. Ruggiero was found guilty of research fraud in and Hauser was found guilty of misconduct in . Like Stapel, they were exposed by internal whistle-blowers. If the field was truly self-correcting, why didnt we correct any single one of them? asks Nosek.
Thats a great idea & but it would be an even ter idea to just make all studies pre-registered, which would go a long way to avoiding the problem of unreplicable studies in the first place, in my view
, John Ioannidis, an epidemiologist currently at Stanford School of Medicine in California argued that most published research findings are lse, according to statistical logic. In a survey of , studies from across the sciences, Daniele Fanelli, a social scientist at the University of Edinburgh, UK, found that the proportion of positive results rose by more than % ween and (ref. ). Psychology and psychiatry, according to other work by Fanelli
This is not a new problem in computer-based studies and counter intuitively one which is actually getting worse year-on-year. We hope shortly to be summarising this decade worth of work in a CABN special issue which should be due out next month. In the meantime readers are strongly advised to visit our website for an overview of the issues: