▲Famous cognitive psychology experiments that failed to replicatebuttondown.com

172 points by PaulHoule 143 days ago | 142 comments

jbentley1 143 days ago [-]

This is a great list for people who want to smugly say "Um, actually" a lot in conversation.

Based on my brief stint doing data work in psychology research, amongst many other problems they are AWFUL at stats. And it isn't a skill issue as much as a cultural one. They teach it wrong and have a "well, everybody else does it" attitude towards p-hacking and other statistical malpractice.

wduquette 143 days ago [-]

"they are AWFUL at stats."

SF author Michael Flynn was a process control engineer as his day job; he wrote about how designing statistically valid experiments is incredibly difficult, and the potential for fooling yourself is high, even when you really do know what you are doing and you have nearly perfect control over the measurement setup.

And on top of it you're trying to measure the behavior of people not widgets; and people change their behavior based on the context and what they think you're measuring.

There was a lab set up to do "experimental economics" at Caltech back in the late 80's/early 90's. Trouble is, people make different economic decisions when they are working with play money rather than real money.

Projectiboga 143 days ago [-]

Expermential Design is one of the big four adacemic subjects within Statistics. The math is complex even before the issues of the effects of the expermential situation.

SkyMarshal 143 days ago [-]

Oblig link to Norvig's "Warning Signs in Experimental Design": https://www.norvig.com/experiment-design.html

dgfitz 143 days ago [-]

> Trouble is, people make different economic decisions when they are working with play money rather than real money.

Understated even. Ever play poker with just chips and no money behind them? Nobody cares, there is no value to the plastic coins.

sputr 143 days ago [-]

As someone who's part of a startup (hrpotentials.com) trying to bring truly scientifically valid psychological testing into HR processes .... yeah. We've been at it for almost 7 years, and we're finally at a point where we can say we have something that actually makes scientific sense - and we're not inventing anything new, just commercializing the science! It only took an electrical engineer (not me) with a strong grasp of statistics working for years with a competent professor of psychology to separate the wheat from the chaff. There's some good science there it's just ... not used much.

obviouslynotme 143 days ago [-]

How are you going to get around Griggs v. Duke Power Co.? AFAIK, personality tests have not (yet) been given the regulatory eye, but testing cognitive ability has.

PaulHoule 143 days ago [-]

Yeah, this is an era which is notorious for pseudoscience.

iamthemonster 143 days ago [-]

I was very surprised at how many statistical methods are taught in undergraduate psychology. Far more statistics than I ever touched in engineering for sure. Yet the undergrads really treated statistics as a cookbook, where they just wanted to be told the recipe and they'd follow it. Honestly they'd have been better off just eyeballing data and collaborating with statisticians for the analysis.

lmpdev 143 days ago [-]

The problem with a lot of the “higher free variable” sciences like psychology, ecology and sociology etc

Is they are the ones who need to be at the bleeding edge of statistics but often aren’t

They absolutely need Bayesian competitive hypothesis testing but are often the least likely to use it

abandonliberty 142 days ago [-]

https://en.wikipedia.org/wiki/Stereotype_threat shows up in this list as not replicated, however, it is one of the most studied phenomena in psychology.

>meta-analyses and systematic reviews have shown significant evidence for the effects of stereotype threat, though the phenomenon defies over-simplistic characterization.[22][23][24][25][26][27][28][9]

Failing to reproduce an effect doesn't prove it isn't real. Mythbusters would do this all the time.

On the other hand, some empires are built on publication malpractice.

One of the worst that I know is John Gottman. Marriage counselling based on 'thin slicing'/microexpressions/'Horsemen of the Apocalypse'. His studies had been exposed as fundamentally flawed, and training based on his principles performed worse than prior offerings, before he was further popularized by Malcolm Gladwell in Blink.

This type of intellectual dishonesty underlies both of their careers.

https://en.wikipedia.org/wiki/Cascade_Model_of_Relational_Di...

https://en.wikipedia.org/wiki/The_Seven_Principles_for_Makin...

https://www.gottman.com/blog/this-one-thing-is-the-biggest-p...

eviks 143 days ago [-]

> And it isn't a skill issue as much as a cultural one. They teach it wrong

It's definitely a skill issue then

Waterluvian 143 days ago [-]

Um, actually I’d say it is the responsibility of all scientists, both professional and amateur, to point out falsehoods when they’re uttered, and not an act of smugness.

rolph 143 days ago [-]

[um], has contexts but is usually a cue, that an unexpected, off the average, something is about to be said.

[actually], is a neutral declaration that some cognitive structure was presented, but is at odds with physically observable fact that will now be laid out to you.

odyssey7 143 days ago [-]

There’s surely irony here

jci 143 days ago [-]

Reminds me of Feynman’s Cargo Cult Science speech:

https://people.cs.uchicago.edu/~ravenben/cargocult.html

turnsout 143 days ago [-]

I read in a study that it takes 10,000 hours to become proficient in statistics /s

delichon 143 days ago [-]

Approximate replication rates in psychology:

  social      37%
  cognitive   42%
  personality 55%
  clinical    44%

So a list of famous psychology experiments that do replicate may be shorter.

https://www.nature.com/articles/nature.2015.18248

NewJazz 143 days ago [-]

I think one would wish the famous ones to be more often replicable.

tomjakubowski 143 days ago [-]

Nonreplicable publications are cited more than replicable ones (2021)

> We use publicly available data to show that published papers in top psychology, economics, and general interest journals that fail to replicate are cited more than those that replicate. This difference in citation does not change after the publication of the failure to replicate. Only 12% of postreplication citations of nonreplicable findings acknowledge the replication failure.

https://www.science.org/doi/10.1126/sciadv.abd1705

Press release: https://rady.ucsd.edu/why/news/2021/05-21-a-new-replication-...

esperent 143 days ago [-]

This is at least partially a failure in publication. Once a paper is published, it's usually left up in the same state forever. If it fails to replicate, that data is published somewhere else. So when someone references the paper, and the diligent reader follows up and reads the reference, it looks convincing, just as it did when first published. It's not reasonable to expect the reader, or even the writer, to be so well versed in all the thousands and thousands of papers published that they know when something has failed to be replicated.

What we need is for every paper to be published alongside a stats card that is kept up to date. How many times it's been cited, how many times people tried to replicate it, and how many times they failed.

nitwit005 143 days ago [-]

This feels like some sort of truth telling paradox, where if you assume the study is true, then seeing a citation like this means it's likely not true.

sunscream89 143 days ago [-]

There may be minute details like having a confident frame of reference for the confidence tests. Cultures, even psychologies might swing certain ideas and their compulsions.

dlcarrier 143 days ago [-]

Isn't the unexpected more famous than the expected?

t_mann 143 days ago [-]

Thanks for providing the reference, that's useful context. Those are awful replication rates, worse than a coin flip. Sounds like the OP can add their own introduction to their list. From the introduction:

> Most results in the field do actually replicate and are robust[citation needed]

glial 143 days ago [-]

The incentive of all psychology researchers is to do new work rather than replications. Because of this, publicly-funded psychology PhDs should be required to perform study replication as part of their training. Protocol + results should be put in a database.

analog31 143 days ago [-]

Sure, dump it on the lowest level employee, who has the least training and the most to lose. Punish them for someone else's bad research. Grad school already takes too long, pays too little, and involves too much risk of not finishing. And it doesn't solve the problem of people having to generate copious quantities of research in order to sustain their careers.

Disclosure: Physics PhD.

gwd 143 days ago [-]

How interesting would it be if every PhD thesis had to have a "replication" section, where they tried to replicate some famous paper's results.

bogtog 143 days ago [-]

Little of this is considered cognitive psychology. The vast majority would be viewed as "social psychology"

Setting that aside, among any scientific field I'm aware of, psychology has taken the replication crisis most seriously. Rigor across all areas of psychology is steadily increasing: https://journals.sagepub.com/doi/full/10.1177/25152459251323...

gwd 143 days ago [-]

> Smile to Feel Better Effect

> Claimed result: Holding a pen in your teeth (forcing a smile-like expression) makes you rate cartoons as funnier compared to holding a pen with your lips (preventing smiling). More broadly, facial expressions can influence emotional experiences: "fake it till you make it."

I read this about a decade ago, and started, when going into a situation where I wanted to have a natural smile, grimacing maniacally like I had a pencil in my teeth. The thing is, it's just so silly, it always makes me laugh at myself, at which point I have a genuine smile. I always doubted whether the claimed connection was real, but it's been a useful tool anyway.

sunscream89 143 days ago [-]

Yeah, the marshmallow one taught me to have patience and look for the long returns on investments of personal effort.

I think there may be something to a few of these, and more may need considering regarding how these are conducted.

Let’s leave open our credulities for the inquest of time.

143 days ago [-]

Terr_ 143 days ago [-]

> Source: Hagger et (63!) al. 2016

I can't help chuckling at the idea that over 1.98 * 10^87 people were involved in the paper.

dlcarrier 143 days ago [-]

If you were to meet a "normal" person, would you interpret that as meaning "perpendicular" or as meaning "the kind of person that doesn't look at everything like it's a mathematical expression"?

Terr_ 143 days ago [-]

Normal: "The kind of person who doesn't go out of their way to put-down other people on HN for being nerdy when a linked article contains a weird editorial interjection that bears an unusual resemblance to a math expression."

recursive 143 days ago [-]

Um, actually, the interpretation here is "factorial", not "perpendicular".

dlcarrier 143 days ago [-]

Disturbing fact: The Stanford prison experiment, run by Philip Zimbardo, wasn't reproducible but that didn't stop Zimbardo from using it to promote his ideologies about the impossibility of rehabilitating criminals, or from becoming the president of the American Psychological Association.

The APA has a really good style guide, but I don't trust them for actual psychology.

runarberg 143 days ago [-]

Yes, the APA certainly has a lot to answer for in their history. The Guantanamo Prison torture scandal is still fresh in my memory.

https://www.democracynow.org/2007/8/20/apa_members_hold_fier...

chatmasta 143 days ago [-]

Has anyone tried to reproduce it? Good luck convincing an ethics review board to let you try that again.

Meanwhile, it’s been reproduced “in vitro” in numerous episodes of atrocity, e.g. Abu Ghraib…

dlcarrier 142 days ago [-]

A reality TV show tried, because they have more practical ethics requirements than academic institutions, and were able to run the study as long as everyone agreed to be there and there was no sign of abuse: https://en.wikipedia.org/wiki/The_Experiment

Also, how do you italicize text in your comment?

chatmasta 142 days ago [-]

> Also, how do you italicize text in your comment?

HN will italicize any string between a pair of asterisks. [0]

> practical ethics requirements

Practical ethics requirements :)

[0] https://news.ycombinator.com/formatdoc

aeve890 143 days ago [-]

>Source: Stern, Gerlach, & Penke (2020)

Wow, what are the odds?

https://en.wikipedia.org/wiki/Stern%E2%80%93Gerlach_experime...

dlcarrier 143 days ago [-]

I thought you were pointing out some bias by comparing the research to previous research from the same authors. It took me far too long to realize that the experiment was from 100 years ago, and you were pointing out that the names were coincidentally the same.

NooneAtAll3 143 days ago [-]

I'm still amazed that wikipedia doesn't have redirect away from its mobile site

dang 143 days ago [-]

(It's on my list to rewrite those URLs in HN comments at least)

NoMoreNicksLeft 143 days ago [-]

Please, please, please... can you rewrite reddit links to old.reddit.com too? Not that there's much reason to link there, but it makes my eyes bleed.

dang 142 days ago [-]

It depends if the bulk of the HN community supports it. As you probably know, we already do that for submission URLs.

insane_dreamer 143 days ago [-]

If the "failed replication" was a single study, as in many cases listed here, there is still an open question as to whether the 1) replication study was underpowered (the ones I looked at had pretty small n's), or 2) the re-implementation of the original study was flawed. So I'm not so sure we can quickly label the original studies as "debunked", no more than we can express a high level of confidence in the original studies.

(This isn't a comment on any of the individual studies listed.)

HK-NC 143 days ago [-]

IIRC the 2013 "racism predicted by telling leading questions" one and ita predecessor, which is listed here but also says there is slight trend toward replication, is just based on implicit association tasks. So you have a green and red button for good and bad, and then a word pops up and you have less a second to choose which button to press. Oversimplifying complex thought processes in my opinion is junk psychology.

chatmasta 143 days ago [-]

It’s also self-referential because there is no objective measure of “racism,” so how can you even measure whether someone is “more racist” based on reaction time to stereotypical stimuli?

“No objective measure” pretty much sums up the whole field, to be honest. I started on a CS & Psych double major, did about eight psych courses, and then decided it was mostly a joke once I got to the quantitative portions. But those courses were very useful for general life skills. Developmental psychology in particular was packed with dense lessons about how we learn as children… social psych was a good overview of all the “well-known” experiments… etc.

Aurornis 143 days ago [-]

> Claimed result: Adopting expansive body postures for 2 minutes (like standing with hands on hips or arms raised) increases testosterone, decreases cortisol, and makes people feel more powerful and take more risks.

A heuristic I use that is unreasonably good at identifying grifters and charlatans: Unnecessarily invoking cortisol or other hormones when discussing behavioral topics. Influencers, podcasters, and pseudoscience practitioners love to invoke cortisol, testosterone, inflammation, and other generic concepts to make their ideas sound more scientific. Instead of saying "stress levels" they say "cortisol". They also try to suggest that cortisol is bad and you always want it lower, which isn't true.

Dopamine is another favorite of the grifters. Whenever someone starts talking about raising dopamine or doing something to increase dopamine, they're almost always being misleading or just outright lying. Health and fitness podcasters are the worst at this right now.

thecrims0nchin 143 days ago [-]

I have a draft of a blog post on this. Originally I was going to write about how cortisol isn't always bad, or good, it's just a chemical in us. But then I started noticing the pattern you point out here where I'm not sure anyone uses the cortisol argument in good faith. Everyone who brings up cortisol is usually trying to sell you something

143 days ago [-]

systemstops 143 days ago [-]

Is anyone tracking how much damage to society bad social science has done? I imagine it's quite a bit.

roadside_picnic 143 days ago [-]

The most obvious one is the breakdown of trust in scientific research. A frequent discussion I would have with another statistics friend of mine was that that anti-vax crowd really isn't as off base as they are more popularly portrayed and if anything, the "trust the science!" rhetoric is more clearly incorrect.

Science should never be taught as dogmatic, but the reproducibility crisis has ultimately fostered a culture where one should not question "established" results (Kahneman famously proclaimed that one "must" accept the results of the unbelievable priming results in his famous book), especially if that one is interested in a long academic career.

The trouble is that some trust is necessary in communicating scientific observations and hypothesis to the general public. It's easy to blame the failure of the public to unify around Covid as based around cultural divides, but the truth is that skepticism around high stakes, hastily done science is well warranted. The trouble is that even when you can step through the research and see the conclusions are sound, the skepticism remains.

However, as someone that has spent a long career using data to understand the world, I suspect the harm directly caused by the wrong conclusions being reached is more minimal than one would think. This is largely because, despite lip service to "data driven decision making", science and statistics very rarely are the prime driver of any policy decision.

seec 143 days ago [-]

I agree wholeheartedly with your conclusion. Science is relevant for those who care about finding the truth, just because they want to know for sure.

But for most people science doesn't really make much difference in how they choose and operate. Knowing the truth doesn't mean you are ready to adapt your behavior.

BeetleB 143 days ago [-]

I imagine it's comparable to the damage done when policies are set that are not based on studies.

Let's be candid: Most policies have no backing in science whatsoever. The fact that some were backed by poor science is not an indictment of much.

rgblambda 143 days ago [-]

From a political point of view, it may actually be beneficial for a policy to have no scientific basis. What happens when the science gets updated?

You either have to change the policy and admit you were "wrong" to an electorate who can't understand nuance, or continue with the policy and accept a few bad news days before the media cycle resets to something else.

feoren 143 days ago [-]

We rack up quite a lot of awfulness with eugenics, phrenology, the "science" that influenced Stalin's disastrous agriculture policies in the early USSR, overpopulation scares leading to China's one-child policy, etc. Although one could argue these were back-justifications for the awfulness that people wanted to do anyway.

systemstops 143 days ago [-]

Those things were not done by awful people though - they all thought they were serving the public good. We only judge it as awful now because of the results. Nearly of these ideas (Lysenkoism I think was always fringe) were embraced by the educated elites of the time.

daoboy 143 days ago [-]

You are absolutely right. Another interesting example: The man who invented the lobotomy won a Nobel Prize for it.

feoren 143 days ago [-]

Lysenkoism! That's the one. Thank you for reminding me of the name (and for knowing what I was grasping at).

I think some "bad people" used eugenics and phrenology to justify prior hate, but they were also effective tools at convincing otherwise "good people" to join them.

izabera 143 days ago [-]

i'm struggling to imagine many negative effects on society caused by the specific papers in this list

systemstops 143 days ago [-]

Public policies were made (or justified) based on some of this research. People used this "settled science" to make consequential decisions.

Stereotype threat for example was widely used to explain test score gaps as purely environmental, which contributed to the public seeing gaps as a moral emergency that needed to be fixed, leading to affirmative action policies.

seec 143 days ago [-]

To be honest, whether they had a "study" proving it or not I think those things would have happened anyway.

It's just a question of power in the end. And even if you could question the legitimacy of "studies" the people in power use to justify their ruling, they would produce a dozen more flawed justifications before you could even produce one serious debunking. And they wouldn't even have to give much light to your production so you would need large cultural and political support.

Psychology exists mostly as a new religion; it serves as a tool for justification for people in power, it is used just in the same way as the bible.

It should not be surprising to anyone that much of it isn't replicable (nor falsifiable in the first place) and when it is, the effects are so close to randomness that you can't even be sure of what it means. This is all by design, you need to keep people confused to rule over them. If they start asking questions you can't answer, you lose authority and legitimacy. Psychology is the tool that serves the dominant ideology that is used to "answer" those questions.

rgblambda 143 days ago [-]

I once did a corporate internal management course that was filled with pseudoscience bullshit. I imagine the impact of that course on the company's productivity was net negative. I'm sure lots of orgs have similar courses.

Learning styles have also been debunked for decades though they continue to be used in education. I saw an amusing line in an article that said 90% of teachers were happy to continue using them even after accepting they're nonsense.

And that's just theories that have been debunked (i.e. proven wrong).

fsckboy 143 days ago [-]

famous cognitive psychology experiments that do replicate: IQ tests

http://www.psychpage.com/learning/library/intell/mainstream....

in fact, the foundational statistical models considered the gold standard for statistics today were developed for this testing.

alphazard 143 days ago [-]

> in fact, the foundational statistical models considered the gold standard for statistics today were developed for this testing.

The normal distribution predates the general factor model of IQ by hundreds of years.[0]

You can try other distributions yourself, it's going to be hard to find one that better fits the existing IQ data than the normal (bell curve) distribution.

[0] https://en.wikipedia.org/wiki/Normal_distribution#History

fsckboy 143 days ago [-]

Darwin's cousin, Francis Galton, for whom the log-normal distribution is often called the Galton distribution, was among the first to investigate psychometrics.

not realizing he was hundreds of years late to the game, he still went ahead and coined the term "median"

more tidbits here https://en.wikipedia.org/wiki/Francis_Galton#Statistical_inn...

dlcarrier 143 days ago [-]

I took an IQ test as a high school student, and one of the subtests involved placing a stack of shuffled pictures in chronological order. I had one series in the incorrect order, because I had no understanding of the typical behavior of snowfall. The test proctor said almost everyone she tested mixed that one up, because it doesn't snow in the area where I live.

I have no doubt that IQ tests reproducibly measure the test takers ability to pass tests, as well as to perform in a society that the tests are based on.

I think it's disingenuous to attribute IQ to intelligence as a whole though, and it is better understood as an indicator of cultural intelligence.

I would expect that, for cultures who's members score below average on IQ tests from the US, an equivalent IQ test created within that culture would show average members of that culture scoring higher than average members of US culture.

growingkittens 143 days ago [-]

> I would expect that, for cultures who's members score below average on IQ tests from the US, an equivalent IQ test created within that culture would show average members of that culture scoring higher than average members of US culture.

A moment from the show "Good Times" in 1974. https://m.youtube.com/watch?v=DhbsDdMoHC0 at 1:25

dlcarrier 143 days ago [-]

Apparently it's referencing a real test, called the BITCH test: https://en.wikipedia.org/wiki/Black_Intelligence_Test_of_Cul...

Also, I forgot how annoying comic relief characters were in sitcoms. They are the opposite of relieving.

fsckboy 142 days ago [-]

in my comment i gave a link to what a fairly large group of university professors, scientists who study, test, and measure intelligence, and what they say they've learned about intelligence. you think you know more, but you don't even investigate or reference what they say, you just think it's the way you think it should be based on ideas you have that you have not tested. not very convincing.

also, cultures don't have iq's, there is no known link to culture.

3cKU 143 days ago [-]

Raven's Progressive Matrices is often administered. Is that test culturally biased? Does that test measure only ability to take that test and nothing else?

dlcarrier 143 days ago [-]

Puzzle tests have their own problems. They're only effective at measuring puzzles solving abilities when they are novel, so retaking the test would lead to higher scores, and practicing even more so. They also only measure puzzle solving abilities which are necessary in some but not all applied intelligence tasks.

tptacek 143 days ago [-]

A fun irony (every part of this scientific question is gnarly as fuck, which can make it interesting to follow) is that the more culturally biased an IQ test is, the more g-loaded it will turn out to be.

https://pubmed.ncbi.nlm.nih.gov/24104504/

dlcarrier 142 days ago [-]

I think humanity majorly underplays how much success is based on culture. I have a long-held theory that offices don't exist to accomplish work, but to establish social relationships, and that work itself is a secondary product of the office community.

My belief was reinforced when companies switched to remote work, and management at many companies complained that it was difficult to tell who was and wasn't working, when the managers didn't get to watch the workers. Abstracting the social relationship from the results of work will make it easier to judge the work itself, but more difficult to enforce the social relationship. When the abstraction occurred, those who were basing the status of their employees on the social relationship, and not the work output, were especially disadvantaged.

teamonkey 143 days ago [-]

Yes, it’s almost certainly linked to quality of schooling and exposure to those types of problems, amongst other things, see the Flynn Effect.

https://en.wikipedia.org/wiki/Flynn_effect

3cKU 141 days ago [-]

Access to schooling etc can't be the whole story: "black students from prosperous families tend to score higher in IQ than blacks from poor families, but they score no higher, on average, than whites from poor families".

tptacek 141 days ago [-]

So IQ is malleable and SES-dependent and GxE interactions are real.

3cKU 140 days ago [-]

No. The first part of that quote is consistent with any hypothesis (G only, E only, G&E), i.e. cannot distinguish between them.

NalNezumi 143 days ago [-]

I can't quite find the study but there was one mentioned to me about showing the Ravens progressive matrices test to hunter / gatherer tribes, and they did horribly. But those tribes do geometric pattern recognition on the daily basis during hunting, so the tester tried to modify the base shapes to mimic more "realistic" shapes for hunter gatherers (rather than unusual shapes such as perfect triangle, circles and rectangles, hard to find in nature) and the score normalized to median.

I was told this in context of "cultural psychology" how many tests or psychological observations and metrics poorly translate over culture. (especially when you try to pin it on some success metric)

pessimizer 143 days ago [-]

What exactly are they meant to replicate other than other IQ tests? They don't make a statement about anything that is falsifiable, other than that if you give somebody who scores high on a test carefully designed and tested to match the results of previously given IQ tests when given to the same people, they'll tend to match the results that those people will get on other tests that were calibrated in the exact same way.

If you're trying to say they replicate over the lifetime of the same person, I've had a 15 point swing between tests, out of the few I've taken. What did stay constant for me from age 10 to age 40 was my Myers-Briggs test (my dad was a metrics obsessive), and that's obvious horseshit. Consistency doesn't mean you're measuring what you claim to be measuring.

edit: if it matters, scores were between 137 and 152, so exactly an entire standard deviation. That's like the difference between sub-Saharan and European that racists are always crowing about, but in the same person. IQ doesn't even personally replicate for me.

teamonkey 143 days ago [-]

You can prepare for IQ tests, just as you can for any other test, and you can get better at some of the problems in these tests the more you practice them, just as you get better at Sudoku puzzles the more you do them.

Related: that brain is plastic and can adapt to challenges in different ways. https://www.scientificamerican.com/article/london-taxi-memor...

fsckboy 142 days ago [-]

>What exactly are they meant to replicate other than other IQ tests?

if a variety of different IQ tests sort the same people the same way, even though every question on the tests is different from the other tests, you have shown that the test is showing something about the subjects, not something about the tests. and that is replicable, and falsifiable.

if you follow the same people over time and provide them with new tests, and they continue to sort in the same relative fashion, you have increased confidence that you are measuring something relatively fixed, not variable. For statistical significance (look it up) you don't draw conclusions on the basis of one person (or one Dad) but on population samples tested under standard conditions.

this is like all study results published here, a thousand nerds who've never studied intelligence come up with a hundred objections to what was tested, assuming with arrogance that the people who specialized and did the work aren't considering what comes off the top of this nerd's head. Better qualified nerds did this work.

>Myers-Briggs...'s obvious horseshit

Myers Briggs is not complete horseshit, correlates closely to, but not as good a fit as, the generally accepted Big Five Factor system, the gold standard of personality tests: you should educate yourself a bit more. Myers-Briggs essentially tries to phrase everything in a postive way, where the Big Five separates and includes Neuroticism which is a more negative (for the person) trait. All these traits should be considered adaptive till proven otherwise, so resist the urge to judge.

tptacek 142 days ago [-]

The Big Five --- not all that great either!

https://www.stat.cmu.edu/~brian/Pmka-Attack-V71-N3/pmka-2006...

(1st section).

astrange 143 days ago [-]

Survivorship bias. You can easily make someone's IQ test not replicate. (Hit them on the head really hard.)

tryauuum 143 days ago [-]

    claimed result: Women are more attracted to hot guys during high-fertility days of their cycles

wait why not? I hoped I'm attractive at least some days of the month :(

SpaceManNabs 143 days ago [-]

One thing that confuses me is that some of these papers were successfully replicated, so juxtaposing them to the ones that have not been replicated at all given the title of the page feels a bit off. Not sure if fair.

The ego depletion effect seems intuitively surprising to me. Science is often unintuitive. I do know that it is easier to make forward-thinking decisions when I am not tired so I dont know.

ceckGrad 143 days ago [-]

>some of these papers were successfully replicated, so juxtaposing them to the ones that have not been replicated at all given the title of the page feels a bit off. Not sure if fair.

I don't like Giancotti's claims. He wrote: >This post is a compact reference list of the most (in)famous cognitive science results that failed to replicate and should, for the time being, be considered false.

I don't agree with Giancotti's epistemological claims but today I will not bloviate at length about the epistemology of science. I will try to be brief.

If I understand Marco Giancotti correctly, one particular point is that Giancotti seems to be saying that Hagger et al. have impressively debunked Baumeister et al.

The ego depletion "debunking" is not really what I would call a refutation. It says, "Results from the current multilab registered replication of the ego-depletion effect provide evidence that, if there is any effect, it is close to zero. ... Although the current analysis provides robust evidence that questions the strength of the ego-depletion effect and its replicability, it may be premature to reject the ego-depletion effect altogether based on these data alone."

Maybe Baumeister's protocol was fundamentally flawed, but the counter-argument from Hagger et al. does not convince me. I wasn't thrilled with Baumeister's claims when they came out, but now I am somehow even less thrilled with the claims of Hagger et al., and I absolutely don't trust Giancotti's assessment. I could believe that Hagger executed Baumeister's protocol correctly, but I can't believe Giancotti has a grasp of what scientific claims "should" be "believed."

SpaceManNabs 143 days ago [-]

You make some good points based on your deeper read. I am a bit saddened that the rest of the comment section (the top 6 comments as of right now) devolved into "look at how silly psychology is with all its p-hacking"

That might be true, but this article's comment section isn't a good place for it because it doesn't seem like the article is entirely fair. I would not call it dishonest, but there is a lack of certainty and finality in being able to conclude that these papers have been successfully proven to not be replicable.

taeric 143 days ago [-]

The idea isn't that it is easier to do things when not tired. It is that you specifically get tired exercising self control.

I think that can be subtly confused by people thinking you can't get better at self control with practice? That is, I would think a deliberate practice of doing more and more self control every day should build up your ability to do more self control. And it would be easy to think that that means you have a stamina for self control that depletes in the same way that aerobic fitness can work. But, those don't necessarily follow each other.

lutusp 143 days ago [-]

A key factor behind psychology's low replication rate is the absence of theories that define the field. In most science fields, an initial finding can be compared to theory before publication, which may weed out unlikely results in advance. But psychology doesn't have this option -- no theories, so no Litmus test.

It's important to say that a psychology study can be scientific in one sense -- say, rigorous and disciplined, but at the same time be unscientific, in the sense that it doesn't test a falsifiable, defining psychological theory -- because there aren't any of those.

Or, to put it more simply, scientific fields require falsifiable theories about some aspect of nature, and the mind is not part of nature.

Future neuroscience might fix this, but don't hold your breath for that outcome. I suspect we'll have AGI in artificial brains before we have testable, falsifiable neuroscience theories about our natural brains.

patrickhogan1 143 days ago [-]

Before dunking on psychology for not replicating, remember this is a cross-discipline problem.

In biomedicine, Amgen could reproduce only 6/53 “landmark” preclinical cancer papers and Bayer reported widespread failures.

seec 143 days ago [-]

All the "hypothesis" or supposed "results" are so bonkers than it's an insult to intelligence itself that such things can be "proved" with psych "experiment".

Not that it matters, most of the psychology field is inherently bullshit, those are just the example of cases they went so far in the insult to intelligence, no amount of "studies" and rhetoric can save them.

sunrunner 143 days ago [-]

No mention of the Stanford Prison Experiment I notice.

dlcarrier 143 days ago [-]

You'd think it's so far in the past that it isn't even considered, but Zimbardo was elected president of the American Psychological Association in 2002, which wasn't all that long ago.

runarberg 142 days ago [-]

And during which time APA was complicit in and participated in torturing prisoners at Guantanamo Bay.

https://www.apa.org/about/policy/chapter-4b

cindyllm 142 days ago [-]

[dead]

eviks 143 days ago [-]

Given how long the whole field has been malicious/incompetent failing at basic statistics and sweeping it under the rug, I think that rather then discarding the experiments that don't replicate the better baseline is to discard everything and wait for the future generation of better cognitive psychologists that come up with any good discovery that is widely replicated?

eska 143 days ago [-]

I recently read the lifework book of a nobel prize winning psychologist and it was full of these disproven experiments. As a non-psychologist my trust in the experts is extremely low.

runarberg 143 days ago [-]

Note that nearly non of these studies are pure cognitive psychology. Most have intersections with social psychology (and I would deem primarily social psychology) or developmental psychology. For example the debunked study on social priming was published in Journal of Personality and Social Psychology.

This title would be much more accurate if the author omitted “cognitive” from the title.

WesolyKubeczek 143 days ago [-]

> Claimed result: Listening to Mozart temporarily makes you smarter.

This belongs in a dungeon crawl game. You find an artifact that plays music to you. Depending on the music played (depends on the artifact's enchantment and blessed status), it can buff or debuff your intelligence by several points temporarily.

somewholeother 143 days ago [-]

The economics of pyschology are the psychology of economics.

If you won't trust the process, you will gain no real outcome.

What we recieve from the process is not necessarily tangible, but instead a fresh perspective on what may be possible. Thus, the inversion is complete, and we may then move forward.

blindriver 143 days ago [-]

Papers should not be accepted until an independent lab has replicated the results. It’s pretty simple but people are incentivized to not care if it’s replicable because they need the paper to publish to advance their career

mcswell 143 days ago [-]

In many cases--longitudinal studies are an example, but not the only one--that's not feasible. And it's often expensive--who would pay for it?

jay_kyburz 143 days ago [-]

Agreed, and the independent lab should be chosen by the publisher, and be kept secret until results are in.

Ferret7446 140 days ago [-]

Devil's advocate, is it possible that humans have psychologically changed since the original experiments?

ausbah 143 days ago [-]

i wonder the replication rate is for ML papers

PaulHoule 143 days ago [-]

From working in industry and rubbing shoulders with CS people who prioritize writing papers over writing working software I’m sure that in a high fraction of papers people didn’t implement the algorithm they thought they implemented.

avdelazeri 143 days ago [-]

Don't get me started, I have seem repos that I'm fairly sure never ran in their presented form. A guy in our lab thinks authors purposefully mess up their code when publishing on GitHub to make it harder to replicate. I'm starting to come around on his theory.

KingMob 143 days ago [-]

And most medical studies. It's just as bad as social psych, if not worse, because there's real money at stake in churning out new drugs.

intalentive 143 days ago [-]

Nowadays everyone publishes their code. There’s typically a project page on github.io, a paper on arxiv.org, and a public repo.

picardo 143 days ago [-]

Well, at least the growth mindset study is not fully debunked yet. It's basically a modern interpretation of what we've known to be true about self-fulfilling prophecies. If you tell children they are can be smart and competent if they work hard, then they will work hard and become smart and competent. This should be a given.

camgunz 143 days ago [-]

Dear lazyweb: is there the opposite of this list anywhere?

Animats 143 days ago [-]

> Most results in the field do actually replicate and are robust [citation needed], so it would be a pity to lose confidence in the whole field just because of a few bad apples.

Is there a good list of results that do consistently replicate?

juujian 143 days ago [-]

Now I want to know which cognitive psychology experiments were successfully replicated though.

djoldman 143 days ago [-]

I'm surprised that dunning Kruger isn't listed.

hn_throw_250915 143 days ago [-]

I thought we knew that these were vehicles by wannabe self-help authors to puff up their status for money. See for example “Grit” and “Deep Work” and other bullshit entries in a breathlessly hyped up genre of pseudoscience.

epolanski 143 days ago [-]

[flagged]

viewtransform 143 days ago [-]

Are you comparing direct descendants of Yoruba versus descendants of Celts in America ? or mixed descendants of Bantu and Cherokee versus mixed descendants of Anglo-Saxons and Slavs ? In your study would Barack Obama be a person of color or a person of pallor ?

Or is this data you have gathered observing people at Costco. Just checking on your scientific methodology.

dboreham 143 days ago [-]

People at Costco would be smarter than average so not a valid sample.

epolanski 142 days ago [-]

https://pubmed.ncbi.nlm.nih.gov/6735823/

https://www.researchgate.net/publication/12481554_Measures_o...

142 days ago [-]

fny 143 days ago [-]

Differences are hidden because (1) differences, even small ones, are used to justify discrimination (2) some feel the need to correct for stereotypes (3) these differences often don't really exist or amount to a small effect size.[0]

In the end, we're talking about distributions of people, and staring at these differences mischaracterizes all but those at the mean.

All that matters is who can pass the test.

[0]: I also encourage you to ask ChatGPT/Grok/Claude "men vs women math performance studies." You'll be shocked to find most studies point to no or small differences.

[1]: Malcom Gladwell wrote a great piece about his experience as a runner that seems appropriate to share https://www.newyorker.com/magazine/1997/05/19/the-sports-tab...

runarberg 143 days ago [-]

Quite often those differences exist because of systemic or cultural bias that affects the test design. Tests are often validated based off of other tests that showed a difference, but those tests often had a severe sampling bias that showed a group difference where non-existed. It then became an established theory that if you design a test that measures e.g. “emotional intelligence” (whatever that means) and it didn’t show a group difference, it was invalid and had to be adjusted until it did.

us-merul 143 days ago [-]

> We know for a fact that sex or ethnicity impacts body yet we seem unable to cope with the idea that there are also differences in how brains work.

Here is your error. You’re assuming that a physical difference in morphology is linked to behavioral or neural correlates. That’s not the case, since observed statistical- or group-level differences need not be driven by biology. You’re assuming biological determinism, and the evidence for direct genetic effects on behavior isn’t there.

Aurornis 143 days ago [-]

> and the evidence for direct genetic effects on behavior isn’t there.

Yes it is. There's an entire field for studying this called Behavioral Genetics.

The easiest evidence comes from comparing monozygotic and dizygotic twins (maternal vs fraternal twins). The variance in behavior is higher among the dizygotic twins who have different genomes.

runarberg 143 days ago [-]

Twin studies are inherently biased. We have also learned since the 1950s that your genes are not nearly as static as previously assumed. The field of behavior genetics is very fraught indeed, full of disproven assumptions, flawed statistics, and racist pseudo-science. For the longest time behavioral genetics served to justify discrimination through eugenics.

Your parent is correct, the evidence for genetic effects only exist in pseudo-scientific fields using long debunked and flawed methodology. In other words, the “evidence” for behavioral genetics has failed to replicate.

Aurornis 143 days ago [-]

> Twin studies are inherently biased.

How? And why do you think this completely invalidates their observations?

> We have also learned since the 1950s that your genes are not nearly as static as previously assumed

Aside from random mutations, your genes are essentially static for your lifetime. Genetic expression can change, but you don’t suddenly flip from being blue eyed to brown eyed because your genes change.

We already know that genetics predispose people to certain conditions like schizophrenia. We have ample evidence that behavioral traits are passed down via genes from centuries of animal breeding. How would anyone possibly conclude that genes have no impact on behavior?

seec 143 days ago [-]

I don't know how people like him can come up with so much ideological bullshit that is very obviously proved wrong just by observing other species or consulting history.

If any of it was wrong not only, we would just not breed and select animals for specific traits but pretty much most of our civilisation wouldn't even exist as it does.

We got there precisely by selecting and using animals as tools and food security. Our farm animals are quite passive, precisely because we selected that trait.

There are some people who pay over 30K€ for the breeding of a specific horse in an attempt to create a race winner; and then we have guys like this, supposedly smart but who keep spiting nonsense and even pretend to have the authority of science behind him.

The evidence is right under everyone's nose. It is extremely hard to "prove" in a "scientific" (at some points statistics have too much interpretation behind them to be meaningful) way but anyone who is completely blinded by ideologies.

People who wouldn't be considered "smart" here have an old saying: "the apple doesn't fall far from the tree". They may not be smart but have infinitely more wisdom and what they say might be true more often than the "smart" peoples.

runarberg 142 days ago [-]

I would argue that behavioral genetics is extremely ideologically driven. This whole sub-field was started by a white supremacist (Francis Galton) with the aim of “proving” the superiority of the white race. The early days were wrought with pseudoscientific bullshit and unlike me complaining about it on a tech forum, the ideology of behavioral geneticist resulted in an actual policy and the horrors of eugenics.

If you want to find more about what makes Behavioral Genetics such a terrible “scientific” endeavor and how the whole field is driven by ideology instead of science, there is a whole book dedicated to the subject: https://en.wikipedia.org/wiki/Misbehaving_Science

seec 141 days ago [-]

I agree that a lot of behavioral genetics is ideologically driven for sure. That alone doesn't make the research wrong. By that measure, much of the sociology field is ideologically driven bullshit but you seem to be ok with it. Yet it is even less of a science than behavioral genetics.

The reason is pretty simple; it aligns with the sensibilities of the dominant ideology and not only do they want to believe the bullshit, in order to feel good and appear virtuous; they also use the bullshit to justify all kinds of policies and political nonsense. But it is fine by you I guess, why is that?

I agree that much of the research on genetics can potentially be used to justify some horrifying stuff. But it is not the role of science to avoid hard questions/answers just to please the political sensibilities of the people in power. In fact, dismissing the research without proper argumentation makes you look extremely suspicious and you end up being a hypocrite, playing politics for brownie points.

I don't doubt that some of the behavioral genetics research is complete bullshit but that's really not specific to this field of research, this is true for pretty much all of the research.

Newton famously spent a whole lot of time on alchemy research, yet he gave us classical mechanics. Darwin theories proved to be completely right, yet it took 20 years to be accepted by the scientific community and 50 years to be broadly accepted. You can still find religious zealots that will contest it today, the only reason it got accepted is because of the dwindling power of religion.

To me it feels like you want to keep the status quo of the "equality" religion and refuse to even look at the research seriously just because you don't like the implications.

The funny thing is that the system in place already has a selection process that is hiding in plain sight. How else do you justify that smarter people get to be paid more and some specific behaviors are rewarded while others are punished. And how come sociopathic people always find their way to power and get away with many horrible things?

When I was a child, my parents got a border collie. We had it very young, barely weaned and it never was in contact with any sheep, never worked on a farm, etc. Yet as it got older and we were hiking in some place with sheep herds, it would systematically run at them and try to "work" them. It was never educated to do so and was quite a pain in the ass sometimes precisely because it was never trained as a farm dog would have been. How do you explain this any other way than behavioral genetics? The answer is that you really can't.

runarberg 140 days ago [-]

This was talked about in the book I cited Misbehaving Science. The field of behavior genetics had very lofty promises, claims of 80% heritability were followed by promises of isolating the genes involved and finding cures as soon as it became technologically feasible to do so. Well, now 50 years later, we have the technology, but the promises have failed to deliver, 80% heritability has not resulted in any treatments, or even better understanding other than 80% heritability. I call this a scientific dead end, and is very common among many of the fields Francis Galton started. The science fails to go anywhere beyond the initial predictions.

The response from Behavioral Geneticists has been to simply lower the expectations, now we are looking at 100s of genes encoding a very complex behavior which is only visible if you measure 1000s of people, and is only if you account for the interaction effect with the environmental factors (E + G + E×G) in your model. This is in stark contrast with sociology which has proven it self very worthy as a science (despite the replication crisis mentioned above). We do not treat e.g. autism by isolating the “autism genes” and creating a drug to “cure” it, rather we learn about the symptoms, see which environmental factors trigger unease with autistic people and accommodate them. Unlike behavior genetics, sociology has been able to guide policy which actually helps people out of e.g. poverty, etc. It has been over 100 years since behavior genetics promised us the same, and some of us are done waiting for them to deliver.

> Darwin theories proved to be completely right, yet it took 20 years to be accepted by the scientific community and 50 years to be broadly accepted.

This simply isn’t true. Darwin’s work was immediately accepted. The origin of the species was a best seller, and his subsequent books were as well. There was some question whether natural selection was a strong enough mechanism to properly explain evolution, but that was pretty much it. If anything people like Francis Galton and Herbert Spencer took his theories too far by extrapolating them to sciences where they simply did not apply. If you are interested in this story I recommend reading some Stephen J. Gould he had a habit of setting the record straight for what natural selection wasn’t).

> How else do you justify that smarter people get to be paid more and some specific behaviors are rewarded while others are punished. And how come sociopathic people always find their way to power and get away with many horrible things?

The latter was pretty well explained by behaviorism. In capitalism your clearest way to success is by externalizing cost, if you push your costs onto others you will succeed, if you don‘t you will be outcompeted by those that do. In economics this is called perverse incentives. And is very accurately described using positive reinforcement. The former is explained in sociology where your biggest predictor of success (by far) is your generational wealth. People who are already wealthy are much more likely to succeed in various aspects of society. In sociology there is a point made how rich culture is rewarded in the education system (e.g. knowledge of Mozart is valued higher then knowledge of ABBA).

Sociology and Behaviorism are successful sciences who are constantly producing new predictions, offering new tools, and opening new questions (sociology much more then behaviorism though). Behavior genetics on the other hand has this age old prediction that there exist a gene for this and that, but consistently fails to find the gene that they claimed they would. There is no rich gene, and there is no psychopathic gene. Instead of offering new predictions, based off of new findings and opening new questions, behavior genetics simply scales down their initial claims, and says they will find some interactions between hundreds of genes which predict a slightly higher chance of something showing up in a large enough sample size. Meanwhile they hold tightly into a century old methodology of twin studies and the concept of heritability promising us that heritability actually means something and is actually a useful concept, we just need more time to see the results.

This is the behavior of a dying science. Science that hasn’t offered us anything of use for a long time, and is populated by dinosaurs desperately trying to make a point about a universe which simply doesn’t work the way they think it works.

runarberg 142 days ago [-]

It is not about whether or not genes can encode behavior, they obviously can. But explaining behavior with genetics has never actually been done, and most attempts of doing so have been pseudo-scientific race science with the goal of showing the superiority of the white race.

Yes we can select a behavioral trade and breed animals that are more likely to behave in a certain way in a given situation. Those animals are usually also trained from birth to behave in that matter, demonstrating how important environmental interaction is to explaining behavior. Selective breeding is a proof that some behavior is heritable. But heritability does not mean that you can explain behavior with genetics, it just means that some behavior is more likely within some population, regardless of whether or not genes are the reason for the variety. Twin studies, even if they were valid and unbiased, fail to account for that, and are therefor not evidence for the wild claims of behavior genetics.

But it gets worse. Like I said before, twin studies are inherently biased and they don‘t actually show an accurate estimate of heritability as behavior geneticist claim. Twins are not a random sample of the population, twins share the same environment at least until the first minutes after birth (and quite often months or even years after birth), they often interact with each other, and even if they are separated soon after birth, they are more often adopted into similar (often high income) families. In short twin studies suffer from bad statistics resulting from junk-in junk-out.

And it gets even worse, because even looking past the fact that heritability does not offer any evidence for behavior genetics, and the fact that twin studies are fraught with bad statistics, the fundamental assumptions of twin studies are wrong. The human genome is not static throughout the live of the individual, our own genes only account for less then half of our genetic mass (the rest are from microorganisms some are there from birth, others leave and enter our bodies frequently, some we might even trade genes with, and a lot of them affect our behavior).

If you are interested, here is a 2014 report debunking twin studies as proof for behavior genetics: https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.12036

EDIT: As for the claim that “[w]e already know that genetics predispose people to certain conditions like schizophrenia.”, that is not so clear either: https://slate.com/technology/2016/02/schizophrenia-genes-fou... and if you want an unopinionated review of the current status of the genetic theory of schizophrenia: this is a nice one: https://pmc.ncbi.nlm.nih.gov/articles/PMC7465115/ and concludes:

> From the impressing 80% heritability stated by Sullivan, very little has been pinpointed and confirmed in live models or come even close to the ultimate goal of targeted therapy. [...] Many confounding variables still plague all levels of testing starting from sample sizes, absence of negative studies of insufficient follow-up for high credibility.

epolanski 143 days ago [-]

It's not an error unless you're able to demonstrate the opposite.

I have yet to see studies that demonstrate that different sexes, hormones or even ethnicities do not impact cognitive abilities or higher proficiency in different fields.

Whereas I've seen plenty that show that women, on average, demonstrate higher cognitive abilities linked to verbal proficiency, text comprehension or executive tasks. Women also tend to have better memory than men.

Facts are that there are genetic differences in how our brains work. And let's not ignore the huge importance of hormones, extremely potent regulators of how we function.

To ignore that we have differences that, at large, help explain statistics is asinine.

us-merul 143 days ago [-]

And how are you able to rule out that societal or environmental effects are the primary driver? How is your argument not circular, that observed differences are therefore the result of biology?

epolanski 143 days ago [-]

I've never stated that biology is the primary driver.

I merely stated that biology, should not be ignored when judging very large samples.

There are cross sex cognitive tests at which women and men tend to perform differently, such as spatial awareness or speed perception and many others.

What's the environmental or cultural factor behind the fact that a female's brain, on average, is able to judge speed much more correctly than a male?

us-merul 143 days ago [-]

I see you edited your response after my reply. I’m not denying that you’ve read about those observed differences. I’m trying to say that those differences don’t need to be driven by biology, and evidence suggests otherwise. Behavior can’t be reduced to genetics, and the mechanistic link isn’t there. You are claiming that morphological differences explain the variation. Besides, by your reasoning, you could look at the NBA before Bill Russell and make very different claims.

143 days ago [-]

3cKU 143 days ago [-]

> And how are you able to rule out

It is not possible to rule out unfalsifiable hypotheses.

mcswell 143 days ago [-]

"on average, women tend to learn languages easier than men": I'm a linguist (although not an expert on second language learning), and I've never heard that. Citation?

xyzelement 143 days ago [-]

I am not the person you're responding to, but is this a surprising and counterintuitive claim? It holds true in my observation (n significantly larger than 32)

eska 143 days ago [-]

All the people I personally know who speak more than 10 languages are men, including me. What now?

myhf 143 days ago [-]

Circular reasoning can be used to "prove" anything, so it's not helpful as a basis for policy making.

runarberg 143 days ago [-]

> Women have, on average, a higher emotional intelligence which is e.g. tied to higher linguistic proficiency. That helps in many different fields and, on average, women tend to learn languages easier than men.

Has this been experimentally shown to be the case with studies that don‘t fail to replicate?

Between studies that fail to replicate and pure conjecture and pseudo-science I certainly favor the former, at least actual studies that fail to replicate can be disproven, your conjectures are just race/sex science and nothing but pseudo-science. I can either take you at your word, or choose not to believe you. I pick the latter.

pessimizer 143 days ago [-]

https://doi.org/10.1093/socpro/spv028

"Are Smart People Less Racist? Verbal Ability, Anti-Black Prejudice, and the Principle-Policy Paradox"

Simple study that implies that you should expect white people who think black people are less intelligent than white people to have about 8-11 fewer IQ points than other white people. Just a survey about racial attitudes, and a test of verbal ability that correlates well with IQ tests.

3cKU 142 days ago [-]

> white people who think ...

Your link says "high-ability whites are less likely than low-ability whites to report ..."

Loading comments...

jbentley1 143 days ago [-]

This is a great list for people who want to smugly say "Um, actually" a lot in conversation.

wduquette 143 days ago [-]

"they are AWFUL at stats."

And on top of it you're trying to measure the behavior of people not widgets; and people change their behavior based on the context and what they think you're measuring.

Projectiboga 143 days ago [-]

Expermential Design is one of the big four adacemic subjects within Statistics. The math is complex even before the issues of the effects of the expermential situation.

SkyMarshal 143 days ago [-]

Oblig link to Norvig's "Warning Signs in Experimental Design": https://www.norvig.com/experiment-design.html

dgfitz 143 days ago [-]

> Trouble is, people make different economic decisions when they are working with play money rather than real money.

Understated even. Ever play poker with just chips and no money behind them? Nobody cares, there is no value to the plastic coins.

sputr 143 days ago [-]

obviouslynotme 143 days ago [-]

How are you going to get around Griggs v. Duke Power Co.? AFAIK, personality tests have not (yet) been given the regulatory eye, but testing cognitive ability has.

PaulHoule 143 days ago [-]

Yeah, this is an era which is notorious for pseudoscience.

iamthemonster 143 days ago [-]

lmpdev 143 days ago [-]

The problem with a lot of the “higher free variable” sciences like psychology, ecology and sociology etc

Is they are the ones who need to be at the bleeding edge of statistics but often aren’t

They absolutely need Bayesian competitive hypothesis testing but are often the least likely to use it

abandonliberty 142 days ago [-]

https://en.wikipedia.org/wiki/Stereotype_threat shows up in this list as not replicated, however, it is one of the most studied phenomena in psychology.

Failing to reproduce an effect doesn't prove it isn't real. Mythbusters would do this all the time.

On the other hand, some empires are built on publication malpractice.

This type of intellectual dishonesty underlies both of their careers.

https://en.wikipedia.org/wiki/Cascade_Model_of_Relational_Di...

https://en.wikipedia.org/wiki/The_Seven_Principles_for_Makin...

https://www.gottman.com/blog/this-one-thing-is-the-biggest-p...

eviks 143 days ago [-]

> And it isn't a skill issue as much as a cultural one. They teach it wrong

It's definitely a skill issue then

Waterluvian 143 days ago [-]

Um, actually I’d say it is the responsibility of all scientists, both professional and amateur, to point out falsehoods when they’re uttered, and not an act of smugness.

rolph 143 days ago [-]

[um], has contexts but is usually a cue, that an unexpected, off the average, something is about to be said.

[actually], is a neutral declaration that some cognitive structure was presented, but is at odds with physically observable fact that will now be laid out to you.

odyssey7 143 days ago [-]

There’s surely irony here

jci 143 days ago [-]

Reminds me of Feynman’s Cargo Cult Science speech:

https://people.cs.uchicago.edu/~ravenben/cargocult.html

turnsout 143 days ago [-]

I read in a study that it takes 10,000 hours to become proficient in statistics /s

delichon 143 days ago [-]

Approximate replication rates in psychology:

  social      37%
  cognitive   42%
  personality 55%
  clinical    44%

So a list of famous psychology experiments that do replicate may be shorter.

https://www.nature.com/articles/nature.2015.18248

NewJazz 143 days ago [-]

I think one would wish the famous ones to be more often replicable.

tomjakubowski 143 days ago [-]

Nonreplicable publications are cited more than replicable ones (2021)

https://www.science.org/doi/10.1126/sciadv.abd1705

Press release: https://rady.ucsd.edu/why/news/2021/05-21-a-new-replication-...

esperent 143 days ago [-]

nitwit005 143 days ago [-]

This feels like some sort of truth telling paradox, where if you assume the study is true, then seeing a citation like this means it's likely not true.

sunscream89 143 days ago [-]

There may be minute details like having a confident frame of reference for the confidence tests. Cultures, even psychologies might swing certain ideas and their compulsions.

dlcarrier 143 days ago [-]

Isn't the unexpected more famous than the expected?

t_mann 143 days ago [-]

> Most results in the field do actually replicate and are robust[citation needed]

glial 143 days ago [-]

analog31 143 days ago [-]

Disclosure: Physics PhD.

gwd 143 days ago [-]

How interesting would it be if every PhD thesis had to have a "replication" section, where they tried to replicate some famous paper's results.

bogtog 143 days ago [-]

Little of this is considered cognitive psychology. The vast majority would be viewed as "social psychology"

gwd 143 days ago [-]

> Smile to Feel Better Effect

sunscream89 143 days ago [-]

Yeah, the marshmallow one taught me to have patience and look for the long returns on investments of personal effort.

I think there may be something to a few of these, and more may need considering regarding how these are conducted.

Let’s leave open our credulities for the inquest of time.

143 days ago [-]

Terr_ 143 days ago [-]

> Source: Hagger et (63!) al. 2016

I can't help chuckling at the idea that over 1.98 * 10^87 people were involved in the paper.

dlcarrier 143 days ago [-]

If you were to meet a "normal" person, would you interpret that as meaning "perpendicular" or as meaning "the kind of person that doesn't look at everything like it's a mathematical expression"?

Terr_ 143 days ago [-]

recursive 143 days ago [-]

Um, actually, the interpretation here is "factorial", not "perpendicular".

dlcarrier 143 days ago [-]

The APA has a really good style guide, but I don't trust them for actual psychology.

runarberg 143 days ago [-]

Yes, the APA certainly has a lot to answer for in their history. The Guantanamo Prison torture scandal is still fresh in my memory.

https://www.democracynow.org/2007/8/20/apa_members_hold_fier...

chatmasta 143 days ago [-]

Has anyone tried to reproduce it? Good luck convincing an ethics review board to let you try that again.

Meanwhile, it’s been reproduced “in vitro” in numerous episodes of atrocity, e.g. Abu Ghraib…

dlcarrier 142 days ago [-]

Also, how do you italicize text in your comment?

chatmasta 142 days ago [-]

> Also, how do you italicize text in your comment?

HN will italicize any string between a pair of asterisks. [0]

> practical ethics requirements

Practical ethics requirements :)

[0] https://news.ycombinator.com/formatdoc

aeve890 143 days ago [-]

>Source: Stern, Gerlach, & Penke (2020)

Wow, what are the odds?

https://en.wikipedia.org/wiki/Stern%E2%80%93Gerlach_experime...

dlcarrier 143 days ago [-]

NooneAtAll3 143 days ago [-]

I'm still amazed that wikipedia doesn't have redirect away from its mobile site

dang 143 days ago [-]

(It's on my list to rewrite those URLs in HN comments at least)

NoMoreNicksLeft 143 days ago [-]

Please, please, please... can you rewrite reddit links to old.reddit.com too? Not that there's much reason to link there, but it makes my eyes bleed.

dang 142 days ago [-]

It depends if the bulk of the HN community supports it. As you probably know, we already do that for submission URLs.

insane_dreamer 143 days ago [-]

(This isn't a comment on any of the individual studies listed.)

HK-NC 143 days ago [-]

chatmasta 143 days ago [-]

It’s also self-referential because there is no objective measure of “racism,” so how can you even measure whether someone is “more racist” based on reaction time to stereotypical stimuli?

Aurornis 143 days ago [-]

thecrims0nchin 143 days ago [-]

143 days ago [-]

systemstops 143 days ago [-]

Is anyone tracking how much damage to society bad social science has done? I imagine it's quite a bit.

roadside_picnic 143 days ago [-]

seec 143 days ago [-]

I agree wholeheartedly with your conclusion. Science is relevant for those who care about finding the truth, just because they want to know for sure.

But for most people science doesn't really make much difference in how they choose and operate. Knowing the truth doesn't mean you are ready to adapt your behavior.

BeetleB 143 days ago [-]

I imagine it's comparable to the damage done when policies are set that are not based on studies.

Let's be candid: Most policies have no backing in science whatsoever. The fact that some were backed by poor science is not an indictment of much.

rgblambda 143 days ago [-]

From a political point of view, it may actually be beneficial for a policy to have no scientific basis. What happens when the science gets updated?

feoren 143 days ago [-]

systemstops 143 days ago [-]

daoboy 143 days ago [-]

You are absolutely right. Another interesting example: The man who invented the lobotomy won a Nobel Prize for it.

feoren 143 days ago [-]

Lysenkoism! That's the one. Thank you for reminding me of the name (and for knowing what I was grasping at).

I think some "bad people" used eugenics and phrenology to justify prior hate, but they were also effective tools at convincing otherwise "good people" to join them.

izabera 143 days ago [-]

i'm struggling to imagine many negative effects on society caused by the specific papers in this list

systemstops 143 days ago [-]

Public policies were made (or justified) based on some of this research. People used this "settled science" to make consequential decisions.

seec 143 days ago [-]

To be honest, whether they had a "study" proving it or not I think those things would have happened anyway.

Psychology exists mostly as a new religion; it serves as a tool for justification for people in power, it is used just in the same way as the bible.

rgblambda 143 days ago [-]

And that's just theories that have been debunked (i.e. proven wrong).

fsckboy 143 days ago [-]

famous cognitive psychology experiments that do replicate: IQ tests

http://www.psychpage.com/learning/library/intell/mainstream....

in fact, the foundational statistical models considered the gold standard for statistics today were developed for this testing.

alphazard 143 days ago [-]

> in fact, the foundational statistical models considered the gold standard for statistics today were developed for this testing.

The normal distribution predates the general factor model of IQ by hundreds of years.[0]

You can try other distributions yourself, it's going to be hard to find one that better fits the existing IQ data than the normal (bell curve) distribution.

[0] https://en.wikipedia.org/wiki/Normal_distribution#History

fsckboy 143 days ago [-]

Darwin's cousin, Francis Galton, for whom the log-normal distribution is often called the Galton distribution, was among the first to investigate psychometrics.

not realizing he was hundreds of years late to the game, he still went ahead and coined the term "median"

more tidbits here https://en.wikipedia.org/wiki/Francis_Galton#Statistical_inn...

dlcarrier 143 days ago [-]

I have no doubt that IQ tests reproducibly measure the test takers ability to pass tests, as well as to perform in a society that the tests are based on.

I think it's disingenuous to attribute IQ to intelligence as a whole though, and it is better understood as an indicator of cultural intelligence.

growingkittens 143 days ago [-]

A moment from the show "Good Times" in 1974. https://m.youtube.com/watch?v=DhbsDdMoHC0 at 1:25

dlcarrier 143 days ago [-]

Apparently it's referencing a real test, called the BITCH test: https://en.wikipedia.org/wiki/Black_Intelligence_Test_of_Cul...

Also, I forgot how annoying comic relief characters were in sitcoms. They are the opposite of relieving.

fsckboy 142 days ago [-]

also, cultures don't have iq's, there is no known link to culture.

3cKU 143 days ago [-]

Raven's Progressive Matrices is often administered. Is that test culturally biased? Does that test measure only ability to take that test and nothing else?

dlcarrier 143 days ago [-]

tptacek 143 days ago [-]

https://pubmed.ncbi.nlm.nih.gov/24104504/

dlcarrier 142 days ago [-]

teamonkey 143 days ago [-]

Yes, it’s almost certainly linked to quality of schooling and exposure to those types of problems, amongst other things, see the Flynn Effect.

https://en.wikipedia.org/wiki/Flynn_effect

3cKU 141 days ago [-]

tptacek 141 days ago [-]

So IQ is malleable and SES-dependent and GxE interactions are real.

3cKU 140 days ago [-]

No. The first part of that quote is consistent with any hypothesis (G only, E only, G&E), i.e. cannot distinguish between them.

NalNezumi 143 days ago [-]

I was told this in context of "cultural psychology" how many tests or psychological observations and metrics poorly translate over culture. (especially when you try to pin it on some success metric)

pessimizer 143 days ago [-]

teamonkey 143 days ago [-]

Related: that brain is plastic and can adapt to challenges in different ways. https://www.scientificamerican.com/article/london-taxi-memor...

fsckboy 142 days ago [-]

>What exactly are they meant to replicate other than other IQ tests?

>Myers-Briggs...'s obvious horseshit

tptacek 142 days ago [-]

The Big Five --- not all that great either!

https://www.stat.cmu.edu/~brian/Pmka-Attack-V71-N3/pmka-2006...

(1st section).

astrange 143 days ago [-]

Survivorship bias. You can easily make someone's IQ test not replicate. (Hit them on the head really hard.)

tryauuum 143 days ago [-]

    claimed result: Women are more attracted to hot guys during high-fertility days of their cycles

wait why not? I hoped I'm attractive at least some days of the month :(

SpaceManNabs 143 days ago [-]

The ego depletion effect seems intuitively surprising to me. Science is often unintuitive. I do know that it is easier to make forward-thinking decisions when I am not tired so I dont know.

ceckGrad 143 days ago [-]

>some of these papers were successfully replicated, so juxtaposing them to the ones that have not been replicated at all given the title of the page feels a bit off. Not sure if fair.

I don't agree with Giancotti's epistemological claims but today I will not bloviate at length about the epistemology of science. I will try to be brief.

If I understand Marco Giancotti correctly, one particular point is that Giancotti seems to be saying that Hagger et al. have impressively debunked Baumeister et al.

SpaceManNabs 143 days ago [-]

taeric 143 days ago [-]

The idea isn't that it is easier to do things when not tired. It is that you specifically get tired exercising self control.

lutusp 143 days ago [-]

Or, to put it more simply, scientific fields require falsifiable theories about some aspect of nature, and the mind is not part of nature.

patrickhogan1 143 days ago [-]

Before dunking on psychology for not replicating, remember this is a cross-discipline problem.

In biomedicine, Amgen could reproduce only 6/53 “landmark” preclinical cancer papers and Bayer reported widespread failures.

seec 143 days ago [-]

All the "hypothesis" or supposed "results" are so bonkers than it's an insult to intelligence itself that such things can be "proved" with psych "experiment".

sunrunner 143 days ago [-]

No mention of the Stanford Prison Experiment I notice.

dlcarrier 143 days ago [-]

You'd think it's so far in the past that it isn't even considered, but Zimbardo was elected president of the American Psychological Association in 2002, which wasn't all that long ago.

runarberg 142 days ago [-]

And during which time APA was complicit in and participated in torturing prisoners at Guantanamo Bay.

https://www.apa.org/about/policy/chapter-4b

cindyllm 142 days ago [-]

[dead]

eviks 143 days ago [-]

eska 143 days ago [-]

I recently read the lifework book of a nobel prize winning psychologist and it was full of these disproven experiments. As a non-psychologist my trust in the experts is extremely low.

runarberg 143 days ago [-]

This title would be much more accurate if the author omitted “cognitive” from the title.

WesolyKubeczek 143 days ago [-]

> Claimed result: Listening to Mozart temporarily makes you smarter.

somewholeother 143 days ago [-]

The economics of pyschology are the psychology of economics.

If you won't trust the process, you will gain no real outcome.

What we recieve from the process is not necessarily tangible, but instead a fresh perspective on what may be possible. Thus, the inversion is complete, and we may then move forward.

blindriver 143 days ago [-]

mcswell 143 days ago [-]

In many cases--longitudinal studies are an example, but not the only one--that's not feasible. And it's often expensive--who would pay for it?

jay_kyburz 143 days ago [-]

Agreed, and the independent lab should be chosen by the publisher, and be kept secret until results are in.

Ferret7446 140 days ago [-]

Devil's advocate, is it possible that humans have psychologically changed since the original experiments?

ausbah 143 days ago [-]

i wonder the replication rate is for ML papers

PaulHoule 143 days ago [-]

avdelazeri 143 days ago [-]

KingMob 143 days ago [-]

And most medical studies. It's just as bad as social psych, if not worse, because there's real money at stake in churning out new drugs.

intalentive 143 days ago [-]

Nowadays everyone publishes their code. There’s typically a project page on github.io, a paper on arxiv.org, and a public repo.

picardo 143 days ago [-]

camgunz 143 days ago [-]

Dear lazyweb: is there the opposite of this list anywhere?

Animats 143 days ago [-]

> Most results in the field do actually replicate and are robust [citation needed], so it would be a pity to lose confidence in the whole field just because of a few bad apples.

Is there a good list of results that do consistently replicate?

juujian 143 days ago [-]

Now I want to know which cognitive psychology experiments were successfully replicated though.

djoldman 143 days ago [-]

I'm surprised that dunning Kruger isn't listed.

hn_throw_250915 143 days ago [-]

epolanski 143 days ago [-]

[flagged]

viewtransform 143 days ago [-]

Or is this data you have gathered observing people at Costco. Just checking on your scientific methodology.

dboreham 143 days ago [-]

People at Costco would be smarter than average so not a valid sample.

epolanski 142 days ago [-]

https://pubmed.ncbi.nlm.nih.gov/6735823/

https://www.researchgate.net/publication/12481554_Measures_o...

142 days ago [-]

fny 143 days ago [-]

In the end, we're talking about distributions of people, and staring at these differences mischaracterizes all but those at the mean.

All that matters is who can pass the test.

[0]: I also encourage you to ask ChatGPT/Grok/Claude "men vs women math performance studies." You'll be shocked to find most studies point to no or small differences.

[1]: Malcom Gladwell wrote a great piece about his experience as a runner that seems appropriate to share https://www.newyorker.com/magazine/1997/05/19/the-sports-tab...

runarberg 143 days ago [-]

us-merul 143 days ago [-]

> We know for a fact that sex or ethnicity impacts body yet we seem unable to cope with the idea that there are also differences in how brains work.

Aurornis 143 days ago [-]

> and the evidence for direct genetic effects on behavior isn’t there.

Yes it is. There's an entire field for studying this called Behavioral Genetics.

The easiest evidence comes from comparing monozygotic and dizygotic twins (maternal vs fraternal twins). The variance in behavior is higher among the dizygotic twins who have different genomes.

runarberg 143 days ago [-]

Aurornis 143 days ago [-]

> Twin studies are inherently biased.

How? And why do you think this completely invalidates their observations?

> We have also learned since the 1950s that your genes are not nearly as static as previously assumed

seec 143 days ago [-]

I don't know how people like him can come up with so much ideological bullshit that is very obviously proved wrong just by observing other species or consulting history.

If any of it was wrong not only, we would just not breed and select animals for specific traits but pretty much most of our civilisation wouldn't even exist as it does.

We got there precisely by selecting and using animals as tools and food security. Our farm animals are quite passive, precisely because we selected that trait.

runarberg 142 days ago [-]

seec 141 days ago [-]

I don't doubt that some of the behavioral genetics research is complete bullshit but that's really not specific to this field of research, this is true for pretty much all of the research.

To me it feels like you want to keep the status quo of the "equality" religion and refuse to even look at the research seriously just because you don't like the implications.

runarberg 140 days ago [-]

> Darwin theories proved to be completely right, yet it took 20 years to be accepted by the scientific community and 50 years to be broadly accepted.

runarberg 142 days ago [-]

If you are interested, here is a 2014 report debunking twin studies as proof for behavior genetics: https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.12036

epolanski 143 days ago [-]

It's not an error unless you're able to demonstrate the opposite.

I have yet to see studies that demonstrate that different sexes, hormones or even ethnicities do not impact cognitive abilities or higher proficiency in different fields.

Facts are that there are genetic differences in how our brains work. And let's not ignore the huge importance of hormones, extremely potent regulators of how we function.

To ignore that we have differences that, at large, help explain statistics is asinine.

us-merul 143 days ago [-]

And how are you able to rule out that societal or environmental effects are the primary driver? How is your argument not circular, that observed differences are therefore the result of biology?

epolanski 143 days ago [-]

I've never stated that biology is the primary driver.

I merely stated that biology, should not be ignored when judging very large samples.

There are cross sex cognitive tests at which women and men tend to perform differently, such as spatial awareness or speed perception and many others.

What's the environmental or cultural factor behind the fact that a female's brain, on average, is able to judge speed much more correctly than a male?

us-merul 143 days ago [-]

143 days ago [-]

3cKU 143 days ago [-]

> And how are you able to rule out

It is not possible to rule out unfalsifiable hypotheses.

mcswell 143 days ago [-]

"on average, women tend to learn languages easier than men": I'm a linguist (although not an expert on second language learning), and I've never heard that. Citation?

xyzelement 143 days ago [-]

I am not the person you're responding to, but is this a surprising and counterintuitive claim? It holds true in my observation (n significantly larger than 32)

eska 143 days ago [-]

All the people I personally know who speak more than 10 languages are men, including me. What now?

myhf 143 days ago [-]

Circular reasoning can be used to "prove" anything, so it's not helpful as a basis for policy making.

runarberg 143 days ago [-]

Has this been experimentally shown to be the case with studies that don‘t fail to replicate?

pessimizer 143 days ago [-]

https://doi.org/10.1093/socpro/spv028

"Are Smart People Less Racist? Verbal Ability, Anti-Black Prejudice, and the Principle-Policy Paradox"

3cKU 142 days ago [-]

> white people who think ...

Your link says "high-ability whites are less likely than low-ability whites to report ..."