At the PsychMethods Facebook discussion group, Uli Schimmack et al. have recently been discussing the lack of merit of the implicit self-esteem (ISE) construct. I chimed in with a brief note concurring with Uli stating that during my first three years of graduate school, I amassed over 20 “failed studies” involving implicit self-esteem (building upon Dijksterhuis’, 2004 seminal ISE paper).1 This led to a tremendous waste of time and research resources substantially derailing my main line of research before I abandoned it altogether a few years later.
Uli asked me if I’d ever published or at least archived such failed studies? I replied in the negative because this wasn’t done in pre-2010 days. I did mention, however, that I would be publicly releasing more details of these failed studies in a book I’m currently writing about social psychology’s unraveling in the context of the broken academic system.
As a sneak preview, I’ve decided to empty my entire file-drawer for all implicit social cognition studies2 I executed during my first three years of graduate school (2005-2008), which includes the 20+ failed studies on implicit self-esteem specifically:
I became so frustrated with my “lack of success” that I created this table in the Spring of 2008 to more carefully document my failures. I also printed out a hard copy of the table and would show it to professors and visiting external speakers. In an exasperated tone, I would ask them: What the hell am I doing wrong?
1. I wouldn’t go as far as Uli in declaring that “implicit self-esteem is DEAD; R.I.P Implicit Self-Esteem (2000-2015).” I would, however, strongly caution any researcher, particularly early-career researchers, against investing research resources on this topic.
2. Sample sizes for the studies ranged between N=80 to N=140, following the traditional heuristic of N=~20 per cell for between-subjects designs (sometimes re-sampling an additional N=20 to N=40 in the case of statistically marginal effects).
At this past SPSP, Uri Simonsohn gave a talk on new ways of thinking about statistical power. From this new perspective, you first determine how large a sample size you can afford for a particular project. Then, you can determine the minimum effect size that can reliably detected (i.e., 95% power) for that sample size (e.g., d_min = .73 can be reliably detected with n=50/cell). I believe that this approach is a much more productive way of thinking about power for several reasons, one being that it substantially enhances the interpretation of null results. For instance, you can conclude (assuming integrity of methods and measurement instruments) that the effect you’re studying is unlikely to be the size of the minimum effect size reliably detectable for your sample size (or else you would have detect it). That being said, it is still possible the effect exists but is much smaller in magnitude, which would require a much larger sample size to reliably detect.
In this post, I use the core ideas from this new approach to come up with a simpler and more intuitive way of gauging publication bias for extant empirical studies.
The idea is simple. If a study reports an observed effect size smaller than the minimum effect size reliably detectable for the sample size used, then the study likely suffers from publication bias and should be interpreted with caution. The further away the observed effect size is from the minimally detectable effect size, the larger the bias. Let’s look at some concrete examples.
Zhong & Liljenquist’s (2006) Study 1 on the “Macbeth effect” found a d=.53 using n=30/cell. At this sample size, however, only effect sizes as large as d=.95 (or greater) are reliably detectable with 95% power. On the other hand, Tversky & Kahneman’s (1981) Framing effect study found a d=1.13 using n=153/cell. At that sample size, effect sizes as small as d=.41 are reliably detectable. See Table below for other examples:
The new bias index can be calculated as follows:
(And note we’d want to calculate a 95% C.I. around the bias estimate, given that bias estimates should be more precise for larger Ns all else being equal.)
To shed more light on the value of this simpler publication bias index, in the near future I will calculate these for studies where replicability information exists and test empirically whether the index predicts lower likelihood of replication.