Discover the most talked about and latest scientific content & concepts.

Concept: Jacob Cohen


The purpose of this study was to compare the effects of 10 weeks of effort-matched short intervals (SI; n = 9) or long intervals (LI; n = 7) in cyclists. The high-intensity interval sessions (HIT) were performed twice a week interspersed with low-intensity training. There were no differences between groups at pretest. There were no differences between groups in total volume of both HIT and low-intensity training. The SI group achieved a larger relative improvement in VO2max than the LI group (8.7% ± 5.0% vs 2.6% ± 5.2%), respectively, P ≤ 0.05). Mean effect size (ES) of the relative improvement in all measured parameters, including performance measured as mean power output during 30-s all-out, 5-min all-out, and 40-min all-out tests revealed a moderate-to-large effect of SI training vs LI training (ES range was 0.86-1.54). These results suggest that the present SI protocol induces superior training adaptations on both the high-power region and lower power region of cyclists' power profile compared with the present LI protocol.

Concepts: Present, Time, Medical statistics, Statistical significance, Odds ratio, Effect size, Jacob Cohen, Equal temperament


There is emerging evidence that the supplementation of omega-3 contributes to a decrease in aggressive behaviour in prison populations. A challenge of such research is achieving statistical power against effect sizes which may be affected by the baseline omega-3 index. There are no published data on the blood omega-3 index with studies of this kind to assess the variability of the blood omega-3 index in conjunction with aggression and attention deficit assessments.

Concepts: Attention, Statistical significance, Educational psychology, Attention-deficit hyperactivity disorder, Aggression, Effect size, Prison, Jacob Cohen


BACKGROUND: Physical activity is assumed to be important in the prevention and treatment of frailty. It is however unclear to what extent frailty can be influenced, because an outcome instrument is lacking. OBJECTIVES: An Evaluative Frailty Index for Physical activity (EFIP) was developed based on the Frailty Index Accumulation of Deficits and clinimetric properties were tested. DESIGN: The content of the EFIP was determined in a written Delphi procedure. Intra-rater reliability, inter-rater reliability, and construct validity were determined in an observational study (n=24) and to determine responsiveness, the EFIP was used in a physical therapy intervention study (n=12). METHOD: Intra-rater reliability and inter-rater reliability were calculated using Cohen’s kappa, construct validity was determined by correlating the score on the EFIP with those on the Timed Up &Go Test (TUG), the Performance Oriented Mobility Assessment (POMA), and the Cumulative Illness Rating Scale for geriatrics (CIRS-G). Responsiveness was calculated by means of the Effect Size (ES), the Standardized Response Mean (SRM), and a paired sample t-test. RESULTS: Fifty items were included in the EFIP. Inter-rater (Cohen’s kappa: 0,72) and intra-rater reliability (Cohen’s kappa: 0,77 and 0,80) were good. A moderate correlation with the TUG, POMA, and CIRS-G was found (0,68 -0,66 and 0,61 respectively, P< 0.001). Responsiveness was moderate to good (ES: -0.72 and SRM:-1.14) for an intervention with a significant effect (P< 0.01). LIMITATIONS: The clinimetric properties of the EFIP have been tested in a small sample and anchor based responsiveness could not be determined. CONCLUSIONS: The EFIP is a reliable, valid, and responsive instrument to evaluate the effect of physical activity on frailty in research and clinical practice.

Concepts: Scientific method, Psychometrics, Student's t-test, Reliability, Cohen's kappa, Inter-rater reliability, Jacob Cohen, Fleiss' kappa


Many factors affect the microbiomes of humans, mice, and other mammals, but substantial challenges remain in determining which of these factors are of practical importance. Considering the relative effect sizes of both biological and technical covariates can help improve study design and the quality of biological conclusions. Care must be taken to avoid technical bias that can lead to incorrect biological conclusions. The presentation of quantitative effect sizes in addition to P values will improve our ability to perform meta-analysis and to evaluate potentially relevant biological effects. A better consideration of effect size and statistical power will lead to more robust biological conclusions in microbiome studies.

Concepts: Medical statistics, Statistical significance, Odds ratio, Effect size, Meta-analysis, Statistical power, Jacob Cohen, Gene V. Glass


The sample size necessary to obtain a desired level of statistical power depends in part on the population value of the effect size, which is, by definition, unknown. A common approach to sample-size planning uses the sample effect size from a prior study as an estimate of the population value of the effect to be detected in the future study. Although this strategy is intuitively appealing, effect-size estimates, taken at face value, are typically not accurate estimates of the population effect size because of publication bias and uncertainty. We show that the use of this approach often results in underpowered studies, sometimes to an alarming degree. We present an alternative approach that adjusts sample effect sizes for bias and uncertainty, and we demonstrate its effectiveness for several experimental designs. Furthermore, we discuss an open-source R package, BUCSS, and user-friendly Web applications that we have made available to researchers so that they can easily implement our suggested methods.

Concepts: Statistics, Sample size, Statistical significance, Future, Effect size, Meta-analysis, Statistical power, Jacob Cohen


Moral licensing refers to the effect that when people initially behave in a moral way, they are later more likely to display behaviors that are immoral, unethical, or otherwise problematic. We provide a state-of-the-art overview of moral licensing by conducting a meta-analysis of 91 studies (7,397 participants) that compare a licensing condition with a control condition. Based on this analysis, the magnitude of the moral licensing effect is estimated to be a Cohen’s d of 0.31. We tested potential moderators and found that published studies tend to have larger moral licensing effects than unpublished studies. We found no empirical evidence for other moderators that were theorized to be of importance. The effect size estimate implies that studies require many more participants to draw solid conclusions about moral licensing and its possible moderators.

Concepts: Effect size, Meta-analysis, Morality, Jacob Cohen, Gene V. Glass


The authors evaluate the quality of research reported in major journals in social-personality psychology by ranking those journals with respect to their N-pact Factors (NF)-the statistical power of the empirical studies they publish to detect typical effect sizes. Power is a particularly important attribute for evaluating research quality because, relative to studies that have low power, studies that have high power are more likely to (a) to provide accurate estimates of effects, (b) to produce literatures with low false positive rates, and © to lead to replicable findings. The authors show that the average sample size in social-personality research is 104 and that the power to detect the typical effect size in the field is approximately 50%. Moreover, they show that there is considerable variation among journals in sample sizes and power of the studies they publish, with some journals consistently publishing higher power studies than others. The authors hope that these rankings will be of use to authors who are choosing where to submit their best work, provide hiring and promotion committees with a superior way of quantifying journal quality, and encourage competition among journals to improve their NF rankings.

Concepts: Scientific method, Sample size, Statistical significance, Type I and type II errors, Effect size, Statistical power, Power, Jacob Cohen


Proposals to increase research reproducibility frequently call for focusing on effect sizes instead of p values, as well as for increasing the statistical power of experiments. However, it is unclear to what extent these two concepts are indeed taken into account in basic biomedical science. To study this in a real-case scenario, we performed a systematic review of effect sizes and statistical power in studies on learning of rodent fear conditioning, a widely used behavioral task to evaluate memory. Our search criteria yielded 410 experiments comparing control and treated groups in 122 articles. Interventions had a mean effect size of 29.5%, and amnesia caused by memory-impairing interventions was nearly always partial. Mean statistical power to detect the average effect size observed in well-powered experiments with significant differences (37.2%) was 65%, and was lower among studies with non-significant results. Only one article reported a sample size calculation, and our estimated sample size to achieve 80% power considering typical effect sizes and variances (15 animals per group) was reached in only 12.2% of experiments. Actual effect sizes correlated with effect size inferences made by readers on the basis of textual descriptions of results only when findings were non-significant, and neither effect size nor power correlated with study quality indicators, number of citations or impact factor of the publishing journal. In summary, effect sizes and statistical power have a wide distribution in the rodent fear conditioning literature, but do not seem to have a large influence on how results are described or cited. Failure to take these concepts into consideration might limit attempts to improve reproducibility in this field of science.

Concepts: Statistics, Sample size, Statistical significance, Statistical hypothesis testing, Effect size, Meta-analysis, Statistical power, Jacob Cohen


The controversies surrounding the effectiveness of cognitive behavioural therapy and graded exercise therapy for chronic fatigue syndrome are explained using Cohen’s d effect sizes rather than arbitrary thresholds for ‘success’. This article shows that the treatment effects vanish when switching to objective outcomes. The preference for subjective outcomes by the PACE trial team leads to false hope. This article provides a more realistic view, which will help patients and their doctors to evaluate the pros and cons.

Concepts: Effect, Effectiveness, Effect size, Cognitive behavioral therapy, Chronic fatigue syndrome, Jacob Cohen


Functional magnetic resonance imaging (fMRI) research is routinely criticized for being statistically underpowered due to characteristically small sample sizes and much larger sample sizes are being increasingly recommended. Additionally, various sources of artifact inherent in fMRI data can have detrimental impact on effect size estimates and statistical power. Here we show how specific removal of non-BOLD artifacts can improve effect size estimation and statistical power in task-fMRI contexts, with particular application to the social-cognitive domain of mentalizing/theory of mind. Non-BOLD variability identification and removal is achieved in a biophysical and statistically principled manner by combining multi-echo fMRI acquisition and independent components analysis (ME-ICA). Without smoothing, group-level effect size estimates on two different mentalizing tasks were enhanced by ME-ICA at a median rate of 24% in regions canonically associated with mentalizing, while much more substantial boosts (40-149%) were observed in non-canonical cerebellar areas. Effect size boosting occurs via reduction of non-BOLD noise at the subject-level and consequent reductions in between-subject variance at the group-level. Smoothing can attenuate ME-ICA-related effect size improvements in certain circumstances. Power simulations demonstrate that ME-ICA-related effect size enhancements enable much higher-powered studies at traditional sample sizes. Cerebellar effects observed after applying ME-ICA may be unobservable with conventional imaging at traditional sample sizes. Thus, ME-ICA allows for principled design-agnostic non-BOLD artifact removal that can substantially improve effect size estimates and statistical power in task-fMRI contexts. ME-ICA could mitigate some issues regarding statistical power in fMRI studies and enable novel discovery of aspects of brain organization that are currently under-appreciated and not well understood.

Concepts: Brain, Sample size, Statistical significance, Type I and type II errors, Magnetic resonance imaging, Effect size, Statistical power, Jacob Cohen