Journal: Behavior research methods
This study documents reporting errors in a sample of over 250,000 p-values reported in eight major psychology journals from 1985 until 2013, using the new R package “statcheck.” statcheck retrieved null-hypothesis significance testing (NHST) results from over half of the articles from this period. In line with earlier research, we found that half of all published psychology papers that use NHST contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion. In contrast to earlier findings, we found that the average prevalence of inconsistent p-values has been stable over the years or has declined. The prevalence of gross inconsistencies was higher in p-values reported as significant than in p-values reported as nonsignificant. This could indicate a systematic bias in favor of significant results. Possible solutions for the high prevalence of reporting inconsistencies could be to encourage sharing data, to let co-authors check results in a so-called “co-pilot model,” and to use statcheck to flag possible inconsistencies in one’s own manuscript or during the review process.
MorePower 6.0 is a flexible freeware statistical calculator that computes sample size, effect size, and power statistics for factorial ANOVA designs. It also calculates relational confidence intervals for ANOVA effects based on formulas from Jarmasz and Hollands (Canadian Journal of Experimental Psychology 63:124-138, 2009), as well as Bayesian posterior probabilities for the null and alternative hypotheses based on formulas in Masson (Behavior Research Methods 43:679-690, 2011). The program is unique in affording direct comparison of these three approaches to the interpretation of ANOVA tests. Its high numerical precision and ability to work with complex ANOVA designs could facilitate researchers' attention to issues of statistical power, Bayesian analysis, and the use of confidence intervals for data interpretation. MorePower 6.0 is available at https://wiki.usask.ca/pages/viewpageattachments.action?pageId=420413544 .
This article introduces the ArduiPod Box, an open-source device built using two main components (i.e., an iPod Touch and an Arduino microcontroller), developed as a low-cost alternative to the standard operant conditioning chamber, or “Skinner box.” Because of its affordability, the ArduiPod Box provides an opportunity for educational institutions with small budgets seeking to set up animal laboratories for research and instructional purposes. A pilot experiment is also presented, which shows that the ArduiPod Box, in spite of its extraordinary simplicity, can be effectively used to study animal learning and behavior.
In covariance structure analysis, two-stage least-squares (2SLS) estimation has been recommended for use over maximum likelihood estimation when model misspecification is suspected. However, 2SLS often fails to provide stable and accurate solutions, particularly for structural equation models with small samples. To address this issue, a regularized extension of 2SLS is proposed that integrates a ridge type of regularization into 2SLS, thereby enabling the method to effectively handle the small-sample-size problem. Results are then reported of a Monte Carlo study conducted to evaluate the performance of the proposed method, as compared to its nonregularized counterpart. Finally, an application is presented that demonstrates the empirical usefulness of the proposed method.
Psychological researchers have begun to utilize Amazon’s Mechanical Turk (MTurk) marketplace as a participant pool. Although past work has established that MTurk is well suited to examining individual behavior, pseudo-dyadic interactions, in which participants falsely believe they are interacting with a partner, are a key element of social and cognitive psychology. The ability to conduct such interdependent research on MTurk would increase the utility of this online population for a broad range of psychologists. The present research therefore attempts to qualitatively replicate well-established pseudo-dyadic tasks on MTurk in order to establish the utility of this platform as a tool for researchers. We find that participants do behave as if a partner is real, even when doing so incurs a financial cost, and that they are sensitive to subtle information about the partner in a minimal-groups paradigm, supporting the use of MTurk for pseudo-dyadic research.
Tachistoscopes allow brief visual stimulation delivery, which is crucial for experiments in which subliminal presentation is required. Up to now, tachistoscopes have had shortcomings with respect to timing accuracy, reliability, and flexibility of use. Here, we present a new and inexpensive two-channel tachistoscope that allows for exposure durations in the submillisecond range with an extremely high timing accuracy. The tachistoscope consists of two standard liquid-crystal display (LCD) monitors of the light-emitting diode (LED) backlight type, a semipermeable mirror, a mounting rack, and an experimental personal computer (PC). The monitors have been modified to provide external access to the LED backlights, which are controlled by the PC via the standard parallel port. Photodiode measurements confirmed reliable operation of the tachistoscope and revealed switching times of 3 μs. Our method may also be of great advantage in single-monitor setups, in which it allows for manipulating the stimulus timing with submillisecond precision in many experimental situations. Where this is not applicable, the monitor can be operated in standard mode by disabling the external backlight control instantaneously.
We report object-naming and object recognition times collected from Russian native speakers for the colorized version of the Snodgrass and Vanderwart (Journal of Experimental Psychology: Human Learning and Memory 6:174-215, 1980) pictures (Rossion & Pourtois, Perception 33:217-236, 2004). New norms for image variability, body-object interaction [BOI], and subjective frequency collected in Russian, as well as new name agreement scores for the colorized pictures in French, are also reported. In both object-naming and object comprehension times, the name agreement, image agreement, and age-of-acquisition variables made significant independent contributions. Objective word frequency was reliable in object-naming latencies only. The variables of image variability, BOI, and subjective frequency were not significant in either object naming or object comprehension. Finally, imageability was reliable in both tasks. The new norms and object-naming and object recognition times are provided as supplemental materials.
The aim of this article is to describe a database of diphone positional frequencies in French. More specifically, we provide frequencies for word-initial, word-internal, and word-final diphones of all words extracted from a subtitle corpus of 50 million words that come from movie and TV series dialogue. We also provide intra- and intersyllable diphone frequencies, as well as interword diphone frequencies. To our knowledge, no other such tool is available to psycholinguists for the study of French sequential probabilities. This database and its new indicators should help researchers conducting new studies on speech segmentation.
This article describes a laboratory system for running learning experiments in operant chambers with various species. It is based on a modern version of a classical learning chamber for operant conditioning, the so-called “Skinner box”. Rather than constituting a stand-alone unit, as is usually the case, it is an integrated part of a comprehensive technical solution, thereby eliminating a number of practical problems that are frequently encountered in research on animal learning and behavior. The Vienna comparative cognition technology combines modern computer, stimulus presentation, and reinforcement technology with flexibility and user-friendliness, which allows for efficient, widely automatized across-species experimentation, and thus makes the system appropriate for use in a broad range of learning tasks.
As human randomness production has come to be more closely studied and used to assess executive functions (especially inhibition), many normative measures for assessing the degree to which a sequence is randomlike have been suggested. However, each of these measures focuses on one feature of randomness, leading researchers to have to use multiple measures. Although algorithmic complexity has been suggested as a means for overcoming this inconvenience, it has never been used, because standard Kolmogorov complexity is inapplicable to short strings (e.g., of length l ≤ 50), due to both computational and theoretical limitations. Here, we describe a novel technique (the coding theorem method) based on the calculation of a universal distribution, which yields an objective and universal measure of algorithmic complexity for short strings that approximates Kolmogorov-Chaitin complexity.