Concept: Applied mathematics


Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions.

Concepts: Machine learning, Formal sciences, Biostatistics, Applied mathematics, Statistics, Time series, Mathematics, Scientific method


The alternating projection algorithms are easy to implement and effective for large-scale complex optimization problems, such as constrained reconstruction of X-ray computed tomography (CT). A typical method is to use projection onto convex sets (POCS) for data fidelity, nonnegative constraints combined with total variation (TV) minimization (so called TV-POCS) for sparse-view CT reconstruction. However, this type of method relies on empirically selected parameters for satisfactory reconstruction and is generally slow and lack of convergence analysis. In this work, we use a convex feasibility set approach to address the problems associated with TV-POCS and propose a framework using full sequential alternating projections or POCS (FS-POCS) to find the solution in the intersection of convex constraints of bounded TV function, bounded data fidelity error and non-negativity. The rationale behind FS-POCS is that the mathematically optimal solution of the constrained objective function may not be the physically optimal solution. The breakdown of constrained reconstruction into an intersection of several feasible sets can lead to faster convergence and better quantification of reconstruction parameters in a physical meaningful way than that in an empirical way of trial-and-error. In addition, for large-scale optimization problems, first order methods are usually used. Not only is the condition for convergence of gradient-based methods derived, but also a primal-dual hybrid gradient (PDHG) method is used for fast convergence of bounded TV. The newly proposed FS-POCS is evaluated and compared with TV-POCS and another convex feasibility projection method (CPTV) using both digital phantom and pseudo-real CT data to show its superior performance on reconstruction speed, image quality and quantification.

Concepts: Mathematics, Optimization problem, Optimization, Applied mathematics, Tomography, Scientific method, Medical imaging, Tomographic reconstruction


How rare are magic squares? So far, the exact number of magic squares of order n is only known for n ≤ 5. For larger squares, we need statistical approaches for estimating the number. For this purpose, we formulated the problem as a combinatorial optimization problem and applied the Multicanonical Monte Carlo method (MMC), which has been developed in the field of computational statistical physics. Among all the possible arrangements of the numbers 1; 2, …, n2 in an n × n square, the probability of finding a magic square decreases faster than the exponential of n. We estimated the number of magic squares for n ≤ 30. The number of magic squares for n = 30 was estimated to be 6.56(29) × 102056 and the corresponding probability is as small as 10-212. Thus the MMC is effective for counting very rare configurations.

Concepts: Optimization, Monte Carlo method, Monte Carlo, Combinatorial optimization, Probability theory, Statistics, Applied mathematics, Mathematics


Population numbers at local levels are fundamental data for many applications, including the delivery and planning of services, election preparation, and response to disasters. In resource-poor settings, recent and reliable demographic data at subnational scales can often be lacking. National population and housing census data can be outdated, inaccurate, or missing key groups or areas, while registry data are generally lacking or incomplete. Moreover, at local scales accurate boundary data are often limited, and high rates of migration and urban growth make existing data quickly outdated. Here we review past and ongoing work aimed at producing spatially disaggregated local-scale population estimates, and discuss how new technologies are now enabling robust and cost-effective solutions. Recent advances in the availability of detailed satellite imagery, geopositioning tools for field surveys, statistical methods, and computational power are enabling the development and application of approaches that can estimate population distributions at fine spatial scales across entire countries in the absence of census data. We outline the potential of such approaches as well as their limitations, emphasizing the political and operational hurdles for acceptance and sustainable implementation of new approaches, and the continued importance of traditional sources of national statistical data.

Concepts: Applied mathematics, Sociology, Data, Social research, Spatial analysis, Mathematics, Demography, Statistics


RELION, for REgularized LIkelihood OptimizatioN, is an open-source computer program for the refinement of macromolecular structures by single-particle analysis of electron cryo-microscopy (cryo-EM) data. Whereas alternative approaches often rely on user expertise for the tuning of parameters, RELION uses a Bayesian approach to infer parameters of a statistical model from the data. This paper describes developments that reduce the computational costs of the underlying maximum a posteriori (MAP) algorithm, as well as statistical considerations that yield new insights into the accuracy with which the relative orientations of individual particles may be determined. A so-called gold-standard Fourier shell correlation (FSC) procedure to prevent overfitting is also described. The resulting implementation yields high-quality reconstructions and reliable resolution estimates with minimal user intervention and at acceptable computational costs.

Concepts: Resolution, Bayesian inference, Statistical inference, Programming language, Computer program, Algorithm, Applied mathematics, Computer


We consider exact enumerations and probabilistic properties of ranked trees when generated under the random coalescent process. Using a new approach, based on generating functions, we derive several statistics such as the exact probability of finding k cherries in a ranked tree of fixed size n. We then extend our method to consider also the number of pitchforks. We find a recursive formula to calculate the joint and conditional probabilities of cherries and pitchforks when the size of the tree is fixed. These results provide insights into structural properties of coalescent trees under the model of neutral evolution.

Concepts: Mathematics, Applied mathematics, Decision theory, Conditional probability, Probability space, Probability theory, Probability, Event


The aim of the current investigation is to develop and statistically optimize nanoethosomes for transdermal valsartan delivery. Box-Behnken experimental design was applied for optimization of nanoethosomes. The Independent variables were phospholipids 90G (X(1)), ethanol (X(2)), valsartan (X(3)) and sonication Time (X(4)) while entrapment efficiency (Y(1)), vesicle size (Y(2)) and flux (Y(3)) were the dependent variables. The optimized formulation obtained was then tested in rats for an in vivo pharmacokinetic study. Results indicate that the nanoethosomes of valsartan provides better flux, reasonable entrapment efficiency, more effectiveness for transdermal delivery as compared to rigid liposomes. Optimized nanoethosomal formulation with mean particle size is 103±5.0nm showed 89.34±2.54% entrapment efficiency and achieved mean transdermal flux 801.36±21.45μg/cm(2)/hr. Nanoethosomes proved significantly superior in terms of, amount of drug permeated in the skin, with an enhancement ratio of 43.38±1.37 when compared to rigid liposomes. Confocal laser scanning microscopy revealed an enhanced permeation of Rhodamine-Red loaded nanoethosomes to the deeper layers of the skin as compared to conventional liposomes. In vivo pharmacokinetic study of nanoethosomal transdermal therapeutic system showed a significant increase in bioavailability (3.03 times) compared with oral suspension of valsartan. Our results suggest that nanoethosomes are an efficient carrier for transdermal delivery of valsartan.

Concepts: Confocal laser scanning microscopy, Pharmacokinetics, Optimization, Skin, Applied mathematics, Pharmacology, Transdermal patch, Statistics


In surveillance of subterranean fauna, especially in the case of rare or elusive aquatic species, traditional techniques used for epigean species are often not feasible. We developed a non-invasive survey method based on environmental DNA (eDNA) to detect the presence of the red-listed cave-dwelling amphibian, Proteus anguinus, in the caves of the Dinaric Karst. We tested the method in fifteen caves in Croatia, from which the species was previously recorded or expected to occur. We successfully confirmed the presence of P. anguinus from ten caves and detected the species for the first time in five others. Using a hierarchical occupancy model we compared the availability and detection probability of eDNA of two water sampling methods, filtration and precipitation. The statistical analysis showed that both availability and detection probability depended on the method and estimates for both probabilities were higher using filter samples than for precipitation samples. Combining reliable field and laboratory methods with robust statistical modeling will give the best estimates of species occurrence.

Concepts: Croatia, Official statistics, Survey sampling, Probability, Statistics, Applied mathematics, Probability theory, Slovenia


For privacy and practical reasons, it is sometimes necessary to minimize sharing of individual-level information in multisite studies. However, individual-level information is often needed to perform more rigorous statistical analysis.

Concepts: Applied mathematics, Statistic, Statistics, Ronald Fisher, Mathematical analysis, Data, Social research, Scientific method


In hair cells, mechanotransduction channels are located in the membrane of stereocilia tips, where the base of the tip link is attached. The tip-link force determines the system of other forces in the immediate channel environment, which change the channel open probability. This system of forces includes components that are out of plane and in plane relative to the membrane; the magnitude and direction of these components depend on the channel environment and arrangement. Using a computational model, we obtained the major forces involved as functions of the force applied via the tip link at the center of the membrane. We simulated factors related to channels and the membrane, including finite-sized channels located centrally or acentrally, stiffness of the hypothesized channel-cytoskeleton tether, and bending modulus of the membrane. Membrane forces are perpendicular to the directions of the principal curvatures of the deformed membrane. Our approach allows for a fine vectorial picture of the local forces gating the channel; membrane forces change with the membrane curvature and are themselves sufficient to affect the open probability of the channel.

Concepts: Applied mathematics, Continuum mechanics, Universe, General relativity, Cochlea, Classical mechanics, Action potential, Curvature