SciCombinator

Discover the most talked about and latest scientific content & concepts.

Concept: Algorithmic efficiency

171

Network motifs are small connected sub-graphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plug-in which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape work-space (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh’s C++ to the Cytoscape Java program, which makes this plug-in suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plug-in can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plug-in, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plug-in, and we hope that this plug-in can be improved to cover other options to make it the best motif-analyzing tool.

Concepts: Algorithm, Graph theory, Social network, Network theory, Network science, Complex network, Discrete mathematics, Algorithmic efficiency

170

Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction.

Concepts: Algorithm, Decision making, Engineering, Feedback, Gene regulatory network, Systems biology, Discrete mathematics, Algorithmic efficiency

138

The underwater acoustic sensor network (UWASN) is a system that exchanges data between numerous sensor nodes deployed in the sea. The UWASN uses an underwater acoustic communication technique to exchange data. Therefore, it is important to design a robust system that will function even in severely fluctuating underwater communication conditions, along with variations in the ocean environment. In this paper, a new algorithm to find the optimal deployment positions of underwater sensor nodes is proposed. The algorithm uses the communication performance surface, which is a map showing the underwater acoustic communication performance of a targeted area. A virtual force-particle swarm optimization algorithm is then used as an optimization technique to find the optimal deployment positions of the sensor nodes, using the performance surface information to estimate the communication radii of the sensor nodes in each generation. The algorithm is evaluated by comparing simulation results between two different seasons (summer and winter) for an area located off the eastern coast of Korea as the selected targeted area.

Concepts: Operations research, Optimization, Ocean, Particle swarm optimization, Algorithmic efficiency, Exchange, Microsoft Exchange Server, Underwater acoustic communication

31

Deep learning has become a promising approach for automated support for clinical diagnosis. When medical data samples are limited, collaboration among multiple institutions is necessary to achieve high algorithm performance. However, sharing patient data often has limitations due to technical, legal, or ethical concerns. In this study, we propose methods of distributing deep learning models as an attractive alternative to sharing patient data.

Concepts: Medicine, Medical imaging, Physician, Distribution, Machine learning, Psychiatry, Technical support, Algorithmic efficiency

24

This paper reviews recent studies on Particle Swarm Optimization (PSO) algorithm. The review has been focused on high impact recent articles that have analyzed and/or modified PSO algorithms. This paper also presents some potential areas for future study.

Concepts: Algorithm, Peer review, Optimization, Particle swarm optimization, Ant colony optimization, Swarm intelligence, Algorithmic efficiency, Articles with example pseudocode

23

The objective of this study was to quantify any improvement with the GE ‘Sharp IR’ point-spread function (PSF) reconstruction algorithm in addition to ordered subsets expectation maximum (OSEM) and time-of-flight (TOF) reconstruction algorithms and establish the optimum parameters to be used in clinical studies.

Concepts: Algorithm, Mathematics, Placebo, Graph theory, Programming language, Linear programming, Reconstruction algorithm, Algorithmic efficiency

22

Detecting overlapping communities is essential to analyzing and exploring natural networks such as social networks, biological networks, and citation networks. However, most existing approaches do not scale to the size of networks that we regularly observe in the real world. In this paper, we develop a scalable approach to community detection that discovers overlapping communities in massive real-world networks. Our approach is based on a Bayesian model of networks that allows nodes to participate in multiple communities, and a corresponding algorithm that naturally interleaves subsampling from the network and updating an estimate of its communities. We demonstrate how we can discover the hidden community structure of several real-world networks, including 3.7 million US patents, 575,000 physics articles from the arXiv preprint server, and 875,000 connected Web pages from the Internet. Furthermore, we demonstrate on large simulated networks that our algorithm accurately discovers the true community structure. This paper opens the door to using sophisticated statistical models to analyze massive networks.

Concepts: Scientific method, Sociology, Observation, Philosophy of science, Networks, Complex network, Algorithmic efficiency, M. Scott Peck

12

We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

Concepts: Gene, Genetics, Bioinformatics, Mathematics, Computer, Computer science, Computational science, Algorithmic efficiency

10

Several algorithms exist for detecting copy number variants (CNVs) from human exome sequencing read depth, but previous tools have not been well-suited for large population studies on the order of tens or hundreds of thousands of exomes. Their limitations include being difficult to integrate into automated variant-calling pipelines and being ill-suited for detecting common variants. To address these issues, we developed a new algorithm-Copy number estimation using Lattice-Aligned Mixture Models (CLAMMS)-which is highly scalable and suitable for detecting CNVs across the whole allele frequency spectrum.

Concepts: Gene, Genetics, Algorithm, Molecular biology, Copy number variation, Allele frequency, Analysis of algorithms, Algorithmic efficiency

8

The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.

Concepts: Cluster analysis, Statistics, Machine learning, Computational complexity theory, Mixture model, K-means clustering, Algorithmic efficiency