The maximum clique enumeration (MCE) problem asks that we identify all maximum cliques in a finite, simple graph. MCE is closely related to two other well-known and widely-studied problems: the maximum clique optimization problem, which asks us to determine the size of a largest clique, and the maximal clique enumeration problem, which asks that we compile a listing of all maximal cliques. Naturally, these three problems are NP-hard, given that they subsume the classic version of the NP-complete clique decision problem. MCE can be solved in principle with standard enumeration methods due to Bron, Kerbosch, Kose and others. Unfortunately, these techniques are ill-suited to graphs encountered in our applications. We must solve MCE on instances deeply seeded in data mining and computational biology, where high-throughput data capture often creates graphs of extreme size and density. MCE can also be solved in principle using more modern algorithms based in part on vertex cover and the theory of fixed-parameter tractability (FPT). While FPT is an improvement, these algorithms too can fail to scale sufficiently well as the sizes and densities of our datasets grow.
Particle swarm optimization is a popular method for solving difficult optimization problems. There have been attempts to formulate the method in formal probabilistic or stochastic terms (e.g. bare bones particle swarm) with the aim to achieve more generality and explain the practical behavior of the method. Here we present a Bayesian interpretation of the particle swarm optimization. This interpretation provides a formal framework for incorporation of prior knowledge about the problem that is being solved. Furthermore, it also allows to extend the particle optimization method through the use of kernel functions that represent the intermediary transformation of the data into a different space where the optimization problem is expected to be easier to be resolved-such transformation can be seen as a form of prior knowledge about the nature of the optimization problem. We derive from the general Bayesian formulation the commonly used particle swarm methods as particular cases.
Power and performance management problem in large scale computing systems like data centers has attracted a lot of interests from both enterprises and academic researchers as power saving has become more and more important in many fields. Because of the multiple objectives, multiple influential factors and hierarchical structure in the system, the problem is indeed complex and hard. In this paper, the problem will be investigated in a virtualized computing system. Specifically, it is formulated as a power optimization problem with some constraints on performance. Then, the adaptive controller based on least-square self-tuning regulator(LS-STR) is designed to track performance in the first step; and the resource solved by the controller is allocated in order to minimize the power consumption as the second step. Some simulations are designed to test the effectiveness of this method and to compare it with some other controllers. The simulation results show that the adaptive controller is generally effective: it is applicable for different performance metrics, for different workloads, and for single and multiple workloads; it can track the performance requirement effectively and save the power consumption significantly.
The problem of determining the optimal geometric configuration of a sensor network that will maximize the range-related information available for multiple target positioning is of key importance in a multitude of application scenarios. In this paper, a set of sensors that measures the distances between the targets and each of the receivers is considered, assuming that the range measurements are corrupted by white Gaussian noise, in order to search for the formation that maximizes the accuracy of the target estimates. Using tools from estimation theory and convex optimization, the problem is converted into that of maximizing, by proper choice of the sensor positions, a convex combination of the logarithms of the determinants of the Fisher Information Matrices corresponding to each of the targets in order to determine the sensor configuration that yields the minimum possible covariance of any unbiased target estimator. Analytical and numerical solutions are well defined and it is shown that the optimal configuration of the sensors depends explicitly on the constraints imposed on the sensor configuration, the target positions, and the probabilistic distributions that define the prior uncertainty in each of the target positions. Simulation examples illustrate the key results derived.
Computer modeling, simulation and optimization are powerful tools that have seen increased use in biomechanics research. Dynamic optimizations can be categorized as either data-tracking or predictive problems. The data-tracking approach has been used extensively to address human movement problems of clinical relevance. The predictive approach also holds great promise, but has seen limited use in clinical applications. Enhanced software tools would facilitate the application of predictive musculoskeletal simulations to clinically-relevant research. The open-source software OpenSim provides tools for generating tracking simulations but not predictive simulations. However, OpenSim includes an extensive application programming interface that permits extending its capabilities with scripting languages such as MATLAB. In the work presented here, we combine the computational tools provided by MATLAB with the musculoskeletal modeling capabilities of OpenSim to create a framework for generating predictive simulations of musculoskeletal movement based on direct collocation optimal control techniques. In many cases, the direct collocation approach can be used to solve optimal control problems considerably faster than traditional shooting methods. Cyclical and discrete movement problems were solved using a simple 1 degree of freedom musculoskeletal model and a model of the human lower limb, respectively. The problems could be solved in reasonable amounts of time (several seconds to 1-2 hours) using the open-source IPOPT solver. The problems could also be solved using the fmincon solver that is included with MATLAB, but the computation times were excessively long for all but the smallest of problems. The performance advantage for IPOPT was derived primarily by exploiting sparsity in the constraints Jacobian. The framework presented here provides a powerful and flexible approach for generating optimal control simulations of musculoskeletal movement using OpenSim and MATLAB. This should allow researchers to more readily use predictive simulation as a tool to address clinical conditions that limit human mobility.
Using a database with the mitochondrial DNA (mtDNA) of 513 Neolithic individuals, we quantify the space-time variation of the frequency of haplogroup K, previously proposed as a relevant Neolithic marker. We compare these data to simulations, based on a mathematical model in which a Neolithic population spreads from Syria to Anatolia and Europe, possibly interbreeding with Mesolithic individuals (who lack haplogroup K) and/or teaching farming to them. Both the data and the simulations show that the percentage of haplogroup K (%K) decreases with increasing distance from Syria and that, in each region, the %K tends to decrease with increasing time after the arrival of farming. Both the model and the data display a local minimum of the genetic cline, and for the same Neolithic regional culture (Sweden). Comparing the observed ancient cline of haplogroup K to the simulation results reveals that about 98% of farmers were not involved in interbreeding neither acculturation (cultural diffusion). Therefore, cultural diffusion involved only a tiny fraction (about 2%) of farmers and, in this sense, the most relevant process in the spread of the Neolithic in Europe was demic diffusion (i.e., the dispersal of farmers), as opposed to cultural diffusion (i.e., the incorporation of hunter-gatherers).
The alternating projection algorithms are easy to implement and effective for large-scale complex optimization problems, such as constrained reconstruction of X-ray computed tomography (CT). A typical method is to use projection onto convex sets (POCS) for data fidelity, nonnegative constraints combined with total variation (TV) minimization (so called TV-POCS) for sparse-view CT reconstruction. However, this type of method relies on empirically selected parameters for satisfactory reconstruction and is generally slow and lack of convergence analysis. In this work, we use a convex feasibility set approach to address the problems associated with TV-POCS and propose a framework using full sequential alternating projections or POCS (FS-POCS) to find the solution in the intersection of convex constraints of bounded TV function, bounded data fidelity error and non-negativity. The rationale behind FS-POCS is that the mathematically optimal solution of the constrained objective function may not be the physically optimal solution. The breakdown of constrained reconstruction into an intersection of several feasible sets can lead to faster convergence and better quantification of reconstruction parameters in a physical meaningful way than that in an empirical way of trial-and-error. In addition, for large-scale optimization problems, first order methods are usually used. Not only is the condition for convergence of gradient-based methods derived, but also a primal-dual hybrid gradient (PDHG) method is used for fast convergence of bounded TV. The newly proposed FS-POCS is evaluated and compared with TV-POCS and another convex feasibility projection method (CPTV) using both digital phantom and pseudo-real CT data to show its superior performance on reconstruction speed, image quality and quantification.
As a specific kind of non-Euclidean metric lies in projective space, Cayley-Klein metric has been recently introduced in metric learning to deal with the complex data distributions in computer vision tasks. In this paper, we extend the original Cayley-Klein metric to the multiple Cayley-Klein metric, which is defined as a linear combination of several Cayley-Klein metrics. Since Cayley-Klein is a kind of non-linear metric, its combination could model the data space better, thus lead to an improved performance. We show how to learn a multiple Cayley-Klein metric by iterative optimization over single Cayley-Klein metric and their combination coefficients under the objective to maximize the performance on separating inter-class instances and gathering intra-class instances. Our experiments on several benchmarks are quite encouraging.
The underwater acoustic sensor network (UWASN) is a system that exchanges data between numerous sensor nodes deployed in the sea. The UWASN uses an underwater acoustic communication technique to exchange data. Therefore, it is important to design a robust system that will function even in severely fluctuating underwater communication conditions, along with variations in the ocean environment. In this paper, a new algorithm to find the optimal deployment positions of underwater sensor nodes is proposed. The algorithm uses the communication performance surface, which is a map showing the underwater acoustic communication performance of a targeted area. A virtual force-particle swarm optimization algorithm is then used as an optimization technique to find the optimal deployment positions of the sensor nodes, using the performance surface information to estimate the communication radii of the sensor nodes in each generation. The algorithm is evaluated by comparing simulation results between two different seasons (summer and winter) for an area located off the eastern coast of Korea as the selected targeted area.
- Proceedings of the National Academy of Sciences of the United States of America
- Published almost 4 years ago
Ride-sharing services are transforming urban mobility by providing timely and convenient transportation to anybody, anywhere, and anytime. These services present enormous potential for positive societal impacts with respect to pollution, energy consumption, congestion, etc. Current mathematical models, however, do not fully address the potential of ride-sharing. Recently, a large-scale study highlighted some of the benefits of car pooling but was limited to static routes with two riders per vehicle (optimally) or three (with heuristics). We present a more general mathematical model for real-time high-capacity ride-sharing that (i) scales to large numbers of passengers and trips and (ii) dynamically generates optimal routes with respect to online demand and vehicle locations. The algorithm starts from a greedy assignment and improves it through a constrained optimization, quickly returning solutions of good quality and converging to the optimal assignment over time. We quantify experimentally the tradeoff between fleet size, capacity, waiting time, travel delay, and operational costs for low- to medium-capacity vehicles, such as taxis and van shuttles. The algorithm is validated with ∼3 million rides extracted from the New York City taxicab public dataset. Our experimental study considers ride-sharing with rider capacity of up to 10 simultaneous passengers per vehicle. The algorithm applies to fleets of autonomous vehicles and also incorporates rebalancing of idling vehicles to areas of high demand. This framework is general and can be used for many real-time multivehicle, multitask assignment problems.