Concept: Object-oriented programming
Complex diseases are typically caused by combinations of molecular disturbances that vary widely among different patients. Endophenotypes, a combination of genetic factors associated with a disease, offer a simplified approach to dissect complex trait by reducing genetic heterogeneity. Because molecular dissimilarities often exist between patients with indistinguishable disease symptoms, these unique molecular features may reflect pathogenic heterogeneity. To detect molecular dissimilarities among patients and reduce the complexity of high-dimension data, we have explored an endophenotype-identification analytical procedure that combines non-negative matrix factorization (NMF) and adjusted rand index (ARI), a measure of the similarity of two clusterings of a data set. To evaluate this procedure, we compared it with a commonly used method, principal component analysis with k-means clustering (PCA-K). A simulation study with gene expression dataset and genotype information was conducted to examine the performance of our procedure and PCA-K. The results showed that NMF mostly outperformed PCA-K. Additionally, we applied our endophenotype-identification analytical procedure to a publicly available dataset containing data derived from patients with late-onset Alzheimer’s disease (LOAD). NMF distilled information associated with 1,116 transcripts into three metagenes and three molecular subtypes (MS) for patients in the LOAD dataset: MS1 (n1=80), MS2 (n2=73), and MS3 (n3=23). ARI was then used to determine the most representative transcripts for each metagene; 123, 89, and 71 metagene-specific transcripts were identified for MS1, MS2, and MS3, respectively. These metagene-specific transcripts were identified as the endophenotypes. Our results showed that 14, 38, 0, and 28 candidate susceptibility genes listed in AlzGene database were found by all patients, MS1, MS2, and MS3, respectively. Moreover, we found that MS2 might be a normal-like subtype. Our proposed procedure provides an alternative approach to investigate the pathogenic mechanism of disease and better understand the relationship between phenotype and genotype.
In a number of applications there is a need to determine the most likely pedigree for a group of persons based on genetic markers. Adequate models are needed to reach this goal. The markers used to perform the statistical calculations can be linked and there may also be linkage disequilibrium (LD) in the population. The purpose of this paper is to present a graphical Bayesian Network framework to deal with such data. Potential LD is normally ignored and it is important to verify that the resulting calculations are not biased. Even if linkage does not influence results for regular paternity cases, it may have substantial impact on likelihood ratios involving other, more extended pedigrees. Models for LD influence likelihoods for all pedigrees to some degree and an initial estimate of the impact of ignoring LD and/or linkage is desirable, going beyond mere rules of thumb based on marker distance. Furthermore, we show how one can readily include a mutation model in the Bayesian Network; extending other programs or formulas to include such models may require considerable amounts of work and will in many case not be practical. As an example, we consider the two STR markers vWa and D12S391. We estimate probabilities for population haplotypes to account for LD using a method based on data from trios, while an estimate for the degree of linkage is taken from the literature. The results show that accounting for haplotype frequencies is unnecessary in most cases for this specific pair of markers. When doing calculations on regular paternity cases, the markers can be considered statistically independent. In more complex cases of disputed relatedness, for instance cases involving siblings or so-called deficient cases, or when small differences in the LR matter, independence should not be assumed. (The networks are freely available at http://arken.umb.no/~dakl/BayesianNetworks.).
The solar-powered production of hydrogen for use as a renewable fuel is highly desirable for the world’s future energy infrastructure. However, difficulties in achieving reasonable efficiencies, and thus cost-effectiveness, have hampered significant research progress. Here we propose the use of semiconductor nanostructures to create a type-II heterojunction at the semiconductor-water interface in a photoelectrochemical cell (PEC) and theoretically investigate it as a method of increasing the maximum photovoltage such a cell can generate under illumination, with the aim of increasing the overall cell efficiency. A model for the semiconductor electrode in a PEC is created, which solves the Schrödinger, Poisson and drift-diffusion equations self-consistently. From this, it is determined that ZnO quantum dots on bulk n-InGaN with low In content x is the most desirable system, having electron-accepting and -donating states straddling the oxygen- and hydrogen-production potentials for x < 0.26, though large variance in literature values for certain material parameters means large uncertainties in the model output. Accordingly, results presented here should form the basis for further experimental work, which will in turn provide input to refine and develop the model.
Elucidating the druggable interface of protein-protein interactions using fragment docking and coevolutionary analysis
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 4 years ago
Protein-protein interactions play a central role in cellular function. Improving the understanding of complex formation has many practical applications, including the rational design of new therapeutic agents and the mechanisms governing signal transduction networks. The generally large, flat, and relatively featureless binding sites of protein complexes pose many challenges for drug design. Fragment docking and direct coupling analysis are used in an integrated computational method to estimate druggable protein-protein interfaces. (i) This method explores the binding of fragment-sized molecular probes on the protein surface using a molecular docking-based screen. (ii) The energetically favorable binding sites of the probes, called hot spots, are spatially clustered to map out candidate binding sites on the protein surface. (iii) A coevolution-based interface interaction score is used to discriminate between different candidate binding sites, yielding potential interfacial targets for therapeutic drug design. This approach is validated for important, well-studied disease-related proteins with known pharmaceutical targets, and also identifies targets that have yet to be studied. Moreover, therapeutic agents are proposed by chemically connecting the fragments that are strongly bound to the hot spots.
IsoMIF Finder is an online server for the identification of molecular interaction field (MIF) similarities. User defined binding site MIFs can be compared to datasets of pre-calculated MIFs or against a user-defined list of PDB entries. The interface can be used for the prediction of function, identification of potential cross-reactivity or polypharmacological targets and drug repurposing. Detected similarities can be viewed in a browser or within a PyMOL session.
The Defense Advanced Research Projects Agency (DARPA) has funded innovative scientific research and technology developments in the field of brain-computer interfaces (BCI) since the 1970s. This review highlights some of DARPA’s major advances in the field of BCI, particularly those made in recent years. Two broad categories of DARPA programs are presented with respect to the ultimate goals of supporting the nation’s warfighters: 1) BCI efforts aimed at restoring neural and/or behavioral function, and 2) BCI efforts aimed at improving human training and performance. The programs discussed are synergistic and complementary to one another, and, moreover, promote interdisciplinary collaborations among researchers, engineers, and clinicians. Finally, this review includes a summary of some of the remaining challenges for the field of BCI, as well as the goals of new DARPA efforts in this domain.
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Half of human spinal cord injuries lead to chronic paralysis. Here, we introduce an electrochemical neuroprosthesis and a robotic postural interface designed to encourage supraspinally mediated movements in rats with paralyzing lesions. Despite the interruption of direct supraspinal pathways, the cortex regained the capacity to transform contextual information into task-specific commands to execute refined locomotion. This recovery relied on the extensive remodeling of cortical projections, including the formation of brainstem and intraspinal relays that restored qualitative control over electrochemically enabled lumbosacral circuitries. Automated treadmill-restricted training, which did not engage cortical neurons, failed to promote translesional plasticity and recovery. By encouraging active participation under functional states, our training paradigm triggered a cortex-dependent recovery that may improve function after similar injuries in humans.
The function of neural circuits is an emergent property that arises from the coordinated activity of large numbers of neurons. To capture this, we propose launching a large-scale, international public effort, the Brain Activity Map Project, aimed at reconstructing the full record of neural activity across complete neural circuits. This technological challenge could prove to be an invaluable step toward understanding fundamental and pathological brain processes.
To examine the long-term effects of exercise modality during weight loss on body composition and associations between body composition and physical function changes.