Concept: Allele frequency
We present a statistical framework for estimation and application of sample allele frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the allele frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is calculated using a dynamic programming algorithm and numerically optimized using analytical derivatives. We then use a Bayesian method for estimating the sample allele frequency in a single site, and show how the method can be used for genotype calling and SNP calling. We also show how the method can be extended to various other cases including cases with deviations from Hardy-Weinberg equilibrium. We evaluate the statistical properties of the methods using simulations and by application to a real data set.
HLA-G molecules seem to have a protective effect for the semi-allogeneic fetus by mother immunosuppression. Also, pregnancy pathologies have been associated to HLA-G∗01:05N “null allele”. In addition, other general regulatory immune functions have been associated to HLA-G in infections, tumors and autoimmunity. Thus, it is striking that HLA∗01:05N allele is maintained in a substantial frequency in certain human populations. In the present work, we have analysed HLA-G allele frequencies in Amerindian Mayans from Guatemala and in Uros from Titikaka Lake “totora” (reed) floating islands (Peru). No HLA-G∗01:05N has been found in both of these Amerindian populations. Further studies in Worldwide populations show that the highest HLA-G∗01:05 allele frequencies are found in Middle East; these findings have a bearing in future clinical /epidemiological studies in Amerindians. This would suggest that either this area was close to the “null” allele origin (as predicted by us) and/or some evolutive pressures are maintaining these high frequencies in Middle East. However, the fact that Cercopithecinae primate family (primates postulated as distant human ancestors) has also a MHC-G “null” allele in all individuals suggests that this allele may confer some advantage either at maternal/fetal interface or at other immune HLA-G function level (tumors, infections, autoimmunity). Human HLA-G∗01:05N may produce HLA-G isoforms, like Cercopithecinae monkeys may, which may suffice for function.
What is the best way to teach evolution? As microevolution may be configured as a branch of genetics, it being a short conceptual leap from understanding the concepts of mutation and alleles (i.e., genetics) to allele frequency change (i.e., evolution), we hypothesised that learning genetics prior to evolution might improve student understanding of evolution. In the UK, genetics and evolution are typically taught to 14- to 16-y-old secondary school students as separate topics with few links, in no particular order and sometimes with a large time span between. Here, then, we report the results of a large trial into teaching order of evolution and genetics. We modified extant questionnaires to ascertain students' understanding of evolution and genetics along with acceptance of evolution. Students were assessed prior to teaching, immediately post teaching and again after several months. Teachers were not instructed what to teach, just to teach in a given order. Regardless of order, teaching increased understanding and acceptance, with robust signs of longer-term retention. Importantly, teaching genetics before teaching evolution has a significant (p < 0.001) impact on improving evolution understanding by 7% in questionnaire scores beyond the increase seen for those taught in the inverse order. For lower ability students, an improvement in evolution understanding was seen only if genetics was taught first. Teaching genetics first additionally had positive effects on genetics understanding, by increasing knowledge. These results suggest a simple, minimally disruptive, zero-cost intervention to improve evolution understanding: teach genetics first. This same alteration does not, however, result in a significantly increased acceptance of evolution, which reflects a weak correlation between knowledge and acceptance of evolution. Qualitative focus group data highlights the role of authority figures in determination of acceptance.
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 4 years ago
The Out-of-Africa (OOA) dispersal ∼50,000 y ago is characterized by a series of founder events as modern humans expanded into multiple continents. Population genetics theory predicts an increase of mutational load in populations undergoing serial founder effects during range expansions. To test this hypothesis, we have sequenced full genomes and high-coverage exomes from seven geographically divergent human populations from Namibia, Congo, Algeria, Pakistan, Cambodia, Siberia, and Mexico. We find that individual genomes vary modestly in the overall number of predicted deleterious alleles. We show via spatially explicit simulations that the observed distribution of deleterious allele frequencies is consistent with the OOA dispersal, particularly under a model where deleterious mutations are recessive. We conclude that there is a strong signal of purifying selection at conserved genomic positions within Africa, but that many predicted deleterious mutations have evolved as if they were neutral during the expansion out of Africa. Under a model where selection is inversely related to dominance, we show that OOA populations are likely to have a higher mutation load due to increased allele frequencies of nearly neutral variants that are recessive or partially recessive.
CRISPR/Cas9 gene drive (CGD) promises a highly adaptable approach for spreading genetically engineered alleles throughout a species, even if those alleles impair reproductive success. CGD has been shown to be effective in laboratory crosses of insects, yet it remains unclear to what extent potential resistance mechanisms will affect the dynamics of this process in large natural populations. Here we develop a comprehensive population genetic framework for modeling CGD dynamics, which incorporates potential resistance mechanisms as well as random genetic drift. Using this framework, we calculate the probability that resistance against CGD evolves from standing genetic variation, de novo mutation of wildtype alleles, or cleavage-repair by nonhomologous end joining (NHEJ) – a likely byproduct of CGD itself. We show that resistance to standard CGD approaches should evolve almost inevitably in most natural populations, unless repair of CGD-induced cleavage via NHEJ can be effectively suppressed, or resistance costs are on par with those of the driver. The key factor determining the probability that resistance evolves is the overall rate at which resistance alleles arise at the population level by mutation or NHEJ. By contrast, the conversion efficiency of the driver, its fitness cost, and its introduction frequency have only minor impact. Our results shed light on strategies that could facilitate the engineering of drivers with lower resistance potential, and motivate the possibility to embrace resistance as a possible mechanism for controlling a CGD approach. This study highlights the need for careful modeling of the population dynamics of CGD prior to the actual release of a driver construct into the wild.
Previous genome-wide scans of positive natural selection in humans have identified a number of non-neutrally evolving genes that play important roles in skin pigmentation, metabolism, or immune function. Recent studies have also shown that a genome-wide pattern of local adaptation can be detected by identifying correlations between patterns of allele frequencies and environmental variables. Despite these observations, the degree to which natural selection is primarily driven by adaptation to local environments, and the role of pathogens or other ecological factors as selective agents, is still under debate. To address this issue, we correlated the spatial allele frequency distribution of a large sample of SNPs from 55 distinct human populations to a set of environmental factors that describe local geographical features such as climate, diet regimes, and pathogen loads. In concordance with previous studies, we detected a significant enrichment of genic SNPs, and particularly non-synonymous SNPs associated with local adaptation. Furthermore, we show that the diversity of the local pathogenic environment is the predominant driver of local adaptation, and that climate, at least as measured here, only plays a relatively minor role. While background demography by far makes the strongest contribution in explaining the genetic variance among populations, we detected about 100 genes which show an unexpectedly strong correlation between allele frequencies and pathogenic environment, after correcting for demography. Conversely, for diet regimes and climatic conditions, no genes show a similar correlation between the environmental factor and allele frequencies. This result is validated using low-coverage sequencing data for multiple populations. Among the loci targeted by pathogen-driven selection, we found an enrichment of genes associated to autoimmune diseases, such as celiac disease, type 1 diabetes, and multiples sclerosis, which lends credence to the hypothesis that some susceptibility alleles for autoimmune diseases may be maintained in human population due to past selective processes.
Comparisons of DNA from archaic and modern humans show that these groups interbred, and in some cases received an evolutionary advantage from doing so. This process - adaptive introgression - may lead to a faster rate of adaptation than is predicted from models with mutation and selection alone. Within the last couple of years, a series of studies have identified regions of the genome that are likely examples of adaptive introgression. In many cases, once a region was ascertained as being introgressed, commonly used statistics based on both haplotype as well as allele frequency information were employed to test for positive selection. Introgression by itself, however, changes both the haplotype structure and the distribution of allele frequencies, thus confounding traditional tests for detecting positive selection. Therefore, patterns generated by introgression alone may lead to false inferences of positive selection. Here we explore models involving both introgression and positive selection to investigate the behavior of various statistics under adaptive introgression. In particular, we find that the number and allelic frequencies of sites that are uniquely shared between archaic humans and specific present-day populations are particularly useful for detecting adaptive introgression. We then examine the 1000 Genomes dataset to characterize the landscape of uniquely shared archaic alleles in human populations. Finally, we identify regions that were likely subject to adaptive introgression and discuss some of the most promising candidate genes located in these regions.
We describe the astonishing changes and progress that have occurred in the field of population genetics over the past 50 years, slightly longer than the time since the first Population Genetics Group (PGG) meeting in January 1968. We review the major questions and controversies that have preoccupied population geneticists during this time (and were often hotly debated at PGG meetings). We show how theoretical and empirical work has combined to generate a highly productive interaction involving successive developments in the ability to characterise variability at the molecular level, to apply mathematical models to the interpretation of the data and to use the results to answer biologically important questions, even in nonmodel organisms. We also describe the changes from a field that was largely dominated by UK and North American biologists to a much more international one (with the PGG meetings having made important contributions to the increased number of population geneticists in several European countries). Although we concentrate on the earlier history of the field, because developments in recent years are more familiar to most contemporary researchers, we end with a brief outline of topics in which new understanding is still actively developing.Heredity advance online publication, 27 July 2016; doi:10.1038/hdy.2016.55.
Extensive genomic resources are available in the model legume Medicago truncatula. Here, we present the discovery and design of the first array of single-nucleotide polymorphism (SNP) markers in M. truncatula through large-scale Sanger resequencing of genomic fragments spanning the genome, in a diverse panel of 16 M. truncatula accessions. Both anonymous fragments and fragments targeting candidate genes for flowering phenology and symbiosis were surveyed for nucleotide variation in almost 230 kb of unique genomic regions. A set of 384 SNP markers was designed for an Illumina’s GoldenGate assay, genotyped on a collection of 192 inbred lines (CC192) representing the geographical range of the species and used to survey the diversity of two natural populations. Finally, 86% of the tested SNPs were of high quality and exhibited polymorphism in the CC192 collection. Even at the population level, we detected polymorphism for more than 50% of the selected SNPs. Analysis of the allele frequency spectrum in the CC192 showed a reduced ascertainment bias, mostly limited to very rare alleles (frequency <0.01). The substantial polymorphism detected at the species and population levels, the high marker quality and the potential to survey large samples of individuals make this set of SNP markers a valuable tool to improve our understanding of the effect of demographic and selective factors that shape the natural genetic diversity within the selfing species Medicago truncatula.
A number of previous studies suggested the presence of deleterious amino acid altering nonsynonymous single-nucleotide polymorphisms (nSNPs) in human populations. However, the proportions of deleterious nSNPs among rare and common variants are not known. To estimate these, >77 000 SNPs from human protein-coding genes were analyzed. Based on two independent methods, this study reveals that up to 53% of rare nSNPs (minor allele frequency (MAF)<0.002) could be deleterious in nature. The fraction of deleterious nSNPs declines with the increase in their allele frequencies and only 12% of the common nSNPs (MAF>0.4) were found to be harmful. This shows that even at high frequencies significant fractions of deleterious polymorphisms are present in human populations. These results could be useful for genome-wide association studies in understanding the relative contributions of rare and common variants in causing human genetic diseases.