Decreased biased gene sales fix favoring Grams/C nucleotides inside the D. melanogaster
The analysis of the distribution of ? along chromosomes at the 100-kb scale reveals a more uniform distribution than that of CO (c) rates, with no reduction near telomeres or centromeres (Figure 5). More than 80% of 100-kb windows show ? within a 2-fold range, a percentage that contrasts with the distribution of CO where only 26.3% of 100-kb windows along chromosomes show c within a 2-fold range of the chromosome average. To test specifically whether the distribution of CO events is more variable across the genome that either GC or the combination of GC and CO events (i.e., number of DSBs), we estimated the coefficient of variation (CV) along chromosomes for each of the three parameters for different window sizes and chromosome arms. In all cases (window size and chromosome arm), the CV for CO is much greater (more than 2-fold) than that for either GC or DSBs (CO+GC), while the CV for DSBs is only marginally greater than that for GC: for 100-kb windows, the average CV per chromosome arm for CO, GC and DSBs is 0.90, 0.37 and 0.38, respectively. Nevertheless, we can also rule out the possibility that the distribution of GC events or DSBs are completely random, with significant heterogeneity along each chromosome (P<0.0001 at all physical scales analyzed, from 100 kb to 10 Mb; see Materials and Methods for details). Not surprisingly due to the excess of GC over CO events, GC is a much better predictor of the total number of DSBs or total recombination events across the genome than CO rates, with semi-partial correlations of 0.96 for GC and 0.38 for CO to explain the overall variance in DSBs (not taking into account the fourth chromosome).
DSB solution requires the creation out-of heteroduplex sequences (both for CO or GC events; Profile S1). These types of heteroduplex sequences normally contain An effective(T):C(G) mismatches that are repaired at random or favoring specific nucleotides. During the Drosophila, there’s no lead experimental evidence help G+C biased gene conversion process fix and you can evolutionary analyses has offered contradictory show while using the CO rates because an effective proxy Lesbian adult dating for heteroduplex development (– but come across , ). Mention however you to definitely GC incidents are more frequent than CO incidents during the Drosophila and also in other bacteria , , , which GC (?) cost are significantly more related than just CO (c) rates when examining the brand new possible outcomes out-of heteroduplex fix.
In certain species, gene sales mismatch fix has been advised getting biased, favoring G and C nucleotides – and you may predicting a confident relationship ranging from recombination rates (sensu frequency of heteroduplex formation) and also the Grams+C content regarding noncoding DNA ,
Our very own study inform you no relationship away from ? with Grams+C nucleotide structure from the intergenic sequences (R = +0.036, P>0.20) or introns (R = ?0.041, P>0.16). A comparable insufficient association is observed whenever G+C nucleotide composition are than the c (P>0.twenty-five for both intergenic sequences and introns). We discover thus zero evidence of gene conversion process bias favoring G and you may C nucleotides when you look at the D. melanogaster according to nucleotide constitution. The causes for almost all of your own earlier overall performance you to definitely inferred gene sales bias into Grams and you will C nucleotides within the Drosophila are several and can include the usage of sparse CO charts too given that partial genome annotation. Since the gene thickness when you look at the D. melanogaster are high in the nations with low-faster CO , , the numerous has just annotated transcribed regions and you will G+C steeped exons , , might have been in earlier times examined while the simple sequences, particularly in this type of genomic places which have low-shorter CO.
The brand new motifs from recombination inside the Drosophila
To discover DNA motifs associated with recombination events (CO or GC), we focused on 1,909 CO and 3,701 GC events delimited by five-hundred bp or less (CO500 and GC500, respectively). Our D. melanogaster data reveal many motifs significantly enriched in sequences surrounding recombination events (18 and 10 motifs for CO and GC, respectively) (Figure 6 and Figure 7). Individually, the motifs surrounding CO events (MCO) are present in 6.8 to 43.2% of CO500 sequences, while motifs surrounding GC events (MGC) are present in 7.8 to 27.6% of GC500 sequences. Note that 97.7% of all CO500 sequences contain at least one MCO motif and 85.0% of GC500 sequences contain one or more MGC motif (Figure S4).