QTL Mapping in Plant Breeding

QTL Mapping

QTL mapping is a statistical analysis to identify which molecular markers lead to a quantitative change of a particular trait.

Principle of QTL Mapping

Ø The objective behind QTL mapping is to find the association between genotypic and phenotypic data. The molecular markers are used to divide the mapping population into distinct genotypic classes depending on genotypes at the marker locus, and correlative statistics are used to see if individuals of one genotype vary substantially from those of the other genotype in terms of the characteristic under investigation.

Ø A substantial variation in phenotypic averages between the two / more groups based on the marker system and population type suggests that the marker locus used to split the mapping population is related to a QTL regulating the trait.

Ø Recombination is responsible for a large P-value reported for discrepancies between the marker and the QTL. The closer a marker is to a QTL, the less likely it is that the marker and QTL will recombine.

Ø As a result, the QTL and marker will generally be transmitted from generation to generation, and the mean of the group with the closely connected marker will be substantially different (P < 0.05) from the average of the group without the marker.

Ø When a marker is weakly connected or unlinked to a QTL, the marker and QTL segregate independently. In this case, the existence or extinction of the loosely connected marker will not make a substantial difference in the genotype group means.

Ø Unlinked markers far apart or on separate chromosomes from the QTL are inherited at random with the QTL, therefore there will be no significant variations in genotype group means.

Steps involved in QTL Mapping

            Several stages that are involved in identifying and characterising quantitative trait loci (QTL) for use in marker-assisted selection. The four primary stages of QTL mapping are listed in the figure and detailed information is given in the subheading.
A. Mapping Populations

Population development for QTL mapping requires selecting targeted parents, crossing them and advancing the progeny to produce a set of lines segregating for traits of interest. It includes F2 populations, RILs (Recombinant Inbred Lines), BC (Backcross), DH (Doubled Haploid), NILs (Near Isogenic Lines), CSSL (Chromosome Segment Substitution Line), F2 derived F3 population and Immortalised F2 population. BC, DH, RILs, and F2 populations are commonly used populations in QTL mapping (Zhang et al. (2010)). Table 1 shows the types of mapping populations and their expected ratio with dominant marker and codominant marker.

F2 population: Progeny produced by selfing the F1 individuals from the cross of selected parents. It is best suited for mapping markers and oligogenes as it requires only two generations to develop. They provide estimates of additive, dominance and epistatic components of genetic variance. The F2 population was used by DeVicente and Tanksley (1993) for evaluating 11 quantitative traits in cultivated tomatoes and their wild relative.

F2 derived F3 population: Selfing of F2 individuals for one generation yields F2 derived F3 populations and harvesting seeds from each F2 plant separately. These populations are suitable for mapping oligogenic traits controlled by recessive genes and QTLs.

Backcross population: The population obtained by crossing the F1 population with either of two parents of the concerned F1 is the backcross population. If the target trait shows phenotypic segregation which is detectable then BC genetic analysis can be used. These populations are not suitable for QTL mapping as they cannot be evaluated in replicating trials.

Doubled Haploids: The cultures of ovule/pollen of F1 plants can be treated with colchicine for chromosome doubling. The DH lines are completely homozygous at all loci in the genome and they do not have residual heterozygosity. They can be multiplied and maintained indefinitely and can be shared for further research. They can be evaluated in replicated trials which makes them suitable for qualitative and quantitative traits. These populations are suitable for mapping heterosis QTLs.

RILs (Recombinant Inbred Lines): Population obtained by continuous inbreeding of individual F2 plants constitute RILs. RILs help in detecting markers located much closer to the target gene than possible with BC, F2 and DH populations. RILs are twice as informative as F2 populations are in terms of recombination. They are perpetual/permanent mapping populations as they can be maintained /propagated indefinitely. They are widely used for developing molecular marker linkage maps, detection of markers linked with qualitative and quantitative traits, and mapping of QTLs and genes.

Immortalized F2 population: The F3 progeny from a F2 plant intermated in two groups and the seeds from at least 20 plants harvested in bulk constitutes an immortalised F2 population. It is not perpetual but it is ephemeral like the F2 population. But the same IF2 can be reconstructed from parental RILs which are perpetual. So, they are called the Immortalized F2 population. Zhang et al. (2014) identified 54 main QTL for five kernel-related traits in maize using an immortalised F2 population consisting of 243 crosses.

NILs (Near Isogenic Lines): Populations produced by backcrossing the F1 with Recurrent Parent for generations and selecting strictly for trait introgressed from donor parent (Trait of Interest) constitute the formation of NILs. Integration of genes into existing linkage maps and identification of molecular markers located near the introgressed gene is also possible using NILs. It is a homozygous and perpetual mapping resource. This population can be used for the construction of a high-resolution mapping population. NILs and CSSLsare are suitable for fine mapping and map-based cloning.

CSSL (Construction of chromosome segment substitution lines):

Group of lines having a single distinct chromosome segment from the donor parent in the chromosome background of the recurrent parent. It is a perpetual mapping population suitable for mapping both oligogenes and QTLs and detection of QTLs with small additive effects.


Table 1: The types of mapping populations and their expected ratio with dominant marker and codominant marker.



B. Development of Linkage Map

Mapping means simply arranging the markers in order, showing the genetic distance between them, and assigning them to linkage groups based on the recombination values obtained from all pairwise combinations. The linkage map shows the position and relative genetic distance between markers along the chromosome. By genotyping the mapping population with polymorphic molecular markers, we may assess the segregation patterns for each of the markers. Individual QTLs have been identified and their effects and positions determined using a variety of molecular markers including RFLPs, RAPDs, SSRs, AFLPs, and SNPs.

Table shows the most widely used molecular markers:

Ø  The polymorphic markers are dominant or codominant. Thus, markers can distinguish between homozygotes and heterozygotes.

Ø  The codominant marker shows size variations, whereas the dominant marker shows the presence or absence.

Ø  And thus, the different sized bands on a gel, are referred to as marker alleles. A codominant marker may have many distinct alleles, but a dominance marker only has two.

Ø  This is why dominant and codominant markers are distinguished from each other.

Ø  The kind of markers (dominant or codominant) and the type of mapping population are both factors that influence the genetic segregation ratio at marker loci.

 C. Phenotyping of Mapping Population

Ø  The quantitative characteristics that are being measured, must be as accurate as possible. Although there should be no missing data in general, some quantities of missing data can be permitted.

Ø  But still, if we proceed with the missing data, the results obtained from such an analysis are not so robust. The sample size is affected by missing data in the population, which reduces the power of QTL mapping.

Ø  To produce a single quantitative value for the line, the data is pooled across locations and replications.

Ø  To have a better understanding of the QTL x Environment interaction, it is also required to evaluate the target traits in studies done in several locations.

Ø  To get robust and good quality results, one should collect good quality phenotypic data. Otherwise, the phenotypic and genotypic data don’t match with each other and thus we didn't get the expected results.

D. QTL Detection

Ø  The primary goal of QTL mapping is to discover the loci which are generally associated with certain characters while avoiding false positives, which occur when a marker is associated with a QTL that does not exist. Following are the methods and models for QTL detection, which are used by software to identify the QTLs in the population.

Ø  The software used for the identification of QTLs using phenotypic and genotypic data are s below:

·         QTL IciMapping

·         PLABQTL

·         Windows QTL Cartographer

·         MapQTL

·         GeneNetwork

Methods for QTL detection

1.   Single QTL mapping

a)   Single-Marker Analysis:

Ø Estimating the association of each marker separately with the target trait is referred to as single marker analysis.

Ø The phenotype means for different genotype groups are compared to detect QTLs at or near the site using the significance of differences between means.

Ø The recombination rate between QTL and marker along with the magnitude of the effective size of the QTL allows the detection of QTL.

Ø Among the methods for QTL mapping, this is the simplest in analysis which is independent of linkage maps and can be performed using statistical software like Q gene and MapManagerQTX.

Ø Since the analysis does not provide a recombination rate between QTL and marker, the exact position of QTL in the genome remains unknown.

Ø This creates a downward bias in estimating the QTL effect size. More than one QTL associated with a marker cannot be determined.

b)       Simple Interval Mapping

Ø  The mapping which requires a marker linkage map for searching QTL at locations of every 1-2cM within each marker interval is referred to as Simple Interval Mapping.

Ø  It provides a LOD score curve which allows QTL localization onto a linkage map which is represented by a support interval.

Ø  It also considers missing marker genotype data for enhancing the reliability of findings.

Ø  It is a statistically powerful tool in QTL detection based on linkage maps which can be represented by a support interval.

2.       Multiple QTL Mapping

a)       Composite Interval Mapping (CIM)

Ø  The QTL mapping combines simple marker analysis followed by multiple QTL models using the stepwise regression method.

Ø  All the markers in descending orders of LOD scores are evaluated for significance and brought together into a model as cofactors in this way the entire genome is scanned for QTL detection and mapping.

Ø  It has been implemented by QTL cartographers which makes it a widely used method for QTL mapping in biparental populations. It lacks detection of interacting QTLs which hinders if epistasis is found among QTLs.

b)       Inclusive Composite Interval Mapping (ICIM)

Ø  Standard stepwise linear regression analysis model of CIM is used to discover the most important markers linked with QTL which identifies most significant QTLs affecting a trait.

Ø  ICIM can detect dominance and two gene epistasis as well. It shows visibly high LOD which improves QTL mapping power and precision of ICIM over CIM.

Ø  The efficiency of results is comparable with the Bayesian model of QTL mapping.

c) Joint Inclusive Composite Interval Mapping (JICIM)

Ø  The extended algorithm of ICIM constitutes JICIM which is designed for multiple cross-population analysis having one parent in common like NAM (Nested Association Mapping) population.

Ø  It allows simultaneous analysis of multiple alleles segregating for a QTLs. It has high QTL detection power when QTL is located in the centre of the marker interval which is the most difficult point.  

d) Multiple Interval Mapping (MIM)

Ø Simultaneous QTL mapping in multiple marker intervals constitutes MIM. If epistasis is present, MIM can detect it among multiple QTLs.

a)   Bayesian Multiple QTL mapping

Ø It uses reversible jump Markov Chain Monte Carlo (MCMC) for modelling. It is based on maximum likelihood functions. It offers flexibility in QTL number, locations and missing genotypes of QTLs.

Ø It estimates the probability of QTL existence in a given marker interval which is the major advantage. It has not been widely used due to many limitations.

Factors affecting the QTL Mapping

A.  Genes that control the target traits

Ø The performance and effectiveness of QTL mapping is influenced by the position of the gene on the chromosome.

Ø There is a better possibility of detecting target loci if genes remain near to the relevant genetic marker. It's also based on the banding pattern of the marker.

Ø There is a greater likelihood of crossing over if genes remain away from the concerned genetic marker. It has an impact on the markers' banding patterns.

Ø It will be challenging to figure out where target genes are really situated.

B.  Heritability in a mapping population

Ø Characters controlled by oligogenes or single genes have a higher heritability than those controlled by polygenes.

Ø Thus, populations containing oiligogenes in a large amount for the trait of interest will carry the same information to the next generation. And thus, heritability plays a great role in mapping the population when it comes to QTL mapping.

C.  Size of mapping population

Ø Both in small and large sample sizes, QTL with small effects can’t be observed and if it gets as an end result, these are not good in future breeding programmes; also, QTL with large effects can be observed.

Ø The quantitative characteristics of interest are measured as precisely as feasible, with only a small portion of missing data to be permitted.

Ø Firstly, the sample size and then the coverage of the genome by genetic markers limit the power to resolve the QTL position.


D.  Type of mapping population used

Ø A Non-random mating population is generally suited for QTL mapping.

Ø It’s a result of different breeding aspects like mutation, genetic drift, natural selection, crossing programme and so on.

E.  Types and Number of markers used

Ø The precision of estimation of both QTL position and effectiveness will increase as the number of markers utilised increases.

Ø The co-dominant marker reveals three sorts of genetic differences, as a result, co-dominant markers give more information about recombination within marker intervals than dominant markers whereas the dominant marker reveals only two.

Applications of QTL Mapping

Ø Plant breeders may not be able to know the exact site of QTL while working on a trait like a disease and/or pest resistance in the plants because it has a huge influence and maybe introgressed via marker assisted backcrossing (MABC).

Ø From a breeder’s point of view, foreground and background selection is a very important aspect. Elements like Marker Assisted Selection (MAS), Marker Assisted Back Cross (MABC) and Marker Assisted Recurrent Selection (MARS) result in an indirect selection of QTLs which requires less time, labour, resource and space.

Ø QTL mapping saves several generations after generation cultivation of the plants to get the desirable progeny from a cross. So, this ultimately saves the labour cost and time to identify the strain in the early years of the breeding programme.

Ø Crop improvement has benefited from QTLs discovered for a variety of attributes in many crops, particularly in terms of increasing production and developing disease-resistant elite lines.

Ø In the case of disease resistance, there is no requirement for lab work with pathogen inoculation and disease development. A significant reduction in linkage drag that occurs during QTL introgression in the targeted population can be seen.

Limitations of QTL Mapping

Ø QTL mapping gives a low-resolution method for the identification of genetic regions. The drawback is that researchers are restricted to the genetic diversity of the segregating population's parents. You can consider utilising complex intercrosses to boost the resolution.

Ø Generally, existing genotype strains and low probe density microarrays fetches wrong information and mismatches the probes that created the erroneous cis-eQTL that could not be totally eradicated.

Ø Large quantity and pure form of DNA is required for the identification of the QTLs in the study.

Ø One can observe limited polymorphism in the study, especially working in the related lines.

Ø When we are identifying the QTLs while using SSR markers, a special kind of gel electrophoresis called ‘Polyacrylamide Electrophoresis’ is required.




















Comments