QTL Mapping in Plant Breeding
QTL Mapping
QTL mapping is a statistical analysis to identify which molecular markers lead
to a quantitative change of a particular trait.
Principle of QTL
Mapping
Ø The
objective behind QTL mapping is to find the association between genotypic and
phenotypic data. The molecular markers are used to divide the mapping
population into distinct genotypic classes depending on genotypes at the marker
locus, and correlative statistics are used to see if individuals of one
genotype vary substantially from those of the other genotype in terms of the
characteristic under investigation.
Ø A
substantial variation in phenotypic averages between the two / more groups
based on the marker system and population type suggests that the marker locus
used to split the mapping population is related to a QTL regulating the trait.
Ø Recombination
is responsible for a large P-value reported for discrepancies between the
marker and the QTL. The closer a marker is to a QTL, the less likely it is that
the marker and QTL will recombine.
Ø As
a result, the QTL and marker will generally be transmitted from generation to
generation, and the mean of the group with the closely connected marker will be
substantially different (P < 0.05) from the average of the group without the
marker.
Ø When
a marker is weakly connected or unlinked to a QTL, the marker and QTL segregate
independently. In this case, the existence or extinction of the loosely
connected marker will not make a substantial difference in the genotype group
means.
Ø Unlinked
markers far apart or on separate chromosomes from the QTL are inherited at
random with the QTL, therefore there will be no significant variations in
genotype group means.
Steps involved in QTL
Mapping
Several stages that are involved in identifying and
characterising quantitative trait loci (QTL) for use in marker-assisted
selection. The four primary stages of QTL mapping are listed in the figure and
detailed information is given in the subheading.
A. Mapping Populations
Population
development for QTL mapping requires selecting targeted parents, crossing them
and advancing the progeny to produce a set of lines segregating for traits of
interest. It includes F2 populations, RILs (Recombinant Inbred Lines), BC
(Backcross), DH (Doubled Haploid), NILs (Near Isogenic Lines), CSSL (Chromosome
Segment Substitution Line), F2 derived F3 population and Immortalised F2 population.
BC, DH, RILs, and F2 populations are commonly used populations in QTL mapping
(Zhang et al. (2010)). Table 1 shows the types of mapping populations
and their expected ratio with dominant marker and codominant marker.
F2 population: Progeny produced by selfing the F1 individuals from the
cross of selected parents. It is best suited for mapping markers and oligogenes
as it requires only two generations to develop. They provide estimates of
additive, dominance and epistatic components of genetic variance. The F2
population was used by DeVicente and Tanksley (1993) for evaluating 11
quantitative traits in cultivated tomatoes and their wild relative.
F2
derived F3 population: Selfing of F2
individuals for one generation yields F2 derived F3 populations and harvesting
seeds from each F2 plant separately. These populations are suitable for mapping
oligogenic traits controlled by recessive genes and QTLs.
Backcross population: The population obtained by crossing the F1 population
with either of two parents of the concerned F1 is the backcross population. If
the target trait shows phenotypic segregation which is detectable then BC
genetic analysis can be used. These populations are not suitable for QTL
mapping as they cannot be evaluated in replicating trials.
Doubled
Haploids: The cultures of ovule/pollen of F1
plants can be treated with colchicine for chromosome doubling. The DH lines are
completely homozygous at all loci in the genome and they do not have residual
heterozygosity. They can be multiplied and maintained indefinitely and can be
shared for further research. They can be evaluated in replicated trials which
makes them suitable for qualitative and quantitative traits. These populations
are suitable for mapping heterosis QTLs.
RILs (Recombinant Inbred
Lines): Population obtained by
continuous inbreeding of individual F2 plants constitute RILs. RILs help in
detecting markers located much closer to the target gene than possible with BC,
F2 and DH populations. RILs are twice as informative as F2 populations are in
terms of recombination. They are perpetual/permanent mapping populations as
they can be maintained /propagated indefinitely. They are widely used for developing molecular marker linkage maps, detection of
markers linked with qualitative and quantitative traits, and mapping of QTLs
and genes.
Immortalized
F2 population: The F3 progeny from a
F2 plant intermated in two groups and the seeds from at least 20 plants
harvested in bulk constitutes an immortalised F2 population. It is not
perpetual but it is ephemeral like the F2 population. But the same IF2 can be
reconstructed from parental RILs which are perpetual. So, they are called the
Immortalized F2 population. Zhang et al. (2014) identified 54 main QTL
for five kernel-related traits in maize using an immortalised F2 population
consisting of 243 crosses.
NILs
(Near Isogenic Lines): Populations
produced by backcrossing the F1 with Recurrent Parent for generations and
selecting strictly for trait introgressed from donor parent (Trait of Interest)
constitute the formation of NILs. Integration of genes into existing linkage
maps and identification of molecular markers located near the introgressed gene
is also possible using NILs. It is a homozygous and perpetual mapping resource.
This population can be used for the construction of a high-resolution mapping
population. NILs and CSSLsare are suitable for fine mapping and map-based
cloning.
CSSL (Construction of
chromosome segment substitution lines):
Group
of lines having a single distinct chromosome segment from the donor parent in
the chromosome background of the recurrent parent. It is a perpetual mapping
population suitable for mapping both oligogenes and QTLs and detection of QTLs
with small additive effects.
Table 1: The types of mapping populations and their expected ratio with dominant marker and codominant marker.
B. Development of Linkage Map
Mapping
means simply arranging the markers in order, showing the genetic distance
between them, and assigning them to linkage groups based on the recombination
values obtained from all pairwise combinations. The linkage map shows the
position and relative genetic distance between markers along the chromosome. By
genotyping the mapping population with polymorphic molecular markers, we may
assess the segregation patterns for each of the markers. Individual QTLs have
been identified and their effects and positions determined using a variety of
molecular markers including RFLPs, RAPDs, SSRs, AFLPs, and SNPs.
Table shows
the most widely used molecular markers:
Ø The polymorphic markers are dominant or codominant. Thus, markers
can distinguish between homozygotes and heterozygotes.
Ø The codominant marker shows size variations, whereas the dominant
marker shows the presence or absence.
Ø And thus, the different sized bands on a gel, are referred to as
marker alleles. A codominant marker may have many distinct alleles, but a
dominance marker only has two.
Ø This is why dominant and codominant markers are distinguished from
each other.
Ø The kind of markers (dominant or codominant) and the type of
mapping population are both factors that influence the genetic segregation
ratio at marker loci.
C. Phenotyping of Mapping Population
Ø
The quantitative
characteristics that are being measured, must be as accurate as possible.
Although there should be no missing data in general, some quantities of missing
data can be permitted.
Ø
But still, if we
proceed with the missing data, the results obtained from such an analysis are
not so robust. The sample size is affected by missing data in the population,
which reduces the power of QTL mapping.
Ø
To produce a single
quantitative value for the line, the data is pooled across locations and
replications.
Ø
To have a better
understanding of the QTL x Environment interaction, it is also required to
evaluate the target traits in studies done in several locations.
Ø
To get robust and good
quality results, one should collect good quality phenotypic data. Otherwise,
the phenotypic and genotypic data don’t match with each other and thus we
didn't get the expected results.
D. QTL Detection
Ø
The primary goal of QTL
mapping is to discover the loci which are generally associated with certain
characters while avoiding false positives, which occur when a marker is
associated with a QTL that does not exist. Following are the methods and models
for QTL detection, which are used by software to identify the QTLs in the
population.
Ø
The software used for the
identification of QTLs using phenotypic and genotypic data are s below:
·
QTL IciMapping
·
PLABQTL
·
Windows QTL Cartographer
·
MapQTL
·
GeneNetwork
Methods
for QTL detection
1.
Single
QTL mapping
a)
Single-Marker
Analysis:
Ø Estimating the association of each marker separately with the
target trait is referred to as single marker analysis.
Ø The phenotype means for different genotype groups are compared to
detect QTLs at or near the site using the significance of differences between
means.
Ø The recombination rate between QTL and marker along with the
magnitude of the effective size of the QTL allows the detection of QTL.
Ø Among the methods for QTL mapping, this is the simplest in
analysis which is independent of linkage maps and can be performed using
statistical software like Q gene and MapManagerQTX.
Ø Since the analysis does not provide a recombination rate between
QTL and marker, the exact position of QTL in the genome remains unknown.
Ø This creates a downward bias in estimating the QTL effect size.
More than one QTL associated with a marker cannot be determined.
b)
Simple
Interval Mapping
Ø The mapping which requires a marker linkage map for searching QTL
at locations of every 1-2cM within each marker interval is referred to as
Simple Interval Mapping.
Ø It provides a LOD score curve which allows QTL localization onto a
linkage map which is represented by a support interval.
Ø It also considers missing marker genotype data for enhancing the
reliability of findings.
Ø It is a statistically powerful tool in QTL detection based on
linkage maps which can be represented by a support interval.
2.
Multiple QTL
Mapping
a)
Composite
Interval Mapping (CIM)
Ø The QTL mapping combines simple marker analysis followed by
multiple QTL models using the stepwise regression method.
Ø All the markers in descending orders of LOD scores are evaluated
for significance and brought together into a model as cofactors in this way the
entire genome is scanned for QTL detection and mapping.
Ø It has been implemented by QTL cartographers which makes it a
widely used method for QTL mapping in biparental populations. It lacks
detection of interacting QTLs which hinders if epistasis is found among QTLs.
b)
Inclusive
Composite Interval Mapping (ICIM)
Ø Standard stepwise linear regression analysis model of CIM is used
to discover the most important markers linked with QTL which identifies most
significant QTLs affecting a trait.
Ø ICIM can detect dominance and two gene epistasis as well. It shows
visibly high LOD which improves QTL mapping power and precision of ICIM over
CIM.
Ø The efficiency of results is comparable with the Bayesian model of
QTL mapping.
c) Joint Inclusive
Composite Interval Mapping (JICIM)
Ø The extended algorithm of ICIM constitutes JICIM which
is designed for multiple cross-population analysis having one parent in common
like NAM (Nested Association Mapping) population.
Ø It allows simultaneous analysis of multiple alleles
segregating for a QTLs. It has high QTL detection power when QTL is located in
the centre of the marker interval which is the most difficult point.
d)
Multiple Interval Mapping (MIM)
Ø Simultaneous QTL mapping in multiple marker intervals constitutes
MIM. If epistasis is present, MIM can detect it among multiple QTLs.
a)
Bayesian
Multiple QTL mapping
Ø It uses reversible jump Markov Chain Monte Carlo (MCMC) for
modelling. It is based on maximum likelihood functions. It offers flexibility
in QTL number, locations and missing genotypes of QTLs.
Ø It estimates the probability of QTL existence in a given marker
interval which is the major advantage. It has not been widely used due to many
limitations.
Factors
affecting the QTL Mapping
A. Genes that control the target traits
Ø The performance and effectiveness of QTL mapping is influenced by
the position of the gene on the chromosome.
Ø There is a better possibility of detecting target loci if genes
remain near to the relevant genetic marker. It's also based on the banding
pattern of the marker.
Ø There is a greater likelihood of crossing over if genes remain
away from the concerned genetic marker. It has an impact on the markers'
banding patterns.
Ø
It will be challenging
to figure out where target genes are really situated.
B. Heritability in a mapping population
Ø Characters controlled by oligogenes or single genes have a higher
heritability than those controlled by polygenes.
Ø Thus, populations containing oiligogenes in a large amount for the
trait of interest will carry the same information to the next generation. And
thus, heritability plays a great role in mapping the population when it comes
to QTL mapping.
C. Size of mapping population
Ø Both in small and large sample sizes, QTL with small effects can’t
be observed and if it gets as an end result, these are not good in future
breeding programmes; also, QTL with large effects can be observed.
Ø The quantitative characteristics of interest are measured as
precisely as feasible, with only a small portion of missing data to be
permitted.
Ø Firstly, the sample size and then the coverage of the genome by
genetic markers limit the power to resolve the QTL position.
D. Type of mapping population used
Ø A Non-random mating population is generally suited for QTL
mapping.
Ø It’s a result of different breeding aspects like mutation, genetic
drift, natural selection, crossing programme and so on.
E. Types and Number of markers used
Ø The precision of estimation of both QTL position and effectiveness
will increase as the number of markers utilised increases.
Ø The co-dominant marker reveals three sorts of genetic differences,
as a result, co-dominant markers give more information about recombination
within marker intervals than dominant markers whereas the dominant marker
reveals only two.
Applications
of QTL Mapping
Ø Plant breeders may not be able to know the exact site of QTL while
working on a trait like a disease and/or pest resistance in the plants because
it has a huge influence and maybe introgressed via marker assisted backcrossing
(MABC).
Ø From a breeder’s point of view, foreground and background
selection is a very important aspect. Elements like Marker Assisted Selection
(MAS), Marker Assisted Back Cross (MABC) and Marker Assisted Recurrent
Selection (MARS) result in an indirect selection of QTLs which requires less
time, labour, resource and space.
Ø QTL mapping saves several generations after generation cultivation
of the plants to get the desirable progeny from a cross. So, this ultimately
saves the labour cost and time to identify the strain in the early years of the
breeding programme.
Ø Crop improvement has benefited from QTLs discovered for a variety
of attributes in many crops, particularly in terms of increasing production and
developing disease-resistant elite lines.
Ø In the case of disease resistance, there is no requirement for lab
work with pathogen inoculation and disease development. A significant reduction
in linkage drag that occurs during QTL introgression in the targeted population
can be seen.
Limitations
of QTL Mapping
Ø
QTL mapping gives a
low-resolution method for the identification of genetic regions. The drawback
is that researchers are restricted to the genetic diversity of the segregating
population's parents. You can consider utilising complex intercrosses to boost
the resolution.
Ø
Generally, existing
genotype strains and low probe density microarrays fetches wrong information
and mismatches the probes that created the erroneous cis-eQTL that could not be
totally eradicated.
Ø
Large quantity and pure
form of DNA is required for the identification of the QTLs in the study.
Ø
One can observe limited
polymorphism in the study, especially working in the related lines.
Ø
When we are identifying
the QTLs while using SSR markers, a special kind of gel electrophoresis called
‘Polyacrylamide Electrophoresis’ is required.
Comments
Post a Comment