Our newsletter informs about the latest news in quantitative real-time
PCR (qPCR and qRT-PCR), which are compiled and summarised on the Gene
Quantification homepage. The focus of this newsletter issue is:
qPCR Biostatistics and qPCR BioInformatics
qPCR data analysis
SSC - A data-driven clustering method for time course gene
Distribution-insensitive cluster analysis in SAS on real-time PCR
gene expression data of steadily expressed genes
BiSearch Web Server
GenEx is the utlimate tool for Real-Time PCR expression profiling
Relative Expression Software Tools (REST) released in summer 2006
TATAA course in qPCR Biostatistics
BioInformatics in real-time qPCR and qRT-PCR [Only registered users see links. ]
Bioinformatics is a multidisciplinary approach to discribe, model and
understand biological processes on basis of information on genes,
proteins and metabolism. It uses computers, data bases and algorythms
to link information and translate it back into biology, physiology or
BioInformatics = Database Management Systems, Data Mining, Sample
Tracking, Information Management, Data Acquisition, Data Analysis,
Statistics, Pattern Recognition & Classification, Simulation & Modeling
Bioinformatics initially centered on sequence and genome analysis but
now the extensive use of microarrays, mass spectrometry, qPCR and
qRT-PCR, has stimulated bioinformatic work in data acquisition, signal
processing, and data mining. Also, simulation and modeling are becoming
increasingly important areas of focus in bioinformatics which finally
will lead to a new level of understanding the networks in the
metabolism: Genomics, Transcriptomics, Splicomics, Proteomics,
To date, there are three popular methods for quantifying the change in
mRNA levels: the Standard Curve Method, the Pfaffl Method, and the
Delta-Delta (DD) CT Method. All three methods use the change in
fluorescence during the reaction steps as a basis for quantification.
This proves to be a highly accurate method as signal intensity is
directly proportional to the amount of a PCR product in a given
reaction. The change in PCR product may then be quantified based on the
change in fluorescent signal intensity. The cycle number at which
fluorescent signaling crosses the "threshold" of logarithmic increases
in cDNA concentration is referred to as the Ct.There are advantages and
disadvantages to each approach, therefore it is important to select the
most appropriate method for your research goals. This section will
provide you with additional information on these three methods of data
analysis along with a few advantages and disadvantages for each. [Only registered users see links. ]
Vaerman JL, Saussoy P, Ingargiola I. J Biol Regul Homeost Agents. 2004
Apr-Jun;18(2):212-4. UCL, Cliniques Saint Luc, Bruxelles, Belgium.
If real-time PCR is to be of much worth to its user, some idea
regarding the reliability of its data is essential. We discuss here
some of the problems associated with interpreting numerical real-time
PCR data that lend themselves to analytical evaluation. We translate
into the language of molecular biology some of the criteria which are
used to evaluate the performance of any new method (linearity,
precision, specificity, limit of detection and quantification).
A data-driven clustering method for time course gene expression data.
Ma P, Castillo-Davis CI, Zhong W, Liu JS. Nucleic Acids Res. 2006 Mar
1;34(4):1261-9. Print 2006.
Gene expression over time is, biologically, a continuous process and
can thus be represented by a continuous function, i.e. a curve.
Individual genes often share similar expression patterns (functional
forms). However, the shape of each function, the number of such
functions, and the genes that share similar functional forms are
typically unknown. Here we introduce an approach that allows direct
discovery of related patterns of gene expression and their underlying
functions (curves) from data without a priori specification of either
cluster number or functional form. Smoothing spline clustering (SSC)
models natural properties of gene expression over time, taking into
account natural differences in gene expression within a cluster of
similarly expressed genes, the effects of experimental measurement
error, and missing data. Furthermore, SSC provides a visual summary of
each cluster's gene expression function and goodness-of-fit by way of a
'mean curve' construct and its associated confidence bands. We apply
this method to gene expression data over the life-cycle of Drosophila
melanogaster and Caenorhabditis elegans to discover 17 and 16 unique
patterns of gene expression in each species, respectively. New and
previously described expression patterns in both species are
discovered, the majority of which are biologically meaningful and
exhibit statistically significant gene function enrichment. Software
and source code implementing the algorithm, SSClust, is freely
Cluster analysis is a tool often employed in the micro-array techniques
but used less in the real-time PCR. Herein we present core SAS code
that instead of the Euclidian distances takes correlation coefficient
as a dissimilarity measure. The dissimilarity measure is made robust
using a rank-order correlation coefficient rather than a parametric
one. There is no need for an overall probability adjustment like in
scoring methods based on repeated pair-wise comparisons. The rank-order
correlation matrix gives a good base for the clustering procedure of
gene expression data obtained by real-time RT-PCR as it disregards the
different expression levels. Associated with each cluster is a linear
combination of the variables in the cluster, which is the first
principal component. Large set of variables can then be replaced by the
set of cluster components with little loss of information. In this way,
distinct clusters containing unregulated housekeeping genes along with
other steadily expressed genes can be disclosed and utilized for
standardization purposes. Simulated data in parallel with the data from
a biological experiment were taken to validate the SAS macro. For both
cases, good intuitive results were obtained.
Statistical practice in high-throughput screening data analysis.
Malo N, Hanley JA, Cerquozzi S, Pelletier J, Nadon R. Nat Biotechnol.
2006 Feb;24(2):167-75. McGill University and Genome Quebec Innovation
Centre, 740 avenue du Docteur Penfield, Montreal, Quebec, Canada, H3A
High-throughput screening is an early critical step in drug discovery.
Its aim is to screen a large number of diverse chemical compounds to
identify candidate 'hits' rapidly and accurately. Few statistical tools
are currently available, however, to detect quality hits with a high
degree of confidence. We examine statistical aspects of data
preprocessing and hit identification for primary screens. We focus on
concerns related to positional effects of wells within plates, choice
of hit threshold and the importance of minimizing false-positive and
false-negative rates. We argue that replicate measurements are needed
to verify assumptions of current methods and to suggest data analysis
strategies when assumptions are not met. The integration of replicates
with robust statistical methods in primary screens will facilitate the
discovery of reliable hits, ultimately improving the sensitivity and
specificity of the screening process.
Aranyi T, Varadi A, Simon I, Tusnady GE. BMC Bioinformatics. 2006 ;7:
431. lnstitute of Enzymology, BRC, HAS, H-1113 Karolina ut 29,
[Only registered users see links. ]
BACKGROUND: A large number of PCR primer-design softwares are available
online. However, only very few of them can be used for the design of
primers to amplify bisulfite-treated DNA templates, necessary to
determine genomic DNA methylation profiles. Indeed, the number of
studies on bisulfite-treated templates exponentially increases as
determining DNA methylation becomes more important in the diagnosis of
cancers. Bisulfite-treated DNA is difficult to amplify since undesired
PCR products are often amplified due to the increased sequence
redundancy after the chemical conversion. In order to increase the
efficiency of PCR primer-design, we have developed BiSearch web server,
an online primer-design tool for both bisulfite-treated and native DNA
templates. RESULTS: The web tool is composed of a primer-design and an
electronic PCR (ePCR) algorithm. The completely reformulated ePCR
module detects potential mispriming sites as well as undesired PCR
products on both cDNA and native or bisulfite-treated genomic DNA
libraries. Due to the new algorithm of the current version, the ePCR
module became approximately hundred times faster than the previous one
and gave the best performance when compared to other web based tools.
This high-speed ePCR analysis made possible the development of the new
option of high-throughput primer screening. BiSearch web server can be
used for academic researchers at the [Only registered users see links. ] site.
CONCLUSION: BiSearch web server is a useful tool for primer-design for
any DNA template and especially for bisulfite-treated genomes. The ePCR
tool for fast detection of mispriming sites and alternative PCR
products in cDNA libraries and native or bisulfite-treated genomes are
the unique features of the new version of BiSearch software.
BiSearch: primer-design and search tool for PCR on bisulfite-treated
Tusnady GE, Simon I, Varadi A, Aranyi T.
Nucleic Acids Res. 2005 Jan 13;33(1):e9.
Institute of Enzymology, BRC, Hungarian Academy of Sciences H-1113
Budapest, Karolina ut 29, Hungary.
Bisulfite genomic sequencing is the most widely used technique to
analyze the 5-methylation of cytosines, the prevalent covalent DNA
modification in mammals. The process is based on the selective
transformation of unmethylated cytosines to uridines. Then, the
investigated genomic regions are PCR amplified, subcloned and
sequenced. During sequencing, the initially unmethylated cytosines are
detected as thymines. The efficacy of bisulfite PCR is generally low;
mispriming and non-specific amplification often occurs due to the T
richness of the target sequences. In order to ameliorate the efficiency
of PCR, we developed a new primer-design software called BiSearch,
available on the World Wide Web. It has the unique property of
analyzing the primer pairs for mispriming sites on the
bisulfite-treated genome and determines potential non-specific
amplification products with a new search algorithm. The options of
primer-design and analysis for mispriming sites can be used
sequentially or separately, both on bisulfite-treated and untreated
sequences. In silico and in vitro tests of the software suggest that
new PCR strategies may increase the efficiency of the amplification.
GenEx is the ultimate tool for Real-Time PCR expression profiling
GenEx was developed by [Only registered users see links. ] in collaboration with TATAA
Biocenter. GenEx offers advanced methods to handle real-time PCR data
behind a user friendly interface. The methods are excellent and very
powerful to select and validate reference genes, to classify samples,
to monitor time dependent processes and the like. More info here!
GenEx ver. 4.1.7. will be released in Autumn 2006
[Only registered users see links. ]
Optimal use of real-time PCR measurements requires proper analysis of
realtime PCR data. DATAN framework provides the appropriate tools to
analyze real-time PCR gene expression data and to extract valuable
information from the measurements. Features in this user-friendly
· Advance plotting functions
· Grouping of data
· Handling of missing data
· Hierarchical Clustering to find associations between data
· Kohonen Neural Networks to classify data
· Pearson to calculate correlations between genes
· Principal Component Analysis to find hidden structures in data
· Scaling and normalization options for gene expression data
· Geometric averaging for assesing best reference gene
GenEx can be purchased from authorized MultiD representatives
You can also be invoiced your copy of GenEx by mailing us the serial
number that appears on the registery screen of your trial version. See
here for further information how to register and purchase GenEx. Price
per license (excl. VAT)
Editor for pre-processing of gene expression data, Scatter plots,
Advanced 2D and 3D graphics, Genorm, Normfinder (from September 2006),
Powerful data mangmnent system.
All functionality in GenEx Light + Pricipal Component Analysis,
Hierarchical Clustering, Self-Organizing Map.
All functionality in GenEx Professional + Partial Least Square
analysis, Neural Networks etc. GenEx Enterpriseis scheduled for spring
For purchase please e-mail the serial number of your copy to [Only registered users see links. ]
Visit the Academic & Industrial Information qPCR Platform - The
reference in qPCR !
A lot of qPCR companies and independent institutions participate in our
qPCR information platform. On the company sub-pages new real-time
cyclers, consumables, new released kits and detection chemistries, as
well as innovative amplification technologies are presented. Links from
the sub-pages will lead you to the related product pages at the
respective homepage. You are welcome to join our Academic & Industrial
qPCR platform with your company or institution !
For further info please contact [Only registered users see links. ]
TATAA Biocenter have found that the worldwide demand for training in
the field of quantitative real-time PCR (qPCR) is huge. To coordinate
this we aim to arrange practical courses, e.g. 3-day Core Module and
2-day Biostatistics Module:
* The basic qPCR Core Module contains three workshop days:
o First workshop day will be directed to people planning or
considering using qPCR in their research and also users not yet fully
familiar will quantitative PCR.
o Second day targets more advanced users and people
concentrating on different quantification strategies.
o The third day focuses on aspects in sample preparation and
* The additional qPCR Biostatistics Module explains statistics
applicable to qPCR and teaches how to use statistics to interpret qPCR
gene expression data, and classify samples based on qPCR expression
* Courses contain both, theoretical seminars and practical hands-on
training with experienced supervision.
* Practical training will be performed on different real-time PCR
cyclers, using multiple detection chemistries.
* The Biostatistics Module is further based on computer-based
demonstrations. Please bring your own Laptop !
qPCR courses are held in regularly in Göteborg, Sweden and in
Freising-Weihenstephan, Germany (near Munich, very close to the Munich
Airport - MUC). Depending on the occasion different prices may apply.
Also different course modules are available on the different occasions.
Further customized workshops and specialized trainings will be held as
well across Europe and world-wide. TATAA Biocenter Germany courses are
held in cooperation with the Institute of Physiology, located at the
Technical University of Munich, in Freising-Weihenstephan.
Course Occasions 2006:
* 20th - 22nd November 2006 3-day qPCR Core Module ( fully booked
* 23rd - 24th November 2006 2-day qPCR Biostatistics Module (
still available )
Forward Please send the qPCR NEWS to further scientists and friends
who are interested in qPCR and in our Academic & Industrial Information
Platform for qPCR.
Michael W. Pfaffl
Responsible Editor of the Gene Quantification Pages [Only registered users see links. ]