Warning: imagejpeg(C:\Inetpub\vhosts\kidney.de\httpdocs\phplern\24929920
.jpg): Failed to open stream: No such file or directory in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 117 BMC+Bioinformatics
2014 ; 15
(ä): 188
Nephropedia Template TP
gab.com Text
Twit Text FOAVip
Twit Text #
English Wikipedia
An integrative method to normalize RNA-Seq data
#MMPMID24929920
Filloux C
; Cédric M
; Romain P
; Lionel F
; Christophe K
; Dominique R
; Abderrahman M
; Daniel P
BMC Bioinformatics
2014[Jun]; 15
(ä): 188
PMID24929920
show ga
BACKGROUND: Transcriptome sequencing is a powerful tool for measuring gene
expression, but as well as some other technologies, various artifacts and biases
affect the quantification. In order to correct some of them, several
normalization approaches have emerged, differing both in the statistical strategy
employed and in the type of corrected biases. However, there is no clear standard
normalization method. RESULTS: We present a novel methodology to normalize
RNA-Seq data, taking into account transcript size, GC content, and sequencing
depth, which are the major quantification-related biases. In this study, we found
that transcripts shorter than 600 bp have an underestimated expression level,
while longer transcripts are even more overestimated that they are long. Second,
it was well known that the higher the GC content (>50%), the more the transcripts
are underestimated. Third, we demonstrated that the sequencing depth impacts the
size bias and proposed a correction allowing the comparison of expression levels
among many samples. The efficiency of our approach was then tested by comparing
the correlation between normalized RNA-Seq data and qRT-PCR expression
measurements. All the steps are automated in a program written in Perl and
available on request. CONCLUSIONS: The methodology presented in this article
identifies and corrects different biases that influence RNA-Seq quantification,
and provides more accurate estimations of gene expression levels. This method can
be applied to compare expression quantifications from many samples, but
preferentially from the same tissue. In order to compare samples from different
tissue, a calibration using several reference genes will be required.