Deprecated: Implicit conversion from float 219.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 219.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Warning: imagejpeg(C:\Inetpub\vhosts\kidney.de\httpdocs\phplern\26139636
.jpg): Failed to open stream: No such file or directory in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 117 Bioinformatics
2015 ; 31
(21
): 3468-75
Nephropedia Template TP
gab.com Text
Twit Text FOAVip
Twit Text #
English Wikipedia
ERGC: an efficient referential genome compression algorithm
#MMPMID26139636
Saha S
; Rajasekaran S
Bioinformatics
2015[Nov]; 31
(21
): 3468-75
PMID26139636
show ga
MOTIVATION: Genome sequencing has become faster and more affordable.
Consequently, the number of available complete genomic sequences is increasing
rapidly. As a result, the cost to store, process, analyze and transmit the data
is becoming a bottleneck for research and future medical applications. So, the
need for devising efficient data compression and data reduction techniques for
biological sequencing data is growing by the day. Although there exists a number
of standard data compression algorithms, they are not efficient in compressing
biological data. These generic algorithms do not exploit some inherent properties
of the sequencing data while compressing. To exploit statistical and
information-theoretic properties of genomic sequences, we need specialized
compression algorithms. Five different next-generation sequencing data
compression problems have been identified and studied in the literature. We
propose a novel algorithm for one of these problems known as reference-based
genome compression. RESULTS: We have done extensive experiments using five real
sequencing datasets. The results on real genomes show that our proposed algorithm
is indeed competitive and performs better than the best known algorithms for this
problem. It achieves compression ratios that are better than those of the
currently best performing algorithms. The time to compress and decompress the
whole genome is also very promising. AVAILABILITY AND IMPLEMENTATION: The
implementations are freely available for non-commercial purposes. They can be
downloaded from http://engr.uconn.edu/?rajasek/ERGC.zip. CONTACT:
rajasek@engr.uconn.edu.