Warning: file_get_contents(https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&id=24618470
&cmd=llinks): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 215
Probabilistic PCA of censored data: accounting for uncertainties in the
visualization of high-throughput single-cell qPCR data
#MMPMID24618470
Buettner F
; Moignard V
; Göttgens B
; Theis FJ
Bioinformatics
2014[Jul]; 30
(13
): 1867-75
PMID24618470
show ga
MOTIVATION: High-throughput single-cell quantitative real-time polymerase chain
reaction (qPCR) is a promising technique allowing for new insights in complex
cellular processes. However, the PCR reaction can be detected only up to a
certain detection limit, whereas failed reactions could be due to low or absent
expression, and the true expression level is unknown. Because this censoring can
occur for high proportions of the data, it is one of the main challenges when
dealing with single-cell qPCR data. Principal component analysis (PCA) is an
important tool for visualizing the structure of high-dimensional data as well as
for identifying subpopulations of cells. However, to date it is not clear how to
perform a PCA of censored data. We present a probabilistic approach that accounts
for the censoring and evaluate it for two typical datasets containing single-cell
qPCR data. RESULTS: We use the Gaussian process latent variable model framework
to account for censoring by introducing an appropriate noise model and allowing a
different kernel for each dimension. We evaluate this new approach for two
typical qPCR datasets (of mouse embryonic stem cells and blood stem/progenitor
cells, respectively) by performing linear and non-linear probabilistic PCA.
Taking the censoring into account results in a 2D representation of the data,
which better reflects its known structure: in both datasets, our new approach
results in a better separation of known cell types and is able to reveal
subpopulations in one dataset that could not be resolved using standard PCA.
AVAILABILITY AND IMPLEMENTATION: The implementation was based on the existing
Gaussian process latent variable model toolbox
(https://github.com/SheffieldML/GPmat); extensions for noise models and kernels
accounting for censoring are available at
http://icb.helmholtz-muenchen.de/censgplvm.