This function estimates and returns parameters needed for spike-in count simulations using supplementary code from Kim et al. 2016 (DOI: 10.1038/ncomms9687).
estimateSpike(spikeData, spikeInfo, MeanFragLengths = NULL, batchData = NULL, Normalisation=c('depth','none'), SampleFilter = 3, RNAseq = c("bulk", "singlecell"), Protocol = c('UMI', 'Read'), verbose = TRUE)
spikeData | is a count |
---|---|
spikeInfo | is a molecule count |
MeanFragLengths | is a numeric vector of the mean fragment length. |
batchData | is a |
Normalisation | is a character value: 'depth' or 'none'. For more information, please consult the details section. |
SampleFilter | is a numeric vector indicating the minimal number of MADs (median absolute deviation)
away from the median number of features detected as well as sequencing depth across all samples
so that outlying samples are removed prior to normalisation and parameter estimation.
The default is |
RNAseq | is a character value: "bulk" or "singlecell". |
Protocol | is a character value defining the type of counts given in |
verbose | Logical value to indicate whether to print function information.
Default is |
List object with the following entries:
The normalised spike-in read counts data.frame
.
The mean and standard deviation per spike-in using normalised read counts in a data.frame
.
The ad-hoc estimated as well as fitted detection probabilities with confidence interval per spike-in using normalised read counts in a data.frame
.
Library size, i.e. total number of reads per library
The estimated library size factors.
Estimation of the four parameters capturing technical variability, namely E[\(\gamma\)], Var[\(\gamma\)], E[\(\theta\)] and Var[\(\theta\)]. For more details, please consult supplementary information of Kim et al. 2016 (DOI: 10.1038/ncomms9687). These estimates are needed for simulating spike-in read counts.
The input spike-in expression matrix
, molecule counts data.frame
and batch annotation data.frame
,
filtered so that only spike-ins with nonzero expression and samples with at least 100 reads are retained.
Reporting the chosen normalisation framework.
Normalisation methods
applies the depth normalization method as implemented in computeSpikeFactors
.
No normalisation is applied. This approach can be used for prenormalized expression estimates, e.g. TPM/FPKM/RPKM estimated by RSEM, salmon, cufflinks etc.
if (FALSE) { data("SmartSeq2_SpikeIns_Read_Counts") data("SmartSeq2_SpikeInfo") Batches = data.frame(Batch = sapply(strsplit(colnames(SmartSeq2_SpikeIns_Read_Counts), "_"), "[[", 1), stringsAsFactors = F, row.names = colnames(SmartSeq2_SpikeIns_Read_Counts)) # estimation spike_param <- estimateSpike(spikeData = SmartSeq2_SpikeIns_Read_Counts, spikeInfo = SmartSeq2_SpikeInfo, MeanFragLength = NULL, batchData = Batches, Normalisation = 'depth') # plotting plotSpike(estSpike = spike_param, Annot = FALSE) }