Skip to main content

Table 2 Summary of analysis for the power-law corrected spike-in background and NUGC3 dilution datasets

From: Finite-size effects in transcript sequencing count distribution: its power-law correction necessarily precedes downstream normalization and comparative analysis

  1. The summarized analysis of the Zipf’s law corrected datasets, namely the spike-in background and dilution datasets, were presented. The spike-in set consists of 1387 transcripts over 12 replicates while the dilution set has 865 transcripts over 8 replicates. For each segmented range, the fitted slope to Pareto distribution, the total number of points, the observed and expected standard deviation are calculated. The expected standard deviation σexp gives the corrected standard deviation of each “slope < 1” segment as if its slope is the same as the reference segment (indicated by *). It is calculated via the formula \( {\sigma}_{{\mathit{\operatorname{seg}}}_i}^{\mathrm{exp}}={\sigma}_{{\mathit{\operatorname{seg}}}_{ref}}^{obs}\left({s}_{{\mathit{\operatorname{seg}}}_{ref}}/{s}_{{\mathit{\operatorname{seg}}}_i}\right) \) using the highest-count segment as the reference. For the spike-in set, the observed and expected standard deviation is about 1.1 times larger while this is about 1.6 times for the dilution set (highlighted in red) in the worst case