Skip to main content
Fig. 1 | Biology Direct

Fig. 1

From: Finite-size effects in transcript sequencing count distribution: its power-law correction necessarily precedes downstream normalization and comparative analysis

Fig. 1

Pareto distributions and scatterplots of spike-in background and dilution datasets. a and b give the Pareto distribution plots of the scaled background counts from the spike-in background and NUGC3 dilution dataset respectively. Both plots are segmented into the highest-count to lowest-count regions based on an order of magnitude per segment (see vertical dotted lines across horizontal axis). Generally, Zipf’s law (i.e., slope of − 1) holds well for the highest-count segments. c and d give the scatterplots of the highest sequencing depth replicate against the rest for the spike-in background and NUGC3 dilution dataset respectively. Both plots exhibit the hallmark of the Pareto’s mathematical moments where a change in variance is perpetuated by a change in the power-law exponent. The noise that plagued the low and lowest-count segments, serves to highlight the instability of the replicated count values when the corresponding power-law mathematical moments stem not only from low exponent values but of non-comparable magnitude as well

Back to article page