Skip to main content

Table 1 Primary and unknown data sets. Sample size for different cities and unknown, along with clean files (size is in GB)

From: Massive metagenomic data analysis using abundance-based machine learning

Location

Acronym

Number of samples

Total size (GB) of clean files (FASTQ format)

Total number of reads (filtered)

Auckland, New Zealand

AKL

15

47.8

136,022,160

Hamilton, Canada

HAM

16

61.5

179,554,428

Sacramento, US

SAC

16

36.5

105,326,430

Santiago, Chile

SCL

20

215.3

613,721,390

Offa, Nigeria

OFA

20

438.2

1,267,427,220

Porto, Portugal

PXO

60

132.2

380,372,340

Tokyo, Japan

TOK

20

308.6

1,103,076,136

New York, US

NYC

26

368.8

1,086,713,476

Unknown

UNK

30

75.3

219,935,058