Skip to main content

Advertisement

Table 5 Detailed statistics for the available datasets

From: IPC – Isoelectric Point Calculator

Dataset Initial no. entries No. entries with sequence and pI No. entries after removing outliers No. entries after removing duplicates
Gauci et al. 5,758 5,758 NA NA
PHENYX 7,582 7,582 NA NA
SEQUEST 7,629 7,629 NA NA
IPC_peptide - 20,969 20,969 16,882 [25] [75]
SWISS-2DPAGE 2,530 1,054 1,029 982
PIP-DB 4,947 2,427 2,254 1,307
IPC_protein - 3.481 3,283 2,324 [25] [75]
  1. NA not available refers to the situation where the given dataset was not created because a merged version was used
  2. Note: all datasets presented in the table are available as hyperlinks; the final datasets were divided randomly into 75 % training and 25 % testing subsets (denoted as [75] and [25], respectively)