DeepTrio training data
DeepTrio training data
WGS models
version | Replicates | #examples |
---|---|---|
Child model | ||
1.1.0 | 4 HG001/NA12891/NA12892 trios 7 HG005/HG006/HG007 trios 3 HG002/HG003/HG004 trios |
566,589,652(1) |
1.2.0 | (Same model as 1.1.0) | |
1.3.0 | (Same model as 1.1.0) | |
1.4.0 | 4 HG001/NA12891/NA12892 trios 7 HG005/HG006/HG007 trios 3 HG002/HG003/HG004 trios |
704,228,446 |
1.5.0 | (6)4 HG001, 3 HG002, 3 HG003, 3 HG004, 7 HG005, 6 HG006, 6 HG007, 4 NA12891, 4 NA12892 | 704,228,358 |
Parent model | ||
1.1.0 | 7 HG005/HG006/HG007 trios 3 HG002/HG003/HG004 trios |
315,847,934 |
1.2.0 | (Same model as 1.1.0) | |
1.3.0 | (Same model as 1.1.0) | |
1.4.0 | 7 HG005/HG006/HG007 trios 3 HG002/HG003/HG004 trios |
457,374,516 |
1.5.0 | (6)3 HG002, 3 HG003, 3 HG004, 7 HG005, 6 HG006, 6 HG007 | 457,374,464 |
WES models
version | Replicates | #examples |
---|---|---|
Child model | ||
1.1.0 | 27 HG001/NA12891/NA12892 trios 6 HG005/HG006/HG007 trios 7 HG002/HG003/HG004 trios |
18,002,596 |
1.2.0 | (Same model as 1.1.0) | |
1.3.0 | (Same model as 1.1.0) | |
1.4.0 | 27 HG001/NA12891/NA12892 trios 6 HG005/HG006/HG007 trios 6 HG002/HG003/HG004 trios |
27,776,416 |
1.5.0 | (6)9 HG001, 7 HG002, 7 HG003, 7 HG004, 8 HG005, 8 HG006, 8 HG007, 9 NA12891, 9 NA12892 | 27,791,954 |
Parent model | ||
1.1.0 | 6 HG005/HG006/HG007 trios 6 HG002/HG003/HG004 trios |
4,131,018 |
1.2.0 | (Same model as 1.1.0) | |
1.3.0 | (Same model as 1.1.0) | |
1.4.0 | 6 HG005/HG006/HG007 trios 6 HG002/HG003/HG004 trios |
13,036,995 |
1.5.0 | (6)6 HG002, 6 HG003, 6 HG004, 8 HG005, 8 HG006, 8 HG007 | 13,036,998 |
PACBIO models(2)(3)
version | Replicates | #examples |
---|---|---|
Child model | ||
1.1.0 | 1 HG005/HG006/HG007 trio 8 HG002/HG003/HG004 trios |
397,610,700 |
1.2.0 | 1 HG005/HG006/HG007 trio 8 HG002/HG003/HG004 trios |
406,893,180(4) |
1.3.0 | 2 HG005/HG006/HG007 trio 10 HG002/HG003/HG004 trios |
539,382,124(5) |
1.4.0 | (Same model as 1.3.0) | |
Parent model | ||
1.1.0 | 1 HG005/HG006/HG007 trio 8 HG002/HG003/HG004 trios |
386,418,918 |
1.2.0 | 1 HG005/HG006/HG007 trio 8 HG002/HG003/HG004 trios |
392,749,204(4) |
1.3.0 | 2 HG005/HG006/HG007 trio 10 HG002/HG003/HG004 trios |
533,353,050(5) |
1.4.0 | (Same model as 1.3.0) |
(1): We include HG002/HG003/HG004 for training WGS model, but only using examples from the region of NIST truth confident region v4.2 subtracting v3.3.2.
(2): We use the entire HG002/HG003/HG004 trio for PacBio model training.
(3): PacBio training data contains training examples with haplotag sorted images and unsorted images.
(4): In v1.2.0, we updated the NIST truth versions we used for training.
(5): In v1.3.0, we included PacBio Sequel II Chemistry v2.2 data in the training dataset. And we updated to NIST truth version to v4.2.1.
(6): Starting in v1.5.0, for clarity, we report the number of unique BAM files used. Note that this doesn’t mean all the trios were paired together to produce training data.