DeepTrio runtime and accuracy metrics for all release models

WGS (Illumina)

Runtime

Runtime is on HG002/HG003/HG004 (all chromosomes).

Stage	Wall time (minutes)
make_examples	~441m
call_variants for HG002	~356m
call_variants for HG003	~360m
call_variants for HG004	~359m
postprocess_variants (parallel)	~62m
total	~1578m = ~26.3 hours

Accuracy

We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1 truth), which was held out while training.

HG002:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	11204	52	19	0.99538	0.99837	0.996873
SNP	71081	252	29	0.996467	0.999592	0.998027

HG003:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	10588	40	21	0.996236	0.998102	0.997168
SNP	69988	178	64	0.997463	0.999087	0.998274

HG004:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	10952	48	27	0.995636	0.997645	0.99664
SNP	71456	203	66	0.997167	0.999078	0.998122

See VCF stats report (for all chromosomes)
- HG002
- HG003
- HG004

PacBio (HiFi)

The numbers below are the same as https://github.com/google/deepvariant/blob/r1.4/docs/metrics-deeptrio.md#pacbio-hifi. To run DeepTrio PacBio, please use v1.4.0.

Runtime

Runtime is on HG002/HG003/HG004 (all chromosomes).

Stage	Wall time (minutes)
make_examples	~829m
call_variants for HG002	~263m
call_variants for HG003	~266m
call_variants for HG004	~268m
postprocess_variants (parallel)	~78m
total	~1704m = ~28.4 hours

See VCF stats report (for all chromosomes)
- HG002
- HG003
- HG004

Accuracy

We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1 truth), which was held out while training.

HG002:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	11233	23	62	0.997957	0.994734	0.996343
SNP	71272	61	22	0.999145	0.999692	0.999418

HG003:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	10595	33	54	0.996895	0.995155	0.996024
SNP	70144	22	19	0.999686	0.999729	0.999708

HG004:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	10967	33	53	0.997	0.995403	0.996201
SNP	71594	65	38	0.999093	0.99947	0.999281

Whole Exome Sequencing (Illumina)

Runtime

Runtime is on HG002/HG003/HG004 (all chromosomes).

Stage	Wall time (minutes)
make_examples	~16m
call_variants for HG002	~5m
call_variants for HG003	~5m
call_variants for HG004	~5m
postprocess_variants (parallel)	~1m
total	~32m

Accuracy

We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1 truth), which was held out while training.

HG002:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	34	0	0	1.0	1.0	1.0
SNP	670	2	0	0.997024	1.0	0.99851

HG003:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	29	0	0	1.0	1.0	1.0
SNP	683	2	0	0.99708	1.0	0.998538

HG004:

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	32	1	0	0.969697	1.0	0.984615
SNP	676	3	0	0.995582	1.0	0.997786

See VCF stats report (for all chromosomes)
- HG002
- HG003
- HG004

How to reproduce the metrics on this page

For simplicity and consistency, we report runtime with a CPU instance with 64 CPUs For bigger datasets (WGS and PACBIO), we used bigger disk size (900G). This is NOT the fastest or cheapest configuration.

Use gcloud compute ssh to log in to the newly created instance.

Download and run any of the following case study scripts:

curl -O https://raw.githubusercontent.com/google/deepvariant/r1.5/scripts/inference_deeptrio.sh

# WGS
bash inference_deeptrio.sh --model_preset WGS

# WES
bash inference_deeptrio.sh --model_preset WES

# PacBio
bash inference_deeptrio.sh --model_preset PACBIO --bin_version 1.4.0

Runtime metrics are taken from the resulting log after each stage of DeepTrio. The runtime numbers reported above are the average of 5 runs each. The accuracy metrics come from the hap.py summary.csv output file. The runs are deterministic so all 5 runs produced the same output.