DeepTrio runtime and accuracy metrics for all release models

WGS (Illumina)

Runtime

Runtime is on HG002/HG003/HG004 (all chromosomes).

Stage Wall time (minutes)
make_examples ~441m
call_variants for HG002 ~356m
call_variants for HG003 ~360m
call_variants for HG004 ~359m
postprocess_variants (parallel) ~62m
total ~1578m = ~26.3 hours

Accuracy

We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1 truth), which was held out while training.

HG002:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 11204 52 19 0.99538 0.99837 0.996873
SNP 71081 252 29 0.996467 0.999592 0.998027

HG003:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 10588 40 21 0.996236 0.998102 0.997168
SNP 69988 178 64 0.997463 0.999087 0.998274

HG004:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 10952 48 27 0.995636 0.997645 0.99664
SNP 71456 203 66 0.997167 0.999078 0.998122

PacBio (HiFi)

The numbers below are the same as https://github.com/google/deepvariant/blob/r1.4/docs/metrics-deeptrio.md#pacbio-hifi. To run DeepTrio PacBio, please use v1.4.0.

Runtime

Runtime is on HG002/HG003/HG004 (all chromosomes).

Stage Wall time (minutes)
make_examples ~829m
call_variants for HG002 ~263m
call_variants for HG003 ~266m
call_variants for HG004 ~268m
postprocess_variants (parallel) ~78m
total ~1704m = ~28.4 hours

Accuracy

We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1 truth), which was held out while training.

HG002:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 11233 23 62 0.997957 0.994734 0.996343
SNP 71272 61 22 0.999145 0.999692 0.999418

HG003:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 10595 33 54 0.996895 0.995155 0.996024
SNP 70144 22 19 0.999686 0.999729 0.999708

HG004:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 10967 33 53 0.997 0.995403 0.996201
SNP 71594 65 38 0.999093 0.99947 0.999281

Whole Exome Sequencing (Illumina)

Runtime

Runtime is on HG002/HG003/HG004 (all chromosomes).

Stage Wall time (minutes)
make_examples ~16m
call_variants for HG002 ~5m
call_variants for HG003 ~5m
call_variants for HG004 ~5m
postprocess_variants (parallel) ~1m
total ~32m

Accuracy

We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1 truth), which was held out while training.

HG002:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 34 0 0 1.0 1.0 1.0
SNP 670 2 0 0.997024 1.0 0.99851

HG003:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 29 0 0 1.0 1.0 1.0
SNP 683 2 0 0.99708 1.0 0.998538

HG004:

Type TRUTH.TP TRUTH.FN QUERY.FP METRIC.Recall METRIC.Precision METRIC.F1_Score
INDEL 32 1 0 0.969697 1.0 0.984615
SNP 676 3 0 0.995582 1.0 0.997786

How to reproduce the metrics on this page

For simplicity and consistency, we report runtime with a CPU instance with 64 CPUs For bigger datasets (WGS and PACBIO), we used bigger disk size (900G). This is NOT the fastest or cheapest configuration.

Use gcloud compute ssh to log in to the newly created instance.

Download and run any of the following case study scripts:

curl -O https://raw.githubusercontent.com/google/deepvariant/r1.5/scripts/inference_deeptrio.sh

# WGS
bash inference_deeptrio.sh --model_preset WGS

# WES
bash inference_deeptrio.sh --model_preset WES

# PacBio
bash inference_deeptrio.sh --model_preset PACBIO --bin_version 1.4.0

Runtime metrics are taken from the resulting log after each stage of DeepTrio. The runtime numbers reported above are the average of 5 runs each. The accuracy metrics come from the hap.py summary.csv output file. The runs are deterministic so all 5 runs produced the same output.