FAQs about the GATK

If you have a question not listed here, contact us and we'll add the question and a response to this list.

1. What is the GATK?

First developed and introduced by the Broad Institute in 2009, the Genome Analysis Toolkit (GATK) consists of an industrial-strength computational infrastructure and computational engine that empowers the development of analysis tools for next-generation sequencing. A rich ecosystem of specialized analysis tools, called “walkers,” lives on top of this infrastructure to process data from any NGS platform and identify changes in sequences that may be associated with disease. The GATK framework provides the widest variety of tools that can be used individually out of the box or chained together into scripted workflows to perform anything from simple data diagnostics to complex "reads-to-results" analyses. The GATK’s robust architecture, powerful processing engine, and high-performance, scalable computing features enable it to take on projects of any size.


2. What distinguishes the GATK from other options for NGS analysis?

The GATK offers several advantages over other available tools for genetic analysis:

  • The GATK offers a comprehensive end-to-end solution covering a wide range of NGS analysis workflows. Most other NGS analysis tools only address specific parts of an overall analysis pipeline (e.g., alignment). The GATK contains over 70 tools covering everything from data processing through variant calling, evaluation, and manipulation—and more tools are always being developed to cover the latest trends in NGS analysis. Click here for a full list of GATK tools.
  • The GATK is uniquely flexible, processing data from any NGS platform and handling any type of genetic data with any level of ploidy: exomes or whole genomes, human or non-human DNA. This makes the GATK an ideal tool for a range of workflows, including targeted panels, somatic mutation studies, and even agricultural genomics. See some example GATK workflows here.
  • Base quality score recalibration (BQSR) in the GATK provides a unique data processing capability that gets data into an optimal form for discovering mutations. Other tools have attempted to implement similar capabilities, but none do it as well as the GATK.
  • The Unified Genotyper is widely used and considered one of the fastest and more accurate variant callers available.
  • GATK is a user-friendly analysis tool compared to most other available tools. Broad has for years offered an active support forum, which has provided insights into using the tool and has helped improve the toolkit. Broad’s decision to partner with Appistry builds on this foundation, giving for-profit users even more support to streamline installation, implementation, and use.


3. Who uses the GATK?

GATK is the de facto standard for cutting-edge NGS analysis (see a representative list of GATK publications here). The GATK is used by thousands of bioinformatics professionals, biomedical researchers, and clinicians. The GATK has been used on initiatives ranging from the 1000 Genomes Project to The Cancer Genome Atlas and by many sequencing centers and leading researchers to conduct population studies and explore the genetic origins of disease.


4. What is the nature of Appistry’s agreement with the Broad Institute?

The Broad Institute has partnered with Appistry to be the exclusive distributor of the GATK for researchers using the toolkit in a for-profit context. The GATK license from Appistry includes commercial-grade support for evaluating, installing, configuring, and using the GATK. Appistry also provides documentation and guided workflows and long-term support for each commercial release, which makes it easier to use the GATK in regulated environments. By partnering with Appistry, the Broad Institute is able to continue to develop and improve the GATK while ensuring the high level of support researchers have sought.


5. Why did the Broad Institute choose Appistry as the first authorized distributor of the GATK?

The Broad Institute believes that Appistry’s deep experience in accelerating the science behind NGS research, as well as bringing to market new technology for high-throughput NGS analysis and next-generation medicine, make it a perfect partner for providing commercial-grade support to users of the GATK.


6. What’s in the latest release from Appistry?

For full details on the new functionality in Appistry’s latest release, visit our What’s New page.


7. What are the advanced features that the Broad introduced with version 2.0 of the GATK?

Version 2.0 of the GATK, introduced in the fall of 2012, added several completely new walkers, which were made available in the academic version from the Broad and the first for-profit release offered by Appistry. None of this advanced functionality is included in any of the versions of the GATK available as GATK-lite. The advanced features address major challenges in NGS analysis, including the need for better sensitivity and specificity in indel calling; integrated calling of SNPs, indels, and SVs; and ways to make increasingly voluminous datasets easier to transfer, manage, and analyze. The advanced tools are

  • Base quality score recalibration (BQSR) v2, which accurately estimates not only base substitution probabilities determined by the original BQSR, but also base insertion and base deletion probabilities. Downstream variant calling hinges on these three statistical measures, which make BQSR v2 even better at distinguishing between true and false positive indels.
  • The Haplotype Caller, a multi-sample local de novo assembly and integrated SNP, indel, and short SV caller. This caller improves the discovery of larger indels (10-50 bp events), reducing the chance of missing or incorrectly determining the alleles underlying a mutation. Taken together, BQSR v2 and the Haplotype Caller significantly improve alignments and increase the chances of discovering the changes that signal mutations in a sequence.
  • Powerful enhancements to the original Unified Genotyper, including a novel error modeling approach that uses a reference sample to build a site-specific error model for SNPs and indels that vastly improves calling accuracy.
  • Reduce Reads, a BAM compression algorithm that reduces file sizes by 20x-100x while preserving all information necessary for accurate SNP and indel calling. ReduceReads enables the GATK to call tens of thousands of deeply sequenced NGS samples simultaneously.

For more information on these advanced features, download Appistry’s white paper.


8. What is included with an Appistry license?

Two critical benefits offered by Appistry are version control and commercial support, both of which make it easier to implement and use the GATK, particularly in regulated environments. Appistry’s quarterly releases package up all updates, enhancements, and issue resolutions provided by the Broad Institute during the time between commercial releases to ensure product stability and a high level of quality control. Appistry also conducts QA and validation on its quarterly releases beyond that performed at the Broad Institute and reports results to customers and the Broad to inform ongoing GATK development.

In addition to the quarterly releases, Appistry provides interim “beta” releases for customers who wish to take advantage of the latest functionality from the Broad, though these releases will not receive full QA testing and should be used at a customer’s risk.

The Appistry license also provides commercial-grade support, including

  • Extensive installation and user documentation that will speed implementation and reduce the need for training
  • A customer support director embedded at the Broad to provide timely response to issues (within one business day).
  • A comprehensive best practices guide that provides guided workflows for common analyses, along with other useful resources and tutorials
  • Long-term support for each commercial release of the GATK that will make it easier to use the toolkit in certified processes, such as CLIA


9. What use of the GATK requires a license through Appistry?

Any entity using the GATK in any for-profit context or to generate revenue must purchase a license from Appistry. This includes

  • All commercial organizations. If you are employed by a for-profit, commercial entity and are using the GATK, you must purchase a license from Appistry regardless of the nature of the work for which you are using the toolkit.
  • Any commercial or non-profit entity charging for the use of a genetic service that employs the GATK.
  • Academic institutions using the GATK (either directly or in a fee-for-service model) who receive funds from a commercial entity when the funds are used to provide a service to the commercial entity that involves the exchange of experimental results. (Collaborative research with a commercial entity to develop theses, reports, or publications produced by the academic institution does not require a commercial license.)


10. How does Appistry license the GATK?

Appistry offers a range of annual license models, including individual user, workgroup and lab licenses, and enterprise licenses, which were designed to meet the specific GATK usage needs of individual scientists as well as small, medium, and large organizations. To learn more, visit our Licensing the GATK page.


11. What is Appistry’s GATK release cycle?

Quarterly releases of the GATK from Appistry package up all updates, enhancements, and issue resolutions provided by the Broad Institute during the time between commercial releases to ensure product stability and a high level of quality control. Appistry also provides interim “beta” releases for customers who wish to take advantage of the latest functionality from the Broad, though these releases will not receive full QA testing and should be used at a customer’s risk.


12. I have been using GATK-lite. Why should I purchase a GATK license from Appistry?

When it announced the licensing changes with the 2.0 release of the GATK in the summer of 2012, the Broad Institute opted to release a “base” set of GATK functionality that they dubbed “GATK-lite.” In January of 2013, the Broad Institute announced that no further updates will be made to GATK-lite. This version of the toolkit is now frozen at version 2.3.9.

Even before this decision, GATK-lite included only a subset of GATK functionality. The full academic version of the GATK from the Broad Institute and the commercial version from Appistry include, for instance, expansions on toolsets found in GATK-lite (such as a new version of BQSR and powerful extensions to the Unified Genotyper) and completely new functionality (the new Haplotype Caller, a local de novo assembler and integrated SNP, indel, and short SV caller; and ReduceReads, a compression tool). The chart below details these differences for the 2.3 releases of the toolkits.

Feature comparison table

Commercial organizations who do not wish to purchase a license from Appistry can certainly continue to use GATK-lite, but as the Broad Institute and Appistry continue to extend the features and functionality in the full GATK (and as scientific publications continue to reference these new extensions), GATK-lite is rapidly becoming outdated.


13. Does Appistry provide access to the source code for the commercial releases of the GATK?

Because the GATK from Appistry is a supported product, Appistry has restricted access to the source code. The source code will be available as part of the license to organizations that purchase an enterprise license from Appistry. Holders of individual named-user licenses from Appistry may purchase the right to access the commercial source code for an additional fee. In these cases, Appistry will no longer be able to guarantee full commercial support should a site or individual opt to make changes to the source code.


14. I wish to purchase a license through Appistry. Who do I contact and how will the purchasing process work?

Researchers using the GATK in a for-profit context can receive more information on licensing the GATK from Appistry by contacting their sales account manager, completing the “Contact Us” form on the Appistry website, or calling Appistry at +1 314-450-5748.


15. I use a licensed version of the GATK and have a question about how to use the tools. Who do I contact with my question?

Licensed users through Appistry should direct questions to Appistry GATK Support on the Appistry website. Licensed users through the Broad Institute should continue to direct questions to the support forums and resources provided by the Broad at broadinstitute.org/gatk/.


Share this Page