Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Next-generation diagnostics and disease-gene discovery with the Exomiser

Abstract

Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires 3 GB of RAM and roughly 15–90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Overview of the processing steps of Exomiser.
Figure 2: Choice of Exomiser prioritization method.
Figure 3: Screenshot of Exomiser output.

Similar content being viewed by others

References

  1. Ng, S.B. et al. Exome sequencing identifies the cause of a Mendelian disorder. Nat. Genet. 42, 30–35 (2010).

    Article  CAS  Google Scholar 

  2. Ng, S.B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009).

    Article  CAS  Google Scholar 

  3. Yang, Y. et al. Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N. Engl. J. Med. 369, 1502–1511 (2013).

    Article  CAS  Google Scholar 

  4. Yang, Y. et al. Molecular findings among patients referred for clinical whole-exome sequencing. JAMA 312, 1870–1879 (2014).

    Article  CAS  Google Scholar 

  5. Zemojtel, T. et al. Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome. Sci. Transl. Med. 6, 252ra123 (2014).

    Article  Google Scholar 

  6. Soden, S.E. et al. Effectiveness of exome and genome sequencing guided by acuity of illness for diagnosis of neurodevelopmental disorders. Sci. Transl. Med. 6, 265ra168 (2014).

    Article  Google Scholar 

  7. Boycott, K.M., Vanstone, M.R., Bulman, D.E. & MacKenzie, A.E. Rare-disease genetics in the era of next-generation sequencing: discovery to translation. Nat. Rev. Genet. 14, 681–691 (2013).

    Article  CAS  Google Scholar 

  8. Robinson, P.N., Krawitz, P. & Mundlos, S. Strategies for exome and genome sequence data analysis in disease-gene discovery projects. Clin. Genet. 80, 127–132 (2011).

    Article  Google Scholar 

  9. Gilissen, C., Hoischen, A., Brunner, H.G. & Veltman, J.A. Disease gene identification strategies for exome sequencing. Eur. J. Hum. Genet. 20, 490–497 (2012).

    Article  CAS  Google Scholar 

  10. Schwarz, J.M., Cooper, D.N., Schuelke, M. & Seelow, D. MutationTaster2: mutation prediction for the deep-sequencing age. Nat. Methods 11, 361–362 (2014).

    Article  CAS  Google Scholar 

  11. Li, M.X. et al. Predicting Mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies. PLoS Genet. 9, e1003143 (2013).

    Article  CAS  Google Scholar 

  12. Pelak, K. et al. The characterization of twenty sequenced human genomes. PLoS Genet. 6, e1001111 (2010).

    Article  Google Scholar 

  13. MacArthur, D.G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012).

    Article  CAS  Google Scholar 

  14. Moreau, Y. & Tranchevent, L.C. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat. Rev. Genet. 13, 523–546 (2012).

    Article  CAS  Google Scholar 

  15. Shashi, V. et al. The utility of the traditional medical genetics diagnostic evaluation in the context of next-generation sequencing for undiagnosed genetic disorders. Genet. Med. 16, 176–182 (2014).

    Article  CAS  Google Scholar 

  16. de Ligt, J. et al. Diagnostic exome sequencing in persons with severe intellectual disability. N. Engl. J. Med. 367, 1921–1929 (2012).

    Article  CAS  Google Scholar 

  17. Oellrich, A. et al. The influence of disease categories on gene candidate predictions from model organism phenotypes. J. Biomed. Semantics 5, S4 (2014).

    Article  Google Scholar 

  18. Köhler, S. et al. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research. F1000Res. 2, 30 (2013).

    Article  Google Scholar 

  19. Washington, N.L. et al. Linking human diseases to animal models using ontology-based phenotype annotation. PLoS Biol. 7, e1000247 (2009).

    Article  Google Scholar 

  20. Köhler, S., Bauer, S., Horn, D. & Robinson, P.N. Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82, 949–958 (2008).

    Article  Google Scholar 

  21. Smedley, D. et al. Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases. Bioinformatics 30, 3215–3222 (2014).

    Article  CAS  Google Scholar 

  22. Pippucci, T. et al. A novel null homozygous mutation confirms CACNA2D2 as a gene mutated in epileptic encephalopathy. PLoS ONE 8, e82154 (2013).

    Article  Google Scholar 

  23. Requena, T. et al. Identification of two novel mutations in FAM136A and DTNA genes in autosomal-dominant familial Meniere's disease. Hum. Mol. Genet. 24, 1119–1126 (2015).

    Article  CAS  Google Scholar 

  24. Farwell, K.D. et al. Enhanced utility of family-centered diagnostic exome sequencing with inheritance model-based analysis: results from 500 unselected families with undiagnosed genetic conditions. Genet. Med. 17, 578–586 (2015).

    Article  CAS  Google Scholar 

  25. Markello, T. et al. York platelet syndrome is a CRAC channelopathy due to gain-of-function mutations in STIM1. Mol. Genet. Metab. 114, 474–482 (2015).

    Article  CAS  Google Scholar 

  26. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  Google Scholar 

  27. Jäger, M. et al. Jannovar: a java library for exome annotation. Hum. Mutat. 35, 548–555 (2014).

    Article  Google Scholar 

  28. Ramu, A. et al. DeNovoGear: de novo indel and point mutation discovery and phasing. Nat. Methods 10, 985–987 (2013).

    Article  CAS  Google Scholar 

  29. Smith, K.R. et al. Reducing the exome search space for Mendelian diseases using genetic linkage analysis of exome genotypes. Genome Biol. 12, R85 (2011).

    Article  Google Scholar 

  30. Abecasis, G.R. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

    Article  Google Scholar 

  31. Smedley, D. et al. PhenoDigm: analyzing curated annotations to associate animal models with human diseases. Database 2013, bat025 (2013).

    Article  Google Scholar 

  32. Blake, J.A., Bult, C.J., Kadin, J.A., Richardson, J.E. & Eppig, J.T. The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics. Nucleic Acids Res. 39, D842–D848 (2011).

    Article  CAS  Google Scholar 

  33. Koscielny, G. et al. The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data. Nucleic Acids Res. 42, D802–D809 (2014).

    Article  CAS  Google Scholar 

  34. Köhler, S. et al. Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am. J. Hum. Genet. 85, 457–464 (2009).

    Article  Google Scholar 

  35. Oti, M. & Brunner, H.G. The modular nature of genetic diseases. Clin. Genet. 71, 1–11 (2007).

    Article  CAS  Google Scholar 

  36. Brown, G.R. et al. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res. 43, D36–D42 (2015).

    Article  CAS  Google Scholar 

  37. Van Slyke, C.E., Bradford, Y.M., Westerfield, M. & Haendel, M.A. The zebrafish anatomy and stage ontologies: representing the anatomy and development of Danio rerio. J. Biomed. Semantics 5, 12 (2014).

    Article  Google Scholar 

  38. Köhler, S. et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42, D966–D974 (2014).

    Article  Google Scholar 

  39. Amberger, J.S., Bocchini, C.A., Schiettecatte, F., Scott, A.F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).

    Article  Google Scholar 

  40. Rath, A. et al. Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum. Mutat. 33, 803–808 (2012).

    Article  Google Scholar 

  41. Robinson, P.N. et al. The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am. J. Hum. Genet. 83, 610–615 (2008).

    Article  CAS  Google Scholar 

  42. Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, D1049–D1056 (2015).

  43. Gkoutos, G.V. et al. Entity/quality-based logical definitions for the human skeletal phenome using PATO. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2009, 7069–7072 (2009).

    PubMed  PubMed Central  Google Scholar 

  44. Franceschini, A. et al. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41, D808–D815 (2013).

    Article  CAS  Google Scholar 

  45. Bone W.P. et al. Computational evaluation of exome sequence data using human and model organism phenotypes improves diagnostic efficiency. Genet. Med. (in the press).

  46. Gahl, W.A. et al. The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases. Genet. Med. 14, 51–59 (2012).

    Article  CAS  Google Scholar 

  47. NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 43, D6–D17 (2015).

  48. Schwarz, J.M., Rodelsperger, C., Schuelke, M. & Seelow, D. MutationTaster evaluates disease-causing potential of sequence alterations. Nat. Methods 7, 575–576 (2010).

    Article  CAS  Google Scholar 

  49. Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

    Article  CAS  Google Scholar 

  50. Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).

    Article  CAS  Google Scholar 

  51. Liu, X., Jian, X. & Boerwinkle, E. dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations. Hum. Mutat. 34, E2393–E2402 (2013).

    Article  CAS  Google Scholar 

  52. Rosenbloom, K.R. et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 43, D670–D681 (2015).

    Article  CAS  Google Scholar 

  53. Guo, Y., Ye, F., Sheng, Q., Clark, T. & Samuels, D.C. Three-stage quality control strategies for DNA re-sequencing data. Brief. Bioinform. 15, 879–889 (2014).

    Article  CAS  Google Scholar 

  54. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  55. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  Google Scholar 

  56. O'Rawe, J. et al. Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 5, 28 (2013).

    Article  CAS  Google Scholar 

  57. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  Google Scholar 

  58. Cunningham, F. et al. Ensembl 2015. Nucleic Acids Res. 43, D662–D669 (2015).

    Article  CAS  Google Scholar 

  59. Aleman, A., Garcia-Garcia, F., Salavert, F., Medina, I. & Dopazo, J. A web-based interactive framework to assist in the prioritization of disease candidate genes in whole-exome sequencing studies. Nucleic Acids Res. 42, W88–W93 (2014).

    Article  CAS  Google Scholar 

  60. Coutant, S. et al. EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics. BMC Bioinformatics 13 Suppl 14: S9 (2012).

    Article  Google Scholar 

  61. Sifrim, A. et al. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease. Genome Med. 4, 73 (2012).

    Article  Google Scholar 

  62. Lee, I.H. et al. Prioritizing disease-linked variants, genes, and pathways with an interactive whole-genome analysis pipeline. Hum. Mutat. 35, 537–547 (2014).

    Article  CAS  Google Scholar 

  63. Li, M.X., Gui, H.S., Kwan, J.S., Bao, S.Y. & Sham, P.C. A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Res. 40, e53 (2012).

    Article  CAS  Google Scholar 

  64. He, Z. et al. Rare-variant extensions of the transmission disequilibrium test: application to autism exome sequence data. Am. J. Hum. Genet. 94, 33–46 (2014).

    Article  CAS  Google Scholar 

  65. Ionita-Laza, I. et al. Finding disease variants in Mendelian disorders by using sequence data: methods and applications. Am. J. Hum. Genet. 89, 701–712 (2011).

    Article  CAS  Google Scholar 

  66. Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).

    Article  CAS  Google Scholar 

  67. Singleton, M.V. et al. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am. J. Hum. Genet. 94, 599–610 (2014).

    Article  CAS  Google Scholar 

  68. Sifrim, A. et al. eXtasy: variant prioritization by genomic data fusion. Nat. Methods 10, 1083–1084 (2013).

    Article  CAS  Google Scholar 

  69. Masino, A.J. et al. Clinical phenotype-based gene prioritization: an initial study using semantic similarity and the human phenotype ontology. BMC Bioinformatics 15, 248 (2014).

    Article  Google Scholar 

  70. Javed, A., Agrawal, S. & Ng, P.C. Phen-Gen: combining phenotype and genotype to analyze rare disorders. Nat. Methods 11, 935–937 (2014).

    Article  CAS  Google Scholar 

  71. Robinson, P.N. Deep phenotyping for precision medicine. Hum. Mutat. 33, 777–780 (2012).

    Article  Google Scholar 

  72. Petrovski, S. & Goldstein, D.B. Phenomics and the interpretation of personal genomes. Sci. Transl. Med. 6, 254fs35 (2014).

    Article  Google Scholar 

  73. Corpas, M. Crowdsourcing the corpasome. Source Code Biol. Med. 8, 13 (2013).

    Article  Google Scholar 

  74. Wright, C.F. et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2014).

    Article  Google Scholar 

  75. Cote, R. et al. The ontology lookup service: bigger and better. Nucleic Acids Res. 38, W155–W160 (2010).

    Article  CAS  Google Scholar 

  76. Whetzel, P.L. et al. BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Res. 39, W541–W545 (2011).

    Article  CAS  Google Scholar 

  77. Girdea, M. et al. PhenoTips: patient phenotyping software for clinical and research use. Hum. Mutat. 34, 1057–1065 (2013).

    Article  Google Scholar 

  78. Washington, N.L. et al. How good is your phenotyping? Methods for quality assessment. In. Proceedings of Phenotype Day 2014@ISMB 2014 http://phenoday2014.bio-lark.org/pdf/6.pdf (2014).

Download references

Acknowledgements

This project was supported by the Bundesministerium für Bildung und Forschung (BMBF; project no. 0313911), the European Community's Seventh Framework Programme (grant agreement no. 602300; SYBIL) and NIH grant no. 5R24OD011883 (Monarch Initiative).

Author information

Authors and Affiliations

Authors

Contributions

P.N.R. and D.S. conceived of the project, programmed the prototype code, and wrote the manuscript. J.O.B.J., M.J., S.K., M.S., N.L.W. and E.S. developed software. T.Z., O.J.B. and W.P.B. tested code and contributed to the development of analysis strategies. S.K., M.A.H. and P.N.R. developed the phenotype analysis framework. M.A.H. helped develop the ontologies and the HPO curation standard. All authors reviewed and approved of the manuscript.

Corresponding author

Correspondence to Peter N Robinson.

Ethics declarations

Competing interests

S.K. and P.N.R. are holders of a patent for an ontology-based search methodology.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Smedley, D., Jacobsen, J., Jäger, M. et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat Protoc 10, 2004–2015 (2015). https://doi.org/10.1038/nprot.2015.124

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2015.124

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing