British Biotechnology Journal, ISSN: 2231-2927,Vol.: 13, Issue.: 4

Review Article

Current Opportunities and Challenges of Next Generation Sequencing (NGS) of DNA; Determining Health and Diseases


Carlo P. J. M. Brouwer1,2,3*, Thuy Duong Vu1, Miaomiao Zhou1, Gianluigi Cardinali4, Mick M. Welling5, Nathalie van de Wiele1 and Vincent Robert1,3

1CBS-KNAW Fungal Biodiversity Center, Uppsalalaan 8, Utrecht 3584 CT, The Netherlands.

2CBMR Scientific Inc., Suite 161, 2057-111 Street NW, Edmonton, Alberta, Canada.

3BioAware Life Sciences Data Management Software, Rue du Henrifontaine 20, B-4280 Hannut, Belgium.

4University of Perugia, Department Applied Biology-Microbiology, Borgo 20 Giugno, 74, I-06121 Perugia, Italy.

5Departement of Radiology, Leiden University Medical Center, Interventional Molecular Imaging Laboratory, Room C2 -204, Leiden, The Netherlands.

Article Information
(1) Chung-Jen Chiang, Department of Medical Laboratory Science and Biotechnology, China Medical University, Taiwan.
(1) Michael R. Emmert-Buck, Avoneaux Medical Institute, Oxford, USA.
(2) Molobe Ikenna Daniel, International Institute of Risk and Safety Management (IIRSM), Nigeria.
(3) Triveni Krishnan, National Institute of Cholera and Enteric Diseases, Kolkata, India.
Many publications have demonstrated the huge potential of NGS methods in terms of new species discovery, environment monitoring, ecological studies, etc. [24,35,92,97,103]. Undoubtedly, NGS will become one the major tools for species identification and for routine diagnostic use. While read lengths are still quite short for most existing systems ranging between 50 bp and 800 bp, they are likely to improve soon. This will enable easier, faster, and more reliable contig assembly and subsequent matching against reference databases. When data generation is no longer a bottleneck, the storage, speed of analysis, and interpretation of DNA sequence data are becoming the major challenges. Also, the integration or the use of data originating from diverse datasets and a variety of data providers are serious issues that need to be addressed. Poor sequence record annotations and species name assignments are known problems that should be instantly addressed and would allow the creation of reference databases used for routine diagnostics based on NGS. Samples with huge amounts of short DNA fragments need to be analyzed and compared against reference databases in an efficient and fast way.  Although a number of solutions have been proposed by Industry; offering commercial software, there still remain hurdles to take. One of the challenges that we need to address is data upload from client’s computers to central or distributed data storage and analysis services. Another one is the efficient parallelization of analyses using cloud or grid solutions. The reliability and up-time of storage and analyses facilities is another important problem that need to be addressed if one wants to use it for routine diagnostics. Finally, the management, reporting and visualization of the analyses results are among the last issues, but not the least challenging ones. Considering the constant growth of computational power and storage capacity needed by different bioinformatics applications, working with single or a limited number of servers is no longer realistic. Using a cloud environment and grid computing is becoming a must. Even single cloud service provider can be restrictive for bioinformatics applications and working with more than one cloud can make the workflow more robust in the face of failures and always growing capacity needs. In this white paper we review the current state of the art in this field. We discuss the main limitations and challenges that we need to address such as; data upload from client’s computers to central or distributed data storage and analysis services; efficient parallelization of analyses using grid solutions; reliability and up-time of storage and analyses facilities for routine diagnostics; management, retrieving and visualization of the analyses results.

Keywords :

DNA; next generation sequencing; bioinformatics; clinical approaches, limitations.

DOI : 10.9734/BBJ/2016/25662

