Abstract
Tuberculosis (TB) remains a public health problem in Thailand and causes socio-economic consequences. Studying of epidemiology and biology of Mycobacterium tuberculosis, a causative agent of TB, development of rapid diagnosis of drug-resistant TB, and effective treatment are urgently required for effective prevention and control of TB. Genotyping of M. tuberculosis strain based on specific genomic markers has been widely used for epidemiological study. Particularly, single nucleotide polymorphisms (SNPs) are current genomic markers appropriate for epidemiological and evolutional study and also for prediction of drug resistance. From the first-year study, we could develop the whole genome sequence analysis pipeline for prediction of resistance to first-line anti-TB drugs (isoniazid, rifampicin, streptomycin and ethambutol) and for rapid genotyping. Comparison with results obtained from standard drug susceptibility testing, the analysis pipeline had sensitivity and specificity of 94.6% and 98.5% for isoniazid, of 92.4% and 99.6% for rifampicin, of 85.3% and 98.9% for streptomycin, and of 66.7% and 97.9% for ethambutol, respectively. For genotyping the pipeline showed 100% concordant results with the standard LSP. In addition, SNPs databases for drug resistance and genotyping were created from this study. In the second year of the study, the automatic genome analysis pipeline for prediction of drug resistance and for genotyping was developed and validated with a test sample set of 50 M. tuberculosis strains comprising of 41 isoniazid-, 43 rifampicin-, 23 streptomycin-, 27 ethambutol-, 12 amikacin-, 13 kanamycin-, 25 fluoroquinolone-, 11 ethionamide- and 15 PAS-resistant strains. Results were compared with those obtained from the standard drug susceptibility testing and revealed the sensitivity and specificity from 82.6 to 92.6% and from 87.0 to 100%, respectively, for prediction of first-line anti-TB drugs. The sensitivity was lower for prediction of second-line drugs; sensitivity and specificity ranged from 66.7 to 80.0% and from 92.0 to 100%, respectively. The automatic genome analysis pipeline also showed 100% concordant results with the standard LSP for genotyping.