Full Paper View Go Back
OTU Clustering: A window to analyse uncultured microbial world
Ashaq Hussain Bhat1 , Puniethaa Prabhu2
- Department of Biotechnology K. S. Rangasamy College of Technology, Tiruchengode, India.
- Department of Biotechnology K. S. Rangasamy College of Technology, Tiruchengode, India.
Correspondence should be addressed to: ashaq11bhat@gmail.com.
Section:Review Paper, Product Type: Isroset-Journal
Vol.5 ,
Issue.6 , pp.62-68, Dec-2017
CrossRef-DOI: https://doi.org/10.26438/ijsrcse/v5i6.6268
Online published on Dec 31, 2017
Copyright © Ashaq Hussain Bhat, Puniethaa Prabhu . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Ashaq Hussain Bhat, Puniethaa Prabhu, âOTU Clustering: A window to analyse uncultured microbial world,â International Journal of Scientific Research in Computer Science and Engineering, Vol.5, Issue.6, pp.62-68, 2017.
MLA Style Citation: Ashaq Hussain Bhat, Puniethaa Prabhu "OTU Clustering: A window to analyse uncultured microbial world." International Journal of Scientific Research in Computer Science and Engineering 5.6 (2017): 62-68.
APA Style Citation: Ashaq Hussain Bhat, Puniethaa Prabhu, (2017). OTU Clustering: A window to analyse uncultured microbial world. International Journal of Scientific Research in Computer Science and Engineering, 5(6), 62-68.
BibTex Style Citation:
@article{Bhat_2017,
author = {Ashaq Hussain Bhat, Puniethaa Prabhu},
title = {OTU Clustering: A window to analyse uncultured microbial world},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {12 2017},
volume = {5},
Issue = {6},
month = {12},
year = {2017},
issn = {2347-2693},
pages = {62-68},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=519},
doi = {https://doi.org/10.26438/ijcse/v5i6.6268}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v5i6.6268}
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=519
TI - OTU Clustering: A window to analyse uncultured microbial world
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Ashaq Hussain Bhat, Puniethaa Prabhu
PY - 2017
DA - 2017/12/31
PB - IJCSE, Indore, INDIA
SP - 62-68
IS - 6
VL - 5
SN - 2347-2693
ER -
Abstract :
Clustering is the technique used to deal with higher amounts of data by partitioning the data into some groups based on some attributes. Clustering technique has many applications in different fields of science and technology. It is an important tool in genomics and metagenomics which performs taxonomic profiling of the microbial world by grouping 16S RDNA amplicon reads into clusters called as Operational Taxonomic Units (OTUs). With the help of Next Generation Sequencing (NGS) tools and clustering it has become easy for scientists to find the microbial diversities in different environments without culturing the microbes. Assignment of 16s rDNA sequences to the clusters called as OTUs is the main task in metagenomics algorithms and is also the main bottleneck for analysing microbial communities. Taxonomic profiling of 16S rDNA is an important step in Metagenomic pipeline analysis. There are several OTU clustering algorithms which clusters the amplicon reads of 16S rDNA into OTUs, each algorithm use a specific type of clustering technique to cluster the sequence reads. Some of the mostly used algorithms are Uclust, swarm, SUMACLUST, SortMeRNA, USEARCH. In this paper, we first give a brief overview of major clustering techniques and their types. Furthermore, we provide a comprehensive overview of OTU clustering algorithms.
Key-Words / Index Term :
16S rDNA; OTUs; Uclust; SUMACLUST; SortMeRNA; USEARCH; taxonomic profiling
References :
[1] P. Dâhaeseleer, âHow does gene expression clustering work?â Nat. Biotechnol., vol. 23, pp. 1499â501, 2005.
[2] N. D. Heintzman, G. C. Hon, R. D. Hawkins, P. Kheradpour, A. Stark, L. F. Harp, Z. Ye, L. K. Lee, R. K. Stuart, and C. W. Ching, âHistone modifications at human enhancers reflect global celltype- specific gene expression,â Nature, vol. 459, no. 7243, pp. 108â112, 2009.
[3] R. K. Chodavarapu, S. Feng, Y. V. Bernatavichute, P.-Y. Chen, H. Stroud, Y. Yu, J. a. Hetzel, F. Kuo, J. Kim, S. J. Cokus, D. Casero, M. Bernal, P. Huijser, A. T. Clark, U. Kramer, S. S. Merchant, X. Zhang, S. E. Jacobsen, and M. Pellegrini, âRelationship between nucleosome positioning and DNA methylation,â Nature, vol. 466, pp. 388â92, 2010.
[4] X. Wang, G. O. Bryant, M. Floer, D. Spagna, and M. Ptashne, âAn effect of DNA sequence on nucleosome occupancy and removal,â Nat. Publishing Group, vol. 18, pp. 507â509, 2011.
[5] A. S. Shirkhorshidi, S. Aghabozorgi, T. Y. Wah, T. Herawan, âBig Data Clustering: A Reviewâ Computational Science and Its Applications â ICCSA 2014Volume 8583 of the series Lecture Notes in Computer Science pp 707-720.
[6] M. L. Sogin, H. G. Morrison, J. A. Huber, D. Mark Welch, S. M. Huse, P. R. Neal, J. M. Arrieta, and G. J. Herndl, âMicrobial diversity
[7] in the deep sea and the underexplored ârare biosphereââ, Proc. Nat. Acad. Sci. USA, vol. 103, no. 32, pp. 12115â 12120, 2006.
[8] S. M. Huse, D. M. Welch, H. G. Morrison, and M. L. Sogin. (2010).Ironing out the wrinkles in the rare biosphere through improved OTU clustering,â Environmental Microbiol., vol. 12, no. 7, pp. 1889â1898.
[9] J. G. Caporaso, J. Kuczynski, J. Stombaugh, K. Bittinger, F. D. Bushman, E. K. Costello, N. Fierer, A. G. Pena, J. K. Goodrich, J. I. Gordon, G. A. Huttley, S. T. Kelley, D. Knights, J. E. Koenig, R. E. Ley, C. A. Lozupone, D. McDonald, B. D. Muegge, M. Pirrung, J. Reeder, J. R. Sevinsky, P. J. Turnbaugh, W. A. Walters, J. Widmann, T. Yatsunenko, J. Zaneveld, and R. Knight, âQIIME allows analysis of high-throughput community sequencing data,â Nature Methods, vol. 7, no. 5, pp. 335â336, May 2010.
[10] R. C. Edgar. (2010). âSearch and clustering orders of magnitude faster than BLASTâ Bioinformatics, vol. 26, no. 19, pp. 2460â2461.
[11] R. C. Edgar, âUPARSE: highly accurate OTU sequences from microbial amplicon reads,â Nat. Methods, vol. 10, no. 10, pp. 996â 8, Oct. 2013.
[12] P. D. Schloss, S. L. Westcott, T. Ryabin, J. R. Hall, M. Hartmann, E. B. Hollister, R. A. Lesniewski, B. B. Oakley, D. H. Parks, C. J. Robinson, J. W. Sahl, B. Stres, G. G. Thallinger, D. J. V. Horn, and C. F. Weber, âIntroducing mothur: Open-source platform-independent community supported software for describing and comparing microbial communitiesâ, Appl. Envir. Microbiol., vol. 75, no. 23, pp. 7537â7541, 2009.
[13] Y. Sun, Y. Cai, L. Liu, F. Yu, M. L. Farrell, W. McKendree, and W. Farmerie, âESPRIT: Estimating species richness using large collections of 16S rRNA pyrosequencesâ, Nucleic Acids Res., vol. 37, no. 10, p. e76, 2009.
[14] Y. Cai and Y. Sun., âESPRIT-Tree: Hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational timeâ Nucleic Acids Res., vol. 39, no. 14, p. e95, 2011.
[15] R. C. Edgar., âMUSCLE: Multiple sequence alignment with high accuracy and high throughputâ, Nucleic Acids Res., vol. 32, no. 5, pp. 1792â1797, 2004.
[16] Y. Sun, Y. Cai, S. M. Huse, et al., âA large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis,â Briefings in Bioinformatics, vol. 13, no. 1, pp. 107â121, 2011.
[17] T. Zhang, R. Ramakrishnan, and M. Livny, âBIRCH: A new data clustering algorithm and its applicationsâ, Data Mining Knowl. Discovery, vol. 1, no. 2, pp. 141â182, 1997.
[18] W. Li and A. Godzik., âCd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequencesâ, Bioinformatics, vol. 22, no. 13, pp. 1658â1659, 2006.
[19] Schloss PD, Handelsman J., âIntroducing DOTUR, a computer program for defining operational taxonomic units and estimating species richnessâ Appl Environ Microbiol 71:1501â1506. http://dx.doi.org/ 10.1128/AEM.71.3.1501, 2005.
[20] Albanese D, Fontana P, De Filippo C, Cavalieri D, Donati C., âMicca: a complete and accurate software for taxonomic profiling of metagenomic dataâ, Sci Rep 5:9743, http://dx.doi.org/10.1038/srep09743, 2015.
[21] MahĂ© F, Rognes T, Quince C, de Vargas C, Dunthorn M., âSwarm: robust and fast clustering method for amplicon-based studiesâ, PeerJ 2:e593, http://dx.doi.org/10.7717/peerj.593, 2014.
[22] MahĂ© F, Rognes T, Quince C, de Vargas C, Dunthorn M., âSwarm v2: highly-scalable and high-resolution amplicon clusteringâ, PeerJ 3:e1420, http://dx.doi.org/10.7717/peerj.1420, 2015.
[23] Kopylova E, NoĂ© L, Touzet H., âSortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic dataâ, Bioinformatics 28:3211â3217. http://dx.doi.org/10.1093/bioinformatics/bts611, 2012.
[24] Hobohm U, Scharf M, Schneider R, Sander C., âSelection of representative protein data setsâ Protein Sci 1, 409 â 417, http:// dx.doi.org/10.1002/pro.5560010313, 1992.
[25] Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R., âUCHIME improves sensitivity and speed of chimera detectionâ BioInformatics 27, 2194â2200,
[26] Legendre P, Legendre L., âNumerical ecologyâ, 2nd ed, Developments in environmental modelling, vol 20, p . Elsevier Science, Amsterdam, The Netherlands, 1998
You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at support@isroset.org or view contact page for more details.