A Robust Multi-Keyword Text Content Retrieval by Utilizing Hash Indexing
Keywords:
Information Retrieval, Text Feature, Text Mining, Text OntologyAbstract
Digital content on servers increase the storage and fetching issues. So, researcher works in this field to organize content for fast retrieval with data security. This paper has worked on text digital content retrieval available in form of documents, files. User can search a desired file by test query and relevant list of files get appeared. Keywords were fetched from the text content by removing noisy data during pre-processing. Pre-processed keywords are identified by the number known as term ID. As per the term-ID each text content got a Hash Index which was termed as key numbers in document index. Each term or word has its own identification number known as term Id , so privacy of comparing content terms and user query maintain by hash based searching. As document identification done by hash index key, so storage of text content was done in encrypted numbers once document select for reading then decryption of document applied for a particular user. Experiment was done on real and artificial text content dataset files on different topics. It was obtained that proposed model of Hash indexing and tem based retrieval has improved the privacy with relevancy of as per query.
Downloads
References
Khan, A., Baharudin, B., Lee, L. H., & Khan, K. (2010). A review of machine learning algorithms for textdocuments classification. Journal of Advances in Information Technology, 1, 4-20.
Khan, A., Baharudin, B., Lee, L. H., & Khan, K. (2010). A review of machine learning algorithms for textdocuments classification. Journal of Advances in Information Technology, 1, 4-20.
Brindha, S., Sukumaran, S., & Prabha, K. (2016). A survey on classification techniques for text mining. Proceedings of the 3rd International Conference on Advanced Computing and Communication Systems. IEEE. Coimbatore, India.
K. Sarkar and R. Law, ``A novel approach to document classi_cation using WordNet,'' CoRR, vol. 1, pp. 259_267, Oct. 2015. [Online].
Vasa, K. (2016). Text classification through statistical and machine learning methods: A survey. International Journal of Engineering Development and Research, 4, 655-658.
B.P.Yudha, and R. Sarrno. "Personality classification based on Twitter text using Naive Bayes, KNN and SVM," In Data and Software Engineering (ICoDSE), in proceedings od International Conference on, pp. 170-174. IEEE, 2015.
J. Santoso, E. M. Yuniarno, et al., "Large Scale Text Classification Using Map Reduce and Naive Bayes Algorithm for Domain Specified Ontology Building." In Intelligent Human-Machine Systems and Cybernetics (IHMSC), in proceedings of the 7th International Conference on, vol. 1, pp. 428-432. IEEE,2015.
B.Tang, H. He, et al., "A Bayesian classification approach using class-specific features for text categorization." IEEE Transactions on Knowledge and Data Engineering 28, pp: 1602-1606,no. 6, 2016.
A. Belmouhcine et M. Benkhalifa. “Implicit Links-Based Techniques to Enrich K-Nearest Neighbors and Naive Bayes Algorithms for Web Page Classification”. Springer International Publishing, 2016, vol. 403,.
G. Khade, S. Kumar, et S. Bhattacharya. “Classification of web pages on attractiveness: A supervised learning approach”. Intelligent Human Computer Interaction (IHCI), 2012.
Wenhai Sun, Bing Wang, Ning Cao, Ming Li, Wenjing Lou, Y. Thomas Hou And Hui Li . “Verifiable Privacy-Preserving Multi-Keyword Text Search In The Cloud Supporting Similarity-Based Ranking”. Ieee Transactions On Parallel And Distributed Systems, Vol. 25, No. 11, November 2014.
Alan Díaz-Manríquez , Ana Bertha Ríos-Alvarado, José Hugo Barrón-Zambrano, Tania Yukary Guerrero-Melendez, And Juan Carlos Elizondo-Leal. “An Automatic Document Classifier System Based on Genetic Algorithm and Taxonomy”. accepted March 9, 2018, date of publication March 15, 2018, date of current version May 9, 2018.