A Survey on Technologies to Handle Big Data
Sri Vasavi College, Erode Self-Finance Wing, 3rd February 2017. National Conference on Computer and Communication, NCCC’17. International Journal of Computer Science (IJCS) Published by SK Research Group of Companies (SKRGC)
Download this PDF format
Abstract
Big Data the term narrates itself as the data that is very big pointing towards large scale and multifaceted data processed using specific analytical methods. Big Data opens its eyes when traditional Relational Database tools and packages were unable to handle such huge data sets in terms of size, speed, source, storage and range of data. It focuses on various techniques to collect, process, analyses and to picture potentially large sets of data within a stipulated timeframe. The challenging task of Big Data analytics paves way to Distributed File System which is flexible, scalable and fault tolerant. Hence the architecture steps up from Centralized to Distributed architecture. The new technologies used to handle such massive data are Hadoop, MapReduce, NoSQL, Apache Hive, Pig, MongoDB etc. The researchers on Big Data seek to gather massive data in Social Media like Twitter, Facebook and News Services, Web Logs, images and videos, documents and PDFs, biometric data, government agency sources and human generated data. Because of complex in computations due to large data sets in areas like complex physics simulations, meteorology, genomics, biological and environmental research, the scientists moved on to Big Data techniques. For processing, Big Data needs HDFS(Hadoop Distributed File System) make use of MapReduce program, that relies on cluster computers which process massive amount of data in parallel and generates large amount of intermediate data. Big Data is used not only in IT Industries and Business but also it roots itself in the field of Healthcare, Social Media, Government, Transportation, and Education and Research. This paper focuses on analysing various technologies that are catered by several researchers to deal with heterogeneous data.
References
[1] Kogent Learning Solutions, Bill Franks, Judith Hurwitz, Alan Nugent, Fern Halper, Marcia Kaufman, Joris Meys, Andrie de Vries, Mark Gardener, Dr. Murray Logan, Michael J. Crawley, Deborah J. Rumsey, Johannes Ledolter, Stephane Tuffery, Dean Abbott, Wrox Certified Big Data Analyst(WCBDA), “Introducing Big Data Analytics and Predictive Modeling”, Wiley Publishers.
[2] https://jeremyronk.wordpress.com/
[3]Yuri Demchenko “The Big Data Architecture Framework (BDAF)” Outcome of the Brainstroming Session at the University of Amsterdam 17 July 2013.
Keywords
Big Data, Hadoop, MapReduce, Social Media, Apache Hive, Twitter, Research