Full Paper View Go Back

The Necessity of Exploratory Data Analysis: How are preprocessing activities beneficial to Data Analysts and Professional Researchers in Academia?

Ismail Olaniyi Muraina1 , Olayemi Muyideen Adesanya2 , Moses Adeolu Agoi3 , Solomon Onen Abam4

  1. Department of Computer Science, Lagos State University of Education, Lagos, Nigeria.
  2. Department of Computer Science, Lagos State University of Education, Lagos, Nigeria.
  3. Department of Computer Science, Lagos State University of Education, Lagos, Nigeria.
  4. Department of Computer Science, Federal College of Education Technical, Ebonyi, Nigeria.

Section:Research Paper, Product Type: Journal-Paper
Vol.11 , Issue.3 , pp.22-28, Jun-2023


Online published on Jun 30, 2023


Copyright © Ismail Olaniyi Muraina, Olayemi Muyideen Adesanya, Moses Adeolu Agoi, Solomon Onen Abam . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
 

View this paper at   Google Scholar | DPI Digital Library


XML View     PDF Download

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Ismail Olaniyi Muraina, Olayemi Muyideen Adesanya, Moses Adeolu Agoi, Solomon Onen Abam, “The Necessity of Exploratory Data Analysis: How are preprocessing activities beneficial to Data Analysts and Professional Researchers in Academia?,” International Journal of Scientific Research in Computer Science and Engineering, Vol.11, Issue.3, pp.22-28, 2023.

MLA Style Citation: Ismail Olaniyi Muraina, Olayemi Muyideen Adesanya, Moses Adeolu Agoi, Solomon Onen Abam "The Necessity of Exploratory Data Analysis: How are preprocessing activities beneficial to Data Analysts and Professional Researchers in Academia?." International Journal of Scientific Research in Computer Science and Engineering 11.3 (2023): 22-28.

APA Style Citation: Ismail Olaniyi Muraina, Olayemi Muyideen Adesanya, Moses Adeolu Agoi, Solomon Onen Abam, (2023). The Necessity of Exploratory Data Analysis: How are preprocessing activities beneficial to Data Analysts and Professional Researchers in Academia?. International Journal of Scientific Research in Computer Science and Engineering, 11(3), 22-28.

BibTex Style Citation:
@article{Muraina_2023,
author = {Ismail Olaniyi Muraina, Olayemi Muyideen Adesanya, Moses Adeolu Agoi, Solomon Onen Abam},
title = {The Necessity of Exploratory Data Analysis: How are preprocessing activities beneficial to Data Analysts and Professional Researchers in Academia?},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {6 2023},
volume = {11},
Issue = {3},
month = {6},
year = {2023},
issn = {2347-2693},
pages = {22-28},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=3141},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=3141
TI - The Necessity of Exploratory Data Analysis: How are preprocessing activities beneficial to Data Analysts and Professional Researchers in Academia?
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Ismail Olaniyi Muraina, Olayemi Muyideen Adesanya, Moses Adeolu Agoi, Solomon Onen Abam
PY - 2023
DA - 2023/06/30
PB - IJCSE, Indore, INDIA
SP - 22-28
IS - 3
VL - 11
SN - 2347-2693
ER -

148 Views    154 Downloads    21 Downloads
  
  

Abstract :
Data analysis is used in all academic disciplines. Still, research has shown that some studies can appear ambiguous when the analyzed data needs to be sufficiently illustrated to identify trends, patterns, and other assumptions. These assumptions typically enable researchers to present the statistical summary using pertinent and self-explanatory graphical representations. Analysts want to use a method that will assist them in condensing the dataset`s critical characteristics for straightforward interpretation and presentation to the audience. In addition to presenting the impact of preprocessing activities in assuring an error-free dataset before actual analysis is done, this study uncovers the trick to conducting an adequate investigation on the dataset to have a clean dataset for accurate analysis interpretation. The most popular preprocessing procedures, including missing values, outliers, and variable transformation, are listed. The study used a descriptive survey design technique and focused on using a questionnaire instrument to gather data from respondents using a Google Forms App. The information was collected using criteria such as gender, amount of data analysis experiences, institution type, and roles within the academic community. Both face validity and construct validity methods were used to validate the instrument. Chrobach`s Alpha yielded a dependability index of 0.88, indicating good reliability. Since the data was prepared and collected online using Google Forms, the data collecting and collation process only took four days. Software for appropriate visualization was used for the analysis. The results demonstrated that thoroughly exploring the data and removing any bias or outliers is the first step that any data analyst must take before starting a proper analysis process. The usage of some of the tools available for cleaning datasets was also outlined, and it was recommended that amateur analysts take the time to learn how to utilize them. Before beginning their final year projects, final-year undergraduate and postgraduate students should be exposed to all exploratory data analysis methods.

Key-Words / Index Term :
Data Analysts, Exploratory Data Analysis, Dataset, Tools, Statistical Summary, Preprocessing

References :
[1]. Kayode A. Okewale, Ifedotun R. Idowu, Bamidele S. Alobalorun, Falilat A. Alabi, "Effective Machine Learning Classifiers for Intrusion Detection in Computer Network", International Journal of Scientific Research in Computer Science and Engineering, Vol.11, Issue.2, pp.14-22, 2023
[2]. R.S. Walse, G.D. Kurundkar, P. U. Bhalchandra, "A Review: Design and Development of Novel Techniques for Clustering and Classification of Data", International Journal of Scientific Research in Computer Science and Engineering, Vol.06, Issue.01, pp.19-22, 2018
[3]. Song, X. “A Brief Introduction to Exploratory Data Analysis”. Advances in Engineering Technology Research, Vol.01 Issue 01, 2023
[4]. Miller, Ryan (2019). Data Preprocessing: What is it, and why is it important? C-Suite Agenda, Vol.01 Issue 01, 2019.
[5]. V.K. Gujare, P. Malviya, "Big Data Clustering Using Data Mining Technique", International Journal of Scientific Research in Computer Science and Engineering, Vol.5, Issue.2, pp.9-13, 2017.
[6]. Bhandari, Pritha “Missing Data: Types, Explanation, & Imputation”, 2021.
[7]. Jim Freeman “Outliers in Statistical Data (3rd edition)”, Journal of the Operational Research Society Vol. 46 Issue. 08, 1995
[8]. Charu Aggarwal “An Introduction to Outlier Analysis” Outlier Analysis Publisher, London, 2017
[9]. Natalja Verina & Jelena Titko, “Digital transformation: a conceptual framework, ” In the Proceedings of 2019 Contemporary Issues in Business, Management and Economics Engineering, Vilnius, Lithuania, 720-727, 2019
[10]. Manikandan S “Data transformation,” Journal of Pharmacology and Pharmacotherapeutics Vol. 01, Issue 02, pp. 126-7, 2010
[11]. A. Singh, N. Jain, "Internet Surfing Prediction System using Association Rule Mining based on FP-Growth", International Journal of Scientific Research in Computer Science and Engineering, Vol.4, Issue.4, pp.1-6, 2016.
[12]. Mishra, S., Sarkar, U., Taraphder, S., Datta, S., Swain, D., & Saikhom, R. et al. “Multivariate Statistical Data Analysis- Principal Component Analysis (PCA)”. International Journal of Livestock Research, Vol. 07, Issue 05, pp. 60-78, 2017.
[13]. Ledisi G. Kabari & Believe B. Nwamae “Principal Component Analysis (PCA) - An Effective Tool in Machine Learning,” International Journals of Advanced Research in Computer Science and Software Engineering Vol. 09, Issue 05, pp. 56-59, 2019.
[14]. Pushpa Singh, Narendra Singh, Krishna Kant Singh, Akansha Singh “Diagnosing of disease using machine learning”, Editor(s): Krishna Kant Singh, Mohamed Elhoseny, Akansha Singh, Ahmed A. Elngar, Machine Learning and the Internet of Medical Things in Healthcare, Academic Press, 89-111, 2021, https://doi.org/10.1016/B978-0-12-821229-5.00003-3.
[15]. Amir H. Alavi, Maria Q. Feng, Pengcheng Jiao, Zahra Sharif-Khodaei “Advanced sensing and monitoring systems for smart cities”, Editor(s): Amir H. Alavi, Maria Q. Feng, Pengcheng Jiao, Zahra Sharif-Khodaei. The Rise of Smart Cities, Butterworth-Heinemann, pp. 1-26, 2022,
[16]. Federico, Zuecco, Massimiliano Barolo “Computer Aided Chemical Engineering. 30th European Symposium on Computer-Aided Process Engineering. 30th European Symposium on Computer Aided Chemical Engineering”, Volume 47 contains the papers presented at the 30th European Symposium of Computer Aided Process Engineering (ESCAPE) event held in Milan, Italy, May 24-27, Vol. 48, pp. 1-2068, 2020.
[17]. Arnab Chakrabarty, Tahir Cagin “Inherently Safer Design in Multiscale Modeling for Process Safety Applications”. Butterworth-Heinemann publisher, pp. 397-406, 2016,
[18]. Girish Kumar Adari, Maheswari Raja, P. Vijaya “Machine learning in genomics: identification and modelling of anticancer peptides”, Editor(s): Amit Kumar Tyagi, Ajith Abraham, Data Science for Genomics, Academic Press, pp. 25-68, 2023,
[19]. Misra, S., Li, H., & He, J. “Robust geomechanical characterization by analyzing the performance of shallow-learning regression methods using unsupervised clustering methods. In Machine Learning for Subsurface Characterization”, 2020,
[20]. Rohini Selvaraj, Nagarajan “Change detection techniques for a remote sensing application: An overview”, Editor(s): Yu-Dong Zhang, Arun Kumar Sangaiah, In Cognitive Data Science in Sustainable Computing, Cognitive Systems and Signal Processing in Image Processing, Academic Press, pp. 129-143, 2022.

Authorization Required

 

You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at  support@isroset.org or view contact page for more details.

Go to Navigation