Book Details

META-LEARNING-BASED AUTOMATED FEATURE ENGINEERING FOR HIGH-DIMENSIONAL DATA

International Journal of Computer Science (IJCS) Published by SK Research Group of Companies (SKRGC)

Download this PDF format

Abstract

High-dimensional datasets pose significant challenges for traditional machine learning models due to redundant, irrelevant, and highly correlated features that degrade predictive performance and increase computational costs. This paper proposes an Automated Feature Engineering (AFE) system that leverages meta-learning to identify and generate the most informative features across diverse data domains. By utilizing a meta-knowledge base of past performance, the system intelligently selects feature transformation and selection strategies, significantly reducing the manual effort required in the data preprocessing phase. The framework integrates advanced search algorithms and performance estimation techniques to explore the feature space efficiently. Experimental evaluations demonstrate that the meta-learning approach consistently outperforms baseline methods in terms of accuracy, scalability, and efficiency in high-dimensional environments. This research provides a scalable solution for accelerating the data science lifecycle in complex, large-scale applications

References

  1. M. Allamanis, E. T. Barr, C. Bird, and C. Sutton, "A survey of machine learning for big code and naturalness," ACM Computing Surveys, vol. 51, no. 4, pp. 1-37, Aug. 2018.
  2. T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 785-794.
  3. Z. Li, Y. Zhou, S. Wang, and Y. Wang, "Deep learning-based software defect prediction," IEEE Transactions on Software Engineering, vol. 45, no. 4, pp. 1-16, Apr. 2019.
  4. M. Tufano, C. Watson, G. Bavota, and M. Di Penta, "An empirical study on learning bug-fixing patterns from code changes," IEEE Transactions on Software Engineering, vol. 45, no. 6, pp. 1-20, June 2019.
  5. S. Panichella, A. Zaidman, M. Di Penta, and R. Oliveto, "How developers' collaboration affects bug fixing," IEEE Transactions on Software Engineering, vol. 44, no. 2, pp. 1-18, Feb. 2018.
  6.  J. Nam and S. Kim, "Heterogeneous defect prediction," in Proc. 10th Joint Meeting on Foundations of Software Engineering, Bergamo, Italy, 2015, pp. 508-519.
  7. Automated Feature Engineering System Using Meta-Learning for High-Dimensional.pdf, project report, 2023.
  8. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," in Proc. NAACL-HLT, 2019, pp. 4171-4186.
  9. Microsoft, "Introduction and Core Philosophy of Windows 11," Technical Documentation, 2021.
  10. Python Software Foundation, "Python History and Key Features," Documentation, 2023.
  11. Google, "Google Colab: Cloud-Based Python Environment," Product Guide, 2022.
  12. R. C. Geyer, T. Klein, and M. Nabi, "Differentially private federated learning: A client-level perspective," arXiv preprint, 2017.
  13. C. Bird, T. Menzies, and T. Zimmermann, "The art and science of analyzing software data," IEEE Software, vol. 32, no. 4, pp. 52-59, 2015.
  14. T. F. Bissyandé et al., "Revisiting the impact of documentation on software quality," Empirical Software Engineering, vol. 18, no. 1, pp. 1-36, 2013.
  15. ISO/IEC, "Unified Modeling Language (UML) Specification," Standard 19505, 2017.

Keywords

Automated Feature Engineering, Meta-Learning, High-Dimensional Data, Machine Learning, Data Preprocessing, Feature Selection.

Image
  • Format Volume 14, Issue 1, No 25, 2026
  • Copyright All Rights Reserved ©2026
  • Year of Publication 2026
  • Author R. Harishma, Dr. D. Ragupathi
  • Reference IJCS-694
  • Page No 001-008

Copyright 2026 SK Research Group of Companies. All Rights Reserved.