Selection of Feature Data in KNN Classification Datasets

Muhammad Iman Nur Hakim; Iwan Setyawan; Danny Manongga; Hindriyanto Dwi Purnomo; Hendry

doi:10.12962/j20882033.v37i1.9263

PDF

Published: Feb 12, 2026

DOI: https://doi.org/10.12962/j20882033.v37i1.9263

Keywords:

Feature data, KNN, Accuracy

Muhammad Iman Nur Hakim

Departement of Automotive Engineering Technology, Polytechnic Of Road Transportation Safety, Tegal, 52125, Indonesia

Iwan Setyawan

Departement of Computer Science, Satya Wacana Christian University, Salatiga, 50711, Indonesia

Danny Manongga

Departement of Computer Science, Satya Wacana Christian University, Salatiga, 50711, Indonesia

Hindriyanto Dwi Purnomo

Departement of Computer Science, Satya Wacana Christian University, Salatiga, 50711, Indonesia

Hendry

Departement of Computer Science, Satya Wacana Christian University, Salatiga, 50711, Indonesia

Abstract

Featured data in a dataset can affect the data processing, either for the better or for the worse. In addition, feature data can also affect the time of data processing. Selection of the right feature data may need to be done where the feature data can represent the whole of a dataset. In this study, a search for feature data will be carried out that can result in better data processing. The classification process will be carried out on an Iris dataset with the KNN algorithm. The iris dataset has 4 feature data (Sepal Length, Sepal Width, Petal Length, Petal Width) and the exact feature data variation will be determined in this classification. The dataset will be broken down into 7 variations of data and tested with a comparison of the training data and test data, namely 90:10, 80:20, 70:30, 60:40, 50:50, 40:60, 30:70, 20:80 and 10:90. The KNN algorithm used has parameters with the number of n neighbors 5 and the Minkowski metric. In this study, the highest accuracy value was 96% and the lowest accuracy value was 71%. The highest accuracy value is obtained from the variation of the Petal Length and Petal Width data features while the lowest accuracy value is obtained from the variation of the Sepal Length and Sepal Width data features.

How to Cite

Muhammad Iman Nur Hakim, Iwan Setyawan, Danny Manongga, Hindriyanto Dwi Purnomo, & Hendry. (2026). Selection of Feature Data in KNN Classification Datasets. IPTEK The Journal for Technology and Science, 37(1), 7–16. https://doi.org/10.12962/j20882033.v37i1.9263

Issue

Vol. 37 No. 1 (2026)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details