Selection of Feature Data in KNN Classification Datasets

Main Article Content

Muhammad Iman Nur Hakim
Iwan Setyawan
Danny Manongga
Hindriyanto Dwi Purnomo
Hendry

Abstract

Featured data in a dataset can affect the data processing, either for the better or for the worse. In addition, feature data can also affect the time of data processing. Selection of the right feature data may need to be done where the feature data can represent the whole of a dataset. In this study, a search for feature data will be carried out that can result in better data processing. The classification process will be carried out on an Iris dataset with the KNN algorithm. The iris dataset has 4 feature data (Sepal Length, Sepal Width, Petal Length, Petal Width) and the exact feature data variation will be determined in this classification. The dataset will be broken down into 7 variations of data and tested with a comparison of the training data and test data, namely 90:10, 80:20, 70:30, 60:40, 50:50, 40:60, 30:70, 20:80 and 10:90. The KNN algorithm used has parameters with the number of n neighbors 5 and the Minkowski metric. In this study, the highest accuracy value was 96% and the lowest accuracy value was 71%. The highest accuracy value is obtained from the variation of the Petal Length and Petal Width data features while the lowest accuracy value is obtained from the variation of the Sepal Length and Sepal Width data features.

Article Details

How to Cite
Muhammad Iman Nur Hakim, Iwan Setyawan, Danny Manongga, Hindriyanto Dwi Purnomo, & Hendry. (2026). Selection of Feature Data in KNN Classification Datasets. IPTEK The Journal for Technology and Science, 37(1), 7–16. https://doi.org/10.12962/j20882033.v37i1.9263
Section
Articles