Classification Modeling using Methods Cross Industry Standard Process for Data Mining to Improve Product Quality

Pravitasari Anjar Pratiwi; Ahmad Saikhu

doi:10.12962/j24609463.v8i2.1405

Authors

Pravitasari Anjar Pratiwi PT HM Sampoerna Tbk
Ahmad Saikhu Institut Teknologi Sepuluh Nopember

DOI:

https://doi.org/10.12962/j24609463.v8i2.1405

Keywords:

Cigarette Filter, Data Mining, Product Quality, Logistic Regression, Random Forest

Abstract

The purpose of this research is to develop a classification model and measure the effect of changes in production inputs on product quality at PT. XZY. This study uses an analytical method adopting the CRISP-DM framework (Cross Industry Standard Process for Data Mining). At the modeling stage, the development of a classification model framework is carried out using the method Random Forest and Logistic Regression. Research results show that by applying the technique oversampling k-Mean SMOTE produces a product classification model with better predictive performance. on models Random Forest with k- Mean SMOTE was able to increase the AUC value from 0.810 to 0.944, while in Logistic Regression there was an increase in the AUC value from 0.690 to 0.724. An increase in the AUC value shows that with k-Mean SMOTE is able to produce a model with very good sensitivity in predicting data classes with low false positive/negative rates. Changes in tow, adhesive, and triacetin materials have a significant effect on the quality of the resulting product. Based on the odds ratio respectively, using adhesive, triacetin, and tow materials with SA/TRIAL status creates a higher chance of producing a product that does not meet the company's quality standards.