Özet:
Diabetes is one of the common health problems with an increasing incidence worldwide. Diabetes is a
chronic disease that can damage organs such as the eyes, heart, and kidneys, as well as cause mortality if not
taken under control. Early diagnosis of diabetes is important in terms of preventing complications and
increasing the quality of life. Machine learning techniques, which are widely used in the medical field, play
the role of an intelligent decision support system that helps experts in the diagnosis of different diseases.
This study includes classifications performed on the Pima Indian Diabetes dataset with six different machine
learning techniques for the early diagnosis of diabetes. One of the main goals of the classifications carried
out is to increase the prediction accuracy. In this study, fourteen different resampling methods were used on
the dataset to increase the success of the classifiers. A total of ninety classifications were carried out without
sampling and resampling for each machine learning model. The success of each classification process was
reported with five different performance metrics. The highest performance was obtained with an accuracy
of 96.296% in the classification using the Random Forest with the InstanceHardnessThreshold undersampling technique. It was observed that resampling techniques generally increased the success of the
classifiers and were more successful when used together with ensemble learning methods. Compared to the
other similar studies in the literature, it was shown that the results obtained in this study were higher than
the others.