Gelişmiş Arama

Basit öğe kaydını göster

dc.contributor.authorÜmit Yilmaz
dc.contributor.authorZafer Aydin
dc.contributor.authorV. Çağri Güngör
dc.contributor.authorCengiz Gezer
dc.date.accessioned2022-04-08T07:16:55Z
dc.date.available2022-04-08T07:16:55Z
dc.date.issued2021en_US
dc.identifier.isbn978-145038954-9
dc.identifier.urihttps //doi.org/10.1145/3471287.3471299
dc.identifier.urihttps://hdl.handle.net/20.500.12573/1256
dc.description.abstractDetermining the potential customers is very important in direct marketing. Data mining techniques are one of the most important methods for companies to determine potential customers. However, since the number of potential customers is very low compared to the number of non-potential customers, there is a class imbalance problem that significantly affects the performance of data mining techniques. In this paper, different combinations of basic and advanced resampling techniques such as Synthetic Minority Oversampling Technique (SMOTE), Tomek Link, RUS, and ROS were evaluated to improve the performance of customer classification. Different feature selection techniques are used in order the decrease the number of non-informative features from the data such as Information Gain, Gain Ratio, Chi-squared, and Relief. Classification performance was compared and utilized using several data mining techniques, such as LightGBM, XGBoost, Gradient Boost, Random Forest, AdaBoost, ANN, Logistic Regression, Decision Trees, SVC, Bagging Classifier based on ROC AUC and sensitivity metrics. A combination of Tomek Link and Random Under-Sampling as a resampling technique and Chi-squared method as feature selection algorithm showed superior performance among the other combinations. Detailed performance evaluations demonstrated that with the proposed approach, LightGBM, which is a gradient boosting algorithm based on decision tree, gave the best results among the other classifiers with 0.947 sensitivity and 0.896 ROC AUC value. © 2021 ACM.en_US
dc.description.sponsorshipIllinois State UniversitySouth Asia Institute of Science and Engineering (SAISE)University of Hawaii at Hiloen_US
dc.language.isoengen_US
dc.publisherAssociation for Computing Machineryen_US
dc.relation.isversionof10.1145/3471287.3471299en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectData Miningen_US
dc.subjectDirect Marketingen_US
dc.subjectImbalanced Dataen_US
dc.subjectMachine Learningen_US
dc.subjectTomek Linken_US
dc.titleData Mining Techniques in Direct Marketing on Imbalanced Data using Tomek Link Combined with Random Under-samplingen_US
dc.typeconferenceObjecten_US
dc.contributor.departmentAGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.contributor.institutionauthorYilmaz, Ümit
dc.contributor.institutionauthorAydin, Zafer
dc.contributor.institutionauthorGüngör, V. Çağri
dc.relation.journalACM International Conference Proceeding Seriesen_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US


Bu öğenin dosyaları:

Thumbnail

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster