{"id":2989,"date":"2016-08-12T23:16:25","date_gmt":"2016-08-12T23:16:25","guid":{"rendered":"http:\/\/www.rezafaisal.net\/?p=2989"},"modified":"2016-08-12T23:16:25","modified_gmt":"2016-08-12T23:16:25","slug":"r-tools-for-visual-studio-effect-of-imbalanced-class","status":"publish","type":"post","link":"https:\/\/www.rezafaisal.net\/?p=2989","title":{"rendered":"R Tools for Visual Studio: Effect of Imbalanced Class"},"content":{"rendered":"<p>Pada data mining atau sebagian teknik machine learning, data adalah sumber pengetahuan yang\u00a0kan digunakan untuk belajar yang nantinya akan digunakan sebagai dasar untuk mengenali ketika ada instance baru.<\/p>\n<p>Kasus di atas bisa ditemui pada supervise learning, khususnya klasifikasi dimana classifier memerlukan data untuk membuat model yang nantinya digunakan untuk menentukan class mana dari instance baru yang akan ditemui.\u00a0 Semakin banyak data yang dimiliki, maka semakin bagus model yang dihasilkan artinya semakin akurat hasil klasifikasi.<\/p>\n<p>Umumnya klasifikasi bertujuan untuk \u201cmembedakan\u201d paling tidak dua class.\u00a0 Bagaimana jika data yang dimiliki ternyata lebih banyak salah satu class saja di banding class lainnya. Sebagai contoh, untuk kasus klasifikasi 2 class, dimiliki data dengan class label A sebanyak 666 instance dan data dengan class label B sebanyak 23 instance. Menurut literatur, hal ini akan berefek tidak bagus jika menggunakan classifier konvensional seperti KNN (k nearest neighbor), SVM (support vector machine), Na\u00efve Bayes, Decision Tree dan lain-lain.\u00a0 Efek apa yang dialami oleh classifier (algoritma klasifikasi) jika menghadapi situasi ini? Untuk mengetahui apa yang terjadi, maka akan dilakukan percobaan dengan menggunakan data \u201creal world\u201d.<\/p>\n<p>Sebagi informasi pada posting ini digunakan R sebagai \u201cbahasa pemrograman\u201d yang digunakan dan R Tools for Visual Studio sebagai tool pemrograman atau IDE.\u00a0 Selain itu juga perlu digunakan library pada kode program yang digunakan di bawah ini.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre id=\"codeSnippet\" style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span style=\"color: #008000;\">#library<\/span>\r\nlibrary(kernlab)\r\nlibrary(caret)\r\nlibrary(ggfortify)\r\nlibrary(rgl)<\/pre>\n<p>&nbsp;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Abalone<\/em><\/strong>}<\/p>\n<p>Data yang digunakan adalah ada Abalone dari UCI Repository (<a title=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/Abalone\" href=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/Abalone\">https:\/\/archive.ics.uci.edu\/ml\/datasets\/Abalone<\/a>).\u00a0 File dataset abalone adalah abalone.data.\u00a0 Berikut ini adalah informasi dari dataset ini yang dapat dilihat dengan menggunakan fungsi-fungsi berikut pada lingkungan R.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<div id=\"codeSnippet\" style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum1\" style=\"color: #606060;\">   1:<\/span> abalone = read.csv(<span style=\"color: #006080;\">\"abalone.data\"<\/span>, header = FALSE)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum2\" style=\"color: #606060;\">   2:<\/span> head(abalone)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum3\" style=\"color: #606060;\">   3:<\/span> table(abalone[,9])<\/pre>\n<p><!--CRLF--><\/p>\n<\/div>\n<\/div>\n<p>Baris 1 bertujuan untuk membaca data pada file abalone.data yang akan ditampung pada objek abalone.\u00a0 Sedangkan baris 2 bertujuan untuk menampilkan 10 data abalone. Dan baris 3 untuk menunjukkan jumlah instance pada masing-masing class label (class label pada dataset ini berada pada kolom ke-9 yang merupakan umur dari abalone). Dan berikut adalah hasilnya jika menggunakan R Tools for Visual Studio.<\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/blog01-3.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"blog01\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/blog01_thumb-3.jpg\" alt=\"blog01\" width=\"550\" height=\"270\" border=\"0\" \/><\/a><\/p>\n<p>Hasil dari baris 3, memperlihatkan terdapat 29 class pada dataset abalone (dapat dilihat juga jumlah instance untuk setiap class label) dan dengan fungsi nrow() dapat dilihat jumlah instance total dataset ini yaitu 4177.<\/p>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Class Imbalance<\/em><\/strong>}<\/p>\n<p>Class Imbalance adalah situasi yang terjadi ketika salah satu class memiliki jumlah lebih besar dari pada class lainnya.\u00a0 Untuk membuat situasi ini, akan dibuat mengkombinasikan dua instance-instance dari dua class label yang berbeda yaitu:<\/p>\n<ul>\n<li>class 12 vs class 6 dengan masing-masing instance adalah 267 dan 259 yang akan menjadi contoh dari class yang seimbang.<\/li>\n<li>class 12 vs class 14, yang memiliki imbalance ratio (IR) sebesar kurang lebih 2.<\/li>\n<li>class 12 vs class 17, yang memiliki IR ~ 4.<\/li>\n<li>class 12 vs class 21, yang memiliki IR ~ 12.<\/li>\n<\/ul>\n<p>Situasi ini membuat classifier kesulitan untuk menentukan class B dan kemungkinan besar instance yang faktanya adalah class B akan diprediksi sebagai class A.<\/p>\n<p>Untuk mendapatkan perbandingan dataset seperti di atas maka dapat digunakan kode sebagai berikut.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">class_majority = <span style=\"color: #006080;\">\"12\"<\/span>\r\nclass_minority = <span style=\"color: #006080;\">\"6\"<\/span>\r\nabalone = read.csv(<span style=\"color: #006080;\">\"abalone.data\"<\/span>, header = FALSE)[,2:9]\r\nmain_data = rbind(abalone[which(abalone$V9 == class_majority),], abalone[which(abalone$V9 == class_minority),])<\/pre>\n<p><span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {<br \/>\n<span style=\"color: #0000ff;\">if<\/span> (<span style=\"color: #0000ff;\">as<\/span>.numeric(main_data[i, 8]) == <span style=\"color: #0000ff;\">as<\/span>.character(class_majority)) {<br \/>\nmain_data[i, 8] = <span style=\"color: #006080;\">&#8220;majority&#8221;<\/span><br \/>\n} <span style=\"color: #0000ff;\">else<\/span> {<br \/>\nmain_data[i, 8] = <span style=\"color: #006080;\">&#8220;minority&#8221;<\/span><br \/>\n}<br \/>\n}<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Kita nanti tinggal mengganti nilai class_minority dengan nilai class label yang diinginkan.\u00a0 Data baru hasil kombinasi class label di atas akan disimpan pada objek main_data. Kemudian jika ingin melihat posisi setiap instance pada data space maka dapat menggunakan teknik PCA seperti kode berikut ini. Untuk mengetahui lebih detail tentang PCA dan data space dapat mengunjungi link berikut ini <a title=\"http:\/\/www.rezafaisal.net\/?p=2951\" href=\"http:\/\/www.rezafaisal.net\/?p=2951\">http:\/\/www.rezafaisal.net\/?p=2951<\/a>.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre id=\"codeSnippet\" style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">class_label = c(<span style=\"color: #006080;\">\"majority\"<\/span>=2, <span style=\"color: #006080;\">\"minority\"<\/span>=4)\r\nabalone_pca = princomp(main_data[, 1:7], cor = TRUE, scores = TRUE)\r\nautoplot(abalone_pca, data = main_data, colour = <span style=\"color: #006080;\">'V9'<\/span>, label = FALSE, label.size = 3)\r\nplot3d(abalone_pca$scores[, 1:3], col = <span style=\"color: #0000ff;\">as<\/span>.numeric(class_label[main_data$V9]))<\/pre>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Dan hasilnya dapat dilihat pada gambar di bawah ini.<\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/blog02-3.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"blog02\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/blog02_thumb-3.jpg\" alt=\"blog02\" width=\"550\" height=\"328\" border=\"0\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Class Imbalance &amp; SVM<\/em><\/strong>}<\/p>\n<p>Pada bagian ini akan ditunjukkan melakukan klasifikasi dengan menggunakan SVM.\u00a0 Fungsi yang digunakan adalah ksvm() dari package \u201ckernlab\u201d.\u00a0 Jika pada sistem belum ada package ini maka terlebih dahulu install package ini dengan fungsi install.package(\u201ckernlab\u201d).<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {\r\n    main_data.test = main_data[i,]\r\n    main_data.train = main_data[ - i,]<\/pre>\n<p>model &lt;- ksvm(V8 ~ ., main_data.train)<br \/>\npred &lt;- predict(model, main_data.test)<\/p>\n<p><span style=\"color: #0000ff;\">if<\/span> (!exists(<span style=\"color: #006080;\">&#8220;result&#8221;<\/span>)) {<br \/>\nresult = c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8]))<br \/>\n} <span style=\"color: #0000ff;\">else<\/span> {<br \/>\nresult = rbind(result, c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8])))<br \/>\n}<br \/>\n}<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Pada kode dapat dilihat terdapat pengulangan yang didalamnya terdapat fungsi untuk membuat model dengan fungsi ksvm dan melakukan prediksi dengan fungsi predict().\u00a0 Pengulangan tersebut dimaksudkan untuk melakukan cross-validation dengan metode leave-one-out, yang artinya akan ada 1 instance yang menjadi data testing dan sisanya menjadi data training seperti yang terlihat pada gambar di bawah ini.\u00a0 Kemudian setiap nilai prediksi dan nilai sebenarnya akan disimpan ke dalam objek result.<\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/01.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"01\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/01_thumb.jpg\" alt=\"01\" width=\"550\" height=\"330\" border=\"0\" \/><\/a><\/p>\n<p>Sedangkan untuk mengetahui performance classifier dari model yang dibuat dapat menggunakan fungsi confusionMatrix() dari package \u201ccaret\u201d. Jika package ini belum tersedia pada sistem maka dapat diinstall dengan fungsi install.package(\u201ccaret\u201d). Confussion matrix dapat digambarkan sebagai berikut.<\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/02.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"02\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/02_thumb.jpg\" alt=\"02\" width=\"550\" height=\"344\" border=\"0\" \/><\/a><\/p>\n<p>Keterangan:<\/p>\n<ol>\n<li>Baris 1 kolom 1, memberikan informasi jumlah prediksi instance sebagai class positif P yang benar. Sebagai contoh jika bagian ini bernilai 5, artinya ada 5 instance yang diprediksi sebagai class P dan hasil prediksi itu benar.<\/li>\n<li>Baris 2 kolom 2, memberikan informasi jumlah prediksi instance sebagai class positif P padahal seharusnya instance tersebut adalah bagian dari class negatif N.<\/li>\n<li>Baris 1 kolom 2, memberikan informasi jumlah prediksi instance sebagai class negatif N padahal seharusnya instance tersebut adalah bagian dari class postiif P.<\/li>\n<li>Baris 2 kolom 2, memberikan informasi jumlah predksi instance sebagai class negatif N yang benar.<\/li>\n<\/ol>\n<p>Dari nilai-nilai tersebut maka akan didapat nilai-nilai penting seperti berikut:<\/p>\n<ol>\n<li><strong><u>Akurasi <\/u><\/strong>yang didapat dari (True Positive + True Negative) \/ (jumlah instance positif + jumlah instance negatif).<\/li>\n<li><strong><u>Sensitivity <\/u><\/strong>atau true positive rate memberikan informasi performance prediksi class positif P. Nilai ini dapat ditentukan dengan rumus true positive \/ jumlah instance positif.<\/li>\n<li><strong><u>Specificity <\/u><\/strong>atau true negative rate memberikan informasi performance prediksi class negatif N, nilai ini dapat ditentukan dengan rumus true negative \/ jumlah instance negatif.<\/li>\n<\/ol>\n<p>Nah dengan fungsi confusionMatrix() informasi ini dapat otomatis didapatkan jika memberikan data hasil prediksi dan data nilai sebenarnya. Karena objek result telah menyimpan informasi tersebut maka dapat ditulis kode sebagai berikut.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre id=\"codeSnippet\" style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">confusionMatrix(result[, 1], result[, 2])<\/pre>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Dan berikut adalah performance classifier SVM dengan menggunakan kombinasi data di atas.<\/p>\n<p><strong><em>Class 12 vs Class 6<\/em><\/strong><\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/03.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"03\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/03_thumb.jpg\" alt=\"03\" width=\"550\" height=\"245\" border=\"0\" \/><\/a><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 249 17<br \/>\nminority 18 242<\/p>\n<p>Accuracy : 0.9335<br \/>\n95% CI : (0.9087, 0.9532)<br \/>\nNo Information Rate : 0.5076<br \/>\nP-Value [Acc &gt; NIR] : &lt;2e-16<\/p>\n<p>Kappa : 0.8669<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 1 <\/span><\/p>\n<p>Sensitivity : 0.9326<br \/>\nSpecificity : 0.9344<br \/>\nPos Pred Value : 0.9361<br \/>\nNeg Pred Value : 0.9308<br \/>\nPrevalence : 0.5076<br \/>\nDetection Rate : 0.4734<br \/>\nDetection Prevalence : 0.5057<br \/>\nBalanced Accuracy : 0.9335<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Pada kombinasi data ini dapat dilihat nilai sensitivity dan specificity terlihat seimbang. Artinya hasil pembelajaran dapat melakukan prediksi dengan baik untuk class majority dan minority. Pada confusion matrix bagaimana prediksi class majority dan minority dilakukan dengan baik.<\/p>\n<p><strong><em>Class 12 vs Class 14<\/em><\/strong><\/p>\n<p><strong><em><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/04.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"04\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/04_thumb.jpg\" alt=\"04\" width=\"550\" height=\"248\" border=\"0\" \/><\/a><\/em><\/strong><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 264 123<br \/>\nminority 3 3<\/p>\n<p>Accuracy : 0.6794<br \/>\n95% CI : (0.6308, 0.7253)<br \/>\nNo Information Rate : 0.6794<br \/>\nP-Value [Acc &gt; NIR] : 0.5241<\/p>\n<p>Kappa : 0.0168<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : &lt;2e-16 <\/span><\/p>\n<p>Sensitivity : 0.98876<br \/>\nSpecificity : 0.02381<br \/>\nPos Pred Value : 0.68217<br \/>\nNeg Pred Value : 0.50000<br \/>\nPrevalence : 0.67939<br \/>\nDetection Rate : 0.67176<br \/>\nDetection Prevalence : 0.98473<br \/>\nBalanced Accuracy : 0.50629<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Pada kasus ini, jumlah instance class majority dan minority mulai tidak seimbang. Class majority lebih banyak 2x lipat dibanding class minority. Dan efek class imbalance mulai terlihat, dimana prediksi class minority banyak mengalami kesalahan. Banyak class minority yang diprediksi sebagai class majority.\u00a0 Sehingga nilai specificity sangat kecil.<\/p>\n<p><strong><em>Class 12 vs Class 17<\/em><\/strong><\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/05.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"05\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/05_thumb.jpg\" alt=\"05\" width=\"550\" height=\"252\" border=\"0\" \/><\/a><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 267 56<br \/>\nminority 0 2<\/p>\n<p>Accuracy : 0.8277<br \/>\n95% CI : (0.7822, 0.8671)<br \/>\nNo Information Rate : 0.8215<br \/>\nP-Value [Acc &gt; NIR] : 0.4198<\/p>\n<p>Kappa : 0.0554<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 1.987e-13 <\/span><\/p>\n<p>Sensitivity : 1.00000<br \/>\nSpecificity : 0.03448<br \/>\nPos Pred Value : 0.82663<br \/>\nNeg Pred Value : 1.00000<br \/>\nPrevalence : 0.82154<br \/>\nDetection Rate : 0.82154<br \/>\nDetection Prevalence : 0.99385<br \/>\nBalanced Accuracy : 0.51724<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Pada kasus ini ketidak seimbangan jumlah instance antara class majority dan minority semakin tidak seimbang. Sehingga prediksi class minority semakin ikut-ikutan ke arah class majority (bias ke arah class majority).<\/p>\n<p><strong><em>Class 12 vs Class 23<\/em><\/strong><\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/06.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"06\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/06_thumb.jpg\" alt=\"06\" width=\"550\" height=\"254\" border=\"0\" \/><\/a><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 267 9<br \/>\nminority 0 0<\/p>\n<p>Accuracy : 0.9674<br \/>\n95% CI : (0.939, 0.985)<br \/>\nNo Information Rate : 0.9674<br \/>\nP-Value [Acc &gt; NIR] : 0.587420<\/p>\n<p>Kappa : 0<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 0.007661 <\/span><\/p>\n<p>Sensitivity : 1.0000<br \/>\nSpecificity : 0.0000<br \/>\nPos Pred Value : 0.9674<br \/>\nNeg Pred Value : NaN<br \/>\nPrevalence : 0.9674<br \/>\nDetection Rate : 0.9674<br \/>\nDetection Prevalence : 1.0000<br \/>\nBalanced Accuracy : 0.5000<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Dan terakhir adalah kasus dimana nilai IR lebih besar dari 9, kasus ini sudah termasuk dalam kategori highly class imbalance. Karena perbandingan antara class majority dan class minority adalah 12. Dan akhirnya dapat dilihat bahwa prediksi class majority semakin kuat dan classifier sudah tidak mampu lagi memprediksi class minority dengan benar walaupun satu instance saja, karena pada confusion matrix terlihat semua class minority diprediksi sebagai class majority.<\/p>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Class Imbalance &amp; KNN<\/em><\/strong>}<\/p>\n<p>Pada bagian ini akan diperlihatkan pengaruh class imbalance pada classifier KNN (K Nearest Neighbor). Untuk pengujian masih menggunakan alur yang sama seperti pada pengujian menggunakan classifer SVM di atas.\u00a0 Beriku ini adalah kode yang digunakan untuk melakukan pengujian class imbalance pada classifer KNN.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<div id=\"codeSnippet\" style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum1\" style=\"color: #606060;\">   1:<\/span> <span style=\"color: #008000;\">#library<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum2\" style=\"color: #606060;\">   2:<\/span> library(caret)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum3\" style=\"color: #606060;\">   3:<\/span> library(<span style=\"color: #0000ff;\">class<\/span>)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum4\" style=\"color: #606060;\">   4:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum5\" style=\"color: #606060;\">   5:<\/span> <span style=\"color: #008000;\">#init<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum6\" style=\"color: #606060;\">   6:<\/span> rm(<span style=\"color: #0000ff;\">list<\/span> = ls())<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum7\" style=\"color: #606060;\">   7:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum8\" style=\"color: #606060;\">   8:<\/span> <span style=\"color: #008000;\">#Ambil 2 class pada dataset abalone<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum9\" style=\"color: #606060;\">   9:<\/span> class_majority = <span style=\"color: #006080;\">\"12\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum10\" style=\"color: #606060;\">  10:<\/span> class_minority = <span style=\"color: #006080;\">\"23\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum11\" style=\"color: #606060;\">  11:<\/span> abalone = read.csv(<span style=\"color: #006080;\">\"abalone.data\"<\/span>, header = FALSE)[, 2:9]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum12\" style=\"color: #606060;\">  12:<\/span> main_data = rbind(abalone[which(abalone$V9 == class_majority),], abalone[which(abalone$V9 == class_minority),])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum13\" style=\"color: #606060;\">  13:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum14\" style=\"color: #606060;\">  14:<\/span> <span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum15\" style=\"color: #606060;\">  15:<\/span>     <span style=\"color: #0000ff;\">if<\/span> (<span style=\"color: #0000ff;\">as<\/span>.numeric(main_data[i, 8]) == <span style=\"color: #0000ff;\">as<\/span>.character(class_majority)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum16\" style=\"color: #606060;\">  16:<\/span>         main_data[i, 8] = <span style=\"color: #006080;\">\"majority\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum17\" style=\"color: #606060;\">  17:<\/span>     } <span style=\"color: #0000ff;\">else<\/span> {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum18\" style=\"color: #606060;\">  18:<\/span>         main_data[i, 8] = <span style=\"color: #006080;\">\"minority\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum19\" style=\"color: #606060;\">  19:<\/span>     }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum20\" style=\"color: #606060;\">  20:<\/span> }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum21\" style=\"color: #606060;\">  21:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum22\" style=\"color: #606060;\">  22:<\/span> <span style=\"color: #008000;\">#pembuatan model dan uji klasifikasi dg leave one out cross validation<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum23\" style=\"color: #606060;\">  23:<\/span> <span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum24\" style=\"color: #606060;\">  24:<\/span>     main_data.test = main_data[i,]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum25\" style=\"color: #606060;\">  25:<\/span>     main_data.train = main_data[ - i,]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum26\" style=\"color: #606060;\">  26:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum27\" style=\"color: #606060;\">  27:<\/span>     pred &lt;- knn(train = main_data.train[, 1:7], test = main_data.test[, 1:7], cl = main_data.train[,8], k = 5)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum28\" style=\"color: #606060;\">  28:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum29\" style=\"color: #606060;\">  29:<\/span>     <span style=\"color: #0000ff;\">if<\/span> (!exists(<span style=\"color: #006080;\">\"result\"<\/span>)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum30\" style=\"color: #606060;\">  30:<\/span>         result = c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8]))<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum31\" style=\"color: #606060;\">  31:<\/span>     } <span style=\"color: #0000ff;\">else<\/span> {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum32\" style=\"color: #606060;\">  32:<\/span>         result = rbind(result, c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8])))<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum33\" style=\"color: #606060;\">  33:<\/span>     }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum34\" style=\"color: #606060;\">  34:<\/span> }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum35\" style=\"color: #606060;\">  35:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum36\" style=\"color: #606060;\">  36:<\/span> <span style=\"color: #008000;\">#hitung performance classifier<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum37\" style=\"color: #606060;\">  37:<\/span> confusionMatrix(result[, 1], result[, 2])<\/pre>\n<p><!--CRLF--><\/p>\n<\/div>\n<\/div>\n<p>Yang membedakan kode di atas dengan kode pada bagian SVM adalah baris ke-27.\u00a0 Pada baris ke-27 dapat dilihat bagaimana training model dan prediksi dilakukan bersama-sama dengan menggunakan fungsi knn(). Dan berikut ini adalah hasil dari setiap kasus dalam bentuk confusion matrix.<\/p>\n<p><strong><em>Class 12 vs Class 6<\/em><\/strong><\/p>\n<p>&nbsp;<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 249 14<br \/>\nminority 18 245<\/p>\n<p>Accuracy : 0.9392<br \/>\n95% CI : (0.9152, 0.958)<br \/>\nNo Information Rate : 0.5076<br \/>\nP-Value [Acc &gt; NIR] : &lt;2e-16<\/p>\n<p>Kappa : 0.8783<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 0.5959 <\/span><\/p>\n<p>Sensitivity : 0.9326<br \/>\nSpecificity : 0.9459<br \/>\nPos Pred Value : 0.9468<br \/>\nNeg Pred Value : 0.9316<br \/>\nPrevalence : 0.5076<br \/>\nDetection Rate : 0.4734<br \/>\nDetection Prevalence : 0.5000<br \/>\nBalanced Accuracy : 0.9393<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Seperti halnya dengan menggunakan classifier SVM, hasil prediksi dengan menggunakan KNN tidak jauh berbeda. Prediksi class majority dan minority terlihat seimbang dan dapat menghasilkan prediksi yang baik.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><em>Class 12 vs Class 14<\/em><\/strong><\/p>\n<p>&nbsp;<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 230 99<br \/>\nminority 37 27<\/p>\n<p>Accuracy : 0.6539<br \/>\n95% CI : (0.6046, 0.7009)<br \/>\nNo Information Rate : 0.6794<br \/>\nP-Value [Acc &gt; NIR] : 0.8714<\/p>\n<p>Kappa : 0.087<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 1.689e-07 <\/span><\/p>\n<p>Sensitivity : 0.8614<br \/>\nSpecificity : 0.2143<br \/>\nPos Pred Value : 0.6991<br \/>\nNeg Pred Value : 0.4219<br \/>\nPrevalence : 0.6794<br \/>\nDetection Rate : 0.5852<br \/>\nDetection Prevalence : 0.8372<br \/>\nBalanced Accuracy : 0.5379<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Sedangkan setelah terjadi ketidak seimbangan class maka dapat dilihat prediksi class minority menjadi kurang baik. Dimana hanya 27 instance yang dapat diprediksi sebagai minority class dan sisanya 99 instance lainnya gagal diprediksi dengan benar.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><em>Class 12 vs Class 17<\/em><\/strong><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 254 46<br \/>\nminority 13 12<\/p>\n<p>Accuracy : 0.8185<br \/>\n95% CI : (0.7722, 0.8588)<br \/>\nNo Information Rate : 0.8215<br \/>\nP-Value [Acc &gt; NIR] : 0.5917<\/p>\n<p>Kappa : 0.2035<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 3.099e-05 <\/span><\/p>\n<p>Sensitivity : 0.9513<br \/>\nSpecificity : 0.2069<br \/>\nPos Pred Value : 0.8467<br \/>\nNeg Pred Value : 0.4800<br \/>\nPrevalence : 0.8215<br \/>\nDetection Rate : 0.7815<br \/>\nDetection Prevalence : 0.9231<br \/>\nBalanced Accuracy : 0.5791<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Begitu juga pada kasus ini, hanya sebagian instance class minority yang dapat diprediksi dengan benar.<br \/>\n<strong><em>Class 12 vs Class 23<\/em><\/strong><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 266 9<br \/>\nminority 1 0<\/p>\n<p>Accuracy : 0.9638<br \/>\n95% CI : (0.9344, 0.9825)<br \/>\nNo Information Rate : 0.9674<br \/>\nP-Value [Acc &gt; NIR] : 0.70798<\/p>\n<p>Kappa : -0.0066<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 0.02686 <\/span><\/p>\n<p>Sensitivity : 0.9963<br \/>\nSpecificity : 0.0000<br \/>\nPos Pred Value : 0.9673<br \/>\nNeg Pred Value : 0.0000<br \/>\nPrevalence : 0.9674<br \/>\nDetection Rate : 0.9638<br \/>\nDetection Prevalence : 0.9964<br \/>\nBalanced Accuracy : 0.4981<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Dan ketika jumlah instance class minority semakin sedikit seperti pada kasus ini maka semua prediksi class minority gagal dilakukan dengan benar.<\/p>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Class Imbalance &amp; Na\u00efve Bayes<\/em><\/strong>}<\/p>\n<p>Untuk pengujian classifer Na\u00efve Bayes pada lingkungan R digunakan kode berikut ini.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<div id=\"codeSnippet\" style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum1\" style=\"color: #606060;\">   1:<\/span> <span style=\"color: #008000;\">#library<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum2\" style=\"color: #606060;\">   2:<\/span> library(caret)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum3\" style=\"color: #606060;\">   3:<\/span> library(e1071)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum4\" style=\"color: #606060;\">   4:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum5\" style=\"color: #606060;\">   5:<\/span> <span style=\"color: #008000;\">#klasifikasi balance class - start<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum6\" style=\"color: #606060;\">   6:<\/span> <span style=\"color: #008000;\">#---------------------------------------------<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum7\" style=\"color: #606060;\">   7:<\/span> <span style=\"color: #008000;\">#init<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum8\" style=\"color: #606060;\">   8:<\/span> rm(<span style=\"color: #0000ff;\">list<\/span> = ls())<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum9\" style=\"color: #606060;\">   9:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum10\" style=\"color: #606060;\">  10:<\/span> <span style=\"color: #008000;\">#Ambil 2 class pada dataset abalone<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum11\" style=\"color: #606060;\">  11:<\/span> class_majority = <span style=\"color: #006080;\">\"12\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum12\" style=\"color: #606060;\">  12:<\/span> class_minority = <span style=\"color: #006080;\">\"14\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum13\" style=\"color: #606060;\">  13:<\/span> abalone = read.csv(<span style=\"color: #006080;\">\"abalone.data\"<\/span>, header = FALSE)[, 2:9]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum14\" style=\"color: #606060;\">  14:<\/span> main_data = rbind(abalone[which(abalone$V9 == class_majority),], abalone[which(abalone$V9 == class_minority),])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum15\" style=\"color: #606060;\">  15:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum16\" style=\"color: #606060;\">  16:<\/span> <span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum17\" style=\"color: #606060;\">  17:<\/span>     <span style=\"color: #0000ff;\">if<\/span> (<span style=\"color: #0000ff;\">as<\/span>.numeric(main_data[i, 8]) == <span style=\"color: #0000ff;\">as<\/span>.character(class_majority)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum18\" style=\"color: #606060;\">  18:<\/span>         main_data[i, 8] = <span style=\"color: #006080;\">\"majority\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum19\" style=\"color: #606060;\">  19:<\/span>     } <span style=\"color: #0000ff;\">else<\/span> {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum20\" style=\"color: #606060;\">  20:<\/span>         main_data[i, 8] = <span style=\"color: #006080;\">\"minority\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum21\" style=\"color: #606060;\">  21:<\/span>     }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum22\" style=\"color: #606060;\">  22:<\/span> }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum23\" style=\"color: #606060;\">  23:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum24\" style=\"color: #606060;\">  24:<\/span> <span style=\"color: #008000;\">#pembuatan model dan uji klasifikasi dg leave one out cross validation<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum25\" style=\"color: #606060;\">  25:<\/span> <span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum26\" style=\"color: #606060;\">  26:<\/span>     main_data.test = main_data[i,]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum27\" style=\"color: #606060;\">  27:<\/span>     main_data.train = main_data[ - i,]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum28\" style=\"color: #606060;\">  28:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum29\" style=\"color: #606060;\">  29:<\/span>     main_data.test[, 8] = <span style=\"color: #0000ff;\">as<\/span>.factor(main_data.test[, 8])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum30\" style=\"color: #606060;\">  30:<\/span>     main_data.train[, 8] = <span style=\"color: #0000ff;\">as<\/span>.factor(main_data.train[, 8])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum31\" style=\"color: #606060;\">  31:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum32\" style=\"color: #606060;\">  32:<\/span>     model = naiveBayes(main_data.train[, -8], main_data.train[, 8])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum33\" style=\"color: #606060;\">  33:<\/span>     pred &lt;- predict(model, main_data.test[,-8])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum34\" style=\"color: #606060;\">  34:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum35\" style=\"color: #606060;\">  35:<\/span>     <span style=\"color: #0000ff;\">if<\/span> (!exists(<span style=\"color: #006080;\">\"result\"<\/span>)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum36\" style=\"color: #606060;\">  36:<\/span>         result = c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8]))<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum37\" style=\"color: #606060;\">  37:<\/span>     } <span style=\"color: #0000ff;\">else<\/span> {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum38\" style=\"color: #606060;\">  38:<\/span>         result = rbind(result, c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8])))<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum39\" style=\"color: #606060;\">  39:<\/span>     }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum40\" style=\"color: #606060;\">  40:<\/span> }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum41\" style=\"color: #606060;\">  41:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum42\" style=\"color: #606060;\">  42:<\/span> <span style=\"color: #008000;\">#hitung performance classifier<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum43\" style=\"color: #606060;\">  43:<\/span> confusionMatrix(result[, 1], result[, 2])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum44\" style=\"color: #606060;\">  44:<\/span> <span style=\"color: #008000;\">#---------------------------------------------<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum45\" style=\"color: #606060;\">  45:<\/span> #klasifikasi balance <span style=\"color: #0000ff;\">class<\/span> - end<\/pre>\n<p><!--CRLF--><\/p>\n<\/div>\n<\/div>\n<p>Hal yang membedakan kode di atas dengan kode sebelumnya adalah pada baris ke-29 dan ke-30, kedua baris ini bertujuan untuk membuat kolom class label bertipe factor\/level.\u00a0 Kemudian pada baris ke-32 dapat dilihat penggunaan fungsi naiveBayes() untuk membuat model.\u00a0 Fungsi naiveBayes() ini terdapat pada package \u201ce1071\u201d, sehingga dapat dilihat pada baris ke-3 pemanggilan library tersebut.<\/p>\n<p>Selanjutnya untuk melakukan prediksi dapat dilihat pada baris ke-33.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><em>Class 12 vs Class 6<\/em><\/strong><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 237 14<br \/>\nminority 30 245<\/p>\n<p>Accuracy : 0.9163<br \/>\n95% CI : (0.8893, 0.9386)<br \/>\nNo Information Rate : 0.5076<br \/>\nP-Value [Acc &gt; NIR] : &lt; 2e-16<\/p>\n<p>Kappa : 0.8328<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 0.02374 <\/span><\/p>\n<p>Sensitivity : 0.8876<br \/>\nSpecificity : 0.9459<br \/>\nPos Pred Value : 0.9442<br \/>\nNeg Pred Value : 0.8909<br \/>\nPrevalence : 0.5076<br \/>\nDetection Rate : 0.4506<br \/>\nDetection Prevalence : 0.4772<br \/>\nBalanced Accuracy : 0.9168<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p><strong><em>Class 12 vs Class 14<\/em><\/strong><\/p>\n<p>&nbsp;<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 265 122<br \/>\nminority 2 4<\/p>\n<p>Accuracy : 0.6845<br \/>\n95% CI : (0.636, 0.7302)<br \/>\nNo Information Rate : 0.6794<br \/>\nP-Value [Acc &gt; NIR] : 0.4381<\/p>\n<p>Kappa : 0.0324<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : &lt;2e-16 <\/span><\/p>\n<p>Sensitivity : 0.99251<br \/>\nSpecificity : 0.03175<br \/>\nPos Pred Value : 0.68475<br \/>\nNeg Pred Value : 0.66667<br \/>\nPrevalence : 0.67939<br \/>\nDetection Rate : 0.67430<br \/>\nDetection Prevalence : 0.98473<br \/>\nBalanced Accuracy : 0.51213<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p><strong><em>Class 12 vs Class 17<\/em><\/strong><\/p>\n<p>&nbsp;<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 258 48<br \/>\nminority 9 10<\/p>\n<p>Accuracy : 0.8246<br \/>\n95% CI : (0.7788, 0.8644)<br \/>\nNo Information Rate : 0.8215<br \/>\nP-Value [Acc &gt; NIR] : 0.4773<\/p>\n<p>Kappa : 0.1882<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 4.823e-07 <\/span><\/p>\n<p>Sensitivity : 0.9663<br \/>\nSpecificity : 0.1724<br \/>\nPos Pred Value : 0.8431<br \/>\nNeg Pred Value : 0.5263<br \/>\nPrevalence : 0.8215<br \/>\nDetection Rate : 0.7938<br \/>\nDetection Prevalence : 0.9415<br \/>\nBalanced Accuracy : 0.5694<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p><strong><em>Class 12 vs Class 23<\/em><\/strong><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 266 9<br \/>\nminority 1 0<\/p>\n<p>Accuracy : 0.9638<br \/>\n95% CI : (0.9344, 0.9825)<br \/>\nNo Information Rate : 0.9674<br \/>\nP-Value [Acc &gt; NIR] : 0.70798<\/p>\n<p>Kappa : -0.0066<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 0.02686 <\/span><\/p>\n<p>Sensitivity : 0.9963<br \/>\nSpecificity : 0.0000<br \/>\nPos Pred Value : 0.9673<br \/>\nNeg Pred Value : 0.0000<br \/>\nPrevalence : 0.9674<br \/>\nDetection Rate : 0.9638<br \/>\nDetection Prevalence : 0.9964<br \/>\nBalanced Accuracy : 0.4981<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Dari hasil di atas dapat dilihat, performance untuk memprediksi class minority semakin melemah ketika jumlah instance pada class minority semakin sedikit dibandingkan class majority. Hal ini memperlihatkan class imbalance effect juga mempengaruhi classifier na\u00efve bayes.<\/p>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Class Imbalance &amp; Decision Tree<\/em><\/strong>}<\/p>\n<p>Terakhir adalah mencoba pengaruh class imbalance pada classifier decision tree.\u00a0 Untuk melakukan proses klasifikasi dengan metode decision tree pada lingkungan R dapat dilakukan dengan menggunakan fungsi J48 dari package \u201cRWeka\u201d.\u00a0 Fungsi ini memiliki kesamaan dengan C4.5.\u00a0 Berikut adalah kode yang digunakan.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<div id=\"codeSnippet\" style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum1\" style=\"color: #606060;\">   1:<\/span> <span style=\"color: #008000;\">#library<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum2\" style=\"color: #606060;\">   2:<\/span> library(caret)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum3\" style=\"color: #606060;\">   3:<\/span> library(RWeka)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum4\" style=\"color: #606060;\">   4:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum5\" style=\"color: #606060;\">   5:<\/span> <span style=\"color: #008000;\">#klasifikasi balance class - start<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum6\" style=\"color: #606060;\">   6:<\/span> <span style=\"color: #008000;\">#---------------------------------------------<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum7\" style=\"color: #606060;\">   7:<\/span> <span style=\"color: #008000;\">#init<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum8\" style=\"color: #606060;\">   8:<\/span> rm(<span style=\"color: #0000ff;\">list<\/span> = ls())<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum9\" style=\"color: #606060;\">   9:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum10\" style=\"color: #606060;\">  10:<\/span> <span style=\"color: #008000;\">#Ambil 2 class pada dataset abalone<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum11\" style=\"color: #606060;\">  11:<\/span> class_majority = <span style=\"color: #006080;\">\"12\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum12\" style=\"color: #606060;\">  12:<\/span> class_minority = <span style=\"color: #006080;\">\"23\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum13\" style=\"color: #606060;\">  13:<\/span> abalone = read.csv(<span style=\"color: #006080;\">\"abalone.data\"<\/span>, header = FALSE)[, 2:9]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum14\" style=\"color: #606060;\">  14:<\/span> main_data = rbind(abalone[which(abalone$V9 == class_majority),], abalone[which(abalone$V9 == class_minority),])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum15\" style=\"color: #606060;\">  15:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum16\" style=\"color: #606060;\">  16:<\/span> <span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum17\" style=\"color: #606060;\">  17:<\/span>     <span style=\"color: #0000ff;\">if<\/span> (<span style=\"color: #0000ff;\">as<\/span>.numeric(main_data[i, 8]) == <span style=\"color: #0000ff;\">as<\/span>.character(class_majority)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum18\" style=\"color: #606060;\">  18:<\/span>         main_data[i, 8] = <span style=\"color: #006080;\">\"majority\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum19\" style=\"color: #606060;\">  19:<\/span>     } <span style=\"color: #0000ff;\">else<\/span> {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum20\" style=\"color: #606060;\">  20:<\/span>         main_data[i, 8] = <span style=\"color: #006080;\">\"minority\"<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum21\" style=\"color: #606060;\">  21:<\/span>     }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum22\" style=\"color: #606060;\">  22:<\/span> }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum23\" style=\"color: #606060;\">  23:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum24\" style=\"color: #606060;\">  24:<\/span> <span style=\"color: #008000;\">#pembuatan model dan uji klasifikasi dg leave one out cross validation<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum25\" style=\"color: #606060;\">  25:<\/span> <span style=\"color: #0000ff;\">for<\/span> (i in 1:nrow(main_data)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum26\" style=\"color: #606060;\">  26:<\/span>     main_data.test = main_data[i,]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum27\" style=\"color: #606060;\">  27:<\/span>     main_data.train = main_data[ - i,]<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum28\" style=\"color: #606060;\">  28:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum29\" style=\"color: #606060;\">  29:<\/span>     main_data.test[, 8] = <span style=\"color: #0000ff;\">as<\/span>.factor(main_data.test[, 8])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum30\" style=\"color: #606060;\">  30:<\/span>     main_data.train[, 8] = <span style=\"color: #0000ff;\">as<\/span>.factor(main_data.train[, 8])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum31\" style=\"color: #606060;\">  31:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum32\" style=\"color: #606060;\">  32:<\/span>     model = J48(V9 ~ ., data = main_data.train)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum33\" style=\"color: #606060;\">  33:<\/span>     pred &lt;- predict(model, main_data.test)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum34\" style=\"color: #606060;\">  34:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum35\" style=\"color: #606060;\">  35:<\/span>     <span style=\"color: #0000ff;\">if<\/span> (!exists(<span style=\"color: #006080;\">\"result\"<\/span>)) {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum36\" style=\"color: #606060;\">  36:<\/span>         result = c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8]))<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum37\" style=\"color: #606060;\">  37:<\/span>     } <span style=\"color: #0000ff;\">else<\/span> {<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum38\" style=\"color: #606060;\">  38:<\/span>         result = rbind(result, c(<span style=\"color: #0000ff;\">as<\/span>.character(pred), <span style=\"color: #0000ff;\">as<\/span>.character(main_data.test[, 8])))<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum39\" style=\"color: #606060;\">  39:<\/span>     }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum40\" style=\"color: #606060;\">  40:<\/span> }<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum41\" style=\"color: #606060;\">  41:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum42\" style=\"color: #606060;\">  42:<\/span> plot(model)<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum43\" style=\"color: #606060;\">  43:<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum44\" style=\"color: #606060;\">  44:<\/span> <span style=\"color: #008000;\">#hitung performance classifier<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum45\" style=\"color: #606060;\">  45:<\/span> confusionMatrix(result[, 1], result[, 2])<\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum46\" style=\"color: #606060;\">  46:<\/span> <span style=\"color: #008000;\">#---------------------------------------------<\/span><\/pre>\n<p><!--CRLF--><\/p>\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\"><span id=\"lnum47\" style=\"color: #606060;\">  47:<\/span> #klasifikasi balance <span style=\"color: #0000ff;\">class<\/span> - end<\/pre>\n<p><!--CRLF--><\/p>\n<\/div>\n<\/div>\n<p>Dari kode tersebut dapat dilihat penggunakan library \u201cRWeka\u201d pada baris ke-3.\u00a0 Kemudian penggunaan fungsi J48() dapat dilihat pada baris ke-32. Dan untuk memperlihatkan gambar pohon yang dihasilkan dapat menggunakan fungsi plot() pada baris ke-42.<\/p>\n<p>Dan di bawah ini dapat dilihat hasil plot model dan performance classifier pada masing-masing kasus imbalance class.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><em>Class 12 vs Class 6<\/em><\/strong><\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c01.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border: 0px;\" title=\"c01\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c01_thumb.jpg\" alt=\"c01\" width=\"550\" height=\"417\" border=\"0\" \/><\/a><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 249 25<br \/>\nminority 18 234<\/p>\n<p>Accuracy : 0.9183<br \/>\n95% CI : (0.8915, 0.9402)<br \/>\nNo Information Rate : 0.5076<br \/>\nP-Value [Acc &gt; NIR] : &lt;2e-16<\/p>\n<p>Kappa : 0.8364<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 0.3602 <\/span><\/p>\n<p>Sensitivity : 0.9326<br \/>\nSpecificity : 0.9035<br \/>\nPos Pred Value : 0.9088<br \/>\nNeg Pred Value : 0.9286<br \/>\nPrevalence : 0.5076<br \/>\nDetection Rate : 0.4734<br \/>\nDetection Prevalence : 0.5209<br \/>\nBalanced Accuracy : 0.9180<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p><strong><em>Class 12 vs Class 14<\/em><\/strong><\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c02.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border: 0px;\" title=\"c02\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c02_thumb.jpg\" alt=\"c02\" width=\"550\" height=\"422\" border=\"0\" \/><\/a><\/p>\n<p>Pada kasus ini pohon \u201cgagal\u201d digambarkan.<\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 267 126<br \/>\nminority 0 0<\/p>\n<p>Accuracy : 0.6794<br \/>\n95% CI : (0.6308, 0.7253)<br \/>\nNo Information Rate : 0.6794<br \/>\nP-Value [Acc &gt; NIR] : 0.5241<\/p>\n<p>Kappa : 0<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : &lt;2e-16 <\/span><\/p>\n<p>Sensitivity : 1.0000<br \/>\nSpecificity : 0.0000<br \/>\nPos Pred Value : 0.6794<br \/>\nNeg Pred Value : NaN<br \/>\nPrevalence : 0.6794<br \/>\nDetection Rate : 0.6794<br \/>\nDetection Prevalence : 1.0000<br \/>\nBalanced Accuracy : 0.5000<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p><strong><em>Class 12 vs Class 17<\/em><\/strong><\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c03.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border: 0px;\" title=\"c03\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c03_thumb.jpg\" alt=\"c03\" width=\"550\" height=\"419\" border=\"0\" \/><\/a><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 254 55<br \/>\nminority 13 3<\/p>\n<p>Accuracy : 0.7908<br \/>\n95% CI : (0.7424, 0.8337)<br \/>\nNo Information Rate : 0.8215<br \/>\nP-Value [Acc &gt; NIR] : 0.9336<\/p>\n<p>Kappa : 0.0042<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 6.627e-07 <\/span><\/p>\n<p>Sensitivity : 0.95131<br \/>\nSpecificity : 0.05172<br \/>\nPos Pred Value : 0.82201<br \/>\nNeg Pred Value : 0.18750<br \/>\nPrevalence : 0.82154<br \/>\nDetection Rate : 0.78154<br \/>\nDetection Prevalence : 0.95077<br \/>\nBalanced Accuracy : 0.50152<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p><strong><em>Class 12 vs Class 23<\/em><\/strong><\/p>\n<p><a href=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c04.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border: 0px;\" title=\"c04\" src=\"http:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/c04_thumb.jpg\" alt=\"c04\" width=\"550\" height=\"422\" border=\"0\" \/><\/a><\/p>\n<div id=\"codeSnippetWrapper\" style=\"font-size: 8pt; overflow: auto; cursor: text; font-family: 'Courier New', courier, monospace; width: 97.5%; direction: ltr; text-align: left; margin: 20px 0px 10px; line-height: 12pt; max-height: 200px; background-color: #f4f4f4; border: silver 1px solid; padding: 4px;\">\n<pre style=\"font-size: 8pt; overflow: visible; font-family: 'Courier New', courier, monospace; width: 100%; color: black; direction: ltr; text-align: left; margin: 0em; line-height: 12pt; background-color: #f4f4f4; border-style: none; padding: 0px;\">Confusion Matrix <span style=\"color: #0000ff;\">and<\/span> Statistics<\/pre>\n<p>Reference<br \/>\nPrediction majority minority<br \/>\nmajority 267 9<br \/>\nminority 0 0<\/p>\n<p>Accuracy : 0.9674<br \/>\n95% CI : (0.939, 0.985)<br \/>\nNo Information Rate : 0.9674<br \/>\nP-Value [Acc &gt; NIR] : 0.587420<\/p>\n<p>Kappa : 0<br \/>\nMcnemar<span style=\"color: #006080;\">&#8216;s Test P-Value : 0.007661 <\/span><\/p>\n<p>Sensitivity : 1.0000<br \/>\nSpecificity : 0.0000<br \/>\nPos Pred Value : 0.9674<br \/>\nNeg Pred Value : NaN<br \/>\nPrevalence : 0.9674<br \/>\nDetection Rate : 0.9674<br \/>\nDetection Prevalence : 1.0000<br \/>\nBalanced Accuracy : 0.5000<\/p>\n<p>&#8216;Positive&#8217; Class : majority<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p>Dari keempat kasus di atas, dapat dilihat bahwa pengaruh jelak dari class imbalance juga berpengaruh terhadap classifier decision tree.<\/p>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Class Imbalance &amp; Solusi<\/em><\/strong>}<\/p>\n<p>Karena classifier konvensional yang disebutkan di atas tidak memiliki performance bagus untuk menangani class minoritas maka perlu cara lain untuk menyelesaikan masalah class imbalance tersebut.\u00a0 Cara paling baik untuk menghindari masalah ini adalah dengan menambah jumlah data minoritas dengan cara melakukan pengambilan data lebih banyak lagi. Sehingga data menjadi seimbang dan classifier konvensional dapat bekerja secara normal lagi.<\/p>\n<p>Jika hal tersebut di atas tidak bisa dilakukan, maka yang dilakukan adalah membuat proses klasifikasi lebih pintar daripada sebelumnya.\u00a0 Sampai ada ada beberapa tipe teknik yang bisa digunakan, yaitu:<\/p>\n<ul>\n<li>Melakukan \u201cperbaikan data\u201d dengan data sampling. Ada dua hal yang bisa dilakukan yaitu oversampling yang bertujuan membuat jumlah instance minoritas menjadi bertambah banyak (menambah artificial instance), lucu kan \u2026 datanya sedikit malah ditambah-tambah biar banyak #eh. Yang kedua adalah undersampling yang bertujuan membuat jumlah instance mayoritas menjadi lebih sedikit dengan cara menghilangkan instance mayoritas yang ada.\u00a0 Hal ini dapat mempengaruhi hilangkan informasi yang telah dimiliki. Dan yang ketiga adalah gabungan keduanya.<\/li>\n<li>Melakukan perbaikan algoritma yang ada agar lebih pintar.\u00a0 Ada beberapa teknik yang bisa digunakan yaitu cost-sensitive learning dan ensemble methods yang keduanya bertujuan membuat algoritma yang lebih pintar, dan tentunya untuk menjadi lebih pintar harus bekerja lebih keras dan mungkin lebih lama.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>{<strong><em>Kesimpulan<\/em><\/strong>}<\/p>\n<p>Dari paparan percobaan di atas beserta hasil dapat diketahui bahwa ketidakseimbangan data dapat mempengaruhi performance classifier yang digunakan. Dan berikut ini adalah beberapa hal yang harus diperhatikan\u00a0saat bermain dengan proses klasifikasi dan classifiernya:<\/p>\n<ul>\n<li>Jangan senang dulu jika mendapatkan nilai akurasi (accuracy) di atas 90%, karena bisa jadi itu hanya akurasi kebenaran prediksi class mayoritas saja. Dan kesalahan yang terjadi hanya pada saat melakukan prediksi class minoritas. seperti yang terlihat pada kasus &#8220;<strong>class 12 vs class 23<\/strong>&#8220;.<\/li>\n<li>Perhatikan nilai\u00a0Sensitivity dan Specificity untuk melihat performance classifier.<\/li>\n<li>Kenali data yang akan digunakan lebih jauh dengan mencari informasi dengan menggunakan teknik yang umum dilakukan pada bidang statistik dan juga perhatikan persebaran data pada data space agar kita mempunyai pengetahuan tentang kerumitan data yang kita hadapi seperti yang telah dibahas pada posting ini\u00a0<a href=\"http:\/\/www.rezafaisal.net\/?p=2951\">http:\/\/www.rezafaisal.net\/?p=2951<\/a>.<\/li>\n<li>Prediksi class minoritas itu penting! Kenapa? Karena pada kasus di dunia nyata, hal penting itu justru menjadi minoritas. Sebagai contoh data penderita kanker lebih sedikit dibanding yang tidak mengidap kanker, artinya klasifikasi untuk memprediksi penderita kanker \u00a0lebih penting dilakukan. \u00a0Contoh lain, pada transaksi keuangan, data kejadian kecurangan transaksi lebih sedikit jika dibandingkan transaksi normal, artinya\u00a0prediksi untuk mengetahui transaksi mana yang curang adalah lebih penting.<\/li>\n<li>Jika bertemu dengan masalah imbalance class, jangan lari! Kita harus lebih bijak untuk menanganinya dengan memanfaatkan preprocessing data dengan menggunakan data sampling atau membuat algoritma yang dapat bekerja lebih pintar untuk menghadapi masalah imbalance class. Kedua hal ini akan dibahas pada posting berikutnya.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pada data mining atau sebagian teknik machine learning, data adalah sumber pengetahuan yang\u00a0kan digunakan untuk belajar yang nantinya akan digunakan sebagai dasar untuk mengenali ketika ada instance baru. Kasus di atas bisa ditemui pada supervise learning, khususnya klasifikasi dimana classifier&hellip;<\/p>\n","protected":false},"author":1,"featured_media":2969,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[6],"tags":[150,156,157],"class_list":["post-2989","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-softwaredev","tag-r","tag-r-tools-for-visual-studio","tag-vs2015"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/www.rezafaisal.net\/wp-content\/uploads\/2016\/08\/06_thumb.jpg","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1sNAL-Md","_links":{"self":[{"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=\/wp\/v2\/posts\/2989","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2989"}],"version-history":[{"count":1,"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=\/wp\/v2\/posts\/2989\/revisions"}],"predecessor-version":[{"id":2990,"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=\/wp\/v2\/posts\/2989\/revisions\/2990"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=\/wp\/v2\/media\/2969"}],"wp:attachment":[{"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2989"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2989"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rezafaisal.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2989"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}