<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JCC</journal-id><journal-title-group><journal-title>Journal of Computer and Communications</journal-title></journal-title-group><issn pub-type="epub">2327-5219</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jcc.2019.711003</article-id><article-id pub-id-type="publisher-id">JCC-96177</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  Churn Prediction Using Machine Learning and Recommendations Plans for Telecoms
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Khulood</surname><given-names>Ebrah</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Selma</surname><given-names>Elnasir</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib></contrib-group><aff id="aff2"><addr-line>Department of Computer Science, International University of Africa, Khartoum, Sudan</addr-line></aff><aff id="aff1"><addr-line>Department of Information Technology, International University of Africa, Khartoum, Sudan</addr-line></aff><pub-date pub-type="epub"><day>04</day><month>11</month><year>2019</year></pub-date><volume>07</volume><issue>11</issue><fpage>33</fpage><lpage>53</lpage><history><date date-type="received"><day>28,</day>	<month>September</month>	<year>2019</year></date><date date-type="rev-recd"><day>2,</day>	<month>November</month>	<year>2019</year>	</date><date date-type="accepted"><day>5,</day>	<month>November</month>	<year>2019</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Keeping customers satisfied is truly essential for saying that business is successful especially in the telecom. Many companies experience different techniques that can predict churn rates and help in designing effective plans for customer retention since the cost of acquiring a new customer is much higher than the cost of retaining the existing one. In this paper, three machine learning algorithms have been used to predict churn namely, Na?ve Bayes, SVM and decision trees using two benchmark datasets IBM Watson dataset, which consist of 7033 observations, 21 attributes and cell2cell dataset that contains 71,047 observations and 57 attributes. The models’ performance has been measured by the area under the curve (AUC) and they scored 0.82, 0.87, 0.77 respectively for IBM dataset and 0.98, 0.99, 0.98 respectively for cell2cell dataset. The proposed models also obtained better accuracy than the previous studies using the same datasets.
 
</p></abstract><kwd-group><kwd>Churn Prediction</kwd><kwd> Telecommunication</kwd><kwd> Modeling</kwd><kwd> Analysis</kwd><kwd> SVM</kwd><kwd>  Na&amp;iuml;ve Bayes</kwd><kwd> Decision Trees</kwd><kwd> Cell2cell</kwd><kwd> IBM</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Customers retaining is the most important asset for any business as it is stated that “the cost of acquiring a new customer can be higher than that of retaining a customer by as much as 700%; increasing customer retention rates by a mere 5% could increase profits by 25% to 95%” [<xref ref-type="bibr" rid="scirp.96177-ref1">1</xref>]. So one of the best solution to retain the customers is to reduce churn rate, where “churn” means moving the customer from service provider to another one, or stopping using specific services over specific periods for many reasons that can be detected previously if the company analyzes its data records and uses machine learning technology which enables the companies to predict the customers who are likely to churn. A lot of studies approved its efficiency to this situation [<xref ref-type="bibr" rid="scirp.96177-ref2">2</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref3">3</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref4">4</xref>] so the company can respond quickly to the behavioral changes in the customer’s minds. Telco’s today is refining &amp; optimizing the customer experience which is the key to sustaining a market differentiation and reducing churn [<xref ref-type="bibr" rid="scirp.96177-ref5">5</xref>], where retaining an existing customer costs much lower than acquiring a new one. This research studies the machine learning algorithms and recommended the best solutions for telecoms. In the competitive telecom sector, customers can easily switch from one provider to another, which lets the telecom providers worried about their customers and how to retain them but they can predict the customers who will move to another provider previously by analyzing their behavior. They can retain them by providing offers and their preferred services according to their historical records so the aim of this study is to predict churn previously and detect the main factors that may let the user move to another provider in telecoms.</p></sec><sec id="s2"><title>2. Related Work</title><p>Many studies are available for churn problem from different viewpoints with different datasets, algorithm and for different industries where churn analysis is one of the world wide used to analyze the customer behaviors and predict the customers who are about to leave the service agreement from a company. Studies revealed that gaining new customers is 5 to 10 times costlier than keeping existing customers happy and loyal in today’s competitive conditions, and that an average company loses 10 to 30 percent of customers annually [<xref ref-type="bibr" rid="scirp.96177-ref6">6</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref7">7</xref>]. Most of the literature focused more on data mining algorithms, but only a few of them focused on distinguishing the important input variables for churn prediction and on enhancing the data samples through efficient pre-processing to be used for data mining algorithms implementation [<xref ref-type="bibr" rid="scirp.96177-ref8">8</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref9">9</xref>]. Amin, A., et al. [<xref ref-type="bibr" rid="scirp.96177-ref10">10</xref>] presented a novel churn prediction approach based on the classifier’s certainty estimation using distance factor where they grouped the dataset into different zones based on the distance which are then divided into two categories with high and low certainty, they used 4 datasets with different samples and they have been discretized by size, the values that exists in each attribute of the dataset, and then assigned certain labels and at the end produced specific list of values in different number of groups of an attribute. They used Na&#239;ve Bayes as classifier and it obtained high accuracy in the zone with greater distance factor’s value (i.e., customer churn and non-churn with high certainty) than those placed in the zone with smaller distance factor’s value (i.e., customer churn and non-churn with low certainty). Accuracy in the last tenth iteration was (82.91% &amp; 84.30%, 70.60% &amp; 74.80%, 70.00% &amp; 89.01%, 57.00% &amp; 56.00%) for the (UDT &amp; LDT) on the 4 datasets used. Andrews, R., et al. [<xref ref-type="bibr" rid="scirp.96177-ref3">3</xref>] used dataset of 10,000 client records from telecom each with 21 attribute, in which 2900 are churners from customers of a Telecom Company in Belgium. They applied profound learning models and they used 10-overlap cross approval methods to check the prediction exactness and the area under curve score is 0.89. Ahmad, A. K., Jafar, A. and Aljoumaa, K. [<xref ref-type="bibr" rid="scirp.96177-ref2">2</xref>] developed machine learning techniques on big data platform for analyzing data from SyriaTel telecom contained all customers’ information over 9 months. The model experimented four algorithms: Decision Tree, Random Forest, Gradient Boosted Machine Tree “GBM” and Extreme Gradient Boosting “XGBOOST”. The AUC for the four models were 83, 87.76, 90.89 and 93.3. The best results were obtained by applying XGBOOST and it obtained 93.3% where it used (SNA) features, which enhanced the performance of the model from 84% to 93.3%. The model was prepared and tested through Spark environment. Saraswat, S. and Tiwari, A. [<xref ref-type="bibr" rid="scirp.96177-ref11">11</xref>] described a framework that was proposed to conduct for the churn prediction model using Na&#239;ve Bayes algorithm for classification task and then apply Elephant Herding Optimization algorithm for solving optimization task used the dataset which was obtained from https://www.kaggle.com and it contains 21 attributes and 3333 instances. Data contains 483 churn’ customer where predicted 244 correctly as churner customer using na&#239;ve equation and after applying Elephant Herding Optimization Algorithm 199 churner, model accuracy is 87%. Different algorithms are used by Ahmed, A.A. and D. Maheswari [<xref ref-type="bibr" rid="scirp.96177-ref12">12</xref>], which are Firefly algorithm and the Hybrid Firefly algorithm on Orange Dataset which contains 50,000 samples and 230 attributes. The dataset was segregated with 90% data for training and 10% for testing. The search space was populated with 20 fireflies and classification was carried out with a maxgen of 1000. The ACC obtained is (86.36%, 86.38%). Some researchers compared between different models as Kumar, N. and C. Naik [<xref ref-type="bibr" rid="scirp.96177-ref13">13</xref>] who used three models Logistic regression, random forest and balanced random forest on dataset contains from 25,000 samples and 110 attributes and used PCA for feature selection and partitions used 70% &amp; 30% for training and testing. The result presented that Logistic regression model has the highest area under the curve where the ACC of the three models (0.861, 0.83, 0.83).</p></sec><sec id="s3"><title>3. The Research Strategy</title><p>The method used in this paper has been summarized in <xref ref-type="fig" rid="fig1">Figure 1</xref> and it has been explained in detail in the next paragraphs.</p><sec id="s3_1"><title>3.1. Datasets Visualization</title><p>There are two datasets used in this study. The first dataset consists of 7034 samples and 20 attributes while the second dataset contains 71,047 samples and 57 attributes. Datasets details are as shown in <xref ref-type="table" rid="table1">Table 1</xref>. Both datasets have been visualized using Orange.</p><p>In <xref ref-type="fig" rid="fig2">Figure 2</xref> &amp; <xref ref-type="fig" rid="fig3">Figure 3</xref> the churn class histogram for both datasets were illustrated. The 0’s value refers to the non-churned customers and shown in blue color and the 1’s value refers to the churned customers and shown in orange color.</p><p>The samples from IBM dataset shown in <xref ref-type="table" rid="table2">Table 2</xref> are the features which have been used in prediction models.</p><p>And <xref ref-type="table" rid="table3">Table 3</xref> includes the samples with the features of cell2cell dataset.</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Datasets used</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Dataset</th><th align="center" valign="middle" >Dataset 1</th><th align="center" valign="middle" >Dataset 2</th></tr></thead><tr><td align="center" valign="middle" >Samples</td><td align="center" valign="middle" >7034</td><td align="center" valign="middle" >71,047</td></tr><tr><td align="center" valign="middle" >Features</td><td align="center" valign="middle" >20</td><td align="center" valign="middle" >57</td></tr><tr><td align="center" valign="middle" >Classes</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >2</td></tr><tr><td align="center" valign="middle" >Missing values %</td><td align="center" valign="middle" >0.0%</td><td align="center" valign="middle" >0.7%</td></tr><tr><td align="center" valign="middle" >negative samples</td><td align="center" valign="middle" >1869 (73.46%)</td><td align="center" valign="middle" >20,609 (29.01%)</td></tr><tr><td align="center" valign="middle" >Data sources</td><td align="center" valign="middle" >IBM Watson [<xref ref-type="bibr" rid="scirp.96177-ref14">14</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref15">15</xref>]</td><td align="center" valign="middle" >Cell2cell [<xref ref-type="bibr" rid="scirp.96177-ref16">16</xref>]</td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> IBM dataset samples</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Churn</th><th align="center" valign="middle" >Dependents</th><th align="center" valign="middle" >Tenure</th><th align="center" valign="middle" >Phone Service</th><th align="center" valign="middle" >Multiple Lines</th><th align="center" valign="middle" >Internet Service</th><th align="center" valign="middle" >Online Backup</th><th align="center" valign="middle" >Device Protection</th><th align="center" valign="middle" >Contract</th><th align="center" valign="middle" >Paperless Billing</th><th align="center" valign="middle" >Payment Method</th><th align="center" valign="middle" >Monthly Charges</th><th align="center" valign="middle" >Total Charges</th></tr></thead><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >29.9</td><td align="center" valign="middle" >29.85</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >34</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >57</td><td align="center" valign="middle" >1890</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >53.9</td><td align="center" valign="middle" >108.2</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >45</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >42.3</td><td align="center" valign="middle" >1841</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >70.7</td><td align="center" valign="middle" >151.7</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >99.7</td><td align="center" valign="middle" >820.5</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >22</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >4</td><td align="center" valign="middle" >89.1</td><td align="center" valign="middle" >1949</td></tr></tbody></table></table-wrap><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> Cell2cell dataset samples</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >churn</th><th align="center" valign="middle" >revenue</th><th align="center" valign="middle" >mou</th><th align="center" valign="middle" >recchrge</th><th align="center" valign="middle" >changem</th><th align="center" valign="middle" >custcare</th><th align="center" valign="middle" >mourec</th><th align="center" valign="middle" >months</th><th align="center" valign="middle" >phones</th><th align="center" valign="middle" >models</th><th align="center" valign="middle" >eqpdays</th><th align="center" valign="middle" >creditaa</th><th align="center" valign="middle" >prizmtwn</th><th align="center" valign="middle" >webcap</th></tr></thead><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >−6.2</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >−6</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >7</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >203</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >−5.9</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >−5</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >15</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >452</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >−2.5</td><td align="center" valign="middle" >211</td><td align="center" valign="middle" >0.5</td><td align="center" valign="middle" >NA</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1.69</td><td align="center" valign="middle" >18</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >281</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >NA</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >27</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >597</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >55</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >NA</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >7.06</td><td align="center" valign="middle" >26</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >371</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >7</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >199</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >76</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >11.2</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >883</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0.2</td><td align="center" valign="middle" >12</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >7.24</td><td align="center" valign="middle" >31</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >263</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >1</td></tr></tbody></table></table-wrap><sec id="s3_1_1"><title>3.1.1. IBM Dataset Visualization and Preprocessing</title><p>The dataset is for customers who left within the last month. The column is called Churn where it contains the below attributes [<xref ref-type="bibr" rid="scirp.96177-ref14">14</xref>] :</p><p>• Services that each customer has signed up, internet, online security, online backup, device protection, tech support, and streaming TV and movies;</p><p>• Customer account information how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges;</p><p>• Demographic info about customers—gender, age range, and if they have partners and dependents.</p><p>Figures 4-12 show the attributes and their distributions according to the churn class where orange color indicates the churn customers and the blue for non-churn. As noticed that:</p><p>• Most churned customers have internet service type Fiber optic;</p><p>• They use paperless billing;</p><p>• Most of the customers were dependents;</p><p>• Their payment method was electronic check;</p><p>• They don’t use “device protection” or “online backup” services, rather they use phone service;</p><p>• Their tenure was less than 14 months.</p><p>Therefore, the predictor attributes have been selected according to this analysis.</p><p><xref ref-type="fig" rid="fig1">Figure 1</xref>0 and <xref ref-type="fig" rid="fig1">Figure 1</xref>1 show the correlation between Total charges, Monthly Charge and Tenure.</p></sec><sec id="s3_1_2"><title>3.1.2. Cell2cell Dataset Visualization and Preprocessing</title><p>Cell2cell is the 6th largest wireless company in the US, Cell2cell dataset consists of 71,047 signifying whether the customer had left the company two months after observation and 57 attributes [<xref ref-type="bibr" rid="scirp.96177-ref17">17</xref>]. The histograms Figures 13-17 show the attributes and their distributions according to the churn in similar way as done with IBM dataset visualization. What has been noticed on cell2cell dataset that:</p><p>• Churned customers have average (mean) monthly minutes of use which is less than 530’ minute;</p><p>• They have service for only 11 - 15 months;</p><p>• The numbers of days for their equipment were between 300 &amp; 361 day;</p><p>• The numbers of models issues are less than 2;</p><p>• Their prizm code refer to town;</p><p>• Their handsets have web capability.</p><p>The excluded data according to the churn class has been illustrated in <xref ref-type="fig" rid="fig1">Figure 1</xref>8 and <xref ref-type="fig" rid="fig1">Figure 1</xref>9. Where <xref ref-type="fig" rid="fig1">Figure 1</xref>8 plots the Churn attribute vs Total Revenue and Total Charge. <xref ref-type="fig" rid="fig1">Figure 1</xref>9 plots the surface fitting for Total charges, Change in Miute Use and Change in Revenues. There are 238 outliers samples have been removed and they have been marked by red color.</p></sec></sec><sec id="s3_2"><title>3.2. Na&#239;ve Bayes Algorithm</title><p>The Naive Bayes algorithm is a classification algorithm based on Bayes rule and a set of conditional independence assumptions [<xref ref-type="bibr" rid="scirp.96177-ref18">18</xref>]. To predict the class label of X, P ( X | C i ) P ( C i ) is evaluated for each class C<sub>i</sub>. The classifier predicts that the class label of tuple X is the class C<sub>i</sub> if and only if</p><p>P ( X | C i ) P ( C i ) &gt; P ( X | C j ) P ( C j )     for   1 ≤ j ≤ m , j ≠ i (1)</p><p>In other words, the predicted class label is the class C<sub>i</sub> for which P ( X | C i ) P ( C i ) is the maximum [<xref ref-type="bibr" rid="scirp.96177-ref19">19</xref>]. Models posterior probabilities according to Bayes rule. That is, for all k = 1 , ⋯ , K ,</p><p>P ^ ( Y = k | X 1 , ⋯ , X p ) = π ( Y = k ) ∏ j = 1 P P ( X j | Y = k ) ∑ k = 1 K π ( Y = k ) ∏ j = 1 P P ( X j | Y = k ) (2)</p><p>where:</p><p>Y is the random variable corresponding to the churn class index of an observation.</p><p>X 1 , ⋯ , X p are the predictors of an observation.</p><p>π ( Y = k ) is the prior probability that a class index is k.</p><p>The model use mean and standard deviation to distrubite the predictors within each class.</p><p>Naive Bayes classification classify data into the training data, the method estimates the parameters of a probability distribution, assuming predictors are conditionally independent given the class. Prediction step: For any unseen test data, the method computes the posterior probability of that sample belonging to each class. The method then classifies the test data according the largest posterior probability.</p></sec><sec id="s3_3"><title>3.3. Support Vector Machine Algorithm</title><p>SVM algorithm for the classification of both linear and nonlinear data. It transforms the original data into a higher dimension, from where it can find a hyperplane for data separation using essential training tuples called support vectors [<xref ref-type="bibr" rid="scirp.96177-ref19">19</xref>]. The SVM binary classification algorithm searches for an optimal hyperplane that separates the data into two classes. For separable classes, the optimal hyperplane maximizes a margin (space that does not contain any observations) surrounding itself, which creates boundaries for the positive and negative classes. The data for training is a set of points (vectors) x<sub>j</sub> along with their categories y<sub>j</sub>. For some dimension d, the x j ∈ R d , and the y<sub>j</sub> = &#177;1. The equation of a hyperplane is [<xref ref-type="bibr" rid="scirp.96177-ref20">20</xref>]</p><p>f ( x ) = x ′ β + b = 0 (3)</p><p>where β ∈ R d and b is a real number.</p><p>As the data used is not allow for a separating hyperplane, the SVM used a soft margin, meaning a hyperplane that separates many, but not all data points. There are two standard formulations of soft margins. Both involve adding slack variables ξ = ( ξ 1 , ξ 2 , ⋯ , ξ N ) and a penalty parameter C.</p><p>• The L<sup>1</sup>-norm problem is:</p><p>min β , b , ξ ( 12 β ′ β + C ∑ j ξ j ) (4)</p><p>such that</p><p>y j f ( x j ) ≥ 1 − ξ j ξ j ≥ 0 (5)</p><p>• The L<sup>2</sup>-norm problem is:</p><p>min β , b , ξ ( 12 β ′ β + C ∑ j ξ j 2 ) (6)</p><p>In these formulations, it can be used C places more weight on the slack variables ξj, meaning the optimization attempts to make a stricter separation between classes. Equivalently, reducing C towards 0 makes misclassification less important.</p><p>The propsed SVM model standardizes the predictors using their corresponding weighted means and weighted standard deviations. Means it standardizes predictor j (x<sub>j</sub>) using</p><p>x j ∗ = x j − μ j ∗ σ j (7)</p><p>μ j ∗ = 1 ∑ k w k ∗ ∑ k w k ∗ x j k (8)</p><p>X<sub>jk</sub> is observation k (row) of predictor j (column).</p><p>( σ j ∗ ) 2 = v 1 v 12 − v 2 ∑ k w k ∗ ( x j k − μ j ∗ ) 2 (9)</p><p>v 1 = ∑ j w j ∗ (10)</p><p>v 2 = ∑ j ( w j ∗ ) 2 (11)</p></sec><sec id="s3_4"><title>3.4. Decision Tree Algorithm</title><p>Decision tree induction is the learning of decision trees from class-labeled training tuples. A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node) holds a class label. The topmost node in a tree is the root node [<xref ref-type="bibr" rid="scirp.96177-ref19">19</xref>]. The Classification Tree splits nodes based on either impurity or node error. Impurity means one of several things, depending on the Split Criterion name-value pair argument:</p><p>• Gini’s Diversity Index (gdi)—the Gini index of a node is</p><p>Gini ( D ) = 1 − ∑ i = 1 m P i 2 , (12)</p><p>where the sum is over the classes i at the node, and p(i) is the observed fraction of classes with class i that reach the node. A node with just one class (a pure node) has Gini index 0; otherwise the Gini index is positive. So the Gini index is a measure of node impurity.</p><p>• Deviance (“deviance”)—with p(i) defined the same as for the Gini index, the deviance of a node is</p><p>∑ i p ( i ) log 2 p ( i ) (13)</p><p>A pure node has deviance 0; otherwise, the deviance is positive.</p></sec><sec id="s3_5"><title>3.5. Models Evaluation’s Methods</title><p>The models have been evaluted using the holdout method and k-fold cross-validation. In the holdout partition method, the given data are randomly partitioned into two independent sets, a training set and a test set. [<xref ref-type="bibr" rid="scirp.96177-ref19">19</xref>]. And in this partition type, a scalar parameter (let’s say “p”) which randomly selects approximately p* n observations for the test set. The p value used here is 0.3 which divided datasets into 70% for training and 30% for testing. In k-fold cross-validation, the initial data are randomly partitioned into k mutually exclusive subsets or “folds”, D<sub>1</sub>, D<sub>2</sub>, ... D<sub>k</sub>, each of approximately equal size. Training and testing is performed k times [<xref ref-type="bibr" rid="scirp.96177-ref19">19</xref>]. The datasets here have been divided into 10 folds.</p></sec></sec><sec id="s4"><title>4. Experiments and Results</title><p>The three models trained on IBM and cell2cell datasets and have been divided into training and test sets using cross validation with partition types “hold-out” 30% and “k-fold” where the k value used is 10. The training and testing error shown in <xref ref-type="table" rid="table4">Table 4</xref> and it shows the best result obtained from training and testing. The models have been trained from 4 to 5 times for each dataset and they didn’t give better accuracy.</p><p>The ROC curve for IBM dataset shown in Figures 20-22 according to <xref ref-type="table" rid="table2">Table 2</xref> for each model output. Whereas Figures 23-25 show the ROC for cell2cell dataset according <xref ref-type="table" rid="table4">Table 4</xref> too. ROC curve for each of the three models shows the trade-off between the true positive rate (TPR) and the false positive rate (FPR). Given a test set and a model, TPR is the proportion of positive (churned) tuples that are correctly labeled by the model; FPR is the proportion of negative (nochurn) tuples that are mislabeled as positive [<xref ref-type="bibr" rid="scirp.96177-ref19">19</xref>].</p><p>In the following experiment the models were evaluted with k-fold value of 10, as shown in <xref ref-type="table" rid="table5">Table 5</xref> for both datasets respectvely. There are small variances between error rates within k-fold cross-validation experiment. The best result obtained from SVM model in fold number 8 on IBM dataset and in fold number 1 on cell2cell dataset.</p><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> Training with holdout 30%</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Dataset</th><th align="center" valign="middle" >Model</th><th align="center" valign="middle" >Training Error</th><th align="center" valign="middle" >Testing Error</th><th align="center" valign="middle" >AUC</th><th align="center" valign="middle" >ACC</th></tr></thead><tr><td align="center" valign="middle"  rowspan="3"  >IBM Waston</td><td align="center" valign="middle" >Na&#239;ve Bayes</td><td align="center" valign="middle" >0.23829</td><td align="center" valign="middle" >0.25334</td><td align="center" valign="middle" >0.81721</td><td align="center" valign="middle" >76%</td></tr><tr><td align="center" valign="middle" >SVM</td><td align="center" valign="middle" >0.20564</td><td align="center" valign="middle" >0.20335</td><td align="center" valign="middle" >0.83683</td><td align="center" valign="middle" >80%</td></tr><tr><td align="center" valign="middle" >Decision Tree</td><td align="center" valign="middle" >0.086392</td><td align="center" valign="middle" >0.23684</td><td align="center" valign="middle" >0.76198</td><td align="center" valign="middle" >76.3%</td></tr><tr><td align="center" valign="middle"  rowspan="3"  >Cell2cell</td><td align="center" valign="middle" >Na&#239;ve Bayes</td><td align="center" valign="middle" >0.021036</td><td align="center" valign="middle" >0.019907</td><td align="center" valign="middle" >0.98149</td><td align="center" valign="middle" >90%</td></tr><tr><td align="center" valign="middle" >SVM</td><td align="center" valign="middle" >0.0082883</td><td align="center" valign="middle" >0.0089536</td><td align="center" valign="middle" >0.99212</td><td align="center" valign="middle" >98.2%</td></tr><tr><td align="center" valign="middle" >Decision Tree</td><td align="center" valign="middle" >0.12620</td><td align="center" valign="middle" >0.012142</td><td align="center" valign="middle" >0.9855</td><td align="center" valign="middle" >98.8%</td></tr></tbody></table></table-wrap><table-wrap id="table5" ><label><xref ref-type="table" rid="table5">Table 5</xref></label><caption><title> Training with k-fold with 10 value for IBM and cell2cell datasets</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Model</th><th align="center" valign="middle"  rowspan="2"  >Fold</th><th align="center" valign="middle"  colspan="4"  >IBM Dataset</th><th align="center" valign="middle"  colspan="4"  >Cell2cell Dataset</th><th align="center" valign="middle" ></th></tr></thead><tr><td align="center" valign="middle" >Training Error</td><td align="center" valign="middle" >Testing Error</td><td align="center" valign="middle" >AUC</td><td align="center" valign="middle" >ACC</td><td align="center" valign="middle" >Training Error</td><td align="center" valign="middle" >Testing Error</td><td align="center" valign="middle" >AUC</td><td align="center" valign="middle"  colspan="2"  >ACC</td></tr><tr><td align="center" valign="middle"  rowspan="10"  >Na&#239;ve Bayes</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0.2436</td><td align="center" valign="middle" >0.2270</td><td align="center" valign="middle" >0.8135</td><td align="center" valign="middle" >77.30%</td><td align="center" valign="middle" >0.0309</td><td align="center" valign="middle" >0.0305</td><td align="center" valign="middle" >0.9665</td><td align="center" valign="middle" >97</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0.2435</td><td align="center" valign="middle" >0.2482</td><td align="center" valign="middle" >0.8149</td><td align="center" valign="middle" >75.20%</td><td align="center" valign="middle" >0.0295</td><td align="center" valign="middle" >0.0302</td><td align="center" valign="middle" >0.9671</td><td align="center" valign="middle" >97</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >0.2394</td><td align="center" valign="middle" >0.2639</td><td align="center" valign="middle" >0.8156</td><td align="center" valign="middle" >73.60%</td><td align="center" valign="middle" >0.0314</td><td align="center" valign="middle" >0.0254</td><td align="center" valign="middle" >0.9667</td><td align="center" valign="middle" >97</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >0.2403</td><td align="center" valign="middle" >0.2596</td><td align="center" valign="middle" >0.8181</td><td align="center" valign="middle" >74.00%</td><td align="center" valign="middle" >0.0295</td><td align="center" valign="middle" >0.0270</td><td align="center" valign="middle" >0.9681</td><td align="center" valign="middle" >97.3</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >0.2423</td><td align="center" valign="middle" >0.2386</td><td align="center" valign="middle" >0.8117</td><td align="center" valign="middle" >76.10%</td><td align="center" valign="middle" >0.0312</td><td align="center" valign="middle" >0.0328</td><td align="center" valign="middle" >0.9663</td><td align="center" valign="middle" >96.7</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >0.2390</td><td align="center" valign="middle" >0.2613</td><td align="center" valign="middle" >0.8166</td><td align="center" valign="middle" >73.90%</td><td align="center" valign="middle" >0.0300</td><td align="center" valign="middle" >0.0321</td><td align="center" valign="middle" >0.9691</td><td align="center" valign="middle" >96.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >0.2440</td><td align="center" valign="middle" >0.2400</td><td align="center" valign="middle" >0.8125</td><td align="center" valign="middle" >76.00%</td><td align="center" valign="middle" >0.0318</td><td align="center" valign="middle" >0.0343</td><td align="center" valign="middle" >0.9647</td><td align="center" valign="middle" >96.6</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >0.2425</td><td align="center" valign="middle" >0.2457</td><td align="center" valign="middle" >0.8123</td><td align="center" valign="middle" >75.40%</td><td align="center" valign="middle" >0.0304</td><td align="center" valign="middle" >0.0316</td><td align="center" valign="middle" >0.9679</td><td align="center" valign="middle" >96.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >0.2447</td><td align="center" valign="middle" >0.2272</td><td align="center" valign="middle" >0.8133</td><td align="center" valign="middle" >77.30%</td><td align="center" valign="middle" >0.0312</td><td align="center" valign="middle" >0.0330</td><td align="center" valign="middle" >0.9652</td><td align="center" valign="middle" >96.7</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >0.2448</td><td align="center" valign="middle" >0.2187</td><td align="center" valign="middle" >0.8106</td><td align="center" valign="middle" >79.10%</td><td align="center" valign="middle" >0.0309</td><td align="center" valign="middle" >0.0326</td><td align="center" valign="middle" >0.9677</td><td align="center" valign="middle" >96.7</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle"  rowspan="10"  >SVM</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0.2048</td><td align="center" valign="middle" >0.2033</td><td align="center" valign="middle" >0.8300</td><td align="center" valign="middle" >79.90%</td><td align="center" valign="middle" >0.0087</td><td align="center" valign="middle" >0.0072</td><td align="center" valign="middle" >0.9927</td><td align="center" valign="middle" >99</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0.2046</td><td align="center" valign="middle" >0.1836</td><td align="center" valign="middle" >0.8262</td><td align="center" valign="middle" >81.70%</td><td align="center" valign="middle" >0.0086</td><td align="center" valign="middle" >0.0084</td><td align="center" valign="middle" >0.9908</td><td align="center" valign="middle" >98.9</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >0.2048</td><td align="center" valign="middle" >0.1838</td><td align="center" valign="middle" >0.8498</td><td align="center" valign="middle" >81.70%</td><td align="center" valign="middle" >0.0087</td><td align="center" valign="middle" >0.0075</td><td align="center" valign="middle" >0.9935</td><td align="center" valign="middle" >99</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >0.1993</td><td align="center" valign="middle" >0.2330</td><td align="center" valign="middle" >0.8064</td><td align="center" valign="middle" >76.70%</td><td align="center" valign="middle" >0.0085</td><td align="center" valign="middle" >0.0085</td><td align="center" valign="middle" >0.9920</td><td align="center" valign="middle" >98.9</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >0.2037</td><td align="center" valign="middle" >0.1925</td><td align="center" valign="middle" >0.8447</td><td align="center" valign="middle" >80.80%</td><td align="center" valign="middle" >0.0086</td><td align="center" valign="middle" >0.0082</td><td align="center" valign="middle" >0.9924</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >0.2018</td><td align="center" valign="middle" >0.2150</td><td align="center" valign="middle" >0.8138</td><td align="center" valign="middle" >78.60%</td><td align="center" valign="middle" >0.0083</td><td align="center" valign="middle" >0.0109</td><td align="center" valign="middle" >0.9882</td><td align="center" valign="middle" >98.7</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >0.2042</td><td align="center" valign="middle" >0.1945</td><td align="center" valign="middle" >0.8239</td><td align="center" valign="middle" >80.50%</td><td align="center" valign="middle" >0.0086</td><td align="center" valign="middle" >0.0079</td><td align="center" valign="middle" >0.9917</td><td align="center" valign="middle" >99</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >0.2062</td><td align="center" valign="middle" >0.1831</td><td align="center" valign="middle" >0.8655</td><td align="center" valign="middle" >81.70%</td><td align="center" valign="middle" >0.0085</td><td align="center" valign="middle" >0.0092</td><td align="center" valign="middle" >0.9926</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >0.2002</td><td align="center" valign="middle" >0.2272</td><td align="center" valign="middle" >0.8089</td><td align="center" valign="middle" >77.30%</td><td align="center" valign="middle" >0.0086</td><td align="center" valign="middle" >0.0081</td><td align="center" valign="middle" >0.9926</td><td align="center" valign="middle" >99</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >0.2033</td><td align="center" valign="middle" >0.2118</td><td align="center" valign="middle" >0.8224</td><td align="center" valign="middle" >78.80%</td><td align="center" valign="middle" >0.0085</td><td align="center" valign="middle" >0.0091</td><td align="center" valign="middle" >0.9913</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle"  rowspan="10"  >Decision Tree</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0.0858</td><td align="center" valign="middle" >0.2499</td><td align="center" valign="middle" >0.7240</td><td align="center" valign="middle" >75%</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.0120</td><td align="center" valign="middle" >0.9853</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0.0841</td><td align="center" valign="middle" >0.2469</td><td align="center" valign="middle" >0.7667</td><td align="center" valign="middle" >75.30%</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.0130</td><td align="center" valign="middle" >0.9859</td><td align="center" valign="middle" >98.7</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >0.0896</td><td align="center" valign="middle" >0.2298</td><td align="center" valign="middle" >0.7447</td><td align="center" valign="middle" >77%</td><td align="center" valign="middle" >0.0002</td><td align="center" valign="middle" >0.0123</td><td align="center" valign="middle" >0.9826</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >0.0882</td><td align="center" valign="middle" >0.2795</td><td align="center" valign="middle" >0.7168</td><td align="center" valign="middle" >72.10%</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.0124</td><td align="center" valign="middle" >0.9846</td><td align="center" valign="middle" >98.9</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >0.0880</td><td align="center" valign="middle" >0.2618</td><td align="center" valign="middle" >0.7264</td><td align="center" valign="middle" >73.90%</td><td align="center" valign="middle" >0.0002</td><td align="center" valign="middle" >0.0109</td><td align="center" valign="middle" >0.9887</td><td align="center" valign="middle" >98.9</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >0.0882</td><td align="center" valign="middle" >0.2684</td><td align="center" valign="middle" >0.6994</td><td align="center" valign="middle" >73.20%</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.0110</td><td align="center" valign="middle" >0.9853</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >0.0868</td><td align="center" valign="middle" >0.2542</td><td align="center" valign="middle" >0.7125</td><td align="center" valign="middle" >74.60%</td><td align="center" valign="middle" >0.0000</td><td align="center" valign="middle" >0.0136</td><td align="center" valign="middle" >0.9831</td><td align="center" valign="middle" >98.6</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >0.0841</td><td align="center" valign="middle" >0.2641</td><td align="center" valign="middle" >0.7321</td><td align="center" valign="middle" >73.60%</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.0117</td><td align="center" valign="middle" >0.9854</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >0.0844</td><td align="center" valign="middle" >0.2400</td><td align="center" valign="middle" >0.7079</td><td align="center" valign="middle" >76%</td><td align="center" valign="middle" >0.0002</td><td align="center" valign="middle" >0.0127</td><td align="center" valign="middle" >0.9863</td><td align="center" valign="middle" >98.7</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >0.0899</td><td align="center" valign="middle" >0.2854</td><td align="center" valign="middle" >0.7020</td><td align="center" valign="middle" >71.40%</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.0120</td><td align="center" valign="middle" >0.9847</td><td align="center" valign="middle" >98.8</td><td align="center" valign="middle" ></td></tr></tbody></table></table-wrap><p>In order to check the models, they have been compared with previous papers which used similar datasets. The result approved that the model is more accurate as shown in <xref ref-type="table" rid="table4">Table 4</xref>.</p><p>ApurvaSree, G., et al. [<xref ref-type="bibr" rid="scirp.96177-ref4">4</xref>] and Induja, S. &amp; D. V. P. Eswaramurthy [<xref ref-type="bibr" rid="scirp.96177-ref21">21</xref>] used IBM Waston dataset with different algorithms including SVM for the first paper &amp; Na&#239;ve Bayes for the second mentioned paper, both results are similar to the results obtained from this paper. However, our proposed method obtained higher accuracy by using SVM model on IBM dataset with k-fold partition, k value = 10 to produce an area under curve reached to 0.86548. As for cell2cell dataset, the papers in [<xref ref-type="bibr" rid="scirp.96177-ref21">21</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref22">22</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref23">23</xref>] [<xref ref-type="bibr" rid="scirp.96177-ref24">24</xref>] also used different algorithms including SVM where the best accuracy for previous studies was 94.13 for AUC whereas the AUC in the proposed model using SVM is 0.99 as shown in <xref ref-type="table" rid="table6">Table 6</xref>.</p><p>The AUC’ values have been plotted for the three models as shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>6 &amp; <xref ref-type="fig" rid="fig2">Figure 2</xref>7, which shown that best result obrained using algorithm from SVM for both datasets.</p><table-wrap id="table6" ><label><xref ref-type="table" rid="table6">Table 6</xref></label><caption><title> Comparison with previous papers which used the same datasets</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Paper</th><th align="center" valign="middle" >Algorithms</th><th align="center" valign="middle" >Result</th><th align="center" valign="middle" >Dataset</th><th align="center" valign="middle" >Proposed Model Result</th></tr></thead><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.96177-ref4">4</xref>]</td><td align="center" valign="middle" >Random forest, Logistic regression, SVM</td><td align="center" valign="middle" >Accuracy (80.75%, 80.88%, 82%)</td><td align="center" valign="middle" >IBM Waston</td><td align="center" valign="middle" >AUC for SVM AUC (87%)</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.96177-ref21">21</xref>]</td><td align="center" valign="middle" >Naive Bayes</td><td align="center" valign="middle" >AUC (83%)</td><td align="center" valign="middle" >IBM Waston</td><td align="center" valign="middle" >AUC for Naive Bayes same (83%)</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.96177-ref22">22</xref>]</td><td align="center" valign="middle" >Decision trees, Logistic regression, Neural networks and SVM</td><td align="center" valign="middle" >Accuracy (62.98, 61.65, 61.40 and 61.78)</td><td align="center" valign="middle" >Cell2cell</td><td align="center" valign="middle" >ACC for decision trees (98%) and SVM (99%)</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.96177-ref23">23</xref>]</td><td align="center" valign="middle" >GP-AdaBoost</td><td align="center" valign="middle" >AUC (0.91)</td><td align="center" valign="middle" >Cell2cell</td><td align="center" valign="middle" >The best AUC for SVM (99%) AdaBoost not used</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.96177-ref24">24</xref>]</td><td align="center" valign="middle" >C4.5 decision tree</td><td align="center" valign="middle" >AUC (63.04)</td><td align="center" valign="middle" >Cell2cell</td><td align="center" valign="middle" >AUC decision trees (98%)</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.96177-ref25">25</xref>]</td><td align="center" valign="middle" >SVM</td><td align="center" valign="middle" >AUC (94.13)</td><td align="center" valign="middle" >Cell2cell</td><td align="center" valign="middle" >AUC for SVM (99%)</td></tr></tbody></table></table-wrap></sec><sec id="s5"><title>5. Conclusion</title><p>This paper analyzed two datasets, IBM Watson dataset consists of 7033 observations, 21 attribute and cell2cell dataset consists of 71,047 observations and 57 attribute where they have been visualized using orange software. The three predictive models “Na&#239;ve Bayes, SVM and decision tree” have been implemented in Matlab. The paper aims to find the best accurate model for churn prediction in telecom and selecting the most important reasons that let customers churn. The models performance has been measured by area under curve where the best AUCs are (0.82, 0.87, 0.78) for IBM dataset &amp; (0.98, 0.99, 0.98) for cell2cell dataset. The AUC, which obtained using SVM algorithm, is better compared with the previous papers. As noticed that the churned customers have some similar services, which means that any telecom company can detect the predictors and retain their customers. The paper concluded that telecom operators can get best predictive models if they analyzed their whole records and tracked the customers’ behavior so they can build different marketing approaches to retain the churners based on the predictors which can be detected when analyzing the historical customer’s records. All churn prediction models in this paper can be used in other customer response models as well, such as cross-selling, up-selling, or customer acquisition.</p></sec><sec id="s6"><title>Acknowledgements</title><p>This work is supported by the International University of Africa, the authors would like to thank the international university of Africa for the support in research and development. In addition, the authors would like to thank the IBM Waston and Cell2cell companies for providing the datasets freely available for the research. The authors also immensely grateful to Prof Saad Subair for his support to publish in this journal.</p></sec><sec id="s7"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s8"><title>Cite this paper</title><p>Ebrah, K. and Elnasir, S. (2019) Churn Prediction Using Machine Learning and Recommendations Plans for Telecoms. Journal of Computer and Communications, 7, 33-53. https://doi.org/10.4236/jcc.2019.711003</p></sec></body><back><ref-list><title>References</title><ref id="scirp.96177-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">John, T., et al. (2018) Telecom Churn.</mixed-citation></ref><ref id="scirp.96177-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Ahmad, A.K., Jafar, A. and Aljoumaa, K. (2019) Customer Churn Prediction in Telecom Using Machine Learning in Big Data Platform. Journal of Big Data, 6, 28. https://doi.org/10.1186/s40537-019-0191-6</mixed-citation></ref><ref id="scirp.96177-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Andrews, R., et al. (2019) Churn Prediction in Telecom Sector Using Machine Learning. International Journal of Information Systems and Computer Sciences, 8, 132-134. https://doi.org/10.30534/ijiscs/2019/31822019</mixed-citation></ref><ref id="scirp.96177-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">ApurvaSree, G., et al. (2019) Churn Prediction in Telecom Using Classification Algorithms. International Journal of Scientific Research and Engineering Development, 5, 19-28.</mixed-citation></ref><ref id="scirp.96177-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Tata Tele Business Services (2018) Big Data and the Telecom Industry.</mixed-citation></ref><ref id="scirp.96177-ref6"><label>6</label><mixed-citation publication-type="journal" xlink:type="simple"><name name-style="western"><surname>Kayaalp</surname><given-names> F. </given-names></name>,<etal>et al</etal>. (<year>2017</year>)<article-title>Review of Customer Churn Analysis Studies in Telecommunications Industry</article-title><source> Karaelmas Science Engineering Journal</source><volume> 7</volume>,<fpage> 696</fpage>-<lpage>705</lpage>.<pub-id pub-id-type="doi"></pub-id></mixed-citation></ref><ref id="scirp.96177-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Umayaparvathi, V. and Iyakutti, K. (2016) A Survey on Customer Churn Prediction in Telecom Industry: Datasets, Methods and Metrics. International Research Journal of Engineering and Technology, 3, 1065-1070.</mixed-citation></ref><ref id="scirp.96177-ref8"><label>8</label><mixed-citation publication-type="journal" xlink:type="simple"><name name-style="western"><surname>Kaur</surname><given-names> S. </given-names></name>,<etal>et al</etal>. (<year>2017</year>)<article-title>Literature Review of Data Mining Techniques in Customer Churn Prediction for Telecommunications Industry</article-title><source> Journal of Applied Technology and Innovation</source><volume> 1</volume>,<fpage> 28</fpage>-<lpage>40</lpage>.<pub-id pub-id-type="doi"></pub-id></mixed-citation></ref><ref id="scirp.96177-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Ahmed, A. and Linen, D.M. (2017) A Review and Analysis of Churn Prediction Methods for Customer Retention in Telecom Industries. 4th International Conference on Advanced Computing and Communication Systems, Coimbatore, 6-7 January 2017, 1-7. https://doi.org/10.1109/ICACCS.2017.8014605</mixed-citation></ref><ref id="scirp.96177-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Amin, A., et al. (2019) Customer Churn Prediction in Telecommunication Industry Using Data Certainty. Journal of Business Research, 94, 290-301. https://doi.org/10.1016/j.jbusres.2018.03.003</mixed-citation></ref><ref id="scirp.96177-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Saraswat, S. and Tiwari, A. (2018) A New Approach for Customer Churn Prediction in Telecom Industry. International Journal of Computer Applications, 181, 40-46. https://doi.org/10.5120/ijca2018917698</mixed-citation></ref><ref id="scirp.96177-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Ahmed, A.A. and Maheswari, D. (2017) Churn Prediction on Huge Telecom Data Using Hybrid Firefly Based Classification. Egyptian Informatics Journal, 18, 215-220. https://doi.org/10.1016/j.eij.2017.02.002</mixed-citation></ref><ref id="scirp.96177-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Kumar, N. and Naik, C. (2017) Comparative Analysis of Machine Learning Algorithms for Their Effectiveness in Churn Prediction in the Telecom Industry. International Research Journal of Engineering and Technology, 4, 485-489.</mixed-citation></ref><ref id="scirp.96177-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">IBM Waston Dataset 2018-11-29. https://www.kaggle.com/jpacse/datasets-for-churn-telecom</mixed-citation></ref><ref id="scirp.96177-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">IBM Data. https://www.ibm.com/communities/analytics/watson-analytics-blog/predictive-insights-in-the-telco-customer-churn-data-set</mixed-citation></ref><ref id="scirp.96177-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Cell2cell Dataset. https://www.kaggle.com/jpacse/telecom-churn-new-cell2cell-dataset</mixed-citation></ref><ref id="scirp.96177-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Business F.S.O. (2002) Cell2cell: The Churn Game. ((A) 8/26/02).</mixed-citation></ref><ref id="scirp.96177-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Mitchell, T.M. (2015) Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression.</mixed-citation></ref><ref id="scirp.96177-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Han, J., Pei, J. and Kamber, M. (2011) Data Mining: Concepts and Techniques. Elsevier, Amsterdam.</mixed-citation></ref><ref id="scirp.96177-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Hastie, T., Tibshirani, R. and Friedman, J. (2008) The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, Berlin.</mixed-citation></ref><ref id="scirp.96177-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Induja, S. and Eswaramurthy, D.V.P. (2016) Customers Churn Prediction and Attribute Selection in Telecom Industry Using Kernelized Extreme Learning Machine and Bat Algorithms. International Journal of Science and Research, 5, 258-265.</mixed-citation></ref><ref id="scirp.96177-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Gajowniczek, K., Orlowski, A. and Zabkowski, T. (2016) Entropy Based Trees to Support Decision Making for Customer Churn Management. Acta Physica Polonica A, 129, 971-979. https://doi.org/10.12693/APhysPolA.129.971</mixed-citation></ref><ref id="scirp.96177-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Idris, A., Iftikhar, A. and ur Rehman, Z.J.C.C. (2017) Intelligent Churn Prediction for Telecom Using GP-AdaBoost Learning and PSO Undersampling. Springer Science + Business Media, Berlin, 1-15. https://doi.org/10.1007/s10586-017-1154-3</mixed-citation></ref><ref id="scirp.96177-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Gajowniczek, K., Zabkowski, T. and Orlowski, A. (2015) Comparison of Decision Trees with Rényi and Tsallis Entropy Applied for Imbalanced Churn Dataset. Federated Conference on Computer Science and Information Systems, Lodz, 13-16 September 2015, 39-44. https://doi.org/10.15439/2015F121</mixed-citation></ref><ref id="scirp.96177-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Maldonado, S., et al. (2015) Profit-Based Feature Selection Using Support Vector Machines—General Framework and an Application for Customer Retention. Applied Soft Computing, 35, 740-748. https://doi.org/10.1016/j.asoc.2015.05.058</mixed-citation></ref></ref-list></back></article>