<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">OJS</journal-id><journal-title-group><journal-title>Open Journal of Statistics</journal-title></journal-title-group><issn pub-type="epub">2161-718X</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/ojs.2020.104039</article-id><article-id pub-id-type="publisher-id">OJS-101976</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Physics&amp;Mathematics</subject></subj-group></article-categories><title-group><article-title>
 
 
  Empirical Study on the Sustainable Development of Domestic Tourism
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Shichang</surname><given-names>Shen</given-names></name><xref ref-type="aff" rid="aff1"><sub>1</sub></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib></contrib-group><aff id="aff1"><label>1</label><addr-line>School of Mathematics and Statistics, Qinghai Nationalities University, Xining, China</addr-line></aff><pub-date pub-type="epub"><day>10</day><month>07</month><year>2020</year></pub-date><volume>10</volume><issue>04</issue><fpage>651</fpage><lpage>658</lpage><history><date date-type="received"><day>29,</day>	<month>June</month>	<year>2020</year></date><date date-type="rev-recd"><day>2,</day>	<month>August</month>	<year>2020</year>	</date><date date-type="accepted"><day>5,</day>	<month>August</month>	<year>2020</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Taking the number of domestic tourists as the research object, an appropriate model is established to analyze the sustainable development of China’s tourism industry. The number of domestic tourists in China from 1985 to 2015 was taken as a sample, and the ARIMA model was established using Eviews
   
  6.0. Finally, the ARIMA(0,
   
  2,
   
  1) model is established. After testing, the model fits well (MAPE = 7.363) and the prediction accuracy is extremely high (99.56%). The ARIMA model’s short-term prediction of the number of tourists is reasonable.
 
</p></abstract><kwd-group><kwd>Number of Tourists</kwd><kwd> ARIMA Model</kwd><kwd> Predictive Analysis</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The increase in people’s income, the increase in consumption levels, and national policies have made tourism develop rapidly. During the “Twelfth Five-Year Plan” period, the scale of China’s tourism industry continued to expand. The added value of tourism and related industries accounted for 4.33% of GDP. The tourism industry has gradually become a new growth point of the national economy. The development of tourism, which can bring huge economic benefits to the country and individuals and can improve the people’s happiness index, cannot be ignored. To promote the healthy and sustainable development of tourism, we must first reasonably predict the number of future tourists. Forecasting the future development trend of the number of tourists will help government tourism departments to formulate tourism development strategies on the one hand, and on the other hand, it will help managers of tourism enterprises to adjust marketing policies to maximize profits. It can also be used as a reference for those who intend to travel. At present, the ARIMA model has been widely used in other related fields [<xref ref-type="bibr" rid="scirp.101976-ref1">1</xref>] [<xref ref-type="bibr" rid="scirp.101976-ref2">2</xref>] [<xref ref-type="bibr" rid="scirp.101976-ref3">3</xref>], but for tourism, the predecessors mostly used BP neural network model and gray theory [<xref ref-type="bibr" rid="scirp.101976-ref4">4</xref>] [<xref ref-type="bibr" rid="scirp.101976-ref5">5</xref>] for related research. Therefore, this article analyzes the future development trend of the number of tourists in China by using the relevant sample data and the ARIMA model, and puts forward relevant suggestions to promote the sustainable development of China’s tourism industry and provide a reference for government tourism departments and tourism enterprise managers.</p></sec><sec id="s2"><title>2. Empirical Analysis</title><p>This article takes the number of domestic tourists in China from 1985 to 2015 as a sample and uses Eviews 6.0 to analyze the data [<xref ref-type="bibr" rid="scirp.101976-ref6">6</xref>]. A prediction model is established for the sequence Y t (Unit: million person-times) of domestic tourists from 1985 to 2011. The number of domestic tourists from 2012 to 2015 is reserved used to test the effect of the model. The data used are from China Statistical Yearbook 2015.</p><sec id="s2_1"><title>2.1. Data Preprocessing</title><p>Observing the trend of domestic tourist numbers from 1985 to 2011 (<xref ref-type="fig" rid="fig1">Figure 1</xref>), it can be seen that the overall number of domestic tourists has a certain exponential trend, showing a non-stationary sequence. To confirm the stability of the domestic tourist number sequence Y t , further ADF unit root test, inspection results are as follows (see <xref ref-type="table" rid="table1">Table 1</xref>).</p><p>From the unit root test results in <xref ref-type="table" rid="table1">Table 1</xref>, it can be seen that the value of the t statistic is 8.084385, which is larger than the critical values of the confidence levels of 1%, 5%, and 10%, that is, the Y t sequence has a unit root, so the original sequence Y t is a non-stationary sequence. In order to smooth the sequence,</p><p>logarithmic and second-order difference processing is performed on the original data. The processed sequence is recorded as W t . ADF unit root test is performed on the W t sequence to determine that the processed sequence, that is, the W t sequence is stable and the unit root test results are shown in <xref ref-type="table" rid="table2">Table 2</xref> and <xref ref-type="fig" rid="fig2">Figure 2</xref>. According to the unit root test results in <xref ref-type="table" rid="table2">Table 2</xref>, the value of the t statistic is −7.857, which is smaller than the critical values of 1%, 5%, and 10% of the confidence level. In addition, the P value is almost zero. Assume that there is no unit root in the W t sequence, so the processed sequence W t is a stationary sequence.</p></sec><sec id="s2_2"><title>2.2. Model Recognition</title><p>Autocorrelation function (ACF) and partial autocorrelation function (PACF) are</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> ADF unit root test of Y t </title></caption><table><tbody><thead><tr><th align="center" valign="middle" ></th><th align="center" valign="middle" >t-Statistic</th><th align="center" valign="middle" >Prob.</th></tr></thead><tr><td align="center" valign="middle" >Augmented Dickey-Fuller test statistic Test critical values</td><td align="center" valign="middle" >8.084 1% level 5% level 10% level</td><td align="center" valign="middle" >1.000 −2.657 −1.954 −1.609</td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> ADF unit root test for W t </title></caption><table><tbody><thead><tr><th align="center" valign="middle" ></th><th align="center" valign="middle" >t-Statistic</th><th align="center" valign="middle" >Prob.</th></tr></thead><tr><td align="center" valign="middle" >Augmented Dickey-Fuller test statistic Test critical values</td><td align="center" valign="middle" >−7.857 1% level 5% level 10% level</td><td align="center" valign="middle" >0.000 −3.738 −2.992 −2.636</td></tr></tbody></table></table-wrap><p>the most important methods for identifying ARIMA models [<xref ref-type="bibr" rid="scirp.101976-ref7">7</xref>]. In Eviews 6.0, the sample autocorrelation and partial autocorrelation analysis diagrams are usually used to identify and rank models.PAC column φ k k and AC column ρ k are significantly different from 0 when k = 1. Consider p = 1 and q = 1. At the same time, according to the recognition principle of the ARIMA model, the autocorrelation function ρ k and the partial autocorrelation function φ k k are calculated as follows:</p><p>When m = 1 and k = 1，The proportion of ρ k + 1 , ρ k + 2 , ⋯ , ρ k + M (where M = [ N ] = [ 27 ] = 5 and N is the sample size of Y t sequence) meeting</p><p>| ρ k + i | ≤ [ 1 N ( 1 + 2 ∑ l = 1 m ρ ^ l 2 ) ] 1 2 ,     i = 1 , 2 , ⋯ , M</p><p>and the proportion of ρ k + 1 , ρ k + 2 , ⋯ , ρ k + M meeting</p><p>| ρ k + i | ≤ 2 [ 1 N ( 1 + 2 ∑ l = 1 m ρ ^ l 2 ) ] 1 2 ,     i = 1 , 2 , ⋯ , M</p><p>is 100% &gt; 95%, so ρ k is truncated in one step.</p><p>When m = 1 and k = 1, the percentages of φ k + 1 , k + 1 , φ k + 2 , k + 2 , ⋯ , φ k + M , k + M satisfying | φ k k | &gt; 1 N and | φ k k | &gt; 2 N are respectively 20% and 0%, the former is</p><p>less than 31.7% and the latter is less than 4.5%, so φ k k is truncated in one step.</p><p>In summary, the autocorrelation function ρ k is truncated in step 1, and the partial autocorrelation function φ k k is also truncated in 1 step, which is consistent with the results obtained by subjectively identifying the correlation diagram of the W t series. According to the above conclusions, the models that may be suitable are ARIMA(1, 2, 1), ARIMA(1, 2, 0), ARIMA(0, 2, 1).</p></sec><sec id="s2_3"><title>2.3. Model Establishment</title><p>For the ARIMA model, the adjusted R<sup>2</sup>, AIC, and SC criteria are all important factors to consider when choosing a model. When judging the pros and cons of the model according to the AIC and SC criteria, it is generally considered that the model with smaller AIC and SC function values is better. It can be seen from <xref ref-type="table" rid="table3">Table 3</xref> that among the three models, the AIC value and the SC value of the ARIMA(0, 2, 1) model are the smallest. When comparing the adjusted determination coefficient R<sup>2</sup>, the larger its value, the better the model’s fitting effect. Among the three models, the adjusted R<sup>2</sup> of the ARIMA(0, 2, 1) model is the</p><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> Model comparison table</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Model</th><th align="center" valign="middle" >Inverted Roots</th><th align="center" valign="middle" >the adjusted R<sup>2</sup></th><th align="center" valign="middle" >AIC</th><th align="center" valign="middle" >SC</th></tr></thead><tr><td align="center" valign="middle" >ARIMA(1, 2, 1)</td><td align="center" valign="middle" >| λ i | &lt; 1</td><td align="center" valign="middle" >0.357</td><td align="center" valign="middle" >−1.585</td><td align="center" valign="middle" >−1.486</td></tr><tr><td align="center" valign="middle" >ARIMA(1, 2, 0)</td><td align="center" valign="middle" >| λ i | &lt; 1</td><td align="center" valign="middle" >0.231</td><td align="center" valign="middle" >−1.444</td><td align="center" valign="middle" >−1.395</td></tr><tr><td align="center" valign="middle" >ARIMA(0, 2, 1)</td><td align="center" valign="middle" >| λ i | &lt; 1</td><td align="center" valign="middle" >0.380</td><td align="center" valign="middle" >−1.697</td><td align="center" valign="middle" >−1.648</td></tr></tbody></table></table-wrap><p>largest. In summary, the ARIMA(0, 2, 1) model should be selected. In addition, the inverse roots of the lag polynomials of the ARIMA(0, 2, 1) model are all less than 1, which meets the requirements of process stability.</p><p>The parameter estimation results of the ARIMA(0, 2, 1) model are shown in <xref ref-type="table" rid="table4">Table 4</xref>.</p></sec><sec id="s2_4"><title>2.4. Model Residual Sequence Test</title><p>After the preliminary judgment of the model as ARIMA(0, 2, 1), it should also be subjected to an adaptive test, that is, the independence test of the model residual a t series, to determine whether the time series is properly described by this model, and whether the model needs further improvement.</p><p>Using Eviews 6.0 software to perform the χ 2 test, the sample size of the residual sequence is 25, and the maximum lag time can be taken [25/10]. Its P value corresponding to the Q test statistic is 0.771, so the residual sequence cannot be rejected. The null hypothesis indicates that the model’s residual a t sequence is purely random and is a white noise sequence.</p></sec></sec><sec id="s3"><title>3. Model Prediction</title><p>After the above test model is reasonable, the ARIMA(0, 2, 1) model can be used for short-term prediction. In order to test the prediction accuracy of the model, we first use the mean absolute percentage error (MAPE) and Hill inequality coefficient (TIC) [<xref ref-type="bibr" rid="scirp.101976-ref8">8</xref>] to test the model fitting effect, where</p><p>MAPE = 100 n ∑ i = 1 n | y i − y ^ i y i | ,     TIC = 1 n ∑ i = 1 n ( y i − y ^ i ) 2 1 n ∑ i = 1 n y i 2 + 1 n ∑ i = 1 n y ^ i 2 ,</p><p>The ARIMA(0, 2, 1) model is used to obtain the predicted value of the number of domestic tourists from 1985 to 2011, which is compared with the real value (see <xref ref-type="fig" rid="fig3">Figure 3</xref>), and the MAPE and TIC values are calculated. It can be seen from <xref ref-type="fig" rid="fig3">Figure 3</xref> that the predicted value curve (X) and the true value curve (Y)</p><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> ARIMA(0, 2, 1) model parameter estimation results</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Variable</th><th align="center" valign="middle" >Coefficient</th><th align="center" valign="middle" >Std. Error</th><th align="center" valign="middle" >t-Statistic</th><th align="center" valign="middle" >Prob.</th></tr></thead><tr><td align="center" valign="middle" >MA(1)</td><td align="center" valign="middle" >−0.922</td><td align="center" valign="middle" >0.053</td><td align="center" valign="middle" >−17.257</td><td align="center" valign="middle" >0.000</td></tr><tr><td align="center" valign="middle" >R-squared</td><td align="center" valign="middle" >0.380</td><td align="center" valign="middle"  colspan="2"  >Mean dependent var</td><td align="center" valign="middle" >0.004</td></tr><tr><td align="center" valign="middle" >Adjusted R-squared</td><td align="center" valign="middle" >0.380</td><td align="center" valign="middle"  colspan="2"  >S.D. dependent var</td><td align="center" valign="middle" >0.129</td></tr><tr><td align="center" valign="middle" >S.E. of regression</td><td align="center" valign="middle" >0.102</td><td align="center" valign="middle"  colspan="2"  >Akaike info criterion</td><td align="center" valign="middle" >−1.697</td></tr><tr><td align="center" valign="middle" >Sum squared resid</td><td align="center" valign="middle" >0.248</td><td align="center" valign="middle"  colspan="2"  >Schwarz criterion</td><td align="center" valign="middle" >−1.648</td></tr><tr><td align="center" valign="middle" >Log likelihood</td><td align="center" valign="middle" >22.213</td><td align="center" valign="middle"  colspan="2"  >Hannan-Quinn criter</td><td align="center" valign="middle" >−1.684</td></tr><tr><td align="center" valign="middle" >Durbin-Watson stat</td><td align="center" valign="middle" >1.731</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Inverted MA Roots</td><td align="center" valign="middle"  colspan="2"  >0.920</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td></tr></tbody></table></table-wrap><p>have a high degree of coincidence, and the trend is basically the same. The residual curve (RESSID) is almost a straight line, and the average absolute percentage. The score error (MAPE) is 7.363 &lt; 10, which indicates that the model fits well. In addition, TIC = 0.0398 &lt; 1, which means that the difference between the predicted value and the true value is very small, and the model is suitable.</p><p>At the same time, we will use the ARIMA(0, 2, 1) model to make dynamic predictions of the 4 observations after the Y t series, and then compare the predicted values with the real values to obtain the dynamic prediction values, absolute errors and relative errors as follows (See <xref ref-type="table" rid="table5">Table 5</xref>):</p><p>It can be seen from <xref ref-type="table" rid="table5">Table 5</xref> that the relative errors of the out-of-sample predictions of the model are less than 1%, the average relative error is about 0.44%, the difference between the predicted value and the true value is very small, and the prediction accuracy of the ARIMA(0, 2, 1) model is extremely high.</p><p>From the above analysis, it can be seen that the model established for the number of domestic tourists is suitable. Now we make short-term predictions of the number of domestic tourists in China from 2016 to 2020 (see <xref ref-type="table" rid="table6">Table 6</xref>). In order to more intuitively observe and analyze the changes in the number of tourists, the future forecast of the number of tourists draws a graph together</p><table-wrap id="table5" ><label><xref ref-type="table" rid="table5">Table 5</xref></label><caption><title> ARIMA(0, 2, 1) model out-of-sample prediction error analysis table (Unit: million person-times)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Time</th><th align="center" valign="middle" >Actual value</th><th align="center" valign="middle" >Dynamic prediction</th><th align="center" valign="middle" >Absolute error</th><th align="center" valign="middle" >Relative error (%)</th></tr></thead><tr><td align="center" valign="middle" >2012</td><td align="center" valign="middle" >2957</td><td align="center" valign="middle" >2934.399</td><td align="center" valign="middle" >22.601</td><td align="center" valign="middle" >0.764</td></tr><tr><td align="center" valign="middle" >2013</td><td align="center" valign="middle" >3262</td><td align="center" valign="middle" >3260.393</td><td align="center" valign="middle" >1.607</td><td align="center" valign="middle" >0.049</td></tr><tr><td align="center" valign="middle" >2014</td><td align="center" valign="middle" >3611</td><td align="center" valign="middle" >3622.603</td><td align="center" valign="middle" >−11.603</td><td align="center" valign="middle" >0.321</td></tr><tr><td align="center" valign="middle" >2015</td><td align="center" valign="middle" >4000</td><td align="center" valign="middle" >4025.052</td><td align="center" valign="middle" >−25.052</td><td align="center" valign="middle" >0.626</td></tr></tbody></table></table-wrap><table-wrap id="table6" ><label><xref ref-type="table" rid="table6">Table 6</xref></label><caption><title> Forecast results of the number of domestic tourists from 2016 to 2020(Unit: million person-times)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Time</th><th align="center" valign="middle" >2016</th><th align="center" valign="middle" >2017</th><th align="center" valign="middle" >2018</th><th align="center" valign="middle" >2019</th><th align="center" valign="middle" >2020</th></tr></thead><tr><td align="center" valign="middle" >Predictive value</td><td align="center" valign="middle" >4472.211</td><td align="center" valign="middle" >4969.046</td><td align="center" valign="middle" >5521.077</td><td align="center" valign="middle" >6134.435</td><td align="center" valign="middle" >6815.933</td></tr></tbody></table></table-wrap><p>With previous values (see <xref ref-type="fig" rid="fig4">Figure 4</xref>). It can be seen from <xref ref-type="fig" rid="fig4">Figure 4</xref> that in the next five years, the number of domestic tourists will increase historically.</p></sec><sec id="s4"><title>4. Conclusions and Recommendations</title><p>1) Based on the sample data of the number of domestic tourists from 1985 to 2015, a ARIMA(0, 2, 1) model was finally established through comparative analysis. The average absolute percentage error MAPE value of the model and the Hill inequality coefficient TIC are 7.363 and 0.0398, respectively. The data from 2012 to 2015 were predicted, and the predicted values were compared with the reserved synchronous values. As a result, it was found that the model prediction accuracy reached 99.56%. These test results mean that the model fits well and the prediction accuracy is extremely high. 2) Forecast the number of domestic tourists from 2016 to 2020 and compare it with historical data (see <xref ref-type="table" rid="table6">Table 6</xref> and <xref ref-type="fig" rid="fig4">Figure 4</xref>). In the next 5 years, the overall trend of tourist numbers is the same as in previous years. It is a trend growth, and the growth rate is slightly larger than before, which will promote the economic development of our country.</p><p>Based on the above conclusions, the author puts forward the following suggestions for reference: 1) Avoid excessive tourists and improve tourist satisfaction in the scenic area. 2) Facing the rapidly increasing number of tourists, the scale of tourist attractions should be expanded, and the activities in the tourist attractions should be increased to alleviate the pressure of “crowding” in the tourist attractions. 3) Improve tourism service facilities, appropriately reduce the price of products in scenic spots, and promote consumption. 4) Expand the space for tourism development, increase innovation in tourism methods and types, and promote sustainable development of the tourism industry.</p></sec><sec id="s5"><title>Funds</title><p>This work is supported by the National Natural Science Foundation of China (No. 11561056) and Natural Science Foundation of Qinghai (No. 2016-ZJ-914).</p></sec><sec id="s6"><title>Conflicts of Interest</title><p>The author declares no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s7"><title>Cite this paper</title><p>Shen, S.C. (2020) Empirical Study on the Sustainable Development of Domestic Tourism. Open Journal of Statistics, 10, 651-658. https://doi.org/10.4236/ojs.2020.104039</p></sec></body><back><ref-list><title>References</title><ref id="scirp.101976-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Chen, P., Wu, L. and Song, H. (2012) Forecast of Inbound Tourists in Anhui Province Based on ARIMA Model. Journal of Anhui Agricultural University (Social Science Edition), No. 1, 32-35.</mixed-citation></ref><ref id="scirp.101976-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Li, G., Zhao, J.L. and Su, Y. (2012) Prediction of Handheld Order Volume of World Container Ships Based on ARMA Model. Research on Science and Technology Management, No. 16, 7-11.</mixed-citation></ref><ref id="scirp.101976-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Liu, L.P. (2011) Time Series Model and Forecast of Total Retail Sales of Social Consumer Goods in China. Economic Forum, No. 6, 3-6.</mixed-citation></ref><ref id="scirp.101976-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Guo, Q.C., Kong L.J. and Cui, W.J. (2011) Forecast of Domestic Tourists Based on BP Neural Network Model. Value Engineering, No. 27, 7-8.</mixed-citation></ref><ref id="scirp.101976-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Liu, H.M. and Fan, S.H. (2010) Forecast and Analysis of Guangzhou Tourism Reception Number Based on Grey System Theory. Statistics and Decision, No. 17, 64-66.</mixed-citation></ref><ref id="scirp.101976-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Yi, D.H. (2008) Data Analysis and E Views Application. Renmin University of China Press, Beijing.</mixed-citation></ref><ref id="scirp.101976-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Wang, Z.L. (2010) Applied Time Series. China Statistics Press, Beijing.</mixed-citation></ref><ref id="scirp.101976-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Amirhan, A. and Liu, W.Z. (2014) Analysis and Forecast of Xinjiang Mutton Yield Based on ARIMA Model. Heilongjiang Animal Husbandry and Veterinary Medicine (Exploration and Research), No. 8, 16-19.</mixed-citation></ref></ref-list></back></article>