Estimation of Growth Model for Population of Ethiopia Using Least Square Method

The purpose of this study focused on modeling the population of Ethiopia using different models and estimating the models parameters via least square method. The models, that were applied for the population growth, were Malthus growth model, Logistic growth model and General growth model. To identify the models which performed effectively in prediction of the actual population, the measure accuracy has been used, such that the models satisfying the criteria of the measure of accuracy is the best statistical model. The results of the analysis were presented using tables and graphical form which are very good to perform comparison for the effectiveness of the models. In this study, MAPE, RSE , MAD and R2 which are considered to measure the accuracy of the models. Malthus growth model, Logistic growth model and General growth model used the population of Ethiopia from 1980 to 2020 inclusive, the data was obtained from international data base(IDB). R studio 3.6.3 were used to estimate the models parameters using simple codes. The study proposed to project the population of Ethiopia via General growth model which performed best in measure of accuracies that makes it effective and efficient as compare with the other models. The model had the smallest RSE ( 492,155 ), MAPE (0.75%) and MAD ( 379,942 ) as well as the highest R2(99.97%) relative to the other models.

order to estimate the model parameters with simple built in function of R studio. The results conveyed using tables and graphs to explore the result of the analysis which are very crucial to create essential insight about the model parameters and to make a comparison among the models for Ethiopian population. The estimation method of the model parameters of the models was obtained using least square technique for all models (Mulugeta et al, 2020). The next section explained about the models development, parameter estimation and model selection methods.

Development of the models 2.1.1. Malthus growth Model
Thomas R. Malthus (1798), proposed a mathematical model for the population growth with the assumption that the population rises at a constant rate proportional to the original population size. Malthus model applied for Ethiopia population and performed better for estimation (projection) (Sintayehu A, 2016) as well as the model proved that it was a model to projection of Tanzania population (Andongwisye et al, 2019). Malthus did not take into account the fact that in any given environment, the growth of population may stop due to the density of population or competition of resources (Al-Eideh and AlOmar 2019). The model is expressed in simple differential equation as follows; (1) Where the total population size and is the constant growth rate defined as the difference between the birth rate and death rate for a certain population size. We rearrange and integrate both side of equation (1) Finally, the exponential model is expressed as (2) Where, represents the population at some specified time, 0 that means P 0 is constant number. By considering and rescaling equation (2) expressed as: * (3)

Logistic growth model
Logistic growth model was proposed by Verhulst in 1845(Marsden et al. 2003. The model is the extension of Malthusian model. The main feature of logistic model is the inclusion of the limiting value which is the size of population that an environment can support (Andongwisye J and Allen R, 2019). Ali et al. (2015) studied census data and predicted population of Bangladesh by using logistic model. They used a curve fitting method and tried to compare the prediction between the case when carrying capacity is known and not known. The model suggested that the population growth depends on carrying capacity and the maximum rate of growth. The model was applied for Ethiopia population and less preformed than the exponential model (Sintayehu A, 2016). The logistic model is an extension of exponential model that includes the ideal conditions that some were excluded in the model as follows 1 , , 0 (4) Where is the maximum sustainable population (carrying capacity), it is assumed to constant value and is the growth rate which is unknown parameter in this case, and / are vital constants. The logistic model reduces to exponential growth signifying that as is greater then the rate of growth becomes negative and population decreases. From equation (1.4) we can find the solution of the non-linear differential equation as follows;

1
By applying the concept of calculus and some rearrangement, the logistic model for the population is given as (Sintayehu A, 2016, Mussa A. andJung I, 2019)  Let : Population at 0, : Population at and : Population at 2 , then By dividing equation (6) to (7) The value of in equation (8) is constant as per the values in the equation are given. This implies that equation (5) can be formulated by some rearrangement * * Let * is , then equation (9) expressed as * ( 1 0 )

General growth model
This is a modification of Malthus growth model by considering P0 as the unknown parameter. The model was applied for Tanzania population and Ethiopia population as a candidate for the population projection (Sintayehu A, 2016, Mussa A. andJung I, 2019) and indicated that the model performed less as compared to the Malthus and logistic models. On the other hand, a review (Mulugeta and et al, 2020) indicates that the general growth model was the best for the prediction of united republic of Tanzania. Amare and Mulugeta (2017) studied about Ethiopian population and suggested that exponential trend model was selected for projecting Population of Ethiopia. The general expression of the model is given as follows (Sintayehu A., 2016 andMussa A. andJung I, 2019) * * ( 1 1 ) Where α and β are the parameters to be estimated for the prediction of the population.

Parameter Estimation of the Models
A French mathematician Adrien-Marie Legendre (1805) developed method of least squares is an algorithm in regression analysis that has most important application in data fitting. This algorithm involves minimization of sum of squared residuals for the sake of maximizing the model parameters. In various areas of experimental sciences, maximization theory is associated with accuracy and precision of the predicted output is attained by error reduction. For many observed data point, the method of least square is reasonably most systematic procedure to fit the unique curve. Suppose we have set of observation, , , , , … , , .

Malthus growth Model
Using equation (3), we can estimate the parameter using method of least square technique. We need to transform the Malthus growth model the exponential model ( * ) into linear model by taking natural logarithm (ln) both and the linear model is given as ln * Suppose that y(t)= ln and thus the linear model is * ( 1 2 ) The sum square error of the equation (2.2) which is formulated as To estimate the parameter , we need to minimize the sum square error by driving with respect to and equal with 0. 0 So that the estimated value for can be calculated by using the following equation ∑ * ∑ 13

Logistic growth Model
Using equation (10), we estimate the model parameter using method of least square. We need to transform the Logistic growth model the exponential model ( * ) into linear model by taking natural logarithm (ln) both and the linear model is given as ln * Suppose that y(t)= ln and thus the linear model is * ( 1 4 ) The sum square error of the equation (14) which is formulated as * To estimate the parameter , we need to minimize the sum square error by driving with respect to and equal with 0.
0 So that the estimated value for can be calculated by using the following equation

General growth model
By considering equation (11), We transform the general growth model ( * * ) into linear model by taking natural logarithm (ln) both in order to apply least square method for parameter estimation and the linear model can be written as ln * * ln Assume that , and ln , Generally we have * (16) We define the error associated in the set of data with the equation (16) by ,

For
1 times the variance of the data set , … , To get the optimal values of and we have to minimize the residual ( , ) by taking the partial derivative of the error with respect to and , equal with 0, that can be done by using the concept of Calculus, 0

Differentiating
, with respect to and and solve for and respectively. After some rearrangement, the estimated value of and are calculated as

Model Selection criteria
The study applied different models for the same dataset which required various ways to compare the statistical models in order to select best statistical model to fit the dataset. Statistical techniques that are used for the comparison of the models to fit the population of Ethiopia.

Mean Absolute Percentage Error (MAPE)
Mean absolute percentage error is one of a measure of accuracy of the models prediction. It is formulated as (Sintayehu A. 2016 and Amare and Mulugeta, 2017) Where is the actual observation of the population of Ethiopia at i th time. is the fitted value of the population at i th time. n is number of observation for the population. The model having the least MAPE preforms better to predict the population due to having the smallest percentage error.

Residual Standard error (RSE)
Residual standard deviation (error) is a way of measuring the variation between the actual and the predicted value of the dataset. This measure can be calculated using the following Where is the actual observation of Ethiopia population at i th time. is the fitted value of the population at i th time. n is number of observation for the population. A model is said to have the best performance if SEE of the model is the smallest that indicates the model is best to predict the population (Mulugeta et al, 2020 and Amare and Mulugeta, 2017).

Mean Absolute Residual Deviation (MAD)
Mean absolute Residual deviation, is a measure of accuracy of the models prediction, is the average of the absolute difference between the actual and the fitted value of the population. This can be mathematically expressed as (Mulugeta et al, 2020) 20 Where is the actual observation of the population of Ethiopia at i th time. is the fitted value of the population at i th time. n is number of observation for the population. The model having the least MAPE preforms better to predict the population due to having the smallest percentage error (Amare and Mulugeta, 2017).

Coefficient of Determination (R2)
This measures how the model fits the dataset, which is the proportion of variation of the predicted value in a unit change of time (Mulugeta et al, 2020). This measure is calculated as Where is the correlation coefficient between the actual and the predicted observation by the model which is calculated as follows A model with the highest R 2 is the best model for the prediction of the population (Mulugeta et al, 2020).

Result
The unknown values for the models, that they were explained in section 2, were estimated using Ethiopia federal democratic republic (EFDR) population data. Using measure of accuracy criteria's, the best model was selected which has been used for the projection of Ethiopia population in the future.

Malthus growth model
The model has one parameter (growth rate) of the population for the country and the value estimated using method of least square with the result given in table 1 below. The growth rate of the model was 0.0276 and the parameter is statistically significant at 0.1% ( 0.001 with a minimum standard error for the estimate (table 1). The significance of the parameter implies that the Malthus growth model is applicable and important to model the population of Ethiopia. <2e-16 *** *** significant at 0.1% Therefore, the Malthus growth model of the population growth is given as 36036457 * .

Logistic growth model
The Carrying Capacity of population of Ethiopia, which is deterministic value for the logistic models and considered as a constant value for the model, can be estimated at the points , , 36,036,457,62,891,069,108,113,150 . Then the value for K can be estimated as follows * 2 * * * By substitute the values on the equation, the carry capacity of population of Ethiopia estimated as 1,351,470,438. This indicates the maximum carry capacity of the country is 1,351,470,438. This can be written as ,351,470,438 And hence, by applied a least square technique for the estimation of growth rate (r) in logistic growth model. As mentioned in the table 2 below, the growth rate the population of Ethiopia is 2.88% which is statistically significant estimate of equation (2.0) ( 0.001) and had a minimum standard error of the estimate.

General growth Model
We estimating and after transforming the general growth model into linear model. We used R software to estimate the parameters by using least square method. The results of the estimates given in table 3 below. As indicated in table 3, the estimates were statistically significant at 1% (pvalue<0.001). 1.476e-04 184.9 <2e-16*** *** significant at 0.1% Then we transformed the linear model into exponential model 17.41 and 0.0273 such that , 17.41 36397112 Therefore, the general equation for population growth of Ethiopia is given as * * 36397112 * 0.0273 * . As a candidate, we have three models for the prediction of Ethiopia population. However, we need to select the best model to predict the population of Ethiopia.
As indicated in plot 1 below, the models were performing in almost the same in prediction of the actual population of Ethiopia. The performance of the models almost the same however, their errors from the actual models may not identical that the model has to be selected as a best model if it has the least error on average from the actual population observation. Using table 4 and the figure below the general growth model performs best to project Ethiopia population and it is suggested to the general growth model for predicting the Ethiopia population. Table 4 indicates the numerical diagnosis of the models using by using SEE, MAD, MAPE and R 2 which are essential to select the best models to predict the population of Ethiopia. The model with the smallest SEE, MAD, MAPE and the model having the highest R 2 was the best model to predict the population of Ethiopia. As a result, the General growth model has the smallest SEE, MAD and MAPE and additionally the model has the higher R 2 as compared to the other models. Hence, * * is the best model to predict the population of Ethiopia.  Figure 2 indicates the residual plot of the modes which conveys the deviation of the actual and the predicted values using the models and indicates the distance from zero. The proposed model using the graph is that the general growth model because on average the model is best relative to the other model that is supported by table 4.

Discussion
Using the figures provided in the above the actual and predicted population were almost the same for Malthus, Logistic and General growth models. The study used 41years population from 1980-2020 inclusively. It finds the models that fitted the dataset very well. The model parameters were statistically significant at 0.1% for all models which was the evidence to use the models to fit the data, that led the models were good to apply for the projection of population of Ethiopia.
The models fitted the population strongly as indicated in figure 1 with the fitted versus the actual population. The models were compared using the accuracy measures to identify the best model out the three models. Mean absolute percentage error (MAPE) of Malthus, Logistic and General growth model were 0.79%, 0.78% and 0.75% respectively (table 4). The best model for the population projection was found to be General growth model because the model had the least values in MAD (379,942), MAPE (0.75%) and RSE (492,155) and had the highest R 2 (99.97%). This result is identical with the result of Mulugeta et al (2020) and it is contradicted with the result in Ethiopia population whereas Malthus growth model was selected as best model for the projection of Ethiopian population (Sintayehu A., 2016) and Logistic growth model was found the best model for the projection of United republic of Tanzania population (Mussa A. et al 2019).
The 2 nd best model for the study were Malthus growth model as indicated in table 4. The population growth rate of Ethiopia based on the information in International Data Base (IDB) and world meter were approximately 2.74% per year in 2017 and 2.62% in the years 2018 and 2.60% in year 2019 which founds the population growth rate 2.76% for Malthus model, 2.88% for logistic growth model and 2.73% per year for General growth model. The study supposed that the population growth rate of the general growth model and the approximated growth rate.

Conclusion
The study has been explained the population of Ethiopia using Malthus, Logistic and General growth model On the bases of the result that was obtained in the analysis part. The estimation of the model parameters was statistically significant which means that all the models were important for the projection of population of Ethiopia. The estimation method for the models parameters of the models was least square. The general growth model was the best model for the projection of Ethiopian population because it is the 1 st in ranking using the measure of accuracy techniques. Malthus growth model was 2 nd model for the projection population of Ethiopia which it had the 2 nd performance of the prediction of the population of Ethiopia. The study proposed that General growth model is the best model that is capable to project the future population of Ethiopia. As recommendation, the model parameters of the models may be estimated various techniques of estimation out of which one can have a power to estimate the model parameters (growth rate). It is better to work on the estimation of the model parameters using different estimation methods rather than using least square method of estimation.