A Study of Middle Class Proportion in Saudi Arabia using Engel's Coefficient

Based on Engel coefficient, this paper makes a descriptive statistical analysis of the current situation of the middle class in Saudi Arabia. The proportion of the middle class is measured by Engel’s coefficient. The research aims to identify the changing trend and development rules of the middle class. This is of great significance for achieving Saudi 2030 and building a moderately prosperous society in all aspects.


Introduction
In order to develop into a sustainable modern society, Saudi Arabia needs to cultivate a huge social intermediate force, namely, the middle class. However, there are a series of problems are worth studying. How large is the middle class? What is the current development situation? What is the changing trend? Scholars at home and abroad have made relatively comprehensive and in-depth researches on the middle class. Economists often start with income indicators, while the sociologists often use questionnaires or surveys to study the middle class from multiple indicators. On the basis of previous scholars' research, this paper will make a descriptive statistical analysis of the current situation of the middle class defined by the single index of Engel coefficient, measure its proportion, and then study the changing trend of the middle class. This research aims to find out the development rules of the middle class, which is of great significance for achieving Saudi 2030 and building a moderately prosperous society in all aspects.

Definition of Middle Class Concept
The middle class is a description given to individuals and households who fall between the working class and the upper class within a societal hierarchy. Through the study of relevant literature review, it is found that persons in the middle class tend to have a higher proportion of college degrees than those in the working class, have more income available for consumption and may own property. Those in the middle class often are employed as professionals, managers and civil servants. In general, most scholars believe that the middle class is closely related to occupation, which is defined by income, property or consumption indicators. To avoid the dispersion of multivariables, this research equates the concept of middle class with that of middle-income group, all of which are generally called as Middle Class.

Data Description and Collection
In view of the fact that income may be affected by various factors such as the disunity of the income definition, the difficulty of investigation, the difference in regional income and consumption level, people's psychological effect of avoiding showing off their properties on purpose, the vagueness and taboo of respondents' answers, the existence of invisible income, etc., this research chooses Engel coefficient as analytical model. The Engel coefficient is easier to be measured accurately to define the middle class. It can also more systematically reflect a person's living habits as well as quality of life.
Engel coefficient is commonly used in consumption indicators. Considering that Saudi Arabia is still in the stage of economic transition, in addition to the residents' own living habits, this research modifies Engel coefficient to define the criteria of the middle class. This paper chooses to subtract 0.1 from the internationally standard Engel coefficient, that is, Engel coefficient is between 0.3 and 0.4 to define the middle class in Saudi Arabia.
The data in this investigation are considered from two aspects. One is the macro-level data from General Authority of Statistics and Ministry of Economy and Planning, including urban data and rural data. The other is to adopt the latest micro-level data released by the Saudi General Social Survey (SGSS), including the social comprehensive survey data in 2014 and 2016. Moreover, due to the uncertainty of students' income and consumption behavior, the data from students are difficultly used as a strong basis. Therefore, this research selects the sample data of the age-appropriate social population with statistical analysis value, which is the non-student group aged 25-55 years old in SGSS. Their number of valid questionnaires was 1051 in 2014 and 825 in 2016, respectively.

Analysis of the Current Situation of Middle Class
According to the General Authority of Statistics, a very important change in Saudi Arabia's consumption structure was the decrease of Engel's coefficient from 30.1% in 2017 to 29.3% in 2018. This showed that the living standard of the residents had greatly improved. The Engel coefficients in Saudi Arabia in recent years were between 30% and 40% across the Kingdom, including cities and villages. However, it is obviously not reasonable to assume that all the people in Saudi Arabia belong to the middle class defined by Engel coefficient. We just know that the proportion of the middle class is expanding from some aspects.
According to the data from the micro SGSS data (see Table 1.1 below), the Engel coefficient of the ageappropriate social population in 2016 is between 30% and 40%. This suggests that Saudi Arabia has reached a modest level of prosperity. We can see from the Table 1.1 that the standard deviation is lower than the average value, which shows that the data dispersion degree is low, Engel coefficient is relatively concentrated, and the consumption behavior is similar. According to the definition and standard of middle class, the proportion of the middle class in consumption in Saudi Arabia has reached 18.22%. Compared with 17.97% in 2014, the Engel coefficient and the proportion of middle class have both increased slightly. Due to regional differences in consumption level, Engel coefficients are further calculated based on the three regions of developed region, relatively developed region and underdeveloped region. It can be seen that the average value of developed region decreased and the proportion of middle class decreased significantly in 2016, which indicates that the quality of life is better and the gap between rich and poor is widening in developed region. However, the average value of Engel coefficient is significantly reduced, and the proportion of middle class is increasing in underdeveloped region, indicating that the living standard of residents in underdeveloped regions has improved.

Measurement of the Proportion of Middle Class
Firstly, the density function is constructed to measure the Engel coefficient of the middle class. Secondly, the kernel density function is used to estimate it to obtain the kernel density function of Engel coefficient. Then, the upper and lower limits of Engel coefficient of the middle class are defined to be 0.3 -0.4. Finally, the kernel density function is numerically integrated to obtain the proportion of the middle class.

Measurement Model of Kernel
Density estimation is a method for estimating unknown density function, which belongs to one of modern nonparametric test methods. If f (x) is a density function of one-dimensional population, let k (·) be a given Borel measurable function on R. hn > 0 is a constant related to n, which satisfies , define: is called the kernel density estimation of one-dimensional population density function f (x). Among them, k (·) is called kernel function, because different choices of kernel function are not sensitive in kernel density estimation, when n is very large, it has little influence on the estimation result, so Gaussian kernel is used in this research. hn is called bandwidth, i.e. smooth parameter. Generally speaking, after a sample is given, the performance of kernel estimation mainly depends on whether the selection of bandwidth hn is appropriate. Therefore, bandwidth should be adjusted, i.e. the integral mean square error MISE proposed by Rosenblat (1956) should be minimized to obtain the optimal bandwidth : However, there is an unknown quantity f(x) in equation (12). In this research, the empirical law proposed by Sliverman 1986 is applied, that is, suppose f(x) is a normal density function N(0, σ 2 ), and Gaussian kernel is selected, the optimal bandwidth hn * is: Finally, the one-dimensional kernel density function is:

Proportion of Middle Class
Engel coefficients in 2014 and 2016 are calculated based on the SGSS data, and the figure of kernel density is drawn by R software as shown in Figure 1.1. As can be seen from Figure 1.1, Engel coefficient of most people in Saudi Arabia is about 0.3. The figure of kernel density also shows a slightly rightward shape, which shows that the proportion of people with large food consumption expenditure is relatively small, and the living standards of people vary greatly. Compared with 2014, the kernel density curve in 2016 shows the following characteristics: the curve shifted slightly to the right, indicating that the residents' food expenditure level increased and the family living standard decreased slightly; the peak value of the curve is slightly higher, the top rises and the width decreases slightly, which indicates that the income Engel coefficient gap narrowed and the proportion of the middle class is slightly increased. Furthermore, based on the kernel density figure of total household income in 2014 and 2016 for urban and rural areas, the kernel density figure of total household income in 2014 and 2016 for developed areas, relatively developed areas and underdeveloped areas are drawn by R software, which gains the conclusion that: the Engel coefficient of urban families in 2016 has not changed much from that in 2014, and the Engel's coefficient of rural families has obviously shifted to the right, indicating that the living standard of rural families has declined. The kernel density figure in developed areas widened in 2016 compared with 2014, that in relatively developed areas remained basically unchanged, and that in underdeveloped areas shifted to the right as a whole, which shows that the quality of living in developed areas widened, the quality of living in more developed areas remained basically unchanged, and the living standards of people in underdeveloped areas deteriorated.

Figure 1.1 Engel Coefficient of Families in 2014 and 2016 (─2014/ -┄2016)
According to the upper and lower limits of the middle class defined by consumption given above, that is, Engel coefficient is between 0.3 and 0.4, the proportion of the middle class can be obtained as shown in Table 1.2 below. As can be seen, the proportion of the middle class of the whole country in 2016 slightly increased compared with that in 2014, the proportion of the urban middle class slightly decreased, and the proportion of the middle class in developed regions also decreased. The reason may be related to changes in economic growth and income changes, but also to greater pressure including housing and consumption on people in cities and developed areas.

Analysis of the Changes of Middle Class
The analysis method of functional data is to consider each sample observation as a whole one. Moreover, in the analysis of economic functional data, scholars often need to find out the main variation modes of the variables that they are interested in that change with time, and also want to know how many such variation modes or forms can better fit the original curve samples, i.e. they need to discuss the main components of data variation by determining the typical functional characteristics of curve data. And principal component analysis can solve this kind of problem very well. This research will study the change rule of the middle class formed by Engel coefficient and find the change trend of the middle class from time to time, so the principal component analysis of functional data is introduced here. Moreover, because the microscopic data (only two years' data) are insufficient and they are not enough to represent the changes of the middle class in recent years, this research uses the urban seven-group data and the rural five-group data published by the General Authority of Statistics to expand the basis of functional data and introduces functional principal component analysis into the changes analysis of the middle class.

Basis Expansion of Functional Data
In this research, B-spline base is used to transform discrete data into functions. Assuming that the primary function f (xi) is a spline function, a cubic spline function with k nodes can be composed of linear combinations of b1(x1), b2(x2), b3(x3), …, bk + 3 (xi), among which bi (xi) has various selection methods. In this research, a cubic polynomial is selected as the basis, and then a truncation basis is added to each node, namely: Where, . ξ is a node. When estimating the model, the least square method is used to estimate k+4 coefficients. Where, is the j-th principal component score of i-th sample, i=1, 2, ..., N. And many weight functions ξj(s) obtained satisfy the standard orthogonal constraint conditions, namely: 6.2.2 Method for solving functional principal components In multivariate statistical analysis, the solution of principal component is to find the eigenvalues and eigenvectors of covariance matrix or correlation coefficient matrix. In functional principal component analysis, it is required to solve the characteristic equation V ξ = ρξ and the corresponding characteristic function.
(1) Solving the characteristic equation If the p × p order matrix V= N -1 X' X, i.e. V represents sample variance-covariance matrix, similar to multivariate statistical analysis, solving the functional principal component weight function ξj(s) is transformed into solving the following characteristic equation: Where, let the covariance functions of xi(s) and xi(t) be v(s, t), s, t∈T, i.e.: If an operator is defined: That is, V is an integral transformation of the weight function ξ and is called covariance operator. Therefore, the characteristic equation of equation (7) can be expressed as Vξ = ρξ (note that ξ here is a characteristic function, not a eigenvector).
(2) Solving weight function by discretization method Let the time points t1, t2,..., tn of observation xi(t) be equally distributed in the interval T, i.e. values are taken at n equal diversion points of the interval T, thus obtaining the multivariate data set X: Similar to multivariate statistical analysis, eigenvalue and eigenvector satisfying equation (11) can be obtained: Where u is an n-dimensional vector.
The elements of the sample variance-covariance matrix are . For the given function ξ, let ξ be an n-dimensional column vector composed of , and l is the length of interval T, w = l/n. Therefore, for any tj, there are: Therefore, the functional characteristic equation has an approximate discrete form: The solution of this equation will correspond to the solution of the formula , and the relation between eigenvalues is ρ=wλ. The approximate discrete form of the normalization constraint is , so if the vector u is the normalized eigenvector of the matrix V, then . After ξ is obtained, the approximate characteristic function ξ can be obtained from the discrete value ξ using any simple interpolation method.

Data Analysis of Middle Class Changes
In this research, B-spline basis is selected to expand by R software, and Engel coefficient of urban and rural sevengroup data and their rate of change are plotted as smoothing curves. The conclusion is that: Apart from the relatively large fluctuation of Engel coefficient in the lowest income group, the Engel coefficient in the other six groups only has slight fluctuation, but has maintained a stable development on the whole. The fluctuation characteristics of Engel coefficient of the middle three curves (middle lower, and middle upper groups) measuring middle class groups are consistent, all of which firstly increase and then decrease periodically. Besides, the Engel coefficient reached the maximum increase in 2004 and 2008, and had a downward trend after 2012. Similarly, according to the five-group rural Engel coefficient and its rate of change, B-spline smoothing is also carried out, which gains the conclusion that: the five-group curve in rural areas changes more sharply than the seven-group curve in cities and towns, and its overall fluctuation trend drops. It reached to the top point in 2004, which may be due to soaring prices in 2004 without the income increase correspondingly. Besides, the middle three curves, which measure the middle class group, had a declining trend since 2012, which indicates that the living standard of the rural middle class group has been improving in recent years. However, from the five-group data of rural Engel coefficient change rate, the fluctuation rate of other groups is basically the same except that the change rate of low-income groups is extremely low near 2012.
Furthermore, the effect diagrams of the mean deviation of principal components of the change rate of sevengroup urban and five-group rural Engel coefficients are drawn by R software, respectively. As can be seen from Figures 1.2 and 1.3, both urban and rural Engel coefficients deviate from the average value more and have large fluctuation, and explain most of the changes in functions of the function. Moreover, their first principal components caused the great changes of five-group Engel coefficient before 2006 and after 2010. In fact, the promulgation of a series of national policies benefiting the people in 2006 and 2010 increased people's income and correspondingly increased consumer spending. Therefore, under the control of various policies, the five-group Engel coefficient changed.

Conclusion
The density function of the middle class group defined by Engel coefficient shows a slightly right-sided distribution, which shows that people's living standards are obviously different. Compared with 2014, the quality of life in 2016 declined, but the difference of the rich and the poor narrowed. From the perspective of urban and rural areas, the Engel coefficient in urban areas has not changed much, and that in the rural areas has moved to the right obviously, which indicates that the quality of life in rural areas has declined. From the regional point of view, it suggests that the quality of living in developed areas widened, the quality of living in more developed areas remained basically unchanged, and the living standards of people in underdeveloped areas deteriorated. Generally speaking, the proportion of Saudi Arabia's middle class is not large, but the overall trend is increasing. And the results of functional data analysis show that: the fluctuation characteristics of Engel coefficient curve of middle-class groups are consistent, all of which firstly increase and then decrease periodically, and had a downward trend after 2012, indicating that the food expenditure of people in all income groups has decreased and the living standard has improved. The smoothing curve of the fluctuation rate of urban and rural Engel coefficient fluctuates roughly the same, but the curve of rural Engel coefficient fluctuates more violently than urban one. From the effect diagram of mean deviation of principal components of Engel coefficient, it can be seen that both urban cities and rural areas caused substantial changes of Engel coefficient before 2006 and after 2010.
In short, the proportion of the middle class in Saudi Arabia is generally increasing, whereas the scale is not large. There is still a big gap compared with the proportion of the middle class in developed countries such as Europe and the United States. Although the quality of life of people has indeed improved, the task of expanding middle class is still very arduous.