Errors Measurement and their Presence in Group Slope Differences - Evaluation of OLS, SEM and EIV Estimators

The biases interaction, considered as measurement error, is responsible for affecting and distorting various inferences about the interactive hypotheses. The study aims focus on a single-indicator and depicted the accuracy of estimate group slope differences by disattenuation of interactive effects, together with error-in-variables (EIV) regression. The simulation results and analytic findings were used for the comparison between relative bias, Type I error of EIV, power, sparse multi-group structural equation model (SEM), and ordinary least squares (OLS). The results have shown that EIV estimators were less biased as compared to the OLS and SEM estimators. In a situation, where groups differ in the prediction of reliability, the OLS and SEM estimators are unable to control the rate of type I error. However, the impact of additional derivations using Cronbach’s alpha depicted decreased reliability with EIV estimator. While using alpha, the bias in EIV estimators was not increased as compared to the SEM and OLS estimators. The results suggested that EIV estimator should be used instead of using OLS and SEM estimators, for the estimation of group slope differences in the presence of measurement error. this study has extended the use of EIV models for deattenuating interaction effects. Secondly, new equations are presented to examine the comparative performance of EIV versus OLS to estimate interaction effects in the presence of an unbiased estimate of reliability. Only one study has analytically compared the relative power, bias, and Type I error rates of EIV and OLS estimators of interaction effects. Thirdly, the performance of sparse SEM relative to EIV and OLS is


Introduction
The mismeasured variables are contaminated within many economic data sets. In empirical economics, the issue of measurement errors is one of the essential issues. The occurrence of measurement errors results in inconsistent and biased parameter measures and links to erroneous estimations to different degrees in economic analysis [1]. There are two different dimensions in which measurement error problems can be addressed such as linear errorsin-variables (EIV) models and nonlinear EIV models [2]. Similarly, different methods are used for treating classical measurement errors and nonclassical measurement errors.
The problem of measurement error is identified by social scientists in terms of data collection, but usually ignored it during their corresponding statistical analyses. The bias induced by measurement error might be ignored if it is estimated to be smaller as compared to the effects being measured in the most optimistic scenario [3]. Appropriate application-particular methods to handle measurement error are present, but they are complicated to integrate, which need difficult-to-satisfy assumptions, or drive to high model dependence levels [4].
Methodological issues related to examining and interpreting measurement effects are discussed in several studies across a variety of disciplines. The statistical difficulty related with detecting interaction effects is one of the consistent findings in previous studies [5]. For example, the ability of a researcher is significantly impaired by study characteristics such as scale coarseness, sampling error, range restriction, measurement error, multicollinearity, and heterogeneity of group error variances [6]. It was also examined that spurious interactions exist when classical test theory was used instead of item response theory.
The situations become complicated when researchers have only a single indicator of the latent predictor, which makes multiple indicators impossible using multigroup structural equation models. In particular, a situation is portrayed by prediction bias research where it might be easier for considering test scores as a single predictor [7]. It is essential for highlighting that prediction bias research is not the only condition where researchers experience single predictors. For instance, single indicators occur naturally when researchers in economics, psychology, and education use standardized test scores for predicting essential consequences [8]. This study therefore assesses methodological alternatives to estimate group slope differences in the presence of measurement error considering the occurrence of single indicators in study.
Currently, two data analytic options are used by researchers for evaluating measurement errors in the singleindicator case such as sparse multigroup SEM and ordinary least squares. The relative performance of OLS and sparse SEM has not been evaluated in previous research as predictors of interactive effects. This study has used Monte Carlo simulation and analytic derivation for evaluating SEM and OLS. A new errors-in-variables (EIV) estimator was presented as a substitute and compared with SEM and OLS.
It is essential to present the advantages and significance of this study. Firstly, this study has extended the use of EIV models for deattenuating interaction effects. Secondly, new equations are presented to examine the comparative performance of EIV versus OLS to estimate interaction effects in the presence of an unbiased estimate of reliability. Only one study has analytically compared the relative power, bias, and Type I error rates of EIV and OLS estimators of interaction effects. Thirdly, the performance of sparse SEM relative to EIV and OLS is evaluated through new simulation results. Fourthly, Cronbach's alpha is used as a measure of reliability, which is a biased estimate of reliability in most practical situations. Therefore, the study has aimed to focus on the presence of measurement in group slope differences and estimate the accuracy of group slope differences by deattenuated interaction effects among the OLS, SEM, and EIV Estimators.

Estimating Group Slope Differences using Statistical Alternatives
The different selection procedures are helpful in the prediction of results that are differentiated across different sub-groups. The size of sample-based observed effects is decreased in the presence of increased slop-based test due to the existing statistical and methodological artefacts, which include the restriction range and measurement error [9].

OLS Estimators
In a linear regression model, OLS estimators are used for the estimation of unknown parameters. It has the ability to minimize the responses, mediated by linear approximation and sum of squared vertical distances observed between the responses in dataset. OLS estimators possess the properties that include increased efficiency, minimum variance, and unbiases. The short comings of OLS estimators are investigated through different researches and the attenuated coefficients of OLS are explained as; Ʃ 1 Ʃ (1) Although, the statisticians are well aware about the measurement error biases of OLS estimator; still the researchers utilized this estimator to evaluate the group slope differences [10]. The OLS estimators use the linear regression model to minimize the sum of squares of differences that are observed between the responses, present in the dataset. It also minimizes the sum of squares of differences between the observed responses that are predicted through the linear functions of explanatory variables [11]. The data fits to the model appropriately if the differences among the sum of squared vertical distances is decreased between the data point and the corresponding point on the regression line. The consistency in OLS estimators is observed in the presence of exogenous regressors [12]. Therefore, the OLS estimators provide minimum variance mean and unbiased estimation in the presence of finite error variances. Moreover, it yields maximum likelihood in the presence of normally distributed errors.
Where is the variance of the criterion and k is the number of indicators measured.

EIV Estimators
The EIV estimators are defined as the regression models that are utilized in the measurement of errors, present among the independent variables. It is evident that EIV is expected to outperform the other competitive estimators including OLS and SEM estimators in the context of analysis of covariance [13]. The attenuated coefficients of EIV are defined by: (3) The inconsistent estimations occur, when there is an error in the measurement of regressors. For instance, in large samples, the parameter estimation does not minimize the true values. The direction of biasness is more complicated in the presence of non-linear models [14]. Errors of different nature and magnitude exist in all data sets due to the increased frequency of attenuation biasness in multivariate regression. As compare to the OLS estimators, the EIV estimators are extended from simple to multivariable cases.
Let x be an observed estimator of τ with mean and variance. The proportion of subjects throughout the focal group is indicated through p. In addition, on x, subgroup moments are indicated as for the focal group and for the reference group and ∆ ∆ portray the respective difference in moments between the reference and focal groups, respectively.

SEM Estimators
SEM is the alternative of OLS and EIV estimators. The SEM estimator tends to fix the variances in the measurement model as depicted by the simulation study [15]. In order to fit into the network of data constructs, the SEM estimators utilize a diverse group of mathematical models, statistical models, and computer algorithms. Moreover, it also includes latent growth modelling, path analysis, confirmatory factor analysis, and partial least squares path modelling. The unobservable latent constructs are often assessed through structural equation models.
SEM estimators are capable to attribute association between the latent variables (unobserved constructs) from observable variables [16]. However, the estimation of parameters is done by comparing estimated covariance matrices and actual covariance matrices that can be accomplished through a specific SEM analysis program, which will represent the association between the variables [17]. The SEM estimators have the ability to induce measurement model, which explains the latent variable through more than one observed variable. It clearly defines the structural model, which attributes the association between the latent variables.

Material and Methods
The study has opted mixed research design for the collection and analysis of data. The qualitative approach has been used to analyse the previously published data in terms of validity. The effects and accuracy of OLS, EIV, and SEM estimators have been examined through simulation approach. For accuracy, unique combinations of different parameters were estimated with around 450 -500 replications.

Data
Data has been acquired from equal sample sizes that depict the errors in measurement in the group of slope differences. This would help in the evaluation of different estimators. The data has been generated through beta distribution, which is used as a single population. The analytic findings have been used together with simulation results to compare the error, power, and relative biasness among the OLS, EIV and SEM estimators.

Procedure
Quantitative analysis has been incorporated in the study through Cronbach's Alpha test for analysing the reliability. Simulation results have also been incorporated in regards of errors-in-variables (EIV) regression. SPSS Version 20 for Cronbach's Alpha and AMOS Graphics for Structural Equation Model were utilized as core software applications.

Results
The simulation of the data was performed for the following parameter values: p = 0.3, n = 400, μ = 0, σ 2 = 1, ∆μ = 1, ∆ σ 2 = .5, .20, .02, 0.1 0.1. The covariance matrix is computed in the first step of the EIV analysis among the estimated indicators and the vector of covariances between the criterion and predictors. The covariances among estimators Ʃ and criterion Ʃ for the simulated data. For example, estimated covariances are corresponded from the first, second, and third rows of Ʃ and Ʃ .
The true estimates for τ, G, and τG were .400, .185, and .175, respectively. The EIV estimates were less biased as compared to the OLS estimates, which were .203, .110, and .139. The standard errors of the estimates were used to construct confidence intervals or t values, which equal the square root of the diagonal elements of Ʃ .

Interactive Effects of OLS, SEM, and EIV Estimators
The equations that calculate the type I error, power rate, and relative biasness of the OLS, SEM, and EIV estimators on basis of group differences in the slopes will be discussed. The performance of these estimators will also be evaluated with reliability. As compared to biasness generated by sparse SEM and OLS while using alpha, the biasness produced in EIV estimates is smaller [7]. The OLS estimator minimizes the squares of errors between the specified data that affects some functions associated with the approximation of data. The presence of dependent value differs from the data due to increased effects of resulting constants. Moreover, it is only concerned with the occurrence of median error and is less time efficient as compared to other two techniques [18].

Cronbach's Alpha and Relative Biasness of EIV
Cronbach's alpha is widely used for the estimation of internal consistency reliability. It depicts that the average association between different set of items is the accurate estimation of average association between items that are concerned to a certain construct. Utilizing the EIV estimator with α may cause biased estimation and overcorrection [7]. The effect of Cronbach's alpha has been examined to hold a typical lower association of reliability with EIV when exposed to additional derivations. Cronbach's alpha is also known as internal consistency as it generally increases with enhanced intercorrelations among the test items. It indicated the degree of each set of items, which is measured through a single uni-dimensional latent construct. It occurs because the association among test items increases when all of the items measure the same construct. The Cronbach's alpha is calculated by the correlation between the score of each scale item and the total score of each observation. These observations are further compared with the variances of each item scores;  Figure 1 shows a reference line that passes through the origin with a slope to understand when the relative EIV bias is greater for a given value of the function. Figure 1 shows that there is an approximate one-to-one relationship between the relative bias of EIV estimates and the function when = 0.2. On the contrary, the relative bias of EIV estimates is greater than the function as increases. Particularly, this result shows that biased estimates increase with the EIV procedure and vice versa.    Figure 3 shows that the Type I error rate is controlled by EIV for tests of slope differences irrespective of the values for ∆ ∆ . Figure 3 also shows that the bias in interaction effects can distort statistical inferences that depend on SEM and OLS.

Discussion
The results along with simulation outcomes have presented new analytical findings. These findings have helped in the comparison among power, relative bias, ordinary least squares, type I error rates of EIV, and estimators of structural equation model. The study has also presented four significant implications of the discussed estimators.
In the case of single indicator measured with error, this study has compared the ability to assess the slope differences. A new estimator has been discussed that can assess the slope differences accurately. The study has also provided evidence through analytical procedure that has explained the preference of EIV estimator over other two estimators.
The presence of measurement error significantly affects the interaction of these estimators [7]. Therefore, it is evident from the results that biasness in EIV estimates and over-corrected EIV estimation are dependent on the usage of alpha. The association between EIV estimates and Cronbach's alpha results in decreased biasness and increased variability among the item loadings. The decrease in biasness is either linked with increase in average loading or test length [7]. Therefore, the results have provided a compelled evidence about using EIV in replacement of OLS and sparse SEM in estimation of group slope differences in the presence of measurement error.
A study conducted by Culpepper [8], depicted that OLS yields inaccurate inferences for the prediction of bias hypotheses. The effect of selection, measurement biasness, and measurement error has been demonstrated on the basis of criterion-predictor factor model, which utilizes OLS. The type I error and power rates are computed through the criterion-prediction factor model. It has been associated with regression analysis to assess the hypotheses of prediction biasness. It has been analysed that OLS is not about testing hypotheses regarding group differences in the latent slopes and intercepts [8].
The group differences in linear regression intercepts or slopes have been examined through differential prediction from either one or more score. The measurement equivalence has been explained as factorial invariance along with single-factor model, used for testing and criterion. The measurement biasness in EIV, OLS, and SEM estimators do not result from the slope intercept differences [19]. However, multi-group confirmatory factor analysis has been used in the testing of different theorems. The EIV is considered among the most popular statistical procedures, used for the investigation of regression slope differences across different groups. The heterogeneous error variances showed great biasness; whereas, the conditions leading to heterogeneity are common.
The error in error biases estimates of associations between different constructs is calculated through the process of measurement. The association between constructs is under-estimated through the association between specific measures of observed relationships due to the presence of measurement error. The SEM estimators are efficient in the estimation of mediation effects as compared to the other estimators. It has the ability to analyse the association between an unobserved latent concept like depression, and observed variables, which measured the depression [17]. Different correlated errors and endogenous variables have been used to model a system or fit any model with complex relationship between the observed and latent variables. These models are usually fitted with continuous, ordinal, binary, survival, and fractional outcomes. The coefficient of estimators is consistent in the regressors, which possess notable and familiar property of OLS estimators in the presence of random measurement errors like EIV estimators [20]. Under standard assumptions, the estimator of slope coefficient is biased towards zero and is often known attenuation in the presence of one or multiple regressor together with uncorrelated regressors [11].
The slope coefficients cannot be identified from standard data in the presence of valid parameter restrictions or valid instruments required for error-ridden regressors. Therefore, the lack of identification in EIV estimators has been related to the uni-dimensional data. The problem related to EIV identification can be handled easily, if the desired variables are observed as panel data, which exhibit two-dimensional variation. Consistent estimation of slope coefficients is carried out without extraneous information, which specifically provides the distribution of latent regressors. However, weak conditions are satisfied through the measurement of different errors [19].
The existence of two dimensions and observed variables is responsible for making EIV identification easier through repeated measurements and linear data transformations [20]. The repeated measurements reduced the error by taking averages, which depicted sufficient variations. Whereas, large set of transformed linear data assist in accurate estimations. These transformations are required for uni-dimensional variables, which are potentially related with the regressor. Moreover, the transformed data tend to solve the problem associated with the estimation of slope coefficients, which may aggravate the EIV problem [20].
The application of EIV models is useful when researchers have only one estimator variable for τ. There are many examples present in the disciplines of psychology and education where researchers have only a single indicator. On the contrary, when multiple indicators are available for estimating interactions between continuous latent variables and for comparing the model invariance of several groups, the application of SEM models as an alternative can be useful. Researchers can use multigroup SEM if item-level data or multiple measures of the latent variable are available. Therefore, additional information is required for understanding the relative performance of SEM and EIV in the multi-indicator case.
There are two limitations in the EIV procedure. Firstly, there should be an accurate measure of reliability for obtaining unbiased estimates of interaction effects. Cronbach's alpha cannot always be the best estimator of reliability, which has the benefit to obtain only a single-test administration. For example, α has a tendency to be biased when item loadings vary, transient errors influence responses over time, and measurement errors are correlated. To be precise, the EIV performance is relied on whether important assumptions are fulfilled and researchers have better estimates of reliability when disattenuating interaction effects with EIV.
A second limitation can occur when applied researchers do not have any indicator for sub-group reliabilities. In this regard, OLS regression will be the option left for researchers. Therefore, researchers can use better approximations for subgroup reliabilities regardless of severely biasing EIV estimates. Further research is required for guiding researchers on the suitable alternatives when estimates of reliability are unavailable.

Conclusion
The results have concluded that instead of using OLS and SEM estimators, EIV estimator should be used for the estimation of group slope differences in the presence of measurement error. The relevance of SEM estimators has not been diminished in the modelling of economic phenomena. The success of SEM estimators depends on high degree of interdependence present between different variables, which are associated with various phenomena. SEM estimators have the ability to deal with single equation model and a complete system of equations. Therefore, it is proposed that the account of cross-sectional and temporal heterogeneity of panel data is carried on by means of different error component structures within the structural equations of a simultaneous equation system.
The analytical results provided in this study allowed researchers to prefer EIV against OLS or sparse SEM. Sparse SEM and OLS is always outperformed by EIV for the conditions presented in Figures 1 through 3 in measuring true interaction effects and the Type I error rates. Researchers should consider EIV accurate for estimating interaction effects and recognize true effects as compared to Type I errors. SEM and OLS had a comparable statistical power to detect interaction effects in the absence of group differences across sample size and predictor reliability and the extent of the sample throughout the focal group.

Funding
This research is not funded through any source.

Disclosure Statement
This research holds no conflict of interest.

Data Availability Statement
The data will be available for review from the corresponding author on request.

Acknowledgements
The author is very thankful to all the associated personnel in any reference that contributed in/for the purpose of this research.