Comparing Performances of Homogeneity Tests Used for Intraclass Version of Kappa

Harika Gozde Gozukara Bag; Celal Reha Alpar

Comparing Performances of Homogeneity Tests Used for Intraclass Version of Kappa

Harika Gozde Gozukara Bag, Celal Reha Alpar

Abstract

The reliability of a measure is an important component of the quality of the measurement. Reliability can be defined as repeatability or consistency of duplicates in a measurement process. In many fields, some studies are reliability studies which are based on assessment of agreement between observations or observers. In this study, we considered the most common usage of intraclass kappa statistic which has been the widely accepted measure for assessing the reliability between two ratings on a binary trait. In a metaanalysis of kappa statistics obtained from multiple studies using the same measure, in multicenter studies or in a stratified study, we would like to compare kappa statistics and present a common or summary kappa agreement using all available information. A homogeneity test is required for an overall kappa estimation of two or more independent kappa coefficients. In this study, the aim was to compare the Fleiss, Donner’s goodness-of-fit, Likelihood Score, Modified Score and the Pearson’s goodness-of-fit test statistics which are used to test the homogeneity of two or more independent intraclass kappa statistics. The test procedures were evaluated separately under the assumption of equal prevelances and unequal prevelances. To be able to compare the tests by taking Type I error rate and power into the consideration, Monte Carlo approach with 10000 simulations was used. Under the assumption of equal prevelances; Pearson’s goodness-of-fit test indicated the best performance in terms of Type I error rate, Fleiss test was tend to be liberal because it is based on large sample variance. Under unequal prevelances; Donner’s goodness-of-fit and Modified Score tests displayed better performance than under the assumption of equal prevelances, Fleiss test found to be liberal for testing the homogeneity of more than two kappa statistics, Type I error rate of Likelihood Score test was at nominal level and exhibited the best performance.

Keywords: Agreement, intraclass kappa statistic, homogeneity test, chisquare distribution, reliability

Full Text: PDF

Download the IISTE publication guideline!

To list your conference here. Please contact the administrator of this platform.

ISSN (online) 2422-8702