Developing Model Eliciting Activities for Teaching of Middle School Mathematics: Evaluations of Self, Peers, and Experts

This study aims to provide expert, peer and self-evaluations of MEAs designed by prospective mathematics teachers in terms of basic principles of modeling (reality, model construction, self-assessment, construct documentation, model generalization). Case-study was used in the study. The study group of this research consists of 15 mathematics teacher candidates studying at Kastamonu University. The activities that the teachers have created were used as data collection tools. The data obtained were analyzed descriptively. Research findings reveal that two activities were suitable for the modeling principle, all but one of the activities were evaluated as partially appropriate in terms of self-assessment and model generalization principles. When the proficiency evaluations are examined in detail, it is determined that the groups consider their activities generally sufficient and a higher scoring tendency compared to their peers. Similarly, it was determined that both peer and self-assessments differed from expert assessments.

which is thought to be related to the principle of self-assessment in terms of enabling students to express their own thoughts, argues that the solutions created should clearly reveal how students think (Lesh & Caylor, 2007;Lesh et al., 2000). The model generalization principle, which includes the shareability and reusability of the developed model, refers to the development of generalizable models that can be used in parallel situations by others for different purposes, not for a specific situation or purpose (Lesh et al., 2000). Finally, the effective prototype principle advocates that the developed model is an effective first example for similar situations and that the developed model can be remembered by students even after time (Lesh & Caylor, 2007).
When the studies investigating the compatibility of modeling activities with modeling principles were examined, different results regarding the realization of the principles emerged. Tekin Dede and Bukova Güzel (2013a) examined the MEA design processes of secondary school mathematics teachers and evaluated the condition of meeting the design principles of the MEAs. Findings revealed that designs are in accordance with the principles of reality, model construction, construct documentation and model generalization, but teachers have difficulty in realizing the principles of self-evaluation and effective prototypes. In another study where MEAs were discussed with modeling principles, it was stated that MEA designs were in accordance with the principles of reality and model construction, but the existence of other four principles was not encountered (Yu & Chang, 2011). Similarly, Tekin, Hıdıroğlu, and Bukova Güzel (2011) stated that prospective teachers consider the principle of reality, model generalization and effective prototype for the MEAs they have designed, that one problem situation is not completely compatible with the principle of model construction, and three of them are not compatible with the principle of construct documentation. However, the studies conducted in the literature usually employ experts for evaluating the conformity of MEAs to modeling principles (Lesh & Caylor, 2007;Tekin Dede & Bukova Güzel, 2013b;Yu & Chang, 2011). It is thought that making evaluations of modeling activities through self-assessment and peer assessment by participants who have received theoretical information about mathematical modeling and its processes, have been in the design process, worked on the criteria for evaluations and have knowledge, besides experts, will be useful in interpreting modeling competencies and will contribute to the literature.
In line with the above explanations, the purpose of this study is to reveal the conformity of MEAs designed by pre-service mathematics teachers with the basic modeling principles in the light of the evaluations obtained from experts, peers and self-evaluations. In line with the purpose stated above, the research problem of this study is as following: "How have the modeling competencies of MEAs designed by prospective mathematics teachers (PMTs) evaluated by experts, peers and themselves and how is the coherence between the evaluations?".

Methodology
Being qualitative in nature, this was a case study, in which a researcher examines a situation within its context, limited by time and activity, and collects detailed information (Merriam, 1988;Yin, 2003). The case that was investigated in the current research involves PMTs' evaluations of MEAs in terms of basic principles of mathematical modelling.

Participants
The study group consists of 15 prospective mathematics teachers studying in the fourth grade of the Kastamonu University Faculty of Education Elementary Mathematics Education Undergraduate Program in the fall and spring terms of the 2019-2020 academic year. The study was conducted within the scope of the undergraduate level Elective II course. Completing the courses on mathematical concepts and pedagogical processes required to complete the MEAs discussed in the study was determined as a criterion in the selection of the participants. Participants worked in groups in the mathematical modeling training process. The groups are named themselves as walking death, infinite / infinite, Pythagorean, and selective permeable.

Procedures of the study
The research process took 10 weeks (Table 1) and was carried out in the fourth year Elective II course (3 course hours per week and 45 minutes per course hour) in the fall semester of the primary school mathematics teaching program. This training process was planned as mathematical modeling education process and MEA design process. The first three week of the training process included providing theoretical background of model eliciting activities in mathematics teaching. Weeks four and five included solution of mathematical modeling problems (see table 1) under Borromeo Ferri cognitive perspectives of modeling competencies. The next week aimed to define and discuss the basic principles of model eliciting activities. Afterwards, each group was asked to enter the MEA design process (Week 8 through 10). There was no restriction on the content of the MEA, only they were asked to pay attention to the MEA to be suitable for a selected grade level in secondary education and to be directed to the subject / subjects they chose from this grade level. The pilot application of MEAs used in the modeling training process has been made and necessary arrangements have been made to provide all modeling principles (Table 1).  (Herget, Jahnke, & Kroll, 2001); PF: Population forecast (Ural, 2014); GB: Giant's boat (Ural, 2018

Data Collection Process
Modeling principles evaluation form was used as data collection tool in the study. The purpose of using this form is to evaluate the MEAs designed by prospective teachers by the group that designed the activities, other peers and experts. Modeling principles framework survey includes evaluations of the evaluators in the context of basic modeling principles (reality principle, model construction principle, self-assessment principle, construct documentation principle, model generalization principle) (Appendix 1).
By subjecting the MEAs designed by the groups to document analysis, it was tried to reveal to what extent they provided the MEA design principles in the light of the existing theoretical framework. Çepni (2007) defines document analysis as the process of collecting existing records and documents related to the work to be done and coding and reviewing according to a certain norm or system. By examining the documents, the MEAs designed by the pre-service teachers were analyzed by the groups themselves, their peers and experts in order to reveal the status of providing the principles. Table 2 shows what the principles are taken into consideration while making the evaluations. Table 2. Evaluation criteria for modeling principles Principle Evaluation criteria Reality The introductory article and the problem situation of MEA is thought to be a situation that students to be applied may encounter in their real life.

Model construction
The statement in the problem situation of MEA requires creating a model Self-assessment Providing MEA's opportunity for students to realize the solution by making decisions about the process in the group without getting support from their teachers. Construct Documentation To what extent the MEA provided the statements about whether the students can present all their thoughts about the solution process in a way that the client can understand.

Model Generalization
The expressions in problem situations of MEA lead students to create a generalizable model Note: Adapted from Tekin-Dede and Bukova Güzel (2013a).
The principle of "Effective Prototype" was not included among the modeling principles discussed in this study. The effective prototype principle is related to "whether the model created and the solution made in solving the problem situation of MEA can be remembered and benefited by the students even when time passes" (Tekin-Dede & Bukova Güzel, 2013a). However, this study did not aim to investigate whether the models created by the students can be remembered and (or) reused. Therefore, evaluations regarding this principle have not been made in the MEAs prepared by prospective teachers.
Tekin-Dede and Bukova Güzel (2013a) exemplified the use of four categories in the process of evaluating modeling activities in their studies. Conformity for each principle has been evaluated under the categories of "eligible", " somewhat eligible", " ineligible" and "not determinable" for the absence of the existence of principles in any way. In addition, scores were used to compare the evaluations made by different elements (self, peer, and expert) in this study, and 0 (zero) points for the "not determinable" category, 1 (one) point for "ineligible", 2 (two) points for " somewhat eligible " and 3 (three) points for "eligible", the evaluation was made on a total of 3 points.
The encodings are made over the transcripts of the designed MEAs. During the expert evaluation process, MEAs were asked to be evaluated by two field experts through the MEA evaluation questionnaire. The averages of the scores obtained from the self, peer and expert evaluations were analyzed descriptively and compared by means of table-graphic representation.

Reliability issues
The reliability of a research is possible by providing (1) time invariance (continuity), (2) consistency between independent experts or raters (rater consistency) and (3) internal consistency (Baxter & Jack, 2008;Patton, 2002). According to Creswell (2013), it is important to encode the transcribed data by multiple coders and to provide consensus among coders in order to ensure reliability.
According to Creswell (2013), it is important to encode the transcribed data by multiple coders and to provide consensus among coders in order to ensure reliability. Coding was done by the researcher of this study and an expert who conducted research on the mathematical modeling competencies process. The percentage of agreement between the analyzes was determined using the calculation proposed by Miles and Huberman (1994). Re-analyzes were made on the topics that were not compatible and consensus was reached as a result of the discussions.
Two mathematics education experts took part in the expert evaluation process of the MEAs prepared by the prospective teachers and they coded independently. As a result of the coding of the data obtained from the modeling principles evaluation form, the inter-coder agreement rate was calculated as 86%. Miles and Huberman (1994) emphasize that for a good qualitative reliability, the reliability of the coding should be at least 80% compliance level. In this context, it was seen that the reliability between coders was sufficient in the study.

Ethical Issues
All participants volunteered to participate in the study and their signatures were obtained accordingly. In the research, there was no harmful application in any way for the participants. During the research, the same information about the research processes was provided to each participant and their personal information was kept confidential.

Findings regarding the evaluation of the modelling competencies of the designed MEAs
The process of evaluating the conformity of the designed MEAs to the modeling principles has progressed in the form of evaluating the prepared activities by experts and teacher candidates. Table 3 presents the evaluation score averages of the modeling activities designed by the groups in the context of the modeling principles. According to the table, when the eligibility of the activities in the context of the modeling principles is analyzed, the MEAs generated by the PMTs are generally found completely eligible in the principles of reality (mean = 2.71), model construction (mean = 2.29) and construct documentation (mean = 2.62). On the other hand, the evaluations of the principles of self-assessment and model generalization were considered somewhat eligible with mean scores of 1.95 and 2.20. 2.20 Note: 1) The relationship between score and modeling principle compliance level is given below. "0.00-0.74 "not determined"; 0.75-1.49 "not eligible"; 1.50-2.24 "somewhat eligible"; 2.25-3.00 "eligible" 2) Points within the "eligible" level range are marked with "green" and "somewhat eligible " with "yellow".
When the conformity of the MEA of each group to modelling principles was analyzed separately, the walking death group activity was found to be generally eligible, scoring 2.38 and above out of three from all principles. While the MEAs of the Pythagoreans and selective permeable groups were evaluated as "eligible" with the principles of reality and model construction, they were considered "somewhat eligible" with other modelling principles. The activity belonging to the infinite / infinite group was evaluated "eligible" except for the principles of self-assessment and model generalization. It is noteworthy that each of the activities belonging to all groups reached the highest average of fitness in the reality principle and the lowest averages in the self-assessment principle.
The evaluations of the MEAs designed in this study according to the modeling principles were made by the groups that designed the MEA (self-evaluation), peers and experts. Table 5 shows the average scores obtained from these evaluations. According to the table, in general terms, the self-evaluations of the groups have higher averages than peer and expert evaluations. This situation can be interpreted as each group trusting their own MEA design and tend to score high. 2.00 1.00 2.00 3.00 Note: 1) The relationship between score and modeling principle compliance level is given below. "0.00-0.74 "not determined"; 0.75-1.49 "not eligible"; 1.50-2.24 "somewhat eligible"; 2.25-3.00 "eligible" 2) Points within the "eligible" level range are marked with "green" and "somewhat eligible " with "yellow", "ineligible " with "white".
Expert evaluations indicates that no MEA that was designed by PMTs is well aligned with all of the mathematical modeling principles. While the experts evaluated the designed MEAs as "eligible" with the model generalization principle, they were considered as somewhat eligible or ineligible for the self-assessment principle. However, it is understood from expert evaluations that some of the MEAs belonging to the groups have deficiencies in the principles of reality, model construction and construct documentation.

Results and discussion
All of the MEAs created by PMTs have been evaluated as eligible with the principles of reality and construct documentation. While two MEAs were found suitable for the model construction principle, all activities except one were considered partially appropriate in terms of self-assessment and model generalization principles. The compliance of MEAs with the effective prototype principle has not been examined within the scope of this study. Upon comparison of our findings and literature, it is determined that it is difficult to create activities that are completely suitable for all MEA design principles, and the development of the appropriate design requires expertise. As a matter of fact, a group of experts in the field of engineering and mathematics education entered the MEA design process and found that their MEA designs were suitable for all modelling principles (Moore, & Diefes-Dux, 2004).
Literature indicates that the MEAs designed by (prospective) teachers have deficiencies in compliance with the MEA design principles although they receive training regarding MEA designs. Deniz (2014) provided teachers MEA theoretical training process, and analyzed teachers' MEAs. The research findings revealed that all of the activities were completely compatible with the principles of reality and model generalization, but to some extent complied with the principle of self-assessment. In addition, important issues were identified in the compliance of the activities with the principles of model constructing and construct documentation. The compliance of the MEAs with the effective prototype principle has not been examined. , as a result of the research they aimed to design MEA with pre-service mathematics teachers, drew attention to the fact that all of the prepared MEAs are well aligned with the principles of reality and model generalization, and especially the abundance of activities that do not provide the construct documentation principle. Similarly, other studies conducted with teachers and prospective teachers indicated that the MEAs were more successful in providing the principles of reality, construct documentation and model construction, and there were deficiencies in the context of selfassessment and model generalization principles Yu, & Chang, 2011). Also Carlson, Larsen and  found that there are problems in the compliance of the MEA designs with the reality principle. In this direction, it may also have expected that the MEAs designed in this study do not fully meet all of the modeling principles.
Mathematical modeling studies state that the teaching and learning of mathematical modeling is complex and is affected by many factors (Borromeo-Ferri & Blum, 2011). However, modeling eliciting activities are inherently difficult activities and experience is required both in the design and implementation process Yu, & Chang, 2011). In this study, although they entered the education process on mathematical modeling and MEA principles, prospective teachers created MEA for the first time and did not have a chance of conducting enough research on this subject. This situation may have caused deficiencies. However, the difficulties in integrating MEA designs into traditional learning environments (Galbraith & Clatworthy, 1990;Ji, 2012;Kaiser, 2007), the difficulties experienced in the transition process of mathematical modeling steps (Blomhoj & Kjeldsen, 2006;Thomas & Hart, 2010) and the negative effects of misconceptions that may occur during the modeling problem solving process (Maaß, 2006) on the successful completion of the mathematical modeling process should be taken into account. These shortcomings can pose an obstacle for designs to reach the planned targets (Baki & Aydın-Güç, 2014;Maaß, 2006).
When the evaluations of compliance with modeling principles were examined in detail, it was determined that the groups saw their own activities as sufficient in general and scored higher than their peers. Similarly, it was determined that expert evaluations differ from peer and self-evaluations. A comparison study has been conducted for many expert-non-expert evaluations and / or processes in teaching studies (Chi, Feltovich, & Glaser, 1981;Fadde 2009). According to Fadde, "a clear measure of expert status is the amount of experience" (2009). In parallel with the findings in this study, it was determined in some studies that novices (students, teachers or teacher candidates) tend to give higher marks than experts (Meyer, 2004;Sancar Tokmak, Incikabi, & Yanpar Yelken, 2012). According to Fadde (2009), novices may not make any effort to understand the meaning of the evaluation criteria and this causes some misinterpretations; on the other hand, experts take the same measures for each criterion to increase consistency and reliability. These differences may cause different situations to arise in the evaluation. With a similar approach, Incikabi and Kacar (2017), in their study with prospective mathematics teachers, analyzed the changes in lesson plan design and pedagogical competence contexts in teaching processes with peer, self and expert evaluations. The results showed that pre-service teachers gave higher efficacy scores in their self-assessments than peer and expert evaluations. It was stated that this was due to the fact that students had more self-confidence before receiving any feedback about their teaching practices (Incikabı & Kacar, 2017). However, it is reported that the differences in the evaluation made by the experts and novices may be due to the misinterpretation of the criteria, the limitations in the methods to evaluate each criterion, the knowledge about the content and skills addressed, and the lack of a common rating strategy (Sancar Tokmak et al., 2012). Accordingly, it is recommended to consider these factors in the evaluation processes of novices and to take measures against these obstacles in the MEA design processes. Moreover, modeling design activities are inherently difficult activities and experience is required both in the design and implementation process. In this regard, providing prospective teachers with opportunities to increase such experiences will support reaching more effective MEA designs.