Using Data Visualization and Data Science to Explore Self-efficacy in the Classroom and Academic Mindset by Grouping Demographic Variables

Janet L. Hanson 1* Chong Ho Yu 2 1. School of Education, Azusa Pacific University, PO box 7000, Azusa, California, USA 91702-7000 2. School of Psychology, Azusa Pacific University, PO box 7000, Azusa, California, USA 91702-7000 * E-mail of the corresponding author: jhason@apu.edu Abstract This study explored the effects of students’ demographic characteristics on the outcome variable students’ Selfefficacy on classroom tasks (SE) using data visualization and data science techniques, which aims to discover the pattern in the data. Grouping variables included students’ self-reports of their gender (Male vs Female) and cultural identification. Data was drawn from five elementary schools (n=1986 students) and two middle schools (n=1257 students) in one suburban school district in the south-western U.S. School contextual variables included socioeconomic status (operationalized as percent enrollment Free and Reduced Meal Plan) and school level (elementary vs middle school). Main effect variables explored included Individual Mindset (IM), Belonging in the classroom, and Relevance of classroom tasks. JMP Pro 15 and SPSS 26 were used to perform the analyses. Teachers can learn the methods and use the results of this study to improve their understanding of their students in diverse populations. Teacher skills in developing student self-efficacy promote student motivation leading to improved outcomes.


Introduction
In this project potential factors contributing to improvement of student self-efficacy was examined with data mining and visualization methods, which emphasize exploration and pattern seeking rather than dichotomous decision-making solely based on a cut-off point. Traditional exploratory data analysis (EDA) is a precursor of modern data mining. Traditional EDA tools include residual analysis, data re-expression, resistant procedures, and data visualization (Yu, 2017). However, with advances in computing and big data analytics, modern EDA has become more goal-oriented using clustering, variable screening, and pattern recognition. According to Yu (2014), "data analysis is a process of reducing large amounts of information to parsimonious summaries while remaining accurate in the description of the total data" (Yu, 2014, p. 10). In order to manage large amount of data, certain automated data reduction algorithms were introduced into data science. However, some big data analysts today may, naively and blindly, simply execute algorithms and leaving the human intervention aside deferring to the black box. The extremely large sample size and high dimensionality of big data still presents issues of heterogeneity from sub-populations, noise accumulation of supposed irrelevant data, "spurious correlations" with non-related variables, and incidental endogeneity, which results in biases in the model. These features of big data make traditional statistical methods invalid (Fan, Han, & Liu, 2014). Exploration of the data (EDA) and human insight are still as relevant as ever. Specifically, data visualization, which necessitates human intervention, were utilized to extract insight from the data.

Exploratory data analysis (EDA)
EDA is a philosophy in which the researcher comes to the data with no assumptions, preconceived ideas or a priori hypotheses, and with an open and sceptical mind. Traditional EDA does not seek to confirm or disprove a conjecture or hypothesis (Yu, 2014(Yu, , 2017. John Tukey (1977), the father of Exploratory Data Analysis (EDA), described the EDA process as exploring the data until a "plausible story" can be developed to explain the relationships of the variables. Though Tukey promoted EDA nearly half a century ago, the process is just coming into broad use in the world of industry. Rather, confirmatory data analysis (CDA), in the Fisherman tradition, has been the main approach for evaluating data taught in academia, especially in the school of psychology and the social sciences. CDA requires a variety of assumptions about the data before beginning the process of analysis, including normality, homogeneity, and independence of the data points and seeks to confirm or test a prior hypotheses. Deming even went so far as to declare hypothesis testing "one of the evils taught in statistics courses" (Boardman, 1994in Yu, 2014.
In contrast, EDA can be considered an end in itself, or it can be combined with Confirmatory Data Analysis (CDA), using "critical tests as may be applied to the data" (Fisher, 1932in Yu, 2014. Tukey on the other hand encouraged everyone using data to come to the process with a spirit of inquiry.

Data visualization (DV)
DV is a methodology of exploring the data, which extends from the philosophy of EDA and the process of Total Quality Management (TQM). DV, which has been called a process of analogy-making, can facilitate data analysis by "developing rich descriptions through graphic summary, robust statistics, and model fit indicators" (Yu, 2014, p. 7). EDA proceeds with no assumptions, seeking patterns, and using tools of data visualization because the brain can be easily deceived by numbers, while it is much easier for the eyes to quickly understand relationships in the data through visuals (Velleman & Hoaglin 1981;Yu, 2014Yu, , 2017.
1.2.1 Resistance Tools of EDA that are resistant to outliers provide help in exploring data without the outliers distorting the data. Resistance is different than robustness, which refers to EDA methods that are not influenced by parametric assumption violations, such as normality, heteroscedascity, and independence of data. Finding tools that are resistant to the influence of outliers in the data has provided a boon to data analysts. Tukey developed one of the best graphical techniques used today, the boxplot or five number summary. The boxplot creates a visual summary of the data, using five important data points, namely, the minimum, the 25th percentile, the median, the 75th percentile, and the maximum value (Velleman & Hoaglin 1981;Yu, 2014Yu, , 2017. Appendix D shows a comparison of SE medians and confidence intervals by cultural identity using boxplots and diamond plots using data from this study.

Revelation
This is the primary tool for EDA and is governed by considerations of the data format for the task, the complexity, and the distribution for determining the appropriateness of the visualization technique. For example, function driven plots are useful when the researcher's goal is to find relationships between the variables, but not useful when looking for patterns in the data. Smooth plots are useful as teaching models whereas dynamic plots are often more useful than static, though dynamic effects cannot be shown in a print format. Dynamic models, such as the bubble plot, can lead to cognitive overload. Both JMP and Tableau software provide the option of creating multipanel visualizations (Velleman & Hoaglin 1981;Yu, 2014Yu, , 2017. Appendix E shows a multi-panel visualization of data used in this study.

Definitions
Individual growth mindset (Items 1-3 reversed) is "the way in which children interpret human behavior and their beliefs about the stability of human traits" (Heyman & Dweck, 1998, p. 391).
Sense of belonging in the classroom (Items 4-7) is "[T]he extent to which students feel personally accepted, respected, included, and supported in the school social environment" (Goodenow & Grady, 1993in Ma, 2003. Task relevance (Items 8-11) is "a student's sense that the subject matter he or she is studying is interesting and holds value [to them]" (Eccles et al., 1983in Farrington et al., 2012. Student self-efficacy on classroom tasks (Items 12-15) is domain specific, meaning that a student may feel able to complete classroom tasks, but not feel they can participate well on a sport team. As an expectancy theory, students with high self-efficacy for a certain task would expect to be able to accomplish that task within the given context. Self-efficacy beliefs are comprised of two components: first is a belief that a specific behavior will produce an outcome, and secondly, the belief in one's personal efficacy to produce that behavior in the present context and in future situations (Hanson, 2017a). Teacher self-efficacy is domain specific and refers to teachers' beliefs that, by their own efforts, they can positively influence their students' learning and outcomes (Wheatley, 2005).

Theoretical Lens
This study used Bandura's Social Cognitive Theory (SCT) as the theoretical lens for designing the study and developing conclusions from the results. In SCT, the concept of self-efficacy that has been shown to be positively associated with performance (Dybowski, Sehner & Harendza, 2017). SE is defined as "one's belief in his or her ability to be successful at a task or in fulfilling a goal" (Bandura, 1977(Bandura, , 1986(Bandura, , 1996(Bandura, , 2001. Lawmakers encourage research on psycho-social variables to improve student self-efficacy in the classroom (Hanson, 2017a & b). Researchers suggest elements of classroom culture explain improvements in SE. Literature provides limited insights on how demographic variables may influence classroom culture and a student's SE. This project also takes recent research into account, which indicated a non-linear relationship between academic self-concept and academic performance (Yu & Lee, 2020;Yu, Lee, Gan, & Brown, 2017). Students' self-efficacy on classroom tasks in face-to-face and online collaborative group work is explained in part by their trust in, and the influence of, their teacher leader (Du et al., 2018). Teachers use of data analytics provides knowledge and skills to develop differentiated, autonomy-supportive, instructional practices for their students' diverse needs and backgrounds; shown to be positively related to students' choice to engage classroom tasks (Fong, Dillard,& Hatcher, 2019). Tunç, Çakıroğlu, and Bulut (2019) found that the effectiveness of any teaching strategy or device is dependent on the teacher's knowledge and skill in developing and implementing it effectively. Increasingly, teachers are finding themselves engaged in online teaching environments, where teachers' knowledge of their students' unique needs becomes even more important due to the reduced transactional distance (Hanson, Loose, & Reveles, 2020). Liu, Joy and Grifths (2010) explained students in online learning environments become more dependent on their peers. Teacher development of group collaborations may become challenging if the teacher has little knowledge of their students' unique demographics and perceptions of the classroom culture. The importance of teacher use of data analytics becomes apparent as it improves teachers' skills in designing and implementing instruction tailored to their unique students' demographics. Teachers' self-efficacy improves as a result of experiencing their students' academic successes (Du et al.).

Gender
The literature suggests a moderating influence exists from the Gender demographic on main effects of students' perceptions in the classroom through socialized gender roles (Carney, Kim, Bright, & Hazler, 2020;Schwarzer, 2008). One's sense of gender role is quantified as a moderating variable on one's sense of efficacy on classroom tasks. For example, female gender roles have been studied for the relationship with students' perceptions of their ability to perform classroom tasks in science and mathematics. Students' perceptions of ability can influence choices of their future studies and careers.

School level
A student's school level has been shown to influence students' perceptions in the classroom through transitions. Boundary crossing affects culture and students' sense of efficacy on classroom tasks. Prior research suggested elementary school students reported stronger self-efficacy than middle and high school students (Hanson et al., 2017a, Parajes, Johnson, & Usher, 2007. For example, middle school students experience changes in classroom structure (teachers change with subject matter expertise), increased expectations by teachers for independent work, and reduced teacher/student social relationship building. School level showed inverse relationship to students' sense of belonging and depressive symptoms related to transition from middle to high school (Newman, Newman, Griffen, O'Connor, & Spas, 2007).

Culture
Culture has been shown to influence student satisfaction in the classroom. Ideal classrooms that promote academic achievement have high satisfaction and cohesiveness and low friction, and do not have excessive competition or task difficulty (Fraser, 1984;Fraser & O'Brien, 1985, cited in McMahon et al., 2009).

Data sources
Data were sourced from observational data of students' self-reports from five elementary schools (n=1986 students) and two middle schools (n=1257 students) in one suburban school district in the southwestern U.S., with an approximately an 84% response rate. The Project for Educational Research that Scale survey (PERTS) was administered during the 2015/2016 school year for use in the District's Local Control Accountability Plan (LCAP) culture assessment for planning and accountability purposes. Data were entered into JMP and reviewed for outliers and missing data.

Instrument
The Project for Educational Research that Scales (PERTS, 2015), consisting of Likert-style survey items, was used to collect students' perceptions of their classroom academic mind sets, including students' self-efficacy on classroom tasks (items 1-3 reversed), sense of belonging in the classroom (items 4-7), relevance of classroom tasks (items 8-11), and students' individual mindset beliefs (items 12-15). The scale reliability was estimated in the form of internal consistency, yielding a total Cronbach's α = .801 (Hanson, 2017b). Examples of the questions include; "You have a certain amount of intelligence, and you really can't do much to change it," "I feel like I belong in this class," and "My class gives me useful preparation for what I plan to do in life." Students self-reported their responses on a scale of 1 to 6, with six being highest, 1 = "Strongly disagree" to 6 = "Strongly agree." Appendix A shows the LCAP survey used to collect the data for this study.

Analyses
Scatterplot 3D was used to identify outliers in the data prior to performing the analysis. No patterns were found and extreme observations were removed before further analysis. Participants included White/Non-Hispanic/Latino Journal of Education and Practice www.iiste.org ISSN 2222-1735 (Paper) ISSN 2222-288X (Online) Vol.11, No.24, 2020 (n=1245), African American (n = 108), Hispanic/Latino (n =413), Pacific Islander (n =27), Asian (n = 56), Combination (n = 649), No answer (n = 45), Other (n = 169), Native American (n =20), American (n = 46). Two school levels tested included elementary (n = 1521) and middle school (n =1257). Gender categories by number included Female (n = 1360) and Male (n =1385). Because the sample size is fairly large, traditional statistical methods tend to yield significant results even if the effects are trivial. To rectify the situation, robust data science methods were utilized. Figure 1 shows data points are everywhere. Because of over-plotting, there is a possibility that the pattern is hidden inside the data cloud. Although IM, Belonging, and Relevance show a small to medium effect with SE (r = .23, .42, and .37 respectively), the result of Pearson's correlation coefficient is subject to the sample size and thus it necessitates data visualization. Appendix B shows the relationships of the variables using a scatterplot with density ellipse, histograms, regression line, and correlations for the main effects variables and SE.

Heat Map
In a heat map, if there is a high concentration of observations, then the hue is closer to the red end of the spectrum. Otherwise, it is closer to the blue end of the spectrum. These heat maps show distinct patterns, with the majority of data grouped in a fairly linear pattern, evidencing a relationship between the independent variables and outcome variable, SE . Appendix C shows the results of the heat maps for the main effects variables.

Data Reduction
Attempting to predict the outcome with such over-plotting of the large data sample could result in large individual residuals. Without reducing the data, there are too many levels and they become "noise." As a remedy, this study used data reduction techniques. For Self-efficacy on the Y axis (SE) and Main Effects on the X axis, the data were reduced to "high" and "low" categories. According to Box (1987), all models are wrong, though some are more useful than others. Both a complicated and a simple model may be appropriate at different times. In his case it is better to go for a reduced model .
Data reduction and visualization methods were used to suppress noise. SE, relevance, belonging, and IM were converted from continuous to binary (high-low) based on the median split. Afterwards, mosaic plots and Fisher's exact test were employed to examine their inter-relationships. The result of the Chi-square test is subject to the sample size. Fisher's exact test was employed to rectify the situation because in this test an empirical sampling distribution, rather than a theoretical sampling distribution, was used for computing the probability. The test is so named because it yields the exact p-value based on the empirical sampling distribution resulting from permutation.

Results
First, the inter-relationships of the data were reviewed using non-parametric Kendall's correlations, linking and brushing, and Mosaic plots of SE versus the independent variables Relevance and IM. Using Categorical data "Low" and "High" we see that all PERTS variables show statistical significance with SE. Figure 2 shows the p-value for relationships of SE with main effect. All variables showed statistically significant relationships. Whether these relationships are true for all subgroups (e.g. gender) were revealed by further analyses.

The Mosaic plots
The Mosaic plots displayed in Figures 3 and 4 are the graphical equivalent to a crosstab table. In the mosaic plot the percentage of each cell is depicted by the size of the rectangle. Obviously, when relevance is low, the portion of low SE outweighs the portion of high SE. But it is reversed when the relevance is high. This relationship is confirmed by Fisher's exact test (p < .0001) and the result is consistent across both males and females. Figure 4 indicates that when IM is low, the portion of low SE slightly outnumbers the portion of high SE. When IM is high, the ratio of the two is reversed. Fisher's exact test confirmed this relationship (p < .0001) and the result is consistent across both males and females.

Linking and Brushing
Next, a visual inspection of the linking and brushing results was reviewed. Figure 5 shows the visual link created by this analysis between histograms of multiple variables. By clicking on the high end of the histogram for the outcome variable (SE), the connected main effects graphs are then highlighted to show the item distributions that produced the SE scores.
Journal of Education and Practice www.iiste.org ISSN 2222-1735 (Paper) ISSN 2222-288X (Online) Vol.11, No.24, 2020 4.3.1 Interpretation of the results of Figure 5 Gender: The Male gender distribution showed a greater proportion in the "High" category for SE than the Female gender distribution.
School Level effect on PERTS variables: A greater proportion of elementary students reported in the "High" category for SE than did middle school students.
Student class size showed no trend as the data was widely distributed across a normal bell curve suggesting class size showed no effect in explaining "High" perceptions of student SE.

Analysis of means (ANOM)
ANOM was the third visualization technique to explore relationships in the data. ANOM is a graphical method showing the distance between each category's mean and the grand mean. In Figure 6, the grey boxes show the confidence intervals with a wider range, representing a lower likelihood of accurate prediction. The dots show the mean distance from the grand mean. Appendix D shows Box and Diamond Plots for a visual comparison of the medians and confidence intervals, with a Tukey SD showing statistical significance at α< .05 level.
4.4.1 Interpretation of the ANOM Plot Self-efficacy: Asian students shows the highest overall Self-efficacy mean, White category next, and Combination category mean is equal to the grand mean. All other categories show means lower than the grand mean with Hispanic, Pacific Islander, and American showing the lowest means.
Main effects median comparison results: The Asian category was in the highest group for SE score, second highest group in IM (with White, combo and other), lowest in Belonging, second lowest in Relevance. African American category was in the middle scoring group in SE, second lowest in IM, highest in Belonging, highest in Relevance. Students who wrote in American as their cultural identity scored in the lowest group in SE, highest group in IM, middle group for Belonging, and the middle group in Relevance. Combination category scored in the second highest in SE, second highest in IM, middle group in Belonging, and the middle in Relevance. Hispanic category score in the lowest group in SE, middle group in IM, middle group in Belonging, and middle group in Relevance. Native American scored highest in SE, lowest in IM, second highest in Belonging, and highest along with African American in Relevance of classroom tasks.

Discussion and Conclusion
This study seeks to fill a gap in the literature by exploring relationships between specific psychosocial variables associated with the classroom culture that have not been fully reported. This study compared the influences of demographic variables on self-efficacy and main effects using data visualization techniques to find patterns in the data. Teachers can use the results of this study to improve their understanding of their students in diverse populations. The results suggest that teachers can be informed of trends in the classroom culture based upon unique combinations of demographics. However, it is important to point out that although trends may exist as averages or medians by groups, each individual is unique and may diverge from the central tendency statistics.
Administrators may provide professional guidelines to improve teachers' understanding of strategies to promote students' sense of belonging (shown the strongest predictor of students' SE) and develop task relevance across diverse cultural and demographic backgrounds. For example, teachers in "Medium" and "High" SES schools (operationalized by FRMP percentage school enrolments) can focus efforts on the IM factor along with Belonging and Relevance factors. Teachers in Middle School contexts can focus on learning new ways to build relationships with students to improve students' sense of belonging by reducing their sense of isolation. Identifying real world significance by diverse demographics may improve the students' sense of relevance of classroom tasks, thereby increasing student motivation and perseverance. Increase student engagement and connection has been shown to increase one's belief in their ability to perform the tasks.
When teachers are able to detect their students' perceptions of the classroom culture through self-reports on measures such as the PERTS scale and recognize differing perceptions resulting from students' demographic identities, teachers can modify instruction and resources accordingly. The goal is to improve equitable educational opportunity for all students.

Implications
The full diversity of results cannot be provided in this paper. Hence, schools are recommended to collect their own data and perform their own individual exploration through a dashboard system for administrators, faculty, and staff to truly explore the patterns in their own data. Appendix E shows a dashboard that can increase the ease of exploration of the patterns in the data.

Limitations
This study used data from a single district in a suburban setting of a large south western state of the US. Results Journal of Education and Practice www.iiste.org ISSN 2222-1735 (Paper) ISSN 2222-288X (Online) Vol.11, No.24, 2020 6 may vary based upon other school contexts. Some sample sizes by cultural demographic categories were small and may not reflect the larger population. In addition, students' self-selected their cultural identification based upon their personal perception. For example, the write-in category "American" was not defined. Nor were the students' perceptions verified through follow-up interviews due to the confidentiality of the participants and coding that kept the participants identity anonymous.

Recommendations
Further exploration of demographic data collected from schools can be performed that disaggregates school outcomes as a dependent variable to identify specific influences on student learning. Teacher's epistemological beliefs can be explored and compared to students' perceptions of variables of their classroom cultures. Next, a study using multi-level modelling would provide further insights to determine the influence of correlated error from multi-level groups of data nested or clustered together. For example, a main effect variable, Sense of belonging, is a grouping variable of students within classrooms, while FRMP is a grouping variable calculated at the school level. Further, qualitative studies can be performed to validate the findings of data visualization, which results from the explorations of a school's data set, especially categories of cultural identification that students self-select in order to develop validity of the constructs used by administrators and teachers for differentiating instruction. A qualitative study exploring how teachers experience professional development for the use of data exploration would provide insights into ways to improve pre-service and in-service teacher education in this area. Finally, a quantitative study exploring the relationship between teacher self-efficacy in the classroom and students' self-efficacy on classroom tasks might provide further insights into how teachers' self-efficacy influences the classroom culture.