Instrumentation

From Practical Statistics for Educators
Jump to: navigation, search

When attempting to measure a phenomenon in educational research, a reliable and valid instrument is necessary. Below are descriptions of several instruments. The writing samples come from dissertation proposals.


Levels of Use (LoU).

This instrument is one of three diagnostic instruments of the Concerns-Based Adoption Model (CBAM) that evolved out of the educational change work of Fuller, Hall, Dirksen, & George during the 1970s (SEDL, 2006). The purpose of the LoU structured interview is to identify teachers’ current behaviors in regard to a specific innovation. The instrument uses a branching technique that uses operationally defined phenomenon to differentiate eight Levels of Use and decision points between each level (see Appendix E). The district will identify a research-based instructional strategy as the innovation to be measured before the study begins. The LoU breaks use and nonuse of the innovation, or instructional strategy, into a continuum of eight categories: (a) Nonuse, (b) Orientation, (c) Preparation, (d) Mechanical Use, (e) Routine, (f) Refinement, (g) Integration, and (h) Renewal. These levels characterize each teacher’s development in acquiring new skills and use of the innovation. Each level describes a very different set of behavioral actions and related understandings of the innovation and its use. Operational definitions have been developed for each Level of Use.

Validity of the LoU was established using ethnographic methodology. First, teachers were assigned LoU ratings based on interviews using the instrument. These ratings were compared to ratings assigned to the same teachers by (a) an observer who spent a full day observing the teacher, and (b) an independent rater who read the observer’s notes and assigned a rating based on the content of the notes. Correlations between LoU ratings obtained using the instrument and the methodology described above were .98 and .65, respectively. Inter-rater reliability for the LoU ratings were established by converting the ratings to a numeric value; this analysis yielded a coefficient of .98 (Cronbach’s alpha).


contributed by Jennifer Mitchell, EdD


Assessment of Reading Comprehension (ARC).

This reading comprehension assessment was developed by the researcher. Reliability and validity data for the Assessment of Reading Comprehension (ARC; form A and form B) were collected during a pilot study. This reading comprehension instrument was designed to reflect the comprehension strands measured on the Connecticut Mastery Test (CMT). These strands include: (a) forming a general understanding, (b) developing an interpretation, (c) making reader/text connections, and (d) examining the content/structure of text (CSDE, 2006; see Appendix B). The researcher collected evidence for content validity by having a panel of reading experts reviewed the ARC. The instrument was revised to more accurately reflect question stems on the CMT. The panel determined that the instrument had strong content validity. The reliability estimates indicate strong total test internal consistency levels. Coefficient values for both Form A and Form B were .85 (Cronbach’s Alpha). The alternate form reliability correlation for the ARC was .76, indicating a high positive correlation between Form A (pretest) and Form B (posttest). Refer to Appendix C for a summary of procedures conducted during the ARC pilot study and Appendix D for a copy of Form A and Form B of the ARC.


contributed by Jennifer Mitchell, EdD


Gates Macginite Reading Test (GMRT) and Degrees of Reading Power (DRP).

Students will also be administered either the Gates Macginite Reading Test (GMRT) or the Degrees of Reading Power (DRP). Data from one of these instruments will be utilized as a covariate to produce adjusted means for students’ initial reading achievement. The district’s reading and language arts coordinator will determine which assessment will be administered based on which instrument yields the most valuable information for the district. Refer to Appendix C for reliability and validity information for both instruments.


contributed by Jennifer Mitchell, EdD


Structured Coaching Log (SCL).

The purpose of the coaching logs is to document the events that occur during the coaching treatments (independent variable) throughout the 10-week quasi-experiment. The SCL will document all professional development training components and coaching strategies implemented with each teacher. Log codes will include a teacher code, a professional development component code, the amount of time spent on each training component, and the instructional strategy focus of each coaching session. Codes have been predetermined by the researcher to create consistent and standard log entries (see Appendix F). Coaches will be trained to use these codes. Evidence for content validity (Gall, Gall, & Borg, 2003) of the SCL was gathered during a pilot study (see Appendix E).


contributed by Jennifer Mitchell, EdD


The School Counselor Activity Rating Scale.

This scale was developed by Janna L. Scarborough, Ph.D., NCC, NCSC, ACS, Assistant Professor, and School Counseling Program Coordinator - Counseling & Human Services Syracuse University. Permission from the developer has been granted to utilize the instrument.

The School Counselor Activity Rating Scale survey defines the logical methods of evaluation which include (a) examining the rationale for each objective within each subgroup of the rating scale as defined by the instrument in terms of coordination, consultation, curriculum, and other activities; (b) the consequences of achieving the objective as defined by preferred and actual activities; and (c) consideration of high order values of goals which is aligned in New York State to the comprehensive model of school counseling. The School Counseling Activity Rating Scale was developed by establishing a list of work activities that reflected the job of school counselors. Task statements were created that reflected the activities under the four major interventions described in the National Model for School Counseling Programs (ASCA, 2003). Items described activities in: counseling (individual and group), consultation, coordination, curriculum (classroom lessons), and other duties.

The School Counseling Activity Rating Scale uses a response format in which school counselors are asked how often an activity is performed. The author recognizes that the verbal frequency scale has limitations, but it was selected for perceived ease, comprehensiveness, and flexibility. Two types of frequencies were measured: actual and preferred activity on a 5-point rating scale numbered 1-5 as defined: (1 ) never do this; (2) rarely do this; (3) occasionally do this; (4) frequently do this; and (5) routinely do this.

The School Counseling Activity Rating Scale’s content validity was obtained by administering a pretest to assess for production mistakes (Scarborough, 2005). A review of the instrument was also conducted by professionals in the school counseling field. A field test of the survey was conducted and results were achieved by utilizing the varimax rotation for factor analysis and construct validity was obtained by reviewing the scores of the one-way ANOVA (Scarborough, 2005). Internal consistency was obtained through the Conbach’s coefficient alpha for each subset of the survey (Scarborough, 2005, p. 278). The coefficient alpha results of each subset are as follows: counseling showed a .85 for actual and .83 for preferred; coordination showed a .84 for actual and .85 for preferred; consultation showed a .75 for actual and .77 for prefer; and curriculum showed a .93 for actual and .90 for preferred (Scarborough, 2005).


contributed by Deborah Hardy, EdD


Readiness Survey.

The Readiness Survey (Carey, 2005) was developed to help school counselors and administrators assess their district's readiness to implement the American School Counselor Association National Model (ASCA,2000), and to determine areas that will need to be addressed to successfully implement the National Model (Poynton, 2005). The survey addresses areas of needs for implementation and diagnoses problems in readiness towards integration into local school districts.

The Readiness Survey (Carey, 2005) is composed of seven indicator areas including community support, leadership, guidance curriculum, staffing time and use, school counselor’s beliefs and attitudes, school counselor’s skills, and district resources. The survey uses a rating scale as defined by (1) like my district; (2) somewhat like my district; (3) not like my district. Validity and reliability of the instrument are in process of being determined as per the University of Massachusetts National Outreach Center for School Counseling.


contributed by Deborah Hardy, EdD


The Gates-MacGinitie Reading Test.

The Gates-MacGinitie Reading Test (GMRT-4) (2002) is an instrument that will be used in the study; it will be administered to students in May 2007. The GMRT-4 will be used to assess students’ level of reading achievement. GMRT-4 is found to have strong reliability and validity. The reliability estimates indicate strong total test and subtest internal consistency levels with coefficient values at or above .90. Content validity was documented through a process of test development to identify the scope of the subtests and identify effective items within subtests. Construct validity is supported by strong intercorrelations between subtests and total test scores. Students’ raw scores will be converted into national stanines, normal curve equivalents, percentile ranks, grade equivalents and extended scale scores (MacGinitie, et al., 2002).


contributed by Patricia Cosentino, EdD


The Roxy Kindergarten Inventory of Skills.

Another instrument that will be used to assess the kindergarten students is The Roxy (pseudonym) Kindergarten Inventory of Skills, which is a district assessment. The Roxy Kindergarten Inventory of Skills will assess students in the following content areas: upper and lower case letter recognition, rhyme recognition and rhyme production, initial sound production, oral blending and oral segmentation. Content validity was originally found through the design of the test when literacy experts from the Roxy district designed the test. Connecticut State Frameworks were reviewed, alternate tests were examined, and important concepts were included in the inventory. Additional content validity will be found by having a jury of 10 experts including kindergarten and first grade teachers and early childhood administrators review the document and validate the content of the assessment as it compares to the Connecticut State Frameworks. The instrument was used in a pilot study in the spring of 2006 in which it was found to have construct validity. The 26 students who were deemed to be below grade level and who were struggling in kindergarten performed poorly on the assessment whereas the students who performed on grade level in class scored on grade level on the assessment.


contributed by Patricia Cosentino, EdD

The California Measure of Mental Motivation (CM3)

The California Measure of Mental Motivation (CM3) is a quantitative instrument focused on measuring cognitive competencies (Giancarlo, 2010). The CM3 is administered to measure cognitive engagement and motivation toward problem solving and learning (Giancarlo, Blohm, & Urdan, 2004). The CM3 is comprised of four major scales including learning orientation, creative problem solving, mental focus, and cognitive integrity (Giancarlo et al., 2004). The CM3 is composed of approximately 25 items for the four major scales. These four factors demonstrate a stability across study samples, and scales derived from the major factors correlated with known measures of student motivation and achievement (Giancarlo et al., 2004). Level II+ of the CM3 adds a fifth important scale: scholarly rigor. Level III of the CM3 adds a sixth major scale: technical orientation. The response format used to collect information appears in the form of a X-point Likert scale, with scales ranging from "strongly agree" to "strongly disagree." Sample items from the instrument are not available for view due to test security. Scores are reported based upon a 50-point metric. Scores ranging from 0 – 9 points represent individuals who are “strongly negatively opposed” to a particular characteristic; scores ranging from 10 – 19 reflect “somewhat negative” perceptions; scores in the 20 – 30 range are considered to be “ambivalent;” scores in the 31 – 40 range are “somewhat disposed” toward the topic; and scores of 41 and above are “strongly disposed” to the attribute (Giancarlo, 2010, p. 26). The CM3 is both a valid and reliable quantitative instrument (Giancarlo et al., 2004). Cronbach's alpha coefficient was used to evaluate internal consistency of scores obtained by the CM3 for the four subscales of the 25-item version. Across the studies conducted, the values ranged from 0.53 to 0.83 (Giancarlo et al., 2004). The reliability estimates for learning orientation ranged from .79 - .83 across the various studies. Creative problem solving produced an alpha coefficient ranging from .70 - .77. Mental focus ranged from .79 - .83 and cognitive integrity ranged from .53 - .63 (Giancarlo et al., 2004).


References:

Giancarlo, C. A. (2010). The California Measure of Mental Motivation: User manual. Millbrae, CA: California Academic Press.

Giancarlo, C. A., Blohm, S. W., & Urdan, T. (2004). Assessing secondary students’ disposition toward critical thinking: Development of the California Measure of Mental Motivation. Educational and Psychological Measurement, 64(2), 347-364.


contributed by Scott Trungadi, Cohort 8


Maslach Burnout Inventory - Educators Survey (MBI-ES)

The Maslach Burnout Inventory-Educators Survey (MBI-ES)(1996) is designed to assess and measure levels of professional burnout in education professions. Burnout is a psychological syndrome of emotional exhaustion that depletes workers emotional resources and prohibits the human service worker from contributing on a psychological level to their clients. The MBI-ES can help people develop awareness to whether burnout is something they need to address in order to understand their own personal feelings potentially related to things like job satisfaction, personal stress levels, and overall motivation (Maslach, Jackson, & Leiter, 1996). The MBI-ES does not measure specific stressors or particular reasons for burnout. Instead, this scale is used to determine if participants are experiencing barely noticeable or major feelings, attitudes, or ideas of burnout. The MBI-ES takes about 10 to 15 minutes to complete and consists of 22 items which are divided into three subscales. The examiner should be a neutral person and it is suggested that they not be someone who has direct authority of the respondents. Through factor analysis this particular instrument revealed that the following three subscales emerged: Emotional exhaustion, depersonalization, and personal accomplishment. Emotional exhaustion is characterized by feelings of emotional or physical depletion and represents 9 questions on the questionnaire. 5 questions were designed to measure characteristics of depersonalization which would indicate the lack of empathy and emotional distance between the respondent and their coworkers. The third subscale measured is personal accomplishment which describes feelings of confidence and competence in one’s job and is assessed by 10 questions. The items are written in the form of statements about personal feeling or attitudes: “I feel burned out from my work,” and “I don’t really care what happens to some recipients” are questions that can be found directly on the survey. The items are answered in terms of frequency in which the respondent experiences these feelings, on a 7-point scale ranging from 0, “never” to 6, “every day”. Each respondents test form is scored by using a scoring key for each subscale. The scores for each subscale are considered separately and are not combined into a single, total score, creating three scores computed for each respondent. Each score is then coded as low, average or high by using numerical cutoff points listed on the scoring key. The consequences of burnout are potentially very serious for workers, their clients, and the larger institutions in which they interact. The MBI-ES is considered to be the leading measurement tool of burnout and will provide a foundation of knowledge to explore individual perspectives that create stress and lead to findings of how professionals cope with that stress. Combined with an additional measure of burnout, the Areas of Work Life Survey (AWS), both measures can contribute to exploring perceptions of work setting qualities that could also contribute to worker burnout.


References:

Maslach, C., Jackson, S. E., & Leiter, M. P. (1996). Maslach burnout inventory. (3rd ed.). Palo Alto, CA: Consulting Psychologists Press.

contributed by Joseph W. Sullivan, Cohort 8

The School Attitude Assessment Survey – Revised (SAAS - R)

The School Attitude Assessment Survey – Revised (SAAS - R) is designed to measure academic self-perceptions, attitude toward school, attitudes toward teachers, goal valuation, and motivation/self-regulation in secondary school students. The purpose of measuring these factors is to distinguish underachievers from achievers in a secondary school setting. The instrument measures factors through 36 questions in the format of a 7-point Likert-type agreement scale. Scoring of the instrument is standardized, and the score is derived from means. McCoach and Siegle (2003) report the SAAS-R demonstrates evidence of adequate internal consistency reliability. A confirmatory factor analysis exhibited reasonable fit (55) = 1,581.7, CFI = .911, TLI = .918, RMSEA = .059, SRMR = .057 (McCoach & Siegle, 2003). As reported by McCoach and Siegle (2003), the scores demonstrated a classical theory internal consistency reliability coefficient of at least .85 on each of the five factors. Interfactor correlations for the five factors of the SAAS-R range from .86 to .91, demonstrating appropriate domains between the subscales.

References: McCoach, D. B., & Siegle, D. (2003). The school attitude assessment survey – revised: A new instrument to identify academically able students who underachieve. Educational and Psychological Measurement, 63(3), 414-429. DOI: 10.1177/0013164402251057.

contributed by Lauren Moyer

Multicultural Awareness, Skills, and Knowledge Survey

The Multicultural Awareness, Skills, and Knowledge Survey (MASKS) assesses the multicultural knowledge, skills, and knowledge of pre-service teachers education majors. The survey consists of 54 questions and the response format used to collect information is a 5-point Likert-type scale, ranging from 1 (not at all) through 5- to a very great extent. The survey includes three scales and six subscales. The scales include knowledge, skills, and awareness. The Knowledge Scale includes a total of 12 items, 7 for Institutional Barriers Teaching Strategies, and 5 for Gay, Lesbian, Bisexual Transgender. The Skills scale includes a total of 14 items, 10 items for Ability to Teach and Assess, and 4 for Comfortable Communicating. The Awareness scale includes a total of 28 items, 10 items for Cultural Biases and Stereotypes, 12 items for Cultural Background Influence, and 6 Academic Difficulties. The type of reliability overall and for each subscale revealed that all 54-item surpassed Cronbach’s alpha threshold of .70. Each item had an alpha score of .90 or more. Further, the analysis revealed that Knowledge had an alpha score of .93, Skills had an alpha score of .95, and Awareness had an alpha score of .97.

Reference: Jones, J. (2017). The development of the Multicultural Awareness, Skills, and Knowledge Survey: An instrument for assessing the cultural competency of Pre-Service Teachers. "Diversity, Social Justice, and the Educational Leader," 1(2), 40-54.

contributed by Héctor Huertas


Multicultural Teaching Competencies Scale

The Multicultural Teaching Competencies Scale (MTCS) is a 16-item inventory using two subscales: multicultural teaching skill and multicultural teaching knowledge, 10 measure multicultural teaching skill, and 6 measure multicultural teaching knowledge. The survey consists of 6-point Likert-type scale, ranging from 1 (strongly disagree) through 6 (strongly agree). The survey questions are formatted in two columns separated by a vertical line. The authors delineate three adverse consequences for multiracial and multiethnic students who do not have instructors who do not possess multicultural teaching competencies to instruct them. First, “lower teacher expectations for racial minority students’ academic ability, [secondly] inequitable assignment of racial minority students of special education classes, and [lastly] disproportionate experiences of academic and social failure among racial minority students” (Spanierman et. al. 2011, p.441). Together, these consequences may have a negative impact on multiracial and multiethnic students’ academic achievement and also widen the academic achievement gap between them and their white peers. Spanierman and colleagues offer a possible approach to remediate this problem. “A survey instrument grounded in extant literature that measures teachers’ self-reported multicultural teaching competence would provide an efficient method of assessment to understand which approach works for whom under what circumstances” (Spanierman et. al. 2011, p.443). The authors (Spanierman et. al. 2011) argue that previous instruments were either poorly constructed or did not yield pertinent information about an individual teacher’s multicultural competencies (p. 442-3). Therefore, the team developed the Multicultural Teaching Competency Scale (MTCS).

Reference: Spanierman, L. B., Oh, E., Heppner, P. P., Neville, H. A., Mobley, M., Wright, C. V., Navarro, R. (2011). The multicultural teaching competency scale: Development and initial validation. "Urban Education," 46(3), 440-464.

contributed by Héctor Huertas