Rubrics and Self-Assessment Project
Scoring rubrics are among the most popular innovations in education (Goodrich, 1997a; Jensen, 1995; Ketter, 1997; Luft, 1997; Popham, 1997). However, little research on their design and their effectiveness has been undertaken. Moreover, few of the existing research and development efforts have focused on the ways in which rubrics can serve the purposes of learning and cognitive development as well as the demands of evaluation and accountability. The two studies that made up the Project Zero's research focused on the effect of instructional rubrics and rubric-referenced self-assessment on the development of 7th and 8th grade students' writing skills and their understandings of the qualities of good writing.
These studies draw on two areas of research: authentic assessment and self-regulated learning. Perspectives on authentic assessment provide a guiding definition of assessment as an educational tool that serves the purposes of learning as well as the purposes of evaluation (Gardner, 1991; Goodrich, 1997b; Wiggins, 1989a, 1989b; Wolf & Pistone, 1991). In addition, the literature on authentic assessment provides guidance on the characteristics of effective assessment (see Goodrich, 1996a, for a review). These characteristics influenced the design of the studies reviewed below, which:
Articulated clear criteria for assessing writing,
Asked students to assess their own work,
Provided opportunities for improvement through revision, and
Was sensitive to students' developmental stages, referring to appropriate grade level standards.
The literature on self-regulated learning and feedback suggests that learning improves when feedback reminds students of the need to monitor their learning and guides them in how to achieve learning objectives (Bangert-Drowns et al., 1991; Butler and Winne, 1995). The Rubrics and Self-Assessment Project is based on the hypothesis that students themselves can be the source of feedback, given the appropriate conditions and supports.
Taken together, the research on authentic assessment and on self-regulated learning point to the potential for instructional rubrics and self-assessment to support learning and skill development. In both of the studies reviewed below, these principles were made concrete by giving students instructional rubrics that describe good and poor writing (see examples below). The term "instructional rubrics" refers to rubrics designed to support student learning and development in addition to serving as standards-referenced assessment tools. Instructional rubrics have several features that support student learning. They:
are written in language that students can understand;
refer to common weaknesses in students' work and indicate how such weaknesses can be avoided; and
can be used by students to evaluate their works-in-progress and thereby guide revision and improvement.
The Historical Fiction Rubric and Persuasive Essay Rubric are examples of instructional rubrics designed for use in this research. Like all of the rubrics used, they draw on district, state, and national standards as well as on feedback from colleagues and teachers. They articulate the criteria for the essay, describe levels of quality from excellent to poor, and make suggestions for avoiding typical writing pitfalls. The expectation was that instructional rubrics, either alone or in combination with a formal process of self-assessment, would have significant effects on students' writing and learning.
Study 1—Instructional Rubrics
The first study spanned the 1996-97 school year and focused on the effects of instructional rubrics on eighth-grade students' writing and on their understandings of the qualities of good writing. Students in nine eighth-grade classes in two urban middle schools were asked to write three different essays: a persuasive essay, an autobiographical incident essay, and a historical fiction essay. Before writing a first draft of each essay, students in the treatment classes were given an instructional rubric. Students in the control classes were not given a rubric but were asked to write first and second drafts of the essays.
Results for Study 1
Findings from Study 1 paint an uneven but intriguing pattern of results. On average, treatment students received higher scores on only one of the three essays, but these differences were statistically significant. In general, it appears that instructional rubrics can help students write better, but that a more intensive intervention may be necessary in order to help all students perform at higher levels consistently. These results are somewhat encouraging though, since the size of the difference (half a point) between the treatment and control groups on the essay that showed a significant difference was educationally as well as statistically meaningful: a half-point difference on a 4-point scale is a 12.5% difference. This effect is all the more meaningful because of the minimal amount of classroom time taken by the intervention. Less than 40 minutes was spent on introducing and reviewing each rubric. Those 40 minutes may have translated into a 12.5% difference in students' scores.
Questionnaires given at the end of the first study were also encouraging. Students answered this question: "When your teachers read your essays and papers, how do they decide whether your work is excellent (A) or very good (B)?" Students' responses showed that control students tended to have a poorer understanding of what counts in good writing. For example:
Well, they give us the assignment and they know the qualifications and if you have all of them you get an A and if you don't get any you get a F and so on [my emphasis].
Note that this student knows that the teacher has her standards or "qualifications," but he does not suggest that he knows what they are. The treatment students, on the other hand, tended to refer to rubrics, "rebeks," and "root braks" as grading guides and often listed criteria from the rubrics they had seen. For example:
An A would consist of a lot of good expressions and big words. He/she also uses relevant and rich details and examples. The sentences are clear, they begin in different ways, some are longer than others, and no fragments. Has good grammar and spelling. A B would be like an A but not as much would be on the paper.
Many of the criteria referred to by this and other treatment students were included in the rubrics used during this study. When compared to the students in the control group, treatment students tended to refer to a greater variety of criteria for high quality writing, including word choice, voice and tone, organization, paragraph format, detail, etc. These differences suggest that the students who received instructional rubrics had more knowledge of what counts in good writing and of the criteria by which their essays were evaluated. It appears that instructional rubrics have the potential to broaden students' conception of good writing beyond the recognition of mechanics and neatness and, as several control students put it, "whether the teacher likes you or not."
Study 2—Rubric-Referenced Self-Assessment
The second study took place during the 1997-98 school year and looked at the effects of instructional rubrics and guided self-assessment on students' writing and understandings of good writing. This study involved thirteen seventh- and eighth-grade classes in the same two urban schools. Both the treatment and control groups wrote two essays: a historical fiction essay, and a response to literature. Students in all participating classes were given instructional rubrics, but only the treatment classes were engaged in a process of guided self-assessment.
The two self-assessment lessons focused on a formal process of guided self-assessment designed in collaboration with the participating teachers. Students used markers to color code the criteria on the rubric and the evidence in their essays that showed that they met the criteria. A simple example comes from the Historical Fiction Rubric, which includes a criterion requiring students to "bring the time and place in which the character lived alive." During class, students were asked to underline "time and place" in red on their rubrics, then underline the information they provided about the time and place of their story in red on their essay. If they could not find the information in their essay—and they were often shocked to discover they could not—they were asked to write at the top of their papers a reminder to add the missing information when they wrote a second draft. This process was followed for all seven criteria on the rubrics. Control classes received copies of the rubrics but did not formally assess their own work in class.
Results for Study 2
The results of Study 2 suggest that rubric-referenced self-assessment can have a positive effect on girls' writing but no effect on boys' writing. In broad stroke, this is consistent with research on sex differences in how boys and girls respond to feedback (Deci & Ryan, 1980; Dweck & Bush, 1976; Dweck, Davidson, Nelson & Enna, 1978; Hollander & Marcia, 1970). Study 2 did not examine students' cognitive and emotional responses to self-assessment, however, so this explanation of the differences between boys and girls is speculative. The different ways in which boys and girls respond to self-assessment need to be better understood.
Project Zero's Rubrics and Self-Assessment Project was supported by the Edna McConnell Clark Foundation.
Heidi Goodrich Andrade
Selected readings and materials:
Bangert-Drowns, R., Kulik, C., Kulik, J., & Morgan, M. (1991). The instructional effect of feedback in test-like events. Review of Educational Research, 61, 213-238.
Butler, D. & Winne, P. (1995). Feedback and self-regulated learning: A theoretical synthesis. Review of Educational Research, 65(3), 245-281.
Deci, E., & Ryan, R. (1980). The empirical exploration of intrinsic motivational processes. In L. Berkowitz, (Ed.), Advances in experimental social psychology. New York: Academic Press.
Dweck, C., & Bush, E. (1976). Sex differences in learned helplessness: 1. Differential debilitation with peer and adult evaluators. Developmental Psychology, 12 147-156.
Dweck, C., Davidson, W., Nelson, S., & Enna, B. (1978). Sex differences in learned helplessness: 2. Contingencies of evaluative feedback in the classroom and 3. An experimental analysis. Developmental Psychology, 14(3), 268-276.
Gardner, H. (1991). Assessment in context: The alternative to standardized testing. In B. R. Gifford and M. C. O'Connor (Eds.), Changing assessments: Alternative views of aptitude, achievement and instruction. Boston: Kluwer.
Goodrich, H. (1996a). Student self-assessment: At the intersection of metacognition and authentic assessment. Doctoral dissertation. Harvard University. Cambridge, MA.
Goodrich, H. (1997b). Thinking-centered assessment. In S. Veenema, L. Hetland, & K. Chalfen (Eds.), The Project Zero classroom: new approaches to thinking and understanding. Cambridge, MA: Project Zero, Harvard Graduate School of Education.
Goodrich, H. (1997a). Understanding rubrics. Educational Leadership, 54(4), 14-17.
Hollander, E. & Marcia, J. (1970). Parental determinants of peer orientation and self-orientation among preadolescents. Developmental Psychology, 2, 292-302.
Jensen, K. (1995). Effective rubric design: Making the most of this powerful assessment tool. Science Teacher, 62(5), 34-37.
Ketter, J. (1997). Using rubrics and holistic scoring of writing. In Tchudi, S. (Ed.) Alternative to grading student writing. Urbana, IL: National Council of Teachers of English.
Luft, J. (1997). Design your own rubric. Science Scope, 20(5), 25-27.
Popham, W. J. (1997). What's wrong—and what's right—with rubrics. Educational Leadership, 55(2), 72-75.
Wiggins, G. (1989b). Teaching to the (authentic) test. Educational Leadership, 46(7), 41-47.
Wiggins, G. (1989a). A true test: Toward more authentic and equitable assessment. Phi Delta Kappan, 70(9), 703-713.
Wolf, D., & Pistone, N. (1991). Taking full measure: Rethinking assessment through the arts. New York: College Board Publications.