Introduction of new teaching strategies often expands the expectations for student learning, creating a parallel need to redefine how we collect the evidence that assures both us and our students that these expectations are in fact being met. The default assessment strategy of the typical large, introductory, college-level science course, the multiple- choice (fixed response) exam, when used to best advantage can provide feedback about what students know and recall about key concepts. Leaving aside the difficulty inherent in designing a multiple-choice exam that captures deeper understandings of course material, its limitations become particularly notable when learning objectives include what students are able to do as well as know as the result of time spent in a course. If we want students to build their skill at conducting guided laboratory investigations, developing reasoned arguments, or communicating their ideas, other means of assessment such as papers, demonstrations (the “practical exam”), other demonstrations of problem solving, model building, debates, or oral presentations, to name a few, must be enlisted to serve as benchmarks of progress and/or in the assignment of grades. What happens, however, when students are novices at responding to these performance prompts when they are used in the context of science learning, and faculty are novices at communicating to students what their expectations for a high-level performance are? The more familiar terrain of the multiple-choice exam can lull both students and instructors into a false sense of security about the clarity and objectivity of the evaluation criteria (Wiggins, 1989) and make these other types of assessment strategies seem subjective and unreliable (and sometimes downright unfair) by comparison. In a worst-case scenario, the use of alternatives to the conventional exam to assess student learning can lead students to feel that there is an implicit or hidden curriculum—the private curriculum that seems to exist only in the mind's eye of a course instructor.
Use of rubrics provides one way to address these issues. Rubrics not only can be designed to formulate standards for levels of accomplishment and used to guide and improve performance but also they can be used to make these standards clear and explicit to students. Although the use of rubrics has become common practice in the K–12 setting (Luft, 1999), the good news for those instructors who find the idea attractive is that more and more examples of the use of rubrics are being noted at the college and university level, with a variety of applications (Ebert-May, undated; Ebert-May et al., 1997; Wright and Boggs, 2002; Moni et al., 2005; Porter, 2005; Lynd-Balta, 2006).
WHAT IS A RUBRIC?
Although definitions for the word “rubric” abound, for the purposes of this feature article we use the word to denote a type of matrix that provides scaled levels of achievement or understanding for a set of criteria or dimensions of quality for a given type of performance, for example, a paper, an oral presentation, or use of teamwork skills. In this type of rubric, the scaled levels of achievement (gradations of quality) are indexed to a desired or appropriate standard (e.g., to the performance of an expert or to the highest level of accomplishment evidenced by a particular cohort of students). The descriptions of the possible levels of attainment for each of the criteria or dimensions of performance are described fully enough to make them useful for judgment of, or reflection on, progress toward valued objectives (Huba and Freed, 2000).
A good way to think about what distinguishes a rubric from an explanation of an assignment is to compare it with a more common practice. When communicating to students our expectations for writing a lab report, for example, we often start with a list of the qualities of an excellent report to guide their efforts toward successful completion; we may have drawn on our knowledge of how scientists report their findings in peer-reviewed journals to develop the list. This checklist of criteria is easily turned into a scoring sheet (to return with the evaluated assignment) by the addition of checkboxes for indicating either a “yes-no” decision about whether each criterion has been met or the extent to which it has been met. Such a checklist in fact has a number of fundamental features in common with a rubric (Bresciani et al., 2004), and it is a good starting point for beginning to construct a rubric. Figure 1 gives an example of such a scoring checklist that could be used to judge a high school student poster competition.
An example of a scoring checklist that could be used to judge a high school student poster competition.
However, what is referred to as a “full rubric” is distinguished from the scoring checklist by its more extensive definition and description of the criteria or dimensions of quality that characterize each level of accomplishment. Table 1 provides one example of a full rubric (of the analytical type, as defined in the paragraph below) that was developed from the checklist in Figure 1. This example uses the typical grid format in which the performance criteria or dimensions of quality are listed in the rows, and the successive cells across the three columns describe a specific level of performance for each criterion. The full rubric in Table 1, in contrast to the checklist that only indicates whether a criterion exists (Figure 1), makes it far clearer to a student presenter what the instructor is looking for when evaluating student work.
A full analytical rubric for assessing student poster presentations that was developed from the scoring checklist (simple rubric) from Figure 1
DESIGNING A RUBRIC
A more challenging aspect of using a rubric can be finding a rubric to use that provides a close enough match to a particular assignment with a specific set of content and process objectives. This challenge is particularly true of so-called analytical rubrics. Analytical rubrics use discrete criteria to set forth more than one measure of the levels of an accomplishment for a particular task, as distinguished from holistic rubrics, which provide more general, uncategorized (“lumped together”) descriptions of overall dimensions of quality for different levels of mastery. Many users of analytical rubrics often resort to developing their own rubric to have the best match between an assignment and its objectives for a particular course.
As an example, examine the two rubrics presented in Tables 2 and 3, in which Table 2 shows a holistic rubric and Table 3 shows an analytical rubric. These two versions of a rubric were developed to evaluate student essay responses to a particular assessment prompt. In this case the prompt is a challenge in which students are to respond to the statement, “Plants get their food from the soil. What about this statement do you agree with? What about this statement do you disagree with? Support your position with as much detail as possible.” This assessment prompt can serve as both a preassessment, to establish what ideas students bring to the teaching unit, and as a postassessment in conjunction with the study of photosynthesis. As such, the rubric is designed to evaluate student understanding of the process of photosynthesis, the role of soil in plant growth, and the nature of food for plants. The maximum score using either the holistic or the analytical rubric would be 10, with 2 points possible for each of five criteria. The holistic rubric outlines five criteria by which student responses are evaluated, puts a 3-point scale on each of these criteria, and holistically describes what a 0-, 1-, or 2-point answer would contain. However, this holistic rubric stops short of defining in detail the specific concepts that would qualify an answer for 0, 1, or 2 points on each criteria scale. The analytical rubric shown in Table 3 does define these concepts for each criteria, and it is in fact a fuller development of the holistic rubric shown in Table 2. As mentioned, the development of an analytical rubric is challenging in that it pushes the instructor to define specifically the language and depth of knowledge that students need to demonstrate competency, and it is an attempt to make discrete what is fundamentally a fuzzy, continuous distribution of ways an individual could construct a response. As such, informal analysis of student responses can often play a large role in shaping and revising an analytical rubric, because student answers may hold conceptions and misconceptions that have not been anticipated by the instructor.
Holistic rubric for responses to the challenge statement: ′Plants get their food from the soil′
An analytical rubric for responses to the challenge statement, “Plants get their food from the soil”
The various approaches to constructing rubrics in a sense also can be characterized to be holistic or analytical. Those who offer recommendations about how to build rubrics often approach the task from the perspective of describing the essential features of rubrics (Huba and Freed, 2000; Arter and McTighe, 2001), or by outlining a discrete series of steps to follow one by one (Moskal, 2000; Mettler, 2002; Bresciani et al., 2004; MacKenzie, 2004). Regardless of the recommended approach, there is general agreement that a rubric designer must approach the task with a clear idea of the desired student learning outcomes (Luft, 1999) and, perhaps more importantly, with a clear picture of what meeting each outcome “looks like” (Luft, 1999; Bresciani et al., 2004). If this picture remains fuzzy, perhaps the outcome is not observable or measurable and thus not “rubric-worthy.”
Reflection on one's particular answer to two critical questions—“What do I want students to know and be able to do?” and “How will I know when they know it and can do it well?”—is not only essential to beginning construction of a rubric but also can help confirm the choice of a particular assessment task as being the best way to collect evidence about how the outcomes have been met. A first step in designing a rubric, the development of a list of qualities that the learner should demonstrate proficiency in by completing an assessment task, naturally flows from this prior rumination on outcomes and on ways of collecting evidence that students have met the outcome goal. A good way to get started with compiling this list is to view existing rubrics for a similar task, even if this rubric was designed for younger or older learners or for different subject areas. For example, if one sets out to develop a rubric for a class presentation, it is helpful to review the criteria used in a rubric for oral communication in a graduate program (organization, style, use of communication aids, depth and accuracy of content, use of language, personal appearance, responsiveness to audience; Huba and Freed, 2000) to stimulate reflection on and analysis of what criteria (dimensions of quality) align with one's own desired learning outcomes. There is technically no limit to the number of criteria that can be included in a rubric, other than presumptions about the learners' ability to digest and thus make use of the information that is provided. In the example in Table 1, only three criteria were used, as judged appropriate for the desired outcomes of the high school poster competition.
After this list of criteria is honed and pruned, the dimensions of quality and proficiency will need to be separately described (as in Table 1), and not just listed. The extent and nature of this commentary depends upon the type of rubric—analytical or holistic. This task of expanding the criteria is an inherently difficult task, because of the requirement for a thorough familiarity with both the elements comprising the highest standard of performance for the chosen task, and the range of capabilities of learners at a particular developmental level. A good way to get started is to think about how the attributes of a truly superb performance could be characterized in each of the important dimensions—the level of work that is desired for students to aspire to. Common advice (Moskal, 2000) is to avoid use of words that connote value judgments in these commentaries, such as “creative” or “good” (as in “the use of scientific terminology language is ‘good’”). These terms are essentially so general as to be valueless in terms of their ability to guide a learner to emulate specific standards for a task, and although it is admittedly difficult, they need to be defined in a rubric. Again, perusal of existing examples is a good way to get started with writing the full descriptions of criteria. Fortunately, there are a number of data banks that can be searched for rubric templates of virtually all types (Chicago Public Schools, 2000; Arter and McTighe, 2001; Shrock, 2006; Advanced Learning Technologies, 2006; University of Wisconsin-Stout, 2006).
The final step toward filling in the grid of the rubric is to benchmark the remaining levels of mastery or gradations of quality. There are a number of descriptors that are conventionally used to denote the levels of mastery in addition to the conventional excellent-to-poor scale (with or without accompanying symbols for letter grades), and several examples from among the more common of these are listed below:
Scale 1: Exemplary, Proficient, Acceptable, Unacceptable
Scale 2: Substantially Developed, Mostly Developed, Developed, Underdeveloped
Scale 3: Distinguished, Proficient, Apprentice, Novice
Scale 4: Exemplary, Accomplished, Developing, Beginning
In this case, unlike the number of criteria, there might be a natural limit to how many levels of mastery need this expanded commentary. Although it is common to have multiple levels of mastery, as in the examples above, some educators (Bresciani et al., 2004) feel strongly that it is not possible for individuals to make operational sense out of inclusion of more than three levels of mastery (in essence, a “there, somewhat there, not there yet” scale). As expected, the final steps in having a “usable” rubric are to ask both students and colleagues to provide feedback on the first draft, particularly with respect to the clarity and gradations of the descriptions of criteria for each level of accomplishment, and to try out the rubric using past examples of student work.
Huba and Freed (2000) offer the interesting recommendation that the descriptions for each level of performance provide a “real world” connection by stating the implications for accomplishment at that level. This description of the consequences could be included in a criterion called “professionalism.” For example, in a rubric for writing a lab report, at the highest level of mastery the rubric could state, “this report of your study would persuade your peers of the validity of your findings and would be publishable in a peer-reviewed journal.” Acknowledging this recommendation in the construction of a rubric might help to steer students toward the perception that the rubric represents the standards of a profession, and away from the perception that a rubric is just another way to give a particular teacher what he or she wants (Andrade and Du, 2005).
As a further help aide for beginning instructors, a number of Web sites, both commercial and open access, have tools for online construction of rubrics from templates, for example, Rubistar (Advanced Learning Technologies, 2006) and TeAch-nology (TeAch-nology, undated). These tools allow the would-be “rubrician” to select from among the various types of rubrics, criteria, and rating scales (levels of mastery). Once these choices are made, editable descriptions fall into place in the proper cells in the rubric grid. The rubrics are stored in the site databases, but typically they can be downloaded using conventional word processing or spreadsheet software. Further editing can result in a rubric uniquely suitable for your teaching/learning goals.
ANALYZING AND REPORTING INFORMATION GATHERED FROM A RUBRIC
Whether used with students to set learning goals, as scoring devices for grading purposes, to give formative feedback to students about their progress toward important course outcomes, or for assessment of curricular and course innovations, rubrics allow for both quantitative and qualitative analysis of student performance. Qualitative analysis could yield narrative accounts of where students in general fell in the cells of the rubric, and they can provide interpretations, conclusions, and recommendations related to student learning and development. For quantitative analysis the various levels of mastery can be assigned different numerical scores to yield quantitative rankings, as has been done for the sample rubric in Table 1. If desired, the criteria can be given different scoring weightings (again, as in the poster presentation rubric in Table 1) if they are not considered to have equal priority as outcomes for a particular purpose. The total scores given to each example of student work on the basis of the rubric can be converted to a grading scale. Overall performance of the class could be analyzed for each of the criteria competencies.
Multiple-choice exams have the advantage that they can be computer or machine scored, allowing for analysis and storage of more specific information about different content understandings (particularly misconceptions) for each item, and for large numbers of students. The standard rubric-referenced assessment is not designed to easily provide this type of analysis about specific details of content understanding; for the types of tasks for which rubrics are designed, content understanding is typically displayed by some form of narrative, free-choice expression. To try to capture both the benefits of the free-choice narrative and generate an in-depth analysis of students' content understanding, particularly for large numbers of students, a special type of rubric, called the double-digit, is typically used. A large-scale example of use of this type of scoring rubric is given by the Trends in International Mathematics and Science Study (1999). In this study, double-digit rubrics were used to code and analyze student responses to short essay prompts.
To better understand how and why these rubrics are constructed and used, refer to the example provided in Figure 2. This double-digit rubric was used to score and analyze student responses to an essay prompt about ecosystems that was accompanied by the standard “sun-tree-bird” diagram (a drawing of the sun, a tree, and other plants; various primary and secondary consumers; and some not well-identifiable decomposers, with interconnecting arrows that could be interpreted as energy flow or cycling of matter). A brief narrative, summarizing the “big ideas” that could be included in a complete response, along with a sample response that captures many of these big ideas accompanies the actual rubric. The rubric itself specifies major categories of student responses, from complete to various levels of incompleteness. Each level is assigned one of the first digits of the scoring code, which could actually correspond to a conventional point total awarded for a particular response. In the example in Figure 2, a complete response is awarded a maximum number of 4 points, and the levels of partially complete answers, successively lower points. Here, the “incomplete” and “no response” categories are assigned first digits of 7 and 9, respectively, rather than 0 for clarity in coding; they can be converted to zeroes for averaging and reporting of scores.
A double-digit rubric used to score and analyze student responses to an essay prompt about ecosystems.
The second digit is assigned to types of student responses in each category, including the common approaches and misconceptions. For example, code 31 under the first partial- response category denotes a student response that “talks about energy flow and matter cycling, but does not mention loss of energy from the system in the form of heat.” The sample double-digit rubric in Figure 2 shows the code numbers that were assigned after a “first pass” through a relatively small number of sample responses. Additional codes were later assigned as more responses were reviewed and the full variety of student responses revealed. In both cases, the second digit of 9 was reserved for a general description that could be assigned to a response that might be unique to one or only a few students but nevertheless belonged in a particular category. When refined by several assessments of student work by a number of reviewers, this type of rubric can provide a means for a very specific quantitative and qualitative understanding, analysis, and reporting of the trends in student understanding of important concepts. A high number of 31 scores, for example, could provide a major clue about deficiencies in past instruction and thus goals for future efforts. However, this type of analysis remains expensive, in that scores must be assigned and entered into a data base, rather than the simple collection of student responses possible with a multiple-choice test.
WHY USE RUBRICS?
When used as teaching tools, rubrics not only make the instructor's standards and resulting grading explicit, but they can give students a clear sense of what the expectations are for a high level of performance on a given assignment, and how they can be met. This use of rubrics can be most important when the students are novices with respect to a particular task or type of expression (Bresciani et al., 2004).
From the instructor's perspective, although the time expended in developing a rubric can be considerable, once rubrics are in place they can streamline the grading process. The more specific the rubric, the less the requirement for spontaneous written feedback for each piece of student work—the type that is usually used to explain and justify the grade. Although provided with fewer written comments that are individualized for their work, students nevertheless receive informative feedback. When information from rubrics is analyzed, a detailed record of students' progress toward meeting desired outcomes can be monitored and then provided to students so that they may also chart their own progress and improvement. With team-taught courses or multiple sections of the same course, rubrics can be used to make faculty standards explicit to one another, and to calibrate subsequent expectations. Good rubrics can be critically important when student work in a large class is being graded by teaching assistants.
Finally, by their very nature, rubrics encourage reflective practice on the part of both students and teachers. In particular, the act of developing a rubric, whether or not it is subsequently used, instigates a powerful consideration of one's values and expectations for student learning, and the extent to which these expectations are reflected in actual classroom practices. If rubrics are used in the context of students' peer review of their own work or that of others, or if students are involved in the process of developing the rubric, these processes can spur the development of their ability to become self-directed and help them develop insight into how they and others learn (Luft, 1999).
We gratefully acknowledge the contribution of Richard Donham (Mathematics and Science Education Resource Center, University of Delaware) for development of the double-digit rubric in Figure 2.
- Advanced Learning Technologies, University of Kansas Rubistar. 2006. [28 May 2006]. http://rubistar.4teachers.org/index.php.
- Andrade H., Du Y. Student perspectives on rubric-referenced assessment. Pract. Assess. Res. Eval. 2005 :10. [18 May 2006];http://pareonline.net/pdf/v10n3.pdf.
- Arter J. A., McTighe J. Scoring Rubrics in the Classroom: Using Performance Criteria for Assessing and Improving Student Performance. Thousand Oaks, CA: Corwin Press; 2001.
- Bresciani M. J., Zelna C. L., Anderson J. A. Assessing Student Learning and Development: A Handbook for Practitioners. Washington, DC: National Association of Student Personnel Administrators; 2004. Criteria and rubrics; pp. 29–37.
- Chicago Public Schools The Rubric Bank. 2000. [18 May 2006]. http://intranet.cps.k12.il.us/Assessments/Ideas_and_Rubrics/Rubric_Bank/rubric_bank.html.
- Ebert-May D., Brewer C., Allred S. Innovation in large lectures—teaching for active learning. Bioscience. 1997;47:601–607.
- Ebert-May D. Scoring Rubrics. Field-tested Learning Assessment Guide. [18 May 2006]. undated. http://www.wcer.wisc.edu/archive/cl1/flag/cat/catframe.htm.
- Huba M. E., Freed J. E. Learner-Centered Assessment on College Campuses. Boston: Allyn and Bacon; 2000. Using rubrics to provide feedback to students; pp. 151–200.
- Luft J. A. Rubrics: design and use in science teacher education. J. Sci. Teach. Educ. 1999;10:107–121.
- Lynd-Balta E. Using literature and innovative assessments to ignite interest and cultivate critical thinking skills in an undergraduate neuroscience course. CBE Life Sci. Educ. 2006;5:167–174.[PMC free article][PubMed]
- MacKenzie W. NETS●S Curriculum Series: Social Studies Units for Grades 9–12. Washington, DC: International Society for Technology in Education; 2004. Constructing a rubric; pp. 24–30.
- Mettler C. A. Designing scoring rubrics for your classroom. In: Boston C., editor. Understanding Scoring Rubrics: A Guide for Teachers. University of Maryland, College Park, MD: ERIC Clearinghouse on Assessment and Evaluation; 2002. pp. 72–81.
- Moni R., Beswick W., Moni K. B. Using student feedback to construct an assessment rubric for a concept map in physiology. Adv. Physiol. Educ. 2005;29:197–203.[PubMed]
- Moskal B. M. Scoring Rubrics Part II: How? ERIC/AE Digest, ERIC Clearinghouse on Assessment and Evaluation. Eric Identifier #ED446111. 2000. [21 April 2006]. http://www.eric.ed.gov.
- Porter J. R. Information literacy in biology education: an example from an advanced cell biology course. Cell Biol. Educ. 2005;4:335–343.[PMC free article][PubMed]
- Shrock K. Kathy Shrock's Guide for Educators. 2006. [5 June 2006]. http://school.discovery.com/schrockguide/assess.html#rubrics.
- TeAch-nology, Inc TeAch-nology. [7 June 2006]. undated. http://teach-nology.com/web_tools/rubrics.
- Trends in International Mathematics and Science Study Science Benchmarking Report, 8th Grade, Appendix A: TIMSS Design and Procedures. 1999. [9 June 2006]. http://timss.bc.edu/timss1999b/sciencebench_report/t99bscience_A.html.
- University of Wisconsin–Stout Teacher Created Rubrics for Assessment. 2006. [7 June 2006]. http://www.uwstout.edu/soe/profdev/rubrics.shtml.
- Wiggins G. A true test: toward more authentic and equitable assessment. Phi Delta Kappan. 1989;49:703–713.
- Wright R., Boggs J. Learning cell biology as a team: a project-based approach to upper-division cell biology. Cell Biol. Educ. 2002;1:145–153.[PMC free article][PubMed]
Articles from CBE Life Sciences Education are provided here courtesy of American Society for Cell Biology
Grading rubrics can be of great benefit to both you and your students. For you, a rubric saves time and decreases subjectivity. Specific criteria are explicitly stated, facilitating the grading process and increasing your objectivity. For students, the use of grading rubrics helps them to meet or exceed expectations, to view the grading process as being fair, and to set goals for future learning.
In order to help your students meet or exceed expectations of the assignment, be sure to discuss the rubric with your students when you assign an essay. It is helpful to show them examples of written pieces that meet and do not meet the expectations. As an added benefit, because the criteria are explicitly stated, the use of the rubric decreases the likelihood that students will argue about the grade they receive. The explicitness of the expectations helps students know exactly why they lost points on the assignment and aids them in setting goals for future improvement.
- Routinely have students score peers essays using the rubric as the assessment tool. This increases their level of awareness of the traits that distinguish successful essays from those that fail to meet the criteria. Have peer editors use the Reviewers Comments section to add any praise, constructive criticism, or questions.
- Alter some expectations or add additional traits on the rubric as needed. Students needs may necessitate making more rigorous criteria for advanced learners or less stringent guidelines for younger or special needs students. Furthermore, the content area for which the essay is written may require some alterations to the rubric. In social studies, for example, an essay about geographical landforms and their effect on the culture of a region might necessitate additional criteria about the use of specific terminology.
- After you and your students have used the rubric, have them work in groups to make suggested alterations to the rubric to more precisely match their needs or the parameters of a particular writing assignment.