Rater reliabilityrefers to the degree of agreement between people who are evaluating or judging student performance according to specific criteria. To determine rater reliability, the agreement between two or more raters must be consistent and dependable.
Koelsch, Estrin, and Farr (1995) describe the scoring sessions by which rater reliability is determined:
"Decisions about how well a student does on a particular performance task are made in scoring sessions, where educators come together to score student work against an agreed-upon set of performance standards and detailed descriptions about levels of performance (outlined in a scoring guide often called a rubric)....Usually scoring is considered reliable if scores on a single piece of work scored by different teachers differ by no more than one score point. (Rubrics typically have from four to six score points.) (pp. 12-13)