Critical Issue: Rethinking Assessment and Its Role in Supporting Educational Reform

ISSUE: Assessment of student achievement is changing, largely because today's students face a world that will demand new knowledge and abilities. In the global economy of the 21st century, students will need to understand the basics, but also to think critically, to analyze, and to make inferences. Helping students develop these skills will require changes in assessment at the school and classroom level, as well as new approaches to large-scale, high-stakes assessment.

Audio Item: No Photo AvailableLinda Bond, director of assessment at NCREL, explains how the new skills needed in the 21st-century global economy have spurred the accountability/assessment movement in education. Excerpted from NCREL's Policy Talks, audiotape #2, Reaching for New Goals and Standards: The Role of Testing in Educational Reform Policy (NCREL, 1994). [Audio Comment, 204k]. A text version is available.

OVERVIEW: Assessment is changing for many reasons. Changes in the skills and knowledge needed for success, in our understanding of how students learn, and in the relationship between assessment and instruction are changing our learning goals for students and schools. Consequently, we must change our assessment strategies to tie assessment design and content to new outcomes and purposes for assessment (Bond, Herman, & Arter, 1994; Bond, Herman, & Arter, in press).

As society shifts from an industrial age, in which a person could get by with basic reading and arithmetic skills, to an information age, which requires the ability to access, interpret, analyze, and use information for making decisions, the skills and competencies needed to succeed in today's workplace are changing as well (Bond, 1992; National Center on Education and the Economy, 1989; and U.S. Department of Labor, 1991). In response to these changes, content standards - the knowledge, skills, and behaviors needed for students to achieve at high levels - are being developed at the national and state levels in areas such as mathematics, science, geography, and history. These standards have been synthesized in The Systematic Identification and Articulation of Content Standards and Benchmarks (Kendall & Marzano, March 1995 update).

In this atmosphere of reform, student assessment is the centerpiece of many educational improvement efforts. Policymakers hope that changes in assessment will cause teachers and schools to do things differently (Linn, 1987; Madaus, 1985). Assessment reform is viewed as a means of setting more appropriate targets for students, focusing staff development efforts for teachers, encouraging curriculum reform, and improving instruction and instructional materials (Darling-Hammond & Wise, 1985).

Many educators and policymakers believe that what gets assessed is what gets taught and that the format of assessment influences the format of instruction (O'Day & Smith, 1993). Contrary to our understanding of how students learn, many assessments - particularly traditional multiple-choice and true-false assessments - test facts and skills in isolation, seldom requiring students to apply what they know and can do in real-life situations. Standardized tests do not match the emerging content standards, and over-reliance on this type of assessment often leads to instruction that stresses basic knowledge and skills (Corbett & Wilson, 1991; Shepard & Smith, 1988; Smith & Cohen, 1991). Rather than encouraging changes in instruction toward the engaged learning that will prepare students for the 21st century, these tests encourage instruction of less important skills and passive learning:

"The notion that learning comes about by the accretion of little bits is outmoded learning theory. Current models of learning based on cognitive psychology contend that learners gain understanding when they construct their own cognitive maps of the interconnections among concepts and facts. Thus, real learning cannot be spoon-fed, one skill at a time." (Shepard, 1989, pp. 5-6).

Although basic skills may be important goals of education, they are often over-emphasized in an effort to raise standardized test scores. Basic skills and minimum competencies become the overarching goal of schools and teachers as accountability and minimum competency exams concentrate on these areas.

However, educators, policymakers, and parents are beginning to recognize that minimums and basics are no longer sufficient (Winking & Bond, 1995) and are calling for a closer match between the skills students learn in school and the skills they will need upon leaving school. Schools are now expected to help students develop skills and competencies in real-life, "authentic" situations, and schools are expected to graduate students who can demonstrate these abilities - often by their performance on alternative assessments rather than standardized tests.

Eva Baker's PictureEva Baker, codirector of the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) at UCLA, talks about students demonstrating what they can do. Excerpted from Schools That Work: The Research Advantage, videoconference 4, Alternatives for Measuring Performance (NCREL, 1991). [Audio Comment, 270K]. A text version is available.



IMPLEMENTATION PITFALLS: Some schools try to change everything at once without adequate buy-in from staff. Assessment decisions always should be related to the purpose of the assessment and the content to be assessed. Teachers need to be involved in the changes and need time to decide how best to change the strategies that they use with their students and to incorporate the changes into their practice. Changing learner outcomes and assessments without teacher input and buy-in often results in resistance to change or ineffective shortcuts to change (Corbett & Wilson, 1991).

Advocates of either extreme approach to assessment - alternative assessment only or traditional assessment only - ignore the important goal of selecting assessments that match both the outcomes to be assessed and the purpose of the assessment.

DIFFERENT POINTS OF VIEW: The two major points of view on assessment are those of the constructivist instructional reform advocate and the measurement/technical quality advocate. Indeed, instructional reform and measurement quality are being perceived increasingly as two ends of the assessment reform continuum. Most advocates of either goal recognize the importance of the other goal, but view their own goal as paramount. Instructional reform is paramount when the assessment is to be used for local purposes - at the classroom and school level, where the local curriculum is addressed and individual students have multiple opportunities to demonstrate what they know and can do. Technical quality issues become paramount with large-scale assessments at the state, district, or national level that involve high stakes (student or school accountability). Some advocates of either side fail to recognize how this fundamental difference in purpose affects the design of the assessment. Policymakers often try to use the same assessment for both purposes, which leads to conflict between instructional reform and technical quality.

Many constructivist researchers believe that because teachers tend to modify the content and format of instruction to fit a high-profile test, an obvious reform strategy is to change the content and format of such tests to enhance the coverage of important learning outcomes and to mirror good instruction (Simmons & Resnick, 1993). Assessment in a constructivist classroom becomes a learning experience for both student and teacher:

"Instead of giving the children a task and measuring how well they do or how badly they fail, one can give the children the task and observe how much and what kind of help they need in order to complete the task successfully. In this approach the child is not assessed alone. Rather, the social system of the teacher and child is dynamically assessed to determine how far along it has progressed." (Newman, Griffin, & Cole, 1989, p. 87)

Professional development and teacher involvement in assessment design are important components of this approach. The primary goal is to change what and how teachers teach rather than to measure performance for accountability purposes. Advocates of this approach are willing to sacrifice some technical quality to achieve these desired ends. They believe that the measurement community has focused exclusively on technical concerns (reliability or consistency of scores across time and raters) and ease of measurement to the detriment of what is important to assess (validity).

However, measurement issues become more important as the stakes attached to these assessments increase. The consequences or decisions to be made based on any individual exam determine the degree of technical quality demanded of that exam. If a student is denied a high school diploma or access to an educational opportunity based upon the results of a single assessment strategy, the assessment must meet very stringent technical quality criteria. On the other hand, when the results of a weekly classroom exam are combined with the results of several other assessments to determine a student's grade for the semester, the technical quality of each individual assessment is less of a concern.

An equally concerned group of measurement specialists worries that in our rush to match the content of the assessment with new learning goals, we will overlook the technical quality of the assessment. If the results of the assessment are not reliable - that is, consistent no matter who does the assessing or when - students may be judged unfairly, and incorrect decisions may be made about their educational needs. Above all, they argue, if we say that student X has more ability than student Y on a given assessment, we need to have technical evidence to back up this conclusion.

Audio Item: No Photo AvailableThe narrator of The ABCs of School Testing describes the need for reliability when scoring writing samples (Joint Committee on Testing Practices, 1993). [519k QuickTime slideshow]. A text version is available.

This group also points to assessment strategies that offer inconsistent results that are uninterpretable. They argue against the use of assessment as a "motivator" and believe that the quality of the assessment information gained is jeopardized when teachers teach to the test (Mehrens & Kaminski, 1989). Every assessment can assess only a sample of the content that the teacher has taught and the student has learned. If that sample adequately represents the total content and the technical quality of the assessment is high, we can infer that students who do well on the assessment also would do well on the remaining content that is not assessed. If teachers teach to the content on the test - ignoring the rest of the content domain - then we cannot make this inference and the utility of the assessment as a measure of the broader domain is lost. The small number of tasks that can be given in a performance assessment format (each task or item can take from 20 to 30 minutes or even several days to complete), this concern about the accuracy of inferences becomes even greater.

Gwyneth Boodoo and William Mehrens's PicturesGwyneth Boodoo and William Mehrens, measurement specialists, talk about the difficulty of making inferences regarding student abilities when using performance assessments. Excerpted from The ABCs of School Testing (Joint Committee on Testing Practices, 1993). [315k audio file]. A text version is available.


This Critical Issue Summary was researched and written by Linda Ann Bond, director of assessment, North Central Regional Educational Laboratory.

Date posted: 1995
Disclaimer and copyright information.