Critical Issue: Ensuring Equity with Alternative Assessments
ISSUE: If American students are to be held responsible for achieving high educational standards, it is ethically imperative that educators develop assessment strategies that ensure equity in assessing and interpreting student performance. In order to protect students from unfair and damaging interpretations and to provide parents and communities with an accurate overall picture of student achievement, educators need to be aware of the promise and the challenges inherent in using alternative assessment practices for high-stakes decisions (such as student retention, promotion, graduation, and assignment to particular instructional groups), which have profound consequences for the students affected. Only then will educators be able to build and use an assessment system that is a vehicle for eliminating, as opposed to underscoring, educational inequities. Although alternative assessments can help ensure ethnic, racial, economic, and gender fairness, equity cannot be achieved by reforms to assessment alone. Change will result only from a trio of reform initiatives aimed at ongoing professional development in curriculum and instruction, improved pedagogy, and quality assessment.
OVERVIEW: One of the reasons for the current national disenchantment with standardized multiple-choice tests, secured tests, and other norm-referenced assessments has been the gross inequities that have resulted from inferences based solely on these tests. In many schools, districts, and states, interpretations based on a single test score have been used to place students in low-track classes, to require students to repeat grades, and to deny high school graduation diplomas. The negative personal and societal effects for students are well-documented: exposure to a less challenging curriculum, significantly increased dropout rates, and lives of unemployment and welfare dependency (Oakes, 1986a; Oakes, 1986b; Shepard & Smith, 1986; Jaeger, 1991). Clearly, using testing as a mechanism for sorting and selecting students for access to educational and economic opportunities is antithetical to achieving equity.
Charlotte Higuchi, a third- and fourth-grade teacher at Farmdale Elementary School in Los Angeles, California, discusses the problems inherent in standardized testing. [392k audio file] Excerpted from the video series Schools That Work: The Research Advantage, videoconference #4, Alternatives for Measuring Performance (North Central Regional Educational Laboratory, 1992). A text version is available.
At all levels, educators are turning to alternative, performance-based assessments that are backed by criterion-referenced standards. Such assessments help educators gain a deeper understanding of student learning, and enable them to communicate evidence of that learning to parents, employers, and the community at large. These new alternative assessments and standards have been heralded as the answer to a whole host of education ills, including the apparent or real gap in performance between students of different ethnic, socioeconomic, and language backgrounds. Research on learning and assessment and on the prevailing practice of shaping instruction to meet test requirements help build the case for alternative assessment.
Findings from cognitive psychology on the nature of meaningful, engaged learning support the use of alternative assessments that are tied to curriculum and instruction and that emphasize higher-order thinking skills and authentic tasks. Alternative assessments often have high fidelity for the goals of instruction and require students to solve complex, real-life problems. Some educators believe that alternative assessments motivate students to show their best performance--performance that may have been masked in the past by standardized fixed-response tests and by unmotivating content. However, the biggest mistake that schools, districts, and states can make is thinking that exchanging one high-stakes tests for another will result in equitable assessment or elimination of the performance gap between students. Darling-Hammond (1994) believes that if new forms of assessment are to support real and lasting reforms and to close--as opposed to accentuate--the achievement gap between students, they must be developed carefully and used for different purposes than the norm-referenced tests that have preceded. These purposes must be made explicit before the assessment system is built.
Linda Darling-Hammond, a researcher and author in the area of assessment and equity, discusses how assessment can enhance equity when changes are made in the ways that assessments are used. [540k audio file] Excerpted from an interview with Linda Darling-Hammond (North Central Regional Educational Laboratory, 1996). A text version is available.
It is true that new forms of assessment are powerful tools for understanding student performance, particularly in areas that require critical thinking and complex problem solving. However, until high expectations for success, sufficient opportunity to learn, and challenging instruction are the standard educational fare for all children, some evidence (Elliott, 1993; LeMahieu, Eresh, & Wallace, 1992) suggests that alternative assessments may reveal even greater achievement gaps than standardized assessments.
One of the most exciting and liberating things about the current interest in assessment is the recognition that numerous assessment tools are available to schools, districts, and states that are developing new assessment systems. These tools range from standardized fixed-response tests to alternatives such as performance assessment, exhibitions, portfolios, and observation scales. Each type of assessment brings with it different strengths and weaknesses to the problem of fair and equitable assessment. Recognizing the complexity of understanding performance or success for individuals, it is virtually impossible that any single tool will do the job of fairly assessing student performance. Instead, the National Center for Research on Evaluation, Standards, and Student Testing (1996) suggests that an assessment system made up of multiple assessments (including norm-referenced or criterion-referenced assessments, alternative assessments, and classroom assessments) can produce "comprehensive, credible, dependable information upon which important decisions can be made about students, schools, districts, or states." Koelsch, Estrin, and Farr (1995) note that multiple assessment indicators are especially important for assessing the performance of ethnic-minority and language-minority students. The real challenge comes in selecting or developing a combination of assessments that work together as part of a comprehensive assessment system to assess all students equitably within the school community.
The first and most critical step in assessing with equity is determining the purposes for assessing and clarifying whether those purposes are low stakes or high stakes (Winking & Bond, 1995). In many cases, schools, districts, and states have not a single purpose, but multiple purposes--some low stakes and some high stakes--for assessing student performance.
Beau Fly Jones, director of educational programs at the Ohio Supercomputing Center in Columbus, Ohio, discusses the purposes of assessment. [420k audio file] Excerpted from the video series Restructuring to Promote Learning in America's Schools, videoconference #4, Multidimensional Assessment: Strategies for the Classroom (North Central Regional Educational Laboratory, 1990). A text version is available.
In the low-stakes case of classroom-based assessment, where the primary purpose is determining content coverage and conceptual understanding or diagnosing learning styles, teachers are able to take into account the student's culture, prior knowledge, experiences, and language differences. When preparing and administering assessments, teachers can follow guidelines for equitable assessment in the classroom and make use of accommodations and adaptations to the assessment to ensure that all students have an equal opportunity to demonstrate their abilities and achievement. Teachers also are able to make inferences about student performance and how they must refine their instruction to increase or maintain high performance without calling into question the technical adequacy of the assessment.
However, when tests have high-stakes consequences (such as student retention, promotion, or graduation), it is important to understand ways to maximize equity while not compromising the technical quality of alternative assessments. In high-stakes situations, the technical adequacy of the assessment affects the validity of inferences made regarding the performance of all students. When alternative tests are used for high-stakes purposes, schools--in addition to being concerned about equity when selecting or developing assessments--must take advantage of methods for maximizing fairness in administering and scoring them. Of utmost importance is ensuring that students have had adequate opportunity to learn the material on which they are being tested.
To help ensure equity, an assessment system should be planned by an interdisciplinary group that includes assessment experts, curriculum experts, teachers, and professional developers, as well as administrators responsible for planning and allocating resources. All involved parties need to understand exactly what alternative assessment systems can and cannot achieve, including the fact that unless instruction and pedagogy change and opportunities are provided for all children to experience the same challenging curriculum, alternative assessments may reveal even greater performance gaps than the standardized assessments they replace. (For further information on the relationship between assessment and school reform, refer to the Critical Issue "Rethinking Assessment and Its Role in Supporting Educational Reform.") Teachers and other staff members need to be provided with professional development and support to learn about alternative assessments. (Refer to the Critical Issue "Realizing New Learning for All Students Through Professional Development.")
The actual design of the assessment system should include input from students and individuals who can provide advice on different cultural interpretations of various assessment tasks. After the planning is completed, a bias-review committee (comprising representatives from cultural and ethnic groups for whom the assessment is intended) can preview the assessment and ensure that it is fair and equitable. The planning team's next task is to ensure that the methods for scoring and interpreting the assessments results reflect the concern for equity that has driven the development of the alternative system. Finally, decisions should be made on the best methods for reporting results to various audiences and for various purposes. (Refer to the Critical Issue "Reporting Assessment Results.")
At the national level, state and local issues related to assessing with equity are mirrored and compounded. Because cultural learnings and context are so important to students' interpretation and responses (Winfield & Woodard, 1994; Darling-Hammond, 1994), moving high-stakes assessment to a national level makes it even more difficult to align tasks with students' culture and context, and potentially reduces the legal defensibility of these assessments. The landmark case of Debra P. v. Turlington (1979) sets a precedent for challenging assessment inferences when students have not had sufficient opportunity to learn the content assessed. This precedent easily may be transferred to high-stakes assessments that are not culturally or contextually based within students' realm of experience. The ability to assess equitably in high-stakes situations is crucial when considering a national assessment and suggests that the most useful context for developing performance-based assessments may be the local level. On the other hand, the New Standards Project provides an example of a voluntary large-scale standards and assessment reform system that combines national reference assessments with locally developed performance tasks and portfolios in ways that potentially allow for culturally and contextually valid assessment.
Steve Ferrara, director of student assessment at the Maryland State Department of Education, talks about the difficulty of changing curricula, instruction, expectations, and standards--all of which affect assessment. [560k audio file] Excerpted from an interview with Steve Ferrara (North Central Regional Educational Laboratory, 1995). A text version is available.
Regardless of the level of the assessment effort, equity will never be achieved as long as everyone involved in educating children sees the assessment tools themselves as responsible for ensuring fairness. It is not just the tools, but also the curriculum, instruction, professional development, parent and community involvement, and leadership practices that affect the fairness of assessments and the inferences based on them. Using alternative assessment to assess with equity requires the comprehensive inclusion of each of these elements of the equity equation. Without these supporting systems, new forms of assessment are likely to maintain and perhaps magnify educational inequities.
ACTION OPTIONS: Because the use of alternative assessment--including performance assessment--for high-stakes purposes is relatively new, there is still much debate about the appropriate standards for technical rigor, and practitioners and researchers are still exploring methods for maximizing equity. Although ensuring fairness in performance assessment remains a challenge, some procedures are available to help increase equity in alternative assessment. In addition to applying statistical techniques such as differential item functioning (DIF) analysis, which is used with standardized tests to determine item bias, educators can take the following actions to help ensure the building of a performance-based assessment system that will address high standards and achieve equitable outcomes.
When planning assessment systems, educators can:
When developing, selecting, and administering alternative assessments, educators can:
When interpreting, scoring, reporting, and using assessment results, educators can:
IMPLEMENTATION PITFALLS: Some types of alternative assessment require teachers to devote considerable time to planning and administering the assessment as well as interpreting student achievement.
Schools may think that the substitution of one high-stakes test for another will result in equitable assessment or the elimination of performance gaps. Yet performance gaps are likely to continue if teaching and assessment strategies remain unchanged. Linn, Baker, and Dunbar (1991) note:
"Gaps in performance among groups exist because of difference in familiarity, exposure, and motivation on the tasks of interest. Substantial changes in instructional strategy and resource allocation are required to give students adequate preparation for complex, time-consuming, open-ended assessments." (p. 18)
Schools may develop and use alternative assesssments with the expectation that a better monitoring system or new forms of assessment alone will address inequitable learning outcomes for students. In actuality, assessment must be integrated with curriculum and instruction in order to promote equity in student learning. (Refer to the Critical Issue "Integrating Assessment and Instruction in Ways That Support Learning.")
In an effort to address higher-order cognitive skills, schools may develop assessments that have ambiguous performance tasks or requirements. Such tasks or requirements may be interpreted very differently by different cultural groups.
Schools may attempt to use alternative assessments for sorting and classifying students according to ability level instead of for improving instruction and raising student achievement. Darling-Hammond (1994) notes that in order to close the achievement gap, new forms of assessment must be developed carefully and be used for different purposes than norm-referenced tests.
Schools and districts may fail to develop policies for using alternative assessment information to improve instruction. They also may not provide ongoing professional development in alternative assessment for teachers. Winfield and Woodard (1994) note: "Merely setting high standards and developing a new assessment system will not ensure changes in teacher behavior or student performance unless professional development activities and capacity building at the school level are given equal priority" (p. 8).
Bond, Moss, and Carr (1996) caution that assessments--even those deemed to be unbiased--may be used to support a policy or program that does not promote equity:
"Concerns about equity spill over the consensual bonds of validity and bias to include questions about the educational system in which the assessment was used. It is possible for an assessment to be considered unbiased in a technical sense--in the sense that the intended interpretation is equally valid across various groups of concern--and yet be used in service of a policy that fails to promote equity....The question for assessment evaluators is whether an assessment is contributing to or detracting from the fairness of the educational system of which it is a part." (p. 118)
Some teachers, parents, and community members may express resistance to any form of alternative assessment. Teachers, in particular, may object to the additional time necessary for developing and grading performance assessments, and may have difficulty in specifying criteria for judging student work.
Schools, districts, and states may exempt from assessments students who traditionally have not performed well (e.g., second-language learners), thereby avoiding the problem of developing fair measures that provide a picture of the entire school community (Phillips, 1996).
Educators may administer alternative assessments and then rush to blame the test or the children for performance gaps. Instead, educators need to be accountable for student achievement. They also must align assessment with curriculum and instruction in order to improve student learning.
When reporting assessment results, educators must learn to use opportunity-to-learn data with care. Some schools and districts report scores for subgroups of students in the absence of opportunity-to-learn data; other schools develop opportunity-to-learn standards that measure only easy-to-access variables that are ancillary to good instruction (e.g., number of books in the library).
When analyzing test results, pairing isolated opportunity-to-learn variables with subgroup data can lead to erroneous cause-and-effect interpretations. For example, comparing the performance of Hispanic and non-Hispanic students along with the amount of reading assigned outside of school is inappropriate because of the lack of information on other important contextualizing factors.
DIFFERENT POINTS OF VIEW: Although no educator would say that equitable assessment is not important, there are emerging schools of thought about the nature of equity and how it relates to assessment. In particular, these viewpoints relate to achieving a level playing field for assessing student work. Most researchers and practitioners agree that equity must be a major consideration when planning, developing, and administering assessment systems. Some researchers (Garcia & Pearson, 1994; Johnston, 1992; Estrin, 1993), however, believe that students' cultural learnings and interpretations of the world around them are so tied to their responses that it is unfair not to address these learnings and interpretations directly. These researchers feel that the only way to truly understand a student's performance is through assessments that are situated in the local realities of schools, classrooms, teachers, and students. Proponents of situated assessment argue that it is unlikely that large-scale, high-stakes assessment could ever equitably measure student performance. They see familiar raters (the students' teacher or panels of individuals) as the best able to judge a students' work because familiarity is necessary to understand the response patterns and culturally tied conceptions of testing and learning that each student brings to the assessment situation.
Maryland School Performance Assessment Program
Kentucky Instructional Results Information System (KIRIS)
Assessment at The Key Learning Community, Indianapolis, Indiana
Assessment Training Institute
50 S.W. Second Ave., Suite 300
Portland, OR 97204-2636
Contact: Rick Stiggins
Grant Wiggins & Associates
4095 US Route 1 - Suite 104
Monmouth Junction, NJ 08852
732.329.0641 , Fax: 732.438.9527
Contact: Grant Wiggins
Center for the Study of Testing, Evaluation, and Educational Policy
School of Education
Chestnut Hill, MA 02167
(617) 552-4521; fax (617) 552-8419
Consortium for Equity in Standards and Testing (CTEST)
School of Education
Chestnut Hill, MA 02167
(617) 552-3574; fax (617) 552-8419
Contact: Steve Stemler, Research Assistant
Laboratory Network Program on Alternative Assessment
Northwest Regional Educational Laboratory
101 S.W. Main Street, Suite 500
Portland, OR 97204
(503) 275-9500; fax (503) 275-9489
Contact: Judy Arter
National Center for Fair and Open Testing (FairTest)
Cambridge, MA 02139
(617) 864-4810; fax (617) 497-2224
Contact: Leslie Snerder
National Center for Research on Evaluation, Standards, and Student Testing
UCLA Graduate School of Education and Information Studies
Center for the Study of Evaluation
405 Hilgard Ave.
1320 Moore Hall, Mailbox 951522
Los Angeles, CA 90095-1522
(310) 206-1532; fax (310) 825-3883
Contact: Joan Herman, Associate Director
National Center on Educational Outcomes (NCEO)
University of Minnesota
College of Education and Human Development
350 Elliott Hall
75 E. River Road
Minneapolis, MN 55455
(612) 626-1530; fax (612) 624-0879
Contact: Jim Ysseldyke, Director
Executive Director, Philadelphia Education Fund
7 Benjamin Franklin Parkway
Philadelphia, PA 19183
(215) 665-1400, ext. 3313
Deborah Winking, Ph.D.
Senior Consultant, Panasonic Foundation
8191 Moller Ranch Drive
Pleasanton, CA 94588
(510) 417-0943; fax (510) 461-2576
This Critical Issue summary was researched and written by Deborah Winking, senior consultant at Panasonic Foundation in Pleasanton, California, and formerly an evaluation associate at North Central Regional Educational Laboratory in Oak Brook, Illinois.
Date posted: 1997