alternative assessment: An assessment in which students originate a response to a task or question. Such responses could include demonstrations, exhibits, portfolios, oral presentations, or essays. (Compare to traditional assessment.)
authentic assessment: An assessment presenting tasks that reflect the kind of mastery demonstrated by experts. Authentic assessment of a student’s ability to solve problems, for example, would assess how effectively a student solves a real problem.
benchmark: A standard or point of reference for measuring and analyzing existing or proposed curricula in the light of content specific goals/standards. Benchmarks often are used in conjunction with standards.
data-driven decision making: A process of making decisions about curriculum and instruction based on the analysis of classroom data and standardized test data. Data-driven decision making uses data on function, quantity, and quality of inputs, and how students learn to suggest educational solutions. It is based on the assumption that scientific methods used to solve complex problems in industry can effectively evaluate educational policy, programs, and methods.
high-stakes testing is “the term used for assessments that determine if a student is retained in a grade or allowed to receive a diploma and graduate” (Lynch, 2000, p. 216).
performance assessment: Direct, systematic observation of an actual student performance or examples of student performances and rating of that performance according to pre-established performance criteria. Students are assessed on the result as well as the process engaged in a complex task or creation of a product.
qualitative research: Collection of nonnumerical data using interviews, observations, and open-ended questions to gather meaning from nonquantified narrative information.
quantitative research: Collection of numerical data in order to describe, explain, predict, and/or control phenomena of interest.
reliability: The degree to which an instrument consistently measures in the same way on repeated trials (e.g., a math test given to a student one day would yield roughly the same score if given to the same student the next day). Reliable assessment is one in which the same answers receive the same score regardless of who performs the scoring or how or where the scoring takes place. The same person is likely to get approximately the same score across multiple test administrations.
standardized tests: A test that is administered under controlled conditions that specify where, when, how, and for how long students may respond to the questions or "prompts." Assessments are administered and scored in exactly the same way for all students. Traditional standardized tests are typically mass-produced and machine-scored; they are designed to measure skills and knowledge that are thought to be taught to all students in a fairly standardized way. Performance assessments also can be standardized if they are administered and scored in the same way for all students.
summative assessment: An assessment at the end of a program period determining the overall quality of a program's results.
traditional assessment: An assessment in which students select responses from a multiple-choice list, a true/false list, a matching list, or work out the full solution of an equation, set out the proof in a geometry problem, etc. (Compare to alternative assessment.)
validity of an instrument: The degree to which a measure accurately assesses the specific concept it is designed to measure, excluding extraneous features from such measurement (e.g., whether a reading-comprehension assessment focuses on students' understanding of a story or their ability to read the story).
Lynch, S. J. (2000). Equity and science education reform. Mahwah, NJ: Lawrence Erlbaum Associates.
Copyright © North Central Regional Educational Laboratory. All rights reserved.
Disclaimer and copyright information.