Standards for Educational and Psychological Testing (American Psychological Association, 1985) is the definitive statement of standards for measurement by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education. It consists of four parts: technical standards for test construction and evaluation, professional standards for test use, standards for particular applications, and standards for administrative procedures. It is designed to guide professional practice and to be useful to a wide range of people who work with tests or test results. By using these standards, educators can make decisions about whether a particular test is technically adequate and appropriate for their purposes and whether they can reasonably reach conclusions based on the test results.
A test that is technically adequate meets criteria for validity, reliability, and norms:
Validity is evaluated by the extent to which the instrument measures what it purports to measure. Validity, therefore, is always in regard to a particular test use (Langhorst, 1989, p. 5).
Adequate reliability depends upon a low magnitude of errors of measurement. Such errors can be introduced, for example, by inconsistencies in the performance of those being tested or variability in their interest, motivation, or other emotional and physical states, which may be unrelated to the purpose of the test. Such influences make stability in test scores of young children very problematic.
Norms permit the comparison of the performance of new groups of test takers with the samples of students on whom the test was standardized. Appropriate use of tests requires that the norming sample be comparable in age, demographic characteristics, and community background to the group of children being assessed.