|
|
Chapter 1: Reducing Class Size in Public Schools: Cost-Benefit Issues and Implications
John F. Witte
A Review of the ResearchThe effect of class size on achievement, along with the larger question of whether the amount of money spent on K-12 education is related to achievement levels, has been controversial for more than two decades in the United States. The literature of Hanushek (1979, 1986, & 1997); Hedges, Laine, & Greenwald (1994); and Hedges & Greenwald (1996) reviews more than one hundred experimental and quasi-experimental studies. These studies reach conclusions ranging from no long-term achievement gains to modestly positive long-term effects. The largely academic debate that centered on tying achievement to expenditures and class size has been overshadowed in recent years by state and national policies to reduce class size. By 1995, 11 states had passed some form of legislation to reduce class sizes in some schools (Bracey, 1995). The National Conference of State Legislators reports that 30 states are engaged in some form of class-size reduction effort. National legislation in 1998 provided $1.2 billion for class-size reductions, and proposed 1999 legislation sought to provide another $20.8 billion over ten years. This nationwide movement toward smaller classes was undoubtedly affected by a class-size reduction experiment begun in 1985 in Tennessee called the Student/Teacher Achievement Ratio, or STAR, study. The positive and lasting effects on achievement reported in that study received widespread publicity. Since then, projects and evaluations also have occurred in Wisconsin (the Student Achievement Guarantee in Education, or SAGE, program), California (the Class Size Reduction, or CSR, program), and other sites. These three state-level projects are compared and reviewed later in this chapter. Because of the potential benefits and increased costs of class-size reduction, the Regional Educational Policy Research Consortium, convened under the auspices of the North Central Regional Educational Laboratory (NCREL), sponsored a conference on recent research on class-size reduction programs. This conference was held in Chicago on October 1, 1999. The purpose of the conference was to address four central questions:
As an overview of the cost and benefit issues related to class-size reduction, this chapter first summarizes the discussion concerning the three programs in Tennessee, Wisconsin, and California that provide the most recent data on class-size reduction. As will be shown, each is distinct and in a different stage of analysis, but each adds to our knowledge of this policy intervention. Next is a discussion of the framing questions listed above and a look at some of the observations and conclusions that can be drawn from the NCREL conference. (Question 4 is subsumed under the first question because the discussion of benefits and costs at the NCREL conference included a lengthy discussion of the effects of smaller classes on teacher quality and demand.) The chapter closes with a summary of the research issues and priorities indicated by conference participants. This chapter will not minimize points of contention among educators concerning what we know and what the data show. However, its purpose is to note areas of agreement and identify where future research should be directed. A Comparison of Tennessee, Wisconsin, and California Programs and Experiments With Class-Size ReductionThree relatively large-scale experiments or programs in class-size reduction have been completed or are ongoing in the United States: Tennessee's STAR study, Wisconsin's SAGE program, and California's CSR program. Tennessee's Student/Teacher Achievement Ratio (STAR) The Tennessee STAR study was actually a constellation of studies, beginning with the DuPont Pilot Project in 1984, which was a pilot for the STAR experiment from 1985 to 1990; the Lasting Benefits Study from 1989 to 1995; and follow-up work that continues today. The main study began in fall 1985. Schools volunteered for the program, and 79 schools were selected representing a mixture of school areas (rural, suburban, urban, inner-city). To be eligible for selection, schools had to be large enough to have three kindergarten classes (57 students) and to accommodate at least one control and two treatment groups (a small class and a regular class with a teacher and an aide). Students were added to the experiment as the first class progressed into first grade (kindergarten was not required in Tennessee). New students also were admitted to program groups laterally in the higher grades. Students were randomly assigned to classes and treatment groups, and teachers were randomly assigned to groups each year. No other interventions were conducted in order to provide as little disruption as possible and provide as uncontaminated a test of smaller classes as possible. After the four-year experiment, follow-up data collection occurred as part of the Lasting Benefits Study as students entered higher grades and returned to normal classroom situations. To date, analysis of data through Grade 8 is available. A range of outcome measures were used, including both norm- and criterion-referenced achievement tests, retention in grade, class behavior, class disruptions (pull-outs), and teacher and aide assessments of classroom conditions and satisfaction. Considerable classroom observation took place, and data were collected on each school. Part of the Tennessee class-size reduction efforts was the Challenge project, which was directed at the state's 16 poorest districts. These districts were given grants to reduce the sizes of K-3 classes all at one time. Study of that program is at the aggregate district level (1989-1995). Wisconsin's Student Achievement Guarantee in Education (SAGE) The SAGE program in Wisconsin, discussed in greater detail later, grew out of a state commission headed by Alex Molnar that was charged with studying and recommending policies on improving urban education and reducing youth violence. Based on their own research, as well as the STAR study reports, the commission members conceived of the idea of reducing class size for the purposes of increasing meaningful contact between youth and adults. The commission proposed the class-size reduction program, along with other school interventions, and the legislature funded the program in 1995. The SAGE program began in the 1996-97 school year in 30 volunteer schools in which at least 30 percent of the students lived below the poverty line. No schools were turned down in the first year, and the program employed a range of classroom treatments. It established the class size at 15 students with a single teacher. The program also required other changes in the schoolusing rigorous curricula, extending school days, opening schools to students and the community in the evening, increasing staff development, and improving teacher accountability. Up to $2,000 per low-income pupil was provided for the program classes. The research design did not include randomization. Rather, it relied on a "matched" set of control schools (on family income, reading achievement, size, and racial composition) from the same districts as the experimental schools but where no programmed interventions took place. The experimental intervention began in kindergarten and Grade 1 and continued as students progressed to Grades 2 and 3. The Terra Nova Comprehensive Test of Basic Skills was administered in both October and May in Grade 1 and in May for Grades 2 and 3. Additional information was obtained from teacher questionnaires and surveys, teachers' logs, classroom observations, and student administrative records. The study pilot program and the study itself continue for five years, through 2001-02. However, in 1997 and again in 1998, the legislature increased the size of the program. California's Class Size Reduction (CSR) The California Class Size Reduction (CSR) program is not an experiment at all. The legislature enacted Governor Pete Wilson's proposal to reduce class size throughout the state in spring 1996, and the CSR program began that fall. The program affected K-3 classes and provided per-student funding for small classes for all classes in a school if all K-3 classes were limited to 20 students. If this reduction in class size was achieved, the districts received $650 per student (raised the following year to $800). They also received facilities grants of $25,000 (raised the following year to $40,000). Schools that already had classes at the 20-student limit were eligible for funding. In the second year (1997-98), 1.6 million students were in small classes at an annual cost of $1.5 billion (Brewer et al., 1999). It is estimated that the program eventually will affect 2.6 million students. A CSR Research Consortium study of the program provides the first outcome measures through the 1998-99 year. The study design included 432 schools and surveys of 1,485 teachers, 336 principals, and 2,113 third-grade parents. Because all schools were offered small classrooms, the study compared those that implemented the program with those that did not. There was no random assignment, and therefore schools could only be matched on whether or not they had implemented small class sizes. Data collection included Stanford Achievement Test scores; administrative data; data on students with disabilities; parent, teacher, principal, and district superintendent surveys; and classroom observations and case studies. Summary These three prominent programs in class-size reduction vary considerably. The STAR study was meant to be a random-assignment, isolated-effects study of two treatments: small sizes or normal sizes with a teacher and an aide in the classroom. It included a within-school randomization in an attempt to control for school-level effects. The SAGE study targeted low-income students, but did not employ random assignment and involved a range of interventions in addition to reduced class size. The evaluation employs a matched-school comparison. The CSR program is a statewide, nontargeted program that will affect every school district in the state. The evaluation study compares schools that did not implement class-size reduction in the first years with those that did. Although this variety of programs and evaluations seems problematic for making generalizations, the variations also provide unique information, which does require confirmation but makes strong suggestion regarding both the benefits and costs of these programs. Benefits and Costs of Class-Size Reduction ProgramsA number of states had class-size reduction programs before the recent surge (Texas, 1985; Indiana, 1985; Oklahoma, 1989; and Utah, 1990). Tennessee, Wisconsin, and California, however, provide the most useful information. Evidence from the SAGE and CSR programs covers only the first years of those evaluations. In addition, many of the crucial issues are just beginning to be addressed and considerably more research and analysis are required. Later in this chapter, we will try to highlight points of general agreement among educators, but current data are inadequate to understand more subtle issues, seriously contested issues, and unexplored questions for further research. Benefits Although most educators recognize the limitations of relying solely on achievement test data to measure educational success and many evaluations are attempting to analyze other measurement criteria, most evaluations have focused on test scores. All three studies provide some evidence for standardized test advantages for small classes. The advantage on a yearly basis was at least 0.1 standard deviations. The findings covered language arts and mathematics in all grades, and science and other subjects in several grades. The STAR study claimed such benefits over four years and that the beneficial effects lasted at least through Grade 8. The SAGE results were reported for first- and second-grade students, representing the first two years in a five-year evaluation. Similarly, the CSR Research Consortium reports only first-year results for third-grade students. SAGE estimates were based on value-added measures, meaning that prior achievement was controlled for, and achievement can be viewed as added education over a year. They also used standard regression analysis and hierarchical linear models that estimated small classroom differences after controlling for individual student differences. Both the STAR and SAGE studies report considerably higher achievement differences of smaller classes for minority students than for white students. For example, Finn and Achilles (1999) report for STAR that after kindergarten, minority student achievement gains from small classes were at least twice as great as the gains of white students in reading and close to that level in mathematics. Similarly, Molnar et al. (1999) report that African-American students in first grade in the SAGE schools gained more on all subtests than African-American students in the control schools. For the total test score, the advantage is approximately 50 percent. African-American students also gained more than white students in the SAGE schools (Molnar et al., 1999). The CSR study, however, did not find any differences between groups of students. It concluded that "[r]elative to students in larger classes, third-grade students in smaller classes showed, on average, a small positive achievement gain. The level of gain was similar for all groups of students, regardless of ethnicity, income status, or English language ability" (CSR Research Consortium, 1999, p. 1). Study findings also differed for classroom configurations other than single-teacher, small classes. The CSR study did not analyze any configuration other than small versus regular classes. However, the STAR and SAGE studies had alternative models. The STAR study concluded that adding a teacher aide in a regular classroom had no statistically significant effect over a single-teacher, regular-size class. In an analysis of 1996-97 SAGE first-year classes, SAGE reports a similar effect on the post-achievement results for both single-teacher, small classes and a two-teacher, 30-student class. Hierarchical/linear modeling in both configurations produced positive results compared to the control classrooms (Molnar et al., 1999). This finding, if supported by further study, is potentially very important, because it provides policymakers with an option that could reduce costs of adding new facilities. Other benefits also were claimed for some of the small-size class experiments. The STAR study found that teachers reported more positive classroom behavior in the smaller classes during the experiment. In addition, a follow-up study in the fourth grade found better learning behavior for small-class students than others. The study included measures of effort, initiative, and nonparticipatory behaviors (disciplining). The effects ranged from 0.11 to 0.14 standard deviations (Finn & Achilles, 1999). STAR also found fewer class pull-outs (for disabled students and others), less retention in grade, and positive effects on parental involvement and teacher satisfaction for small classes. STAR researchers estimated that 383 fewer teachers were needed after the program because fewer students were retained. Similar results over much less time were reported for SAGE and CSR. SAGE reported improved classroom discipline and other pedagogical benefits that will be discussed below. Statistical significance was not discussed (Molnar, Smith, & Zahorik, 1998). Similarly, the CSR study reported that less time was spent on disciplining students, and that parents of children in reduced-size classes were more satisfied than parents of children in larger classes (CSR Research Consortium, 1999). However, the report did not indicate statistical significance. The researchers did report that differences in parental involvement between large and reduced-size classes were not statistically significant. Issues, Differences, and Uncertainties Concerning Benefits Discussion of the limitations and issues in these studies revolves primarily around research design and achievement test score results. Eric Hanushek of Stanford University, while lauding the general approach of the STAR study and its value as a random-assignment experiment, points out that the results deviated from a considerable amount of prior research and that within the experiment there were a series of problems. The prior studies of whether money affected academic achievement were not reviewed, but the debate is well known (Hanushek, 1979, 1986, & 1997; Hedges, Laine, & Greenwald, 1994; Hedges & Greenwald, 1996). Charles Achilles of the STAR study challenges this literature as being based on nonexperimental data and teacher-pupil ratios, which he argues do not reflect actual class size. Hanushek's concerns about potential biases in the STAR research focus on school selection; inadequate data on teacher randomization and quality; inadequate checks on randomization of students, especially the lack of prior achievement tests; and student attrition and switching from treatment categories. The school selection issue is based on two concerns. The first, which also applies to SAGE and CSR, is that schools had to volunteer for the program. This requirement introduces potential selection bias in the factors that might be associated with volunteering and nonvolunteering schools. This problem could affect the within-school randomization if there are systematic characteristics that distinguish volunteering schools from others. No evidence of these differences was introduced, however. Hanushek also notes that, although teachers were to be randomly assigned to the various treatment and control groups, the study contained little information on how this assignment was done or on the critical characteristics of teacher quality. In his 1999 article, he counters a finding by Krueger (1997) that showed no differences in teacher experience, race, or degree level between the groups, noting that those variables are not very highly correlated with teacher quality. No differences that would indicate nonrandom teacher assignment were described, however. Hanushek also lamented the fact that prior achievement was not controlled for in the experiment (thus the estimates were not value-added estimates) and that the lack of prior tests prevented an adequate test of the random assignment of students. He says that he understood why this was difficult for the initial kindergartners, where testing is very difficult, but not on students who later entered in higher grades. STAR's Charles Achilles and others counter that a large-scale randomization experiment should not require a value-added model, because students would be randomly distributed in their initial ability and that, once in the program, students could be tracked by incorporating prior-year tests. Hanushek also notes that overall attrition from the experiment was more than 50 percent. In addition, 10 percent of the students crossed over from one treatment group to another, and another 10 to 12 percent did not take tests in the last two years. He also cites the work of Goldstein and Blatchford (1998) and Krueger (1997) to indicate that the attrition was not random. Furthermore, students in both groups who dropped out were not doing as well as students who remained, and those dropping out of regular classes were farther below average than those in small classes. Hanushek speculates that this difference may be due to higher retention in grades among students in regular-size classes. But whatever the reason, the differential should work against the small classes following attrition because a larger percentage of poorly achieving students would have left the regular classes. Students also switched from the control to the treatment group (and to a lesser degree from the small to regular-size group). This movement also could produce bias, but it at least raises questions concerning the precision of the randomization process. Achilles, however, cites a recent paper by Nye, Hedges, and Konstantopoulus (1999) that attempts to answer both of these problems. They estimated separate achievement models for those actually receiving the treatment to which they were assigned and for those who switched and did not follow through in the assigned category. They argue that this method should understate results for small classes unless the small classes are detrimental to achievement. The achievement advantages compared to students in regular-size classes for these two categories were similar in mathematics, reading, and science in Grades 3, 4, 6, and 8 (Nye et al., 1999). To answer the potential attrition problem, Nye et al. (1999) provide mean third-grade test scores in math, reading, and science for both actual assignment and initially assigned students broken down by those who were present or not present in the eighth-grade follow-up. Although, as with other studies, those who left both small and regular classes were doing considerably worse in Grade 3, the researchers discern no differences between the groups for either actual or assignedonly treatment groups. They conclude: "As a result, it is implausible that attrition made small classes appear more favorable than if there were no attrition" (Nye et al., 1999, p. 133). There are differences of opinion concerning substantive findings of the STAR study. Probably the most relevant is the question of when the small-size class effect occurs. For example, Hanushek argues that the effect seems to occur in kindergarten and perhaps Grade 1, but there is no effect following those grades. He supports this argument by noting that the differences between small- and regular-size classes appear after kindergarten and improve slightly after Grade 1, but the gaps then remain the same in Grades 2 and 3. He also notes a major difference between annual cohort advantages and the test advantages of small classes for those who remain for four years in the treatment groups. The annual cohort advantages increase each year. However, the advantages for the four-year group appear approximately the same for each succeeding grade (K-3) in reading, but decline considerably in math in Grades 2 and 3 (Hanushek, 1999). He interprets this finding as "consistent with a one-time effect of smaller classes that either erodes or can be made up for over time in regular classes" (Hanushek, 1999, p. 155). This argument is countered by Finn and Achilles (1999), who reported growth in achievement in each year for those students in small classes. A one-year effect also was reported in the SAGE program. Specifically, in the second-year small class, students did not improve on their first-year advantages over students in comparison classes. However, that result might have been caused by late implementation of second-year small classes in many SAGE schools (see below). And this finding might not persist for the last three years of the experiment (Molnar, Smith & Zahorik, 1998). Achilles again counters these conclusions for the STAR study by referring to their understanding of the increasing variance over grades of the test measures used and points to a recent study by Krueger that controls for that variance. The Krueger study, however, finds a similar result to Hanushek's for students entering kindergarten and staying in the program for four years. But that effect is not held up when he analyzes all students by year of entry. Those entering in first- and second-grade seem to benefit considerably from more years in the program (Krueger, 1999). The Nye, Hedges, and Konstantopoulus (1999) study also reports on estimates of cumulative advantages in Grades 4, 6, and 8 in math, reading, and science for students in small classes in one through four years of prior small classes. The effects are significant in all but Grade 8 for the first year only, and the effects increase with more years of small classes (Nye et al., 1999). Finally, Hanushek reports on a study he did at the suggestion of Achilles that looked at the distribution of gains in small classes by school. The study was of kindergarten effects comparing all regular, regular with aide, and small classes in 79 schools. He found that smaller classes were superior to both other categories in only 40 schools. Although this result is better than what would be expected with equal probability across the three categories, it suggests that something in addition to small class size might be at work (Hanushek, 1999). This important point will be addressed in the discussion of difference in classroom behavior. There are also difficulties in the Wisconsin and California programs, and the researchers involved in the SAGE and CSR studies are forthright about the problems in their research situations. By contrast, the STAR study may have less inferential and design problems than the other two major studies under way. To begin, Wisconsin's SAGE program is not a random-assignment study. This situation is partially offset by the ability of the researchers to do comparative value-added models. However, there were also several problems with the comparison schools. First, there were only 17 comparison schools the first year, compared to 30 SAGE schools. The second year, two comparison schools withdrew and one converted to a SAGE school. Although on most student characteristics students appear similar in SAGE and comparison schools, in the second year there are considerably more white students from families ineligible for free lunch in the comparison schools (Molnar et al., 1999). These individual differences can be accounted for in multivariate analyses, but they might indicate differences in school characteristics that are not controlled. In addition, in the second year, with funding unclear, many SAGE schools did not reduce classes for first-graders until late in the fall or, in one case, the beginning of the second semester. This timing might have had an impact on the failure of achievement to increase further in the second year of the program. As with the STAR study, there was high attrition from the SAGE programapproximately 30 percent during the first two years. As in the STAR program, attrition occurred more among underachieving students in both the small classes and in the comparison groups. Differences on pretests between those who left and those who remained in each group were very close, however, indicating that attrition would have little impact on the comparison between small classes and regular classes (Molnar et al., 1999). Finally, the authors reported ceiling effects on 1996-97 first-grade testseffects that should bias the small-class size advantages downward. The problem required switching the form of the test being used in 1997-98. The California study presents even more potential problems. There was no experimental assignment. Study comparisons are between classrooms that were reported with smaller classes and those that were not. Reports emphasize that poorer, more heavily Hispanic districts were less able to implement small classes in the first two years. Crowded school districts were slow to implement the program, and this meant schools with high proportions of English language learning (ELL) students. Those schools in poor areas also did not have enough extra funds to implement the program, and therefore diverted money from other programs. If those programs affected achievement, a further unmeasured bias is introduced. Additionally, students cannot be tracked over time, and hence value-added achievement measures cannot be used. These conditions create assorted problems and place heavy emphasis on controlling for student, parent, and school differences. And it is unclear how many of these controls can be instituted or appropriately linked to classroom type. Perhaps the biggest problem may be yet to come. The CSR Research Report estimates that by 2000-01 almost all first- and second-graders will be in small classes, as will 90 percent of third-graders and 95 percent of kindergartners. The upshot is that comparisons in California largely will disappear by next year, and those used in the past might be questionable if the goal is to try to ascertain the pure effects of smaller classes on achievement. Costs The costs associated with smaller class sizes can be divided into monetary costs and quality-of-instruction costs. Both are difficult to determine and in the long term require many assumptions. Monetary costs include operating costs and fixed or facility costs. Quality of instruction costs include the effects of reduced-size classes on the supply of teachers and/or the substitution or addition of teacher aides and other support personnel. With the exception of the CSR program in California, existing experiments provide little useful data for long-term cost estimates. That is because the STAR and SAGE programs were meant to be small-scale experiments or pilot programs. The STAR program cost about $12 million at the time; the SAGE program allocated up to $2,000 per pupil for participating schools and cost $4.59 million in the first year and $6.96 million in the second year. The CSR program provides a better indication of costs. The state agreed to pay $650 per pupil for students in classes of 20 or fewer students in the first year and $800 per pupil in subsequent years. The state also gave each participating school a $25,000 facilities grant that increased to $40,000 in 1997-98. Total per-year program costs for 1997-98 were approximately $1.5 billion, and 1.6 million students were in kindergarten through third-grade classes. Combining the facilities grants and cost-per-pupil for a 200-student K-3 school brings the costs to approximately $1,000 per pupil, or $20,000 per 20-student classroom in the California case. It should be noted that these costs were merely set by the California legislature, with little effort to relate them to actual costs of reaching the 20-student targets or needs (some schools already had classes with fewer than 20 stu-dents, but they still received state money). National estimates are problematic and require a number of assumptions just to estimate new personnel costs. Because space inventories do not exist on a national basis, facilities costs cannot be estimated with any reliability. It is worth noting, however, that the CSR study found that space problems were listed as the number one problem by principals in schools that were unable to implement reduced-size classes in the first year (CSR, 1999). Estimating personnel costs requires making assumptions on the class-size limit, which varies from 15 to 20. However, it also depends on how class size is measured and how flexible the measurement would be. An example of a flexible system would be one that relies on teacher-pupil ratios across a school or district, while an inflexible system (such as the California program) would require each classroom in a school to be below a specified size before any classes in the school would qualify. In addition, the grades to which small classes would apply, and whether the program is targeted to low-income students or all students, would also greatly affect estimates. Finally, labor costs and the dynamics of teacher supply will affect costs over time. CSR researchers approach the estimates for operating cost increases by creating a "base policy" set of assumptions and then altering important assumptions to indicate the range of costs with different assumptions. The most important assumptions they make are to apply reduced classes to all students (no targeting) in "grouped" Grades 1-3 on a district level. Thus, a district average across these grades must meet the target classroom size, which varied from 15 to 18 to 20. They assumed that this system was "inflexible," because if the average were one student higher than needed, the district had to add a classroom. However, in comparison to the existing California policy and SAGE and STAR programs, grouping by grades and averaging across the system appears to be highly flexible. They also assumed that all grades would implement the policy at one time. In the results for this base model, the first thing that is apparent is that the target-level class size is important. To reduce class size to an average of 15 students requires more than five times more classes than to reduce them to 20 students. For example, in 1997-98 there were actually about 510,000 classrooms for Grades 1 to 3 in the United States (Brewer et al., 1999). A class size of 15 in 1998-99 would require 226,910 new classrooms, or an increase of 44.5 percent. How sensitive are these results to varying assumptions? That depends on the assumption. Working with the middle-size class of 18, it appears that the largest effect is created by targeting the program to low-income students. By setting the policy definition of an eligible school as one in which 50 percent or more of students qualify for free or reduced-price lunch (approximately 180 percent of the poverty line), the costs of reducing classes to a grouped average of 18 lowers the annual cost by more than 60 percent, from $5.05 billion to approximately $1.8 billion. In contrast, eliminating the grouping and requiring each class to meet the average increases costs about 10 percent; setting the average on a school rather than a district basis adds approximately 20 percent to the base model costs (Brewer et al., 1999). If one uses the base model with a class size of 18, the costs per classroom are about $10,000 per year for 1999-2000. That cost is considerably lower than what is being spent in California for a reduction to 20. The California policy is inflexible, however, requiring all K-3 classes in a school to be at or below the limit. This requirement could be much more costly. The other possibility is that California is providing too much support, and schools are gaining overall resources after they reduce classes. At any rate, somewhere between $10,000 and $20,000 per classroom per year might serve as personnel cost estimates of reasonable programs. Although the study by Brewer, Krop, Gill, and Reichardt (1999) is clear, carefully thought out, and the best available, the authors indicate a number of limitations and issues. In addition to not being able to estimate fixed or facilities costs, they also do not take into account a number of dynamic aspects of teacher supply. For example, although they build cost-of-living increases into average salaries and benefits, they do not take into account aging of teachers or retirements. Aging is likely to add to out-year costs in that teachers move up in the salary grid; retirements will likely work in the opposite direction as new teachers at lower pay replace more highly paid retirees. Finally, their estimates do not include two possible savings: the reduced need for teacher aides and possible cost savings of educating students with disabilities. All of these factors might influence ultimate costs, and it seems impossible to determine even an assumed aggregate direction for these factors. Clearly, considerably more research is needed on these effects as states implement their programs. With even crude empirical measures of some of these changes, we will be able to assume parameter estimates and take these factors into account. A final cost is the effect of class-size reduction on teacher quality. Evidence from the CSR program in California clearly indicates short-term problems in providing the 23,500 new teachers in the first two years of the California program. The most direct indicator of this was that the number of uncertified K-3 teachers rose from 1 percent before class-size reduction to 12 percent two years later. In addition, the number of uncertified teachers in schools varied dramatically by the income of the students. In schools in the lowest income quartile, more than 20 percent of the teachers were uncertified by 1997-98 (up from 2 percent in 1995-96) compared to 5 percent in the highest income quartile (up from less than 1 percent). (See CSR, 1999, Figures 6 and 7.) Whether teacher quality will be affected adversely in the long term is unknown. Increased demand could increase teacher salaries and smaller class sizes could make the job of teaching more attractive. Both factors could increase the quality of teachers. As Hanushek and others note, there are other important forces at work. For example, teachingespecially in the elementary gradeshas traditionally been a woman's profession; as opportunities for women in other professions expand, the quality could be adversely affected. Also, markets for teachers vary dramatically across the country and across districts. That means that market shortages of crisis proportions could exist in one area while class-size reductions in other areas would be much less affected. Thus, as with the benefits of class-size reduction, a number of research issues remain to be addressed before we can achieve an accurate estimate of the potential costs of this important policy intervention. Pedagogical and Classroom Practices With Reduced Class SizesMost educators agree about what the small-size class research has concluded on classroom practices. All three of the central studies reviewed above analyzed pedagogy and classroom behavior using either surveys (usually of teachers and aides) and/or classroom observation by researchers in the classroom or by videotape. In all three studies and in several recently published nonexperimental studies, two conclusions seem to emerge. The first is that radical changes in pedagogy do not result from smaller class sizes. Simply stated, teachers continue to do the same thing, but they seem to do it better. Specifically, the substance of lessons (what was taught) seemed to be similar in small and larger classrooms, and the approaches teachers took to teaching, such as large-group discussion, seat work, group exercises, and so forth, did not radically change. However, teachers in small classes reported, and it was observed, that more overall time was spent on instruction. In addition, there were two important shifts in classroom behavior in small as compared to large classes. The firsthighlighted in the STAR, SAGE, and CSR studies, and reinforced by other studies (Betts & Shkolnik, 1999; Rice, 1999)was that more individualization occurred in small classes. This was reflected in increased time devoted to individuals as opposed to groups and to working closely with students who were having difficulty. A second and directly related result was that less time was spent on disciplining and other noninstructional activities. Thus it appears clear from a range of studies that class-size reduction beneficially increases time-on-task. What is not known from the existing studies, but could be the subject of future experiments, is whether within small classes, different overall approaches to teaching and learning might provide superior achievement results. For example, one could compare an accelerated school, readiness to learn, and a Montessori program all using smaller classes. The point is that while teachers' pedagogy seemed to remain the same in smaller as in regular-sized classes in the three most-studied programs, it is not evident that the gains from those approaches maximize the advantages of small classes. Thus future theories, trials, and empirical research are needed to think about and test the differential achievement results of various pedagogical approaches in reduced-size classes. Class-Size Reduction Compared to Other ReformsThe issue of benefits and costs compared to other reforms and programs is a critical one. However, very little is known in detail about the costs or benefits of other reforms. Areas of study might include, for example, systemic reform efforts with increased use of standards and testing, the benefits and costs of choice programs and charter schools, or intensive staff development interventions. Costs often are not tracked on a program basis and benefits often are difficult to measure or to isolate from other changes occurring in school districts. A more reasonable approach may be simply to calculate and debate the alternative uses of additional resources being devoted to class-size reduction. For example, if the costs fall between the $10,000 estimate offered by CSR researchers (for reduction to 18) and the $20,000 being spent in California, one could ask whether it would be better to reduce class sizes, increase teacher salaries, add technology, or improve professional development. This would consist of estimating costs without associated estimates on achievement for students, but the policy debate would at least have some substance with that approach. That minimalist approach assumes that at least those alternative costs can be accurately estimated. ReferencesBetts, J. R., & Shkolnik, J. L. (1999). The behavioral effects of variations in class size: The case of math teachers. Educational Evaluation and Policy Analysis, 21(2), 193-214. Bracey, G. (1995). Research oozes into practice: The case of class size. Phi Delta Kappan, 77, 89. Brewer, D., Krop, C., Gill, B. P., & Reichardt, R. (1999). Estimating the costs of national class size reductions under different policy alternatives. Educational Evaluation and Policy Analysis, 21(2), 1179-92. CSR Research Consortium. (1999). Class size reduction in California 1996-1998: Early findings signal promise and concerns [Online]. Available: http://www.classize.org. Finn, J. D., & Achilles, C. M. (1999). Tennessee's class size study: Findings, implications, misconceptions. Educational Evaluation and Policy Analysis, 21, 97-110. Goldstein, H. & Blatchford, P. (1998). Class size and educational achievement: A review of methodology with particular reference to study design. British Educational Research Journal, 24, 255-268. Hanushek, E. A. (1979). Conceptual and empirical issues in the estimation of educational production functions. Journal of Human Resources, 14, 351-88. Hanushek, E. A. (1986). The economics of schooling: Production and efficiency in public schools. Journal of Economic Literature, 24, 1141-71. Hanushek, E. A. (1997). Assessing the effects of school resources on student performance: An update. Educational Evaluation and Policy Analysis, 19(2), 141-64. Hanushek, E. A. (1999). Some findings from an independent investigation of the Tennessee STAR experiment and from other investigations of class size effects. Educational Evaluation and Policy Analysis, 21(2), 143-64. Hanushek, E. A. (forthcoming). The evidence on class size. In S. E. Mayer & P. E. Peterson (eds.). Earning and learning: How schools matter. Washington, DC: The Brookings Institution. Hedges, L. V., & Greenwald, R. (1996). Have times changed? The relation between school resources and student performance. In G. Burtless (ed.), Does money matter? The effect of student resources on student achievement and adult success (pp. 43-73). Washington, DC: The Brookings Institution. Hedges, L. V., Laine, R. D., & Greenwald, R. (1994). Does money matter? A meta-analysis of the effects of differential inputs on student outcomes. Educational Researcher, 23(3), 63-85. Krueger, A. B. (1997). Experimental estimates of educational production functions (NBER Working Paper No. 6051). Cambridge, MA: National Bureau of Economic Research. Krueger, A. B. (1999). Experimental estimates of educational production functions. Quarterly Journal of Economics, CXIV, 497-532. Molnar, A., Smith, P., & Zahorik, J. (1998). 1997-98 evaluation results of the student achievement guarantee in education (SAGE) program. [Online]. Available: http:/www.uwm.edu/SOE/centers&projects/sage. Molnar, A., Smith, P., Zahorik, J., Palmer, A., Halbach, A., & Ehrle, K. (1999). Evaluating the SAGE program: A pilot program in targeted pupil-teacher reduction in Wisconsin. Educational Evaluation and Policy Analysis, 21(2), 165-78. Nye, B., Hedges, L.V., & Konstantopoulos, S. (1999). The long-term effects of small classes: A five-year follow-up of the Tennessee class size experiment. Educational Evaluation and Policy Analysis, 21(2), 127-42. Rice, J. K. (1999). The impact of class size on instructional strategies and use of time in high school math and science courses. Educational Evaluation and Policy Analysis, 21(2), 215-30. Previous | Table of Contents | Next
|
Contact Us | Privacy Policy |