Reviewing the Conclusions of CREDO’s National Charter School Study 2013

Prior to its release of the National Charter School Study 2013, The Center for Research on Education Outcomes (CREDO) produced a series of reports – one national, the others state-based – looking at student achievement in charter schools as compared to traditional public schools. CREDO’s 2009 national report, Multiple Choice: Charter School Performance in 16 States, is the basis for the statement often made across certain media, policy and education circles that only one out of five charter schools succeed.

We at The Center for Education Reform (CER) have questioned that conclusion, both due to the lack of research rigor in its methodology (pairing charter school students with virtual twins in traditional public schools), and to the subjective nature that is required to make such assessments about unobserved variables in students. In addition, CREDO’s continuing habit of making comparisons of schools and student achievement across state lines ignores the wide variation in state standards, tests, and measurements.

This CER analysis scrutinizes the new CREDO National Charter School Study 2013, identifies the problems with its data and calls into question the subsequent conclusions.

Summary of Analyses

This report is broken down into two separate analyses. First, it provides an update on the 2009 report, which reviewed charter school performance in 16 states yet made generalizations about charters nationwide. Second, it examines learning gains across states, schools and nationally using data from 27 total locations (New York City is included as separate data from New York state). In the 16-state revisit, schools are broken down into continuing schools (those included in the 2009 report) and new schools, and analysis of overall charter impact in reading and math is done within these states and by those types of schools. In the 16 states, continuing schools made modest progress – about seven more days of learning in reading compared with traditional school students, and closed the learning gap in math by seven days. CREDO believes the closure of eight percent of schools from 2009 is the cause for this improved charter school performance.

The data from the 27 states were used to show national trends in charter schools as compared with traditional school students, and found that the average charter student gains an additional eight days of learning in reading and they are on par with their counterparts in math. In addition, in reading 16 states had charter students performing better than traditional students, weaker for charters in eight states and similar academic performance in three states. For math, 12 states had higher charter school academic growth, 13 had weaker growth and two were similar.

If National Comparisons Are So Easy, Why Do We Need Common Core?

Within this CREDO study it is said that, “not surprisingly, the performance of charter schools was found to vary significantly across states.” CREDO recognizes in fine print that there are wide variations in state tests and that they have somehow determined a way to align them for meaningful comparison. That of course begs the question – if it’s that easy to align state tests and results across state lines, why is there a national move for Common Core State Standards and aligned tests? Leaders across the political spectrum recognize that America’s school standards are a mixed bag in terms of rigor and requirements. In addition, the assessments that measure them are completely different and impossible for even the best researchers to standardize. Not only are there uneven rules and varying assessments, but the cut scores to determine which outcomes are passing – and those that are failing – are all over the map.

This great variance in school standards is why year after year the NAEP results, while limited because they only offer a snapshot in time, are so compelling and so universally accepted by those who understand research, and why CREDO’s results are so wanting and, in some quarters, derided. When NAEP measures student performance across state lines, it measures them on identical levels, albeit in a small sample and not for the same students each time. When CREDO claims to do the same, it is improvising at best, but even worse jumping to erroneous conclusions that are potentially detrimental to students.

In addition to the two national charter reports, CREDO has released 25 state reports using the same methodology, and through which many find overall positive results. Such comparisons, while still based on the same questionable methodology, at least compare students on the same state assessments. CREDO argues that many states now have data that permit growth over time comparisons. Such acknowledgement makes one wonder: why bother with sweeping, national generalizations when one can obtain state by state results and compare non-virtual traditional public schools students with charter students in the same state, and in some cases, same cities? Comparing students within the same location, and under the same policy environment and laws is closer to the gold standard methodology for which researchers have been advocating. Researchers want to know the effects of the activity or intervention being studied even if only for a limited population. Having inaccurate measurements of said activity does researchers and policy makers no good.

If the methodology employed were based on randomized controlled trials (RCT), then one would also want to account for variances in state laws that often dictate conditions under which charter schools operate. Some states require charters only to recruit at-risk students; most underfund charter school students, on average, 30 percent less than their traditional public school peers; a very select few have policies in place that allow for objective authorizing and oversight; and all vary greatly on how they look at school and student-based performance.

Student Achievement

CREDO’s report argues that it employs growth data for students to create a picture of (ignore edit above) student achievement gains – or losses – over time. It attributes the ability to do this to better and more consistent data collected by states. However, it’s not that simple. For example, some students in the groups are only in their first testing year in a charter school. Others have been tested each year over five years in the same school. Growth measures are supposed to be grounded in comparable data for comparable students year after year. If the sample doesn’t account for the same students year after year, how can it conclude that achievement is positive or negative?

Page 24 of CREDO’s Supplementary Findings Report demonstrates the conundrum of analyzing groups of data and not individual student data consistently over varying periods of time. For example, CREDO acknowledges that their results include students who have only spent one or two years in charter schools, “not allowing much time for their cumulative impact to be seen.” Much more is of great concern and anyone using this report to make conclusions would be wise to read the fine print before doing so.

Methodology

CER has argued – echoing highly respected researchers — that the only studies that are valid for understanding and comparing charter school achievement are “gold standard” randomized control studies such as those done by Stanford Economist Dr. Caroline Hoxby, and University of Arkansas’ Dr. Patrick Wolf, to name just two among at least a dozen more. Such studies compare students who were chosen randomly from two pools – students who were chosen by lottery and attend the school of choice, and students who did not attend, but were also in the lottery. Hoxby has done such studies with regard to charter schools and Wolf has conducted such studies for voucher programs.

The CREDO study employs a completely different method of assessing student achievement, which is described in detail in the report. Because of the Center for Education Reform’s ongoing critique of their methodology, CREDO addresses the issue of randomized control “gold standard” studies and argues that RCTs are not valid for broad charter school studies.

The 2013 CREDO Study takes CER’s previous critiques to account in a side-by-side rebuttal, stating, “The lottery must be random. This is often not true in charter schools, as many schools permit preferences to siblings of current students, children of school founders or staff, or residential preferences for students who live near the school.” Once again we take issue with this statement.

CER has responded in a point-by-point counter response you can find here.

The bottom line is RCT ‘Gold Standard’ research is prominent accepted research practice, which CREDO rejects. In fact, some researchers have suggested this is a way to draw easier, not better, conclusions.

RCTs easily handle students who are exempt from lotteries, like siblings, by excluding those students from the analysis. This is common research practice and in no way threatens the internal validity of the research. Randomization can be tested and is tested by all ‘gold standard’ student performance analyses.

The CREDO study’s authors have admitted that it is easier to “generalize” about a charter school by creating so-called virtual twins, while admitting that head-to-head studies (referred to as “Lottery Studies,”) are superior to their approach. According to respected researcher Dr. Caroline Hoxby of Stanford, Harvard, and the National Bureau of Economic Research, “the CREDO study does not have data on charter schools’ admissions lotteries, so it does not use a randomization-based method of evaluation. Randomization is the ‘gold standard’ method of evaluating charter schools’ effects on student achievement because it effectively eliminates all forms of selection bias so long as (i) randomized admissions lotteries were used and (ii) a sufficient number of students participated in them.”

Matched vs. Unmatched Students

CREDO acknowledges that there are problems with finding matches for all students and that in some cases students who may or may not have a huge impact on the outcomes of the achievement data are excluded altogether. In addition, whereas this report finds matches for 85 percent of the students in charters, the last report found only 75-80 percent. However, as Dr. Hoxby argues, “lacking lottery data, the CREDO study depends on a matching method based on charter school students prior histories in the traditional public schools. But it does not match each charter school student to individual traditional public school students with similar demographic characteristics. Rather, it matches each charter school student to a group of students in traditional public schools. A charter school student can potentially be matched to a group that contains many students. The study then computes average achievement and other average characteristics of each group. Thereafter the study treats these group averages as though they were students.” There are numerous other problems with this approach that experts such as Hoxby have enumerated and CREDO addresses itself in this newest report:

“Although the VCR method used in this report provides matches for 85% of the charter students in our data set, it is important to identify ways in which unmatched students may differ with those included in the analysis. The ability to extrapolate findings from a particular sample to the broader population is referred to as external validity (discussed above). In the case of this analysis, CREDO’s sample encompasses a large proportion of the entire population of charter students across the country, but as can be seen below, unmatched charter students do differ from their matched counterparts.

“We see that the test scores of matched charter students are significantly higher than for unmatched students in both math and reading in the year in which they were matched (period 1). This is because charter students at the very low and high end of the test score distribution have more trouble finding matches in TPS. The fact that our data represent over 90% of all charter students in the country makes us confident that estimates are highly aligned with actual population values, although we are uncertain to what extent our results apply to students without matches.”

Policy Prescriptions vs. Data

As weak as CREDO’s research is, its policy prescriptions are even more troubling. For instance, CREDO’s plan to address what it concludes as uneven student achievement in charters is lacking in any experience in how state policies are written and how they impact actual schools and students. CREDO concludes that the closure of eight percent of charter schools in the 16 original states studied in 2009 is why the achievement may be a strong factor in why previously studied states have improved. However, this latest report also concludes that new schools alone are not responsible for the improved quality.

In reality, charter schools that are inadequate close long before they are academically deficient. This is because, as CER has pointed out in years of study, operational and financial deficiencies are the first and earliest sign that a school may not be equipped to educate children.

Conclusion

At the Center for Education Reform, we follow a simple premise: all schools, including charter schools, must be held accountable. The path to accountability for charter schools must start with strong laws with multiple and independent charter school authorizers and tools in place to hold charters to the highest academic and operational standards.

State-by-state and community-by-community analyses are the only true measures to date that offer validity for parents, policymakers and the media to report to make smart decisions about educational choices and outcomes for students.

Continued analysis of this report is forthcoming and ongoing given the voluminous nature of the data.