Chronicles of (the conceptually incoherent & empirically invalid world of) VergarNYa

As with the Vergara case in California, a central claim of the New York City Parents Union is that the presence of statutory tenure protections in New York State leads to a persistent and systematic deprivation of a sound basic education which falls disproportionately on the state’s low income and minority children.

Let’s review again the basic structure of this argument. The argument challenges state statutes that impose restrictions on district contractual agreements pertaining to procedures for evaluation and dismissal of teachers once they achieve “tenure” or continuing contract status.

The argument goes – within districts, minority and low income are disproportionately assigned the “least effective” teachers.

Within districts, minority and low income children are disproportionately affected by assignment of the least qualified teachers, including novice teachers and those not classified as “highly qualified.”

And this occurs because of statutory definitions of and job protections pertaining to “tenure.”

Now, to the extent that substantive disparities of the types mentioned above exist, the next trick is to show some connection to the laws in question.

These laws are presumed to affect all districts which operate under them similarly.

If these laws are unchanged over time, it is presumed that districts have little room to affect positive change in the distribution of teacher attributes when operating under these laws.

If a similar or greater share of variation in teacher attributes actually exists across districts (across separate teacher contracts) as opposed to within, then it is likely that some other factors are playing into the disparate assignment of teachers, including the sorting of teachers as they apply for jobs on the labor market, considering variations in working conditions and compensation.

Unless of course, we are arguing this case in the offbeat world of VergarNYa.

Let’s take a look at the actual data on NYC and NY state (NYC Labor market) teachers to see just how badly the actual data might undermine plaintiffs arguments before their case even gets off the ground.

For the following analyses, I’ve mined three sources of data which I have available at my fingertips because of previous projects:

  1. NYC Value Added estimates publicly posted in 2012.
  2. New York State Personnel Master File (teacher credentials and compensation)
  3. New York State School Report Cards (school demographics)

Do Low Income and Minority Children within NYC have the Least Effective Teachers using the City’s Own VAM Estimates?

I’ve explained previously the problems with using “effect” ratings themselves in determining equitable distributions, since we can’t always tell whether the distribution of teacher “effect” are inequitable, or biased. That is, do we appear to have more “bad” teachers in high poverty schools or are teachers getting bad ratings in part because they work in high poverty settings?

I’ve also explained previously, that while the New York State (NYSED) growth percentile scores tend to be significantly biased, by poverty and other demographic characteristics, New York City’s more refined Value Added Model produces significantly less bias. You might say – ah… that’s good… and in some ways it is. But it certainly doesn’t help the VergarNYa arguments, does it?

For example, Figure 1 below shows the demographic characteristics of NYC schools of the Upper Half of teachers by Value Added Percentile and the Bottom Third of teachers – among those in the upper half or bottom third for three consecutive years:

Figure 1.

Slide1

As it turns out, the percent black or Hispanic, and the percent free or reduced priced lunch is actually higher, on average, in schools of the teachers in the upper half.

Ah… but you say, what about the really really really bad bottom 5% of teachers? What are the demographics of their schools? Well, again, comparing the bottom 5% to all others, the bottom 5% are in schools with a) lower shares of low income children and b) lower shares of disadvantaged minorities.

Figure 2.

Slide2

 

Are the Odds Greater that Low Income or Minority Children within NYC have Less Qualified, or Novice Teachers?

Well, then, since our indicators of teacher “effect” aren’t sharply disparately distributed, unless we used the state’s biased measures, what about more traditional attributes of teachers – like concentrations of novice teachers, or state policy designations, like “highly qualified?”

The next figure is based on logistic regression models evaluating the relative odds that a teacher in a school with X% versus X+1% low income or disadvantaged minority students is novice or not, or highly qualified or not. The models focus on New York City schools again. Figure 3 shows us that:

a)      There is little if any shift in the likelihood that teacher is novice when % free or reduced priced lunch is higher, or when % black or Hispanic is higher.

b)      There is a slight uptick (small but statistically significant, in part because we have such a large data set) in the likelihood that a teacher is novice as % black and Hispanic population increases, coupled with a slight decrease in the likelihood that a teacher is highly qualified.

  1. The uptick amounts to a <1% increase in likelihood that a teacher is novice for each 1% increase in % black and Hispanic population;
  2. The decrease amounts to about half of one percent in the likelihood that a teacher is highly qualified for each 1% increase in % black and Hispanic population.

Figure 3.

Slide3

So, we do have some disparity here. That said, it’s still a heavy lift to suggest that a) this disparity rises to a level of substantial constitutional deprivation and even heavier lift to suggest that b) state teacher tenure laws have anything to do with the presence of this disparity (the apparent inability of the district to reshuffle teachers to negate this relationship between novice teacher concentration and student minority concentration).

The next figure evaluates total teaching experience using a regression model to parse the relationship between school demographics and average total years teaching. And this figure shows that for each additional 1% low income population, teacher experience… well… doesn’t really change. For each 1% additional disadvantaged minority population, teacher experience declines by 5% of one year.

Figure 4.

Slide4

Again, we have some disparity with respect to minority concentration… but again, it’s one heck of a stretch to assume causation between state teacher tenure laws and differences of 5% of one year in average experience of teachers associated with each 1% change in percent minority student concentration.

Is this really a within district/within contract problem?

As I’ve pointed out in my previous two posts, the presumption that a major cause of teacher quality disparity affecting low income and minority children is state statutory protections of due process in dismissal cases, relies on substantial disparity in teacher attributes across schools within districts, as opposed to across districts. The idea, as expressed by the various local administrators in California who took the stand at trial, is that their hands are tied. They have no other choice because of the shackles of tenure and seniority protections, to keep bad teachers in low income and minority schools, and of course, keep good teachers in their less low income, less minority schools. It’s not their fault. It’s the law [a baffling admission indeed…].

But it’s quite possible that in fact, the major cause of disparity in teacher attributes disparately affecting low income and minority children lies in the ways teacher sort on the labor market – across districts – thus across contracts – and not within districts by leveraging state law to their defense and advantage.

How then, do student population characteristics compare for novice teachers and highly qualified teachers across versus within New York State districts? Figure 5 compares the demographics of schools of Novice teachers within NYC and within the NYC metro area, across districts.

Figure 5.

Slide5

Figure 5 shows us that the disparities in populations are much greater across districts than within NYC.

  1. The % black or Hispanic population is about 15% higher in the districts of novice teachers than those who are not novice.
  2. The % low income is nearly 10% higher in the districts of novice teachers than those who are not novice.
  3. By contrast, within NYC, the percent black or Hispanic is about 7% (half the between district disparity) higher in schools of novice versus non-novice teachers, and the percent low income is between 1 and 2% greater in schools of novice teachers.

Do districts really have no ability to leverage change?

Finally, as I explained in my previous post, there already exists a substantial body of literature which severely undermines the assertion that local public districts in New York simply have no way to resolve teaching inequality across student populations. Most specifically, this one piece by Boyd and colleagues validates that New York City in particular made significant strides in the early 2000s at improving what had been far more substantive gaps in teacher attributes.

The gap between the qualifications of New York City teachers in high-poverty schools and low-poverty schools has narrowed substantially since 2000. For example, in 2000, teachers in the highest-poverty decile of schools had math SAT scores that on average were 43 points lower than their counterparts in the lowest-poverty decile of schools. By 2005 this gap had narrowed to 23 points. The same general pattern held for other teacher qualifications such as the failure rate on the Liberal Arts and Sciences (LAST) teacher certification exam, the percentage of teachers who attended a “least competitive” undergraduate college, and verbal SAT scores. Most of the gap-narrowing resulted from changes in the characteristics of newly hired teachers, rather than from differences in quit and transfer rates between high and low-poverty schools.

Boyd, D., Lankford, H., Loeb, S., Rockoff, J., & Wyckoff, J. (2008). The narrowing gap in New York City teacher qualifications and its implications for student achievement in high‐poverty schools. Journal of Policy Analysis and Management, 27(4), 793-818. http://cepa.stanford.edu/sites/default/files/Narrowing.pdf

 

And so it goes…. In the land of VergarNYa… a world where logical fallacy rules the day and where empirical evidence simply doesn’t matter…

 

 

The VergarGuments are Coming to New York State!

And so it goes… The VergarGuments keep-a-comin… spreading their way from California to the Empire State, from Albany to Buffalo. And what are VergarGuments you say?

Well, a VergarGument is a fallacious form of legal reasoning applied in the context of state constitutional litigation over causes of inequities and inadequacies of schooling selectively suffered by disadvantaged children. Yeah… that’s a mouthful, but it is worthy of its own newly minted, excessively precise definition.

The VerGargument arises from the recent Vergara case in California where a sufficiently gullible (or politically predisposed – you be the judge) judge accepted whole hog, the assertion that state laws governing the assignment and dismissal of teachers under district contractual agreements caused that state’s most disadvantaged children to be disproportionately subjected to “grossly ineffective” teachers.

You see, “causation” is a pretty important part of such legal challenges. And here, the burden on those bringing the case against the statutes in question was to show (reasonably/sufficiently display a connection… not “prove” beyond any reasonable doubt… and also not quite the same as statistical causation) that those statutes are responsible for the selective mistreatment of those bringing the case to court (deprivation of their state constitutionally guaranteed rights) . As I explained in my previous post, the causation assertion is suspect on simple logical grounds, with little need to get into the weeds of the statistical analyses.

The assertion that state policy restrictions on local contractual agreements is a primary (or even a significant) cause of teaching inequity is problematic at many levels.

First, variation in access to teacher quality across schools within districts varies… across districts. Some districts (in California or elsewhere) achieve reasonably equitable distributions of teachers while others do not. If state laws were the cause, these effects would be more uniform across districts – since they all have to deal with the same state statutory constraints (perhaps those district leaders testifying at trial in Vergara and bemoaning the inequities within their own districts were, in fact, revealing their own incompetence, rather than the supposed shackles of state laws?).

Second, teacher quality measures and attributes tend to vary far more across than within districts, making it really hard to assert that district contractual constraints (which constrain within, not cross-district sorting) imposed by state law have any connection to the largest share of teacher quality inequity.

But hey, let’s take a closer look at the evidence that already exists on these and related points in New York State, home to the newest rounds of VergarGuments, where a group calling itself the NYC Parents Union has filed suit claiming that New York State’s tenure laws cause poor minority children to be deprived of a Sound Basic Education (constitutional requirement upheld in previous litigation over funding disparities that have yet to be resolved).

We know from a long line of research, much of which can be found here, that teacher quality & qualification distributions vary in roughly equal parts across New York State school districts as they do within the largest district(s) across schools. Specifically, in one of the first major published studies of the new era of teacher quality research, Lankford, Loeb and Wyckoff (2002)[1] found:

  • Teachers are systematically sorted across schools and districts such that some schools employ substantially more qualified teachers than others do.
  • Differences in the qualifications of teachers in New York State occurs primarily between schools within districts and between districts within regions, not across regions.
  • The exception to the result that there is little difference in average teacher characteristics across regions is for the New York City region, which on average employs substantially less qualified teachers.
  • Nonwhite, poor, and low performing students, particularly those in urban areas, attend schools with less qualified teachers.

http://cepa.stanford.edu/sites/default/files/TeacherSorting.pdf

Again, to support the idea that state restrictions on local contracts are the primary, or even a significant cause of teacher quality disparity to the point of deprivation of the constitutional right to sound basic education, one would expect to find that within districts, under a common contract and tenure protections, children in high need schools suffer disproportionately, and that districts have limited if any control (due to state law constraints) over that disproportionate suffering.

A second bit of empirical evidence that severely undermines the claim that state legal restrictions are prohibitive of districts improving the distribution of teaching quality is the finding of a few years later, by an overlapping group of researchers, that New York City had taken steps to substantially mitigate differences in teacher qualifications across schools, in part with the help of additional state policy restrictions.[2]

The gap between the qualifications of New York City teachers in high-poverty schools and low-poverty schools has narrowed substantially since 2000. For example, in 2000, teachers in the highest-poverty decile of schools had math SAT scores that on average were 43 points lower than their counterparts in the lowest-poverty decile of schools. By 2005 this gap had narrowed to 23 points. The same general pattern held for other teacher qualifications such as the failure rate on the Liberal Arts and Sciences (LAST) teacher certification exam, the percentage of teachers who attended a “least competitive” undergraduate college, and verbal SAT scores. Most of the gap-narrowing resulted from changes in the characteristics of newly hired teachers, rather than from differences in quit and transfer rates between high and low-poverty schools.

That last part is critical here, since the central VergarGument for why state imposed legal protections harm low income and minority children is that they force districts to retain and place the worst teachers with the neediest students. Recruitment and retention are completely overlooked in the VergarGument which instead places the entire emphasis on dismissal, and legal restrictions on dismissal.

But there’s another piece to this puzzle as well, also documented in research done on New York State. And that is, that variations in compensation matter – and may determine whether a district even has the capacity to retain the teachers it would need, to, say, provide a sound basic education to low income and minority children. Ondrich, Pas and Yinger (2008) explain:

We find that teachers in districts with higher salaries relative to nonteaching salaries in the same county are less likely to leave teaching and that a teacher is less likely to change districts when he or she teaches in a district near the top of the teacher salary distribution in that county.[3]

And it is abundantly clear that New York State school districts – especially those serving the state’s neediest children – lack the ability to pay the necessary wages to recruit and retain the workforce they need.

VergarGuments are an absurd smokescreen, failing to pass muster at even the most basic level of logical evaluation of causation – that A (state laws in question) can somehow logically (no less statistically) be associated with selective deprivation of children’s constitutional rights.

Are children in New York State being deprived of their right to a sound basic education.

Absolutely.

Yes.

Most certainly.

Are VergarGuments the most logical path toward righting those wrongs? Uh… no.

 

 

 

 

 

 

[1] Lankford, H., Loeb, S., & Wyckoff, J. (2002). Teacher sorting and the plight of urban schools: A descriptive analysis. Educational evaluation and policy analysis, 24(1), 37-62.

 

[2] Boyd, D., Lankford, H., Loeb, S., Rockoff, J., & Wyckoff, J. (2008). The narrowing gap in New York City teacher qualifications and its implications for student achievement in high‐poverty schools. Journal of Policy Analysis and Management, 27(4), 793-818. http://cepa.stanford.edu/sites/default/files/Narrowing.pdf

 

 

[3] Ondrich, J., Pas, E., & Yinger, J. (2008). The determinants of teacher attrition in upstate New York. Public Finance Review, 36(1), 112-144.

http://www-cpr.maxwell.syr.edu/efap/Papers_reports/The_Determinants_of_Teacher_Attrition.pdf

 

On “Access to Teacher Quality” as the New Equity Concern

A short while back, the Center for American Progress posted their take-away from the Vergara decision. That takeaway was that equity of teacher quality distribution is the new major concern, or as they framed it Access to Effective Teaching. Certainly, the distribution of teaching quality is important. But let me set the record straight on a few major issues I have with this claim.

First, this is not new. It is relatively standard in the context of state constitutional litigation over equity and adequacy of educational resources to focus on the distributions of programs and services, as well as student outcomes, AND TEACHER ATTRIBUTES!

I (and many others) have regularly addressed these issues in reports and on the witness stand for years. It is important to understand that school finance equity litigation as it is often identified, actually tends these days to focus more broadly on equity and adequacy of educational programs and services, including teacher characteristics, and their relation to inequities and inadequacies of funding.

Second, modern measures of effective teaching, as I have explained in a previous post, are very problematic for evaluating “equity.” teacher effectiveness which have a tendency to be associated with demographic context and for that matter access to resources. To review, as I’ve explained numerous previous times, growth percentile and value added measures contain 3 basic types of variation:

  1. Variation that might actually be linked to practices of the teacher in the classroom;
  2. Variation that is caused by other factors not fully accounted for among the students, classroom setting, school and beyond;
  3. Variation that is, well, complete freakin statistical noise (in many cases, generated by the persistent rescaling and stretching, cutting and compressing, then stretching again, changes in test scores over time which may be built on underlying shifts in 1 to 3 additional items answered right or wrong by 9 year olds filling in bubbles with #2 pencils).

Our interest in #1 above, but to the extent that there is predictable variation, which combines #1 and #2, we are generally unable to determine what share of the variation is #1 and what share is #2. A really important point here is that many if not most models I’ve seen actually adopted by states for evaluating teachers do a particularly poor job at parsing 1 & 2. This is partly due to the prevalence of growth percentile measures in state policy.

This issue becomes particularly thorny when we try to make assertions about the equitable distribution of teaching quality. Yes, as per the figure above, teachers do sort across schools and we have much reason to believe that they sort inequitably. We have reason to believe they sort inequitably with respect to student population characteristics. The problem is that those same student population characteristics in many cases also strongly influence teacher ratings.

As such, those teacher ratings themselves aren’t very useful for evaluating the equitable distribution of teaching. In fact, in most cases it’s a pretty darn useless exercise, ESPECIALLY with the measures commonly adopted across states to characterize teacher quality. Being able to determine the inequity of teacher quality sorting requires that we can separate #1 and #2 above. That we know the extent to which the uneven distribution of students affected the teacher rating versus the extent to which teachers with higher ratings sorted into more advantaged school settings.

Third and finally, claims of identifying some big new equity concern seem almost always intended to divert attention from the substantive persistent inequities of state school finance systems (like this). That is, the intent seems far too often to assert that equity can be fixed without any attention to funding inequity. That in fact, the inequity of teacher quality distribution is somehow exclusively a function of state statutory job protections for teachers and/or corrupt adult-self-interested district management and teachers union arrangements.

The assertion that state policy restrictions (and no other possible major cause?) on local contractual agreements is the primary (or even a significant) cause of teaching inequity is problematic at many levels.

First, variation in access to teacher quality across schools within districts varies… across districts. Some districts (in California or elsewhere) achieve reasonably equitable distributions of teachers while others do not. If state laws were the cause, these effects would be more uniform across districts – since they all have to deal with the same state statutory constraints (perhaps those district leaders testifying at trial in Vergara and bemoaning the inequities within their own districts were, in fact, revealing their own incompetence,rather than the supposed shackles of state laws?).*

Second and most importantly, teacher quality measures and attributes tend to vary far more across than within districts, making it really hard to assert that district contractual constraints (which constrain within, not cross-district sorting) imposed by state law have any connection to the largest share of teacher quality inequity.

Setting aside the ludicrous logical fallacies on which the Vergara ruling rests, let’s take a more reasonable look at the distribution of teacher attributes with respect to resources and contexts in five states – based on prior reports I have prepared on behalf of plaintiff school children and the districts they attend, and drawn from academic papers (New York & Illinois).

Evaluating Disparities in Teacher Attributes

Ample research suggests that teacher quality is an important determinant of student achievement.[1] Although not the only policy instrument available, one way districts can try to attract higher quality teachers is by increasing salaries. Teacher salaries, however, are dependent on availability of state and local revenues. Moreover, district working conditions play a significant role in influencing the job choices of teachers. All else equal, teachers tend to avoid or exit schools with higher concentrations of children in poverty and higher concentrations of minority – specifically black – children. Some researchers have attempted to estimate the extent of salary differentials needed to offset the problem of teachers transferring from predominantly black schools. For example, Hanushek, Kain, and Rivkin (2004) note: “A school with 10% more black students would require about 10% higher salaries in order to neutralize the increased probability of leaving.”[2] Thus, to attact equal quality teachers high need districts and particularly the severe disparity districts would likely need to pay higher salaries than low need districts. The analyses presented here shows that that is not the case.

A substantial body of literature has found that concentrations of novice teachers (i.e. teachers with less than 3 or 4 years of experience) can have significant negative effects on student outcomes.[3]Rivkin, Hanushek, and Kain (2005) find that teacher experience is important in the first two years of a teaching career (but not thereafter).[4] Hanushek and Rivkin note that: “we find that identifiable school factors – the rate of student turnover, the proportion of teachers with little or no experience, and student racial composition – explain much of the growth in the achievement gap between grades 3 and 8 in Texas schools.”[5] Notably, evidence from a variety of state and local contexts, provides a consistent picture that higher concentrations of novice teachers are associated with negative effects on student outcomes.

Framework for Identifying “Disadvantaged Districts”

Figure 1 provides a conceptual framing of the distribution of local public school districts in terms of resource allocation and re-allocation pressures. Along the horizontal axis are “cost-adjusted” expenditures per pupil and along the vertical axis are actual measured outcomes, with both measures standardized around statewide means. Per pupil expenditure are “adjusted” for the cost of achieving specific (state average district) outcomes, where factors that influence cost include district structural characteristics, geographic location (labor costs) and various student need factors. One would assume that if expenditure measures are appropriately adjusted for costs districts would cluster around the diagonal line of expected values – where districts with more resources on average have higher outcomes. To the extent that this relationship holds with real data on real districts, one can then explore differences in resource allocation between districts falling in different regions (or quadrants) of Figure 1.

 Figure 1. Hypothetical Distribution of School Districts

Slide1

 

 

EVIDENCE FROM CONNECTICUT

In this section, we explore the resource and resource allocation differences across districts that fall into quadrants 2 and 4. We also examine the resources used in districts at the extremes of quadrant 4 – those with lowest outcomes and greatest needs. More specifically, we examine separately resource levels in the five districts that have

  • EEO funding deficits of greater than $3,000 per pupil;
  • average standardized assessment scores more than 1.5 standard deviations below the mean district; and
  • LEP/ELL shares in 2007-08 greater than 10%.

We referred to these districts as severe disparity districts, and they include Meriden, Waterbury, New London, Bridgeport and New Britain.

Figure 2 – Severe Funding Disparities and Outcomes

Slide2

At least two considerations limit the usefulness of simply comparing average salary levels across districts. First, competitive wages for professional occupations vary across regions in the state. Because, for instance, competitive wages in the Bridgeport-Norwalk-Stamford area are about 20 percent higher than in the Hartford area, a given nominal salary in Bridgeport has different purchasing power than the same nominal salary in Hartford. Second, teacher salaries vary substantially across different experience levels within districts. Thus, two districts that pay identical salaries for teachers with the same level of experience can have much different average salaries if one district has more experienced teachers than the other. Because differences in the experience distribution of teachers across districts are interesting in their own right, we examine them directly in the next section. In this section, we maintain focus on differences in salaries controlling for experience levels.

To address these issues, we estimated a salary model for Connecticut teachers using individual teacher level data on Connecticut teachers.[6] The goal of the wage model is to determine the average disparity in teacher salary between a) high spending/high outcomes districts and low spending/low outcomes districts and b) between severe disparity districts and other low spending/low outcomes districts controlling for teacher experience levels and the region of the state where the teacher works. The resulting estimates indicate, on average, how much more or less a teacher with similar qualifications, in the same labor market, is expected to be paid in FTE salary if working in a disadvantaged district.

The results of the regression analysis are presented in Table 1. The results indicate that salaries for teachers with more experience are higher, that teachers with advanced degrees, controlling for experience level are paid more, and that teachers tend to be paid less in regions other than Bridgeport-Stamford, and particularly so in the more rural parts of the state. With respect to differences across the three categories of districts, the results indicate that all else equal:

  1. A teacher in a low spending/low outcome district is likely to be paid about $1,000 less than a comparable teacher in a high spending/high outcome district in the same labor market;
  2. A teacher in a severe disparity district is likely to be paid about $1,800 less than a comparable teacher in all other districts in the same labor market;
  3. A teacher in a severe disparity district is likely to be paid about $1,600 less than a comparable teacher in other low spending/low outcome districts in the same labor market.

Thus, despite the expectation that severe disparity district would need to pay higher salaries to attract teachers of equal quality, we find they pay lower salaries than other districts in the same regions.

Figure 3 uses a variation on the statistical model in Table 1, including an interaction term between district group and experience category, to project the expected salaries of teachers in each experience category, holding other teacher characteristics constant. By interacting district group and experience, we are able to determine whether at some experience levels, teachers in severe disparity districts have more or less competitive salaries (whereas the model in Table 1 tells us only that, on average, across all experience levels, teachers’ salaries differ across district groups).

 Table 1- Regression Estimates of Connecticut Teacher Salary Structures

Slide3

 

At all experience levels, teachers in high spending/high outcome districts are paid more than their otherwise comparable peers in low spending/low outcome districts or in severe disparity districts. The gap appears to grow at higher levels of experience for teachers in severe disparity districts, and the gap is largest for teachers in low spending/low outcome districts across the mid-ranges of experience. For example, in the first few years of teaching, a teacher in a severe disparity district earns a wage of about $51,300 compared to a teacher in an advantaged district at $52,707, a difference of just under $1,400. But, by the 10th year of experience, that wage gap has grown to over $3,000, by the 15th year, nearly $4,000 and by the 20th year, over $4,300.

 Figure 1– Teacher Salary Disparities

Slide4

Data Source: http://sdeportal.ct.gov/Cedar/WEB/ct_report/StaffExport.aspx

We used another variation on the statistical model to project salaries for each group, for teachers with equated characteristics, in order to evaluate if teacher salaries in one group are falling further behind teacher salaries in another group over time. In this case, we interact the district group with the year variable in order to allow for the possibility that teacher salary disparities may be different in different years. Results from these regressions help to evaluate whether teacher salaries in severe disparity districts are catching up or falling even further behind.

Figure 4 shows that both teachers in the low spending/low outcomes group as a whole and in the severe disparity group in particular, are falling further behind teacher salaries in the high spending/high outcome group (in the same labor market). The growth in the salary gap between teachers in severe disparity districts and those in high resource districts is particularly disconcerting having grown from a difference of $1,054, or 1.7%, in 2005 to a difference of $5,517, or 8.1%, in 2010.

Figure 4 – Salary Disparities over Time

Slide5

Data Source: http://sdeportal.ct.gov/Cedar/WEB/ct_report/StaffExport.aspx

Figures predicted for an individual with 5 to 9 years experience, a Master’s degree, and in CBSA 25540 (Hartford)

Figure 5 shows that, compared to high spending/high outcome districts, low spending/low outcome districts including severe disparity districts have high shares of teachers in their first four years of experience. Districts in the low spending/low outcomes group generally have smaller shares of teachers in the 5 to 9 year and 10 to 14 year categories, whereas districts facing severe disparities have shortfalls of the most experienced teachers.

Figure 5 – Teacher Experience

 Slide6

Data Source: http://sdeportal.ct.gov/Cedar/WEB/ct_report/StaffExport.aspx

Table 2 provides the estimates of a logistic regression model of the probability that a teacher is in his or her first three years of teaching, after correcting for other factors. The purpose of this analysis is to identify factors associated with, or predictors of, the likelihood that a teacher is a novice teacher. Figure 6 above indicates a greater share of novice teachers in low resource, low outcome district and in severe disparity districts than in high resource, high outcome districts. Unlike the chart above, the logistic regression models allows us to determine the relative probability that a teacher in a severe disparity district is a novice, compared a) in the same year, b) to other districts in the same labor market (metropolitan area), and c) whether those probabilities change over time.   The results in Table 2 shows that on average:

  1. Teachers working in the severe disparity group are 20% more likely to be “novice” teachers than teachers in all other districts.
  2. Teachers in low spending/low outcomes districts are 19% more likely to be novice teachers than those in high spending/high outcomes districts.

Table 2 – Estimates of the Odds that a Teacher is in Her First 3 Years

 Slide7

*p<.05, **p<.10

Data Source: http://sdeportal.ct.gov/Cedar/WEB/ct_report/StaffExport.aspx

EVIDENCE FROM TEXAS

Here, I explore the distribution of teachers’ salaries and concentrations of novice teachers across Texas school districts. I explore how salaries and novice teacher concentrations vary by:

  • Poverty Quintile (U.S. Census poverty rate)
  • District Property Value Quintile
  • Resource/Outcome Group

Resource/outcome groups are determined according to Figure 6. I showed in the previous section that adjusted current operating expenditures were associated with actual outcomes. On average, higher spending districts had higher outcomes. I expressed spending and outcomes around their averages, such that there were high spending, high outcome districts where spending and outcomes were both above average, and there were low spending, low outcome districts where both were below average, as shown in Figure 6. A similar classification is constructed for both the college readiness model results and for the TAKS model results.

Figure 6

Slide8

Table 3 provides the results of four wage models which attempt to discern the extent of variation in teacher wages between groups of districts, among districts in the same labor market, and for teachers with the same number of years of experience and the same degree level.

Table 3. Salary Parity for Teachers across Districts by District Group/Type (2008-09 to 2009-10)

 Slide9

*p<.05

Includes controls for labor market.

Table 3 provides mixed findings. First, teachers in low resource, low outcome districts earn about $271 more than teachers in high resource, high outcome districts using the TAKS based cost model for classification, and $449 more using the college readiness model for classification. These are very small salary differentials and hardly likely to be sufficient for recruiting teachers of comparable qualifications to those in the more advantaged districts.

Teachers in the highest poverty quintile of districts are paid about $1,660 more than teachers in the lowest poverty quintile. While larger than the wage premium difference between resource/outcome categories, this difference is also hardly likely to balance the distribution of teacher qualifications between the highest and lowest poverty districts. But, low property wealth districts pay, on average a lower teacher wage than high property wealth districts, by about $1,306. None of these differences is huge. Wages are relatively flat across these groups. The contrast in findings by property wealth and by poverty is intriguing, suggesting perhaps that in Texas property wealth related disparities in resources remain more persistent than even poverty related disparities.

Table 4 addresses the distribution of novice teachers by the same group classifications, again comparing districts to others in the same labor market. Table 4 uses a logistic regression model to determine the odds that a teacher is in his or her first 3 years of teaching.

  • Table 4 shows that a teacher in a low resource, low outcome district is 53% (TAKS model) or 40% (college readiness model) more likely to be novice than a teacher in a high resource, high outcome district.
  • A teacher in a district in the highest poverty quintile is 26% more likely to be novice than a teacher in a district in the lowest poverty quintile.
  • Finally, and quite strikingly, a teacher in a district in the lowest wealth quintile is 66% more likely than a teacher in a district in the highest wealth quintile to be novice. Again, property wealth disparities rule the day.

Table 4. Likelihood that a Teacher is a Novice (2008-09 to 2009-10)

Slide10 *p<.05

Includes controls for labor market.

EVIDENCE FROM KANSAS

In this subsection, I explore disparities in actual staffing distributions and assignments to courses across Kansas public school districts using data on individual teachers, focusing on the most recent two years of data (2010 & 2011). For illustrative purposes, I organize Kansas school districts into quadrants, based on where each district falls in terms of a) total expenditures per pupil adjusted for the costs of achieving comparable (average) student outcomes (using the Duncombe cost index)[7], and b) actual district average proficiency rates on state reading (grades 5, 8 and 11) and math (grades 4, 7 and 10) assessments.

Figure 7 shows the distribution of districts by their quadrants. As an important starting point, Figure 7 shows that there exists a reasonably strong positive relationship between adjusted spending per pupil and outcomes (r-squared = .45, weighted for district enrollment). That is, districts with more resources have higher outcomes and districts with fewer resources have lower outcomes. Placing a horizontal line at the average actual outcomes and a vertical line at the average adjusted spending carves districts into four groups or quadrants. It is important to understand, however, that districts nearer the intersection of the horizontal and vertical lines are more similar to one another and less representative of their quadrants. That is, “average” Kansas districts are characterized by the cluster around the intersection as opposed to the few districts right at the intersection. To explore the extent of disparities between the most and least advantaged districts statewide, some analyses herein focus specifically on those districts which are deeper into their quadrants, labeled as “extreme” and colored in red in the figure.[8]

Figure 7. Distribution of Districts by Resources & Outcomes (2010)

Slide11

The quadrants of the figure may be characterized as follows:

  • Upper Left: Lower than average adjusted spending with higher than average outcomes
  • Upper Right: Higher than average adjusted spending with higher than average outcomes
  • Lower Right: Higher than average adjusted spending with lower than average outcomes
  • Lower Left: Lower than average adjusted spending with lower than average outcomes

Again, some caution is warranted in interpreting these quadrants. One can be fairly confident that those districts deeper into the upper right and lower left quadrants legitimately represent high resource, high outcome, and low resource low outcome districts. But, one should avoid drawing bold “efficiency” conclusions about districts in the upper left or lower right. For example, the relationship appears somewhat curved, not straight, shifting larger numbers of districts that lie at the middle of the distribution into the upper left quadrant (rather than evenly distributed around the intercept).

The largest numbers of children in the state attend school districts that fall in the expected quadrants – those in the upper right which have high resource levels and high outcomes – and those in the lower left which have low resource levels and low outcomes. While a significant number of districts fall in the upper left – appearing to have high outcomes and low resources – most are relatively near the center of the distribution, and in total, they serve fewer students than either those in the upper right or lower left quadrants.

It is also important to understand that comparisons of staffing configurations made across these quadrants are all normative – based on evaluating what some children have access to relative to others. Most of the following comparisons are between school districts in the upper right and lower left hand quadrants. That is, what do children in low resource, low outcome schools have access to compared to children in high resource, high outcome schools? We know from the previous figures, based on the Office of Civil Rights data that participation rates in advanced courses decline precipitously as poverty increases across Kansas schools and districts. We also know that access to such opportunities is important for success in college. And, we know that such opportunities can only be provided by making available sufficient numbers of qualified teaching staff. Further, we know that districts serving higher need student populations face resource allocation pressures to allocate more staffing to basic, general and remedial courses. Research on staffing configurations in other states generally supports these assertions.

Table 9 summarizes the characteristics of districts falling into each quadrant. Of the approximately 474,000 students matched to districts for which full information was available in 2010, 172,671 attend districts with high spending and high outcomes, at least compared to averages. 154,000 attend districts with low spending and low outcomes. Smaller groups attend districts in the other two quadrants.

For adjusted total expenditures per pupil, districts in the higher spending, higher outcome quadrant have about $4,000 per pupil more than those in the lower spending, low outcomes quadrant. The difference for general fund budgets is about $800. Also related to resources, districts with high spending levels and high outcomes have fewer pupils per teacher assignment when compared to low spending, low outcome districts. That is, from the outset, low spending low outcome districts have fewer teacher assignments to spread across children. Yet, these low spending low outcome districts, which are invariably higher need districts, must find ways to both provide basic and remedial programming to bring their students up to minimum standards, and must find some way to offer the types of advanced courses required for their graduates to have meaningful access to higher education.

Table 5. Characteristics of Districts by Group (2010)

 Slide12

Figure 8 shows that districts with higher concentrations of low income populations have systematically higher concentrations of novice teachers (in their first or second year). In fact, low income concentration alone explains nearly 40% of the variation in novice teacher concentration. Districts like Kansas City have much higher rates of novice teachers than neighboring suburban districts, including those which are growing rapidly and have increased demand for new teachers. This finding suggests that districts like Kansas City and Turner have much higher turnover rates than districts like DeSoto, Blue Valley or Shawnee Mission. Yet, current Kansas school finance policies provide financial support for teacher retention in the districts already advantaged with systematically lower concentrations of novice teachers.

 

Figure 8. Shares of First and Second Year Teachers by Low Income Student Shares

Slide13

Table 6 uses data from the statewide staffing files for 2010 and 2011 and compares teachers by quartile and then for the extreme groups. Based on the indicator of teacher prior year status differences appear relatively small, with marginally higher shares of teachers indicating that they are returning teachers in high resource, high outcome districts or very high resource very high outcome districts.

Table 6. Shares of Returning and Novice Teachers by District Group

 Slide14

Data Source: Statewide Staffing Assignment Database, 2010-2011

 

But, shares of novice teachers reveal more substantive differences. Table 6 shows that in low resource, low outcome districts over 23% of teachers have 3 or fewer years of experience, compared to 15.36% in high resource high outcome districts. The share of novice teachers increases to 26.56% in very low resource very low outcome districts.

Table 7. Odds that a Teacher is Novice by District Group (Logistic Regression)

 Slide15

Table 7 provides more precise estimates of the odds that a teacher is novice, given the group that the district is in, and compared against districts in the same labor market. The baseline comparison group is the high resource high outcome group. Compared to teachers in the high resource high outcome districts, teachers in the low resource low outcome districts are nearly 70% more likely to be novice.

Table 8 asks whether teachers in low resource low outcome districts are receiving lower base salaries than teachers of the same experience level in high resource high outcome districts in the same labor market.

Table 8. Salary Disparities by District Group (linear regression)

 Slide16

*P<.05

Table 8 shows that teachers in low resource low outcome districts at the same experience level are paid, on average, in base salary, about $450 less than teachers in high resource high outcome districts in the same labor market. Teachers in other districts are actually paid even less in base salary. That is, there exists no compensating differential to attract teachers to low resource low outcome districts. In fact, arguably, current policies which provide for additional local budget authority to affluent suburban districts work to reinforce the salary disparities shown in Table 8 and the novice teacher concentration disparities shown in Tables 6 and 7, and in Figure 8.

 

EVIDENCE FROM ILLINOIS

Figure 9 depicts the distribution of our Illinois school districts by outcome-resource quadrant, and by grade ranges served. Both the cost adjusted spending measure and the outcome index are standardized around a mean of “0.” On average, districts in Illinois cluster around the expected values and therefore are concentrated in the expected quadrants. Unified K-12 districts are least spread out across the quadrants. That is, there are fewer extremes among Unified K-12 districts. It should also be noted that the low resource, low outcome group of Unified K-12 districts is heavily influenced by the presence of Chicago City schools. The greatest extremes exist for secondary districts, primarily in the Chicago metropolitan area. In the case of outcome measures, districts are standardized around the mean for their grade range (district type).

Figure 9. Distribution of Illinois School Districts

Slide7

Table 9 characterizes the Illinois districts in each quadrant. There are 146 high spending high outcome elementary districts serving over 250 thousand children and 120 low spending low outcome districts serving about 150 thousand children. There are 41 high spending high outcome secondary districts serving up to 150 thousand children and 25 low spending low outcome secondary districts serving about 50 thousand children. For unified districts, there are 82 that are high spending and high outcome, serving 250 thousand children and 156 that are low spending with low outcomes, serving over 800 thousand children, with about half of those children attending Chicago Public Schools.

Even without any adjustment for costs or needs, the average per pupil operating expenditures are lower in low spending, low outcome districts. After adjustment, they are substantially lower. The percent of children who are low income is substantially higher in low spending, low outcome districts. Further, low spending, low outcome districts have fewer total staffing assignments per 1,000 students than their more affluent peers, and have lower teacher salaries at given levels of experience and degree level. Overall, lower spending low outcome districts in Illinois face substantial deficits from the outset.

Table 9. Descriptive Characteristics of Illinois School Districts

Slide8

EVIDENCE FROM NEW YORK

Figure 10 depicts the distribution of New York State school districts. As with the Illinois districts, the New York district spending and outcome measures are standardized around a mean of “0.” Again, districts tend to fall clustered around expectations (correlation, weighted for district enrollment = 0.63). High spending, high outcome districts spread far into the upper right corner of Figure 3, whereas disadvantaged districts tend to be more clustered toward the center of the Figure. However, some notable exceptions fall well into the lower left quadrant, including mid-size cities of Utica and Poughkeepsie along with the larger upstate cities of Syracuse, Rochester and Buffalo.

Figure 10. Distribution of New York School Districts

Slide9

Table 10 summarizes the characteristics of New York State school districts in the low spending, low outcomes and high spending, high outcomes quadrants. There are 186 districts serving nearly 580 thousand children in the high spending high outcomes quadrant and 194 districts in the low spending, low outcomes quadrant serving just over 450 thousand children. Low spending, low outcome districts have significantly higher rates of children in poverty, significantly lower nominal spending per pupil and substantially lower need and cost adjusted spending per pupil, lower teacher salaries (at similar degree and experience), but they do have slightly more total teacher assignments per 1,000 pupils.

Table 10. Descriptive Characteristics of New York School Districts

Slide10

 

==================

*Anzia and Moe, here, find that variance of teacher attributes is greater in larger than in smaller districts in California, still asserting state policy to be the cause, but to have differential effect because large bureaucracies (large districts) are more susceptible to state policy constraints. This argument is plainly illogical because a large district has the capacity to organize/operate as if it was several small districts – whereas a small district does not have the capacity to act as a large district. In other words, small districts are more likely constrained.  If large districts respond more bureaucratically, that is in fact the responsibility of district leadership, not a direct result of state policy. More likely however, the greater variance in large districts regarding the relationship between teacher characteristics and student populations is a function of greater variance, in general, on all measures, across schools in larger more heterogeneous districts.

==============

NOTES

[1] For example, see Eric A. Hanushek, John F. Kain, and Steven G. Rivkin, “Teachers, Schools, and Academic Achievement,” Econometrica 72, no. 3 (Fall 2005): 417-458; Daniel Aaronson, Lisa Barrow, and William Sander, “Teachers and Student Acheivement In Chicago Public High Schools,” Federal Reserve Bank fo Chicago Working Paper 2002-28, 2002.

[2] Hanushek, Kain, Rivkin, “Why Public Schools Lose Teachers,” p. 350

[3] See Charles T. Clotfelter, Helen F. Ladd and Jacob L. Vigdor, “Who Teaches Whom? Race and the distribution of novice teachers,” Economics of Education Review 24, no. 4 (August, 2005): 377-392;   See Charles T. Clotfelter, Helen F. Ladd and Jacob L. Vigdor, “Teacher sorting, teacher shopping, and the assessment of teacher effectiveness,” Sanford Institute of Public Policy, Duke University, 2004; and Hanushek, Kain, and Rivkin, “Teachers, schools, and academic achievement.”

[4] Hanushek, Kain, and Rivkin, “Teachers, schools, and academic achievement.”

[5] http://edpro.stanford.edu/hanushek/admin/pages/files/uploads/w12651.pdf

[6] Connecticut Department of Education provides a 6 year extractable panel (2005 to 2010) of individual teacher level data, available at: http://sdeportal.ct.gov/Cedar/WEB/ct_report/StateStaffReport.aspx. This file includes just over 50,000 cases (individuals) per year, with indicators of district and school assignment, teacher position type, assignment and salaries.

[7] The Duncombe Cost Index is used to adjust expenditures for the value of those expenditures toward achieving common outcome goals (the statewide average). This is done by taking the expenditure figure (either general fund budgets or total expenditures per pupil) and dividing that figure by the cost index.

[8] having either <75% proficient & <$12,000 per pupil in total expenditures, adjusted for need and costs, or having >90% proficient & >$14,000 in need and cost adjusted spending

The real path to quality, equitable and adequate schooling (hint – It’s not Vergara!)

The blogging has been sparse lately because my head is buried in really cool and important projects these days. My apologies to those anxiously awaiting glib, sarcastic updates and smackdowns on issues such as the Vergara case (where the logical fallacies run wild – more later if I ever get the chance) or the multitude of nonsensical reports that continue to flow out of beltway think tanks.

Here’s a quick summary on a topic I’ve addressed previously – but with some new studies included – That is, what do we really know about the importance of school finance reforms – equitable and adequate distribution of resources – for providing the necessary underlying conditions for equitable and adequate system of elementary and secondary schooling?

Over the past several decades, many states have pursued substantive changes to their state school finance systems, while others have not. Some reforms have come and gone. Some reforms have been stimulated by judicial pressure resulting from state constitutional challenges and others have been initiated by legislatures. In an evaluation of judicial involvement in school finance and resulting reforms from 1971 to 1996, Murray, Evans and Schwab (1998) found that “court ordered finance reform reduced within-state inequality in spending by 19 to 34 percent. Successful litigation reduced inequality by raising spending in the poorest districts while leaving spending in the richest districts unchanged, thereby increasing aggregate spending on education. Reform led states to fund additional spending through higher state taxes.” (p. 789)

Evaluating whether state school finance systems, or reforms to those systems lead to increases in spending generally, or targeted to children from economically disadvantaged backgrounds is of little relevance in the absence of evidence supporting effectiveness of such reforms.   There exists an increasing body of evidence that substantive and sustained state school finance reforms matter for improving both the level and distribution of short term and long run student outcomes. A few studies have attempted to tackle school finance reforms broadly applying multi-state analyses over time. Card and Payne (2002) found “evidence that equalization of spending levels leads to a narrowing of test score outcomes across family background groups.” (p. 49) Jackson, Johnson and Persico (2014) use data from the Panel Study of Income Dynamics (PSID) to evaluate long term outcomes of children exposed to court-ordered school finance reforms, based on matching PSID records to childhood school districts for individuals born between 1955 and 1985 and followed up through 2011. They find that the “Effects of a 20% increase in school spending are large enough to reduce disparities in outcomes between children born to poor and non‐poor families by at least two‐thirds,” and further that “A 1% increase in per‐pupil spending increases adult wages by 1% for children from poor families.”(p. 42)

Figlio (2004) explains that the influence of state school finance reforms on student outcomes is perhaps better measured within states over time, explaining that national studies of the type attempted by Card and Payne confront problems of a) the enormous diversity in the nature of state aid reform plans, and b) the paucity of national level student performance data. Most recent peer reviewed studies of state school finance reforms have applied longitudinal analyses within specific states. And several such studies provide compelling evidence of the potential positive effects of school finance reforms. Roy (2011) published an analysis of the effects of Michigan’s 1990s school finance reforms which led to a significant leveling up for previously low-spending districts. Roy, whose analyses measure both whether the policy resulted in changes in funding and who was affected, found that “Proposal A was quite successful in reducing interdistrict spending disparities. There was also a significant positive effect on student performance in the lowest-spending districts as measured in state tests.” (p. 137) Similarly, Papke (2001), also evaluating Michigan school finance reforms from the 1990s, found that “increases in spending have nontrivial, statistically significant effects on math test pass rates, and the effects are largest for schools with initially poor performance.” (Papke, 2001, p. 821)[1] Deke (2003) evaluated “leveling up” of funding for very-low-spending districts in Kansas, following a 1992 lower court threat to overturn the funding formula (without formal ruling to that effect). The Deke article found that a 20 percent increase in spending was associated with a 5 percent increase in the likelihood of students going on to postsecondary education. (p. 275)

Three studies of Massachusetts school finance reforms from the 1990s find similar results. The first, a non-peer-reviewed report by Downes, Zabel, and Ansel (2009) explored, in combination, the influence on student outcomes of accountability reforms and changes to school spending. It found that “Specifically, some of the research findings show how education reform has been successful in raising the achievement of students in the previously low-spending districts.” (p. 5) The second study, an NBER working paper by Guryan (2001), focused more specifically on the redistribution of spending resulting from changes to the state school finance formula. It found that “increases in per-pupil spending led to significant increases in math, reading, science, and social studies test scores for 4th- and 8th-grade students. The magnitudes imply a $1,000 increase in per-pupil spending leads to about a third to a half of a standard-deviation increase in average test scores. It is noted that the state aid driving the estimates is targeted to under-funded school districts, which may have atypical returns to additional expenditures.” (p. 1)[2] The most recent, by Nguyin-Hoang & Yinger (2014) also found that “changes in the state education aid following the education reform resulted in significantly higher student performance.”(p.297).

Downes had conducted earlier studies of Vermont school finance reforms in the late 1990s (Act 60). In a 2004 book chapter, Downes noted “All of the evidence cited in this paper supports the conclusion that Act 60 has dramatically reduced dispersion in education spending and has done this by weakening the link between spending and property wealth. Further, the regressions presented in this paper offer some evidence that student performance has become more equal in the post-Act 60 period. And no results support the conclusion that Act 60 has contributed to increased dispersion in performance.” (p. 312)[3] Most recently, Hyman (2013) also found positive effects of Michigan school finance reforms in the 1990s, but raised some concerns regarding the distribution of those effects. Hyman found that much of the increase was targeted to schools serving fewer low income children. But, the study did find that students exposed to an additional “12%, more spending per year during grades four through seven experienced a 3.9 percentage point increase in the probability of enrolling in college, and a 2.5 percentage point increase in the probability of earning a degree.” (p. 1)

Indeed, this point is not without some controversy, much of which is easily discarded. Second-hand references to dreadful failures following massive infusions of new funding can often be traced to methodologically inept, anecdotal tales of desegregation litigation in Kansas City, Missouri, or court-ordered financing of urban districts in New Jersey (see Baker & Welner, 2011).[4] Hanushek and Lindseth (2009) use a similar anecdote-driven approach in which they dedicate a chapter of a book to proving that court-ordered school funding reforms in New Jersey, Wyoming, Kentucky, and Massachusetts resulted in few or no measurable improvements. However, these conclusions are based on little more than a series of graphs of student achievement on the National Assessment of Educational Progress in 1992 and 2007 and an untested assertion that, during that period, each of the four states infused substantial additional funds into public education in response to judicial orders.[5] Greene and Trivitt (2008) present a study in which they claim to show that court ordered school finance reforms let to no substantive improvements in student outcomes. However, the authors test only whether the presence of a court order is associated with changes in outcomes, and never once measure whether substantive school finance reforms followed the court order, but still express the conclusion that court order funding increases had no effect. In equally problematic analysis, Neymotin (2010) set out to show that massive court ordered infusions of funding in Kansas following Montoy v. Kansas led to no substantive improvements in student outcomes. However, Neymotin evaluated changes in school funding from 1997 to 2006, but the first additional funding infused following the January 2005 supreme court decision occurred in the 2005-06 school year, the end point of Neymotin’s outcome data.

On balance, it is safe to say that a sizeable and growing body of rigorous empirical literature validates that state school finance reforms can have substantive, positive effects on student outcomes, including reductions in outcome disparities or increases in overall outcome levels. Further, it stands to reason that if positive changes to school funding have positive effects on short and long run outcomes both in terms of level and distribution, then negative changes to school funding likely have negative effects on student outcomes. Thus it is critically important to understand the impact of the recent recession on state school finance systems, the effects on long term student outcomes being several years down the line. It is also important to understand the features of state school finance systems including balance of revenue sources that may make these systems particularly susceptible to economic downturn.

NOTES

[1] In a separate study, Leuven and colleagues (2007) attempted to isolate specific effects of increases to at-risk funding on at risk pupil outcomes, but did not find any positive effects.

[2] While this paper remains an unpublished working paper, the advantage of Guryan’s analysis is that he models the expected changes in funding at the local level as a function of changes to the school finance formula itself, through what is called an instrumental variables or two stage least squares approach. Then, Guryan evaluates the extent to which these policy induced variations in local funding are associated with changes in student outcomes. Across several model specifications, Guryan finds increased outcomes for students at Grade 4 but not grade 8. A counter study by the Beacon Hill Institute suggest that reduced class size and/or increased instructional spending either has no effect on or actually worsens student outcomes (Jaggia & Vachharajani, 2004).

[3] Two additional studies of school finance reforms in New Jersey also merit some attention in part because they directly refute findings of Hanushek and Lindseth and of the earlier Cato study and do so with more rigorous and detailed methods. The first, by Alex Resch (2008) of the University of Michigan (doctoral dissertation in economics), explored in detail the resource allocation changes during the scaling up period of school finance reform in New Jersey. Resch found evidence suggesting that New Jersey Abbott districts “directed the added resources largely to instructional personnel” (p. 1) such as additional teachers and support staff. She also concluded that this increase in funding and spending improved the achievement of students in the affected school districts. Looking at the statewide 11th grade assessment (“the only test that spans the policy change”), she found: “that the policy improves test scores for minority students in the affected districts by one-fifth to one-quarter of a standard deviation” (p. 1). Goertz and Weiss (2009) also evaluated the effects of New Jersey school finance reforms, but did not attempt a specific empirical test of the relationship between funding level and distributional changes and outcome changes. Thus, their findings are primarily descriptive. Goertz and Weiss explain that on state assessments achievement gaps closed substantially between 1999 and 2007, the period over which Abbott funding was most significantly scaled up. Goertz & Weiss further explain: “State Assessments: In 1999 the gap between the Abbott districts and all other districts in the state was over 30 points. By 2007 the gap was down to 19 points, a reduction of 11 points or 0.39 standard deviation units. The gap between the Abbott districts and the high-wealth districts fell from 35 to 22 points. Meanwhile performance in the low-, middle-, and high-wealth districts essentially remained parallel during this eight-year period” (Figure 3, p. 23).

[4] Two reports from Cato Institute are illustrative (Ciotti, 1998, Coate & VanDerHoff, 1999).

[5] That is, the authors merely assert that these states experienced large infusions of funding, focused on low income and minority students, within the time period identified. They necessarily assume that, in all other states which serve as a comparison basis, similar changes did not occur. Yet they validate neither assertion. Baker and Welner (2011) explain that Hanushek and Lindseth failed to even measure whether substantive changes had occurred to the level or distribution of school funding as well as when and for how long. In New Jersey, for example, infusion of funding occurred from 1998 to 2003 (or 2005), thus Hanushek and Lindseth’s window includes 6 years on the front end where little change occurred (When?). Kentucky reforms had largely faded by the mid to late 1990s, yet Hanushek and Lindseth measure post reform effects in 2007 (When?). Further, in New Jersey, funding was infused into approximately 30 specific districts, but Hanushek and Lindseth explore overall changes to outcomes among low-income children and minorities using NAEP data, where some of these children attend the districts receiving additional support but many did not (Who?). In short the slipshod comparisons made by Hanushek and Lindseth provide no reasonable basis for asserting either the success or failures of state school finance reforms. Hanushek (2006) goes so far as to title the book “How School Finance Lawsuits Exploit Judges’ Good Intentions and Harm Our Children.” The premise that additional funding for schools often leveraged toward class size reduction, additional course offerings or increased teacher salaries, causes harm to children is, on its face, absurd. And the book which implies as much in its title never once validates that such reforms ever do cause harm. Rather, the title is little more than a manipulative attempt to convince the non-critical spectator who never gets past the book’s cover to fear that school finance reforms might somehow harm children. The book also includes two examples of a type of analysis that occurred with some frequency in the mid-2000s which also had the intent of showing that school funding doesn’t matter. These studies would cherry pick anecdotal information on either or both a) poorly funded schools that have high outcomes or b) well-funded schools that have low outcomes (see Evers & Clopton, 2006, Walber, 2006).

==================

References

Ajwad, Mohamad I. 2006. Is intra-jurisdictional resource allocation equitable? An analysis of campus level spending data from Texas elementary schools. The Quarterly Review of Economics and Finance 46 (2006) 552-564

Baker, Bruce D. 2012. Re-arranging deck chairs in Dallas: Contextual constraints on within district resource allocation in large urban Texas school districts. Journal of Education Finance 37 (3) 287-315

Baker, B. D., & Corcoran, S. P. (2012). The Stealth Inequities of School Funding: How State and Local School Finance Systems Perpetuate Inequitable Student Spending. Center for American Progress.

Baker, B., & Green, P. (2008). Conceptions of equity and adequacy in school finance. Handbook of research in education finance and policy, 203-221.

Baker, B. D., Sciarra, D. G., & Farrie, D. (2012). Is School Funding Fair?: A National Report Card. Education Law Center. http://schoolfundingfairness.org/National_Report_Card_2012.pdf

Baker, B. D., Taylor, L. L., & Vedlitz, A. (2008). Adequacy estimates and the implications of common standards for the cost of instruction. National Research Council.

Baker, B. D., & Welner, K. G. (2011). School finance and courts: Does reform matter, and how can we tell. Teachers College Record, 113(11), 2374-2414.

Baker, B.D., Welner, K.G. (2010) Premature celebrations: The persistence of inter-district funding disparities. Education Policy Analysis Archives. http://epaa.asu.edu/ojs/article/viewFile/718/831

Card, D., and Payne, A. A. (2002). School Finance Reform, the Distribution of School Spending, and the Distribution of Student Test Scores. Journal of Public Economics, 83(1), 49-82.

Chambers, Jay G., Jesse D. Levin, and Larisa Shambaugh. 2010. Exploring weighted student formulas as a policy for improving equity for distributing resources to schools: A case study of two California school districts. Economics of Education Review, 29(2), 283-300.

Chambers, Jay, Larisa Shambaugh, Jesse Levin, Mari Muraki, and Lindsay Poland. 2008. A Tale of Two Districts: A Comparative Study of Student-Based Funding and School-Based Decision Making in San Francisco and Oakland Unified School Districts. American Institutes for Research. Palo Alto, CA.

Ciotti, P. (1998). Money and School Performance: Lessons from the Kansas City Desegregations Experience. Cato Policy Analysis #298.

Coate, D. & VanDerHoff, J. (1999). Public School Spending and Student Achievement: The Case of New Jersey. Cato Journal, 19(1), 85-99.

Corcoran, S., & Evans, W. N. (2010). Income inequality, the median voter, and the support for public education (No. w16097). National Bureau of Economic Research.

Dadayan, L. (2012) The Impact of the Great Recession on Local Property Taxes. Albany, NY: Rockefeller Institute. http://www.rockinst.org/pdf/government_finance/2012-07-16-Recession_Local_%20Property_Tax.pdf

Deke, J. (2003). A study of the impact of public school spending on postsecondary educational attainment using statewide school district refinancing in Kansas, Economics of Education Review, 22(3), 275-284.

Downes, T. A., Zabel, J., and Ansel, D. (2009). Incomplete Grade: Massachusetts Education Reform at 15. Boston, MA. MassINC.

Downes, T. A. (2004). School Finance Reform and School Quality: Lessons from Vermont. In Yinger, J. (ed), Helping Children Left Behind: State Aid and the Pursuit of Educational Equity. Cambridge, MA: MIT Press

Duncombe, W., Yinger, J. (2008) Measurement of Cost Differentials In H.F. Ladd & E. Fiske (eds) pp. 203-221. Handbook of Research in Education Finance and Policy. New York: Routledge.

Duncombe, W., & Yinger, J. (1998). School finance reform: Aid formulas and equity objectives. National Tax Journal, 239-262.

Edspresso (2006, October 31). New Jersey learns Kansas City’s lessons the hard way. Retrieved October 23, 2009, from http://www.edspresso.com/index.php/2006/10/new-jersey-learns-kansas-citys-lessons-the-hard-way-2/

Evers, W. M., and Clopton, P. (2006). “High-Spending, Low-Performing School Districts,” in Courting Failure: How School Finance Lawsuits Exploit Judges’ Good Intentions and Harm our Children (Eric A. Hanushek, ed.) (pp. 103-194). Palo Alto, CA: Hoover Press.

Figlio, D.N. (2004) Funding and Accountability: Some Conceptual and Technical Issues in State Aid Reform. In Yinger, J. (ed) p. 87-111 Helping Children Left Behind: State Aid and the Pursuit of Educational Equity. MIT Press.

Goertz, M., and Weiss, M. (2009). Assessing Success in School Finance Litigation: The Case of New Jersey. New York City: The Campaign for Educational Equity, Teachers College, Columbia University.

Greene, J. P. & Trivitt, (2008). Can Judges Improve Academic Achievement? Peabody Journal of Education, 83(2), 224-237.

Guryan, J. (2001). Does Money Matter? Estimates from Education Finance Reform in Massachusetts. Working Paper No. 8269. Cambridge, MA: National Bureau of Economic Research.

Hanushek, E. A., and Lindseth, A. (2009). Schoolhouses, Courthouses and Statehouses. Princeton, N.J.: Princeton University Press., See also: http://edpro.stanford.edu/Hanushek/admin/pages/files/uploads/06_EduO_Hanushek_g.pdf

Hanushek, E. A. (Ed.). (2006). Courting failure: How school finance lawsuits exploit judges’ good intentions and harm our children (No. 551). Hoover Press.

Hyman, J. (2013). Does Money Matter in the Long Run? Effects of School Spending on Educational Attainment.

Imazeki, J., & Reschovsky, A. (2004). School finance reform in Texas: A never ending story. Helping children left behind: State aid and the pursuit of educational equity, 251-281.

Jackson, C. K., Johnson, R. C., & Persico, C. (2014). The Effect of School Finance Reforms on the Distribution of Spending, Academic Achievement, and Adult Outcomes.

Jaggia, S., Vachharajani, V. (2004) Money for Nothing: The Failures of Education Reform in Massachusetts http://www.beaconhill.org/BHIStudies/EdStudy5_2004/BHIEdStudy52004.pdf

Leuven, E., Lindahl, M., Oosterbeek, H., and Webbink, D. (2007). The Effect of Extra Funding for Disadvantaged Pupils on Achievement. The Review of Economics and Statistics, 89(4), 721-736.

Murray, S. E., Evans, W. N., & Schwab, R. M. (1998). Education-finance reform and the distribution of education resources. American Economic Review, 789-812.

Neymotin, F. (2010) The Relationship between School Funding and Student Achievement in Kansas Public Schools. Journal of Education Finance 36 (1) 88-108

Nguyen-Hoang, P., & Yinger, J. (2014). Education Finance Reform, Local Behavior, and Student Performance in Massachusetts. Journal of Education Finance, 39(4), 297-322.

Papke, L. (2005). The effects of spending on test pass rates: evidence from Michigan. Journal of Public Economics, 89(5-6). 821-839.

Resch, A. M. (2008). Three Essays on Resources in Education (dissertation). Ann Arbor: University of Michigan, Department of Economics. Retrieved October 28, 2009, from http://deepblue.lib.umich.edu/bitstream/2027.42/61592/1/aresch_1.pdf

Roy, J. (2011). Impact of school finance reform on resource equalization and academic performance: Evidence from Michigan. Education Finance and Policy, 6(2), 137-167.

Sciarra, D., Farrie, D., Baker, B.D. (2010) Filling Budget Holes: Evaluating the Impact of ARRA Fiscal Stabilization Funds on State Funding Formulas. New York, Campaign for Educational Equity. http://www.nyssba.org/clientuploads/nyssba_pdf/133_FILLINGBUDGETHOLES.pdf

Taylor, L. L., & Fowler Jr, W. J. (2006). A Comparable Wage Approach to Geographic Cost Adjustment. Research and Development Report. NCES-2006-321. National Center for Education Statistics.

Walberg, H. (2006) High Poverty, High Performance Schools, Districts and States. in Courting Failure: How School Finance Lawsuits Exploit Judges’ Good Intentions and Harm our Children (Eric A. Hanushek, ed.) (pp. 79-102). Palo Alto, CA: Hoover Press.

Stronger than the Scorn: How do NJ schools really stack up?

This is my end of school year review for New Jersey schools. Indeed much of the data in this post is from years prior. Nonetheless, these data affirm a long standing strong position of New Jersey schools either in the U.S. or international context. On many occasions I’ve pointed out better and worse uses of national and international assessment data – of Mis-NAEPery and PISA-Palooza… wherein the media and punditocracy go wild with gross misrepresentations and misinterpretations of relatively limited albeit not entirely useless assessment data.

In New Jersey, we’ve been told in recent years that while our average scores remain high, we must not rest on our laurels, because our gains pale by comparison to reformy standout states like Tennessee. We’ve been told that while our average performance is high, our gaps in achievement are among the largest in the nation and certainly not improving at any reasonable rate. We’ve also been told that these findings provide strong proof that all the money New Jersey has thrown at schools in response to years of litigation over school funding has not only been unhelpful, but that the additional funding to high poverty settings has actually caused harm. As such, the way to repair that harm is to reduce funding to high need settings and redistribute those harmful resources across other, less needy districts likely to use it more wisely (yeah… really… they did say that… and they’ve followed through on that redistribution plan!) And that will help fix the achievement gap!

But what really is the state of student outcomes in New Jersey, if we apply a few basic principles to the analysis of NAEP data – guidelines I have addressed in numerous previous posts:

  • First, average state contexts differ and those differences strongly influence average NAEP scores at all grade levels. In short, poverty matters! As such, performance levels should be adjusted for poverty.
  • Second, NAEP gains over time (which are cohort gains), are strongly influenced by initial NAEP performance. That is, those who started lower and had more to gain, gained more. As such, changes over time should be adjusted for initial scale scores.
  • Third, achievement gaps between low income and non-low -income, or among races are substantially influenced by the income gaps between these groups. As such, achievement gaps should be adjusted for differences in income between groups.

Average Performance Level Adjusted for Poverty

This first figure shows us the scatterplot of state average poverty rate and average 8th grade NAEP scale scores. Those states falling above the line have greater than expected scale scores and below the line have lower than expected scale scores. Notably, the correlations are quite strong. New Jersey beats the odds on both 8th grade reading and math. That is, NJ scores are higher than would be expected even given New Jersey’s low poverty rate?

Figure 1

Slide1

If we rank states by their average difference from expectations, New Jersey comes in fifth (averaging the math and reading differentials).

Figure 2

Slide2

But that’s only because the whole U.S. stinks, right?

Of course, some might argue that its really nothing to cheer about – doing better than other U.S. states, because of course, we all know that the U.S. performs miserably compared to other nations. But as I’ve pointed out previously, when conducting similar poverty adjusted comparisons across countries, the U.S. doesn’t look so bad. (see here for more explanation/discussion)

Figure 3

Slide11

Ah… but you say (Amanda Ripley style)… outcomes of even high performing – non-low income kids in the U.S. still stink compared to those in other countries. Again, I respond by pointing out that most such comparisons are deeply conceptually flawed. Perhaps most importantly, as I’ve explained on numerous previous posts, the U.S. average is only as low as it is in international comparisons because of the large number of low performing (relatively high poverty) states that have largely thrown their education systems under the bus for the past several decades. Massachusetts and New Jersey (among others) are not contributing to that drag, and if treated as their own nations (like the silliness of comparing against Shanghai or Singapore), they’d look pretty good.

Figure 4

Slide12

Aren’t other states kicking NJ’s butt on NAEP gains because of their reformy policies?

Okay – so New Jersey’s average performance is pretty high even adjusted for its low poverty rate. But we all know – including the U.S. Secretary of Education, that states that have been much more aggressive with teacher effectiveness policies and charter expansion have kicked NJ’s butt over the past several years on gains. NJ is simply resting on its laurels. Complacent and, as such, a serious laggard. Heck, any day now, Tennessee, Louisiana, Colorado and even Washington DC will be blowing away New Jersey on NAEP. Any day now. Any day!

As I’ve pointed out previously, NAEP gains are sensitive to NAEP starting point. Yeah… if you start low, it seems that for whatever reason (test scaling, etc.) it’s easier to show bigger gains.

So then, where does NJ stand on 10 year NAEP gains (because looking at 2 year gains is simply pointless) when compared to other states, after adjusting for starting point? First, here’s the relationship between starting point and gains (clearer picture in this post, dropping DC):

Figure 5

Slide3

Figure 6

Slide4

Now, here are the adjusted ranks.

Figure 7

Slide5

But what about those massive achievement gaps?

One of the common reformy assertions in New Jersey in recent years has been that even if we wish to deceive ourselves that on average, New Jersey is doing pretty well, there’s simply no refuting that our achievement gaps are among the worst in the nation. Further, these gaps are proof positive of the harms of targeting additional resources at children in need.

Let’s take a look at those achievement gaps, with an appropriate adjustment for income gaps. As I’ve explained over and over and over again – achievement gaps between low income and non-low income kids, or between kids of different races, are significantly explained by differences in family income of these groups. Yeah… put simply, income gaps predict outcome gaps. As such, one can use the income gaps to adjust the outcome gaps, just like these previous analyses.

First, here are the scatterplots revealing the relationships between income gaps and outcome gaps.

Figure 8

 Slide6

Figure 9

Slide7

Now, here are the adjusted achievement gaps. New Jersey’s achievement gap is only above average for one measure – NAEP reading grade 8. At the fourth grade level, New Jersey’s achievement gaps are among the smallest in the country!

Figure 10

Slide8Figure 11

Slide9

Summary?

To summarize:

  • NJ schools do better than expected on NAEP given statewide poverty rates, ranking among the highest states.
  • NJ schools have gained more on NAEP than nearly all other states (when correcting for starting point)
  • NJ’s 8th grade achievement gaps are relatively average (when correcting for income gaps). The only NJ achievement gap that is greater than average is grade 8 reading.
  • NJ’s 4th grade achievement gaps are among the smallest among states (when correcting for income gaps)

So congratulations, NJ… you’re doin’ pretty well. That’s not to say by any stretch of the imagination that we should be complacent. We’ve still got Massachusetts to catch up to in most cases. They, not Tennessee or Louisiana are giving us a run for our money. And as I pointed out in my most recent post, we need to give serious consideration to reinvesting in our neediest communities. Prior investments (including early childhood programs) may provide partial explanation for why our fourth grade achievement gaps are so relatively small. But we’ve backed off substantially on funding fairness in recent years, the consequences of which are yet to be measured.

 

On Teacher Effect vs. Other Stuff in New Jersey’s Growth Percentiles

PDF: BBaker.SGPs_and_OtherStuff

In this post, I estimate a series of models to evaluate variation in New Jersey’s school median growth percentile measures. These measures of student growth are intended by the New Jersey Department of Education to serve as measures of both school and teacher effectiveness. That is, the effect that teachers and schools have on marginal changes in their median student’s test scores in language arts and math from one year to the next, all else equal. But all else isn’t equal and that matters a lot!

Variations in student test score growth estimates, generated either by value-added models or growth percentile methods, contain three distinct parts:

  1. “Teacher” effect: Variations in changes in numbers of items answered correctly that may be fairly attributed to specific teaching approaches/ strategies/ pedagogy adopted or implemented by the child’s teacher over the course of the school year;
  2. “Other stuff” effect: Variations in changes in numbers of items answered correctly that may have been influenced by some non-random factor other than the teacher, including classroom peers, after school activities, health factors, available resources (class size, texts, technology, tutoring support), room temperature on testing days, other distractions, etc;
  3. Random noise: Variations in changes in numbers of items answered correctly that are largely random, based on poorly constructed/asked items, child error in responding to questions, etc.

In theory, these first two types of variations are predictable. I often use a version of Figure 1 below when presenting on this topic.

We can pick up variation in growth across classrooms, which is likely partly attributable to the teacher and partly attributable to other stuff unique to that classroom or school. The problem is, since the classroom (or school) is the unit of comparison, we really can’t sort out what share is what?

Figure 1

Slide1

We can try to sort out the variance by adding more background measures to our model, including student individual characteristics, student group characteristics, class sizes, etc., or by constructing more intricate analyses involving teachers who switch settings. But we can never really get to a point where we can be confident that we have correctly parsed that share of variance attributable to the teacher versus that share attributable to other stuff. And the most accurate, intricate analyses can rarely be applied to any significant number of teachers.

Thankfully, to make our lives easier, the New Jersey Department of Education has chosen not to try to parse the extent to which variation in teacher or school median growth percentiles is influenced by other stuff. They rely instead on two completely unfounded, thoroughly refuted claims:

  1. By accounting for prior student performance (measuring “growth” rather than level) they have fully accounted for all student background characteristics (refuted here[1]); and
  2. Thus, any uneven distribution of growth percentiles, for example, lower growth percentiles in higher poverty schools, is a true reflection of the distribution of teacher quality (refuted here[2]).

In previous analyses I have explored predictors of New Jersey growth percentiles at the school level, including the 2012 and 2013 school reports. Among other concerns, I have found that the year over year correlation (across schools) between growth percentiles is only slightly stronger than the correlation between growth percentiles and school poverty.[3] That is, NJ SGPs tend to be about as correlated with other stuff as they are with themselves year over year. One implication of this finding is that even the year-over-year consistency is merely consistently measuring the wrong effect year over year. That is, the effect of poverty.

In the following models, I take advantage of a richer data set in which I have used a) school report card measures, b) school enrollment characteristics and c) detailed statewide staffing files and have combined those data sets into one, multi-year data set which includes outcome measures (SGPs and proficiency rates), enrollment characteristics (low income shares, ELL shares) and resource measures derived from the staffing files.

Following are what I would characterize as exploratory regression models, using 3-years of measures of student populations, resources and school features, as predictors of 2012 and 2013 school median growth percentiles.

               Resource measures include:

  •  Competitiveness of wages: a measure of how much teachers’ actual wages differ from predicted wages for all teachers in the same labor market (metro area) in the same job code, with the same total experience and degree level (estimated via regression model). This measure indicates the wage premium (>1.0) or deficit (<1.0) associated with working in a given school or district. This measure is constant across all same job code teachers across schools within a district. This measure is created using teacher level data from the fall staffing reports from 2010 through 2012.
  • Total certified teaching staff per pupil (staffing intensity): This measure is created by summing the full time certified classroom teaching staff for each school and dividing by the total enrolled pupils. This measure is created using teacher level data from the fall staffing reports from 2010 through 2012.
  • % Novice teachers with only a bachelors’ degree: This measure also focuses on classroom teachers, taking the number with fewer than 3 years of experience and only a bachelors’ degree and dividing by the total number of classroom teachers. This measure is created using teacher level data from the fall staffing reports from 2010 through 2012.

I have pointed out previously that it would be inappropriate to consider a teacher or school to be failing, or successful for that matter, simply because of the children they happen to serve. Estimate bias with respect to student population characteristics is a huge validity concern regarding the intended uses of New Jersey’s growth percentile measures.

The potential influence of resource variations presents a comparable validity concern, though the implications vary by resource measure. If we find, for example that teachers receiving a more competitive wage are showing greater gains, we might assert that the wage differential offered by a given district is leading to a more effective teacher workforce. A logical policy implication would then be to provide resources to achieve wage premiums in schools and districts serving the neediest children, and otherwise lagging most on measures of student growth.

Of course, schools having more resources for use in one way – wages – also may have other advantages. If we find that overall staffing intensity is a significant predictor of student growth, it would be unfair to assert that the growth percentiles reflect teacher quality. That is, if growth in some schools is greater than in others because of more advantageous staffing ratios. Rather than firing the teachers in the schools producing low growth, the more logical policy response would be to provide those schools the additional resources to achieve similarly advantageous staffing ratios.

With these models, I also test assumptions about variations across schools within larger and smaller geographic areas – counties and cities. This geography question is important for a variety of reasons.

New Jersey is an intensely racially and socioeconomically segregated state. Most of that segregation occurs between municipalities far more so than within municipalities. That is, it is far more likely to encounter rich and poor neighboring school districts than rich and poor schools within districts. Yet education policy in New Jersey, like elsewhere, has taken a sharp turn toward reforms which merely reshuffle students and resources among schools (charter and district) within cities, pulling back significantly from efforts to target additional resources to high need settings.

Figure 2 shows that from the early 1990s through about 2005, New Jersey placed significant emphasis on targeting additional resources to higher poverty school districts. Since that time, New Jersey’s school funding progressiveness has backslid dramatically. And these are the very resources needed for districts – especially high need districts – to provide wage differentials to recruit and retain a high quality workforce, coupled with sufficient staffing ratios to meet their students’ needs.

Figure 2

Slide2

Findings

Table 1 shows the estimates from the first set of regression models which identify predictors of cross school and district, within county variation in growth percentiles. The four separate models are of language arts and math growth percentiles (school level) from the 2012 and 2013 school report cards. These models show that:

Student Population Other Stuff

  1. % free lunch is significantly, negatively associated with growth percentiles for both subjects and both years. That is, schools with higher shares of low income children have significantly lower growth percentiles;
  2. When controlling for low income concentrations, schools with higher shares of English language learners have higher growth percentiles on both tests in both years;
  3. Schools with larger shares of children already at or above proficiency tend to show greater gains on both tests in both years;

School Resource Other Stuff

  1. Schools with more competitive teacher salaries (at constant degree and experience) have higher growth percentiles on both tests in both years.
  2. Schools with more full time classroom teachers per pupil have higher growth percentiles on both tests in both years.

Other Other Stuff

  1. Charter schools have neither higher nor lower growth percentiles than otherwise similar schools in the same county.

 

TABLE 1. Predicting within County, Cross School (cross district) Variation in New Jersey SGPs

Slide3

*p<.05, **p<.10

TABLE 2. Predicting within City Cross School (primarily within district) Variation in New Jersey SGPs

Slide4

*p<.05, **p<.10

Table 2 includes a fixed effect for city location. That is, Table 2 runs the same regressions as in Table 1, but compares schools only against others in the same city. In most cases, because of municipal/school district alignment in New Jersey, comparing within the same city means comparing within the same school district. But, using city as the unit of analysis permits comparisons of district schools with charter schools in the same city.

In Table 2 we see that student population characteristics remain the dominant predictor of growth percentile variation. That is, across schools within cities, student population characteristics significantly influence growth percentiles. But the influence of demography on destiny, shall we say (as measured by SGPs), is greater across cities than within them, an entirely unsurprising finding. Resource variations within cities show few significant effects. Notably, our wage index measure does not vary within districts but rather across them and was replaced in these models by a measure of average teacher experience. Again, there was no significant difference in average growth achieved by charters than by other similar schools in the same city.

Preliminary Policy Implications

The following preliminary policy implications may be drawn from the preceding regressions.

Implication 1: Because student population characteristics are significantly associated with SGPs, the SGPS are measuring differences in students served rather than, or at the very least in addition to differences in collective (school) teacher effectiveness. As such, it would simply be wrong to use these measures in any consequential way to characterize either teacher or school performance.

Implication 2: SGPs reveal positive effects of substantive differences in key resources, including staffing intensity and competitive wages. That is, resource availability matters and teachers in settings with access to more resources are collectively achieving greater student growth. SGPs cannot be fairly used to compare school or teacher effectiveness across schools and districts where resources vary.

These findings provide support for a renewed emphasis on progressive distribution of school funding. That is, providing the opportunity for schools and districts serving higher concentrations of low income children and lower current growth, to provide the wage premiums and staffing intensity required to offset these deficits.[4]

Implication 3: The presence of stronger relationships between student characteristics and SGPs across schools and districts within counties, versus across schools within individual cities highlights the reality that between district (between city) segregation of students remains a more substantive equity concerns than within city segregation of students across schools.

As such, policies which seek merely to reshuffle students across charter and district schools within cities and without attention to resources are unlikely to yield any substantive positive effect in the long run. In fact, given the influence of student sorting on the SGPs, sorting students within cities into poorer and less poor clusters will likely exacerbate within city achievement gaps.

Implication 4: The presence of significant resource effects across schools and districts within counties, but lack of resource effects across schools within cities, reveals that between district disparities in resources, coupled with sorting of students and families, remains a significant concern, and more substantive concern than within district inequities. Again, this finding supports a renewed emphasis on targeting additional resources to districts serving the neediest children.

Implication 5: Charter schools do not vary substantively on measures of student growth from other schools in the same county or city when controlling for student characteristics and resources. As such, policies assuming that “chartering” in-and-of-itself (without regard for key resources) can improve outcomes are likely misguided. This is especially true where such policies do little more than reshuffle low and lower income minority students across schools within city boundaries.

====================

[1]http://njedpolicy.wordpress.com/2013/05/02/deconstructing-disinformation-on-student-growth-percentiles-teacher-evaluation-in-new-jersey/

[2]https://schoolfinance101.wordpress.com/2014/04/18/the-endogeneity-of-the-equitable-distribution-of-teachers-or-why-do-the-girls-get-all-the-good-teachers/

[3]https://schoolfinance101.wordpress.com/2014/01/31/an-update-on-new-jerseys-sgps-year-2-still-not-valid/

[4]This finding also directly refutes the dubious assertion by NJDOE officials in their 2012 school funding report that the additional targeted funding was not only doing no good, but potentially causing harm and inducing inefficiency. https://schoolfinance101.wordpress.com/2012/12/18/twisted-truths-dubious-policies-comments-on-the-njdoecerf-school-funding-report/

The Cuomology of State Aid (or Tales from Lake Flaccid)

We’ve heard much bluster over time about school funding in New York State, and specifically how money certainly has no role in the policy debate over how to fix New York State schools, unless it has to do with providing more money to charter schools, or decrying the fact that district schools statewide are substantially over-funded.  See for example, this wonderfully absurd rant from DFER.

Here’s a recap of other posts I’ve written in recent years on NY school finance:

  1. On how New York State crafted a low-ball estimate of what districts needed to achieve adequate outcomes and then still completely failed to fund it.
  2. On how New York State maintains one of the least equitable state school finance systems in the nation.
  3. On how New York State’s systemic, persistent underfunding of high need districts has led to significant increases of numbers of children attending school with excessively large class sizes.
  4. On how New York State officials crafted a completely bogus, racially and economically disparate school classification scheme in order to justify intervening in the very schools they have most deprived over time.

I also recently wrote about this interesting trend in NY state school finance and NY state outcome standards- specifically that the state continues to raise the bar on outcomes while lowering the funding target intended to be sufficient for meeting those outcomes.

As I previously explained, regarding outcome standards:

Put simply, higher student outcome standards cost more to achieve, not less. As explained above, the New York State school finance formula is built on an underlying basic cost estimate of what it would take for a low need (no additional student needs) district to achieve adequate educational outcomes as measured on state assessments. The current formula is built on average spending estimates dating back several years now and based on prior outcome standards, tied to a goal of achieving 80% proficient or higher. More than once in the past several years, the state has substantively increased the measured outcome standards.

For 2010, the Regents adjusted the assessment cut scores to address the inflation issue, and as one might expect proficiency rates adjusted accordingly. The following figure shows the rates of children scoring at level 3 or 4 in 2009 and again in 2010. I have selected a few key, rounded, points for comparison. Districts where 95% of children were proficient or higher in 2009 had approximately 80% in 2010. Districts that had 80% in 2009 had approximately 50% in 2010. This means that the operational standard of adequacy using 2009 data was equivalent to 50% of children scoring level 3 or 4 in 2010. This also means that if we accept as reasonable, a standard of 80% at level 3 or 4 in 2010, that was equivalent to 95% – not 80% – in 2009.

Slide1This next figure shows the resulting shift of the change in assessments from 2012 to 2013, also for 8th grade math. Again, I’ve applied ballpark cutpoint comparisons.  Here, a school where 60% were proficient in 2012 was likely to have 20% proficient in 2013. A school where 90% were proficient in 2012 was likely to have 50% proficient in 2013.   If, as state policymakers argue, the 2013 assessments do more accurately represent the standard for college readiness, and thus the constitutional standard of meaningful high school education, it is quite likely that the cost of achieving that constitutional standard is much higher than previously estimated. Notably, only a handful of schools surpass the 80% threshold on math proficiency for the 2013 assessments.

Slide2

Yet, as I also explained in that previous post that while it appears that the state has been chipping away at funding gaps for districts including New York City, they have not done so by substantively increasing funding, but rather by decreasing the adequate funding target.  This figure shows that the underlying basic cost figure for the foundation aid formula climbed gradually as planned through 2012-13. Note that this climb was based on the assumed 80% success rate on the 2007-08 outcome standard, not considering the 2009-10 adjustment to that outcome standard. But inexplicably, the state has chosen to reduce the basic funding figure for each year since, despite raising the outcome standards dramatically.

Slide4

In their final adopted 2014-15 budget, state legislators did come through with some more funding for high need districts. That should not be discounted entirely. BUT… the state has played numerous games of late with the funding formula to far overstate their accomplishments- one of which is lowering the target. Of course it’s easier to slam dunk when you lower the rim. But even then, there’s no slam dunking going on here. Let’s take a look at the structure of funding and remaining gaps from fy14 to fy15.

This figure shows that for low need districts, the funding gap (to target funding) in 2014 was about $1,200 per pupil and that gap was reduced by about $200 per pupil. For high need districts, the gap was over $3,400 per pupil as was reduced to around $2,600 per pupil, a seemingly large reduction. But, first, remember that the target was lowered. And second, take a look at the green/blue numbers. State aid was indeed increased in high need districts by about $300 per pupil, but the required local contribution for these districts was increased by about $400 per pupil!

Slide3

Yeah… that’s right… you tell ’em Andy – Go pay for it yourself… uh… except that…uh… we told you that you can’t actually raise your local levy more than 2%? and if you do, we’ll penalize you! Damnit! So there!

Let’s peel this back a bit. First, here’s the effect of the lowering of the base on the funding gaps. That gap, which appears to be only about $2,639 in 2015 for high need districts would still be over $2,900 if the state hadn’t lowered the bar! In fact, the bar should have been rising if for no other reason than basic inflationary adjustment, setting aside the higher standards.

Slide5

Now it would be one thing if local contributions were generally lagging in high need districts. Indeed, some districts like Poughkeepsie have historically had lower than average local effort. But, many of these districts are so property poor and low income that raising local taxes generates little additional revenue. Here’s the average local effort by need group, from 2011-12.

Slide6

And here’s a look a the changes in required local contribution. Yeah… that’s right… if you’re a low need district, on average, your required local contribution goes down slightly. And if your a high need district, your local contribution skyrockets!

Slide7

So, should NY bureaucrats really be patting themselves on the back for their great accomplishments at closing funding gaps? Well, I’m certainly willing to give credit for the fact that the state did increase aid to a greater degree in districts with greater need.

But, let’s be clear here:

1) the state has still barely put a dent in the funding gaps that exist between actual foundation aid and funding targets defined by the state’s own (woefully inadequate) foundation aid formula.

2) the state has placed an additional burden on high need districts that many likely can’t even (or wouldn’t be allowed to) meet.

3) the state has engaged in egregious manipulation of state aid runs in their efforts to deceive local district officials and the general public regarding the adequacy of state aid.

And that’s just wrong!

 

 

 

 

 

 

Uncommon Denominators: Understanding “Per Pupil” Spending

This post is another in my series on data issues in education policy. The point of this post is to encourage readers of education policy research to pay closer attention to the fact that any measure of “per pupil spending” contains two parts – a measure of “spending” in the numerator and a measure of “pupils” in the denominator.

Put simply, both measures matter, and matching the right numerator to the right denominator matters.

Below are a few illustrations of why it’s important to pay attention to both the numerator and denominator when considering both variations across settings in education spending or variations over time in education spending.

Declining Enrollment Growth and Exploding Spending!

First it is important to understand that when the ratio of spending to pupils is growing over time, that growth may be a function of either or both, increasing expenditures in the numerator or declining pupils in the denominator. Usually, both parts are moving simultaneously, making interpretation more difficult. The State of Vermont over the past 20 years makes a fun example.

Vermont is among the highest per pupil spending states in the nation, and Vermont’s per pupil spending has continued to grow at a relatively fast pace over the past 20 years. Figure 1 shows Vermont’s per pupil spending growth (not adjusted for inflation, because choice of an inflator adds another level of complexity) in the upper half of the figure.

But, the lower half of the figure shows Vermont’s enrollment over the same period.

Figure 1. Vermont per Pupil Spending and Enrollments

Slide2

Clearly, given the dramatic enrollment statewide enrollment decline, even if total revenue and spending remained constant, or lagged significantly in its decline, per pupil spending would continue to grow.

Figure 2 breaks out the year over year growth rates of a) total revenue, b) enrollments and c) revenue per pupil. The math is pretty simple here, and the issue almost too obvious to bother with on this blog… but the point here is that if enrollment is declining by 2% annually, and total revenue (or spending) increasing by 4% to 6%, then per pupil revenue will increase by 6% to 8%.

Figure 2. Vermont % Change in Spending and Enrollment

Slide3

Yes, that’s all pretty simple and seemingly obvious. But, that doesn’t stop many from simply looking at per pupil spending growth as if it all represents spending growth. 8% annual growth likely plays differently to a political audience than 4% or 6% growth. Both parts are moving and we can’t forget that. Further, because the provision of education involves a mix of fixed, step and variable costs, we can’t expect spending changes to track perfectly with enrollment changes over time. But yes, we can and should expect appropriate adjustments down the line to accommodate the pupils that need to be served.

Equity Implications of Alternative Denominators: The ADA Game

I’ve written previously on this blog about different measures of student enrollment used in state school finance formulas, which are also used in presenting per pupil spending. A handful of states rely on “Average Daily Attendance” as a basis for providing state aid, and in turn as the method by which they report per pupil spending. As I’ve explained in previous posts, Average Daily Attendance measures vary systematically with respect to poverty, compared with enrollment measures. That is, on average, among those enrolled in a school, attendance rates tend to be lower on a daily basis in schools serving more low income and minority students. So, if one uses these measures to drive state aid to local districts, the result is systematically lower state aid in higher poverty, higher minority districts. But, if one uses these same measures to report per pupil spending, then no harm no foul… or so it seems.

As an aside, when pushed to rationalize financing schools on the basis of attendance, state policymakers often suggest that the purpose of the policy is to create an incentive for school officials to increase attendance rates.[i] The problems with this argument are many-fold. First, local public school districts remain responsible for providing the resources to educate all eligible enrolled children. While 90% may be in attendance on any given day, and while some children may be absent more than others, the same 90% are not in attendance every day. In all likelihood, 100% of eligible enrolled children attend at some point in the year. Second, depriving local public school districts of state aid lessens their capacity to provide interventions that might lead to improved attendance rates. Third, many school absences are simply beyond the control of local public school officials. This is particularly the case for poverty-induced, chronic health related absences. Finally, there exists little or no sound empirical evidence that this approach provides an effective incentive.[ii]

Figure 3 provides an illustration of how different per pupil spending figures look across Texas districts when reported by a) enrollment and b) average daily attendance with respect to shares of low income children. First, because few if any districts have perfect average daily attendance, the green dots – spending per enrolled pupil – are lower than the orange dots – spending per pupil in average daily attendance. Spending per enrolled pupil is simply lower than spending per pupil in average daily attendance. Further, while it would appear that spending per pupil in average daily attendance is higher in higher poverty districts than in lower poverty ones, that is not necessarily the case for spending per enrolled pupil (much smaller difference).

Figure 3. Per Pupil Spending and Low Income Concentrations in Texas

Slide5

Figure provides an alternative view, collapsing data into low income quintiles.

Figure 4. Per Pupil Spending by Low Income Quintile in Texas

Slide6

And it is certainly relevant that the districts in question here are obligated not merely to serve those who show up on a given day but to have resources available to all of whom for which they are held responsible. That is, those enrollment.

Matching the Numerator and Denominator: My expenditures on your pupils?

Finally, I’d like to address the somewhat more convoluted issue of matching the right numerator to the right denominator, especially when making spending comparisons across schools or districts.

I wrote extensively here, about making comparisons between brick and mortar vs. online schools.

And, I wrote extensively here about making comparisons between charter schools and district schools in New York City.

The increasing complexities of the interdependency relationships between district hosts and charter schools create significant confusion when comparing per pupil spending between host district and charter schools. In a recent report, I provide explanations of common (though likely intentional, after the 3rd or 4th iteration) mistakes. Here is one version of my critique of the Ball State study, which appears in Footnote 22, page 49 of this study: http://nepc.colorado.edu/files/rb-charterspending_0.pdf

For example, under many state charter laws, host districts or sending districts retain responsibility for providing transportation services, subsidizing food services, or providing funding for special education services. Revenues provided to host districts to provide these services may show up on host district financial reports, and if the service is financed directly by the host district, the expenditure will also be incurred by the host, not the charter, even though the services are received by charter students.

Drawing simple direct comparisons thus can result in a compounded error: Host districts are credited with an expense on children attending charter schools, but children attending charter schools are not credited to the district enrollment. In a per-pupil spending calculation for the host districts, this may lead to inflating the numerator (district expenditures) while deflating the denominator (pupils served), thus significantly inflating the district’s per pupil spending. Concurrently, the charter expenditure is deflated.

Correct budgeting would reverse those two entries, essentially subtracting the expense from the budget calculated for the district, while adding the in-kind funding to the charter school calculation. Further, in districts like New York City, the city Department of Education incurs the expense for providing facilities to several charters. That is, the City’s budget, not the charter budgets, incur another expense that serves only charter students. The Ball State/Public Impact study errs egregiously on all fronts, assuming in each and every case that the revenue reported by charter schools versus traditional public schools provides the same range of services and provides those services exclusively for the students in that sector (district or charter).

Here’s a relatively straightforward, albeit incomplete illustration. Figure 5 shows that in many states, like New York, Connecticut or New Jersey, the relationship between district host and charter spending creates significant problems in equating numerators and denominators. In many states, as we explain above, host districts retain responsibility for spending on such things as charter student transportation or special education. Districts within stats may opt for different approaches to transportation financing, and some districts may opt to provide funding for centralized enrollment management or for facilities co-locations. The costs of providing these services typically remain on the ledger of the district. That is, they are in the district’s numerator, even when the pupils are removed from the denominator. This makes the resulting per pupil spending comparisons, well, simply wrong.

Figure 5. The Conceptual Problem with Matching Numerators and Denominators – Charter Spending Comparisons

Slide8

Connecticut is one state where responsibility for transportation and special education expense is retained by the district (while many CT charters serve very few children with disabilities to begin with). Figure 6 below provides an illustration of how charter to host district spending comparisons differ when one removes special education and transportation expenses from the districts’ numerator. When these expenses are included on the district’s expense, district spending is somewhat higher than charter spending, but when they are removed, in both cases district spending is lower.

Figure 6. Matching spending responsibilities for more accurate comparisons

Slide9

Notably, this is far from a complete analysis. It is merely illustrative. Similar problems exist with reported data on charter school revenues and spending in New Jersey.

In New York City, the Independent Budget Office has produced a handful of useful reports on making relevant comparisons there.

============

Note: Charter advocates often argue that charters are most disadvantaged in financial comparisons because charters must often incur from their annual operating expenses, the expenses associated with leasing facilities space. Indeed it is true that charters are not afforded the ability to levy taxes to carry public debt to finance construction of facilities. But it is incorrect to assume when comparing expenditures that for traditional public schools, facilities are already paid for and have no associated costs, while charter schools must bear the burden of leasing at market rates–essentially an “all versus nothing” comparison. First, public districts do have ongoing maintenance and operations costs of facilities as well as payments on debt incurred for capital investment, including new construction and renovation. Second, charter schools finance their facilities by a variety of mechanisms, with many in New York City operating in space provided by the city, many charters nationwide operating in space fully financed with private philanthropy, and many holding lease agreements for privately or publicly owned facilities. (for more, see: http://nepc.colorado.edu/files/rb-charterspending_0.pdf, p49-50)

==============

[i]Recently, when New Jersey slipped the attendance factor into the determination of state aid, Education Commissioner Chris Cerf argued:

“When you look at the (difference) between the number of children on the rolls and the number of children in some of these schools, it can be very distressing,” Cerf said. “Pushing these districts to do everything in their power to get kids to attend class is good.” http://blogs.app.com/capitolquickies/2012/04/24/cerf-said-push-districts-to-get-kids-in-school/

[ii] A study published in the Spring 2013 issue of the Journal of Education Finance purports to find positive effects on attendance and graduation rates in states with “strong incentive” enrollment basis for funding, with particular emphasis on states relying on average daily attendance, but combining with them many (most) states using an average daily membership figure. Most problematically, the study draws its main conclusion from state aggregate cross sectional analyses, applying unsatisfyingly ambiguous classifications of state school finance policy count methods, and applying an approach which cannot separate finance policy effects from other contextual differences across states.

The final study is published here: Ely, Todd L., and Mark L. Fermanich. “Learning to count: school finance formula count methods and attendance-related student outcomes.” Journal of Education Finance 38.4 (2013): 343+

An earlier draft is available here: http://www.aefpweb.org/sites/default/files/webform/Fermanich_Ely_AEFP_2012.pdf

On “Dropout Factories” & (Fraudulent) Graduation Rates in NJ

This NJ Star Ledger piece the other day reminded me of an issue I’ve been wanting to check out for some time now. I’m skeptical of graduation rates as a measure of student outcomes to begin with, because, of course, graduation can be strongly influenced by local norms and practices. As such, it’s really hard to validly compare graduation rates from one place to another or even over time, as graduation standards may change. Notably, arbitrary assignment of “passing” cut scores on high stakes assessment isn’t particularly helpful and can be quite harmful. But I digress.

What piqued my interest a while back was the apparent disconnect between cohort attrition measures from 9th to 12th grade, or 10th to 12th grade, and reported graduation rates. Indeed, these are two different things. BUT, it seems strange for example that North Star Academy in Newark could report a 100% graduation rate! and .3% dropout rate! while having approximately 50% attrition rate between grades 5 and 12! How can you lose half your kids over time and still have 100% graduation and effectively no dropouts. Of course the answer is that none of these are dropouts, but rather they are voluntary transfers (with no follow up to determine where they’ve gone or what happened to them).

In any case, it seemed at best, a bit disingenuous and at worst, outright fraudulent for North Star to present itself as near perfect, when a deeper dive into the data (something North Star’s own data driven leaders fail to ever report) suggest otherwise.

Here, I quickly explore the significance of this issue across charter and district schools statewide.

First, let’s look at 2013 graduation rates and the 2012-13 fall enrollment cohorts as seniors relative to themselves as freshman.

Slide1As the key indicates, orange dots are district cohort ratios – representing the senior class of 2013 as a percent of who they were as a freshman class of 2009-10. Green dots are graduation rates for all of the same district schools. Blue circles are reported graduation rates for charters and red squares are cohort ratios. The trendline is fit to charter and district school cohort ratios. In most cases, the cohort ratios are lower than the reported grad rates. But not by a whole lot. For TEAM Academy the two are close enough to overlap. For Central Jersey College Prep and University Academy, there is what appears to be a differential of about 5 to 7% or so.

But for North Star, the gap is huge. If we evaluate North Star on its reported graduation rate, the school looks great. Nearly perfect! But even compared to other schools statewide, on the same measure of cohort loss, North Star is no leader. Rather, it’s a laggard. (not a Paterson Sci/Tech or Hoboken laggard, but a laggard nonetheless).

Let’s take a look now at the 2012 and 2013 graduation rates, averaged, and the last 3 cohorts of sophomores to seniors, averaged just to see if the above single year estimates are anomalous.

Slide2Taking the two years of grad rates and three cohorts actually reveals that North Star a) falls even further below the trendline (worse than the average district school) AND b) still has a massive gap between reported grad rate and cohort loss. TEAM now also has a gap, but that gap is smaller than for North Star. Indeed, it is possible that TEAM is back filling enrollments (adding kids in high school to fill empty seats), but I’ll leave Ryan Hill, chief exec of TEAM to let me know if that’s the case.

Now, it’s certainly also possible that district schools in Newark are adding kids in upper grades, as they exit from North Star, or other charter or magnet schools. It is far less likely that many of these students are shifting to selective private schools (and upward, outward transfer) after the 10th grade.

Finally, let’s take a look at the gaps between reported graduation rates and cohort ratios, again using the last three sophomore to senior cohorts and last two years of graduation rates.

Slide3Consider this a test of the legitimacy of using the graduation rate to characterize the extent to which schools actually help students persist toward high school completion. The above graph suggests that North Star’s graduation rate is overstated by 10 to 15% averaged over time and that their graduation rate is far more inflated than nearly anyone else except American History High. TEAM is actually quite low in average difference between cohort attrition and reported graduation rate. The other high outlier here is Central Jersey College Prep.

Again, there can be a number of enrollment flow/transfer reasons for the gaps between cohort attrition and reported graduation rate. But, at the very least these figures should be regularly reported and used as a basis for evaluating the validity of reported graduation rates.

 

 

 

Welcome to Relay Medical College & North Star Community Hospital

Arne Duncan has one whopper of an interview available here: http://www.msnbc.com/andrea-mitchell-reports/watch/better-preparing-our-nations-educators-237066307522

Related to his new push to evaluate teacher preparation programs using student outcome data: https://schoolfinance101.wordpress.com/2014/04/25/arne-ology-the-bad-incentives-of-evaluating-teacher-prep-with-student-outcome-data/

And his Whitehouse press release can be found here: http://www.whitehouse.gov/the-press-office/2014/04/25/fact-sheet-taking-action-improve-teacher-preparation

Now, there’s a whole lot to chew on here, but let me focus on one of the more absurd hypocrisies in all of this.

First, Duncan seems to think the world of medical education without apparently having the first clue about how any of it actually works. In his view, it’s really just a matter of intensive clinical training (no academic prerequisites required) and competitive wages (a reasonable, though shallowly articulated argument).

Second, Duncan also seems to think that a major part of the solution for Ed Schools can be found in entrepreneurial startups like Relay Graduate School of Education. The Whitehouse press release proclaims:

Relay Graduate School of Education, founded by three charter management organizations, measures and holds itself accountable for both program graduate and employer satisfaction, and requires that teachers meet high goals for student learning growth before they can complete their degrees. There is promise that this approach translates into classroom results as K-12 students of Relay teachers grew 1.3 years in reading performance in one year.

Now, I’ll set aside for the moment that the student outcome metrics proposed for use in evaluating ed schools create the same bad incentives (and unproven benefits) that the feds have imposed for evaluating physicians and hospitals.

Let’s instead consider the model of the future – one which blends Arne Duncan’s otherwise entirely inconsistent models of training. I give you:

The Relay Medical College and North Star Community Hospital

Here’s how it all works. Deep in the heart of some depressed urban core where families and their children suffer disproportionate obesity, asthma and other chronic health conditions, where few healthy neighborhood groceries exist, but plenty of fast food joints are available, sits the newly minted North Star Community Hospital.

It all starts here. NSCH is a new kind of hospital that does not require any of its staff to actually hold medical degrees, any form of board certification or nursing credential, or even special technician degrees to operate medical equipment or handle medications. Rather, NSCH recruits bright young undergrads from top liberal arts colleges, with liberal arts majors, and puts them through an intense 5 week training program where they learn to berate and belittle minority families and children and shame them into eating more greens and fiber. Where they learn to demean them into working out – walking the treadmill, etc. It’s rather like an episode of the Biggest Loser. And the Hospital is modeled on the premise that if it can simply engage enough of the community members in its bootcamp style wellness program, delivered by these fresh young faces, they can substantively alter the quality of life in the urban core.

There is indeed some truth to the argument. Getting more community members to eat healthier and exercise will improve their health stats, including morbidity and mortality measures commonly used in Hospital rating systems. In fact, over time, this Hospital, which provides no actual medical diagnosis and treatment does produce annual reports that show astoundingly good outcome measures for community members who complete their program.

These great outcome measures generate headlines from the local news writers who fail to explore more deeply what they mean (Yes Star Ledger editorial board, that’s you!). NCSH becomes such a darling of the media and politicians that they are granted authority to start their own medical school to replicate their “successes.” And they are granted the authority to run a medical school where medical training need not even be provided by individuals with medical training!

Rather, they will grant medical degrees to their own incoming staff based on their own experiences with healthcare awesomeness. That’s right, individuals who themselves had little or no basic science or actual supervised clinical training in actual medicine, but have 3 to 5 years of experience in medical awesomeness in this start-up (pseudo) Hospital will grant medical degrees – to their own incoming peers!

Acknowledging the brilliance of this new model, US Dept of Health officials established a new rating system for all medical colleges whereby they must show that graduates of their programs reduce patient morbidity and mortality. RMC and NCSH continue to lead the nation, despite providing no actual medical interventions, but sticking to their plan of tough love, no excuses wellness training.

But, one day, it comes to light that while approximately 50 community members per year who succeeded in NCSH program and did in fact experience improved quality of life, there had been over 150 entrants to the program each year (like this). In fact, most failed. Some simply weren’t up for the daily berating inflicted on them by NCSH staff. Some had other chronic health ailments and were told by NCSH staff to suck it up, get in line (literally, in line, step left only when told) or leave.

It became clear that patients with diabetes and heart conditions need not apply. None of the staff employed at NCSH had training in cardiology or for that matter any CPR or basic life support skills. That stuff really didn’t matter to them and they sure as heck weren’t going to stand for someone keeling over on the treadmill, and lowering the NCSH mortality stats.

Sadly, by this point in time RMC and NCSH had become such a touted model that the real urban hospitals had all been closed. Further, there were few if any incentives for real medical colleges to train physicians to work in the urban core, where the traditional medical model had now been fully replaced by the RMC/NCSH model. They certainly couldn’t match the stats that NCSH was posting if they chose to serve patients who actually had chronic health conditions, or were non-compliant patients.

And those 100 dropouts of the NCSH program from each cohort, those with diabetes, heart disease and other health conditions not so easily mediated with a good shout down, were simply out luck. Actual community morbidity and mortality stats skyrocketed. But alas, no one was left to care.

Note:

Indeed, wellness is key to the provision of high quality healthcare in the urban core and elsewhere. But it is not a replacement. And yes, one can make an argument that the bootcamp program described above as NCSH legitimately helped to improve the health outcomes and perhaps even the overall quality of life for the 50 program completers, as does the reality TV show Biggest Loser.

One can certainly make the comparison to the benefits obtained by the 50% or so actual completers of the most no excusiness charter schools like North Star Academy in Newark, NJ. Those few students who do succeed and complete are likely better off academically than they might have otherwise been. But this by no means indicates that North Star Academy and Relay Graduate School of Education, or my hypothetical North Star Community Hospital and Relay Medical College are model programs for serving the public good. In fact, as pointed out here, assuming so, applying bogus easily manipulated and simply wrongheaded metrics to proclaim success, may in the end cause far more harm than good.