Inexcusable Inequalities! This is NOT the post funding equity era!

I’ve heard it over and over again from reformy pundits. Funding equity? Been there done that. It doesn’t make a damn bit of difference. It’s all about teacher quality! (which of course has little or nothing to do with funding equity?).  The bottom line is that equitable and adequate financing of schools is a NECESSARY UNDERLYING CONDITION FOR EVERYTHING ELSE!

I’m sick of hearing, from pundits who’ve never run a number themselves and have merely passed along copies of the meaningless NCES Table showing national average spending in high poverty districts slightly greater than that for lower poverty ones. 

I’m sick of the various iterations of the “we’ve tripled spending and gotten nothing for it” argument and accompanying bogus graphs.  And further, the implication put forward by pundits that these graphs and table taken together mean that we’ve put our effort into the finance side for kids in low-income schools, but it’s their damn lazy overpaid teachers who just aren’t cutting it.

I’m intrigued by those pundits who would point out that perhaps outcomes of low-income children have improved over the past few decades and that the improvement is entirely attributable to increased accountability measures (when the same pundits have argued previously that the massive increases in funding led to no improvement. Perhaps there has been improvement, and perhaps there has been some increase in funding on average… and perhaps that’s the connection? More insights on achievement gap closure and shifting resources here!).

I’m also sick of those who would so absurdly argue that districts serving low-income and minority children really have more than enough money to deliver good programs, but they’ve squandered it all on useless stuff like cheerleading and ceramics.

Anyway, the goal of this post is  to point out some of the inexcusable inequalities that persist in K-12 education, inequalities that have real consequences for kids. Let’s take a look, for example, at two states that have persistently large achievement gaps between low-income and non-low income students – Illinois and Connecticut. These two states have somewhat different patterns of overall funding disparity, but suffice it to say, both states have their winners and losers, and the differences between them are ugly and unacceptable.

Let’s start with Connecticut. Below is a graph of Connecticut school district “need and cost adjusted current spending per pupil” and standardized test outcomes on the Connecticut Mastery Test (CMT). Expenditures are adjusted for differences on labor market competitive wages and for shares of children qualifying for free or reduced price lunch and for children with limited English language proficiency (based on estimates reported here). I’ve used essentially the same methods I discussed in this previous post.

What we see here is that resources – after adjustment for needs and costs – vary widely. Heck, they vary quite substantially even without these adjustments! What we also see is that we’ve got some really high flyers, like Weston, New Canaan and Westport, and we’ve got some that, well, are a bit behind in both equitable resources and outcomes (Bridgeport and New Britain in particular). To be blunt, THIS MATTERS!  Yeah.. okay, reformy pundits are saying, but they really have enough anyway. Why put anything else into those rat-holes.

Let’s break it down a bit further. Here are the characteristics of a few of the most advantaged and most disadvantaged districts in the above figure.

But of course, all we need to do is reshuffle the deck chairs  in Bridgeport and New Britain – fire their bottom 5% – heck let’s go for 20% teachers – pay the new ones based on test scores… and all will be fixed! Those deficits in average salaries might be a bit problematic. And even the nominal (no adjustments) spending figures fall well short of their advantaged neighbors. But bring on those reformy fixes, and throw in some funding cuts while you’re at it!

I’m sure… absolutely sure that the only reason those salaries are low is because they’ve wasted too much money on administrators and reducing class size… which we all know doesn’t accomplish anything???? But wait, here are the elementary class sizes?

Well, there goes that ridiculous reformy assumption. Class sizes are actually larger in these higher need districts! and Salaries lower. Damn cheerleading costs! Killing us! Perhaps it’s even going into  junk like band and art which are obviously a waste of time and money on these kids!

Well, here are the staffing structures of the schools, with staffing positions reported per 100 pupils.

Hmmm… disadvantaged districts have far fewer total positions per child, and if we click and blow up the graph, we can see some striking discrepancies! Those high need districts have far more special education and bilingual education teachers (squeezing out other options, from their smaller pot!). Those high need districts have only about half the access to teachers in physical education assignments or art, much less access to Band (little or none to Orchestra), and significantly less access to math teachers!

But, okay… this Connecticut thing is a freakin’ anomaly, right?  These kind of disparities – savage inequalities – are surely a thing of the past. This is, after all, THE POST-FUNDING EQUITY ERA? Been there and done that!

Let’s do the same walk through for a few Illinois districts. First, here are the graphs of need and cost adjusted (based on a cost model used in my previous post and related working paper) operating expenditures and outcomes –

For unified K-12 districts

For High School districts

Here are the basic stats on these districts

In this case, imagine trying to recruit and retain teachers of comparable quality in JS Morton to those in New Trier at $20k less on average, or in Aurora East compared to Barrington, at nearly $20k less. Ahh…you say… Baker… you’re making way too much of the funding issue. First, we know their wasting it all on small class size and cheerleading. Second, Baker… you’re missing the point that if we fire the bad teachers and pay the good teachers based on student test scores, those New Trier teachers will be banging down the door to get into J S Morton! That’s real reform dammit! And we know it works (even though we don’t have an ounce of freakin’ evidence to that effect!).

Clearly, if schools in Aurora East and JS Morton are slated for closure under NCLB (I’ve not checked this actually), it’s not because of poverty. It’s not for lack of resources… Clearly it’s their lazy, overpaid teachers who refuse to pull all-nighters with their kids to beat those odds????? To get those kids into calculus and trig classes presently filled with empty seats (and their own overpaid under-worked teachers!)

So, here’s what the staffing ratios look like.

First, those advantaged districts just have a lot more teacher assignments (position assignments) than the disadvantaged ones. And they especially have far more assignments in advanced math, advanced science, Library/Media, Art and music. There’s not a whole lot of squandering on extras going on in JS Morton and Aurora East. Like CT though, the disadvantaged districts do have bilingual education and special education teachers!  The staffing disparities are baffling – Savage in fact!

In fact, I must be making this stuff up right. After all, THIS IS THE POST-FUNDING DISPARITY ERA? This kind of stuff is just pulled from the chapters of an old Kozol book!  Teachers matter. Not funding. We all know that (except perhaps the various researchers who’ve actually explored the relationship between school funding reforms and student outcomes, only to find that it does matter).

Clearly, this matters. These funding disparities are substantial. And while these examples are selected from the extremes of the distributions, these districts have plenty of company at the extremes, and these districts fall along a clearly patterned continuum. And, with enough data and enough space, I could keep going and going here. CT and IL are not unique – though IL is clearly among the worst in the nation. New York anyone?

Utica is quite possibly one of the most financially screwed local public school districts in the nation (Poughkeepsie isn’t far behind)!

Arguably, there are entire states – like Tennessee and Arizona that are approaching (if they’ve not already surpassed) the conditions of districts like Utica, JS Morton, Bridgeport or New Britain.

Until we take these disparities seriously and stop counting on miracles and superman to give us a free ride, we’re not likely to make real progress on the “Scarsdale-Harlem” achievement gap.

Treating teachers like crap, cutting state funding, basing teacher salaries on student test scores will do nothing to correct these disparities, and will likely only make them worse. Nor can we expect to close the gap by simply replacing the current underfunded schools with comparably underfunded schools under new management (or simply paying parents of kids in these districts a discount rate to just go somewhere else, and never follow up on the kids). This reformy goo is a dangerous distraction from the real issues!





Logic Gaps in the NJ Ed Reform Debate

Not much time for another full length post today. There are numbers to be crunched. But, I did feel it necessary to clear up a few issues regarding NJ Education Reform proposals, including those laid out yesterday focused on a) reforming teacher evaluation to focus on student assessment data, b) tying evaluation to compensation, tenure and dismissal policies, c) ending last in first out, and d) requiring mutual consent in placement/hiring of teachers to specific school locations.

And of course, these policy proposals are framed with the usual urgency.

Here are four overarching claims (and a few other things) based on reformy logic being applied in the New Jersey policy debate:

1. We must act now!

The argument goes that we must act now, before it’s too late, because things are so awful. First, it’s rather hard to argue with a straight face, and certainly not with any data, that NJ’s public education system is so awful. NJ performs at or near the top among states on national assessments, and NJ low-income students (qualifying for free lunch) also do quite well nationally and have risen over the years (one example here).  Typically, the great urgency argument is a ruse to get policymakers to act in haste, and adopt policies they and especially those who voted for them, will regret later.

2. We couldn’t possibly do worse!

The argument that we couldn’t possibly do worse! Clearly, New Jersey could do worse, since New Jersey does quite well. That’s not to say that we shouldn’t keep trying to do better, or that we shouldn’t be trying to do better specifically in those areas where we aren’t doing as well as we should. But, we could surely do worse, as the vast majority of states do!  See:

3. Teacher evaluation, compensation and tenure reform are the key variables!

All of the current proposals center on what are argued to be necessary changes to teacher evaluation, compensation, tenure and dismissal. That is, the assumption is that we can improve all schools by making these changes and specifically that we can improve the 200 failing schools which serve over 100,000 students. For these changes to be reasonable, one would have to have some idea, some empirical basis perhaps, for why these policy changes might have any positive effect on either our highly successful districts or those supposedly dreadfully failing ones. Since the existing research literature provides no real substantive support for merit pay (as a way to either stimulate immediate, or long-term improvements), or using student test scores for teacher evaluation, one might logically look at the differences between NJ’s highest performing schools and NJ’s lowest performing ones. Of course, what we find there is that the teacher contractual agreements are quite similar in higher and lower performing schools in NJ. Of course, other things are different, most notably the demographics of those schools.

Let’s make this really simple – IT’S PLAINLY ILLOGICAL TO BLAME SUCCESS OR FAILURE ON A FACTOR THAT DOESN’T VARY ACROSS SUCCESSFUL AND FAILING SCHOOLS. That’s just middle school science logic. Perhaps we should fire the middle school science teachers who taught the current crop of ed reformers?

4. No business in their right mind would retain “ineffective” employees, so why should we let this happen in schools?

There’s also that fun argument that no business in their right mind would or should retain ineffective, low quality, employees? Why would they? Why do they? Well, it’s all relative. Now surely, anyone reading this has encountered at least a few employees of private companies or perhaps even colleagues who, well, just aren’t that good at what they do. Some people do better than others in any field, and there’s always a bottom rung. We ask ourselves, why do we retain these people? Why would a school retain an ineffective teacher? Why would a school grant tenure to such a large share of teachers, some of whom might not be that great. Sometimes the answer to this is pretty simple – That those waiting in line to apply to take those jobs at present salaries might not be any better, and in fact, might be worse! You don’t let go your bottom rung unless you are pretty sure you can replace them with something better.  Applied to the current NJ school reform debate: One cannot simply assume that if we force poor urban districts to lay off large numbers of teachers that we would consider “ineffective,” that there will be a long line of better teachers waiting to take those jobs. In fact, the alternatives might be worse in many cases, unless we significantly step up teacher pay and maintain quality benefits, including job stability and the potential for consistent income growth over time (potentially allowing a lower wage than would otherwise be required).


We must fix LIFO now!

That is, clearly, the most offensive policies that exist today across states and in district contractual agreements are those that protect old, crusty ineffective, uncaring curmudgeons while discarding – throwing out onto the streets – young energetic and caring teachers.

This one is really a smokescreen issue, especially when coupled with the immediacy claim. It makes for good sound bytes and has a catchy acronym – LIFO – which must be bad, because it sounds so bad! But when you dig deeper, even though it seems to make sense that quality should trump seniority in layoff decisions, it’s not that simple – nor is it huge money saver and job saver as some assert.

  • First, layoffs are here and now – in very tight budget times – and the supposed evaluations to be used don’t yet exist. So suggesting that this is a necessary immediate change is foolish.
  • Second, if we are relying heavily on test scores to decide quality – the only teachers who would have scores attached to them would be those in core content teaching in grades 3 to 8. But, layoffs are likely to occur in other areas first – and unlikely to reach core teaching in K-8 in many cases. In fact, schools and districts already have significant latitude to restructure programs and offerings leading to layoffs that may not all fall entirely on the basis of seniority (programmatic & position cuts).
  • Third, there is more research out there than is acknowledged in the present debate that actually does speak to the value of experience.
  • Fourth, replacing a not-so-great, convenience based (and perhaps turf protecting) measure like seniority with a potentially politically charged, manipulable and or random error prone alternative (like test score based evaluations) CAN ACTUALLY MAKE THINGS WORSE. While LIFO may not be great, the alternatives could be worse and could be an even greater deterrent to the recruitment of a talented teacher workforce.

A few other notes

Regarding what we know about mutual consent teacher hiring/placement policies:

Oh, and by the way, just to be absolutely clear, NEW JERSEY IS NOT THE HIGHEST SPENDING STATE IN THE NATION!

Dumbest “real” reformy graphs!

So in my previous post I created a set of hypothetical research studies that might be presented at the Reformy Education Research Association annual meeting. In my creation of the hypotheticals I actually tried to stay  pretty close to reality, setting up reasonable tables with information that is actually quite probable.  Now, when we get down to the real reformy stuff that’s out there, it’s a whole lot worse. In fact, had I presented the “real” stuff in my previous post, I’d have been criticized for fabricating examples that are just too stupid to be true. Let’s take a look at some real “reformy” examples here:

1. From Democrats for Education Reform of Indiana

According to the DFER web site post which includes this graph:

True, there are some great, traditional public schools in Indiana and throughout the nation.  We’re also fortunate that a vast majority of our educators excel at their jobs and are dedicated to doing whatever it takes to help students succeed.  However, that doesn’t mean we should turn a blind eye to what ISN’T working.  Case in point?  The following diagram displays how all 5th grade classes in the span of a year in one central Indiana school district are doing on a set of state Language Arts student academic standards.  Because 5th grade classes in Indiana are only taught by one teacher, the dots can be translated to display how well the students of individual teachers are doing.

Now, ask yourself this:  In which dot or class would you want your child?  And, imagine if your child were in the bottom performing classroom for not one but MULTIPLE years.  In spite of lofty claims made by those who defend the current system, refusal to offer constructive alternatives to rectify charts such as the one above represents the sad state of education dialogue in America today.

So, here we have a graph… a line graph of all things, across classrooms (3rd grade graphing note – a bar graph would be better, but still stupid). This graph shows the average pass rates on state assessments for kids in each class. Nothin’ else. Not gains. Just average scores. Gains wouldn’t necessarily tell us that much either. But this is truly absurd.  The author of the DFER post makes the bold leap that the only conclusion one can draw from differences in average pass rates across a set of Indiana classrooms is that some teachers are great and others suck! Had I used this “real” example to criticize reformers, most would have argued that I had gone overboard.

2. Bill Gates brilliant exposition on turning that curve upside down – and making money matter

Now I’ve already written about this graph, or at least the post in which it occurs, but I didn’t include the graph itself.

Gates uses this chart to advance the argument:

Over the last four decades, the per-student cost of running our K-12 schools has more than doubled, while our student achievement has remained flat, and other countries have raced ahead. The same pattern holds for higher education. Spending has climbed, but our percentage of college graduates has dropped compared to other countries… For more than 30 years, spending has risen while performance stayed flat. Now we need to raise performance without spending a lot more.

Among other things, the chart includes no international comparison, which becomes the centerpiece of the policy argument. Beyond that, the chart provides no real evidence of a lack of connection between spending and outcomes across districts within U.S. States.  Instead, the chart juxtaposes completely different measures on completely different scales to make it look like one number is rising dramatically while  the others are staying flat. This tells us NOTHING. It’s just embarrassing. Simply from a graphing standpoint, a blogger at Junk Charts noted:

Using double axes earns justified heckles but using two gridlines is a scandal!  A scatter plot is the default for this type of data. (See next section for why this particular set of data is not informative anyway.)

Not much else to say about that one. Again, had I used an example this absurd to represent reformy research and thinking, I’ d have likely faced stern criticism for mis-characterizing the rigor of reformy research!

Hat tip to Bob Calder on Twitter, for finding an even more absurd representation of pretty much the same graph used by Gates above. This one comes to us from none other than Andrew Coulson of Cato Institute. Coulson has a stellar record of this kind of stuff. So, what would you do to the Gates graph above if you really wanted to make your case that spending has risen dramatically and we’ve gotten no outcome improvement? First, use total rather than per pupil spending (and call it “cost”) and then stretch the scale on the vertical axis for the spending data to make it look even steeper. And then express the achievement data in percent change terms because NAEP scale scores are in the 215 to 220 range for 4th grade reading, for example, but are scaled such that even small point gains may be important/relevant but won’t even show as a blip if expressed as a percent over the base year.

And here’s the Student’s First version of the same old story:

3. Original promotional materials from the reformy documentary, The Cartel (a manifesto on New Jersey public schools)

The Cartel is essentially the ugly step-cousin of Waiting for Superman and The Lottery. I’ve written extensively about the Cartel when it was originally released and then when it made its Jersey tour. Thankfully, it didn’t get much beyond that. Back when it was merely a small time, low budget, ill-conceived, and even more poorly researched pile of reformy drivel, The Cartel had a promotional web site (different from the current one) which included a page of documented facts explaining why reform was necessary in New Jersey. The central message was much the same as the Gates message above. The graphs that follow are nolonger there, but the message is – for example – here:

With spending as high as $483,000 per classroom (confirmed by NJ Education Department records), New Jersey students fare only slightly better than the national average in reading and math, and rank 37th in average SAT scores.

Here are the truly brilliant graphs that support this irrefutable conclusion:

I have discussed these graphs at length previously! I’m not sure it’s even worth reiterating my previous comments. But, just to clarify, it is entirely conceivable that participation rates for the SAT differ somewhat across states and may actually be an important intervening factor? Nah… couldn’t be.

Reformy Disconnect: “Quality Based” RIF?

I addressed this point previously in my post on cost-effectiveness of quality based layoffs, but it was buried deep in the post.

Reformers are increasingly calling for quality based layoffs versus seniority based layoffs, as if a simple dichotomy. Sounds like a no brainer when framed in these distorted terms.

I pointed out in the previous post that if the proposal on the table is really about using value-added teacher effect estimates versus years of service, we’re really talking about the choice between significantly biased and error prone – largely random – layoffs versus using years of service. It doesn’t sound as much like a no brainer when put in those terms, does it? While reformers might argue that seniority based layoffs are still more “error prone” than effectiveness rating layoffs, it is actually quite difficult to determine which, in this case, is more error prone. Existing simulation studies identifying value-added estimates as the less bad option, use value-added estimates to determine which option is better. Circular logic (as I previously wrote)?

We’re having this policy conversation about layoffs now because states are choosing (yes choosing, not forced, not by necessity) to slash aid to high need school districts that are highly dependent on state aid, and will likely be implementing reduction in force (RIF) policies. That is, laying off teachers. So, reformy pundits argue that they should be laying off those dead wood teachers – those with bad effectiveness ratings, instead of those young, energetic highly qualified ones.

So, here are the basic parameters for quality-based RIF:

1. We must mandate test-score based teacher effectiveness ratings as a basis for teacher layoffs.

2. But, we acknowledge that those effectiveness ratings can at best be applied to less than 20% of teachers in our districts, specifically teachers of record – classroom teachers – responsible for teaching math and reading in grades 3 to 8 (4 to 8 if only annual assessment data)

3. Districts are going to be faced with significant budget cuts which may require laying off around 5% or somewhat more of their total staff, including teaching staff.

4. But, districts should make efforts to layoff staff (teachers) not responsible for the teaching of core subject areas.

Is anyone else seeing the disconnect here? Yeah, there are many levels of it, some more obvious than others. Let’s take this from the district administrator’s/local board of education perspective:

“Okay, so I’m supposed to use effectiveness measures to decide which teachers to lay off. But, I only have effectiveness measures for those teachers who are supposed to be last on my list for lay offs? Those in core areas. The tested areas. How is that supposed to work?”

Indeed the point of the various “quality based layoff” simulations that have been presented (the logic of which is problematic) is to layoff teachers in core content areas and rely on improved average quality of core content teachers over time to drive system wide improvements. These simulations rely on heroic assumptions of a long waiting list of higher quality teacher applicants just frothing at the mouth to take those jobs from which they too might be fired within a few years due to random statistical error (or biased estimates) alone.

That aside, reduction in force isn’t about choosing which teachers to be dismissed so that you can replace them with better ones. It’s about budgetary crisis mode and reduction of total staffing costs. And reduction in force is not implemented in a synthetic scenario where the choice only exists to lay off either core classroom teachers based on seniority, or core classroom teachers based on effectiveness ratings (the constructed reality of the layoff simulations). Reduction in force is implemented with consideration for the full array of teaching positions that exist in any school or district. “Last in, first out” or LIFO as reformy types call it, does not mean ranking all teachers systemwide by experience and RIF-ing the newest teachers regardless of what they teach, or the program they are in. Specific programs and positions can be cut, and typically are.

And it is unlikely that local district administrators in high need districts would, or even should, look first to cut deeply into core content area teachers. So, a 5% staffing cut might be accomplished before ever cutting a single teacher for whom an effectiveness rating occurs – or very few. So, in the context of RIF, layoffs actually based on effectiveness ratings are a drop in the bucket.

So now I’m confused. Why is this such a pressing policy issue here and now? Does chipping away at seniority based provisions really have much to do with improving the implementation of RIF policies? Perhaps some are using the current economic environment and reformy momentum to achieve other long-run objectives?

Stretching Truth, Not Dollars?

This week, Mike Petrilli (TB Fordham Institute) and Marguerite Roza (Gates Foundation) released a “policy brief” identifying 15 ways to “stretch” the school dollar. Presumably, what Petrilli and Roza mean by stretching the school dollar is to find ways to cut spending while either not harming educational outcomes or actually improving them. That goal in mind, it’s pretty darn hard to see how any of the 15 proposals would lead to progress toward that goal.

The new policy brief reads like School Finance Reform in a Can. I’ve written previously about what I called Off-the-Shelf school finance reforms, which are quick and easy – generally ineffective and meaningless, or potentially damaging – revenue-neutral school finance fixes. In this new brief, Petrilli and Roza have pulled out all the stops. They’ve generated a list, which could easily have been generated by a random search engine scouring “reformy” think tank websites, excluding any ideas actually supported by research literature.

The policy brief includes some introductory ramblings about district level practices for “stretching” the school dollar, but the policy brief focuses on state policies that can assist in stretching the school dollar at the state level and provide local districts greater options to stretch the school dollar. I will focus my efforts on the state policy list.

Here’s the state policy recommendation list:

1. End “last hired, first fired” practices.

2. Remove class-size mandates.

3. Eliminate mandatory salary schedules.

4. Eliminate state mandates regarding work rules and terms of employment.

5. Remove “seat time” requirements.

6. Merge categorical programs and ease onerous reporting requirements.

7. Create a rigorous teacher evaluation system.

8. Pool health-care benefits.

9. Tackle the fiscal viability of teacher pensions.

10. Move toward weighted student funding.

11. Eliminate excess spending on small schools and small districts.

12. Allocate spending for learning-disabled students as a percent of population.

13. Limit the length of time that students can be identified as English Language Learners.

14. Offer waivers of non-productive state requirements.

15. Create bankruptcy-like loan provisions.

This list can be lumped into four basic categories:

A) Regurgitation of “reformy” ideology for which there exists absolutely no evidence that the “reforms” in question lead to any improvement in schooling efficiency. That is, no evidence that these reforms either “cut costs” (meaning reduce spending without reducing outcomes) or improve benefits (or outcome effects).

  1. Creating a rigorous evaluation system
  2. Ending “last hired, first fired” practices
  3. Move toward weighted student funding

B) Relatively common “money saving” ideas, backed by little or no actual cost-benefit analysis – the kind of stuff you’d be likely to read in a personal finance column in magazine in a dentist’s office.

  1. Pool health-care benefits.
  2. Create bankruptcy-like loan provisions. (???)
  3. Tackle pensions
  4. Cut spending on small districts and schools (consolidate?)

C) Reducing expenditures on children with special needs by pretending they don’t exist.

  1. Allocate spending for learning-disabled students as a percent of population.
  2. Limit the length of time that students can be identified as English Language Learners.

D) Un-regulation

  1. eliminate class-size limits
  2. provide waivers for ineffective mandates
  3. eliminate seat time requirements
  4. merge categorical programs
  5. eliminate work rules
  6. eliminate mandatory salary schedules

So, let’s walk through a few of these in greater detail. Let’s address whether there is any evidence whatsoever that these policies a) would actually lead to reduced short run costs while not harming, or even improving outcomes, or b) are for any other reason a good idea.

Creating an Evaluation System

This likely requires significant up front spending- heavy front end investment to design the system and put the system into place. Yes, increased, not decreased spending. And in the short-term, while money is tight. AND, there is little or no evidence that what is being recommended – a Tennessee or Colorado-style teacher evaluation model (50% on value-added scores), would actually reduce spending and /or improve outcomes. Rather, I could make a strong case that such a model will lead to exorbitant legal fees for the foreseeable future (I have a forthcoming law review article on this topic).  The likelihood of achieving long run benefits from these short run expenses is questionable at best. In fact, the likelihood of significant harm seems equal if not greater (see my previous post on this topic: value-added teacher evaluation).

Ending “Last Hired, First Fired” layoff policies

In very crude terms, this approach might simply allow a district – or entire state – to layoff senior, higher salary teachers. Yeah… that could reduce the payroll. Good policy? Really questionable! Of course, Petrilli and Roza also argue that we simply shouldn’t be paying teachers for experience or degrees anyway. So I guess if we did that, we wouldn’t generate savings from this recommendation. Silly me. One or the other, I guess.

Now, we could generate performance increases (at lower spending, if we keep seniority pay, or at constant spending if we don’t) if, and only if, the future actually plays out as simulated in the various performance-based layoff simulations which I, and others have recently discussed. The assumptions in these simulations are bold (unrealistic), and much of the logic circular.

And then there are those short-term legal costs of defending the racially disparate firings, and random error firings.

Eliminating Class Size Limits

Yes, larger classes require less spending – on a per pupil basis. Smaller classes have greater benefit (greater “bang for the buck” shall we so boldly say) in higher poverty settings. A labor market dynamic problem realized in the late 1990s, when CA implemented statewide class size reduction, was that the policy stretched the pool of highly qualified teachers and ultimately made it even harder for high poverty schools to get high quality teachers (a dreadfully oversimplified and disputable version of the story).

Removing class size limits might be reasonable if only affluent districts agreed to increase their class sizes, putting more “high quality” teachers into the available labor pool… who might then be recruited into high poverty districts (another dreadfully oversimplified, if not absurd scenario).  But who really thinks it will play out this way? We already know that affluent school districts a) have strong preferences for very small class sizes and b) have the resources to retain those small class sizes or reduce them further. See Money and the Market for High Quality Schooling.

Eliminating mandatory salary schedules

It seems that in this recommendation, Petrilli and Roza are arguing against state policies that mandate the adoption by local public school districts of specific step and lane salary schedules. They really only provide one brief paragraph with little or no explanation regarding what the heck they are talking about.

I’ve personally never been much of a fan of state rigidity regarding local negotiated agreements – at least in terms of steps and lanes. Many problems can occur where states enact policies as rigid as those of Washington State, were teachers statewide are on a single salary schedule.

The best work on this topic (and I’ve worked on the same topic with Washington data) is by Lori Taylor of Texas A&M who shows that the Washington single salary schedule leads to non-competitive wages for teachers in metro areas, and also leads to non-competitive wages for teachers in math and science relative to other career opportunities in metro areas. The statewide salary schedule in Washington is arguably too rigid. Here’s a link to Taylor’s study:

Taylor, L. (2008) Washington Wages: An Analysis of Educator and Comparable Non-educator Wages in the State of Washington. Washington State Institute for Public Policy.

But this does not mean, by any stretch of the imagination, that removing this requirement would save money, or “stretch” the education dollar. It might allow bargaining units in metro areas in Washington to scale up salaries over time as the economy improves. And it might lead to some creative differentiation across negotiated agreements, with districts trying to leverage different competitive advantages over one another for teacher recruitment.

But, these competitive behaviors among districts may also lead to ratcheting of teacher salaries across neighboring bargaining units, and may lead to increased salary expense with small marginal returns (as clusters of districts compete to pay more for an unchanging labor pool). For an analysis of this effect, see Mike Slagle’s work on spatial relationships in teacher salaries in Missouri. In short, Slagle finds that changes to neighboring district salary schedules are among the strongest predictors of an individual district’s salary schedule. Ratcheting upward of salaries in neighboring districts is likely to lead to adjustment by each neighboring district (to the extent resources are available). Ratcheting downward does not tend to occur (not reported in this article).

Slagle, M. (2010) A Comparison of Spatial Statistical Methods in a School Finance Policy Context. Journal of Education Finance 35 (3)

[note: this article is a shortened version of Mike’s dissertation. The article addresses only the ratcheting of per pupil spending, but the full dissertation also addresses teacher salaries]

In any case, we certainly have no evidence that removing state level requirements for mandatory salary schedules would save money while holding outcomes harmless – hence improving efficiency. Like I said, I’m not a big fan of such restrictions either, but I have no delusion that removing them will save any district a ton of money – or any for that matter.

This recommendation seems to also be tied up in the notion that we shouldn’t be paying teachers for experience or degree levels anyway. Therefore, mandating as much would clearly be foolish. I’ve addressed this idea previously in The Research Question that Wasn’t Asked.

In addition, this recommendation seems to adopt the absurd assumption that we could immediately just pay every teacher in the current system the bachelor’s degree base salary (Okay, the salary of a teacher with 3 years and a bachelor’s degree, where marginal test-score returns to experience fade). We could immediately recapture all of that salary money dumped into differentiation by experience or differentiation by degree, and that we could have massive savings with absolutely no harm to the quality of schooling – or quality of teacher labor force in the short-run or in the long-term. Again, that’s the research question that was never asked. Previous estimates of all of the money wasted on the master’s degree salary “bump” are actually this crude.

For similarly absurd analysis by Marguerite Roza regarding teacher pay, see my previous post on “inventing research findings.”

Move toward Weighted Student Funding

Petrilli and Roza also advocate moving to Weighted Student Funding. They seem to argue that the “big” savings here will come from the ability of states and school districts to immediately take back funding as student enrollments decline. That is, a district in a state, or school in a district gets a certain amount per kid. If they lose the kid, they lose the money. This keeps us from wasting a whole lot of money on kids who aren’t there anymore.

Okay… Now… most state aid is allocated on a per pupil basis to begin with. And, in general, as enrollments fluctuate, state aid fluctuates. Lose a kid. Lose the state aid that is driven by that kid. Some states have recognized that the costs of providing education don’t actually decline linearly (or increase linearly) with changes in enrollment and have included safety valves to slow the rate of aid loss as enrollments decline. Such policies are reasonable.

Petrilli and Roza seem to be belligerently and ignorantly declaring that there is simply never a legitimate reason for a funding formula to include small school district or declining enrollment provisions. I have testified in court as an expert against such provisions when those provisions are completely “out of whack”, but would never say they are entirely unwarranted. That’s just foolish, and ignorant.

Local revenues in many states (and in many districts within states) still make up a large share of public school funding, and local revenues are typically derived from property taxes applied to the total taxable property wealth of the school district. As kids come and go, local revenues do not come and go. If a tax levy of X% on the district’s assessed property values raises $8,000 per pupil – and if enrollment declines, but the total assessed value stays constant, the same tax raises more per pupil, perhaps $8,100. The district would lose state funding because it has fewer pupils (and perhaps also because it can generate larger local share per pupil).  But that’s really nothing new.

There’s really no new “huge” savings to be had here.


a) we are talking about kids moving to charter schools from the traditional public schools, and for each kid who moves to a charter school, we either require the district to pass along the local property tax share of funding associated with that child (Many states), or reduce state aid by the equivalent amount (Missouri).

b) there exists a property tax revenue limit tied specifically to the number of pupils served in the district (as in Wisconsin and other states) which then means that the district would have to reduce its local property taxes to generate only the per pupil revenue allowed. That’s not savings. It’s a state enforced local tax cut.

So then, why do Petrilli and Roza care about Weighted Student Funding as an option? The above two “Unless” scenarios are possible suspects. Blind reformy punditry regardless of logic is equally possible (WSF is cool… reformy… who cares what it does?).

It’s not really about “saving” money at all. Rather, it’s about creating mechanisms to enable local property tax revenues to be diverted in support of charter schools (even if the local taxpayers did not approve the charter), or to have local budgets forcibly reduced/capped when students opt-in to voucher programs (Milwaukee).

And this isn’t really a “weighted student funding” issue at all. In many states, it already works this way (WSF or not). Big savings? Perhaps an opportunity to reduce the state subsidy to charter schools by requiring greater local pass through – in those states where this doesn’t already occur. But these provisions face significant legal battles in some states. If a state is not already doing this, this policy change would also likely lead to significant up front legal expenses.

In fact, I can’t imagine a circumstance where adopting weighted student funding can be expected to either save money or improve outcomes for the same money. There’s simply no proof to this effect. Sadly, while it would seem at the very least, that adopting weighted funding might improve transparency and equity of funding across schools or districts, that’s not necessarily the case either.

My own research finds that districts adopting weighted funding formulas have not necessarily done any better than districts using other budgeting methods when it comes to targeting financial resources on the basis of student needs. See:

Petrilli and Roza’s Weighted Funding recommendation for “stretching” the dollar is strange at best. As a recommendation to state policymakers, adoption of weighted funding provides few options for “stretching” the dollar, but may provide a mechanism for diverting districts’ local revenues to support choice programs (potentially reducing state support for those programs).

As a recommendation to local school district officials, adoption of weighted funding really provides no options for “stretching” the dollar, and may, in fact, increase centralized bureaucracy required to develop and manage the complex system of decentralized budgeting that accompanies WSF (see:


No savings?

No improvements to equity?

No evidence of improved efficiency?

What then, does WSF have to do with “stretching” the school dollar?

Baker, B.D., Elmer, D.R. (2009) The Politics of Off‐the‐Shelf School Finance Reform. Educational Policy 23 (1) 66‐105

Baker, B.D. (2009) Evaluating Marginal Costs with School Level Data: Implications for the Design of Weighted Student Allocation Formulas. Education Policy Analysis Archives 17 (3)

Savings from Small Districts and Schools

I am one who believes in creating savings through consolidation of unnecessarily small schools and school districts. And, at the school or district level, some sizeable savings can be achieved by reorganizing schools into more optimal size configurations (elementary schools of 300 to 500 students and high schools of 600 to 900 for example, See Andrews, Duncombe and Yinger)

For other research on the extent to which consolidation can help cut costs, see Does School District Consolidation Cut Costs, also by Bill Duncombe and John Yinger (the leading experts on this stuff).

Now, Petrilli and Roza, however, seem to imply that the savings from these consolidations or simply from starving the small schools and districts can perhaps help states to sustain the big districts – STRETCHING that small school dollar. Note that Petrilli and Roza ignore entirely the possibility that some of these small schools and districts (in states like Wyoming, western Kansas, Nebraska) might actually have no legitimate consolidation options. Kill them all! Get rid of those useless small schools and districts, I say!

Here’s the thing about de-funding small schools and districts to save big ones. The total amount of money often is not much… BECAUSE THEY ARE SMALL SCHOOLS!!!!!  I learned this while working in Kansas, a state which arguably substantially oversubsidizes small rural school districts, creating significant inequities between those districts and some of the states large towns and cities with high concentrations of needy students. While the inequity can (and should) be reduced, the savings don’t go very far.

So, let’s say we have 6 school districts serving 100 kids each, and spending $16,000 per pupil to do so. Let’s say we can lump them all together and make them produce equal outcomes for only $10,000 per pupil. A bold, bold assumption. We just saved $6,000 per pupil (really unlikely), across 600 pupils. That’s not chump change… it’s $3,600,000 (okay… in most state budgets that is chump change).

So, now let’s take this savings, and give it to the rest of the kids in the state – oh – about 400,000. Well, we just got ourselves about $9 per pupil. Even if we try to save the mid-sized city district of 50,000 students down the road, it’s about $72 per pupil. That is something. And if we can achieve that, then fine. But slashing small districts and schools to save big, or even average ones, usually doesn’t get us very far. BECAUSE THEY ARE SMALL! GET IT! SMALL DISTRICTS WITH SMALL BUDGETS!

Similar issues apply to elimination of very small schools in large urban districts. It’s appropriate strategy – balancing and optimizing enrollment (reorganizing those too-small high schools created as a previous Gates-funded reform?). It should be done. But unless a district is a complete mess of tiny, poorly organized schools, the savings aren’t likely to go that far.

Let’s also remember that major reconfiguration of school level enrollments will require significant up front capital expense! Yep, here we are again with a significant increased expense in the short-term. Duncombe and Yinger discuss this in their work. Strangely, this slips right past Petrilli and Roza.

Use Census Based Funding for Special Education

So, what Petrilli and Roza are arguing here is that states could somehow save money by allocating their special education funding to school districts on an assumption that every school district has a constant share of its enrollment that qualifies for special education programs. Those districts that presently have more? Well, they’ve just been classifying every kid they can find so they can get that special education money. This flat-funding policy will bring them into line… and somehow “stretch” that dollar.

Let’s say we assume that every district has 16% (Pennsylvania) or 14.69% (New Jersey) children qualifying for special education. Let’s say we pick some number, like these, that is about the current average special education population.  Our goal is really to reduce the money flowing to those districts that have higher than average rates. Of course, if we pick the average, we’ll be reducing money to the districts with higher rates and increasing money to the districts with lower rates and you know what – WE’LL SPEND ABOUT THE SAME IN SPECIAL EDUCATION AID? “Stretching?” how?

And will we have accomplished anything close to logical? Let’s see, we will have slammed those districts that have been supposedly over-identifying kids for decades just to get more special ed aid. That, of course, must be good.

BUT, we will also be providing aid for 14.69% of kids to districts that have only 7% or 8% children with disabilities. Funding on a census basis or flat basis requires that we provide excess special education aid to many districts – unless we fund all districts as if they have the same proportion of special education kids as the district with the fewest special education kids. That is, simply cut special education aid to all districts except the one that currently receives the least.

How is that smart “stretching?”

The only way to “save” money with this recommendation is simply to “cut funding” and “cut services.” And, unless cut to the bare minimum, the “flat allocation” strategy requires choosing to “overfund” some districts while “underfunding” others. One might try to argue that this policy change would at least reduce further growth in special ed populations. But the article below suggests that this is not likely the case either. The resulting inequities significantly offset any potential benefits.

There exist a multitude of problems with flat, or census-based special education funding, which have led to declining numbers of states moving in this direction in recent years, New Jersey being an exception. I discuss this with co-authors Matt Ramsey and Preston Green in our forthcoming chapter on special education finance in the Handbook on Special Education Policy Research.

Of course, there also exists the demographic reality that children with disabilities are simply not distributed evenly across cities, towns and rural areas within states, leading to significant inequities when using Census Based funding. CB Funding is, in fact, the antithesis of Weighted Student Funding. How does one reconcile that?

For a recent article on the problems with the underlying assumptions of Census Based special education funding, see:

Baker, B.D., Ramsey, M.J. (2010) What we don’t know can’t hurt us? Evaluating the equity consequences of the assumption of uniform distribution of needs in Census Based special education funding. Journal of Education Finance 35 (3) 245‐275

Here’s a draft copy of our forthcoming book chapter on special education finance: SEF.Baker.Green.Ramsey.Final

Limit Time for ELL/LEP

This one is both absurd and obnoxious. Essentially, Petrilli and Roza argue that kids should be given a time limit to become English proficient and should not be provided supplemental programs or services – or at least the money for them – beyond that time frame. For example, a child might be funded for supplemental services for 2 years, and 2 years only. Some states have done this. Again, there is no clear basis for such cutoffs, nor is it clear how one would even establish the “right” time limit, or whether that time limit would somehow vary based on the level of language proficiency at the starting time.

Yes, this approach, like cutting special education funding can be used to cut spending and cut and reduce the quality of services. But that’s all it is. It’s not “stretching” any dollar.

Other Stuff

Now, the brief does list other state policy options as well as other district practices. Some of these are rather mundane, typical ideas for “cost saving.” But, of course, no evidence or citation of actual cost effectiveness, cost benefit or cut utility analysis is presented. Petrilli and Roza toss around ideas like a) pooling health care costs, b) redesigning sick leave policies or c) shifting health care costs to employees. These are the kind of things that are often on the table anyway.

I fail to see how this new policy brief provides any useful insights in this regard. Some actual cost-benefit analysis would be the way to go. As a guide for such analyses, I recommend Henry Levin and Patrick McEwan’s book on Cost Effectiveness Analysis in Education.

There are a handful of articles available on the topic of incentives associated with varied sick leave policies, including THIS ONE, School District Leave Policies, Teacher Absenteeism, and Student Achievement, by Ron Ehrenberg of Cornell (back in 1991).

One category I might have included above is that at least two of the recommendations embedded in the report argue for stretching the school dollar, so-to-speak, by effectively taxing school employees. That is, setting up a pension system that requires greater contribution from teacher salaries, and doing the same for health care costs. This is a tax – revenue generating (or at least a give back). This is not stretching an existing dollar. This is requiring the public employees, rather than the broader pool of taxpayers (state and/or local), to pay the additional share. One could also classify it as a salary cut. But Petrilli and Roza have already proposed salary cuts in half of the other recommendations. Just say it. Hey… why not just take the “master’s bump” money and use that to pay for pensions and health care? No-one will notice it’s even gone? We all know it was wasted and un-noticed to begin with.

I was particularly intrigued by the entirely reasonable point that school districts should NOT make the harmful cuts by narrowing their curriculum. I was intrigued by this point because this is precisely what Marguerite Roza has been arguing that poor districts MUST do in order to achieve minimum standards within their existing budgets. I wrote about this issue previously HERE. It is an interesting, but welcome about-face to see Roza no-longer arguing that poor, resource constrained school districts should dump all but the basics (while other districts, with more advantaged student populations and more adequate resources need not do the same).

Utter lack of sources/evidence for any/all of this junk

Finally, I encourage you to explore the utter lack of support (or analysis) that the policy brief provides for any/all of its recommendations. It won’t take much time or effort. Read the footnotes. They are downright embarrassing, and in some cases infuriating. At the very least, they border on THINK TANKY MALPRACTICE.

There is a reference to the paper by Dan Goldhaber simulating seniority based layoffs, but that paper provides no analysis of cost/benefit, the central premise of the dollar stretching brief. The Petrilli/Roza (not Goldhaber) assumption is simply that the results will be good, and because we are firing more expensive teachers, it will cost less to get those good results.

The policy brief makes a reference to “typical teacher contracts” (FN2) regarding sick leave, with no citation… no supporting evidence, and phrased rather offensively (18 weeks a year off? For all teachers? Everywhere! OMG???)

FN2: Typical U.S. teacher contracts are for 36.5 weeks per year and include 2.5 weeks sick and personal days for a total work year of 34 weeks, or 18 weeks time off.

The brief refers to work by NCTQ (not the strongest “research” organization) for how to restructure teacher pay.

The report self-cites The Promise of Cafeteria Style Pay (by Roza, non-peer reviewed… schlock), and makes a bizarre generalized attack in footnote 5 that school districts uniformly defend the use of non-teaching staff as substitutes (no evidence/source provided).

FN5: Districts requiring non-teaching staff to serve as substitutes argue that it is good practice to have all staff in classrooms at least a few days a year.

The brief cites policy reports (and punditry) on pension gaps (including the Pew Center report), and those reports refer to alternative plans for closing gaps over time. These are important issues, but the question of how this “stretches” the school dollar is noticeably absent.

And that’s it. That’s the entire extent of “research” and “evidence” used to support this policy brief.

The problem? Cheerleading and Ceramics, of course!

David Reber with the Topeka Examiner had a great post a while back (April, 2010) addressing the deceptive logic that we should be outraged by supposed exorbitant spending on things like cheerleading and ceramics, and not worry so much about the little things, like disparities between wealthy and poor school districts. I finally saw this post today, from a tweet, and realized I had not yet blogged on this topic.

This logic/argument comes from the “research” of Marguerite Roza, who, well, has a track record of making such absurd arguments in an effort to place blame on poor urban districts and take attention away from disparities between poor urban districts and their more affluent suburban neighbors.

This new argument is really just more of the same ol’ flimsy logic from this crew. For the past several years, Roza and colleagues have attempted to argue that states have largely done their part to fix inequities in funding between school districts, and that now, the burden falls on local public school districts to clean up their act. Here’s an excerpt from one of my recent articles on this topic:

On other occasions, Roza and Hill have argued that persistent between-district disparities may exist but are relatively unimportant. Following a state high court decision in New York mandating increased funding to New York City schools, Roza and Hill (2005) opined: “So, the real problem is not that New York City spends some $4,000 less per pupil than Westchester County, but that some schools in New York [City] spend $10,000 more per pupil than others in the same city.” That is, the state has fixed its end of the system enough.

This statement by Roza and Hill is even more problematic when one dissects it more carefully. What they are saying is that the average of per pupil spending in suburban districts is only $4,000 greater than spending per pupil in New York City but that the difference between maximum and minimum spending across schools in New York City is about $10,000 per pupil. Note the rather misleading apples-and-oranges issue. They are comparing the average in one case to the extremes in another.

In fact, among downstate suburban[1] New York State districts, the range of between-district differences in 2005 was an astounding $50,000 per pupil (between the small, wealthy Bridgehampton district at $69,772 and Franklin Square at $13,979). In that same year, New York City as a district spent $16,616 per pupil, while nine downstate suburban districts spent more than $26,616 (that is, more than $10,000 beyond the average for New York City). Pocantico Hills and Greenburgh, both in Westchester County (the comparison County used by Roza and Hill), spent over $30,000 per pupil in 2005.[2] These numbers dwarf even the purported $10,000 range within New York City (a range that we agree is presumptively problematic); our conclusion based on this cursory analysis is that the bigger problem likely remains the between-district disparity in funding.

My article (with Kevin Welner) goes on to show how states have far from resolved between district disparities and that New York State in particular has among the most substantial persistent disparities between wealthy and poor school districts.For more information on persistent between district disparities that really do exist, see: Is School Funding Fair?.

I have a forthcoming paper this spring where I begin to untangle the new argument about poor urban districts really having plenty of money but simply wasting it on cheerleading and ceramics. Here’s a draft of a section of the introduction to that paper:

A handful of authors, primarily in non-peer reviewed and think tank reports posit that poor urban school districts have more than enough money to achieve adequate student outcomes and simply need to reallocate what they have toward improving achievement on tested subject areas. These authors, including Marguerite Roza and colleagues of the Center for Reinventing Public Education encourage public outrage that any school district not presently meeting state outcome standards would dare to allocate resources to courses like ceramics or activities like cheerleading. To support their argument, the authors provide anecdotes of per pupil expense on cheerleading being far greater than per pupil expense on core academic subjects like math or English.

Imagine a high school that spends $328 per student for math courses and $1,348 per cheerleader for cheerleading activities. Or a school where the average per-student cost of offering ceramics was $1,608; cosmetology, $1,997; and such core subjects as science, $739.[1]

These shocking anecdotes, however, are unhelpful for truly understanding resource allocation differences and reallocation options. For example, the major reason why cheerleading or ceramics expenses per pupil are highest is the relatively small class sizes, compared to those in English or Math. In total, the funds allocated to either cheerleading of ceramics are unlikely to have much if any effect if redistributed to reading or math.

Further, the requirement that poor urban (or other) districts currently falling below state outcome standards must re-allocate any and all resources from co-curricular and extracurricular activities toward improving achievement on tested outcomes may increase inequities in the depth and breadth of curricular offerings between higher and lower poverty schools – inequities that may be already quite substantial. That is, it may already be the case that higher poverty districts and those facing greater resource constraints are reallocating resources toward core, tested areas of curriculum and away from more advanced course offerings which extend beyond the tested curriculum and enriched opportunities including both elective courses and extracurricular activities.  Some evidence on this point already exists.

The perspective that low performing districts merely need to reallocate what they already have is particularly appealing in the current fiscal context, where state budgets and aid allocations to local public school districts are being slashed. Accepting Roza’s logic, states under court mandates or in the shadows of recent rulings regarding educational adequacy, but facing tight budgets may simply argue that high poverty and/or low performing districts should shift all available resources into the teaching of core, tested subjects. Lower poverty districts with ample resources that exceed minimum outcome standards face no such reallocation obligations, leading to substantial differences in depth and breadth of curriculum. Arguably a system that is both adequate and fair would protect the availability of deep and broad curriculum while simultaneously attempting to improve narrowly measured outcomes.

More later as this research progresses.

[1] “Downstate Suburban” refers to areas such as Westchester County and Long Island and is an official regional classification in the New York State Education Department Fiscal Analysis and Research Unit Annual Financial Reports data, which can be found here: and

[2] Interestingly, however, Bridgehampton and New York City have relatively similar “costs” due to Bridgehampton’s small size and New York City’s high student needs (see Duncombe and Yinger, 2009). The figures offered in this paragraph are based on Total Expenditures per Pupil from State Fiscal Profiles 2005. Results are similar when comparing current operating expenditures per pupil.

The Circular Logic of Quality-Based Layoff Arguments

Many pundits are responding enthusiastically to the new LA Times article on quality-based layoffs – or how dismissing teachers based on Value-added scores rather than on seniority would have saved LAUSD many of its better teachers, rather than simply saving its older ones.

Some are pointing out that this new LA Times report is the “right” way to use value-added as compared with the “wrong” way that LA Times had used the information previously this year.

Recently, I explained the problematic circular logic being used to support these “quality-based layoff” arguments. Obviously, if we dismiss teachers based on “true” quality measures, rather than experience which is, of course, not correlated with “true” quality measures, then we save the jobs of good teachers and get rid of bad ones. Simple enough? Not so. Here’s my explanation, once again.

This argument draws on an interesting thought piece and simulation posted at  ( Teacher Layoffs: An Empirical Illustration of Seniority vs. Measures of Effectiveness), which was later summarized in a (less thoughtful) recent Brookings report (

That paper demonstrated that if one dismisses teachers based on VAM, future predicted student gains are higher than if one dismisses teachers based on experience (or seniority). The authors point out that less experienced teachers are scattered across the full range of effectiveness – based on VAM – and therefore, dismissing teachers on the basis of experience leads to dismissal of both good and bad teachers – as measured by VAM. By contrast, teachers with low value-added are invariably – low value-added – BY DEFINITION. Therefore, dismissing on the basis of low value-added leaves more high value-added teachers in the system – including more teachers who show high value-added in later years (current value added is more correlated with future value added than is experience).

It is assumed in this simulation that VAM (based on a specific set of assessments and model specification) produces the true measure of teacher quality both as basis for current teacher dismissals and as basis for evaluating the effectiveness of choosing to dismiss based on VAM versus dismissing based on experience.

The authors similarly dismiss principal evaluations of teachers as ineffective because they too are less correlated with value-added measures than value-added measures with themselves.

Might I argue the opposite? – Value-added measures are flawed because they only weakly predict which teachers we know – by observation – are good and which ones we know are bad? A specious argument – but no more specious than its inverse.

The circular logic here is, well, problematic. Of course if we measure the effectiveness of the policy decision in terms of VAM, making the policy decision based on VAM (using the same model and assessments) will produce the more highly correlated outcome – correlated with VAM, that is.

However, it is quite likely that if we simply use different assessment data or different VAM model specification to evaluate the results of the alternative dismissal policies that we might find neither VAM-based dismissal nor experienced based dismissal better or worse than the other.

For example, Corcoran and Jennings conducted an analysis of the same teachers on two different tests in Houston, Texas, finding:

…among those who ranked in the top category (5) on the TAKS reading test, more than 17 percent ranked among the lowest two categories on the Stanford test. Similarly, more than 15 percent of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.

  • Corcoran, Sean P., Jennifer L. Jennings, and Andrew A. Beveridge. 2010. “Teacher Effectiveness on High- and Low-Stakes Tests.” Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI.

So, what would happen if we did a simulation of “quality based” layoffs versus experience-based layoffs using the Houston data, where the quality-based layoffs were based on a VAM model using the Texas Assessments (TAKS), but then we evaluate the effectiveness of the layoff alternatives using a value-added model of Stanford achievement test data? Arguably the odds would still be stacked in favor of VAM predicting VAM – even if different VAM measures (and perhaps different model specifications). But, I suspect the results would be much less compelling than the original simulation.

The results under this alternative approach may, however, be reduced entirely to noise – meaning that the VAM based layoffs would be the equivalent of random firings – drawn from a hat and poorly if at all correlated with the outcome measure estimated by a different VAM – as opposed to experienced based firings. Neither would be a much better predictor of future value-added.  But for all their flaws, I’d take the experienced based dismissal policy over the roll of the dice, randomized firing policy any day.

In the case of the LA Times analysis, the situation is particularly disturbing if we look back on some of the findings in their own technical report.

I explained in a previous post that the LA Times value-added model had potentially significant bias in its estimates of teacher quality. For example, in my earlier post, I explain that:

Buddin finds that black teachers have lower value-added scores for both ELA and MATH. Further, these are some of the largest negative effects in the second level analysis – especially for MATH. The interpretation here (for parent readers of the LA Times web site) is that having a black teacher for math is worse than having a novice teacher. In fact, it’s the worst possible thing! Having a black teacher for ELA is comparable to having a novice teacher.

Buddin also finds that having more black students in your class is negatively associated with teacher’s value-added scores, but writes off the effect as small. Teachers of black students in LA are simply worse? There is NO discussion of the potentially significant overlap between black teachers, novice teachers and serving black students, concentrated in black schools (as addressed by Hanushek and Rivken in link above).

By contrast, Buddin finds that having an Asian teacher is much, much better for MATH. In fact, Asian teachers are as much better (than white teachers) for math as black teachers are worse! Parents – go find yourself an Asian math teacher in LA? Also, having more Asian students in your class is associated with higher teacher ratings for Math. That is, you’re a better math teacher if you’ve got more Asian students, and you’re a really good math teacher if you’re Asian and have more Asian students?????

One of the more intriguing arguments in the new LA Times article is that under the seniority based layoff policy:

Schools in some of the city’s poorest areas were disproportionately hurt by the layoffs. Nearly one in 10 teachers in South Los Angeles schools was laid off, nearly twice the rate in other areas. Sixteen schools lost at least a fourth of their teachers, all but one of them in South or Central Los Angeles.

That is, new teachers who were laid off based on seniority preferences were concentrated in high need schools. But so too were teachers with low value-added ratings?

While arguing that “far fewer” teachers would be laid off in high need schools under a quality-based layoff policy, the LA Times does not however offer up how many teachers would have been dismissed from these schools had their biased value-added measures been used instead? Recall that from the original LA Times analysis:

97% of children in the lowest performing schools are poor, and 55% in higher performing schools are poor.

Combine this finding with the findings above regarding the relationship between race and value-added ratings and it is difficult to conceive how VAM based layoffs of teachers in LA would not also fall disparately on high poverty and high minority schools. The disparate effect may be partially offset by statistical noise, but that simply means that some teachers in lower poverty schools will be dismissed on the basis of random statistical error, instead of race-correlated statistical bias (which leads to a higher rate of dismissals in higher poverty, higher minority schools).

Further, the seniority based layoff policy leads to more teachers being dismissed in high poverty schools because the district placed more novice teachers in high poverty schools, whereas the value-added based layoff policy would likely lead to more teachers being dismissed from high poverty, high minority schools, experienced or not, because they were placed in high poverty, high minority schools.

So, even though we might make a rational case that seniority based layoffs are not the best possible option, because they may not be highly correlated with true (not “true”) teaching quality, I fail to see how the current proposed alternatives are much if any better.  They only appear to be better when we measure them against themselves as the “true” measure of success.

Thought for the day…

Many will consider this blasphemy, but, I’ve been pondering lately:

If our best public and private schools are pretty good (perhaps even better than Finland?),

And, if the majority (not all, but most) of our best AND our worst public (and private) schools use salary schedules which base teacher compensation primarily on degrees/credits/credentials obtained and years of experience or service…

Can we really attribute the failures of our worst schools  to these features of teacher compensation?

Yeah… there might be a better (more efficient and effective way), but is this really the main problem?

When schools have money…

When schools and school districts have more money and spend more money, what do they spend it on?

We are told these days to believe that everything we thought about the virtues of small class size back in the 1990s was misguided. That improving teacher quality trumps reducing class size any day when it comes to efficiently improving student outcomes. We are told to believe that teacher quality can be improved at nominal cost, whereas achieving similar gains via class size reduction would be absurdly inefficient and very costly. Yet to date, we have little evidence that we can actually achieve the same measured outcome gains achieved by reducing class size in the 1990s by instead improving teacher quality… and that this can easily be done at lower cost. In fact, many go so far as to argue that we can take the same average teacher wage, and instead of paying older teachers more and younger teachers less, we just have to pay better teachers more and worse ones less – and that somehow this will lead to a new interest in teaching among our best and brightest college graduates. I struggle with the reasoning here, and certainly have not seen the evidence.

I am particularly skeptical that dramatically reducing the predictability and stability of career earnings, while not altering dramatically the average level of compensation can result in any positive changes to teacher quality. This is especially true if higher teacher wages are tied to extremely noisy measures of teacher performance – making it difficult for a teacher to even control, no less predict his/her career earnings.

We do know from many older studies that improving wages can improve the teacher workforce:

  • Murnane and Olson (1989) find that salaries affect the decision to enter teaching and the duration of the teaching career.[1]
  • Figlio (1997, 2002) and Ferguson (1991) find that higher salaries are associated with better qualified teachers[2]
  • Loeb and Page (1998, 2000) find that raising teacher wages by ten percent reduces high school dropout rates by between three and six percent and increases college enrollment rates by two percent.[3]

We also know that imposing strict spending limits on public schooling, thereby limiting the ability of public schools to pay competitive salaries can harm teacher quality over time:

  • Figlio and Rueben (2001) explain: “Using data from the National Center for Education Statistics we find that tax limits systematically reduce the average quality of education majors, as well as new public school teachers in states that have passed these limits.”[4]

Interest in teacher quality over class size reduction has grown so strong that some are beginning to make the leap that we should simply increase class size to 30 or even 35 students per class in order to pay enough to get really good teachers. After all, who can argue with the logic that a good teacher with 35 kids is better than a crappy one with 20 kids.  Of course, this assumes falsely that every class of 35 would be taught by a better teacher, on average, than those teaching the classes of 20, because every teacher currently teaching the smaller classes is crappy. That said, we do have pretty consistent evidence that salary increases could increase teaching quality.

However, we also have at least some evidence that teacher quality and class size interact.  We may find that we are fighting a losing battle trying to recruit high quality teachers to teach classes of 35 kids even at the higher salary. This may especially be the case in schools and districts where large classes are particularly difficult to manage. Class size is a working condition and more desirable working conditions can reduce the need for paying higher salaries – another trade-off for which we have no good dollar to dollar estimates.

This brings me to my somewhat related data query for the day. When schools have money to spend, what do they spend it on? When looking at high spending suburban school districts, or looking at private independent schools, which I have referred to elsewhere as “luxury” schooling, what are their defining attributes?

Arguably, the defining attributes of luxury schooling simply reflect the demands of luxury schooling consumers – residents of high spending affluent suburban communities and parents who send their children to private independent schools. Go to nearly any private school web site – or web site on “why you should choose a private school,” and you will find one item at the top of the list nearly every time – Small Class Size- or Individual Attention (and alternative angle on what? small class size!). But this is irrational right? Why should affluent suburban consumers or private school parents prefer something that simply drives up the price and with diminishing marginal returns? Whatever the reasons, they do, and arguably lower pupil to teacher ratios and smaller class sizes are a, if not the defining feature of “luxury” schooling.

First, here are the per pupil spending levels in select labor markets, for private schools by type and for public schools in the same labor market.

FIGURE 1 : Per Pupil Spending of Private and Public Schools by Labor Market

Private independent schools in particular, systematically outspend public schools in the same labor market by about 2/1!

And this graph shows the pupil to teacher ratios for public schools, all private schools, private Catholic schools and private independent schools. Private independent schools spend double what public schools spend, and leverage most of that money to provide pupil to teacher ratios that are approximately half those of the public schools (teacher salaries are similar to slightly lower than public school salaries).

FIGURE 2: Pupil to Teacher Ratios of Private and Public Schools

Now, this graph shows the per pupil state and local revenues of public school districts in the NY metropolitan area, by district poverty rates. In New York State, as we show at, higher poverty districts have systematically fewer resources than their lower poverty, often very affluent suburban neighbors. This graph validates that pattern.

FIGURE 3: Per pupil Revenues of New York Metro Area Districts (in NY State) by Poverty

Now here are the elementary class sizes by district spending group. Note that as spending per pupil increases, class sizes systematically decrease.

FIGURE 4: Spending and Elementary Class Size

The same pattern holds for middle and secondary class sizes.

FIGURE 5: Spending and Middle/Secondary Class Size

Note that at least some of the smaller class size at the middle/secondary level in the highest spending public school districts is a function of providing a diverse set of specialized elective courses, advanced placement classes, multiple languages and so on. The same is true for private independent schools. These are opportunities that many lower spending and/or higher poverty districts in many states go without.

Yes, consumers of luxury schooling seem to have a pretty strong preference for small classes, despite modern wisdom that class size is clearly second fiddle to teaching quality. Imagine the teacher salaries one could pay by moving pupil to teacher ratios in independent schools from 8/1 up to the public school average of 16/1. Imagine the salaries that could be paid in affluent Westchester County and Long Island school districts by increasing class sizes from 16 or 18 up to 35? (see this post on just how high these salaries already are!)

For some reason these private schools and affluent public school districts – more specifically those who support these schools – exhibit a strong preference for small class size even when given wide latitude to cho0se differently. Perhaps they are on to something?


[1] Richard J. Murnane and Randall Olsen (1989) The effects of salaries and opportunity costs on length of state in teaching. Evidence from Michigan. Review of Economics and Statistics 71 (2) 347-352
[2] David N. Figlio (1997) Teacher Salaries and Teacher Quality. Economics Letters 55 267-271. David N. Figlio (2002) Can Public Schools Buy Better-Qualified Teachers?” Industrial and Labor Relations Review 55, 686-699.  Ronald Ferguson (1991) Paying for Public Education: New Evidence on How and Why Money Matters. Harvard Journal on Legislation. 28 (2) 465-498.

[3] Susanna Loeb and Marianne Page (2000) Examining the link between teacher wages and student outcomes: the importance of alternative labor market opportunities and non-pecuniary variation. Review of Economics and Statistics 82, 393-408. Susanna Loeb and Marianne Page (19980 Examining the link between wages and quality in the teacher workforce. Department of Economics, University of California, Davis

[4] David N. Figlio and Kim S. Rueben (2001) Tax limits and the qualifications of new teachers Journal of Public Economics Volume 80, Issue 1, April 2001, Pages 49-71

Intellectual Pathologies of the Reformy World (Kevin vs. Kevin)

Yesterday, a colleague and coauthor on two recent articles – Kevin Welner (U. of Colorado) – wrote a scathing critique of the manifesto on fixing urban schools that was released last week by several large city superintendents.

Kevin Welner’s commentary can be found here:

The manifesto can be found here:

Kevin Carey notes in his critique of Kevin Welner:

I highlight this because it’s crucial to understanding the worst intellectual pathologies of the education establishment. People like Welner don’t just think that Joel Klein, Michele Rhee, Andres Alonso, and Arlene Ackerman are making bad decisions in the course of helping poor children learn. Welner believes that by asserting that poor children can learn, the superintendents are hurting the cause of making poor children less poor. While many people believe this, most choose not to say it so clearly.

I urge you to take a look at what Kevin Welner actually said in his commentary. The centerpiece of Kevin Welner’s argument was that the superintendents and others behind the manifesto were making a strong sales pitch for fast-tracking education reform strategies for which the research base is mixed at best. Kevin Welner asks:

Are these adults acting responsibly when they advocate for even more test-based accountability and school choice? Over the past two decades, haven’t these two policies dominated the reform landscape – and what do we have to show for it? Wouldn’t true reform move away from what has not been working, rather than further intensifying those ineffective policies? Are they acting responsibly when they promote unproven gimmicks as solutions?

Are they acting responsibly when they do not acknowledge their own role in failing to secure the opportunities and resources needed by students in their own districts, opting instead to place the blame on those struggling in classrooms to help students learn?

And Kevin Welner summarizes the manifesto as follows:

Move money from neighborhood schools to charter schools!
Make children take more tests!
Move money from classrooms to online learning!
Blame teachers and their unions – make them easier to fire!
Tie teacher jobs and salaries to student test scores!


None – literally NONE – of these gimmicks is evidence-based.

I tend to agree that the findings on expansion of charters are mixed at best, and that tying teacher ratings to test scores is deeply problematic. Perhaps what irked Kevin Carey most here, is that he has convinced himself, through exceedingly flimsy logic, that he Kevin Carey is right, and that other Kevin, Kevin Welner is simply wrong on these points. Allow me to bring you back to a series of recent comments by Kevin Carey that display his completely distorted understanding of research on charters (and implications for policy) and the usefulness of value-added modeling to rate teachers.

Kevin Carey on Charters

Here’s a recent quote from Kevin Carey, attacking the civil rights framework on whether the evidence supports expansion of charter schools.

Here’s the problem: the contention that charters have “little or no evidentiary support” rests on studies finding that the average performance of all charters is generally indistinguishable from the average regular public school. At the same time, reasonable people acknowledge that the best charter schools–let’s call them “high-quality” charter schools–are really good, and there’s plenty of research to support this.

I have noted previously, here, that I find this to be one of the most patently stupid arguments I think I’ve seen in a long time.

To put it in really simple terms:


or … Good schools outperform average ones. Really?

Why should that be any different for charter schools (accepting a similar distribution) that have a similar average performance to all schools?

This is absurd logic for promoting charter schools as some sort of unified reform strategy – Saying… we want to replicate the best charter schools (not that other half of them that don’t do so well).

Yes, one can point to specific analyses of specific charter models adopted in specific locations and identify them as particularly successful. And, we might learn something from these models which might be used in new charter schools or might even be used in traditional public schools.

But the idea that “successful charters” (the upper half) are evidence that charters are “successful” is just plain silly.

Kevin Carey on Value-Added Teacher Ratings

In the New York Times Room for Debate series on value-added measurement of teachers, Carey argued that Value-added measures would protect teachers from favoritism. Principals would no-longer be able to go after certain teachers based on their own personal biases. Teachers would be able to back up their “real” performance with hard data. Here’s a quote:

“Value-added analysis can protect teachers from favoritism by using hard numbers and allow those with unorthodox methods to prove their worth.” (Kevin Carey, here)

The reality is that value-added measures simply create new opportunities to manipulate teacher evaluations through favoritism. In fact, it might even be easier to get a teacher fired by making sure the teacher has a weak value-added scorecard. Because value-added estimates are sensitive to non-random assignment of students, principals can easily manipulate the distributions of disruptive students, students with special needs, students with weak prior growth and other factors, which, if not fully accounted for by the VA model will bias teacher ratings. More here!

Kevin Carey also claims as a matter of accepted fact, that VA measures “level the playing field for teachers who are assigned students of different ability.” This statement, as a general conclusion, is wrong.

  1. VA measures do account for the initial performance level of individual students, or they would not be VA measures. Even this becomes problematic when measures are annual rather than fall/spring, so that summer learning loss is included in the year to year gain. An even more thorough approach for reducing model bias is to have multiple years of lagged scores on each child in order to estimate the extent to which a teacher can change a child’s trajectory (growth curve). That makes it more difficult to evaluate 3rd or 4th grade teachers, where many lagged scores aren’t yet available. The LAT model may have had multiple years of data on each teacher, but didn’t have multiple lagged scores on each child. All that the LAT approach does is to generate a more stable measure for a teacher, even if it is merely a stable measure of the bias of which students that teacher typically has assigned to him/her.
  2. VA measures might crudely account for socio-economic status, disability status or language proficiency status, which may also  affect learning gains. But, typical VA models, like the LA Times model by Buddin tend to use relatively crude, dichotomous proxies/indicators for these things. They don’t effectively capture the range of differences among kids. They don’t capture numerous potentially important, unmeasured differences.  Nor do they typically capture classroom composition – peer group – effect which has been shown to be significant in many studies, whether measured by racial/ethnic/socioeconomic composition of the peer group or by average performance of the peer group.
  3. For students who have more than one teacher across subjects (and/or teaching aides/assistants), each teacher’s VA measures may be influenced by the other teachers serving the same students.

I could go on, but recommend revisiting my previous posts on the topic where I have already addressed most of these concerns.

Intellectual pathologies?  Pot… kettle?