Blog

Stretching Truth, Not Dollars?

This week, Mike Petrilli (TB Fordham Institute) and Marguerite Roza (Gates Foundation) released a “policy brief” identifying 15 ways to “stretch” the school dollar. Presumably, what Petrilli and Roza mean by stretching the school dollar is to find ways to cut spending while either not harming educational outcomes or actually improving them. That goal in mind, it’s pretty darn hard to see how any of the 15 proposals would lead to progress toward that goal.

The new policy brief reads like School Finance Reform in a Can. I’ve written previously about what I called Off-the-Shelf school finance reforms, which are quick and easy – generally ineffective and meaningless, or potentially damaging – revenue-neutral school finance fixes. In this new brief, Petrilli and Roza have pulled out all the stops. They’ve generated a list, which could easily have been generated by a random search engine scouring “reformy” think tank websites, excluding any ideas actually supported by research literature.

The policy brief includes some introductory ramblings about district level practices for “stretching” the school dollar, but the policy brief focuses on state policies that can assist in stretching the school dollar at the state level and provide local districts greater options to stretch the school dollar. I will focus my efforts on the state policy list.

Here’s the state policy recommendation list:

1. End “last hired, first fired” practices.

2. Remove class-size mandates.

3. Eliminate mandatory salary schedules.

4. Eliminate state mandates regarding work rules and terms of employment.

5. Remove “seat time” requirements.

6. Merge categorical programs and ease onerous reporting requirements.

7. Create a rigorous teacher evaluation system.

8. Pool health-care benefits.

9. Tackle the fiscal viability of teacher pensions.

10. Move toward weighted student funding.

11. Eliminate excess spending on small schools and small districts.

12. Allocate spending for learning-disabled students as a percent of population.

13. Limit the length of time that students can be identified as English Language Learners.

14. Offer waivers of non-productive state requirements.

15. Create bankruptcy-like loan provisions.

This list can be lumped into four basic categories:

A) Regurgitation of “reformy” ideology for which there exists absolutely no evidence that the “reforms” in question lead to any improvement in schooling efficiency. That is, no evidence that these reforms either “cut costs” (meaning reduce spending without reducing outcomes) or improve benefits (or outcome effects).

  1. Creating a rigorous evaluation system
  2. Ending “last hired, first fired” practices
  3. Move toward weighted student funding

B) Relatively common “money saving” ideas, backed by little or no actual cost-benefit analysis – the kind of stuff you’d be likely to read in a personal finance column in magazine in a dentist’s office.

  1. Pool health-care benefits.
  2. Create bankruptcy-like loan provisions. (???)
  3. Tackle pensions
  4. Cut spending on small districts and schools (consolidate?)

C) Reducing expenditures on children with special needs by pretending they don’t exist.

  1. Allocate spending for learning-disabled students as a percent of population.
  2. Limit the length of time that students can be identified as English Language Learners.

D) Un-regulation

  1. eliminate class-size limits
  2. provide waivers for ineffective mandates
  3. eliminate seat time requirements
  4. merge categorical programs
  5. eliminate work rules
  6. eliminate mandatory salary schedules

So, let’s walk through a few of these in greater detail. Let’s address whether there is any evidence whatsoever that these policies a) would actually lead to reduced short run costs while not harming, or even improving outcomes, or b) are for any other reason a good idea.

Creating an Evaluation System

This likely requires significant up front spending- heavy front end investment to design the system and put the system into place. Yes, increased, not decreased spending. And in the short-term, while money is tight. AND, there is little or no evidence that what is being recommended – a Tennessee or Colorado-style teacher evaluation model (50% on value-added scores), would actually reduce spending and /or improve outcomes. Rather, I could make a strong case that such a model will lead to exorbitant legal fees for the foreseeable future (I have a forthcoming law review article on this topic).  The likelihood of achieving long run benefits from these short run expenses is questionable at best. In fact, the likelihood of significant harm seems equal if not greater (see my previous post on this topic: value-added teacher evaluation).

Ending “Last Hired, First Fired” layoff policies

In very crude terms, this approach might simply allow a district – or entire state – to layoff senior, higher salary teachers. Yeah… that could reduce the payroll. Good policy? Really questionable! Of course, Petrilli and Roza also argue that we simply shouldn’t be paying teachers for experience or degrees anyway. So I guess if we did that, we wouldn’t generate savings from this recommendation. Silly me. One or the other, I guess.

Now, we could generate performance increases (at lower spending, if we keep seniority pay, or at constant spending if we don’t) if, and only if, the future actually plays out as simulated in the various performance-based layoff simulations which I, and others have recently discussed. The assumptions in these simulations are bold (unrealistic), and much of the logic circular.

And then there are those short-term legal costs of defending the racially disparate firings, and random error firings.

Eliminating Class Size Limits

Yes, larger classes require less spending – on a per pupil basis. Smaller classes have greater benefit (greater “bang for the buck” shall we so boldly say) in higher poverty settings. A labor market dynamic problem realized in the late 1990s, when CA implemented statewide class size reduction, was that the policy stretched the pool of highly qualified teachers and ultimately made it even harder for high poverty schools to get high quality teachers (a dreadfully oversimplified and disputable version of the story).

Removing class size limits might be reasonable if only affluent districts agreed to increase their class sizes, putting more “high quality” teachers into the available labor pool… who might then be recruited into high poverty districts (another dreadfully oversimplified, if not absurd scenario).  But who really thinks it will play out this way? We already know that affluent school districts a) have strong preferences for very small class sizes and b) have the resources to retain those small class sizes or reduce them further. See Money and the Market for High Quality Schooling.

Eliminating mandatory salary schedules

It seems that in this recommendation, Petrilli and Roza are arguing against state policies that mandate the adoption by local public school districts of specific step and lane salary schedules. They really only provide one brief paragraph with little or no explanation regarding what the heck they are talking about.

I’ve personally never been much of a fan of state rigidity regarding local negotiated agreements – at least in terms of steps and lanes. Many problems can occur where states enact policies as rigid as those of Washington State, were teachers statewide are on a single salary schedule.

The best work on this topic (and I’ve worked on the same topic with Washington data) is by Lori Taylor of Texas A&M who shows that the Washington single salary schedule leads to non-competitive wages for teachers in metro areas, and also leads to non-competitive wages for teachers in math and science relative to other career opportunities in metro areas. The statewide salary schedule in Washington is arguably too rigid. Here’s a link to Taylor’s study:

Taylor, L. (2008) Washington Wages: An Analysis of Educator and Comparable Non-educator Wages in the State of Washington. Washington State Institute for Public Policy.

But this does not mean, by any stretch of the imagination, that removing this requirement would save money, or “stretch” the education dollar. It might allow bargaining units in metro areas in Washington to scale up salaries over time as the economy improves. And it might lead to some creative differentiation across negotiated agreements, with districts trying to leverage different competitive advantages over one another for teacher recruitment.

But, these competitive behaviors among districts may also lead to ratcheting of teacher salaries across neighboring bargaining units, and may lead to increased salary expense with small marginal returns (as clusters of districts compete to pay more for an unchanging labor pool). For an analysis of this effect, see Mike Slagle’s work on spatial relationships in teacher salaries in Missouri. In short, Slagle finds that changes to neighboring district salary schedules are among the strongest predictors of an individual district’s salary schedule. Ratcheting upward of salaries in neighboring districts is likely to lead to adjustment by each neighboring district (to the extent resources are available). Ratcheting downward does not tend to occur (not reported in this article).

Slagle, M. (2010) A Comparison of Spatial Statistical Methods in a School Finance Policy Context. Journal of Education Finance 35 (3)

[note: this article is a shortened version of Mike’s dissertation. The article addresses only the ratcheting of per pupil spending, but the full dissertation also addresses teacher salaries]

In any case, we certainly have no evidence that removing state level requirements for mandatory salary schedules would save money while holding outcomes harmless – hence improving efficiency. Like I said, I’m not a big fan of such restrictions either, but I have no delusion that removing them will save any district a ton of money – or any for that matter.

This recommendation seems to also be tied up in the notion that we shouldn’t be paying teachers for experience or degree levels anyway. Therefore, mandating as much would clearly be foolish. I’ve addressed this idea previously in The Research Question that Wasn’t Asked.

In addition, this recommendation seems to adopt the absurd assumption that we could immediately just pay every teacher in the current system the bachelor’s degree base salary (Okay, the salary of a teacher with 3 years and a bachelor’s degree, where marginal test-score returns to experience fade). We could immediately recapture all of that salary money dumped into differentiation by experience or differentiation by degree, and that we could have massive savings with absolutely no harm to the quality of schooling – or quality of teacher labor force in the short-run or in the long-term. Again, that’s the research question that was never asked. Previous estimates of all of the money wasted on the master’s degree salary “bump” are actually this crude.

For similarly absurd analysis by Marguerite Roza regarding teacher pay, see my previous post on “inventing research findings.”

Move toward Weighted Student Funding

Petrilli and Roza also advocate moving to Weighted Student Funding. They seem to argue that the “big” savings here will come from the ability of states and school districts to immediately take back funding as student enrollments decline. That is, a district in a state, or school in a district gets a certain amount per kid. If they lose the kid, they lose the money. This keeps us from wasting a whole lot of money on kids who aren’t there anymore.

Okay… Now… most state aid is allocated on a per pupil basis to begin with. And, in general, as enrollments fluctuate, state aid fluctuates. Lose a kid. Lose the state aid that is driven by that kid. Some states have recognized that the costs of providing education don’t actually decline linearly (or increase linearly) with changes in enrollment and have included safety valves to slow the rate of aid loss as enrollments decline. Such policies are reasonable.

Petrilli and Roza seem to be belligerently and ignorantly declaring that there is simply never a legitimate reason for a funding formula to include small school district or declining enrollment provisions. I have testified in court as an expert against such provisions when those provisions are completely “out of whack”, but would never say they are entirely unwarranted. That’s just foolish, and ignorant.

Local revenues in many states (and in many districts within states) still make up a large share of public school funding, and local revenues are typically derived from property taxes applied to the total taxable property wealth of the school district. As kids come and go, local revenues do not come and go. If a tax levy of X% on the district’s assessed property values raises $8,000 per pupil – and if enrollment declines, but the total assessed value stays constant, the same tax raises more per pupil, perhaps $8,100. The district would lose state funding because it has fewer pupils (and perhaps also because it can generate larger local share per pupil).  But that’s really nothing new.

There’s really no new “huge” savings to be had here.

UNLESS:

a) we are talking about kids moving to charter schools from the traditional public schools, and for each kid who moves to a charter school, we either require the district to pass along the local property tax share of funding associated with that child (Many states), or reduce state aid by the equivalent amount (Missouri).

b) there exists a property tax revenue limit tied specifically to the number of pupils served in the district (as in Wisconsin and other states) which then means that the district would have to reduce its local property taxes to generate only the per pupil revenue allowed. That’s not savings. It’s a state enforced local tax cut.

So then, why do Petrilli and Roza care about Weighted Student Funding as an option? The above two “Unless” scenarios are possible suspects. Blind reformy punditry regardless of logic is equally possible (WSF is cool… reformy… who cares what it does?).

It’s not really about “saving” money at all. Rather, it’s about creating mechanisms to enable local property tax revenues to be diverted in support of charter schools (even if the local taxpayers did not approve the charter), or to have local budgets forcibly reduced/capped when students opt-in to voucher programs (Milwaukee).

And this isn’t really a “weighted student funding” issue at all. In many states, it already works this way (WSF or not). Big savings? Perhaps an opportunity to reduce the state subsidy to charter schools by requiring greater local pass through – in those states where this doesn’t already occur. But these provisions face significant legal battles in some states. If a state is not already doing this, this policy change would also likely lead to significant up front legal expenses.

In fact, I can’t imagine a circumstance where adopting weighted student funding can be expected to either save money or improve outcomes for the same money. There’s simply no proof to this effect. Sadly, while it would seem at the very least, that adopting weighted funding might improve transparency and equity of funding across schools or districts, that’s not necessarily the case either.

My own research finds that districts adopting weighted funding formulas have not necessarily done any better than districts using other budgeting methods when it comes to targeting financial resources on the basis of student needs. See: http://epaa.asu.edu/ojs/index.php/epaa/article/view/5

Petrilli and Roza’s Weighted Funding recommendation for “stretching” the dollar is strange at best. As a recommendation to state policymakers, adoption of weighted funding provides few options for “stretching” the dollar, but may provide a mechanism for diverting districts’ local revenues to support choice programs (potentially reducing state support for those programs).

As a recommendation to local school district officials, adoption of weighted funding really provides no options for “stretching” the dollar, and may, in fact, increase centralized bureaucracy required to develop and manage the complex system of decentralized budgeting that accompanies WSF (see: http://epx.sagepub.com/content/23/1/66.short)

So,

No savings?

No improvements to equity?

No evidence of improved efficiency?

What then, does WSF have to do with “stretching” the school dollar?

Baker, B.D., Elmer, D.R. (2009) The Politics of Off‐the‐Shelf School Finance Reform. Educational Policy 23 (1) 66‐105

Baker, B.D. (2009) Evaluating Marginal Costs with School Level Data: Implications for the Design of Weighted Student Allocation Formulas. Education Policy Analysis Archives 17 (3)

Savings from Small Districts and Schools

I am one who believes in creating savings through consolidation of unnecessarily small schools and school districts. And, at the school or district level, some sizeable savings can be achieved by reorganizing schools into more optimal size configurations (elementary schools of 300 to 500 students and high schools of 600 to 900 for example, See Andrews, Duncombe and Yinger)

For other research on the extent to which consolidation can help cut costs, see Does School District Consolidation Cut Costs, also by Bill Duncombe and John Yinger (the leading experts on this stuff).

Now, Petrilli and Roza, however, seem to imply that the savings from these consolidations or simply from starving the small schools and districts can perhaps help states to sustain the big districts – STRETCHING that small school dollar. Note that Petrilli and Roza ignore entirely the possibility that some of these small schools and districts (in states like Wyoming, western Kansas, Nebraska) might actually have no legitimate consolidation options. Kill them all! Get rid of those useless small schools and districts, I say!

Here’s the thing about de-funding small schools and districts to save big ones. The total amount of money often is not much… BECAUSE THEY ARE SMALL SCHOOLS!!!!!  I learned this while working in Kansas, a state which arguably substantially oversubsidizes small rural school districts, creating significant inequities between those districts and some of the states large towns and cities with high concentrations of needy students. While the inequity can (and should) be reduced, the savings don’t go very far.

So, let’s say we have 6 school districts serving 100 kids each, and spending $16,000 per pupil to do so. Let’s say we can lump them all together and make them produce equal outcomes for only $10,000 per pupil. A bold, bold assumption. We just saved $6,000 per pupil (really unlikely), across 600 pupils. That’s not chump change… it’s $3,600,000 (okay… in most state budgets that is chump change).

So, now let’s take this savings, and give it to the rest of the kids in the state – oh – about 400,000. Well, we just got ourselves about $9 per pupil. Even if we try to save the mid-sized city district of 50,000 students down the road, it’s about $72 per pupil. That is something. And if we can achieve that, then fine. But slashing small districts and schools to save big, or even average ones, usually doesn’t get us very far. BECAUSE THEY ARE SMALL! GET IT! SMALL DISTRICTS WITH SMALL BUDGETS!

Similar issues apply to elimination of very small schools in large urban districts. It’s appropriate strategy – balancing and optimizing enrollment (reorganizing those too-small high schools created as a previous Gates-funded reform?). It should be done. But unless a district is a complete mess of tiny, poorly organized schools, the savings aren’t likely to go that far.

Let’s also remember that major reconfiguration of school level enrollments will require significant up front capital expense! Yep, here we are again with a significant increased expense in the short-term. Duncombe and Yinger discuss this in their work. Strangely, this slips right past Petrilli and Roza.

Use Census Based Funding for Special Education

So, what Petrilli and Roza are arguing here is that states could somehow save money by allocating their special education funding to school districts on an assumption that every school district has a constant share of its enrollment that qualifies for special education programs. Those districts that presently have more? Well, they’ve just been classifying every kid they can find so they can get that special education money. This flat-funding policy will bring them into line… and somehow “stretch” that dollar.

Let’s say we assume that every district has 16% (Pennsylvania) or 14.69% (New Jersey) children qualifying for special education. Let’s say we pick some number, like these, that is about the current average special education population.  Our goal is really to reduce the money flowing to those districts that have higher than average rates. Of course, if we pick the average, we’ll be reducing money to the districts with higher rates and increasing money to the districts with lower rates and you know what – WE’LL SPEND ABOUT THE SAME IN SPECIAL EDUCATION AID? “Stretching?” how?

And will we have accomplished anything close to logical? Let’s see, we will have slammed those districts that have been supposedly over-identifying kids for decades just to get more special ed aid. That, of course, must be good.

BUT, we will also be providing aid for 14.69% of kids to districts that have only 7% or 8% children with disabilities. Funding on a census basis or flat basis requires that we provide excess special education aid to many districts – unless we fund all districts as if they have the same proportion of special education kids as the district with the fewest special education kids. That is, simply cut special education aid to all districts except the one that currently receives the least.

How is that smart “stretching?”

The only way to “save” money with this recommendation is simply to “cut funding” and “cut services.” And, unless cut to the bare minimum, the “flat allocation” strategy requires choosing to “overfund” some districts while “underfunding” others. One might try to argue that this policy change would at least reduce further growth in special ed populations. But the article below suggests that this is not likely the case either. The resulting inequities significantly offset any potential benefits.

There exist a multitude of problems with flat, or census-based special education funding, which have led to declining numbers of states moving in this direction in recent years, New Jersey being an exception. I discuss this with co-authors Matt Ramsey and Preston Green in our forthcoming chapter on special education finance in the Handbook on Special Education Policy Research.

Of course, there also exists the demographic reality that children with disabilities are simply not distributed evenly across cities, towns and rural areas within states, leading to significant inequities when using Census Based funding. CB Funding is, in fact, the antithesis of Weighted Student Funding. How does one reconcile that?

For a recent article on the problems with the underlying assumptions of Census Based special education funding, see:

Baker, B.D., Ramsey, M.J. (2010) What we don’t know can’t hurt us? Evaluating the equity consequences of the assumption of uniform distribution of needs in Census Based special education funding. Journal of Education Finance 35 (3) 245‐275

Here’s a draft copy of our forthcoming book chapter on special education finance: SEF.Baker.Green.Ramsey.Final

Limit Time for ELL/LEP

This one is both absurd and obnoxious. Essentially, Petrilli and Roza argue that kids should be given a time limit to become English proficient and should not be provided supplemental programs or services – or at least the money for them – beyond that time frame. For example, a child might be funded for supplemental services for 2 years, and 2 years only. Some states have done this. Again, there is no clear basis for such cutoffs, nor is it clear how one would even establish the “right” time limit, or whether that time limit would somehow vary based on the level of language proficiency at the starting time.

Yes, this approach, like cutting special education funding can be used to cut spending and cut and reduce the quality of services. But that’s all it is. It’s not “stretching” any dollar.

Other Stuff

Now, the brief does list other state policy options as well as other district practices. Some of these are rather mundane, typical ideas for “cost saving.” But, of course, no evidence or citation of actual cost effectiveness, cost benefit or cut utility analysis is presented. Petrilli and Roza toss around ideas like a) pooling health care costs, b) redesigning sick leave policies or c) shifting health care costs to employees. These are the kind of things that are often on the table anyway.

I fail to see how this new policy brief provides any useful insights in this regard. Some actual cost-benefit analysis would be the way to go. As a guide for such analyses, I recommend Henry Levin and Patrick McEwan’s book on Cost Effectiveness Analysis in Education.

There are a handful of articles available on the topic of incentives associated with varied sick leave policies, including THIS ONE, School District Leave Policies, Teacher Absenteeism, and Student Achievement, by Ron Ehrenberg of Cornell (back in 1991).

One category I might have included above is that at least two of the recommendations embedded in the report argue for stretching the school dollar, so-to-speak, by effectively taxing school employees. That is, setting up a pension system that requires greater contribution from teacher salaries, and doing the same for health care costs. This is a tax – revenue generating (or at least a give back). This is not stretching an existing dollar. This is requiring the public employees, rather than the broader pool of taxpayers (state and/or local), to pay the additional share. One could also classify it as a salary cut. But Petrilli and Roza have already proposed salary cuts in half of the other recommendations. Just say it. Hey… why not just take the “master’s bump” money and use that to pay for pensions and health care? No-one will notice it’s even gone? We all know it was wasted and un-noticed to begin with.

I was particularly intrigued by the entirely reasonable point that school districts should NOT make the harmful cuts by narrowing their curriculum. I was intrigued by this point because this is precisely what Marguerite Roza has been arguing that poor districts MUST do in order to achieve minimum standards within their existing budgets. I wrote about this issue previously HERE. It is an interesting, but welcome about-face to see Roza no-longer arguing that poor, resource constrained school districts should dump all but the basics (while other districts, with more advantaged student populations and more adequate resources need not do the same).

Utter lack of sources/evidence for any/all of this junk

Finally, I encourage you to explore the utter lack of support (or analysis) that the policy brief provides for any/all of its recommendations. It won’t take much time or effort. Read the footnotes. They are downright embarrassing, and in some cases infuriating. At the very least, they border on THINK TANKY MALPRACTICE.

There is a reference to the paper by Dan Goldhaber simulating seniority based layoffs, but that paper provides no analysis of cost/benefit, the central premise of the dollar stretching brief. The Petrilli/Roza (not Goldhaber) assumption is simply that the results will be good, and because we are firing more expensive teachers, it will cost less to get those good results.

The policy brief makes a reference to “typical teacher contracts” (FN2) regarding sick leave, with no citation… no supporting evidence, and phrased rather offensively (18 weeks a year off? For all teachers? Everywhere! OMG???)

FN2: Typical U.S. teacher contracts are for 36.5 weeks per year and include 2.5 weeks sick and personal days for a total work year of 34 weeks, or 18 weeks time off.

The brief refers to work by NCTQ (not the strongest “research” organization) for how to restructure teacher pay.

The report self-cites The Promise of Cafeteria Style Pay (by Roza, non-peer reviewed… schlock), and makes a bizarre generalized attack in footnote 5 that school districts uniformly defend the use of non-teaching staff as substitutes (no evidence/source provided).

FN5: Districts requiring non-teaching staff to serve as substitutes argue that it is good practice to have all staff in classrooms at least a few days a year.

The brief cites policy reports (and punditry) on pension gaps (including the Pew Center report), and those reports refer to alternative plans for closing gaps over time. These are important issues, but the question of how this “stretches” the school dollar is noticeably absent.

And that’s it. That’s the entire extent of “research” and “evidence” used to support this policy brief.

Introducing the Reform-Inator!

Introducing the Coolest New Gadget of the Year – just in time for last-day shopping! The Reform-inator!

  1. Can be used to instantly fire and/or de-tenurize teachers. However, in order to use the reform-inator for these purposes you must line up 100 teachers including all of the good, bad and average ones. The reforminator is a bit touchy… and misfires quite frequently … hitting an average teacher instead of a truly bad one about 35% of the time, and hitting a good teacher instead of a truly bad one about 20% of the time. But what the heck… go for it. Thin the herd. Probabilities are in your favor, if only marginally. And besides, there will be plenty more teachers willing to step up and face the firing line next year.
  2. Can be used to instantly replicate (or new reformy term: scalify, or scalification) only the upper half of charter schools, because we all know that the upper half of charter schools are … well… better than average ones, and well… good charters are good… and bad ones bad (but no need to talk about those, just as there’s no need to talk about the good traditional public schools)… so we really want to replicate and expand only those good charters (primarily by reduced regulation, increased numbers of authorizers and reduced oversight requirements, even though the track record to date hasn’t really shown that to be easily accomplished).
  3. Can be used to take anything that is presently about 7% smaller than it was in the past, and make it disappear entirely – GONE… ALL GONE… just like all of the money for public schools. It’s not just recessed – temporarily diminished – It’s just gone. Vanished. Time to shut it all down! No more sweetheart deals (especially in those really crazy overspending states like Arizona and Utah)!
  4. Can instantly make value-added estimates of teacher effectiveness the “true” measure of teacher effectiveness, and further, can make value-added estimates of teacher effectiveness a stronger predictor of themselves… which of course, are the true measure of effectiveness (stronger than a weak to moderate correlation, that is). Use the special self-validation trigger for this particular effect. Also works for low self-esteem.
  5. Can be used to locate Superman (‘cuz I sure can’t find him in these scatterplots of NYC charter school performance compared to traditional public schools, or these from Jersey either).
  6. Will eliminate entirely anything that might be labeled as Status Quo! Because we all know that if it’s status quo – it’s got to go (or at the very least, the first reformy role of logic: “anything is better than the status quo”)
  7. Most importantly, like any good REFORMY tool, it’s got a Trigger!

Other ideas?

Is it the “New Normal” or the “New Stupid?”

I’ll admit from the start that I’m recycling some arguments here (okay… all of the arguments) … but this stuff needs to be reinforced, over and over again. Quite honestly, to me, from a school finance perspective, this is the most important issue that has surfaced in the past year, and potentially the most dangerous and damaging for the future of American public education.

Robert Reich of Berkeley recently wrote of the Attack on American Education:

http://wallstreetpit.com/54502-the-attack-on-american-education

Specifically, Reich pointed to substantial budget cuts across states as evidence of our de-investment in public schooling. Here are the first three states (by alphabetical order), and the education spending cuts mentioned by Reich in his blog post:

  • Arizona has eliminated preschool for 4,328 children, funding for schools to provide additional support to disadvantaged children from preschool to third grade, aid to charter schools, and funding for books, computers, and other classroom supplies. The state also halved funding for kindergarten, leaving school districts and parents to shoulder the cost of keeping their children in school beyond a half-day schedule.
  • California has reduced K-12 aid to local school districts by billions of dollars and is cutting a variety of programs, including adult literacy instruction and help for high-needs students.
  • Colorado has reduced public school spending in FY 2011 by $260 million, nearly a 5 percent decline from the previous year. The cut amounts to more than $400 per student.

As I have mentioned on numerous previous occasions, even the assumption that these cuts represent “de-investment” (suggesting cutting back on something that has been scaled up over time) is flawed, because it accepts that these states actually invested to begin with. Reich points out that current attack is a seemingly unprecedented attack on public education budgets across states, in both K-12 and higher education and arguably an attack on promoting an educated society more generally:

Have we gone collectively out of our minds? Our young people — their capacities to think, understand, investigate, and innovate — are America’s future. In the name of fiscal prudence we’re endangering that future.

But even Reich’s arguments fail to point out that in many of these states, the attack on education and de-investment (if there ever was significant investment, or scale up) has been occurring for decades. In good times, and in bad… Bad economic times just provide a more convenient excuse. Couple that with all of the new rhetoric about the “New Normal” and the excuses to slash-and-burn public school funding are at an all time high.

Let’s review:

First, here’s where the above three states fit into comparisons of state and local education revenue per pupil. Yes, some of the higher spending states are cutting back as well, if you read down Reich’s list of education spending cuts, but these three states have a particularly rich history of low spending and education cutbacks (including year after year mid-year funding recisions, even in good economic times in Colorado) .

Figure 1

Okay,so who cares if they aren’t spending that much. Maybe it’s because they’ve been taxing themselves to death… like we all have, obviously… we all know that… and that education spending is simply eating away at their economies. It’s just not sustainable!

So, here are direct expenditures on education (k-12 and higher ed) as a percent of aggregate personal income for each state. California has been flat, and low for over 30 years and Colorado and Arizona which were once relatively high, have decreased their effort consistently for about 30 years, in a race to the bottom.

Figure 2

Total Direct Education Spending as a Percent of Personal Income

Yeah but… yeah but….yeah but… it’s because their total taxes are so darn high. This is just education. Well then:

Figure 3

Yes, even on these, California is perhaps somewhat above average, whereas Colorado in recent years has been sitting near the bottom. Arizona jumped up in recent years, but is by no means high, compared with other states or trended, over time, out of control.

But even then, we know they’ve all gone wild on teacher hiring… bloating that teacher workforce, reducing class sizes and pupil teacher ratios to inefficiently low levels:

Figure 4

Pupil to Teacher Ratios over Time

Okay, well maybe not California, Arizona or Colorado (or Utah… in Gray at the top of the figure). California did increase teacher numbers in the late 1990s with class size reduction, but that flattened out and increased since, with lack of financial support.

But we all know that none of this matters anyway, right?

In fact, REFORMY logic dictates that it’s those states which have been spending like crazy, wasting their effort and paying for way too many teachers that are a real drag on our national test scores AND our economy.

The problem is not states like California, Arizona or reformy standouts like Colorado (or Tennessee or Lousiana), but rather, those over-educated curmudgeonly high spending non-reformy, low pupil teacher ratio states like Vermont, Massachusetts and New Jersey.

They – yes they – with their gold-plated schools are the shame of our nation (and why we can’t be Finland, right?)!  Our national education emergency (if there is one) is certainly not the fault of those states exercising consistent and appropriate fiscal austerity in good times or in bad.

Well:

Figure 5

Relationship Between State & Local Revenue per Pupil (for high poverty districts) & NAEP Mean Scale Scores

www.schoolfundingfairness.org

On average, states like Arizona and California which have high need student populations, but have thrown their public schools under the bus, are a significant drag on our national performance.

And this is due to lack of effort as much as it is lack of capacity.  Higher effort states also tend to be the higher spending states which also tend to have the higher outcomes. And, when taken as a separate group, compare quite favorably on international performance comparisons.

Figure 6

Relationship between Fiscal Effort and Level of Financial Resources

www.schoolfundingfairness.org

Finally, these differences in outcomes, effort and pupil to teacher ratios are not all about differences in poverty. Again, I’ve already pointed out that these states have high pupil-to-teacher ratios and low spending not because they are poor but rather because they don’t put up the effort.

And now we are boldly (and belligerently) encouraging them to “do more with less” by which we actually mean “do even less with less?”

To clarify how poverty rates fit within this picture, Figure 7 provides adjusted state poverty estimates (see citation below figure) and pupil to teacher ratios. At their respective poverty levels, each of these states has higher – if not much higher than average pupil to teacher ratios. They also have much lower than average per pupil spending.

Figure 7

State Cost Adjusted Poverty Estimates and Pupil to Teacher Ratios

Renwick, Trudi. Alternative Geographic Adjustments of U.S. Poverty Thresholds: Impact on State Poverty Rates. U.S. Census Bureau, August 2009

Further, while these states have higher pupil to teacher ratios than other states with similar poverty rates, they also have very low outcomes even compared to other states with similar corrected poverty rates. Colorado remains somewhat in the middle of the pack on outcomes, having a lower poverty population than either Arizona or California and also having more recently slashed and burned its public education system. Colorado pupil to teacher ratios have also remained closer to those of other states, and much lower than California or Arizona.

Figure 8

State Cost Adjusted Poverty Estimates and NAEP Mean Outcomes

 

How does this all fit into the long-run picture of investment in public schooling? Yes, we’ve had the most significant economic downturn in several decades. State budgets took a hit, and good information on that budget hit can be found at www.rockinst.org, where, among other things, data show that the most recent quarterly estimates of state revenue are still about 7% off their peak in 2008. That’s 7% – not 100%, not 20% (even more important is the variation across states). It’s a hole. But it’s not ALL GONE (and only a complete fool would argue as much)! Note that there have been in the past few decades at least two other significant economic slowdowns/downturns that affected state revenues and education spending – from about 1989 to 1992 – with lagged effects in some regions, and from 2001 to 2002 (post 9/11 shock).  In some states, education spending rebounded in the wake of these downturns, but in others, state legislatures continued to constrain if not outright slash-and burn state education budgets (while expanding tax cuts) throughout the economic good times that followed each downturn (1996ish to 2001 and 2002 t 2008).

What’s different now? Why are we sitting at the edge of a much more dangerous policy agenda? Well, the recent economic downturn was greater. But again, recent data shows the beginnings of a rebound. What is most different is that we are now faced with this completely absurd argument of The New Normal – as a national agenda to scale back education spendingEVEN IN STATES WHERE IT HAD ALREADY BEEN SCALED BACK FOR DECADES. But who knew? Didn’t every state just spend out of its freakin’ mind for …oh… the past hundred years or so?

The New Normal argument that we must cut back our bloated education budgets and increase class sizes and pupil to teacher ratios back to reasonable levels is, at best, based on the shallowest understanding of (hyper-aggregated & overstated) national “trends” in education spending and pupil to teacher ratios, coupled with complete obliviousness to the variations in effort and spending and pupil to teacher ratios that exist across states, and for that matter, the demographic trends in some states which make it appear as if education spending has spiraled out of control (Vermont). That is, if we assume that those pitching-tweeting-blogging The New Normal have even the first clue about trends in education spending, state school finance systems, and the quality of public schooling across states to begin with. Personally, I’m not sure they do. In fact, I’m increasingly convinced they don’t.

A few comments on the Gates/Kane value-added study

A few comments on the Gates/Kane Value-added study

(My apologies in advance for an excessively technical, research geeky post, but I felt it necessary in this case)

Take home points

1) As I read it, the new Gates/Kane value-added findings are NOT by any stretch of the imagination an endorsement of using value-added measures of teacher effectiveness for rating individual teachers as effective or not or for making high-stakes employment decisions. In this regard, the Gates/Kane findings are consistent with previous findings regarding stability, precision and accuracy of rating individual teachers.

2) Even in the best of cases, measures used in value-added added models remain insufficiently precise or accurate to account for the differences in children served by different teachers in different classrooms (see discussion of poverty measure in first section, point #2 below)

3) Too many of these studies, including this one, adopt the logic that value-added outcomes can be treated both as a measure of effectiveness to be investigated (independent variable) and as the true measure of effectiveness (the dependent measure). That is, this study like others evaluates the usefulness of both value added measures and other measures of teacher quality by their ability to predict future (or different group) value-added measures. Certainly, the deck is stacked in favor of value added measures under such a model. See value-added as a predictor of itself below.

4) Value-added measures can be useful for exploring variations in student achievement gains across classroom settings and teachers, but I would argue that they remain of very limited use for identifying more precisely or accurately, the quality of individual teachers.  Among other things, the most useful findings in the new Gates/Kane study apply to very few teachers in the system (see final point below).

Detailed discussion

Much has been made of the preliminary findings of the Gates Foundation study on teacher effectiveness. Jason Felch of the LA Times has characterized the study as an outright endorsement of the use of Value-added measures as the primary basis for determining teacher effectiveness. Mike Johnston, the Colorado State Senator behind that state’s new teacher tenure law, which requires that 50% of teacher evaluation be based on student growth (and tenure and removal of tenure based on the evaluation scheme), also seemed thrilled – via twitter – that the Gates study found that value-added scores in one year predict value-added scores in another – seemingly assuming this finding unproblematically endorses his policies (?) (via Twitter: SenJohnston Mike Johnston New Gates foundation report on effective teaching: value added on state test strongest predictor of future performance).

But, as I read it, the new Gates study is – even setting aside its preliminary nature – NOT AN OUTRIGHT ENDORSEMENT OF USING VALUE-ADDED MEASURES AS A SIGNIFICANT BASIS FOR MAKING HIGH STAKES DECISIONS ABOUT TEACHER DISMISSAL/RETENTION, AS IS MANDATED VIA STATE POLICIES LIKE THOSE ADOPTED IN COLORADO – OR AS SUGGESTED BY THE ABSURDLY NARROW APPROACH FOR “OUTING” TEACHERS TAKEN BY MR. FELCH AND THE LA TIMES.

Rather, the new Gates study tells us that we can use value-added analysis to learn about variations in student learning (or at least in test score growth) across classrooms and schools and that we can assume that some of this variation is related to variations in teacher quality. But, there remains substantial uncertainty in the capacity to estimate whether any one teacher is a good teacher or a bad one.

Perhaps the most important and interesting aspects of the study are its current and proposed explorations of the relationship between value-added measures and other measures, including student perceptions, principal perceptions and external evaluator ratings.

Gates Report vs. LA Times Analysis

In short, data quality and modeling matter, but you can only do so much.

For starters, let’s compare some of the features of the Gates study value added models to the LAT models. These are some important differences to look for when you see value- added models being applied to study student performance differences across classrooms – especially where the goal is to assign outcome effects to teachers.

  1. The LAT Times model, like many others, uses annual achievement data (as far as I can tell) to determine teacher effectiveness, whereas the Gates study at least explores the seasonality of learning – or more specifically, how much achievement change occurs over the summer (which is certainly outside of teacher’s control AND differs across students by their socioeconomic status). One of the more interesting findings of the Gates study is that from 4th grade on: “The norm sample results imply that students improve their reading comprehension scores just as much (or more) between April and October as between October and April in the following grade. Scores may be rising as kids mature and get more practice outside of school.” This means that if there exist substantial differences in summer learning by students’ family income level and/or other factors, as has been found in other studies, then using annual data could significantly and inappropriately disadvantage teachers who are assigned students whose reading skills lagged over the summer. The existing blunt indicator of low income status is unlikely to be sufficiently precise to correct for summer learning differences.
  2. The LA Times model did include such blunt measures for poverty status and language proficiency, as well as disability status (single indicator), but later found shares of gifted children to be associated with differences in teacher ratings, along with student race. The Gates study includes similarly crude indicators of socioeconomic status, but does include in their value-added model whether individual children are classified as gifted. It also includes student race and the average characteristics of students in each classroom (peer group effect). This is much richer and more appropriate model, but still likely insufficient to fully account for the non-random distribution of students.  That is, the Gates study models at least attempt to correct for the influence of peers in the classroom in addition to individual characteristics of students, but even this may be insufficient. One particular concern of mine is the use of a single dichotomous measure of child poverty – whether the child qualifies for free or reduced price lunch – and the share of children in each class who do. The reality is that in many urban public schooling settings like those involved in the Gates study, several elementary/middle schools have over 80% children qualifying for free or reduced lunch, but this apparent similarity is no guarantee of similar poverty conditions among the children in one school or classroom compared to another. One classroom might be filled 80% with children whose family income is at or below 100% income threshold for poverty, whereas another classroom might be filled with 80% children whose income is 85% higher (at the threshold for “reduced” price lunch). This is a big difference that is not captured with this crude measure.
  3. The LAT analysis uses a single set of achievement measures. Other studies like the work of Sean Corcoran (see below) using data from Houston, TX have shown us the relatively weak relationship between value-added ratings of teachers produced by one test and value added ratings of teachers produced by another test. Thankfully, the Gates foundation analysis takes steps to explore this question further, but I would argue, overstates the relationship they found between tests or states that relationship in a way that might be misinterpreted by pundits seeking to advance the use of value-added for high stakes decisions (more later).

Learning about Variance vs. Rating Individual Teachers with Precision and Accuracy

If we are talking about using the value-added method to classify individual teachers as effective or ineffective and to use this information as the basis for dismissing teachers or for compensation, then we should be very concerned with the precision and accuracy of the measures as they apply to each individual teacher. In this context, one can characterize precision and accuracy as follows.

  • Precision – That there exists little error in our estimate that a teacher is responsible for producing good or bad student value-added on the test instrument used.  That is, we have little chance of classifying a good teacher as bad, an average teacher as bad, or vice versa.
  • Accuracy – That the test instrument and our use of it to measure teacher effectiveness is really measuring “true” effectiveness of the teacher – or truly how good that teacher is at doing all of the things we expect that teacher to do.

If, instead of classifying individual teachers as good or bad (and firing them, or shaming them in the newspaper or on milk cartons), we are actually interested in learning about variations in “effectiveness” across many teachers and many sections of students over many years, and whether student perceptions, supervisor evaluations, classroom conditions and teaching practices are associated with differences in effectiveness, we are less concerned about precise and accurate classification of individuals and more concerned about the relationships between measures, across many individuals (measured with error).  That is, do groups of teachers who do more of “X” seem to produce better value-added gains? Do groups of teachers prepared in this way seem to produce better outcomes? We are not concerned about whether a given teacher is accurately “scored.” Instead, we are concerned about general trends and averages.

The Gates study, like most previous studies, finds what I would call relatively weak correlations between the value-added score an individual teacher receives for one section of students in math or reading compared to another, and from one year to the next. The Gates research report noted:

“When the between-section or between-year correlation in teacher value-added is below .5, the implication is that more than half of the observed variation is due to transitory effects rather than stable differences between teachers. That is the case for all of the measures of value-added we calculated.”

Below is a table of those correlations – taken from their Table #5.

Unfortunately, summaries of the Gates study seem to obsess on how relatively high the correlation is from year to year for teachers rated by student performance on the state math test (.404) and largely ignore how much lower many of the other correlations are. Why is the correlation for the ELA test under .20 and what does that say about the high-stakes usefulness of the approach? Like other studies evaluating the stability of value-added ratings, the correlations seem to run between .20 and .40, with some falling below .20. That’s not a very high correlation – which then suggests not a very high degree of precision in figuring out which individual teacher is a good teacher versus which one is bad. BUT THAT’S NOT THE POINT EITHER!

Now, the Gates study rightly points out that lower correlations do not mean that the information is entirely unimportant. The study focuses on what it calls “persistent” effects or “stable” effects, arguing that if there’s a ton of variation across classrooms and teachers, being able to explain even a portion of that variation is important – A portion of a lot is still something. A small slice of a huge pie may still provide some sustenance. The report notes:

“Assuming that the distribution of teacher effects is “bell-shaped” (that is, a normal distribution), this means that if one could accurately identify the subset of teachers with value-added in the top quartile, they would raise achievement for the average student in their class by .18 standard deviations relative to those assigned to the median teacher. Similarly, the worst quarter of teachers would lower achievement by .18 standard deviations. So the difference in average student achievement between having a top or bottom quartile teacher would be .36 standard deviations.” (p.19)

The language here is really, really, important, because it speaks to a theoretical and/or hypothetical difference between high and low performing teachers drawn from a very large analysis of teacher effects (across many teachers, classrooms, and multiple years). THIS DOES NOT SPEAK TO THE POSSIBILITY THAT WE CAN PRECISELY AND ACCURATELY IDENTIFY WHETHER ANY SINGLE TEACHER FALLS IN THE TOP OR BOTTOM GROUP! It’s a finding that makes sense when understood correctly but one that is ripe for misuse and misunderstanding.

Yes, in probabilistic terms, this does suggest that if we implement mass layoffs in a system as large as NYC and base those layoffs on value-added measures, we have a pretty good chance of increasing value-added in later years – assuming our layoff policy does not change other conditions (class size, average quality of those in the system – replacement quality). But any improvements can be expected to be far, far, far less than the .18 figure used in the passage above. Even assuming no measurement error – that the district if laying off the “right” teachers (a silly assumption), the newly hired teachers can be expected to fall, at best, across the same normal curve. But I’ve discussed my taste for this approach to collateral damage in previous posts. In short, I believe it’s unnecessary and not that likely to play out as we might assume. (see discussion of reform engineers at bottom)

A Few more Technical Notes

Persistent or Stable Effects: The Gates report focuses on what it terms “persistent” effects of teachers on student value-added – assuming that these persistent effects represent the consistent, over time or across sections influence of a specific teacher on his/her students’ achievement gains. The report focuses on such “persistent” effects for a few reasons. First, the report uses this discussion to, I would argue, overplay the persistent influence teachers have on student outcomes – as in the quote above which is later used in the report to explain the share of the black-white achievement gap that could be closed by highly effective teachers. The assertion is that even if teacher effects explain small portion of variations in student achievement gains, if variations in those gains are huge, then explaining a portion is important. Nonetheless, the persistent effects remain a relatively small portion (as high as “modest” portion in some cases) – which dramatically reduces the precision with which we can identify the effectiveness of any one teacher (taking as given that the tests are the true measure of effectiveness – the validity concern).

AND, I would argue that it is a stretch to assume that the persistent effects within teachers are entirely a function of teacher effectiveness. The persistent effect of teachers may also include the persistent characteristics of students assigned to that teacher – that the teacher, year after year, and across sections is more likely to be assigned the more difficult students (or the more expert students). Persistent pattern yes. Persistent teacher effect? Perhaps partially (How much? Who knows?).

Like other studies, the identification of persistent effects from year to year, or across sections in the new Gates study merely reinforces that with more sections and/or more years of data (more students passing through) for any given teacher, we can gain a more stable value-added estimate and more precise indication of the value-added associated with the individual teacher. Again, the persistent effect may be a measure of the persistence of something other than the teacher’s actual effectiveness (teacher X always has the most disruptive kids, larger classes, noisiest/hottest/coldest – generally worst classroom).  The Gates study does not (BECAUSE IT WASN’T MEANT TO) assess how the error rate of identifying a teacher as “good” or “bad” changes with each additional year of data, but given that other findings are so consistent with other studies, I would suspect the error rate to be similar as well.

Differences Between Tests: The Gates study provides some useful comparisons of value-added ratings of teachers on one test, compared with ratings of the same teachers on another test – a) for kids in the same section in the same year, and b) for kids in different sections of classes with the same teacher.

Note that in a similar analysis, Corcoran, Jennings and Beveridge found:

“among those who ranked in the top category (5) on the TAKS reading test, more than 17 percent ranked among the lowest two categories on the Stanford test. Similarly, more than 15 percent of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.”

Corcoran, Sean P., Jennifer L. Jennings, and Andrew A. Beveridge. 2010. “Teacher Effectiveness on High- and Low-Stakes Tests.” Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI.

That is, analysis of teacher value-added ratings on two separate tests called into question the extent to which individual teachers might accurately be classified as effective when using a single testing instrument. That is, if we assume both tests to measure how effective a teacher is a teaching “math,” or a specific subject within “math,” then both tests should tell us the same thing about each teacher – which ones are truly effective math teachers and which ones are not. Corcoran’s findings raise serious questions about accuracy in this regard.

The Gates study argues that comparing teacher-value added across two math tests – where one is more conceptual – allows them to validate that doing well on one test, the state test – as long as the results are correlated with the other, more conceptual test, did not compromise conceptual learning. That seems reasonable enough, to the extent that the testing instruments are being appropriately described (and to the extent they are valid instruments).  In terms of value-added ratings, the Gates study, like the Corcoran study, finds only a modest relationship between ratings of teacher based on one test and ratings based on the other:

“the correlation between a teacher’s value-added on the state test and their value-added on the Balanced Assessment in Math was .377 in the same section and .161 between sections.”

But the Gates study also explores the relationships between “persistent” components across tests – which must be done across sections taking the test in the same year (until subsequent years become available). They find:

“we estimate the correlation between the persistent component of teacher impacts on the state test and on BAM is moderately large, .54.”

“The correlation in the stable teacher component of ELA value-added and the Stanford 9 OE was lower, .37.”

I’m uncomfortable with the phrasing here that says – “persistent component of teacher impacts” – in part because there exist a number of other persistent conditions or factors that may be embedded in the persistent effect, as I discuss above. Setting that aside, however, what the authors are exploring is whether the correlated component – the portions of student performance on any given test that are assumed to represent teacher effectiveness – is similar between tests.

In any case, however, these correlations like the others in the Gates analysis are telling us how highly associated – or not – the assumed persistent component is across tests across many teachers teaching many sections of the same class.  This allows the authors to assert that across all of these teachers and the various sections they teach, there is a “moderately” large relationship between student performance on the two different tests, supporting the authors’ argument that one test somewhat validates the other. But again, this analysis, like the others in the report, does not suggest by any stretch of the imagination that either one test or the other will allow us to precisely identify the good teacher versus the bad one. There is still a significant amount of reshuffling going on in teacher ratings from one test to the next, even with the same students in the same class sections in the same year. And, of course, good teaching is not synonymous with raising a student’s test scores.

This analysis does suggest that we might – by using several tests – get a more accurate picture of student performance and how it varies across teachers, and does at least suggest that across multiple tests – if the persistent component is correlated – just like across multiple years – we might get a more stable picture of which teachers are doing better/worse.  Precise enough for high stakes decisions (and besides, how much more testing can we/they handle?)? I’m still not confident that’s the case.

Value-added is the best predictor of itself

This seems to be one of the findings that gets the most media-play (and was the basis of Senator Johnston’s proud tweets). Of course value-added is a better predictor of future value-added (on the same test and with the same model) than other factors are of future value-added – even if value-added is only a weak predictor of future (or different section) value-added. Amazingly, however, many of the student survey responses on factors related to things like “Challenge” seem almost as related to value-added as value-added to itself. That is a surprising finding, and I’m not sure yet what to make of it. [note that the correlation between student ratings and VAM were for the same class & year, whereas VAM predicting VAM is a) across sections and b) across years).

Again, the main problem with this VAM predicts VAM argument is that it assumes value-added ratings in the subsequent year to be THE valid measure of the desired outcome. But that’s the part we just don’t yet know. Perhaps the student perceptions are actually a more valid representation of good teaching than the value-added measure? Perhaps we should flip the question around? It does seem reasonable enough to assume that we want to see students improve their knowledge and skills in measurable ways on high quality assessments. Whether our current batch of assessments, as we are currently using them and as they are being used in this analysis accomplishes that goal remains questionable.

What is perhaps most useful about the Gates study and future research questions is that it begins to explore with greater depth and breadth the other factors that are – and are not – associated with student achievement gains.

Findings apply to a relatively small share of teachers

I have noted in other blog posts on this topic that in the best of cases (or perhaps worst if we actually followed through with it), we might apply value added ratings to somewhat less than 20% of teachers – those directly responsible and solely responsible for teaching reading or math to insulated clusters of children in grades 3 to 8 – well… 4-8, actually … since many VA models use annual data and the testing starts with grade 3. Even for the elementary school teachers who could be rated, the content of the ratings would exclude a great deal of what they teach. Note that most of the interesting findings in the new Gates study are those which allow us to evaluate the correlations of teachers across different sections of the same course in addition to subsequent years. These comparisons can only be made at the middle school level (and/or upper elementary, if taught by section). Further, many of the language arts correlations were very low, limiting the more interesting discussions to math alone. That is, we need to keep in mind that in this particular study, most of the interesting findings apply to no more than 5% to 10% of teachers – those involved in teaching math in the upper elementary and middle grades – specifically those teaching multiple sections of the same math content each year.


Still searching for that pot of gold

The rhetoric about our decades-long drunken spending spree just won’t stop, nor will the rhetoric that the money is all gone. All of it. Nothin’ left. We spent it all. We taxed ourselves to the limit and those damn teachers unions and public schools just took it all and left us with the bill. It’s gone! all gone!

Here are some recent quotes/comments from pundits who’ve done little analytically but to offer a few absurd back of the napkin explanations for why they believe that a) we’ve been on a drunken spending spree and b) it’s all gone!

Andy Rotherham in Time:

the golden age of school spending is likely coming to an end.

http://www.time.com/time/nation/article/0,8599,2035999,00.html

There’s so much more in this article, including statements about how it’s plainly obvious that for each worker added to a private firm, there is an immediate incremental return in production output (each additional worker adds $x worth of output to any private firm) whereas in education we continue to add workers and see nothing in return. Both parts of this assumption are… well… just nutty.

So, Rotherham has given us the argument that our “golden age” of school spending is coming to an end. And Mike Petrilli, in a twitter-battle with Diane Ravitch has laid down the Petrillian Truth (roll with that one Mike…it’s got a nice ring) that “The Money is Gone!”

MichaelPetrilli: That’s a great line, Diane, but it doesn’t solve the problem. The money is gone. We have to help schools cut smart.
http://educationnext.org/in-which-i-debate-diane-ravitch-in-140-characters-or-less/

That’s right. It’s all gone. It’s freakin’ gone. Cut, cut, cut. Cut it all. Zero out public education. It doesn’t matter what state you live in, what part of the country, your state has taxed you to the limit and has spent it all on the edu-bureaucracy. Every state… the whole nation has simply been pouring money into schools and they have to stop because the money is gone.

Okay, really, how much is gone? And has any of it come back yet? Is it really all gone forever? Is 20% gone, 50%, or perhaps even 70%? Must we reset the system to an average cost that is, say, 20% below where it was in 2008? 10?

You know, there are actually legitimate researchers and organizations out there tracking the condition of state and local revenues. And while these have been some tough times, their findings are somewhat less apocolyptic than the comments of Rotherham and Petrilli above… who don’t actually look at state budget data when making these claims. Here are the findings from the most recent quarterly report from the Rockefeller Institute:

The Rockefeller Institute’s compilation of data from 48 early reporting states shows collections from major tax sources increased by 3.9 percent in nominal terms compared to the third quarter of 2009, but was 7.0 percent below the same period two years ago. Gains were widespread, with 42 states showing an increase in revenues compared to a year earlier. After adjusting for inflation, tax revenues increased by 2.6 percent in the third quarter of 2010 compared to the same quarter of 2009. States’ personal income taxes represented a $2.5 billion gain and sales taxes a $2.0 billion gain for the period.
www.rockinst.org

Yes, revenues are down. State revenues are still rolling in about 7% below where they were in 2008, but in most states have begun to rebound… in order to reach that level. We took a hit. States took a hit. Some took a bigger hit than others and some are rebounding more quickly and others more slowly.

But, I must also reiterate that not every state really put their heart into public schools or the combination of their elementary and secondary and higher education systems to begin with. Many have already been systematically reducing their spending effort for years.

A few national graphs first. Here’s total state and local government expenditure as a share of personal income over time.Yes, on average, it has climbed slightly over 30 years. And, it has oscillated in between, with government expenditure (state and local) declining as a share of personal income during those periods when personal income grew quickly.

Elementary, secondary and higher education do make up a sizable share of this spending – albeit not clearly a drunken spree. Here’s education direct expenditures as a share of state and local general expenditures over the same time period.

So, the reality is that education spending first declined as a share of general spending and has since leveled off. So actually, it may be some of that other stuff that’s creating pressure on the system, a point duly acknowledged by Rotherham. But, the current argument seems to be that public schools are discretionary – negotiable – and all of that other stuff is not. Either way, even the total growth in the previous figure is not that disconcerting.  A whole other discussion for a later point in time is the issue of how many states have kicked non-current expenditures (pension obligations and other debt) down the road for someone else to deal with.

Most importantly, however, here are the differences in direct education spending as a share of personal income across states. When it comes to public K-12 and higher education systems, states vary widely. Some have provided high levels of support for schools, allocated that support fairly and maintained appropriate levels of effort to finance their education systems. Others have thrown their education systems under the bus. They don’t need some data-proof ideologue to tell them that the money is gone and now’s the time to cut.

This figure, like the ones in my previous “bubble” post, shows the variation in “effort” across states – measured somewhat differently – but same conclusion. That’s the thing – I keep taking different angles on these data and they keep telling me similar stories – that many states have actually systematically reduced their “effort” to finance public education systems over time, and yes, some have increased effort. And, there’s an interesting story behind each trend. Again, Vermont has systematically scaled up education spending relative to personal income over time. New Jersey has increased over time as well, but New Jersey has only risen to  a relatively below average position over time. By contrast, Colorado and Arizona both provide LESS DIRECT SPENDING ON EDUCATION AS A SHARE OF PERSONAL INCOME IN 2008 THAN THEY DID IN 1977!!!!!!!!!!  And they are not the only ones.  Perhaps those states need a correction in the other direction?

It will indeed be interesting to see how these “effort” measures shift as income takes a temporary hit and a bigger one that it has in the past. Most of the differences in the level of “effort” in the above figure are a function of income. States with higher personal income are able to raise what they need in education spending with a much smaller share of income. Even New Jersey, which is a relatively high spending state has relatively low effort. Other lower effort states include Connecticut and Massachusetts.

But, back to the point – These national aggregate claims that we’re tapped out – all of us – and every state – are entirely inappropriate and irresponsible. Let’s take a hard look and a more precise look at what’s really going on. Let’s focus our attention on useful quarterly reports like those from Rockefeller Institute on the condition of state revenue and lets provide appropriately differentiated instruction to states based on the widely varied conditions they face and the widely varied levels of effort they’ve applied thus far toward improving their education systems. The current rhetoric is unhelpful, and sadly, I think that’s the point!

The problem? Cheerleading and Ceramics, of course!

David Reber with the Topeka Examiner had a great post a while back (April, 2010) addressing the deceptive logic that we should be outraged by supposed exorbitant spending on things like cheerleading and ceramics, and not worry so much about the little things, like disparities between wealthy and poor school districts. I finally saw this post today, from a tweet, and realized I had not yet blogged on this topic.

This logic/argument comes from the “research” of Marguerite Roza, who, well, has a track record of making such absurd arguments in an effort to place blame on poor urban districts and take attention away from disparities between poor urban districts and their more affluent suburban neighbors.

This new argument is really just more of the same ol’ flimsy logic from this crew. For the past several years, Roza and colleagues have attempted to argue that states have largely done their part to fix inequities in funding between school districts, and that now, the burden falls on local public school districts to clean up their act. Here’s an excerpt from one of my recent articles on this topic:

On other occasions, Roza and Hill have argued that persistent between-district disparities may exist but are relatively unimportant. Following a state high court decision in New York mandating increased funding to New York City schools, Roza and Hill (2005) opined: “So, the real problem is not that New York City spends some $4,000 less per pupil than Westchester County, but that some schools in New York [City] spend $10,000 more per pupil than others in the same city.” That is, the state has fixed its end of the system enough.

This statement by Roza and Hill is even more problematic when one dissects it more carefully. What they are saying is that the average of per pupil spending in suburban districts is only $4,000 greater than spending per pupil in New York City but that the difference between maximum and minimum spending across schools in New York City is about $10,000 per pupil. Note the rather misleading apples-and-oranges issue. They are comparing the average in one case to the extremes in another.

In fact, among downstate suburban[1] New York State districts, the range of between-district differences in 2005 was an astounding $50,000 per pupil (between the small, wealthy Bridgehampton district at $69,772 and Franklin Square at $13,979). In that same year, New York City as a district spent $16,616 per pupil, while nine downstate suburban districts spent more than $26,616 (that is, more than $10,000 beyond the average for New York City). Pocantico Hills and Greenburgh, both in Westchester County (the comparison County used by Roza and Hill), spent over $30,000 per pupil in 2005.[2] These numbers dwarf even the purported $10,000 range within New York City (a range that we agree is presumptively problematic); our conclusion based on this cursory analysis is that the bigger problem likely remains the between-district disparity in funding.

http://epaa.asu.edu/ojs/article/viewFile/718/831

My article (with Kevin Welner) goes on to show how states have far from resolved between district disparities and that New York State in particular has among the most substantial persistent disparities between wealthy and poor school districts.For more information on persistent between district disparities that really do exist, see: Is School Funding Fair?.

I have a forthcoming paper this spring where I begin to untangle the new argument about poor urban districts really having plenty of money but simply wasting it on cheerleading and ceramics. Here’s a draft of a section of the introduction to that paper:

A handful of authors, primarily in non-peer reviewed and think tank reports posit that poor urban school districts have more than enough money to achieve adequate student outcomes and simply need to reallocate what they have toward improving achievement on tested subject areas. These authors, including Marguerite Roza and colleagues of the Center for Reinventing Public Education encourage public outrage that any school district not presently meeting state outcome standards would dare to allocate resources to courses like ceramics or activities like cheerleading. To support their argument, the authors provide anecdotes of per pupil expense on cheerleading being far greater than per pupil expense on core academic subjects like math or English.

Imagine a high school that spends $328 per student for math courses and $1,348 per cheerleader for cheerleading activities. Or a school where the average per-student cost of offering ceramics was $1,608; cosmetology, $1,997; and such core subjects as science, $739.[1]

These shocking anecdotes, however, are unhelpful for truly understanding resource allocation differences and reallocation options. For example, the major reason why cheerleading or ceramics expenses per pupil are highest is the relatively small class sizes, compared to those in English or Math. In total, the funds allocated to either cheerleading of ceramics are unlikely to have much if any effect if redistributed to reading or math.

Further, the requirement that poor urban (or other) districts currently falling below state outcome standards must re-allocate any and all resources from co-curricular and extracurricular activities toward improving achievement on tested outcomes may increase inequities in the depth and breadth of curricular offerings between higher and lower poverty schools – inequities that may be already quite substantial. That is, it may already be the case that higher poverty districts and those facing greater resource constraints are reallocating resources toward core, tested areas of curriculum and away from more advanced course offerings which extend beyond the tested curriculum and enriched opportunities including both elective courses and extracurricular activities.  Some evidence on this point already exists.

The perspective that low performing districts merely need to reallocate what they already have is particularly appealing in the current fiscal context, where state budgets and aid allocations to local public school districts are being slashed. Accepting Roza’s logic, states under court mandates or in the shadows of recent rulings regarding educational adequacy, but facing tight budgets may simply argue that high poverty and/or low performing districts should shift all available resources into the teaching of core, tested subjects. Lower poverty districts with ample resources that exceed minimum outcome standards face no such reallocation obligations, leading to substantial differences in depth and breadth of curriculum. Arguably a system that is both adequate and fair would protect the availability of deep and broad curriculum while simultaneously attempting to improve narrowly measured outcomes.

More later as this research progresses.


[1] “Downstate Suburban” refers to areas such as Westchester County and Long Island and is an official regional classification in the New York State Education Department Fiscal Analysis and Research Unit Annual Financial Reports data, which can be found here: http://www.oms.nysed.gov/faru/PDFDocuments/2008_Analysis.pdf and http://www.oms.nysed.gov/faru/Profiles/profiles_cover.html

[2] Interestingly, however, Bridgehampton and New York City have relatively similar “costs” due to Bridgehampton’s small size and New York City’s high student needs (see Duncombe and Yinger, 2009). The figures offered in this paragraph are based on Total Expenditures per Pupil from State Fiscal Profiles 2005. http://www.oms.nysed.gov/faru/Profiles/profiles_cover.html. Results are similar when comparing current operating expenditures per pupil.

Potential abuses of the Parent Trigger???

This article in the LA Times has been getting a lot of buzz today – http://www.latimes.com/news/local/la-me-compton-parents-20101207,0,1116485.story

The article discusses the use of what is called a “parent trigger” policy.  Here’s the synopsis:

On Tuesday, they intend to present a petition signed by 61% of McKinley parents that would require the Compton Unified School District to bring in a charter company to run the school. Charter schools are independently operated public schools.

“I know it’s never been done before, but I want to step up because I’m a parent who cares about my children and their education,” Murphy said Monday. She and other parents were meeting with organizers from Parent Revolution, a nonprofit that lobbied successfully last year for the so-called parent-trigger law.

So, what you’ve got is 61% of parents in a community pushing for a school to be converted to a charter school and potentially pushing for that school to be a specific type of charter school. This presents all sorts of interesting – and twisted possibilities.

I wrote about a week ago on how some charter schools, like North Star Academy in Newark have established themselves as the equivalent of elite magnet schools – potentially engaging in activities such as pushing out lower performing kids over time.

So, my question for the day is whether these “parent trigger” policies might allow a simple majority of parents – or some defined majority share – to force a reorganization of their neighborhood school into a charter – that would subsequently weed out those other “less desirable kids?”

That is, does this new policy of simple majority (mob) rule allow parents in a specific community to redefine their neighborhood school so that the school no-longer serves lower performing kids or kids whose parents are less able or for that matter less interested in engaging in a level of parent involvement that might be required by a specific charter operator? In short, can the majority of parents effectively kick out a minority of parents that they don’t like – including parents of kids with disabilities or non-English speaking parents?

Sure, you say – charters can’t discriminate in this way because they must rely on lotteries for admissions and must take children with disabilities and those unable to speak English. They would have to accept those kids in the neighborhood. Yes, by law this might be true. But experience with many charters proves otherwise. Many do rely on attrition to boost scores – somehow avoid serving kids with disabilities and non-English speaking kids. But the neighborhood school couldn’t do the same.

Taking this a step further, envision a neighborhood split along language, ethnic or even religious lines. Can the parents of the majority group force their neighborhood school to be reconstituted as a cultural, language or for that matter religion (argued as culture) specific school that is effectively hostile to the minority?

Hey education law friends – help me out with the possibilities here?

The Circular Logic of Quality-Based Layoff Arguments

Many pundits are responding enthusiastically to the new LA Times article on quality-based layoffs – or how dismissing teachers based on Value-added scores rather than on seniority would have saved LAUSD many of its better teachers, rather than simply saving its older ones.

Some are pointing out that this new LA Times report is the “right” way to use value-added as compared with the “wrong” way that LA Times had used the information previously this year.

Recently, I explained the problematic circular logic being used to support these “quality-based layoff” arguments. Obviously, if we dismiss teachers based on “true” quality measures, rather than experience which is, of course, not correlated with “true” quality measures, then we save the jobs of good teachers and get rid of bad ones. Simple enough? Not so. Here’s my explanation, once again.

This argument draws on an interesting thought piece and simulation posted at http://www.caldercenter.org  ( Teacher Layoffs: An Empirical Illustration of Seniority vs. Measures of Effectiveness), which was later summarized in a (less thoughtful) recent Brookings report (http://www.brookings.edu/~/media/Files/rc/reports/2010/1117_evaluating_teachers/1117_evaluating_teachers.pdf).

That paper demonstrated that if one dismisses teachers based on VAM, future predicted student gains are higher than if one dismisses teachers based on experience (or seniority). The authors point out that less experienced teachers are scattered across the full range of effectiveness – based on VAM – and therefore, dismissing teachers on the basis of experience leads to dismissal of both good and bad teachers – as measured by VAM. By contrast, teachers with low value-added are invariably – low value-added – BY DEFINITION. Therefore, dismissing on the basis of low value-added leaves more high value-added teachers in the system – including more teachers who show high value-added in later years (current value added is more correlated with future value added than is experience).

It is assumed in this simulation that VAM (based on a specific set of assessments and model specification) produces the true measure of teacher quality both as basis for current teacher dismissals and as basis for evaluating the effectiveness of choosing to dismiss based on VAM versus dismissing based on experience.

The authors similarly dismiss principal evaluations of teachers as ineffective because they too are less correlated with value-added measures than value-added measures with themselves.

Might I argue the opposite? – Value-added measures are flawed because they only weakly predict which teachers we know – by observation – are good and which ones we know are bad? A specious argument – but no more specious than its inverse.

The circular logic here is, well, problematic. Of course if we measure the effectiveness of the policy decision in terms of VAM, making the policy decision based on VAM (using the same model and assessments) will produce the more highly correlated outcome – correlated with VAM, that is.

However, it is quite likely that if we simply use different assessment data or different VAM model specification to evaluate the results of the alternative dismissal policies that we might find neither VAM-based dismissal nor experienced based dismissal better or worse than the other.

For example, Corcoran and Jennings conducted an analysis of the same teachers on two different tests in Houston, Texas, finding:

…among those who ranked in the top category (5) on the TAKS reading test, more than 17 percent ranked among the lowest two categories on the Stanford test. Similarly, more than 15 percent of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.

  • Corcoran, Sean P., Jennifer L. Jennings, and Andrew A. Beveridge. 2010. “Teacher Effectiveness on High- and Low-Stakes Tests.” Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI.

So, what would happen if we did a simulation of “quality based” layoffs versus experience-based layoffs using the Houston data, where the quality-based layoffs were based on a VAM model using the Texas Assessments (TAKS), but then we evaluate the effectiveness of the layoff alternatives using a value-added model of Stanford achievement test data? Arguably the odds would still be stacked in favor of VAM predicting VAM – even if different VAM measures (and perhaps different model specifications). But, I suspect the results would be much less compelling than the original simulation.

The results under this alternative approach may, however, be reduced entirely to noise – meaning that the VAM based layoffs would be the equivalent of random firings – drawn from a hat and poorly if at all correlated with the outcome measure estimated by a different VAM – as opposed to experienced based firings. Neither would be a much better predictor of future value-added.  But for all their flaws, I’d take the experienced based dismissal policy over the roll of the dice, randomized firing policy any day.

In the case of the LA Times analysis, the situation is particularly disturbing if we look back on some of the findings in their own technical report.

I explained in a previous post that the LA Times value-added model had potentially significant bias in its estimates of teacher quality. For example, in my earlier post, I explain that:

Buddin finds that black teachers have lower value-added scores for both ELA and MATH. Further, these are some of the largest negative effects in the second level analysis – especially for MATH. The interpretation here (for parent readers of the LA Times web site) is that having a black teacher for math is worse than having a novice teacher. In fact, it’s the worst possible thing! Having a black teacher for ELA is comparable to having a novice teacher.

Buddin also finds that having more black students in your class is negatively associated with teacher’s value-added scores, but writes off the effect as small. Teachers of black students in LA are simply worse? There is NO discussion of the potentially significant overlap between black teachers, novice teachers and serving black students, concentrated in black schools (as addressed by Hanushek and Rivken in link above).

By contrast, Buddin finds that having an Asian teacher is much, much better for MATH. In fact, Asian teachers are as much better (than white teachers) for math as black teachers are worse! Parents – go find yourself an Asian math teacher in LA? Also, having more Asian students in your class is associated with higher teacher ratings for Math. That is, you’re a better math teacher if you’ve got more Asian students, and you’re a really good math teacher if you’re Asian and have more Asian students?????

One of the more intriguing arguments in the new LA Times article is that under the seniority based layoff policy:

Schools in some of the city’s poorest areas were disproportionately hurt by the layoffs. Nearly one in 10 teachers in South Los Angeles schools was laid off, nearly twice the rate in other areas. Sixteen schools lost at least a fourth of their teachers, all but one of them in South or Central Los Angeles.

http://articles.latimes.com/2010/dec/04/local/la-me-1205-teachers-seniority-20101204/2

That is, new teachers who were laid off based on seniority preferences were concentrated in high need schools. But so too were teachers with low value-added ratings?

While arguing that “far fewer” teachers would be laid off in high need schools under a quality-based layoff policy, the LA Times does not however offer up how many teachers would have been dismissed from these schools had their biased value-added measures been used instead? Recall that from the original LA Times analysis:

97% of children in the lowest performing schools are poor, and 55% in higher performing schools are poor.

Combine this finding with the findings above regarding the relationship between race and value-added ratings and it is difficult to conceive how VAM based layoffs of teachers in LA would not also fall disparately on high poverty and high minority schools. The disparate effect may be partially offset by statistical noise, but that simply means that some teachers in lower poverty schools will be dismissed on the basis of random statistical error, instead of race-correlated statistical bias (which leads to a higher rate of dismissals in higher poverty, higher minority schools).

Further, the seniority based layoff policy leads to more teachers being dismissed in high poverty schools because the district placed more novice teachers in high poverty schools, whereas the value-added based layoff policy would likely lead to more teachers being dismissed from high poverty, high minority schools, experienced or not, because they were placed in high poverty, high minority schools.

So, even though we might make a rational case that seniority based layoffs are not the best possible option, because they may not be highly correlated with true (not “true”) teaching quality, I fail to see how the current proposed alternatives are much if any better.  They only appear to be better when we measure them against themselves as the “true” measure of success.

The Curious Duplicity of NCTQ

NCTQ fashions itself as a leading think tank on promoting teacher quality in K-12 education. NCTQ adopts a relatively extreme position that teacher quality is the one and only thing that matters! Teacher quality is THE determining factor of school quality.

I also believe that teacher quality is very important. I also agree with NCTQ on the point that content knowledge, at the middle and secondary levels especially, is particularly important and that simply being listed as “qualified” to teach specific content is no guarantee.

As part of their effort to improve teacher quality, NCTQ has been going around doing “studies” and applying ratings to the quality of teacher preparation institutions. Now, I noted on my previous post that NCTQ and others may actually be missing the boat on who is actually preparing teachers. But lets set that aside for a moment. One would think that if NCTQ is so interested in teacher quality as the primary determinant of school quality and student success, and teacher expertise as an important part of that equation at higher grade levels, that any analysis of the quality of undergraduate or graduate programs to train teachers would have to place significant emphasis on faculty quality and expertise? right? It would make little sense to simply review which textbooks are used or what the course descriptions say, or what the curricular sequence happens to be? Right?

Out of a multitude of indicators on teacher preparation institutions, NCTQ includes only 1 – yes 1 – regarding faculty quality, which is described as follows:

In our evaluation of programs, we examined teaching responsibilities for all faculty members, as indicated by course assignments in course schedules, excluding all clinical coursework. We looked for two specific examples of inappropriate assignments: 1) an instructor teaching across the areas of foundations of education, methods and educational psychology; and/or 2) an instructor who teaches both reading and mathematics methods courses. Other inappropriate assignments may well be made but were not included in our review.

http://www.nctq.org/edschoolreports/illinois/standards/26Methodology.jsp

Yep, that’s it. All that they address is whether a faculty member appears to teach across two areas that no faculty member, in their view, could be sufficiently prepared to teach. The rest is based largely on textbooks chosen, syllabi and course descriptions, regardless of faculty expertise. Clearly this was a matter of data convenience. It’s hard to figure out whether individual faculty members truly possess expertise in their fields, short of evaluating their individual academic backgrounds, research and writing on the topic.

But it is absurd for an organization that believes teacher quality in K-12 education paramount, and content expertise critical, to ignore outright faculty expertise in their evaluations of teacher preparation institutions.

Here’s their FAQ on the long-term project of evaluating teacher preparation programs: http://www.nctq.org/p/response/evaluation_faq.jsp

Related reading (actual research):

Wolf-Wendel, L, Baker, B.D., Twombly, S., Tollefson, N., & Mahlios, M. (2006) Who’s Teaching the Teachers? Evidence from the National Survey of Postsecondary Faculty and Survey of Earned
Doctorates. American Journal of Education 112 (2) 273-300

Ed Schools

Ed schools seem to make an easy target in public policy debates over the quality of American public schooling and the American teacher workforce.

In many recent lopsided “ed school as the root of all evil” presentations, “Ed Schools,” are treated as some easily defined, static entity over time. In the book of reformyness (chapter 7, verse 2), “Ed Schools” necessarily consist of some static set of traditional higher education institutions – 4 year teachers colleges including regional state colleges and flagship universities – where a bunch of crusty old education professors spew meaningless theory at wide-eyed undergrads (who graduated at the bottom of their high school class) seeking that golden ticket to a job for life – with summers off.

In order to craft a clearly understandable (albeit entirely false) dichotomy of policy alternatives, pundits then present teachers who have obtained alternative certification as a group of individuals, nearly all of whom necessarily attended highly selective colleges and majored in something really, really rigorous and then received their certification through some more expeditious and clearly much more practical and useful fast-tracked option.

This was certainly the theme of a discussion (hashtag #edschools) at Thomas B. Fordham Institute actively tweeted the other day by Mike Petrilli and a few others.  What I found most interesting was that no-one really challenged the assumptions that “ed schools” are some easily definable group of traditional higher education institutions – that this has been unchanged over decades – and that teacher training is some consistent, exclusive domain of traditional public higher education institutions – specifically as an undergraduate degree granting enterprise? That there are and have always been, oh… about a thousand or so ed schools… that well… keep on doing the same damn thing over and over again (for the past 50 years, one participant tweeted) … and well… no one ever shuts down the bad Ed Schools… and that’s why we’re in such bad shape! It’s really that simple.

Because this characterization is simply assumed to be true, the obvious way to crack this broken and declining system is to expand alt. certification and allow more non-traditional, for profit and entrepreneurial organizations – especially non-university organizations to grant teaching credentials – heck – let’s let them actually grant degrees. Who needs brick-and-mortar colleges anyway? Given the assumed static nature of the declining and antiquated system of “Ed Schools” that has brought us to our knees, this is the only answer!!!!!

One of my favorite tweets from the event was from Mike Petrilli, relaying a comment by Kate Walsh:

Walsh: There are 1410 Ed schools in the country. NCTQ spent 5 years determining that number.

You know what Kate, by the time you were done figuring that out (however you did), the number had already changed. Also, FYI, there are actually some data sources out there that might have been helpful for tabulating the existing degree granting programs and the numbers of degrees conferred by those programs.

So, let’s take a look at some of the data on degrees conferred across all education fields in 1990, 2000 and 2010.

Let’s start with a quick look at the total degrees conferred in “education” as defined by degree classification codes (CIP Codes), across all institutions granting such degrees nationally. The interesting twist here is that bachelor’s degree production of education degrees has been relatively constant over time for about 20 years and perhaps longer. Doctoral degree production increased from 1990 to 2000, but stagnated after that. On the other hand, Master’s degree production has skyrocketed.

Now, one might try to argue that what that’s really about is all of those currently practicing teachers who are just accumulating those worthless master’s degrees to get that salary bump. I will write more on this topic at a later point, but that’s not likely the dominant scenario. Yes, many of the master’s degrees are obtained to broaden fields of certification in order to give current teachers more options – either assignment options in their current districts, or other job opportunities. AND, many of the masters degrees these days are initial credentials granted to individuals who did not receive their teaching credential as an undergraduate. Many initial teaching credentials are granted at the master’s, not bachelor’s level. A substantial amount of teacher training goes on at the master’s, not undergraduate level. No matter the case, the master’s degrees – of which there are so many – and so many more being granted than bachelors degrees – are the interesting story here.

Is it really that the same old traditional higher education institutions with crusty old, out of date professors, are now just spewing out masters degrees? Or is something else at work here?

Well, here are the top 25 MA producers in education back in 199o. Even at that time, the largest master’s degree granting institutions were not the top universities – or even the top teachers colleges. But, some of those schools were at least in the mix. Teachers College of Columbia University, Ohio State, Michigan State and Harvard all appear in the top 25 in 1990.

Here are the top 25 master’s producers in 2000. Here, the tide begins to shift a bit. Schools like NOVA Southeastern with their online programs, and National-Louis grow even bigger than they had been a decade earlier. Teachers College retains a top 25 spot, as does Ohio State, and University of Minnesota makes the list. Harvard is gone.

By 2009, “Ed Schools” are a substantially different mix. Not only that, but look at the volume of degree production. Back in 1990, Ed Schools at respectable major universities were putting out about 600 master’s degrees in education related fields per year. They held on to similar rates in 2000 and still in 2009. But by 2009, Walden University and U. of Phoenix were each cranking out 4,500+ master’s degrees per year. Grand Canyon U. comes in next in line. These are the entrepreneurial up-starts that are the product of minimized regulation of teaching credentials.

If there truly has been a decline in the quality of the teacher workforce, and if pundits truly believe that this supposed decline is related somehow to “Ed Schools,” then it might behoove those same pundits to explore the dramatic changes that have, in fact, already occurred in the “Ed School” marketplace.

If there has been a dramatic decline in teacher preparation, and in specialized training, it may be worth taking a look at those institutions that have emerged to dominate the production of education degrees and credentials in recent years. After all, Walden and Phoenix each produce 5 to 10 times the master’s degree credentials in education of major public universities. And, production of education master’s degrees is now nearly double the level of production of education bachelor’s degrees. And many of these entrepreneurial start-ups specifically frame their master’s programs as an option for individuals with a bachelor’s degree in “something else” to obtain a teaching credential.

Is even more deregulation and entrepreneurial teacher preparation what we really need? Can one really blame the traditional higher education institutions, whose share of production has declined steadily for decades, for declining teacher quality? Only if you ignore these trends, which I expect these pundits will continue to do.