Truly Uncommon in Newark…

A while back I wrote a post explaining why I felt that while Robert Treat Academy Charter School in Newark is a fine school, it’s hardly a replicable model for large scale reform in Newark, or elsewhere. I have continued over time to write about the extent to which Newark Charter schools in particular have engaged in a relatively extreme pattern of cream skimming. The same is true in Jersey City and Hoboken, but not so in Trenton. But, Trenton also offers us fewer examples of those high-flying charters that we are supposed to view as models for the future of NJ education. When I wrote my earlier post on Treat, I somehow completely bypassed North Star Academy, which I would now argue is even that much less scalable than Robert Treat. That’s not to say that North Star Academy is not a highly successful school for the students that it serves… or at least for those who actually stay there over time. But rather that Star of the North is yet another example of why the “best” New Jersey charter schools provide a very limited path forward for New Jersey urban school reform. Let’s take a look:

So, here’s where North Star fits in my 8th grade performance comparisons of beating the odds, based on the statistical model I explain in previous posts:

In this figure (ab0ve), we see that North Star certainly beats the odds at 8th grade. Now, we can also already see that North Star has a much lower % free lunch than nearly any other school in Newark, limiting scalability right off the bat. There just aren’t enough non-poor kids in Newark to create many more schools with demography like North Star. Not to mention the complete lack of children with disabilities or limited English language proficiency.

Here’s North Star on the map, in context. Smaller lighter circles are lower % free lunch schools. Most of the charters in this map are… well.. smaller lighter circles (with charters identified with a red asterisk). Not all, however, are as non-representative as North Star.

Now, here’s the part that sets North Star and a few others apart – at first in a seemingly good way…

If we take the 2009 assessments for each grade level, one interesting finding is that the charter schools serving lower grade levels in Newark are generally doing less well than the NPS average (red line). But, those schools that start at grade 5 seem to be picking up a population that right away is doing comparable or better than the NPS average. See, for example, TEAM and Greater Newark (comparable to NPS in their first grade – 5th – served) and, of course, North Star whose students perform well above NPS in their first year – likely not fully a North Star effect, but rather at least partly a selection effect (Lottery or not, it’s a different population than those served in the district). More strikingly, with each increase in grade level, proficiency rates climb dramatically toward 100% by 8th grade. Either they are simply doing an amazing job of bringing these kids to standards over a 3 year period… or … well… something else.

The figure above looks at 6th, 7th, and 8th graders in the same year. That is, they aren’t the same kids over time doing better and better. But, even if we looked at 6th graders in one year, 7th graders the next year and 8th graders the following year, we wouldn’t necessarily be looking at the same kids. In fact, one really easy way to make cohort test scores rise is to systematically shed – push out – those students who perform less well each year. Sadly, NJDOE does not provide the individual student data necessary for such tracking. But there are a few other ways to explore this possibility.

First, here are the cohort “attrition rates” based on 3 sequential cohorts for Newark Charter schools:

In this figure, we can see that for the 2009 8th graders, North Star began with 122 5th graders and ended with 101 in 8th. The subsequent cohort also began with 122, and ended with 104. These are sizable attrition rates. Robert Treat, on the other hand, maintains cohorts of about 50 students – non-representative cohorts indeed – but without the same degree of attrition as North Star. Now, a school could maintain cohort size even with attrition if that school were to fill vacant slots with newly lotteried-in students. This, however, is risky to the performance status of the school, if performance status is the main selling point.

Here’s what the cohort attrition looks like when tracked with the state assessment data.

Here, I take two 8th grade cohorts and trace them backwards. I focus on General Test Takers only, and use the ASK Math assessment data in this case. Quick note about those data – Scores across all schools tend to drop in 7th grade due to cut-score placement (not because kids get dumber in 7th grade and wise up again in 8th). The top section of the table looks at the failure rates and number of test takers for the 6th grade in 2005-06, 7th in 2006-07 and 8th in 2007-08. Over this time period, North Star drops 38% of its general test takers. And, cuts the already low failure rate from nearly 12% to 0%. Greater Newark also drops over 30% of test takers in the cohort, and reaps significant reductions in failures (partially proficient) in the process.

The bottom half of the table shows the next cohort in sequence. For this cohort, North Star sheds 21% of test takers between grade 6 and 8, and cuts failure rates nearly in half – starting low to begin with (starting low in the previous grade level, 5th grade, the entry year for the school). Gray and Greater Newark also shed significant numbers of students and Greater Newark in particular sees significant reductions in share of non(uh… partially)proficient students.

My point here is not that these are bad schools, or that they are necessarily engaging in any particular immoral or unethical activity. But rather, that a significant portion of the apparent success of schools like North Star is a) attributable to the demographically different population they serve to begin with and b) attributable to the patterns of student attrition that occur within cohorts over time.

Again, the parent perspective and public policy perspective are entirely different. From a parent (or child) perspective, one is relatively unconcerned whether the positive school effect is function of selectivity of peer group and attrition, so long as there is a positive effect. But, from a public policy perspective, the model is only useful if the majority of positive effects are not due to peer group selectivity and attrition, but rather to the efficacy and transferability of the educational models, programs and strategies. Given the uncommon student populations served by many Newark charters and even more uncommon attrition patterns among some… not to mention the grossly insufficient data… we simply have no way of knowing whether these schools can provide insights for scalable reforms.

As they presently operate, however, many of the standout schools – with North Star as a shining example – do not represent scalable reforms.

New Jersey Superintendent Salaries in Context

These two figures provide some updated context to an earlier post on Arbitrary Pay Limits for NJ Administrators. The bombastic rhetoric on this topic refuses to die down. So, here are a few more figures to put NJ public school district administrator salaries into context. Note that these two figures compare THE TOP 20 SUPERINTENDENT SALARIES to salaries of a) the majority of private independent school headmasters statewide, and b) the average of a large number of NJ Hospital administrators (non-physician chief executives). Just a little more fodder for the conversation.

Figure 1

Mean and Median Compensation by Group

Figure 2

Top 20 Public School Superintendents (2009-10) Compared with Private School Headmasters (2008)

Enough said.

3 very weak arguments for using weak indicators

This post is partly in response to the Brookings Institution report released this week which urged that value-added measures be considered in teacher evaluation:

http://www.brookings.edu/~/media/Files/rc/reports/2010/1117_evaluating_teachers/1117_evaluating_teachers.pdf

However, this post is more targeted at the punditry that has followed the Brookings report – the punditry that now latches onto this report as a significant endorsement of using value-added ratings as a major component in high-stakes personnel decisions. Personally, I didn’t read it that way. Nowhere did I see this report arguing strongly for a substantial emphasis on value-added measures. That said, I actually felt that the report based its rather modest conclusions on 3 deeply flawed arguments.

Argument 1 – Other methods of teacher evaluation are ineffective at determining good versus bad teachers because those methods are only weakly correlated with value-added measures.

Or, in other words, current value added measures, while only weak predictors of future value-added, are still stronger predictors of future value-added (using the same measures and models) than other indirect measures of teacher quality such as experience or principal evaluations.

This logic undergirds the quality-based dismissal example in the recent Brookings report which is based on an earlier Calder Center paper (www.caldercenter.org). That paper showed that if one dismisses teachers based on VAM, future predicted student gains are higher than if one dismisses teachers based on experience (or seniority). The authors point out that less experienced teachers are scattered across the full range of effectiveness – based on VAM – and therefore, dismissing teachers on the basis of experience leads to dismissal of both good and bad teachers – as measured by VAM. By contrast, teachers with low value-added are invariably – low value-added – BY DEFINITION. Therefore, dismissing on the basis of low value-added leaves more high value-added teachers in the system – including more teachers who show high value-added in later years (current value added is more correlated with future value added than is experience).

It is assumed in this simulation that VAM (based on a specific set of assessments and model specification) produces the true measure of teacher quality both as basis for current teacher dismissals and as basis for evaluating the effectiveness of choosing to dismiss based on VAM versus dismissing based on experience.

The authors similarly write off principal evaluations of teachers as ineffective because they too are less correlated with value-added measures than value-added measures with themselves.

Might I argue the opposite? – Value-added measures are flawed because they only weakly predict which teachers we know – by observation – are good and which ones we know are bad? A specious argument – but no more specious than its inverse.

The circular logic here is, well, problematic. Of course if we measure the effectiveness of the policy decision in terms of VAM, making the policy decision based on VAM (using the same model and assessments) will produce the more highly correlated outcome – correlated with VAM, that is.

However, it is quite likely that if we simply use different assessment data or different VAM model specification to evaluate the results of the alternative dismissal policies that we might find neither VAM-based dismissal nor experienced based dismissal better or worse than the other.

For example, Corcoran and Jennings conducted an analysis of the same teachers on two different tests in Houston, Texas, finding:

…among those who ranked in the top category (5) on the TAKS reading test, more than 17 percent ranked among the lowest two categories on the Stanford test. Similarly, more than 15 percent of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.

Corcoran, Sean P., Jennifer L. Jennings, and Andrew A. Beveridge. 2010. “Teacher Effectiveness on High- and Low-Stakes Tests.” Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI.

So, what would happen if we did a simulation of “quality based” layoffs versus experience-based layoffs using the Houston data, where the quality-based layoffs were based on a VAM model using the Texas Assessments (TAKS), but then we evaluate the effectiveness of the layoff alternatives using a value-added model of Stanford achievement test data? Arguably the odds would still be stacked in favor of VAM predicting VAM – even if different VAM measures (and perhaps different model specifications). But, I suspect the results would be much less compelling than the original simulation.

The results under this alternative approach may, however, be reduced entirely to noise – meaning that the VAM based layoffs would be the equivalent of random firings – drawn from a hat and poorly if at all correlated with the outcome measure estimated by a different VAM – as opposed to experienced based firings. Neither would be a much better predictor of future value-added. But for all their flaws, I’d take the experienced based dismissal policy over the roll of the dice, randomized firing policy any day.

Argument 2 – We should be unconcerned about high misclassification errors – falsely identifying good teachers as bad, therefore resulting in random harm to teachers – Rather, we should be concerned that current methods falsely identify bad teachers as good, doing lifelong harm to kids.

The Brookings report argues:

Much of the concern and cautions about the use of value-added have focused on the frequency of occurrence of false negatives, i.e., effective teachers who are identified as ineffective. But framing the problem in terms of false negatives places the focus almost entirely on the interests of the individual who is being evaluated rather than the students who are being served. It is easy to identify with the good teacher who wants to avoid dismissal for being incorrectly labeled a bad teacher. From that individual’s perspective, no rate of misclassification is acceptable. However, an evaluation system that results in tenure and advancement for almost every teacher and thus has a very low rate of false negatives generates a high rate of false positives, i.e., teachers identified as effective who are not. These teachers drag down the performance of schools and do not serve students as well as more effective teachers.

Again, the false identification assumption regarding current evaluations is based on the assumption that the value-added measure is a true measure of teacher quality. That is, we know current evaluations are bad because many teachers get tenure but have bad value-added ratings. But, value-added ratings are good because some teachers who had good value-added ratings at one point, under one type of value-added model applied to one type of assessments, also have good value-added ratings later using the same model specification applied to similar or same testing data.

Setting that circular logic issue aside, we are faced with the moral dilemma I posed in an earlier post. This argument is all about the “adults vs. kids” issue, and the assumption that if it’s really about the kids, it can’t be at all about the adults in the system, and vice versa. The reality however is that a system that is a great workplace for adults can translate to a better educational setting for children and a system that creates a divisive, negative workplace setting for the adults is unlikely to translate to a better educational setting for the kids. It’s more likely to be a both/and, not either/or situation.

I explained previously:

I guess that one could try to dismiss those moral, ethical and legal concerns regarding wrongly dismissing teachers by arguing that if it’s better for the kids in the end, then wrongly firing 1 in 4 average teachers along the way is the price we have to pay. I suspect that’s what the pundits would argue – since it’s about fairness to the kids, not fairness to the teachers, right? Still, this seems like a heavy toll to pay, an unnecessary toll, and quite honestly, one that’s not even that likely to work even in the best of engineered circumstances.

Too often overlooked in these analyses is the question of who will really want to teach in an education system where the chance of having one’s career cut short by random statistical error is quite high????? Who will be waiting in line? What kind of workplace will that create? And can we really expect average teaching quality to improve as a result?

Argument 3 – Other professions use “weak” indicators or signals of performance, like using the SAT for college admission or using patient mortality rates to evaluate hospital quality.

The Brookings report argues:

It is instructive to look at other sectors of the economy as a gauge for judging the stability of value-added measures. The use of imprecise measures to make high stakes decisions that place societal or institutional interests above those of individuals is wide-spread and accepted in fields outside of teaching.

Examples from Brookings:

In health care, patient volume and patient mortality rates for surgeons and hospitals are publicly reported on an annual basis by private organizations and federal agencies and have been formally approved as quality measures by national organizations.

The correlation of the college admission test scores of college applicants with measures of college success is modest (r = .35 for SAT combined verbal + math and freshman GPA[9]). Nevertheless nearly all selective colleges use SAT or ACT scores as a heavily weighted component of their admission decisions even though that produces substantial false negative rates (students who could have succeeded but are denied entry).

On its face, the argument that other professions use weak indicators is reason for public education to do the same is absurd. And, this argument presents as a given, with very weak justification and a handful of cherry-picked citations, that these weak signals play a significant role in high stakes decision-making. A more thorough review of health-care policy literature in particular raises many of the same concerns we hear in the education debate over institutional and individual performance measures – precision, accuracy and incentives. There also exists a similar divide in perspectives between healthcare policy wonks and management organizations versus physicians with regard to the accuracy and usefulness of the indicators and the incentives created by specific measures.

Many pundits out there tweeting and blogging about this new Brookings report are the same pundits who continue to argue that value-added ratings should constitute as much as 50% of teacher evaluation – and that somehow this new Brookings report validates their claim. I don’t see where the Brookings report goes anywhere near that far.

To those viewing the Brookings report in that light, implicit in the “other sectors do it” argument is that the SAT and mortality rates are considered major factors for evaluating students for admission or for evaluating hospital quality. Are they really? In an era where more and more colleges are making the SAT optional, how many are using it as 50% of admissions criteria? Yes, most highly selective colleges do still require the SAT, and it no doubt serves as a tipping factor on admissions decisions (largely out of convenience when taking the first cut at a large applicant pool). But, several have abandoned use of SAT altogether (http://www.fairtest.org/university/optional), perhaps because it is perceived to be such a weak signal – or because of all of the perverse incentives and inequities associated with the SAT. Would anyone seriously consider using patient mortality rates alone as 50% of the value for rating hospital quality – determining hospital closures?

And even the batting average comparison – the authors argue that past batting average is only a weak predictor of future batting average – but is clearly still important in player personnel decisions. But what percentage does batting average count for? Does a baseball GM consider which pitchers the batter went up against that season? Would batting average count for, say, 50% of the decision – absolutely, a fixed share, deal breaker? It may be important, but it’s one of many, many statistics most of which are also likely considered in context – in a very flexible decision framework (more art than science?).

And then there’s the issue of the incentives created by emphasizing a specific measure. What may be good in baseball may not in healthcare or education!

Let’s say we put this much emphasis on batting average. What is a player to do? How can the player improve his worth? Getting more hits – getting traded to a team in a division with poor pitching… to get more hits? Neither is a deeply problematic incentive. There’s not much downside to improving one’s batting average (setting aside the role of performance enhancing drugs and all that). Besides, it’s a freakin’ game!!!!!… a game that involves gaming the system. A game that is based largely on obsession with statistics and trying to figure our which ones are and aren’t meaningful.

The SAT is different. Much has been made of the perverse incentives including increased classification among students from higher income families to take an un-timed SAT (http://www.slate.com/id/2141820/) and entire industries that have merged around the emphasis on SAT scores for college admission, reinforcing socio-economic disparities in SAT performance. Even if the SAT could be a reasonable indicator, its usefulness has been distorted and significantly compromised by the incentives and resulting behaviors. Hence the decision by many colleges to consider the test optional .

What about hospitals trying to reduce mortality rates? Turning away the sickest, highest risk patients and most complex cases is one option. Unlike the batting average measure, there is a significant downside to this one. Those with the greatest needs don’t get served!

To argue that these supposedly analogous measures are widely accepted in healthcare as a basis for high-stakes decisions is a foolish stretch. The citations in the Brookings report to sources that report healthcare indicators (like: http://www.hospitalcompare.hhs.gov/) are more analogous to the school report cards that already exist on state department of education web sites (but with additional survey responses added) – and not analogous to the publication of individual teacher value-added ratings as done earlier this year by the LA Times. Comparable information – and comparably useless information – is already widely available on public schools through both government sources and a plethora of private vendors. For that matter, consumer ratings of individual teachers are also available through sources like www.ratemyteacher.com.

To wrap this up…

Do I think value-added measures are useless? No! Do I think they should be used to evaluate and compare individual teachers and should be used as a significant factor in employment decisions? No! But, there is likely still much that can be learned from studying different approaches to value-added modeling, developing more useful assessment instruments, using these instruments and models to better understand what works in schools and what doesn’t. Clearly we need to learn how to use data more thoughtfully to inform school improvement efforts. Quite honestly, I believe many already are.

Follow up Response to Chad Aldeman’s critique of my argument:

You (Chad) seem to have missed the point, which I perhaps did not make clear enough. Indeed, if we were able to measure precisely what we wanted to measure and if we measured it, and it predicted itself, we’d be in great shape. The problem is that we can’t measure precisely what we think we want to measure.

Given the same assessment instrument and value added model parameters, we get a .2 to .3 correlation from year to year… (possibly partly a function of non-random assignment).

Given different assessments in the same year, Sean Corcoran found:

…among those who ranked in the top category (5) on the TAKS reading test, more than 17 percent ranked among the lowest two categories on the Stanford test. Similarly, more than 15 percent of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.

* Corcoran, Sean P., Jennifer L. Jennings, and Andrew A. Beveridge. 2010. “Teacher Effectiveness on High- and Low-Stakes Tests.” Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI.

So… what this calls into question is whether we really are measuring what we want to measure. Are we able to precisely determine that a math teacher is good at teaching math – regardless of the students they have, or the year they have them, or the test we use to measure it? Or even regardless of the scaling of the test.

When we measure points per game, we know what we are measuring. I’m curious now about the correlation between points per game and win/loss, or even a championship season… or annual revenue, since that’s the end goal… but I digress.

Searching for Superguy in Jersey…

A short while back I did a post called Searching for Superguy in Gotham. In that post, I tackled the assumption that Superguy was easily identifiable as a hero leader of charter schools – or at least that was one distorted portrayal of Superguy in Waiting for Superman. Now, I should point out here that I really don’t know of anyone actually out there running charter schools who wishes to portray him/herself in this way. So, to be absolutely clear, this post is in no way an attack on those who are out there just trying to do the best job they can for kids in need.

This post IS a criticism of the punditry around charter schools- the notion that charter schools are easy to pick out from the crowd of urban (or other) schools- because they are necessarily, plainly and obviously better. That classic argument that the upper half is better than average!

This was the basis of my Searching for Superguy in Gotham activity. In that activity, I estimated a relatively simple statistical model to determine which schools performed better than expected, given their students and location and which schools performed less well than expected, given their students and location. I had been planning all along to do something similar with New Jersey Charter Schools. Now is that time!!!!!

As I did with New York City charter schools, I have estimated a statistical model of the proficiency rates of each charter school and each other school in the same New Jersey city. In the model, I correct for a) free lunch rates, b) homelessness rates, c) student racial composition (Hispanic and black). AND, I compare each test – grade level and subject – to the same test across all schools. AND, I compare each school to other schools in the same city (by using a “city” dummy variable). I obtained all necessary variables from a) NJ school report cards (outcome measures) and b) NJ enrollment data file (free lunch, race, homelessness) and c) NCES Common Core of data for “city” location of school.

So now, the search for Jersey Superguy begins! Let’s start with 4th Grade Math performance in 2009. This scatterplot includes all schools with ASK4 Math scores in cities where charters existed in 2009. Schools above the red horizontal line are schools that “beat the odds.” That is, they are schools that had proficiency rates that were above the expected proficiency rates for that school, given its students, the test, and the location (city). Schools below the red line are schools that did not meet expectations. So, is superman (mythical super charter school leader) hiding in one of those dots way at the top of the scatter? Is he in a high-flying, high poverty school? Is he in a high-flying low poverty school? Certainly, he could not be down in the lower half of the graph.

CLICK HERE TO SEE WHICH SCHOOLS ARE CHARTERS AND WHICH ARE DISTRICT SCHOOLS

CLICK HERE FOR A CLOSE UP ON NEWARK SCHOOLS OVER AND UNDER THE LINE

NOTE: I’m in the process of fixing a data error that occurs on a few charter schools (affecting merging of data). These figures still include the merge error, but the overall distributions are not affected. Schools affected include: Environment Community School, Liberty Academy, Hope Academy, International CS of Trenton, Jersey City Community CS and Jersey City Golden Door. I HAVE NOW EXCLUDED MISMATCHED SCHOOLS.

The source of the error is the NJDOE enrollment file, which, for example identifies Environment Community School as both 80_6232_920 (county, district, school) and as 80_6235_900. The first of these codes is correct. The second is for Liberty Academy CS (according to the School Report Card and according to NCES data).

Now, let’s take a look at the 8th Grade Math outcomes. Here’s the statewide scatterplot:

Surely superguy must be hangin’ out in one of those high flyin’ dots way at the top of the scatter?

CLICK HERE TO SEE WHICH SCHOOLS ARE CHARTERS AND WHICH ARE DISTRICT SCHOOLS

CLICK HERE FOR A CLOSE UP ON NEWARK SCHOOLS OVER AND UNDER THE LINE

As you can see, there are plenty of charters and traditional public schools above the line, and below the line. The point here is by no means to bash charters. Rather, this is about being realistic about charters and more importantly realistic about the difficulty of truly overcoming the odds. It’s not easy and any respectable charter school leader or teacher and any respectable traditional public school leader or teacher will likely confirm that. It’s not about superguy. It’s about hard work and sustained support – be it for charters or for traditional public schools.

As I noted in my previous searching for superguy post:

Yeah… I’d like to be a believer. I don’t mean to be that much of a curmudgeon. I’d like to sit and wait for Superguy – perhaps watch a movie while waiting (gee… what to watch?). But I think it would be a really long wait and we might be better off spending this time, effort and our resources investing in the improvement of the quality of the system as a whole. Yeah, we can still give Superguy a chance to show himself (or herself), but let’s not hold our breath, and let’s do our part on behalf of the masses (not just the few) in the meantime.

TECHNICAL APPENDIX

Here is a link to the model used for generating the over/under performing graphs above

And here is a separate model in which I test whether Charter schools on average outperform traditional public schools in the same city. This model shows that they don’t, or at least that their 1 to 3 percentage point edge on proficiency is not statistically significant. But whether charters on average outperform – or don’t – traditional public schools is not the point. The point is that like traditional public schools – they vary – and it’s important for us to get a handle on how and why all schools vary in their successes and failures – charter or not.

Complete slide set here: New Charter Figures Nov 12

BONUS MAPS

Here are some updated maps of the demographics and adjusted performance measures of charter and district schools in Newark.

First, % Free Lunch 2009-10:

Next, a new one, % LEP/ELL – note that the % LEP/ELL for NWK charters is so low, therefore their dots are so small that the star indicating “charter” covers them entirely:

Finally, here are the Beating the Odds figures converted into color coded circles – with large purple circles being high performers – better than expectations – medium size pale dots being relatively average performers – and large yellow dots performing below expectations:

Jersey City % LEP/ELL

Jersey City % Free Lunch

Jersey City Performance Index

Getting all “bubbly” over that spending “bubble?”

Let me just say that I hate this graph!

FIGURE 1 – NATIONAL TRENDS IN PUPIL TO TEACHER RATIOS

Why? Well, this is my own version of the graph… but it is a graph that has been used many times over, of late, to make the argument that American public schools have simply been drowning in an excess of public funding for decades and that public school districts nationally have leveraged all of that additional money over time to flood classrooms with additional staff. Of course, this framing is being used to set up the argument that it’s time for a logical correction – a return to sanity – a return to reasonable class sizes, etc. etc. etc.

First of all, even this national graph – which isn’t particularly meaningful – shows a reduction of slightly over 2 students per teacher over the past 25 years. Chop off the period prior to 1985 and the most dramatic shifts in pupil to teacher ratio are wiped away.

Second, national averages just aren’t that meaningful. Claims like this one, from Mike Petrilli of Fordham Institute, misunderstand entirely the variation in resources across states and variation in effort across states:

The tough-love message to superintendents and school boards nationwide should be clear: The day of reckoning has arrived; let the de-leveraging begin. The spending bubble is over. No more adding staff at a pace that outstrips student enrollment; no more sweetheart deals on pensions or health insurance; no more whining about “large” class sizes of twenty-five. It’s time to live within our means.

http://www.edexcellence.net/flypaper/index.php/2010/11/welcome-to-a-new-era-of-restraint/

The new “reform” story line is that all states have been spending out of control they have taxed their citizenry to death and have spent most of that money hiring more and more teachers even while student populations have stagnated. Further, we have done all of this out of control spending and class size reduction while seeing absolutely no return for our dollar! Pretty simple! Very compelling, eh? NO.

When one takes a look at individual states – in terms of relative tax burden, changes in tax burden over time – in terms of changes in pupil to teacher ratios over time – and in terms of student outcomes in relation to pupil to teacher ratios and tax burdens – it’s pretty damn hard to find states which actually fit that story line. And public K-12 education is largely a state and local endeavor, not a federal one.

So, let’s take a state by state look. Let’s start here – with the total revenues per pupil – state aggregate – including state, local and federal sources. I’ve adjusted these for inflation (Employment Cost Index – Gov’t Workers), but not for regional variation. For more information on comparing across states, see THIS POST. (<–See this link if you want to know how states rank compared to one another on state and local revenues).

FIGURE 2 – TOTAL REVENUES PER PUPIL – INFLATION ADJUSTED, NOT REGIONALLY ADJUSTED (2005 Dollars)

What this shows us is that there’s quite a bit of variation in revenues for local public school districts across states, with some states investing a lot – like New Jersey, Vermont and Wyoming and some not so much – like Utah, Arizona and California. This graph also tells us that many states really didn’t see consistent spending/district revenue growth throughout the period. No massive… long term… uniform… spending bubble… from which they all benefited.

The patterns of revenue increases mirror patterns of pupil to teacher ratio change over time. Yes, some states like Vermont, Wyoming and New Jersey did see pupil to teacher ratios decline. But, not so for Utah, California (after 1998) or Arizona. In fact, in Arizona pupil to teacher ratios increased over time!

FIGURE 3 – PUPIL TO TEACHER RATIOS ACROSS STATES AND OVER TIME

And, it is similarly foolish to assert that all states put themselves over the brink in terms of taxes to support all of this supposedly lavish spending. Here are the state and local direct expenditures as a percent of personal income across states:

FIGURE 4 – STATE AND LOCAL DIRECT EXPENDITURES (All, not just Educ.) AS A PERCENT OF PERSONAL INCOME

Look at New Jersey, which has provided relatively high levels of funding, with increases in the early 1990s and again from 1998 to 2003 (though lagging in the middle). New Jersey is not among highest at all on state and local direct expenditures as a percent of personal income – because incomes in New Jersey are quite high. Arizona is also relatively low and Utah and California only average – not high. The “high” state and local direct spending burden states in this mix are Wyoming and Vermont.

So then, what about that story line we’re being told applies across our nation’s schools –

that they’ve been swimming in money for decades – a huge ongoing spending bubble – and
that they’ve spent it all on pupil to teacher ratios – increasing teacher quantity not quality – and
that they’ve taxed their state residents to death in the process – and
got nothing for it in improved student outcomes.

Well, the only state that seems to come close here is Vermont (and perhaps Wyoming) – which does have high and growing public expenditure burden (and the highest “effort” index at www.schoolfundingfairness.org), increased education spending per pupil over time and continued decline in pupil to teacher ratios.

Of course, that last part of the story line about outcome failure doesn’t fit so well, because Vermont does very well on outcomes (See this article for discussion of empirical research on results of Vermont school finance reforms: https://schoolfinance101.com/wp-content/uploads/2010/01/doreformsmatter-baker-welner.pdf)

It also turns out that even Vermont’s “bubble” story isn’t so simple. Vermont really hasn’t been adding significant numbers of additional staff and spending a lot more in recent years. Rather, Vermont is experiencing a significant decline in student population – but has not adjusted its public education system accordingly – reorganizing into more efficiently organized schools and districts. You see, if you spend the same amount and retain similar numbers of teachers, when enrollments decline, pupil to teacher ratios decline and per pupil spending goes up. Yeah… same effect as a spending bubble – but very different cause. And yes, this is something that should be addressed. But again – somewhat different issue! Except for an infusion of funding in the late 1990s in response to school funding equity litigation – Vermont has not necessarily been on a wild education spending binge to reduce class sizes – Rather, the state is a victim of declining school-aged population, resulting in declining enrollment and declining pupil to teacher ratios.

Here’s Vermont enrollment over this same period.

FIGURE 5 – VERMONT ENROLLMENTS

This map shows the locations and enrollment of Vermont schools. Small red dots are schools with 50 or fewer students and small orange/brown dots have 50 to 100. In some areas of the state, you can find small red dots within a few miles of each other (many of which are small town schools – in separate towns – actually serving the same grade levels/ranges).

FIGURE 6 – LOCATIONS AND ENROLLMENT SIZES OF VERMONT SCHOOLS

Okay, so if this story line doesn’t even fit for Vermont, then who? Perhaps no-one! Arguments built on assumptions of “national trends” regarding financing, class size and/or most features of our public education systems – our 50 and then some systems – are generally unhelpful (if not entirely misguided).

Arguments that encourage state legislators across the nation – including those in states like Utah, Arizona and California that now is the time to cut, cut, cut, because we just went through decades of spend, spend, spend, are downright ignorant and irresponsible.

And please check out this post: https://schoolfinance101.wordpress.com/2010/10/27/when-schools-have-money/ where I explain that within states – especially inequitable states like NY or IL – if and when anyone did benefit from reduced pupil to teacher ratios and more importantly smaller class sizes, it was often those in the most affluent communities.

A few updated NJ charter figures

New, updated slides in PPT format (for clarity on labels): CHARTER SCHOOLS_NOV2010

I expect people will be asking why some of my figures previously posted don’t match up exactly with figures presented by others on New Jersey Charter Schools – including those produced by ACNJ in a new report. In short, the answer is that at least with regard to “poverty” measurement and comparisons across charters and Newark Public Schools, they are different measures. In my previous slides I show a bar graph of Free Lunch rates and later show scatterplots of performance by Free or Reduced Price Lunch rates. ACNJ and many others use only Free or Reduced lunch rates, never exploring the distinction between the two. Seems like a subtle difference for the lay reader and one that might not sink in right away. But, it can actually be an important distinction in this type of comparison.

Here’s a link to the differences in eligibility guidelines: http://www.fns.usda.gov/cnd/governance/notices/iegs/iegs.htm

For children to qualify for Free Lunch, their family income levels must be below the 130% income level with respect to the Poverty Income Level (30% above poverty line). That is, kids in families who qualify for free lunch are in families up to that level.

The income threshold for Reduced Price Lunch is the 185% income level with respect to the poverty income level.

The fact is that most school aged children in Newark fall under the 185% income level with respect to the poverty income level. As such, most schools in Newark have over 80% children in this category. Therefore, it is hard to use this relatively “generous” income threshold in order to distinguish differences in populations across Newark Schools- NPS or charter. The lower income threshold serves as a better way to distinguish the differences.

Here is the % Free Lunch using NJDOE 2009-10 data: http://www.state.nj.us/education/data/enr/enr10/

These data are highly consistent (except for Lady Liberty) with my 2008-09 data from the National Center for Education Statistics Common Core of Data. Most Newark Charter Schools, especially the frequently touted high performers, have very low relative rates of children below the 130% poverty threshold.

Here is the % Free or Reduced Lunch using NJDOE 2009-10 data:

Here, the charter schools scatter themselves more widely among the NPS schools. They appear more comparable and their average is only marginally different by some accounts. BUT, the reality is that most kids in Newark fall under this threshold and nearly every school in the above figure exceeds 70% free or reduced lunch and the vast majority exceed 80%. This higher income threshold limits our ability to distinguish real differences in student populations across Newark schools.

Another angle would be to say that the difference in the position of charter schools in the second graph versus the first is an indication that CHARTERS ARE SERVING THE LESS POOR AMONG THE POOR. Not all, but many are doing this. Most surprising perhaps is that Robert Treat in particular remains a standout even with regard to the less poor.

Additional Figures:

Here are the special education classification rate data for 2004 through 2007:

NJDOE has not posted the more recent classification rate data by the same format. Enrollment files used in the first part of this post have disaggregated classification data, but report mostly “0” values for charters because counts were too small to report. NJDOE does report placement data, but again, these data are spotty at best for NJ Charter schools.

Here are the frequency distributions by school, for Newark Schools, by Free Lunch and by Free or Reduced Lunch. As you can see, the distribution for Free or Reduced Lunch is all crunched in the range above 80% making it more difficult to distinguish true poverty differences among schools serving Newark children.

GUIDELINES FOR USING/COMPARING NJ CHARTER DATA

When comparing across schools within poor urban setting, compare on basis of free lunch, not free or reduced, so as to pick up variation across schools. Reduced lunch income threshold too high to pick up variation.
When comparing free lunch rates across schools either a) compare against individual schools and nearest schools, OR compare against district averages by GRADE LEVEL. Subsidized lunch rates decline in higher grade levels (for many reasons, to be discussed later). Most charter schools serve elementary and/or middle grades. As such they should be compared to traditional public schools of the same grade level. High school students bring district averages down.
When comparing test score outcomes using NJ report card data, be sure to compare General Test Takers, not Total Test Takes. Total Test Takers include scores/pass rates for children with disabilities. But, as we have seen time and time again, in charts above, Charters tend not to serve these students. Therefore, it is best to exclude scores of these students from both the Charter Schools and Traditional Public Schools.

ACNJ’s Newark Kids Count 2010 report appears to fail on all 3 guidelines above.

ACNJ’s Newark Kids Count: http://acnj.org/admin.asp?uri=2081&action=15&di=1841&ext=pdf&view=yes

ADDITIONAL STUFF

One question raised by the ACNJ Kids Count yearbook is why the NPS schools hold ground with Newark Charters through 4th grade, but appear to lose ground in 8th grade. The charter advocate explanation is that charters are simply doing better, cumulatively, with students through 8th grade and preparing them for college. However, there are two other equally if not more likely explanations.

First, the mix of schools that are charter schools serving 8th grade students is different from the mix serving 4th grade students. Heavy “cream-skimmers” like North Star Academy start at 5th grade. And some lower performing charters, actually serving more representative populations end at 4th grade. The different mix of charters having students taking the 8th grade test versus those taking the 4th grade test may explain a substantial portion of the difference. It’s also important to understand that at this break – where low performers end – and where high performers start up – that many low performing students may be pushed back into NPS schools and meanwhile, higher performing ones creamed off. Here are the charter school proficiency rates (general test takers only) from 2009 state report cards, along side NPS proficiency rates (averaged across tests).

Second, charter schools have the ability to use cohort attrition to their advantage, over time, shedding the students who perform less well on assessments, perhaps due to the extent of parental obligation involved in keeping students in the school or even due to the message that the child “just can’t cut it here.” NJDOE data don’t allow for precise student level tracking to see whether individual students stay on in particular charter schools or which students do. But, one can do a relatively simple back of the napkin approach using the grade level enrollment files to determine whether or not cohort attrition may be an issue. Note from the performance graph above, North Star in particular shows incrementally higher proficiency rates at each higher grade level. While this is not a cohort comparison, it is possible that this pattern arises due to attrition of weaker students in higher grades.

Here’s a quick look at 3 cohorts of 5th graders across these schools:

This tabulation shows significant cohort attrition for North Star in particular.

Now, there is nothing particularly conclusive about the above slides, but they do raise questions as to whether the difference in 8th grade scores between NPS and Newark Charters is at least partly if not substantially a function of a) the different mix of schools serving 8th grade and b) the significant cohort attrition of at least one of the larger schools. Note that these attrition patters, if shedding lower performing students have the effect of both raising the charter 8th grade average and depressing the NPS 8th grade average.

New Jersey Charter Schools Association gets angry over… data?

For some time now, I’ve been pulling together data from the National Center for Education Statistics and from the New Jersey Department of Education on New Jersey Charter Schools. Why do I do it? Mainly out of frustration that no-one else seems to be playing a monitoring role. I’ve not seen any good compilations or presentations of the various types of data that exist on New Jersey Charter Schools. That said, the data aren’t great. They aren’t worthy of high level academic research. But they are what we’ve got, and they are from the primary government sources charged with collecting these data. So, here are a series of my slides compiled from the data:

Link to PDF slides: CHARTER SCHOOLS_OCT2010

CHARTER SCHOOLS_NOV2010 (Includes updated slides)

Updated Figures: https://schoolfinance101.wordpress.com/2010/11/10/a-few-updated-nj-charter-figures/

CHARTER SCHOOL DEMOGRAPHICS

Data: LINK TO UPDATED SPREADSHEET OF FREE LUNCH AND SPECIAL ED DATA

On second look, it appears that this first graph matches the 2008-09 data from the spreadsheet linked above (not the 2007-08 as originally labeled).

CHARTER SCHOOL PERFORMANCE WITH RESPECT TO DEMOGRAPHICS (NEWARK)

CHARTER SCHOOLS IN SPATIAL CONTEXT (CLICK FOR READABILITY)

Previous posts and additional figures on NJ charters can be found throughout my blog at:

1. Math Trends over Time by District Factor Group: https://schoolfinance101.wordpress.com/2009/12/14/nj-charter-update-math-trends-over-time/

2. Playing with Charter Numbers: https://schoolfinance101.wordpress.com/2009/11/13/playing-with-charter-numbers-in-nj/

3. Replicating Robert Treat Academy: https://schoolfinance101.wordpress.com/2009/11/05/replicating-robert-treat-academy/

My general conclusions from these previous posts and the above graphs?

New Jersey Charter Schools generally serve smaller shares of children qualifying for free lunch than schools in their host district and schools in their immediate surroundings.
New Jersey Charter Schools serve very few children with disabilities.
New Jersey Charter School performance, like charter school performance elsewhere is a mixed bag. Some of the highest performers are simply not comparable to traditional public schools in their districts because they serve such different student populations (far fewer low income children and few or no special education students). So, even if we found that these schools produced greater gains for their students than similar students would have achieved in the traditional public schools, we could not sort out whether that effect came from school quality differences or from peer group differences (which doesn’t matter from the parent perspective, but does from the policy perspective).

A colleague of mine shared these data with an interested reporter. I spoke with the reporter. And the reporter requested a response from a representative of the New Jersey Charter Schools Association.

The New Jersey Charter Schools Association responded:

The New Jersey Charter Schools Association seriously questions the credibility of this biased data. Rutgers University Professor Bruce Baker is closely aligned with teachers unions, which have been vocal opponents of charter schools and have a vested financial interest in their ultimate failure.

Baker is a member of the Think Tank Review Panel, which is bankrolled by the Great Lakes Center for Education Research and Practice. Great Lakes Center members include the National Education Association and the State Education Affiliate Associations in Illinois, Indiana, Michigan, Minnesota, Ohio and Wisconsin. Its chairman is: Lu Battaglieri, the executive director of the Michigan Education Association.

There are now thousands of children on waiting lists for charters schools in New Jersey. This demand shows parents want the option of sending their children to these innovative schools and are satisfied with the results.

Wow. That’s quite interesting. These data can’t be credible simply because I sit on the Think Tank Review Panel and I am – ACCORDING TO THEM (news to me) – closely aligned with teachers’ unions. According to this statement, these data are necessarily “biased,” even though the statement provides no evidence whatsoever to that effect. Heck, I’ve merely graphed and mapped NCES and NJDOE data. Did my mapping software introduce some devious union bias? Damn that ArcView!

By the way, I don’t get any kind of ongoing pay for doing this Think Tank Review stuff. I do get contracted to write a policy brief or critique on occasion, and it’s a relatively small sum of money for each brief or critique. I consult for a lot of groups around the country and a long list can be found on my vitae, here: B.Baker.Vitae.October5_2010

I don’t take any money for this blog or reprints/re-posts of it, and quite honestly, when I do take contract money to write a policy brief or report – whoever it’s for – I go to extra lengths to make sure that the data and analysis are defensible, typically opting for the most conservative representation of the data, knowing full well that the instinct of any opposing critic will be to pounce.

Hey… these data are what they are. I’m just making graphs of them. This official statement of the New Jersey Charter Schools Association is a childish personal attack from an organization that apparently has little else to stand on.

SOURCE LINKS:

For free lunch data and enrollments: http://nces.ed.gov/ccd/

Use the “build a table” function (under CCD Data tools)

For special education count data:

General NJDOE site: http://www.state.nj.us/education/specialed/data/

For 2007 classification rates: http://www.state.nj.us/education/specialed/data/2007.htm

First link: http://www.state.nj.us/education/specialed/data/ADR/2007/classification/distclassification.xls

Note that same link is dead for 2008 and 2009: http://www.state.nj.us/education/specialed/data/2008.htm

For test score data: http://education.state.nj.us/rc/rc09/database.htm

No more “sweetheart deals” for poor schools…

Fordham Institute’s Mike Petrilli is showing his “tough love” for public schools in these tough economic times:

The tough-love message to superintendents and school boards nationwide should be clear: The day of reckoning has arrived; let the de-leveraging begin. The spending bubble is over. No more adding staff at a pace that outstrips student enrollment; no more sweetheart deals on pensions or health insurance; no more whining about “large” class sizes of twenty-five. It’s time to live within our means.

http://www.edexcellence.net/flypaper/index.php/2010/11/welcome-to-a-new-era-of-restraint/

Of course, a major problem with this assertion for anyone who has read… well… much of anything about school funding in the past decade or so, is the underlying assumption that schools nationwide, uniformly and especially poor urban districts have simply been punch drunk on excess funding resources for the past few decades… with dramatic increases for all especially from the 1990s forward.

Now Mike Petrilli is actually better than many on this point, on most occasions. But the problem with Petrilli’s statement above is that it fails to recognize the incredible variation that persists across states and school districts, and perpetuates the myth of school districts nationally and uniformly being punch drunk on the public dollar – the “bubble” is over. The “bubble” that everyone enjoyed!

Here are two recent sources which show the extent of persistent disparities across states and across districts within states by district poverty rates:

Is School Funding Fair? www.schoolfundingfairness.org
Baker, B.D., Welner, K.G. (2010) Premature celebrations: The persistence of inter-district funding disparities. Education Policy Analysis Archives. http://epaa.asu.edu/ojs/article/viewFile/718/831

In our report on Fair School Funding, we show just how large the funding disparities are across states and also show that in many states, higher poverty districts still receive systematically fewer resources per pupil – and that’s in dollars adjusted for regional wage variation, economies of scale and population density – no fancy weights to reduce spending for poverty differences themselves. Yes, in some, many states, higher poverty districts have far fewer resources than lower poverty ones! Shocking, I know.

Further, the first of these sources explains that much of the deprivation of resources that has occurred in certain states is a function of complete and utter lack of state financial effort – not lack of capacity.

Additionally, in the second article above, Kevin Welner and I show by a similar method that overall, nationally, there remains a positive relationship between school district state and local revenues and resident income levels (across districts within states). Here’s a figure from our report, based on a regression model characterizing the relationship between median household income and state and local revenues per pupil over time. Even by 2005, that relationship remained positive – and still does through 2008. Some progress was made through 1996, and then leveled off.

Figure 1

Relationship between Median Household Income and State & Local Revenues per Pupil (District Level) Nationally

More interesting, however, is the variation in support for schools generally, across states, and for higher poverty schools within states.

Yes, state budgets oscillate over time. There are good times and there are bad times. But some… if not many states… have chosen consciously to use the argument of “tough economic times” to throw their public education systems under the bus. Heck, many of these states used the good economic times to argue for throwing their education systems under the bus. Wouldn’t want to slow the economy by “overtaxing” and spending too much on schools.

Here are some snapshots of state spending, from a recent post.

Figure 2

Figure 3

For more information on these graphs, see my earlier post on State Ranking Madness. Suffice it to say that not all states have put a lot of funding into their schools and these figures come from 2007-008, the front end of the downturn – or back side of the supposed bubble. Some states like Utah and Tennessee allocate very little to their public education systems. Perhaps those are poor states that simply can’t afford to do more? Perhaps they were “taxed to death” in the good times – during that bubble – and simply can’t sustain that in the bad times! That’s simply not the case!

Figure 4 shows the relationship between state and local revenue levels and the percent of gross state product spent on schools (from an earlier post). Tennessee, which is near the bottom on state and local revenue is also near bottom on “effort,” along with Louisiana, Colorado and North Carolina. These states are using far less of their capacity to support public education. Their lagging resources are a function of political choices as much if not more than a function of economic conditions.

Figure 4

As it turns out, these states also somehow missed out on that whole bubble thing. Figure 5 shows the ECWI (Education Comparable Wage Adjustment) adjusted current operating expenditures per pupil from 1997 to 2005 (range for which ECWI is available) for states we classify as “highlands” in our fairness report. Adjusted for competitive wage growth (of non-teachers in same labor markets), per pupil spending in Tennessee stayed a) relatively flat and b) below all others in its region. Where’s their bubble? Where’s all of that punch drunk public spending? Well, perhaps if we could take it back to about 1920 or so. Then we’d find the bubble?

Figure 5

Utah is my new favorite standout these days, and is shown in this figure.Yeah… you have to look carefully to find Utah per pupil spending down their blending in with the horizontal axis of the graph. Utah per pupil spending is a) absurdly lower than all others around it and b) really, really, flat over time. Hey, where’s that bubble? Where’s all that punch drunk public expenditure? In fact, the only state with a really steady climb in spending here is Wyoming (perhaps Montana also).

Figure 6

Oh, and back to that issue of how all of these states have taxed themselves to oblivion to support that punch drunk spending binge. This figure shows the cumulative taxes as a percent of income with a handful of states including Utah and Tennessee identified. Note that for most states, these trends change little even if we take the graph back to 1980. While the above “effort” figure focuses only on public K-12 funding as a share of gross state product, this graph focuses on all tax revenues as a share of income. Yeah… Tennessee and Utah really taxed themselves to death to support their massive spending bubble (albeit my time line is slightly different… but heck… there wasn’t a bubble anyway). We really need to bring their taxes and spending back into line. Really!

Figure 7

I’d find it pretty hard to argue that schools – and the adult interests who run schools – in states like Louisiana, Utah, Colorado and Tennessee have been punch drunk for decades and need to get with the program and learn how to live within their means! MAKING BROAD NATIONAL STATEMENTS TO THIS EFFECT IS DOWNRIGHT ABSURD, GIVEN THAT PUBLIC SCHOOL FINANCE REMAINS PRIMARILY A STATE AND LOCAL FUNCTION!

Just as spending levels vary across states, so do funding levels vary across children, schools and districts within states. That is the central concern in Is School Funding Fair? Funding fairness, like overall funding levels, varies widely across states, even within regions. Figure 8 shows that among the mid-atlantic states, only New Jersey reversed the relationship between household income and school district revenues. Even then, many affluent suburban New Jersey districts continue to outpace spending of their poorer urban neighbors. New York State remains among the most regressively funded states in the nation, with affluent districts continuing to far outspend New York City schools and schools in other poorer mid-size and small cities around the state.

Figure 8

Trends in Income-Revenue Relationships in Mid-Atlantic States

Figure 9 shows the same trends for North Central states, where Illinois is the standout of not only maintaining a regressive distribution of resources but actually increasing in regressiveness of funding from 1998 to 2005. Over time, affluent suburban Chicago school districts have continually outpaced state and local revenues of poorer urban, minority districts.

Figure 9

Trends in Income-Revenue Relationships in North Central States

Figure 10 compares the within state fairness measure with the measure of overall support for public education. Some states like Florida and North Carolina simply spend little, while having reasonable capacity to spend more, and distribute what they have inequitably. Tennessee spends little – distributing those shares of near nothing relatively equitably – a strange variation on fairness.

Figure 10

Relationship between Fairness and Overall Support for Public Schooling

This argument that it’s time to accept reality, tighten our belts, etc. etc. etc. is essentially an argument that we should simply let class sizes climb back over 25 or 30 per classroom in poor urban districts and that we have no choice in the matter because there just isn’t enough money to do otherwise. There’s just no more money… in New Jersey, in Tennessee, in Louisiana, in Colorado, in Utah… you name it. They’ve all tried their hardest and spent themselves into oblivion while teachers, administrators and public education bureaucrats have been throwing extravagant parties on the public dime! That’s just how it is???????

I’m not buying it. I assure you that within states, affluent suburban districts will be among the last to increase class sizes (as I discuss in When Schools Have Money) , unless state imposed spending limits force them to. And I assure you that some states that already put up the least effort to support their public schools will proclaim most loudly the need to do even less – raising class sizes from 35 toward 40 (or cutting corners elsewhere). The level of financial support provided to public education systems may, on average, be tied to general economic conditions. But the extent to which that support varies so widely across states and the extent to which states allocate those resources fairly across wealthier and poorer communities, is significantly if not predominantly in the control of states.

I close this with a favorite video clip – A Utah/Tennessee (and others) perspective on education funding!

Pondering the “Master’s Bump”

Last month, I wrote a post titled “The Research Question that Wasn’t Asked,” in which I questioned whether many of us have overstated the connection between research on teachers with advanced degrees and compensation policies that link compensation to advanced degrees.

I pointed out in my post that:

Studies of the association between different levels of experience and the association between having a master’s degree or not and student achievement gains have never attempted to ask about the potential labor market consequences of stopping providing additional compensation for teachers choosing to further their education – even if only for personal interest – or stopping providing any guarantee that a teacher’s compensation will grow at a predictable rate over time throughout the teacher’s career.

Many, like Rotherham but even more so, NCTQ, present this as a “research given.” That clearly, it’s just dumb to pay teachers more who possess attributes we know are not associated with student achievement differences (across teachers). Is it possible, however, that changing these conditions could have significant labor market consequences? Perhaps good… but equally likely… unintended negative consequences.

Yes, teachers with any old masters degree or teachers with more than 10 years behind them might not, on average, be “measurably more productive.” But does the option to pay and recruit more experienced teachers or teachers with masters’ degrees enhance the likelihood that a district can attract teachers who are actually better teachers? I’m not so sure that the answer to this question unasked is so obvious that we need not ask it. So let’s stop pretending that it is.

In another recent post – When Schools have Money – I pointed out how all of the cool kids on the reform block have also come to the undisputed conclusion that smaller class sizes are an inefficient waste of resources and that we should instead focus our resources on hiring better rather than more teachers (as if a simple tradeoff) – even increasing class size to free up the money to increase wages (ignoring that larger student load is a working condition that might offset much of the competitive advantage gained from the higher wage).

In that post, I pointed out how one of the defining features of both high spending traditional public school districts and of high spending elite private schools is SMALLER CLASS SIZE. It’s a selling point for those private schools – appealing to parents choosing to pay the high tuition – and also appealing to parents of school aged children choosing a neighborhood to live in. Apparently, small class size is also a preferred strategy of the very high spending Harlem Children’s Zone charter schools (more on this at a later point). In this post, I concluded:

For some reason these private schools and affluent public school districts – more specifically those who support these schools – exhibit a strong preference for small class size even when given wide latitude to choose differently. Perhaps they are on to something?

That got me to thinking – thinking about the potential relationship between these posts, in part based on my research findings using state data systems and studying the distribution of teacher and principal academic backgrounds and credentials. AND, considering the simple logic that … if both highly successful and unsuccessful school districts do these things (pay based on degrees and experience) can these things really be the primary cause of failure in the unsuccessful districts?

As mentioned in my first post above, I suspect that most school districts which offer a significant “bump” for teachers holding a master’s degree are not doing so on some assumption of a direct, simple, causal link between holding a master’s degree or not, and teaching effectiveness (measured exclusively by student achievement gains).

Yes, it would be one thing if the argument for offering a salary bump for a master’s degree was predicated entirely on the assumption of specific average achievement gains resulting from a teacher having a master’s degree. Then one day, we find out that having a master’s degree isn’t associated with the assumed achievement gains. Problem identified. Therefore, we must stop providing the salary increment. Solution found. Money saved. All is well and good. But very little is that simple, especially in complex social systems like teacher labor markets.

I suspect that the “master’s bump” more likely exists in many districts as a competitive recruitment/retention tool. Districts offer this bump (or agree to it, through negotiations) because they prefer the option to recruit teachers who have obtained higher degrees, for any number of reasons. In addition, districts offer this bump so that current teachers are incentivized to further their education (or to recruit teachers interested in pursuing further education). Districts may offer this bump knowing full well that many teachers will use it to get the degree, stick around for a few years at the higher salary, and then use their degree to pursue other opportunities. Even then, the district may have benefited from providing a favorable work environment that encouraged teachers to pursue further education and advance their own careers.

It may also be that the master’s bump saves districts money, in that they would otherwise raise all teachers’ wages to the level currently paid for a teacher with a master’s degree. Having a share of teachers with only a bachelor’s degree paid at a lower wage therefore reduces the total. That is, perhaps it’s not a master’s bump at all? Perhaps it’s a bachelors’ deduction?

The “master’s bump” is likely to be used more by some districts than others. My previous analyses suggest that more affluent school districts in metropolitan areas tend to offer a larger master’s bump. This is certainly the case in the Chicago metro area. Yep, there they go again – those highly successful, high spending districts – throwing their money at things that we know have no effect on student learning?

Here’s a quick snapshot of a) the distribution of master’s degree (or higher) salary premiums (master’s bump) across New Jersey districts, by wealth/income group (district factor group) and b) the distribution of teachers holding master’s degrees (or higher) by district wealth/income group.

FIGURE 1: Salary premium associated with holding a master’s degree for teachers in wealthier (District Factor Groups I&J) and poorer (District Factor Groups A&B) New Jersey school districts

So, what we see here is that districts in the wealthiest two factor groups provide the largest master’s bump. Are they just being wasteful? As with the preference for seemingly frivolous smaller class sizes, affluent districts in New Jersey, like elsewhere, tend to offer larger premiums for teachers with advanced degrees. It may be that there is simply public demand in these communities for small classes taught by teachers with advanced degrees. Perhaps the public has not been clued in that these preferences are misguided? Or once again, perhaps these affluent and generally successful school districts and their patrons are on to something?

FIGURE 2: Percent of teachers holding master’s degrees in wealthier (District Factor Groups I&J) and poorer (District Factor Groups A&B) New Jersey school districts

Figure 2 merely validates that the districts offering larger premiums for teachers with a master’s degree also seem to have more of them. It would be a stretch to try to link the two as a causal relationship, given that many other factors are in play. Nonetheless, the apparent preference for teachers with a master’s degree in high wealth districts is realized in the percentages that actually hold a master’s degree.

FIGURE 3: Comparison of wealthy district and poor district wages for a teacher with a master’s degree across experience levels in 2009-10

Figure 3 shows that at the master’s degree level, poorer districts even in New Jersey – which provides substantial additional support to those districts – just can’t keep up with affluent district teacher wages. And one reason is the larger master’s bump provided in the wealthier suburban districts.

Notably, the poorer districts in these graphs have fewer teachers holding master’s degrees and provide a smaller master’s bump. That is, these districts are not “wasting” as much money on frivolous degrees as their wealthier peers. Is it really logical then to suggest that the presence of the “master’s bump” is in any way clearly associated with the difficulties of improving performance in high poverty districts? Low poverty successful districts are doing it even more! Can we make any assertion in this regard at all? Do we really know that we could simply recapture money “wasted” by these districts on masters’ degrees and use that money more wisely to improve outcomes? I’m skeptical. In fact, one could equally logically argue that we should provide sufficient funds to the poorer districts to offer higher salaries and a larger master’s bump, to put them in better position to recruit and retain the teachers currently flocking to high wealth districts.

Does offering a substantial premium for teachers with higher levels of educational attainment allow districts to attract and retain better teachers than they might if they didn’t offer this bump? We don’t know if it does (as far as I can tell), AND we don’t know that it doesn’t! Further, if some districts stopped offering the masters bump, would it affect, positively or negatively, the quality of teachers they could attract? We don’t know (as far as I can tell). These are very different questions than the ones that have been addressed in existing research, and these are questions that have not been sufficiently explored. Thus, we should not be so quick to assume that we know the simple and obvious solutions.

Thought for the day…

Many will consider this blasphemy, but, I’ve been pondering lately:

If our best public and private schools are pretty good (perhaps even better than Finland?),

And, if the majority (not all, but most) of our best AND our worst public (and private) schools use salary schedules which base teacher compensation primarily on degrees/credits/credentials obtained and years of experience or service…

Can we really attribute the failures of our worst schools to these features of teacher compensation?

Yeah… there might be a better (more efficient and effective way), but is this really the main problem?

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: