Where are the most economically disproportionate charter schools? (& why does it matter?) UPDATED

UpdatedIt seems that Mike Petrilli on Twitter takes issue with my reference to these schools below as “segregated.” In his view, if a city includes some charter schools that have more of a 50/50 balance of low income and non-low income kids, those are the integrated schools, even if they achieve their balance by creaming off the non-low income kids in a district that is 80% low income.  Petrilli seems to suggest that it is necessarily a good thing if charters can can create a balanced population for themselves, even if they create imbalanced population (even more intense concentration of poverty) for the system as a whole.  Notably, an unanswered question by the data below is the extent to which the creation of economically non-representative charters in a city can help to retain some middle class families that might not have otherwise sent their children to the district schools. Certainly, there exists at least some evidence that Catholic school enrollments have suffered from charter expansion.  It seems far less likely that these charters are recruiting into the city, higher income children from neighboring districts. To suggest that a majority, or even large share of non-low-income students in charters are retained (but would have otherwise left the public system), brought in from lower poverty neighboring suburbs, or siphoned from private schools and would not have otherwise attended the public system is a huge stretch – a smokescreen.  It remains most likely that the vast majority of sorting displayed herein is internal to the public-charter system and unlikely to be crossing school district or city boundaries. [more below]

In this first of several posts, I explore economic variation in charter enrollments in the states of Massachusetts, New Jersey and Connecticut.

I’m taking a fairly simple, easily replicable approach here and encourage any data savvy readers to take their own shot at it. For this analysis I’m using the most recent three years of non-preliminary school level enrollment data from the National Center for Education Statistics Common Core of Data, Public School Universe Survey.

http://nces.ed.gov/ccd/pubschuniv.asp

I’m only using a handful of variables here. I’m using:

  • City of location (lcity)
  • Total school enrollment (member)
  • Total number of free lunch qualified children (frelch)
  • Charter school indicator (chartr)

For each year of the data, I sum the enrollment of all schools in the city of location, including charters and district schools and magnets or other special schools. That gives me the total number of all kids enrolled in a city (yeah… it’s a little messy in that some cities include schools that also enroll kids from outside the city – I limit the final lists to large enough enrollment areas where such cases should not substantively distort final numbers). I do the same for kids qualified for free lunch. So, I have:

  • City Total Enrollment
  • City Free Lunch Enrollment

Note that this is by city, not host district, but city is a relevant geographic unit for many reasons, including the fact that many US cities are actually carved into multiple segregated public school districts. Part of the point here is to run a quick-and-dirty summary with the publicly available, readily useable data.

Next, I determine each charter school’s market share:

  • School market share = school enrollment/city enrollment

And then each school’s share of low income kids served:

  • School free lunch share = school free lunch / city free lunch

If a school was serving a representative population by low income status, then the free lunch share for the school would equal the market share for the school. That is, the school would be serving both X% of total enrollment and X% of low income kids. I use a simple disparity ratio here:

  • School free lunch share / school market share

If the disparity ratio is say, .50, then the charter school is serving only half as many low income kids as would be proportional for that school.

To make the final data set manageable… I focus on charter schools in cities where the aggregate enrollment is greater than 10,000. And to have more stable numbers 1) I use only those charters with at least a 1% market share and I use a three year average (2009 to 2011).

So, let’s have at it. Here are the ratios for Connecticut schools:

Slide4All but two CT charters underserve low income students in these data.  Four are under 70%.  Park City, Jumoke and AF Bridgeport are particularly egregious examples!

Here’s Massachusetts:

Slide5

Many Boston area schools are excluded from the above table on the basis that what outsiders generally think of as “Boston” is actually carved into many smaller city areas, many of which fell under my 10,000 aggregate enrollment threshold. I will report additional data on these areas at a later date.

And finally, New Jersey:

Slide6

Unfortunately, in this last figure, we actually lose some of New Jersey’s most economically disproportionate charter schools which are in Hoboken, which fell under the aggregate enrollment threshold.

Why does this matter?

There exist at least two reasons why it matters to pay close attention to just how different charter schools are from their surroundings – that is, if and when they are. First, better understanding demographic differences of charter schools – or any school for that matter – provides useful backdrop for claims of chartery miracles. Second, the demography of charters in their local contexts, and demographic shifts induced by choice programs, or attendance boundary reconfiguration for that matter, have implications for schools on both ends – sending and receiving.

1. Claims of reformy miracles

I don’t know how many times I’ve come across tweets and blog posts, for example, talking about how BASIS charter schools in Arizona are better than Singapore or Shanghai, or even Finland. And that, since we all know Arizona is a high poverty state, BASIS must be serving low income kids, and thus achieving some transferable miracle.

If we put BASIS into a scatterplot, including its % free or reduced lunch share, among Arizona schools, expressed in national percentile ranking for math, we get this picture:

Slide2Here, BASIS looks rather not-so-miraculous. In fact, it’s right about where one would expect given the students it serves.

Likewise, schools like Robert Treat Academy and North Star Academy often receive praise for their outcomes in New Jersey. Here’s where they lie when we take into account free lunch shares alone (and use general test taker outcomes to reduced special ed and ELL effects).

Slide1Both are near where one would expect them to be given their students. In fact, many more Newark Public Schools district schools deviate positively – and more positively – from expectations than either of these “miracle” schools.

2. Effects on the system as a whole

As I’ve shown in several previous posts (like this one), when charter schools (or district’s own magnet schools) siphon off lower need students they leave behind higher need students. Just as the concentration of lower need students in charter or magnet schools may provide advantageous peer group influence on those involved, the concentration of higher need students left behind in district or other charter schools has adverse peer group effects. Similar concerns arise with neighborhood level sorting of children and families. The policy goal is to figure out how to best manage student sorting so as not to exacerbate these problems via under-regulated choice programs (with incentives to cream-skim).

Regulation need not take the form of requiring all charter (or district magnet) schools to serve proportionate shares of specific populations (by race, economic status or disability). The reality is that some charter schools, like districts’ own magnet schools may work better with some populations than others and thus forcing them to serve a population they are ill equipped to serve is neither productive for the school nor the child.

However, where charter (or magnet) success depends on ability to serve a select population, alternative policy constraints like growth caps may be in order, to restrain otherwise parasitic tendencies.

Thus far, however, unfettered, largely parasitic charter growth continues to have the potential to do much more harm than good in the long run.

UPDATE

Some have pointed out that the charter sector in these states appears relatively “balanced” overall. Thus, what’s the harm? They merely introduce heterogeneity based on the preferences of individual parents on behalf of their children.  The problem is that charter enrollment behaviors seems to vary substantially by city. So, statewide averages, or statewide distributions can mask real local level problems. For example, in New Jersey, most of the charter schools in Trenton over enroll low income kids, while on average in Newark, they under enroll. That charters in Trenton over enroll low income kids does not help the Newark situation, though it does raise different questions for Trenton. Notably, when CREDO conducted its study of charter school effects in New Jersey, the identified positive effect came entirely from Newark, whereas charters elsewhere in the state underperformed.

Here are a few additional slides showing the city level aggregate disproportionality for the states above. Note that there may be a few cases where charter operators submitted the WRONG information about their “city of location” to their state, for the national data. In which case, a charter may show up in a city where it keeps its management office rather than where it runs its school. Don’t blame me for wrong addresses in the data. Blame those who submitted their information WRONG!

Here’s NJ, where the greatest aggregate disproportionality is in Princeton. And to those arguing that charters are merely creating more balance than can the district – that is NOT the case in Princeton NJ. Note that the net disproportionality in Newark is about 84%. Thus, while there is heterogeneity, with some schools overservign low income kids, there are enough schools underserving low income kids and by a large enough margin that the net effect is that charters in Newark are underserving. Some other smaller towns with single charters standout… Camden is approximately balanced between charter and district schools and Trenton has higher concentration of low income kids in charters. On average in NJ, the state average is relatively balanced.

Slide7Here’s Massachusetts, which on average is imbalanced, with significant disproportionality in locations like Dorchester which is home to many charters. Charters within the cit of Boston itself are more balanced.

Slide8Here’s Connecticut, which on average is also imbalanced.

Slide9Another point that has been raised, related to the issue of charters attracting suburbanites and retaining “wealthier” families than might otherwise stay in the cities and send their kids to the schools, is the argument that these most disproportionate charters likely represent their neighborhoods within the cities, and the schools around them. First, as I explain in the comments below, this apparent skimming pattern isn’t so much a function of some charters serving wealthy populations (not so much a Princeton problem), but rather a function of charters in otherwise poor neighborhoods skimming off the less poor from surrounding neighborhoods and schools. Indeed, the other scenario likely exists in a few select cases. But having reviewed numerous maps of charter locations and demography, I don’t suspect that’s the norm.  Here are a few maps for illustrations.

Here are Newark charters:

Slide11Note for example, that Robert Treat Academy stands out like a sore thumb. And even TEAM, which is more representative than other Newark Charters, sticks out in its context (a yellow circle surrounded by red ones). So too does Greater Newark which is surrounded both by higher poverty district schools and higher poverty other charters.

Here’s Hartford, CT, where nearly every other district school – except for the magnet  schools – is a red circle – serving very high poverty concentrations.

Hartford ChartersBut, Hartford is wonderfully illustrative of the fact that some districts also impose on themselves a significant degree of economic segregation. Hartford’s Capital Prep is as disproportionate in low income enrollment as Jumoke and Achievement First.  But none – none of the districts’ regular public schools, including those right next door, serve such low shares of kids qualified for free lunch.

Comments on NJ’s Teacher Evaluation Report & Gross Statistical Malfeasance

A while back, in a report from the NJDOE, we learned that outliers are all that matters. They are where life’s important lessons lie! Outliers can provide proof that poverty doesn’t matter. Proof that high poverty schools – with a little grit and determination – can kick the butts of low poverty schools. We were presented with what I, until just the other day might have considered the most disingenuous, dishonest, outright corrupt graphic representation I’ve seen… (with this possible exception)!

Yes, this one:Slide5This graph was originally presented by NJ Commissioner Cerf in 2012 as part of his state of the schools address. I blogged about this graph and several other absurd misrepresentations of data in the same presentation here & here.

Specifically, I showed before that the absurd selective presentation of data in this graph completely misrepresents that actual underlying relationship, which looks like this:

Slide6Yep, that’s right, % free or reduced priced lunch alone explains 68% of the variation in proficiency rates between 2009 and 2012 (okay, that’s one more year than in the misleading graph above, but the pattern is relatively consistent over time).

But hey, it’s those outliers that matter right? It’s those points that buck the trend that really define where we want to look…what we want to emulate? right?

Actually, the supposed outliers above are predictably different, as a function of various additional measures that aren’t included here. But that’s a post for another day. [and discussed previously here]

THEN came the recent report on progress being made on teacher evaluation pilot programs, and with it, this gem of a scatterplot:

Slide1

This scatterplot is intended to represent a validation test of the teacher practice ratings generated by observations.  As reformy logic tells us, an observed rating of a teacher’s actual classroom practice is only ever valid of those ratings are correlated with some measure of test score gains.

In this case, the scatterplot is pretty darn messy looking. Amazingly, the report doesn’t actually present either the correlation coefficient (r) or coefficient of determination (r-squared) for this graph, but I gotta figure in the best case it’s less than a .2 correlation.

Now, state officials could just use that weak correlation to argue that “observations BAD, SGP good!” which they do, to an extent. But before they even go there, they make one of the most ridiculous statistical arguments I’ve seen, well… since I last wrote about one of their statistical arguments.

They argue – in picture and in words above – that if we cut off points from opposite corners – lower right and upper left – of a nearly random distribution – there otherwise exists a pattern. They explain that “the bulk of the ratings show a positive correlation” but that some pesky outliers buck the trend.

Here’s a fun illustration. I generated 100 random numbers and another  100 random numbers, normally distributed and then graphed the relationship between the two:

Slide2And this is what I got! The overall correlation between the first set of random numbers and second set was .03.

Now, applying NJDOE Cerfian outlier exclusion, I exclude those points where X (first set of numbers) >.5 & Y (second set)<-.5 [lower right], and similarly for the upper left. Ya’ know what happens when I cut of those pesky supposed outliers in the upper left and lower right. The remaining “random” numbers now have a positive correlation of .414! Yeah… when we chisel a pattern out of randomness, it creates… well… sort of… a pattern.

Mind you, if we cut off the upper right and lower left, the bulk of the remaining points show a negative correlation. [in my random graph, or in theirs!]

But alas, the absurdity really doesn’t even end there… because the report goes on to explain how school leaders should interpret this lack of a pattern that after reshaping is really kind of a pattern, that isn’t.

Based on these data, the district may want to look more closely at its evaluation findings in general. Administrators might examine who performed the observations and whether the observation scores were consistently high or low for a particular observer or teacher. They might look for patterns in particular schools, noting the ones where many points fell outside the general pattern of data. These data can be used for future professional development or extra training for certain administrators. (page 32)

That is, it seems that state officials would really like local administrators to get those outliers in line – to create a pattern where there previously was none – to presume that the reason outliers exist is because the observers were wrong, or at least inconsistent in some way.  Put simply, that the SGPs are necessarily right and the observations wrong, and that the way to fix the whole thing is to make sure that the observations in the future better correlate with the necessarily valid SGP measures.

Which would be all fine and dandy… perhaps… if those SGP measures weren’t so severely biased as to be meaningless junk. 

Slide4Yep, that’s right – SGP’s at least at the school level, and thus by extension at the underlying teacher level are:

  1. higher in schools with higher average performance to begin with in both reading and math
  2. lower in schools with higher concentrations of low income children
  3. lower in schools with higher concentrations of non-proficient special education children
  4. lower in schools with higher concentrations of black and Hispanic children

So then, what would it take to bring observation ratings in line with SGPs? It would take extra care to ensure that ratings based on observations of classroom practice, regardless of actual quality of classroom practice, were similarly lower in higher poverty, higher minority schools, and higher in higher performing schools. That is, let’s just make sure our observation ratings are similarly biased – similarly wrong – to make sure that they correlate.  Then all of the wrong measures can be treated as if they are consistently right???????

Actually, I take some comfort in the fact that the observation ratings weren’t correlated with the SGPs. The observation ratings may be meaningless and unreliable… but at least they’re not highly correlated with the SGPs which are otherwise correlated with a lot of things they shouldn’t be.

When will this madness end?

 

 

A few quick thoughts and graphs on Mis-NAEP-ery

Update: Here are a bunch of additional graphs relating Students First Report Card grades with unadjusted and adjusted NAEP Gains (hint – it’s the adjusted gains that matter since low performing states are able to post bigger gains, and also generally received higher grades from Students First). Mis_naep_ery9

Yesterday gave us the release of the 2013 NAEP results, which of course brings with it a bunch of ridiculous attempts to cast those results as supporting the reform-du-jour. Most specifically yesterday, the big media buzz was around the gains from 2011 to 2013 which were argued to show that Tennessee and Washington DC are huge outliers – modern miracles – and that because these two settings have placed significant emphasis on teacher evaluation policy – that current trends in teacher evaluation policy are working – that tougher evaluations are the answer to improving student outcomes – not money… not class size… none of that other stuff.

I won’t even get into all of the different things that might be picked up in a supposed swing of test scores at the state level over a 2 year period. Whether 2 year swings are substantive and important or not can certainly be debated (not really), but whether policy implementation can yield a shift in state average test scores in a two  year period is perhaps even more suspect.

Setting all that aside, let’s just take a step back and look at the NAEP data, changes in scores from 03-13, 09-13 and 11-13 for 4th grade reading and 8th grade math. BUT, as I’ve shown before, since gains on NAEP appear correlated with starting point – lower performing states show higher gains, let’s condition those gains on starting point by representing them in scatterplots against starting points.

Here are the figures. In some of the figures below, I’ve cut out Washington, DC because it is such a low performing outlier. It does creep into the picture as its scores rise. But this is a rise over the longer haul, much prior to teacher evaluation reforms.

If teacher evaluation reform (or expanded choice, etc.) has caused great NAEP gains, then the graphs below should show that especially from pre-RTTT baseline year 2009 to 2013, states adopting RTTT-style teacher eval policies should be rising above the trendline – but not those curmudgeonly states that have lagged in such reform efforts.

Grade 4 Reading 2003-2013Slide1

Over the 10 year period, Maryland is the miracle state in 4th grade reading. Matt Di Carlo has pointed this out in the past. Florida does okay, and Alabama is also a standout. New Jersey and Massachusetts – both initial high performers also exceed expectations given starting point. Louisiana falls right on the line.

Grade 4 Reading 2009-2013Slide2

From 2009-2013, Maryland remains the standout. Georgia, Washington, Utah and Minnesota also do pretty darn well, and yes… Tennessee is in that next batch, but even Wyoming and New Hampshire beat expectations by more, having started higher. Louisiana beats expectations. In any case, it’s hard to make a case that from 2009 to 2013, states that moved most aggressively on teacher evaluation are those that showed greatest gains.

Grade 4 Reading 2011-2013Slide3

On the recent 2-year bump, Tennessee and DC do quite well, but so too does Minnesota. Colorado (another teacher eval state) does pretty well on this one. This graph may provide the “best” (albeit painfully weak, suspect and short term) “evidence” for teacher eval states – well – except for Minnesota, which I don’t believe was leading the reformy pack on that issue. Of course there are also those who wish to point to choice policies as the driver – noting Indiana’s presence in the mix – but similar inconsistencies undermine this argument (with larger and smaller charter and voucher share states falling, well, all over the place in this figure – but that does warrant some additional figures at a later point.)

Let’s move to 8th grade math.

Grade 8 Math 2003-2013Slide4 New Jersey and Massachusetts lead the way on 10 year gains – even though they started high – with Vermont and New Hampshire doing okay as well. Hawaii also isn’t looking bad here. But Louisiana, despite starting low, posted lack-luster gains. Tennessee is right below Nevada – falling pretty much in line with expectations.

Grade 8 Math 2009-2013Slide5From 2009 to 2013, New Jersey and Massachusetts along with Rhode Island, Hawaii, Ohio, California and Mississippi do pretty well. Not your most reformy mix of states – regarding teacher evaluation or choice programs (but for Ohio’s charter expansion). Louisiana is still sucking it up, and Tennessee falling more or less in line with expectations.

Grade 8 Math 2011-2013Slide6Finally, in the much noisier two year bump on math from 2011 to 2013, we get a little more spreading out – because a two year bump is noisier – less certain – less decisive in any way, and also less related to initial level. Here, New Jersey and Massachusetts are still about as far above expected growth as is Tennessee, which for the first time jumps above expectations for grade 8 math growth. DC does creep into the picture here, and posts some pretty nice gains. BUT… the issue with DC is that its average starting point is so low that it’s hard to predict accurately what its gain would likely be.

Is Tennessee’s 2-year growth an anomaly? we’ll have to wait at least another two years to figure that out. Was it caused by teacher evaluation policies? That’s really unlikely, given that those states that are equally and even further above their expectations have approached teacher evaluation in very mixed ways and other states that had taken the reformy lead on teacher policies – Louisiana and Colorado – fall well below expectations.

UPDATE: Classic example of Mis-NAEP-ery

Here are some additional versions of the figures above, in which I have identified the states that received passing grades from Students First for “teaching” related policies.

Clarification: The graphs below separate states that received above/below a “teaching” grade point average of 2.0 from Students First.

Slide1Slide2Slide3Slide4Another UPDATE: Here are the trends on DC score improvements… So, in other words, are you really telling me that teacher contractual changes adopted in the last few years affected student gains starting back in 1996?

Slide1Slide2Slide3Slide4

Failure is in the Eye of the Political Hack: Thoughts & Data on NJ Failure Factories & NOLA Miracles

We all know… by the persistent blather emanating from reformy-land that some common truths exist in education policy.

Among those truths are that New Jersey’s urban public school districts are absolute, undeniable Failure Factories, while New Orleans’ Post-Katrina charter invasion is the future of greatness in public (well, not really public) education – the ultimate example of how reformyness taken to its logical extreme saves children from failure factories.

Thus, we must take New Jersey down that New Orleans path toward greatness. It’s really that simple. Dump this union-protectionist favor-my-failure-factory mindset… throw all caution (and public tax dollars) to the wind – jump on that sector agnostic train and relinquish all adult self interest.

But like most reformy truths, this one is a bit fact challenged, even when mining reformy preferred data sources.

Now – as I’ve explained previously, I do have my concerns with the Global Report Card method for bridging state – NAEP and international assessments. But why should a little statistical validity concern keep us from having some fun with it.

Wouldn’t it be fun, for example, if we could make some direct comparisons between NOLA’s miracle relinquished, sector agnostic charter schools and New Jersey’s union-protectin’ public bureaucracy failure factories? Wouldn’t it just?

And of course, if we can compare our individual school districts with Finland and Singapore using the Global Report Card than why the heck can’t we compare NJ Failure Factories and NOLA Miracle Charters? I guess we can.

Let’s start with a global look at NJ’s massively failing public schooling system when compared with Finland, and all other U.S. Public School Districts. In this graph we have NJ districts in Orange, compared against the Finnish average (50% on the vertical axis) and in the context of all other U.S. districts, from lower to higher percent free/reduced lunch.

Slide4

In fact, a whole bunch of NJ districts look like they’re doin’ pretty darn well-Above that Finnish median (the Finnish line?). But hey… there are those high poverty NJ districts over to the right… those failure factors… those where children are being dreadfully failed by their unionized teachers (yeah… you!)… they do indeed fall well below the Finnish median… and that’s just not acceptable!

Certainly, Louisiana as a state must look at least as good as NJ when compared to Finland… especially given the massive gains of NOLA children after that wonderfully beneficial weather event some years back (or so it’s been characterized by many a reformy public official in the past 5 years or so).

Slide5

Well that’s not a very good start is it? But hey, Louisiana is a very high poverty state with many issues to overcome. And it is well understood that the best way to overcome poverty is to put very little fiscal effort into public education, to rate teachers by their students test scores, and to evaluate teacher preparation similarly.

NOLA Miracles and NJ Failure Factories

Let’s dig deeper into that lower right hand corner for a bit. Let’s specifically isolate those NJ public districts and those NOLA Recovery School District charters with over 80% children qualified for free or reduced priced lunch and let’s see how they stack up against each other, by their percentile rank among U.S. Districts in 2009 Reading and Math.

Slide1

Here, it certainly appears that NJ failure factories are actually doing about as well as NOLA miracle schools. Heck, Union City, West New York and East Newark beat them all, including NOLA KIPP. Newark – one of the most failure-factory of all is ahead of most NOLA charter organizations and not far behind KIPP.

The picture is pretty similar for reading, but with two NOLA charter operators rising higher in the picture. The others, not-so-much!

Slide2

Now – you say.. but the NOLA charters are much higher in poverty.  This isn’t really a fair comparison, even though I’ve isolated only the highest poverty districts and schools. But to say that, you’d have to be ignorant of a key problem with poverty measurement – about which I’ve written on numerous occasions in recent years both in my blog, in peer reviewed articles and in recent reports.

Put simply, because the same income thresholds are used across the whole country for determining those free/reduced lunch rates above, poverty in NOLA schools is significantly  overstated and in the NJ schools is significantly understated.

So, I can use adjustments that we generated for our research on poverty measures to correct the free/reduced lunch rates for our NOLA charters and NJ districts. For the most part, the NJ districts, by comparison move up to 100%, but here, I allow them to go above 100% just to spread them out. By contrast and as expected the NOLA Charters go down in poverty.

And the pictures look like this:

Slide3Slide4

And what do we see?

We see that ON AVERAGE, NJ PUBLIC SCHOOL DISTRICT FAILURE FACTORS ARE BOTH HIGHER IN POVERTY AND HIGHER IN AVERAGE PERFORMANCE OUTCOMES THAN THE VAST MAJORITY OF NOLA MIRACLE CHARTERS.

Heck, even Camden City Schools, slated to be dismantled NOLA-style… already performs about as well (middle of the pack) as most post-Katrina NOLA miracle charters.  On Reading, Union City, West New York, East Newark, Elizabeth, East Orange, New Brunswick and Newark all beat NOLA KIPP and all have higher adjusted low income rates!

Yes… the statistical bridging method here from state assessments to national percentiles is, well, imperfect at best.

But that never stopped reformy-pundits from arguing that all U.S. schools suck when compared to Finland or Singapore.

Thus by empirical and logical extension, the NOLA reformy miracle is a cesspool when compared to New Jersey’s failure factories.

Either that, or New Jersey’s failure factories really aren’t as bad as we’ve been led to believe (except maybe this one).

Well that doesn’t fit the reformy narrative very well does it?

Charter Schools & the Public Good: Jersey City Version

As I’ve discussed in several recent posts, I’m increasingly concerned with how charter school expansion has played out both in our cities and in our suburbs.

My one post that perhaps best captures my overarching concerns is here.

It seems that increasingly, no matter where I look, my worst fears are realized. As I’ve explained numerous times – I began my work on charter school policy with positive expectations. Not so much anymore. Here’s how it’s all playing in Jersey City, NJ.

First… the map…where we have our two highly skimmed schools – Soaring Heights and Learning Community Charter. Slide1And yes, we do have some charters for the commoners at least in terms of income status.

NOTE: While LCCS has not updated its latitude/longitude data for its new location – the enrollment data characterizing their actual student enrollments are from 2012-13.

Slide2The skimming behavior of the elite charters not only disadvantages other district schools, but also those charters for the commoners.

Perhaps more problematic than the number of lower income children left behind in district and non-elite charters is the number and share and type of children with disabilities. Here are the aggregate shares, which are disparate enough.

Slide3

More problematic however is the fact that the big red bar representing district schools includes much larger shares of children with far more severe and more costly disabilities. Charters are serving only those children with the least severe a) mild specific learning disabilities, b) speech/language needs and c) in some cases “other” health impairments.

Slide4And under New Jersey’s persistently biased growth measures, these strong patterns of student sorting not only have consequences for the average level of student performance, but also for the average gains. Clustering more disadvantaged peers together – which necessarily happens when you cluster more advantaged peers together – has consequences.

Higher poverty settings have lower gains and vice versa. Does this mean, as NJDOE would have us believe [by arguing that their growth measures fully account for student background and that teachers are the most important in school determinant of growth], that teachers in Liberty Academy and Jersey City Comm Charter suck and teachers in Learning Community and Soaring Heights are awesome? This is a highly suspect (read totally ridiculous, offensive and asinine) conclusion to draw.

Slide5And lower performing settings have lower gains, though this picture is somewhat less clear, because Soaring Heights fails to soar to its expected heights.

Slide6The sorting induced by some though not all charter schools in Jersey City raises concerns about how New Jersey charter policy should move forward in the future. This is not to suggest that any and all sorting is bad and should never occur – or be immediately stopped. But, we cannot ignore it… nor should we let the system run wild on its current path.

NJ Education Spending & the Collapse of Equity [Update]

A while back, I wrote this post on the collapse of educational equity in New Jersey.

A few years back, I wrote this post to try to clear up the multitude of falsehoods I kept hearing about New Jersey taxes and spending.

Well… not much time to write a great deal of explanatory text today… but here are a few updated figures. Tax figures are from the state and local tax query system of Taxpolicycenter.org.  Note that these figures only go through FY 2011, as do Census data on local public school district spending used in the retreat from equity post above.

But before I go to my updated slides, note that the Center on Budget and Policy Priorities also recently produced a report on education spending since 2008, finding that New Jersey was among those states that either in percentage terms or on a per pupil basis, had seen reductions in inflation adjusted elementary and secondary education spending.

Slide5

So, here’s New Jersey spending on education since 1990, beginning with STATE DIRECT EXPENDITURES on k-12 and higher education. Note that the peak of state direct spending on k-12 was in 2006, following the largest scale up of “Abbott” funding (from 1998 to 2005ish). Since that time, first with the adoption of the School Funding Reform Act (for comments on problems with SFRA, see this post)  and then with recession era cuts, state support has declined.

Slide1Here’s combined state and local direct spending on education PER CAPITA (NOT per pupil, but per population) – which is the lions share of spending on education (federal being relatively small… but for a temporary “stabilization” boost).

Slide2Now, one argument for the per capita drop is/was that earnings/incomes, etc were dropping and thus the burden on the taxpayer was simply too high and climbing. But here’s what the direct spending – state and local – on education looks like as a share of personal income.

Slide3Yes, even as a share of income, education spending declined.

And this decline comes largely as a function of state aid decline. And, when state aid declines, the natural tendency is to use local property taxes to the extent possible to offset that decline.

So, as we can see, property taxes spiked.

Slide4

And, of course, some local public districts have far more capacity to offset their losses with property tax increases than do others. See this post

And for more information on persistent property tax disparities by wealth in NJ see this post!

So… during this period, as the post mentioned at the outset of this post explains – the progressiveness of New Jersey’s state school finance system begins to decline.

Slide6

progressivenessThat previously progressive system had actually made some substantial strides for low income children.

Friday Story Time: Deconstructing the Cycle of Reformy Awesomeness

Once upon a time, there was this totally awesome charter school in Newark, NJ. It was a charter school so awesome that its leaders and founders and all of their close friends decided they must share their miracle with the world in books on the reasons for their awesomeness, including being driven by data and teaching like a champion!

The school’s break-the-mold – beating the odds – disruptively innovative awesomeness was particularly important during this critical time of utter collapse of the American education system which had undoubtedly been caused by corrupt self-interested public school teachers (& their unions) who had been uniformly ill-trained by antiquated colleges and universities that themselves were corrupt and self-interested and generally in the business of selling worthless graduate degrees.

In fact, the undisputed awesomeness of this North Star Academy could, in theory, provide the foundation for a whole new approach to turning around the dreadful state of American education.

And thus came the Cycle of Reformy Awesomeness, which looks something like this:

Slide1

Built on the foundation of awesomeness established by THE North Star Academy, since teachers are the undisputed most important in school factor determining student outcomes, the awesomeness of North Star could be attributed primarily to the quality of the teachers and innovative practices they used in their data driven classrooms!

Thus, by extension, we must establish new institutions of teacher preparation whereby these truly exceptional teachers (of 3 to 5 years experience) not only are provided the opportunity to share their expertise on a personal collaborative level with their own colleagues, but rather, we should let these teachers be the instructors in a new graduate school of education (regardless of academic qualifications) and we should actually let them grant graduate degrees in education to their own colleagues.

This new approach of letting teachers in a school grant graduate degrees to their own work colleagues (and those in other network schools) could lead to rapid diffusion of excellence and would most certainly negate the corrupt perverse incentives pervasive throughout the current, adult oriented self-interested American higher education system! Disruptive innovation indeed!

And so their founders and disciples took their show on the road. They took their show to state departments of education to urge fast-tracked uncritical promotion of their cycle of awesomeness. They gained leverage on local boards of education in nearby school districts to promote diffusion of their awesomeness. And they set out to other state departments of education to share their insights on how to achieve awesomeness with drive by data… excuse me… being driven by data!

And driven by data they were… for example… absolutely all of the kids in their school passed that test in high school.

And there was much rejoicing.

Slide2And that one too:

Slide3

 

And there was much rejoicing.

And they were only getting better, and better and better:

Slide4

And there was much rejoicing.

And better:

Slide5The more they looked at their own data – well, really only one measure of their data – the more they patted themselves on the back, congratulated their own reformy awesomeness and shared it with the world. And the state!

Slide6

And there was much rejoicing.

Yup… 100% graduation rate… which is totally unheard of for a high poverty, urban high school in dreadful Newark, NJ! [or at least for a school that happens to be located in the high poverty city of Newark].

A true miracle it was… is… and shall be. One that must be proliferated and shared widely.

But alas, the more they shared, the more they touted their awesomeness, the more it started to become apparent that all might not be quite so rosy in North Star land.

As it turned out, those kids in North Star really didn’t look so much like those others they were apparently so handily blowing out on state tests….

Slide11

And there was complete freakin’ silence!

Somehow, this rapidly growing miracle school was managing to serve far fewer poor children than others (except a few other charter schools also claiming miracle status) around them.

And, they were serving hardly any children with disabilities and few or none with more severe disabilities.

Slide12

And again there was complete freakin’ silence!

And if that was the case, was it really reasonable to attribute their awesomeness to the awesomeness of their own teachers – their innovative strategies… and the nuanced, deep understanding of being driven by data?

Actually, it is perhaps most befuddling if not outright damning that such non-trivial data could be so persistently ignored in a school that is so driven by data?

And there was complete freakin’ silence!

But alas, these were mere minor signals that all might not be as awesome as originally assumed.

It also turned out that of all the 5th graders who entered the halls of awesomeness, only about half ever made it to senior year – year after year after year after year… after year.

Slide14

And for black boys in the school, far fewer than that:

Slide15

And there was complete freakin’ silence!

And in any given year, children were being suspended from the school at an alarming rate.

Slide13

Again… raising the question of how a school driven by data could rely so heavily on a single metric – test scores and pass rates derived from them – to proclaim their awesomeness, when in fact, things were looking somewhat less than awesome.

Could a school really be awesome  if only the fewer than half who remain (or 20% of black boys who remain) pass the test? Might it matter at least equally as much what happened to the the other half who left?

Was it perhaps possible that the “no excuses” strategies endorsed as best practices both in their school and in their training of each other really weren’t working so well…and weren’t the strategies of true teaching champions… but rather created a hostile and oppressive environment causing their high attrition rate? Well… one really can say this one way or the other…

Regardless of the cause, what possibly could such a school share with those traditional supposedly failing public schools who lacked similar ability to send the majority of their children packing? Further, what possibly could the rather novice teachers in this school charged with granting their own co-workers graduate credentials share with experienced researchers and university faculty training the larger public school teacher workforce?

Alas the miracle was (is) crumbling.

But that miracle wasn’t just any ol’ miracle. Rather, it was the entire foundation for the reformy cycle of awesomeness! And without that foundation, the entire cycle comes crumbling down.

Slide16

No miraculously awesome charter school [in fact, one might argue that any school with such attrition is an unqualified failure].

Thus no valid claim of miraculous teachers and teaching.

Thus no new secret sauce for teacher preparation.

All perpetrated with deceptive and in some cases downright fraudulent (100% graduation rate?) presentation of data.

And thus the search continues… for the next miracle… and the next great disruptive innovation to base on that miracle… whatever… wherever it may be.

 

 

 

 

 

 

 

The “Ed Schools are the Problem” Fallacy

I had the displeasure of waking up to this drivel in my in-box this morning:

“Those who can, do. Those who can’t, teach. And those who can’t teach, teach teaching.”

http://www.nytimes.com/2013/10/21/opinion/keller-an-industry-of-mediocrity.html?_r=0

yeah… and those completely lacking in critical thinking, basic research and data interpretation skills write op-eds for the Times.

I don’t really teach teachers myself, so I guess I shouldn’t take offense. But I do mainly because the core argument advanced here is so ill-informed and poorly conceived.

Allow me to start by pointing out that I have actually written detailed, quantitative research in peer reviewed journals on the very topic of who’s teaching the teachers. In fact, the article we wrote was done partly in response to the Arthur Levine report cited in the Times op-ed piece. And it’s not as if the article title really conceals its contents:

  • Wolf‐Wendel, L., Baker, B. D., Twombly, S., Tollefson, N., & Mahlios, M. (2006). Who’s teaching the teachers? Evidence from the National Survey of Postsecondary Faculty and the Survey of Earned Doctorates. American Journal of Education, 112(2), 273-300.

My apologies for the fact that this article is fire-walled. I really don’t expect all of my blog readers to go through the trouble of paying for it or finding an academic library that carries it. But any responsible journalist, pundit or author proclaiming a strong policy position on this issue ought to at least do some reading on the topic first. The above article is certainly not uncritical of teacher preparation. [UPDATE: Full version here, courtesy of the kind folks at AJE Baker2006]

And the issues of complexity and variation in teacher preparation I explore in the above research article are not the only massive omission or conflation put forth in the New York Times piece, which operates on the overly crude assumption of a uniform system of content-free instruction across any and all ed schools.

Let’s tackle the bigger and much simpler issue here – the broad notion advanced in this op-ed that Ed Schools are the problem!  Ed Schools are the primary threat to the quality of our public schooling system as a whole and by extension Ed Schools are a threat to our national security. [yeah… he didn’t really say that… but somehow it often goes there] And further, that if we can just replace ed schools – with some other unknown thing – we’ll all be better off.

A kinder, gentler variant on this argument is that it’s just the bad ed schools that are a threat and that we can weed out those bad ed schools by looking at how the students of their graduates perform. I’ve addressed this issue in a few previous blog posts. First, I’ve addressed the question of whether “ed school” is really some static, monolithic entity. Second, I’ve addressed the feasibility of rating ed schools by twice removed outcome measures.

But there’s actually a simpler logical fallacy at play here which lies at the root of many reformy arguments regarding causes and consequences – failure to acknowledge that  the U.S. has a wide range of elementary and secondary of schools that are both high performing and low performing and that the defining features differentiating higher and lower performing schools are not found primarily in their teachers or the preparation programs they attended – or whether they attended any at all – but rather in the communities they serve, the resources available to them and the backgrounds, health and economic well-being of the children and families they serve.

This is not about the poverty as excuse argument. This is about the simple point that our highest performing public schools also employ teachers from traditional public college and university preparation programs and in many cases, teachers from the same – or substantively overlapping – college and university preparation programs as teachers in our lowest performing schools in the same region.

If that’s the case, then how is it possible that teacher preparation programs are the problem?

I know… the good reformer at this point is thinking – but there are no good U.S. public schools or districts. They all suck and that’s precisely why teacher preparation is the problem. Of course, if that was the case – that all K-12 public schools suck – it would hard to, by research design – with a dependent variable that doesn’t vary – attribute that sucky-ness to a single cause.  But the dependent variable does vary… even when we rely on reformy resources like the Global Report Card I wrote about here.

First, here’s a location where you, yourself can actually download the reformy report card, which in large part was designed to shake the confidence of America’s suburban parents by taking a few statistical leaps to show them their leafy suburban schools wouldn’t stack up so well if we transported them to Finland or Singapore.

http://globalreportcard.org/docs/Global-Report-Card-Data-11.14.12.xlsx

I’ll save that argument for another day, and just select two sets of districts from this report card, from Illinois and Kansas, because I have the data readily available. Let’s look at local public school districts that are

1) Better than the Average Fin and those that are…

2) Worse than 80% of beer-swillin’ Hockey Lovin’ Canadians.

That’s quite a contrast (even though both are high performing countries – on average – setting aside demographics, etc.).

Here are the lists:

Slide1

Slide2

So, we’ve got some school districts in each state that are better than the average Finnish school and some that get trampled by the those syrup swillin’ hosers from the Great White North.

The only plausible explanation is that the teachers in the Better than Finland category are either from completely non-traditional ed schools or not ed schools at all while  the teachers in the not-so-great schools all come from your typical state ed school.

Certainly, we know from large bodies of teacher labor market research that graduates of various preparation programs, colleges and universities and alternative route programs more broadly,  sort themselves on the labor market, with those who possess stronger academic credentials often sorting into the “more desirable” jobs.

But that’s somewhat of an aside here. For the basic reformy premise of massive uniform ed school failure to be true – we would have to see little or no commonality in the ed school preparation of teachers across these settings – across totally awesome U.S. schools and totally sucky ones.

So, here’s the recent distribution of graduates of Kansas teacher preparation programs in the Kansas City metropolitan area which includes the Blue Valley School district – better than the average  Fin and Kansas City Kansas, which, well, gets its butt kicked by Canada!

Slide3

Hmmm… you can’t possibly be telling me that both KCK and BVSD have teachers who graduated from the major state teacher preparation colleges can you? If that’s the case, then their relative international rankings might not be determined by teacher preparation?

[ignore the poverty shading in the background…’cuz payin attention to poverty… well… just isn’t cool with the reformy crowd!]

There are some notable features to this map. One is that BVSD and and Olathe to its west were still significantly growing districts during this period. So it makes sense that they hired a lot of new teachers during that time. It makes less sense that KCK, more stagnant (and declining) in population hired so many new teachers – but for the relatively high turnover rate more common in such high poverty settings! There are also some distributional differences in the dots – which universities produce more teachers for which districts (or provide more credentials). Pittsburg state (blue dots) more prevalent in KCK provides a local program that feeds to KCK. I’d be hard pressed, however, to lay blame on Pitt state for KCK’s Canadian butt-whoopin’ and I’d be equally hard pressed to credit K-state in producing more teachers for Blue Valley as the cause of Blue Valley’s competitive match up with Finland! The fact is that all of these Kansas districts draw heavily on teachers produced by the public teachers colleges of that state – and some do as well as Finland while others struggle.

As such, it’s pretty darn hard to lay blame on traditional teacher preparation in Kansas for these differences in outcomes.

Now, let’s take a look at a few high performing and lower performing districts in Illinois.

First, here are the top 15 undergraduate degree producers for Chicago and Aurora East and for Naperville and Lake Forest. Rather than from the degree producers perspective, these data simply include all instructional staff in these districts, downloadable here: http://www.isbe.state.il.us/research/xls/2012-tsr-public-dataset-instr.xlsx

The data include where teachers got their undergraduate and advanced degrees.

Slide5

Wow… there’s actually quite a bit of overlap in the institutions. Sure, there are differences. Where a state name is listed, the teacher received his/her undergraduate degree from an un-named institution in that state (such are the shortcomings of state administrative data). The City of Chicago does have larger shares from some Chicago based programs. But there’s also overlap and there’s significant overlap for the state’s major public teacher preparation institutions, like Illinois State University, Northern Illinois University and the University of Illinois main campus (Champaign). How can that be? How can there possibly be school districts that compete favorably with Finland while employing graduates of traditional teachers colleges?

While the percentages of teachers in these districts who attended any one preparation institution tend to be small, the shares who attended major public preparation institutions for their bachelors degrees appears marginally larger in the high performers (Over 10% for both IL State and Northern).

That’s impossible! But… But… But… graduates of those same colleges are teaching in districts that got whooped by the Canadians? So how can we possibly place blame for systemic failure of American schools on teacher prep programs? I’m struggling with the logic here.

One more look… here are the advanced degree granting institutions for teachers in higher versus lower performing Chicago area districts. Note that “NULL” refers to those not holding (or reporting) advanced degrees and that the share holding only a bachelors degree is higher in the lower performing districts (poorer, minority districts).

Slide6

Again, these degrees – which in include both initial and additional certifications – are dominated by traditional credential granting institutions with substantial overlap across teachers between higher and lower performing schools.

This is a separable but related issue to the evaluation of ed schools by student outcome measures. I’ll continue digging more into that issue in future posts.

It is certainly hard to make a compelling case that traditional teacher preparation institutions are the primary cause of our supposed lagging national education system when our highest performing schools – those that compete favorably with Finland – also employ in large number, graduates of those preparation programs and in many cases employ significant numbers of graduates of the same programs that provide teachers for our supposed failing schools.

$500 million? No! $3 BILLION! That’s $3BILLION! Comments New York State’s Underfunding of NYC Schools

New York’s Governor Cuomo has been big on words promising NOT TO FUND New York State schools and squeeze them to the maximum extent possible with layers of cuts and caps. After all – NOT FUNDING SCHOOLS is the most noble of endeavors – that along with declaring death penalties for those underfunded, high need schools that post low average test scores.

The Governor’s most recent anti-school-funding attack comes in response to NYC Mayoral Candidate Bill de Blasio’s campaign promise to push for universal preschool for city school children. At least as characterized in a New York Daily News editorial, the noble Governor is again set to dig in his heals against any additional spending on schools:

Laying claim to big ideas, Bill de Blasio has promised to deliver universal all-day pre-kindergarten, paid for by raising taxes on wealthy New Yorkers.

The attractive concept helped boost de Blasio to a commanding lead over Republican Joe Lhota, and is key to his education program. He calls universal pre-k “how we will start to close the achievement gap” between minority and white children.

Now is the time for voters to consider whether de Blasio has a prayer of fulfilling his pledge to provide services to 70,000 4-year-olds, along with after-school programs for middle-schoolers.

He’d need a half-billion dollars a year and has staked all on convincing Albany to okay a city income tax hike on high earners.

Bill, meet Andrew:

Gov. Cuomo says no.

In an interview with the Daily News Editorial Board, Cuomo made clear that he has no intention of pressing the Legislature to give de Blasio the $500 million in tax money he’s counting on.

Read more: http://www.nydailynews.com/opinion/bill-rude-awakening-article-1.1487839#ixzz2i5NJcOJT

So, as the Daily News would characterize, the Governor is incensed that de Blasio would even consider the absurd possibility of needing an additional half a billion – that’s HALF A BILLION DOLLARS! to finance the frivolity of universal preschool.

Yeah… HALF A BILLION DOLLARS sure sounds like an obscene number. But let’s reflect for a moment on just how much money the state of New York, under the leadership of Governor Cuomo continues to come up short in financing the state school finance formula that was adopted back in 2007 in order to comply with a NY State High Court ruling that funding levels at that time for New York City were inadequate.

In 2012-13 and again in 2013-14 – New York State continues to short NYC on general state aid for schools to the tune of around 3 BILLION DOLLARS! Yeah… that’s 6 times the seemingly obscenely huge funding request for de Blasio’s pre-k proposal.

Let’s take a look. First, here’s a graph of the relationship between state aid shortfalls per pupil and the state’s own pupil need index. Larger circles represent larger (enrollment) districts. As can be seen here, in 2013-14, based on March/April 2013 adopted budget figures (state aid run worksheets), New York City – the bowling ball in the picture – is shorted by just under $3,000 per pupil in State Aid! That’s $3,000 PER PUPIL IN STATE AID!

Slide1Now, here’s how the funding formula that got the state out from under litigation is supposed to work.

A district’s target funding level per pupil is supposed to be a function of a) the foundation level of funding per pupil [appropriately inflated to represent current year costs], times b) the pupil need index for each district times c) the regional cost index for each district times d) the number of “aidable foundation pupil units” (which is an enrollment count including some adjustments for special education and other factors).

The adjusted foundation amount per pupil in NYC is $16,562 in 2013-14.

Bear in mind that even this figure is based on rigged analyses that severely underestimate actual needs and costs.

Then, the state determines the share of that figure to be covered by the district and balance to be covered by the state. The state share for NYC is supposed to be $7,006.

Take that figure times the total aidable foundation pupil units (TAFPU) and you’ve got….. $8.8 BILLION DOLLARS!

THAT’S $8.8 BILLION DOLLARS!!!!!

Slide2But alas, that’s far more than the state actually allocates to the City of New York through the foundation aid program it adopted to get out from under litigation brought by the city of New York – because funding was (and still is) inadequate!

I’m not even going to quibble (here and now) over the broader conception of “adequacy” involved, or the fact that the state concocted ways to reduce their estimated targets. The fact is that even though they set a low bar and further lowballed their funding targets, they’ve (meaning the Governor and Legislature) chosen to not even come close to funding those targets!

Instead, the state begins by freezing the underlying foundation aid level to past levels, setting NYC’s foundation aid level to just under $6.4 billion.

Yeah… sounds like a lot… but that’s already well short of what the city is supposed to get. And that’s just the first CUT.

Next, the state applies what it calls the Gap Elimination Adjustment (and then partially restores that cut, to make it seem like a gift), further reducing state aid to NYC down to just under $5.9 billion.

That’s a total state aid shortfall of nearly $3 billion! THAT’S $3 BILLION!!!!!!!!!!!  [with the figure fluctuating around $3 billion from year to year – the figure was higher for 2012-13]

So before the good Governor Cuomo decries the obscene half a billion dollar request presented by future mayor(?) de Blasio, it might be wise for him to reflect on the state’s own past and still relevant promises to New York City… promises that would rightfully (constitutionally… as per the high court decision in Campaign for Fiscal Equity v. State) drive an additional $3 Billion to NYC.

THAT’S $3 BILLION!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

 

More Thoughts on Interpreting Educational/Economic Research: DC Impact Study

Today brings us yet another opportunity to apply common sense interpretation to an otherwise seemingly complex research study – this time on the “effectiveness” of the DC Impact teacher evaluation system on improving teaching quality in the district. The study, by some of my favorite researchers (no sarcasm here, these are good, thoughtful individuals who do high quality work) is nicely described in the New York Times Economix Blog section:

To study the program’s effect, the researchers compared teachers whose evaluation scores were very close to the threshold for being considered a high performer or a low performer. This general method is common in social science. It assumes that little actual difference exists between a teacher at, say, the 16th percentile and the 15.9th percentile, even if they fall on either side of the threshold. Holding all else equal, the researchers can then assume that differences between teachers on either side of the threshold stem from the threshold itself.

The results suggest that the program had perhaps its largest effect on the rate at which low-performing teachers left the school system. About 20 percent of teachers just above the threshold for low performance left the school system at the end of a year; the probability that a teacher just below the threshold would quit was instead above 30 percent.

In addition, low-performing teachers who remained lifted their performance, according to the system’s criteria. To give a sense of scale, the researchers noted that the effect was about half as large as the substantial gains that teachers typically make in their first years of teaching combined.

http://economix.blogs.nytimes.com/2013/10/17/a-new-look-at-teacher-evaluations-and-learning/?_r=1&

The study: http://cepa.stanford.edu/sites/default/files/w19529.pdf

So, for research and stats geeks this description speaks to a design referred to as regression discontinuity analysis. It sounds complicated but it’s really not.  The idea is that whenever we stick cut-points through some distribution of ratings or scores – through messy/noisy data – some people fall just below those cut-points and others just above. But the cut-points are really arbitrary and those who fall just above the line really aren’t substantively, or even statistically significantly different from those who fall just below the line. It’s almost equivalent (assumed equivalent for research purposes) to taking a group of otherwise similar individuals and randomly assigning to some, one score (below the line) and others another score (above the line).

In one application of this approach, researchers from Harvard studied the effect of high stakes high school exit exams on student in Massachusetts. Students who barely passed the test were compared with those who barely failed the test on the first try. In reality, missing or making one or two additional questions does not validly indicate that one child knows their math better than the other. The kids were otherwise comparable, but some were labeled failures and others successes. Those labeled failures were more likely to drop out of high school and less likely to attend college.

The conclusion – that these arbitrary, non-meaningful distinctions adopted in policy are harmful.

This brings us to the present study on the DC Impact teacher evaluation system. Here, the researchers identified teachers who were really no different from one another statistically on their DC Impact ratings, but some were just a few fractions of a point low enough to be labeled as Ineffective and face threat of dismissal, and others just high enough to be out of the woods for now. That is, there really aren’t any substantive observed quality differences between these two groups. Note that the researchers studied this at the high end of the ratings distribution as well, but didn’t really find as much going on there.

Put simply, what this study says is that if we take a group of otherwise similar teachers, and randomly label some as “ok” and tell others they suck and their jobs are on the line, the latter group is more likely to seek employment elsewhere. No big revelation there and certainly no evidence that DC Impact “works.”

Rather, arbitrary, non-meaningful distinctions are still consequential. This is largely what was found in the Massachusetts high stakes testing studies.

Actually, one implication for supervisors is that if you want to get a teacher you don’t like to leave your school, find a way to give them a bad rating. But I think most supervisors and principals could already figure that one out.

Here’s an alternative experiment to try – take a group of otherwise similar teachers and randomly assign them to group 1 and group 2. We’ll treat Group 1 okay… just okay… no real pats on the back or accolades. Group 2 on the other hand will be berated and treated like crap by the principal on a daily basis and each day in passing, the principal will scowl at them and say… “your job’s on the line!.”

My thinking is that group 2 teachers will be more likely to seek employment elsewhere. Not hugely different from the DC Impact research framework and nor are the policy implications. Does this mean that teacher evaluation works – has appropriate labor market consequences. No… not at all. It means that arbitrary differential treatment matters.

Of course, this would be an unethical experiment unlikely to make through IRB approval. But heck, screwing with people’s lives via actual arbitrary and capricious employee rating schemes, adopted as policy is totally okay.

As for the second conclusion… that those who do stay appear to improve their game…it certainly makes sense that individuals would try not to continue being the whipping boy… even if they perceive their prior selection as whipping boy to be arbitrary and capricious. Notably, the bulk of the evaluations in this study were based on observed behaviors not test-based metrics, and with observations, teachers have more direct control over what their supervisors observe and can therefore respond accordingly. Whether these behavior changes have anything to do with better actual on-the-job performance – “good teaching” – is at least questionable.