Thoughts on “Randomized” vs. Randomized Charter School Studies

There’s much talk in education research about Randomized Control Trials and truly “experimental” research being the “gold standard” for determining whether a specific intervention “works” or not. Thus is the basis for the Institute for Education Sciences What Works Clearing House. It is often argued that randomized, or experimental studies are “good” and decisive, and that other approaches simply don’t match up. Therefore, if someone really wants to know what works or doesn’t with regard to a specific intervention or set of interventions, one need only review those randomized, experimental studies to identify the consensus finding.

There’s so much to discuss on these issues, including the extent to which truly randomized experiments can actually shed light on how interventions might play out in other settings or at scale. But I’ll stick to a much narrower focus in this post, and that is, just how randomized is randomized? Most recently, this question came to mind after reading this post addressing “experimental” vs. “non-experimental” studies of charter schools by Matt Di______Carlo at Shanker blog, and this post over at Jay P. Greene’s blog on RIGOROUS charter research (meaning experimental, or randomized).

There tend to be two types of studies done to determine the relative effectiveness of “charter schools” versus traditional “district schools.” The basic idea of either type of study is to determine the effect that “charter schooling” or some specific set of policies/practices and instructional models and strategies about “charter schooling”, has on students’ outcomes, when compared to kids who don’t receive those strategies. That is, exposure to “charter schooling” is assumed to be a treatment, and non-exposure, whatever that constitutes, is the control.

One type of study tries to identify after the fact, otherwise similar kids (matched pairs) attending a set of charter schools and a set of district schools in the same city, and then compares their achievement growth over time. These studies often fall short in two important ways.

The other type of study is often referred to as meeting the gold standard – as being a randomized study – or lottery-based study. It is assumed, since these studies are declared golden, that they therefore necessarily resolve both above concerns. And it is possible, that if these studies truly were randomized (or even could be) that they could resolve the above concerns. But they don’t (resolve these concerns), because they aren’t (really randomized).

First, what would a randomized study look like? Well, it would have to look something like this – where we randomly take a group of kids – with consent or even against their will – and assign them to either the charter or traditional school option. The mix of kids in each group is truly random and checked to ensure that the two groups are statistically representative (using better than the usual measures) of the population.  Then, we have to make sure that all other “non-treatment” factors are equivalent, including access to facilities, resources, etc. That is, anything that we don’t consider to be a feature of the treatment itself. This is especially important if we want to know whether expanding elements of the treatment are likely to work for a representative population.  This is a randomized, controlled trial.

Slide1

So then, what’s randomized in a randomized charter school study? Or lottery-based study?  One might sketch out a lottery-based study as follows:

Slide2

Here, the study is really only randomized at one point in a long complicated sequence – the lottery itself. Students and families have to decide they want to enter the lottery – that they are interested in attending a charter school, which will ultimately affect the composition of the charter school enrollments. Then, among those selecting into the pool, students are randomly chosen to attend the charters along side others randomly chosen to attend (from a non-random pool of lottery participants), and the others randomly selected, to go, well, somewhere else… with a group of peers non-randomly chosen to end up in that same somewhere else.

So, while the studies compare the achievement of kids randomly chosen to those randomly un-chosen (thus comparing only those who tried to get a charter slot), the kids are shuffled into settings that are anything but randomly assigned, containing potentially vastly different peer groups and a variety of other differences in setting. Add to this the likelihood of non-random student attrition, further altering peer group over time.

As such, I very much prefer these studies to be referred to as “lottery-based” rather than randomized or experimental. These studies are randomized at only one step in this process, potentially conflating setting/peer effects with treatment effects, thus substantially compromising policy implications.

As with those matching studies, the types of variables used to check and/or correct for peer composition and non-randomness of attrition are often too imprecise to be useful.

One fun alternative would be to pull a switch, whereby the charter teachers, their model, instructional strategies etc. would be traded with the district schools’ teachers, model and strategies, as a confirmatory test to see whether the charter model effects are actually transferable (assuming there were effects to begin with).

Slide5

Clearly, I’m asking way too much to assume that charter school, or most other program/intervention research in education be based on real RCTs. That’s not going to happen. And I’m not convinced it would be that useful for informing policy anyway. But, my point in this post is to make it clear that the difference between the types of matched student studies done by CREDO, for example, and the studies being (mis)characterized as “gold standard” randomized studies is far more subtle than many are willing to admit and NEITHER ARE WHAT THEY’RE REALLY CRACKED UP TO BE!

Dumbest “School Finance” Tweet Ever?

Critics say only public systems can focus 100% on the children, but vast majority of K-12 $$ goes to employees not kids bit.ly/SLrNUn

— AEI Education(@AEIeducation) December 18, 2012

How Modern School Finance/Education Policy Works: Lessons from New York

I’ll admit that the more I do this stuff, the more I write about today’s education policy environment and especially the environment around school funding, I do get more cynical. And few states have done more to encourage my cynicism than New York, of late. But I suspect that the tales from the trenches in many other states might be quite similar. So let me use New York as a prototype of the twists and turns and warped logic of modern state education policy.  New York education policy has followed a four step process:

Step 1: Slither out from court order by rigging low-ball foundation aid formula

As I noted on another recent post, several years back the New York Court  of Appeals ordered that the state legislature provide sufficient funding (specifically to New York City) to achieve a “sound basic education” which was ultimately equated with a “meaningful high school education.”  The city and governor’s office presented to the court alternative estimates of what that would cost. The state (governor/legislature/regents), as might be expected sought a “less expensive” option. And the court largely took their side. That is, the court ordered that the system be fixed, but largely (uncritically, but for some dissenting minority opinion) accepted the state’s proposal to fix it.

The state achieved their low-ball estimate by pulling a few classic tricks, some of which have been used in other states. First, the state based their minimum funding level on average spending of existing districts meeting the state standards – but had set a relatively low bar for those standards (a bar most were already surpassing anyway). Then they chose to look only at the “instructional” spending share of current spending (lopping off a large chunk of spending that’s actually needed to operate a school).  Rhode Island recently pulled the same garbage, but instead of looking at instructional spending for districts within Rhode Island they used instructional spending in the neighboring states of Massachusetts, Connecticut and New Hampshire (okay… NH doesn’t border RI… does it… but don’t tell their Commissioner… ‘cuz including NH allowed them to bring the average down! See link above).

The final step in their low-ball analysis was to look only at the average spending of the lower half spending districts that meet the state standards – assuming those districts to be the “efficient” ones, better reflecting minimum “costs.” Of course, what this does in New York State is to eliminate from the calculation nearly every district in the Rockland, Westchester, NYC and Long Island regions. So… base level of funding is essentially the average instruction-only spending of the lower half spending districts that have at least somewhat below current average outcomes, and lie somewhere between Syracuse and Buffalo. That makes sense right? That should give us a reasonable ballpark cost for New York City, Mount Vernon or Yonkers, right?

Even for my 2012-13 analyses below, the foundation level per pupil is set to only $6,570, where it is assumed that the average instructional spending per pupil needed in a New York State to achieve state standards.  So then, how does that stack up against alternative cost estimates of what would actually be needed to achieve specific state outcome targets?

I don’t have time to explain the chart below in great detail, but I do provide complete analysis/explanation in this report on New York State school finance.

In short, what Figure 1 shows us is in PURPLE, the foundation level, or target funding calculated to be needed by districts in each poverty quintile under the state’s own proposed remedy to their constitutional violation.  The PURPLE is the amount of money a district would have under the foundation aid formula, as a combination of state aid and levying the minimum required local effort.

The blue bars come from a cost model produced a few years back by William Duncombe of Syracuse University in which he used that model to estimate the average spending actually required to achieve a 90% proficiency rate on state assessments (where the average had drifted over time, making the 80% standard relatively meaningless – again, see report).  The red arrows show the gap between estimated costs of reasonable outcome goals and guaranteed funding under the foundation formula.

Figure 1:

Slide1

The point here is simply to show a) how much the state low-balled the target funding using their approach vs. a more rigorous approach, and b) how those funding gaps increase quite dramatically for higher poverty districts. In fact, the target funding level is not that far off for low poverty districts, but it’s only slightly better than half of the cost of comparable outcomes for high poverty districts.

Step 2: Conjure annual excuses for why the state can’t afford to fund even its own low-balled targets for local districts

Given figure 1 above, it might be bad enough if the state did follow through and fund its formula. The formula itself was/is grossly insufficient, determined by bogus calculations and filtrations (exclusions) of data all toward the end goal of generating the lowest possible politically palatable estimate of the cost of providing a sound basic education in New York.

But no… no… low-balling the cost wasn’t nearly far enough for the NY legislature and Gov(ernors) to go. The next step was to say – We can’t afford it (they were saying this even before the economy tanked, and they set out a multiyear phase in)!  We can’t afford our own low-ball estimate (while decrying that the estimate was somehow actually overly generous?).

Did they cut back just a little from their target? Oh… say… give districts about 90% or 80% (uh… that would actually be a lot of cut) of what the formula said they needed? Nope. They went much deeper than that. In fact, as I showed in one recent post, as student population needs escalate (according to the state’s own Pupil Need Index) under-funding with respect to foundation targets grows in some cases to over $4,000 per pupil and in New York City to over $3,000 per pupil.

Figure 2.

Slide1

As I showed in that same post, among the most screwed large districts in the state, several receive from the state in general foundation aid only about half (or less) of what they should receive under the STATE’S OWN LOW-BALL FORMULA!

Figure 3.

Slide2

Let’s be clear here. I’m not talking about shortfalls from the relatively high cost targets in that first graph. I’m talking about state aid shortfalls relative to the STATE’S OWN LOW-BALL Foundation Aid model – the model represented by the purple bars in the first graph.  Note also that the state in proposing this foundation model that they’ve subsequently underfunded, essentially declared that low-ball model to be the empirical manifestation of their own state constitutional obligation. It’s their own freakin’ definition of their constitutional obligation…. And they’ve chosen to ignore it.

Step 3: Pretend that it’s all the teachers’ fault and use that as a basis for holding hostage additional funding that should have gone to high need districts years ago!

Oh… but it doesn’t end there!

Riding the national, Duncanian wave of new normalcy (which I’ve come to learn is an extreme form of innumeracy) & reformyness, the only possible cause of lagging achievement in New York State  is bad teachers –greedy overpaid teachers with fat pensions – and protectionist unions who won’t let us fire them. Clearly, the lagging state of performance in low income and minority districts in New York State has absolutely nothing at all to do with lack of financial resources under the low-balled aid formula that the state has chosen to not even half fund for the past 5 years? Nah… that couldn’t have anything to do with it. Besides, money certainly has nothing to do with providing decent working conditions and pay which might leveraged to recruit and retain teachers.

And we all know that if New York State’s average per pupil spending is high, or so the Gov proclaims, then spending clearly must be high enough in each and every-one of the state’s high need districts! (right… because averages always represent what everyone has and needs, right? Reformy innumeracy rears its ugly head again!).

So it absolutely has to be the fact that no teacher in NY has ever been evaluated at all, or fired for being bad even though we know for sure that at least half of them stink. The obvious solution is that they must be evaluated by egregiously flawed metrics – and we must ram those metrics down their throats.

In fact, the New York legislature and Governor even found it appropriate to hold hostage additional state aid if districts don’t adopt teacher evaluation plans compliant with the state’s own warped demands and ill-conceived policy framework.

As I understand it, legislation passed this past year actually tied receipt of state general aid to compliance with the state teacher evaluation mandate. That, in order to receive any increase in state general/foundation aid over prior year, a districts would have to file and have accepted their teacher evaluation plan.

That’s it – we’ll take away their general state aid – their foundation aid – the aid they are supposed to be getting in order to comply with that court order of several years back. The aid they are constitutionally guaranteed under that order. I’m having some trouble accepting the supposed constitutional authority of a state legislature and governor to cut back general aid on this basis – where they’ve already failed to provide most of the aid they themselves identified as constitutionally adequate under court order? But I guess that’s for the New York Court system to decide.

If nothing else, it is thoroughly obnoxious, arbitrary and capricious and grossly inequitable treatment. I hear the reformers (who understand neither math nor school finance) whine… But why… why is it inequitable to require similarly that poor and rich districts follow state teacher and principal evaluation guidelines. Setting aside the junk nature of that evaluation system and the bogus measures on which it rests (and the fact that the reformers’ fav-fab-charters have largely rightfully ignored the eval mandate), it is inequitable because districts serving higher poverty children stand to lose more money per child as a result of non-compliance. And they’ve already been squeezed.

And here’s how that plays out. As I understand it, if districts don’t comply by January, they face the threat of losing the small increase in state aid they received for the current year (compared to 11-12). So, they’d lose it retro-actively, part way through this year. And guess what? Because higher need districts received a marginally greater increase in state aid, they’d lose more per pupil. But the gaps shown above actually already include that oh-so-generous increase! That’s right, the poorer you are, the bigger the financial penalty for non-compliance with the teacher evaluation mandate – and the bigger the financial hole the state has put you in to begin with!

Figure 4. State aid Per Pupil Before and After Non-Compliance Penalty by Student Need

Slide4

Figure 5. Compliance Penalty by Student Need

Slide5This recent article explains that Hempstead, already underfunded by the largest per pupil amount of any large district in the state, stands to lose another $3.5 million in aid if it does not come to agreement on a teacher evaluation plan. State general aid is for the general provision of education to these kids – to pay for enough teachers, classrooms etc. It’s about the day to day operations of schools to ensure the provision of a sound basic education.  This funding shouldn’t be held hostage over reformy whims.

Note that for many districts I have likely understated the amount of aid they would lose because I have counted only changes to general, foundation aid, including “gap elimination adjustment” and partial restoration of those funds. (it would appear, for example, that the potential losses to Hempstead reported in the news are closer to that districts total aid change, not just foundation/GEA change).

Step 4: Protect billions in state aid still being allocated to districts with far fewer additional student needs/costs

And let us not forget that New York State was one of the shining stars – a poster child – of my report with Sean Corcoran for the Center for American Progress where we chronicled how states actually use their aid systems to make equity worse, not better.  While the NY Gov and Legislature have continued to shed elephant tears (in purely political terms) about their fiscal dire straits, the state persists in protecting billions in state direct aid and indirect tax relief subsidies that largely support the states lower  and lowest need local public school districts.

Figure 6 shows that if we look at state general aid, based on initial calculations to local districts by poverty (left hand panel), even after allocating state general aid, there remains an $1,100 per pupil gap in state and local revenue between high and lower poverty districts. But, after the state “tweaks” the  state general aid distribution to provide minimum aid to the wealthiest districts and increase aid to middle/upper middle class districts, and then adds on “tax relief” subsidies, the gap between higher and lower poverty districts increases to $2,300 per pupil. Yep – NY state is actually using billions in state funding to make the system less equitable!  Read the report below for more thorough explanation/analysis!

Figure 6. School Finance Pork in New York!

Slide6

Baker, B.D., Corcoran, S.P.(2012) The Stealth Inequalities of School Funding: How Local Tax Systems and State Aid Formulas Undermine Equality. Washington, DC. Center for American Progress. http://www.americanprogress.org/wp-content/uploads/2012/09/StealthInequities.pdf

And that is how modern state education policy works!

Forget the $300m Deal! Let’s talk $3.4 billion (or more)!

Sometime last week or so, Sockpuppets for Ed Reform marched on City Hall in NY demanding that the city and teachers union come to a deal on a teacher evaluation system compliant with the state’s new regulations for such systems, so that the district could receive an approximately $300 million grant payment associated with the implementation of that system. Well, actually, it was more about trying to enrage the public that the evil teachers union in particular was at fault for holding hostage and potentially losing this supposedly massive sum of funding.

As one can see by the signs the SFER protesters were displaying, the protest was much less clearly articulated than I’ve described above. On would think, from looking at stuff like this: http://nyulocal.com/wp-content/uploads/2012/11/DSC_0841.jpg that this protest was actually about obtaining funding for the district – funding that would provide for substantive and sustained improvement to district programs/services.

But hey, far be it for SFER to actually carry placards that are in any way accurate or precise (or to have any clue what they are talking about). At this particular event in NYC, they even convinced a 15 year old that the fight was really about funding.

So, we’ve got a protest that is presented as being about funding, but is really about a teacher evaluation system driven by student test scores, being carried out by a group that clearly has little or no understanding of either.

You know, I would typically give a group of undergrads a break on stuff like this.  Hey, they’re undergrads and have time to learn/develop the discipline/understanding of these complex topics. Heck, I was anything but a disciplined undergrad myself.  But unfortunately, this group has thus far displayed to me the worst attributes of the most intellectually lazy of today’s college students – a persistent pattern of copying and pasting low quality content from web sites and presenting it as novel content of their own. It’s as if their placards, and their entire website was generated by lifting content from “reformy-pedia.”

So then, what is the real story on what’s goin’ on with Teacher Evaluation and School Funding in New York State?

The State Evaluation System/Guidelines

I’ve written several posts recently about the state metrics for teacher evaluation and the state department of education push to get districts on board. I also wrote about the letter from the Chancellor of the Board of Regents which appeared in the NY Post, encouraging NYC in particular to get on board with that $300m RAW deal!

In my humble opinion, no-one should sign on to a deal to implement a teacher evaluation system under the current NYSED guidelines, given the evidence I’ve laid out over the past few weeks. No-one. Just say NO.

First, the state’s consultants designing their teacher and principal effectiveness measures find that those measures are substantively biased:

Despite the model conditioning on prior year test scores, schools and teachers with students who had higher prior year test scores, on average, had higher MGPs. Teachers of classes with higher percentages of economically disadvantaged students had lower MGPs. (p. 1) https://schoolfinance101.com/wp-content/uploads/2012/11/growth-model-11-12-air-technical-report.pdf

But instead of questioning their own measures, they decide to give them their blessing and pass them along to the state as being “fair and accurate.”

The model selected to estimate growth scores for New York State provides a fair and accurate method for estimating individual teacher and principal effectiveness based on specific regulatory requirements for a “growth model” in the 2011-2012 school year. p. 40 https://schoolfinance101.com/wp-content/uploads/2012/11/growth-model-11-12-air-technical-report.pdf

The next step was for the Chancellor to take this misinformation and polish it up as pure spin as part of the power play against the teachers in New York City (who’ve already had the opportunity to scrutinize what is arguably a better but still substantially flawed set of metrics). The Chancellor proclaimed:

The student-growth scores provided by the state for teacher evaluations are adjusted for factors such as students who are English Language Learners, students with disabilities and students living in poverty. When used right, growth data from student assessments provide an objective measurement of student achievement and, by extension, teacher performance. http://www.nypost.com/p/news/opinion/opedcolumnists/for_nyc_students_move_on_evaluations_EZVY4h9ddpxQSGz3oBWf0M

Then send in the enforcers…. This statement came from a letter sent to a district that did decide to play ball with the state on the teacher evaluation regulations. The state responded that… sure… you can adopt the system of multiple measures you propose – BUT ONLY AS LONG AS ALL OF THOSE OTHER MEASURES ARE SUFFICIENTLY CORRELATED WITH OUR BIASED MEASURES… AND ONLY AS LONG AS AT LEAST SOMEONE GETS A BAD RATING.

The department will be analyzing data supplied by districts, BOCES and/or schools and may order a corrective action plan if there are unacceptably low correlation results between the student growth subcomponent and any other measure of teacher and principal effectiveness… https://schoolfinance101.wordpress.com/2012/12/05/its-time-to-just-say-no-more-thoughts-on-the-ny-state-tchr-eval-system/

This is a raw deal, whether attached to what appears to be a pretty big bribe or not. And quite honestly, while $300 million is nothing to sneeze at, it pales in comparison to what the city schools are actually owed under the state’s own proposal for how it would fund its schools to comply with a court order of nearly a decade ago.

THE REAL ISSUE in NY State

Meanwhile, at the other end of the state – well sort of – a different protest was going on. This protest in Albany actually was about funding and the fact that the state of New York has repeatedly cut state aid from local public school districts each of the past few years, has systematically cut more per pupil funding from districts serving needier student populations and has never once come close to providing the funding levels that the state’s own funding formula suggest are needed (actually, were needed back in 2007!).

Here’s a quick run-down on the state of school funding in New York:

  1. New York continues to maintain one of the least equitable school finance systems in the country, where districts serving higher concentrations of children in poverty have systematically less state and local revenue per pupil.
  2. New York State accomplishes these patterns of egregious disparity not merely by lack of effort, but by actually allocating substantial state resources – disproportionate state resources – toward buying down the tax rates of the state’s wealthiest districts and making other politically convenient state aid allocations to economically advantaged districts, at the expense of children in poverty.
  3. Even though the state was ordered by the NY court of appeals nearly a decade ago to provide adequate resources to children attending high need districts, and even though the court accepted the state’s own proposed funding formula to meet that goal (which was much lower than more rigorously determined spending targets), the state has chosen to not even come close to funding those targets and in recent years has systematically cut more funding from children with greater needs.

So, how does this all affect districts across New York State and NYC in particular? I’m going to set a really low bar here for my comparisons. In response to court order in the Campaign for Fiscal Equity case the state of New York proposed a new school finance formula – a foundation aid formula – to begin implementation in 2007. It was actually a pretty lame, relatively low-balled funding formula to begin with, as explained here!

But even that low-balled estimate of what districts were supposed to get has never been close to fully funded. Several large districts, including Albany, for example, receive in 2012-13, less than half of the state aid they are supposed to receive if the formula was implemented.

The formula provides a target level of funding for each district based on student needs and regional costs. Then, the formula determines the share of that target funding that should come from the state. Then, the formula as actually implemented, ignores all of that and provides a marginal increase or decrease (over what districts have historically received) maintaining the persistent inequities of the system.

The first figure below shows the difference between actual state foundation aid per pupil (after applying this trick they refer to as gap elimination adjustment) and the aid calculated to be needed according to THE STATE’S OWN FORMULA for addressing regional costs and student needs. Districts are organized from low need (left) to high need (right) using the state’s own pupil need index. Bubble size indicates district enrollment size. NYC is the BIG ONE! And, we can see, by eyeballing the middle of that bubble, that NYC is being shorted between $3,000 and $4,000 per pupil. At 1 million kids, that’s about $3.4 billion … each year… every year… over time.  No, not a $300m implementation grant, but $3.4 billion in annual operating funds. Yeah… the stuff that actually provides for smaller class sizes, decent teacher pay, up to date materials, supplies and equipment, and arts, music and all that other stuff!

Slide1

The table below provides a closer look at districts with the largest funding gap between what the formula calculates is needed and what districts actually receive in state aid.

Slide2

So, instead of talking about a one shot $300m bribe to implement a bad system based on bad data, at a cost that may exceed the amount of grant to begin with, perhaps it would make more sense to focus on that $3.4 billion deal! You know, the one state officials themselves promised in response to that court order all those years ago.

And when we do start taking more seriously this much bigger funding issue, don’t forget to send me a cool lookin’ knit protest hat!

Readings

Policy Brief on State Aid in New York (Summer 2011) NY Aid Policy Brief_Fall2011_DRAFT6

Baker, B.D., Welner, K.G. (2012) Evidence and Rigor: Scrutinizing the Rhetorical Embrace of
Evidence-based Decision-making. Educational Researcher 41 (3) 98-101

Baker, B.D., Welner, K. (2011) School Finance and Courts: Does Reform Matter, and How Can We
Tell? Teachers College Record 113 (11) p. –

Baker, B.D., Corcoran, S.P.(2012) The Stealth Inequalities of School Funding: How Local Tax
Systems and State Aid Formulas Undermine Equality. Washington, DC. Center for American
Progress. http://www.americanprogress.org/wp-content/uploads/2012/09/StealthInequities.pdf

Baker, B.D., Sciarra, D., Farrie, D. (2012) Is School Funding Fair? Second Edition, June 2012.
http://schoolfundingfairness.org/National_Report_Card_2012.pdf

Baker, B.D. (2012) Revisiting the Age Old Question: Does Money Matter in Education. Shanker
Institute. http://www.shankerinstitute.org/images/doesmoneymatter_final.pdf

Baker, B.D., Welner, K.G. (2011) Productivity Research, the U.S. Department of Education, and
High-Quality Evidence. Boulder, CO: National Education Policy Center. Retrieved [date] from
http://nepc.colorado.edu/publication/productivity-research.

Friday Thoughts on Data, Assessment & Informed Decision Making in Schools

Some who read this blog might assume that I am totally opposed, in any/all circumstances to using data in schools to guide decision-making. Despite my frequent public cynicism I assure you that I believe that much of the statistical information we collect on and in schools and school systems can provide useful signals regarding what’s working and what’s not, and may provide more ambiguous signals warranting further exploration – through both qualitative information gathering (observation, etc.) and additional quantitative information gathering.

My personal gripe is that thus far – especially in public policy – we’ve gone about it all wrong.  Pundits and politicians seem to have this intense desire to impose certainty where there is little or none and impose rigid frameworks with precise goals which are destined to fail (or make someone other than the politician look as if they’ve failed).

Pundits and politicians also feel the intense desire to over-sample the crap out of our schooling system – taking annual measurements on every child over multiple weeks of the school year when strategic sampling of selected testing items across samples of students and settings might provide more useful information at lower cost and be substantially less invasive (NAEP provides one useful example). To protect the health of our schoolchildren, we don’t make them all walk around all day with rectal thermometers hanging out of…well… you know?  Nor do political pollsters attempt to poll 100% of likely voters.  Nor should we feel the necessity to have all students take all of the assessments, all of the time, if our goal is to ensure that the system is getting the job done/making progress.

In my view, a central reason for testing and measurement in schools is what I would refer to as system monitoring,  where system monitoring is best conducted in the least intrusive and most cost-effective way – such that the monitoring itself does not become a major activity of the system!  We just need enough sampling density in our assessments to generate sufficient estimates at each relevant level of the system.

I know there are those who would respond that testing everyone every year ensures that no kids fall through the cracks. If we did it my less intrusive way… kids who weren’t given all test questions in math in a given year might fall through some hypothetical math crack somewhere. But it is foolish to assume that NCLB-every-student-every-year testing regimes actually solve that problem. Further, high stakes testing with specific cut scores either for graduation or grade promotion violates one of the most basic tenets of statistical measurement of student achievement – that these measures are not perfectly precise. They can’t identify exactly  where that crack is, or which kid actually fell through it! One can’t select a cut score and declare that the child one point above that score (who got one more question correct on that given day) is ready (with certainty) for the next grade (or to graduate) and the child 1 point below is not. In all likelihood these two children are not different at all in their actual “proficiency” in the subject in question. We might be able to say – by thoughtful and rigorous analysis – that on average, students who got around this score in one year, were likely to get a certain score in a later year, and perhaps even more likely to make it beyond remedial course work in college. And we might be able to determine if students attending a particular school or participating in a particular program are more or less likely (yeah… probability again) to succeed in college.

Thoughtful analysis and more importantly thoughtful USE of testing data in schools requires a healthy respect for what those numbers can and cannot tell us… and nuanced understanding that the numbers typically include a mix of non-information (noise/unexplainable, non-patterned information), good information (true signal) and perhaps misinformation (false signal, or bias, variation caused by something other than what we think it’s caused by).

These issues apply generally to our use of student assessment data in schools and also apply specifically to an area I discuss often on this blog – statistical evaluation of teacher influence on tested student outcomes.

I was pleased to see the Shankerblog column by Doug Harris a short while back in which Doug presented a more thoughtful approach to integrating value-added estimates into human resource management in the schooling context. Note that Doug’s argument is not new at all, nor is it really his own unique view. I first heard this argument in a presentation by Steve Glazerman (of Mathematica) at Princeton a few years ago. Steve also used the noisy medical screening comparison to explain the use of known-to-be-noisy information to assist in making more efficient decisions/taking more efficient steps in diagnosis. That is, with appropriate respect for the non-information in the data, we might actually find ways to use that information productively.

Last spring, I submitted an article (still under review) in which I, along with my coauthors Preston Green and Joseph Oluwole explained:

As we have explained herein, value-added measures have severe limitations when attempting even to answer the narrow question of the extent to which a given teacher influences tested student outcomes. Those limitations are sufficiently severe such that it would be foolish to impose on these measures, rigid, overly precise high stakes decision frameworks.  One simply cannot parse point estimates to place teachers into one category versus another and one cannot necessarily assume that any one individual teacher’s estimate is necessarily valid (non-biased).  Further, we have explained how student growth percentile measures being adopted by states for use in teacher evaluation are, on their face, invalid for this particular purpose.  Overly prescriptive, overly rigid teacher evaluation mandates, in our view, are likely to open the floodgates to new litigation over teacher due process rights, despite much of the policy impetus behind these new systems supposedly being reduction of legal hassles involved in terminating ineffective teachers.

This is not to suggest that any and all forms of student assessment data should be considered moot in thoughtful management decision making by school leaders and leadership teams. Rather, that incorrect, inappropriate use of this information is simply wrong – ethically and legally (a lower standard) wrong. We accept the proposition that assessments of student knowledge and skills can provide useful insights both regarding what students know and potentially regarding what they have learned while attending a particular school or class. We are increasingly skeptical regarding the ability of value-added statistical models to parse any specific teacher’s effect on those outcomes. Further, the relative weight in management decision-making placed on any one measure depends on the quality of that measure and likely fluctuates over time and across settings. That is, in some cases, with some teachers and in some years, assessment data may provide leaders and/or peers with more useful insights.  In other cases, it may be quite obvious to informed professionals that the signal provided by the data is simply wrong – not a valid representation of the teacher’s effectiveness.

Arguably, a more reasonable and efficient use of these quantifiable metrics in human resource management might be to use them as a knowingly noisy pre-screening tool to identify where problems might exist across hundreds of classrooms in a large district. Value-added estimates might serve as a first step toward planning which classrooms to observe more frequently. Under such a model, when observations are completed, one might decide that the initial signal provided by the value-added estimate was simply wrong. One might also find that it produced useful insights regarding a teacher’s (or group of teachers’) effectiveness at helping students develop certain tested algebra skills.

School leaders or leadership teams should clearly have the authority to make the case that a teacher is ineffective and that the teacher even if tenured should be dismissed on that basis. It may also be the case that the evidence would actually include data on student outcomes – growth, etc. The key, in our view, is that the leaders making the decision – indicated by their presentation of the evidence – would show that they have used information reasonably to make an informed management decision. Their reasonable interpretation of relevant information would constitute due process, as would their attempts to guide the teacher’s improvement on measures over which the teacher actually had control.

By contrast, due process is violated where administrators/decision makers place blind faith in the quantitative measures, assuming them to be causal and valid (attributable to the teacher) and applying arbitrary and capricious cutoff-points to those measures (performance categories leading to dismissal).   The problem, as we see it, is that some of these new state statutes require these due process violations, even where the informed, thoughtful professional understands full well that she is being forced to make a wrong decision. They require the use of arbitrary and capricious cutoff-scores. They require that decision makers take action based on these measures even against their own informed professional judgment.

My point is that we can have thoughtful, data informed (NOT DATA DRIVEN) management in schools. We can and should! Further, we can likely have thoughtful data informed management (system monitoring) through far less intrusive methods than currently employed – taking advantage of advancements in testing and measurement, sampling design etc. But we can only take these steps if we recognize the limits of data and measurement in our education systems.

Unfortunately, as I see it, current policy efforts enforcing the misuse of assessment data (as illustrated here, here and here) and misuse of estimates of teacher effectiveness based on those data (as illustrated here) will likely do far more harm than good.  Unfortunately, I don’t see things turning corner any time soon.

Until then, I may just have to stick to my current message of Just say NO!

Ed Schools – The Sequel: Rise of the Intellectually Dead

Warning: The following post contains the elitist musings of an ivory tower professor who has only professed at major research universities, who attended a selective liberal arts college & received his doctorate from an Ivy league institution (well… a branch of one… Teachers College at Columbia).

A while back, I wrote a post on “ed schools” the point of which was to show the shift in production of degrees that had occurred between the early 1990s and late 2000s. When I wrote that first post, ed schools were coming under fire from DC think tanks like the National Center on Teaching Quality (NCTQ), which seemed largely unable to understand the most basic issues of degree production in education (I’m unsure they’ve learned much since then!). And now, it would appear that our esteemed U.S. Secretary of Education has decided that ed schools and teacher preparation will be of primary interest in the second term of this administration.

The problem as I previously indicated, was that most of this rhetoric about ed schools and their supposed failure of society and production of generations of ill-equipped American youth, is that the rhetoric of “ed school” assumes a static definition of ed school – rooted in a 1950s to 1970s characterization of the regional public teachers college, and built on an assumption that teachers obtain their training and a teaching credential – for the one thing they teach – through a single institution as the core of their undergraduate education. Being “teachers colleges,” these schools are obviously lax on admission standards, have curriculum that is neither academically rigorous nor practical, etc. etc. etc. (the conflicting rhetoric in this regard is fun to follow – too much theory… no practical application… but not academically rigorous, etc.), and well… simply must be replaced by a vast set of alternative routes/pathways/programs!

In short, the vast majority of the critique of teacher education assumes this monolithic AND STATIC entity of teacher preparation housed in state colleges and universities. Emporia state in Kansas – that’s you! Monclair in NJ – that’s you! West Georgia – you too! And those state flagships with teacher prep programs? Damn you Rutgers, Michigan, Illinois for producing increasing numbers of underqualified teachers! The wrath of NCTQ and now Arne Duncan will be upon you!

But degree & credential production in education has not entirely been static over time. In fact, anything but! There are clearly emerging trends. And if we believe that there really has been a decline in the academic quality of those receiving credentials in education, it would behoove us to take a close look at those trends. But since no-one else seems to be doing that – especially not NCTQ – I figured I should take another shot at it.

A couple of key points are in order. FIRST – it is important to understand that these days, many initial teaching credentials are already granted through alternate routes outside of undergraduate programs and to individuals with degrees in fields other than education. In addition to non-degree alternate routes which I cannot even capture with the data in this post, many initial teaching credentials are granted through graduate programs – at the masters degree level and an even larger share of additional – second/third credentials received by practicing teachers are obtained through graduate programs. Individual teachers may have collected a handful of different credentials, all from different institutions.

So, let’s take a look at undergraduate and masters degree production trends.

Undergraduate Training

Undergraduate degree production in “education” fields generally (most of which involves teacher preparation) has been most stable over time. Using 1994 Carnegie Classifications (the most stratified system of Carnegie classifications of the past few decades: see end of post for definitions), we see that the percent of degrees being produced by what were the public “teachers colleges” (Comprehensive 1… as opposed to those labeled as “Teachers Colleges”) still hold the lions share, but have declined over time. Research Universities which produced around 14% in 1990 now produce closer to 10% (those are your state flagships & major private universities). So… the major traditional public college and university role is declining slightly in market share.

That loss is being picked up by what is actually a very small subset of colleges – that also tend to be relatively small, and not so prestigious colleges. These are the “LA – Liberal Arts 2” colleges. It’s quite striking that growth in this subset is sufficient to shift the market shares of major state universities and comprehensive regional colleges. Incidentally, LA 2s were among the first to expand rapidly their production of online and distance MBAs… around the same time they started tapping the ed market. (this period overlaps with a trend among financially strapped, less selective colleges making the move to change their name to “university.“)

Slide19

Patterns are also relatively stable by the Barrons’ competitiveness ratings. Notably, colleges right in the middle of the competitiveness ratings have the largest market share. I know this conflicts with reformy ideas that all ed degrees are produced by the worst colleges – but at the undergrad level, it’s a pretty normal distribution. Competitive colleges have a consistent 50% market share. Indeed, they are not the top third. They are also not the bottom! They are… the middle… as one would expect for a profession with modest (at best) earnings expectations.

The next two categories out from there – one up (very) and one down (less), have just under 20%. But, the “less competitive” group seems to be showing an uptick (they are also heavy on those LA2s!). Highly Competitive and Non-Competitive are also relatively comparable, but with non-competitive slightly outpacing highly-competitive.

 

Slide20

Masters Degrees

It’s in the production of masters degrees where the real fun stuff is happening. First, let’s take a look at what’s been happening across institutions by type. Note that Comprehensive colleges were, in large part, designed to deliver bachelors and masters degree programs and many from early on had large education programs and teacher preparation programs in particular. But we see in the figure below that the market share of masters degree production for Comp1s has declined over time. So too has the market share for masters degrees for Research Universities (including state flagship universities).

Amazingly, it’s those LA2s again that have risen dramatically in degree production. These lower tier liberal arts colleges (we’re not talkin’ Williams, Haverford, etc… which are LA1s. Those schools aren’t crankin’ up masters in Ed… and they’re also not changing their name to Williams University, etc.), have become the second largest producers of masters degrees in education. Bear in mind that liberal arts colleges, as classified in the 1990s, were never really intended to be handing out graduate degrees – no less massive numbers of them.  LA2s have gone from only about 1% of ed masters production in 1990 to over 10% by 2011.

Slide23

The next figure reclassifies these schools by the competitiveness of their undergraduate programs (since we lack competitiveness measures for graduate programs). What we see here is that masters programs housed in “LESS COMPETITIVE” undergraduate colleges are the ones that are creeping up in market share. To a significant extent, these are online, credential granting programs run through LA2s.

Slide24

So, what we have here, is a rather dramatic expansion of graduate credentials in education being handed out by what some (including myself) might characterize as relatively low quality, non-selective undergraduate institutions that were never meant to be handing out graduate degrees to begin with.  But perhaps that’s just my ivory tower, Research I perspective.

Now lets take a look at the top 20 Masters degree producers in the early 1990s and then in the most recent three years. In the early 1990s, the largest producers were crankin’ out a few thousand over a three year period. These included some early entrants – pre-online era – to the degree mass-production game like Lesley College and National Louis U. But, there were also many programs housed in brick and mortar public universities in the mix, including both state flagships (UT Austin, Ohio State) and other pretty solid academic schools (Harvard, Columbia/TC).  Arguably, these [the public colleges in particular] are the schools now taking the brunt of the blame for the state of teacher preparation – Northern Arizona, Northern Colorado, Eastern Michigan, etc.

Slide26

But who has actually been crankin’ out the masters degrees and credentials in recent years? And, if there is a decline and pending crisis in education training/preparation, who might instead be to blame? Below is the more recent production of graduate degrees/credentials. First and foremost, we’ve now got schools crankin’ out over 3,000 per year – or 9k per 3 years. Phoenix, Waldon and Grand Canyon together produce more masters degrees than many of the next several combined.  There is a substantial gap in production before one reaches the first traditional teacher preparation program on the list.

Is it possible that the emphasis on traditional “ed schools” within state boundaries as the obvious source of our problems is misplaced?

Slide25

Graduate Degree Production in Educational Leadership/Administration

I’ve got one last bit to address here and that’s training in educational leadership/administration, a topic I’ve written about in my academic publications (see below). Degree production in educational leadership has followed many of the same trends we see in education more generally. And there has been comparable push to provide more “alternatives” for gaining access to principal, supervisor and district leadership credentials. NOTE- if you think some of what I’m displaying here makes education grad degree production look like a cesspool, I assure you that when it comes to the production of MBAs, the picture is equally if not even more ugly! (One can buy an MBA almost anywhere… perhaps even more easily than a degree in ed admin… and in many cases which I have observed directly, the level of academic rigor, even within major universities, is hardly different!)

The figure below shows that major research universities have played a declining role in the production of graduate degrees (all levels) in educational administration. Again, it’s those entrepreneurial LA2s that are crankin’ up the production – moving into 2nd place among institution types.

Slide7

Now lets take a look specifically at doctoral degrees. One can almost kind of understand the mass production masters degrees which in education are often tied to obtaining specific certifications perhaps in additional fields of specialization (special education, etc.). Yes, in many states, administration degrees are structured such that the masters is coupled with building level certification and doctorate with district level certification. Even then, how many doctorates does any one institution need to be cranking out? And who should be granting that level of degree?

By 1990s Carnegie classifications, doctorates should be (have been) largely granted out by Research and Doctoral Universities. Comprehensive colleges were generally masters producing schools, not doctoral granting institutions. These strata were, in fact, intended to reflect the capacity of institutions to grant certain types/levels of degrees.

Already by the early 1990s, Nova Southeastern had pioneered mass production of the education doctorate. But outside of the Nova model, most major producers of doctorates were actual universities (okay… a bit harsh… since NOVA actually is a university, and has a pretty well defined, conventional curriculum for their graduate programs).

Slide12

In the most recent years, Nova Southeastern has remained strong… but now right up there are such stellar academic powerhouses as Walden, Capella and Phoenix! (and Argosy)… many of which probably occasionally show up as side-bar advertisements on my blog! (as they do when I log into facebook).

A notable change in the past few years is the entrance of USC and Penn to this mix, with their new practitioner preparation programs which apparently crank out a sizable number of doctorates per year.  This raises the interesting question of whether leading universities should try to get into the mass production game? Is the system overall better for it, even if those institutions have to sacrifice some quality in order to mass produce? We’ll have to see if they can keep up with the Waldens and Capellas over the next several years.

Slide14

Closing Thoughts

To me, these trends are pretty astounding, and serious consideration of these trends must play into any discussion that alarmists might have about the supposed decline in the quality of teacher and administrator preparation (to the extent these alarmists give serious consideration to anything).  Those ringing these alarm bells seem more than happy to suggest that the obvious problem lies with traditional “ed schools” (read, regional and state flagship public colleges and universities) and that the obvious solution is to provide more alternative routes, online options – teacher preparation by MOOC…  (and likely not a MOOC delivered by Stanford U. faculty… but rather through Walden, Capella and the like) & expansion of schools relying on imported, short term labor supply.

I also find it strange to say the least that those who argue that the problem is that our teachers don’t come from the upper third of college graduates seem to believe that the solution is to expand the types programs that tend to grow most rapidly among colleges that cater to the bottom third (less & non-competitive).  To those reformy alarmists who feel they’ve identified the obvious problems and logical solutions, the above data should make sufficiently clear that we’ve already gone down that road.

Further, I’m thoroughly unconvinced that new models purporting to be more selective in the teachers they prepare, but relying largely on a self-credentialing model (we use our teachers to credential our teachers… and only accept as graduate students those who work in our schools?) focused primarily in ideological & cultural indoctrination   are a step in the right direction.  I have little doubt they’ll find a captive audience to self-credential and maintain a viable “business model,” (by requiring their own teachers to take courses delivered by their peers & bosses to achieve the credentials needed to keep their jobs) but this endogenous, back-patting self-validating model is no way to train the future teacher workforce.*

All of this begs the question of what next? Where do we go from here? How to we achieve integrity and quality in the production of degrees and credentials, and more broadly training and preparation of future teachers and administrators? I really don’t have any answers for these questions right now. But I’m pretty sure that the last two decades have taken us the wrong direction!

Related Research

Baker, B.D, Orr, M.T., Young, M.D. (2007) Academic Drift, Institutional Production and Professional Distribution of Graduate Degrees in Educational Administration. Educational Administration Quarterly  43 (3)  279-318

Baker, B.D., Fuller, E. The Declining Academic Quality of School Principals and Why it May Matter. Baker.Fuller.PrincipalQuality.Mo.Wi_Jan7

Baker, B.D., Wolf-Wendel, L.E., Twombly, S.B. (2007) Exploring the Faculty Pipeline in Educational
Administration: Evidence from the Survey of Earned Doctorates 1990 to 2000. Educational
Administration Quarterly 43 (2) 189-220

Wolf-Wendel, L, Baker, B.D., Twombly, S., Tollefson, N., & Mahlios, M.  (2006) Who’s Teaching the Teachers? Evidence from the National Survey of Postsecondary Faculty and Survey of Earned Doctorates.  American Journal of Education 112 (2) 273-300

1994 Carnegie Classifications

  • Research Universities I: These institutions offer a full range of baccalaureate programs, are committed to graduate education through the doctorate, and give high priority to research. They award 50 or more doctoral degrees1 each year. In addition, they receive annually $40 million or more in federal support.
  • Research Universities II: These institutions offer a full range of baccalaureate programs, are committed to graduate education through the doctorate, and give high priority to research. They award 50 or more doctoral degrees1 each year. In addition, they receive annually between $15.5 million and $40 million in federal support.
  • Doctoral Universities I: These institutions offer a full range of baccalaureate programs and are committed to graduate education through the doctorate. They award at least 40 doctoral degrees1 annually in five or more disciplines.
  • Doctoral Universities II: These institutions offer a full range of baccalaureate programs and are committed to graduate education through the doctorate. They award annually at least ten doctoral degrees—in three or more disciplines—or 20 or more doctoral degrees in one or more disciplines.
  • Master’s (Comprehensive) Universities and Colleges I: These institutions offer a full range of baccalaureate programs and are committed to graduate education through the master’s degree. They award 40 or more master’s degrees annually in three or more disciplines. [Includes typical regional, within-state public normal schools/teachers colleges]
  • Master’s (Comprehensive) Universities and Colleges II: These institutions offer a full range of baccalaureate programs and are committed to graduate education through the master’s degree. They award 20 or more master’s degrees annually in one or more disciplines.
  • Baccalaureate (Liberal Arts) Colleges I: These institutions are primarily undergraduate colleges with major emphasis on baccalaureate degree programs. They award 40 percent or more of their baccalaureate degrees in liberal arts fields4 and are restrictive in admissions.
  • Baccalaureate Colleges II: These institutions are primarily undergraduate colleges with major emphasis on baccalaureate degree programs. They award less than 40 percent of their baccalaureate degrees in liberal arts fields4 or are less restrictive in admissions. [Includes many cash-strapped, relatively non-selective, smaller private liberal arts colleges]

*I still like to believe that the most important background attribute of a “good teacher” or school leader is someone who is enthusiastic about their own learning, constantly seeking intellectual growth and challenge and that this attribute is often revealed in the types of advanced studies an individual chooses to pursue. To me, even if the Relay model does tap into a set of graduates of more selective colleges, if the Relay program itself is little more than a workshop on “no excuses” classroom disciplinary practices and typical inspiring edu-guru staff development fodder, then the Relay model is antithetical to developing truly good teachers. A workshop or two and perhaps some practical guidance from peers or teacher leaders – okay. But a graduate degree based on this stuff? Are you kidding? (just watch the RELAY GSE Videos here: http://www.relayschool.org/videos?vidid=5)

When Disinformation is Fueled by Misinformation! CHANCELLOR TISCH, YOU ARE WRONG!

Very recently, I posted a critique of the recent technical report on New York State median growth percentiles to be used in that state’s teacher and principal evaluation system.

Today, I read this piece in the NY Post – an editorial by NY State Board of Regents Chancellor Merryl Tisch, and well, MY HEAD ALMOST EXPLODED!

The point of the editorial is to encourage NY City’s teachers and DOE to agree to a teacher evaluation system based on supposedly objective measures – where “objective measures” seems largely to be code language for estimates of teacher effectiveness derived from student assessment data.

First, I have written several previous posts on the usefulness of NYC’s value-added model for determining teacher effectiveness.

  1. the NYC VAM model retains some persistent biases
  2. the NYC VAM model is highly unstable from year to year
  3. the NYC VAM results capture only a handful of teachers per school and their results tend to jump all over the place
  4. adopting the NCTQ irreplaceables logic, the NYC VAM data are so noisy that few if any teachers are persistently irreplaceable
  5. for various reasons, it is unlikely that these are just early glitches in the system that will get better with time

Setting aside this long list of concerns about the NYC VAM results, I now turn to the NYSED – state median growth percentile data (which actually seem inferior to the NYC VAM model/estimates). In her editorial, Chancellor Tisch proclaims:

The student-growth scores provided by the state for teacher evaluations are adjusted for factors such as students who are English Language Learners, students with disabilities and students living in poverty. When used right, growth data from student assessments provide an objective measurement of student achievement and, by extension, teacher performance.

Let me be blunt here. CHANCELLOR TISCH – YOU ARE WRONG! FLAT OUT WRONG! IRRESPONSIBLY & PERHAPS NEGLIGENTLY WRONG!

[now, one might quibble that Chancellor Tisch has merely stated that the measures are “adjusted for” certain factors and she has not claimed that those adjustments actually work to eliminate bias. Further, she has merely declared that the measures are “objective” and not that they are accurate or precise. Personally, I don’t find this deceptive language at all comforting!]

Indeed, the measures attempt – but fail to sufficiently adjust for key factors. They retain substantial biases as identified in the state’s own technical report. And they are subject to many of the same error concerns as the NYC VAM model.  Given the findings of the state’s own technical report, it is irresponsible to suggest that these measures can and should be immediately considered for making personnel and compensation decisions.

Finally, as I laid out in my previous blog post to suggest that “growth data from student assessments provide an objective measure of student achievement, and, by extension, teacher performance” IS A HUGE UNWARRANTED STRETCH!

While I might concur with the follow up statement from Chancellor Tisch that “We should never judge an educator solely by test scores, but we shouldn’t completely disregard student performance and growth either.” I would argue that school leaders/peer teachers/personnel managers should absolutely have the option to completely disregard data that have high potential to be sending false signals, either as a function of persistent bias or error. Requiring action based on biased and error prone data (rather than permitting those data to be reasonably mined to the extent they may, OR MAY NOT, be useful) is a toxic formula for public schooling quality.

The one thing I can’t quite figure out here is which is the misinformation and which is the disinformation. In any case, both are wrong!

The rest of what I have to say, I’ve already said. But, so readers don’t have to click the link below to access the previous post, I’ve pasted  the entire thing below. Enjoy!

COMPLETE PREVIOUS POST!

I was immediately intrigued the other day when a friend passed along a link to the recent technical report on the New York State growth model, the results of which are expected/required to be integrated into district level teacher and principal evaluation systems under that state’s new teacher evaluation regulations.  I did as I often do and went straight for the pictures – in this case- the scatterplots of the relationships between various “other” measures and the teacher and principal “effect” measures.  There was plenty of interesting stuff there, some of which I’ll discuss below.

But then I went to the written language of the report – specifically the report’s (albeit in DRAFT form)  conclusions. The conclusions were only two short paragraphs long, despite much to ponder being provided in the body of the report. The authors’ main conclusion was as follows:

The model selected to estimate growth scores for New York State provides a fair and accurate method for estimating individual teacher and principal effectiveness based on specific regulatory requirements for a “growth model” in the 2011-2012 school year. p. 40

http://engageny.org/wp-content/uploads/2012/06/growth-model-11-12-air-technical-report.pdf

13-Nov-2012 20:54

Updated Final Report: http://engageny.org/sites/default/files/resource/attachments/growth-model-11-12-air-technical-report_0.pdf

Local copy of original DRAFT report: growth-model-11-12-air-technical-report

Local copy of FINAL report: growth-model-11-12-air-technical-report_FINAL

Unfortunately, the multitude of graphs that immediately preceded this conclusion undermine it entirely. but first, allow me to address the egregious conceptual problems with the framing of this conclusion.

First Conceptually

Let’s start with the low hanging fruit here. First and foremost, nowhere in the technical report, nowhere in their data analyses, do the authors actually measure “individual teacher and principal effectiveness.” And quite honestly, I don’t give a crap if the “specific regulatory requirements” refer to such measures in these terms. If that’s what the author is referring to in this language, that’s a pathetic copout.  Indeed it may have been their charge to “measure individual teacher and principal effectiveness based on requirements stated in XYZ.” That’s how contracts for such work are often stated. But that does not obligate the author to conclude that this is actually what has been statistically accomplished. And I’m just getting started.

So, what is being measured and reported?  At best, what we have are:

  • An estimate of student relative test score change on one assessment each for ELA and Math (scaled to growth percentile) for students who happen to be clustered in certain classrooms.

THIS IS NOT TO BE CONFLATED WITH “TEACHER EFFECTIVENESS”

Rather, it is merely a classroom aggregate statistical association based on data points pertaining to two subjects being addressed by teachers in those classrooms, for a group of children who happen to spend a minority share of their day and year in those classrooms.

  • An estimate of student relative test score change on one assessment each for ELA and Math (scaled to growth percentile) for students who happen to be clustered in certain schools.

THIS IS NOT TO BE CONFLATED WITH “PRINCIPAL EFFECTIVENESS”

Rather, it is merely a school aggregate statistical association based on data points pertaining to two subjects being addressed by teachers in classrooms that are housed in a given school under the leadership of perhaps one or more principals, vps, etc., for a group of children who happen to spend a minority share of their day and year in those classrooms.

Now Statistically

Following are a series of charts presented in the technical report, immediately preceding the above conclusion.

Classroom Level Rating Bias

School Level Rating Bias

And there are many more figures displaying more subtle biases, but biases that for clusters of teachers may be quite significant and consequential.

Based on the figures above, there certainly appears to be, both at the teacher, excuse me – classroom, and principal – I mean school level, substantial bias in the Mean Growth Percentile ratings with respect to initial performance levels on both math and reading. Teachers with students who had higher starting scores and principals in schools with higher starting scores tended to have higher Mean Growth Percentiles.

This might occur for several reasons. First, it might just be that the tests used to generate the MGPs are scaled such that it’s just easier to achieve growth in the upper ranges of scores. I came to a similar finding of bias in the NYC value added model, where schools having higher starting math scores showed higher value added. So perhaps something is going on here. It might also be that students clustered among higher performing peers tend to do better. And, it’s at least conceivable that students who previously had strong teachers and remain clustered together from year to year, continue to show strong growth. What is less likely is that many of the actual “better” teachers just so happen to be teaching the kids who had better scores to begin with.

That the systemic bias appears greater in the school level estimates than in the teacher level estimates is suggestive that the teacher level estimates may actually be even more bias than they appear. The aggregation of otherwise less biased estimates should not reveal more bias.

Further, as I’ve mentioned on several times on this blog previously, even if there weren’t such glaringly apparent overall patterns of bias their still might be underlying biased clusters.  That is, groups of teachers serving certain types of students might have ratings that are substantially WRONG, either in relation to observed characteristics of the students they serve or their settings, or of unobserved characteristics.

Closing Thoughts

To be blunt – the measures are neither conceptually nor statistically accurate. They suffer significant bias, as shown and then completely ignored by the authors. And inaccurate measures can’t be fair. Characterizing them as such is irresponsible.

I’ve now written 2 articles and numerous blog posts in which I have raised concerns about the likely overly rigid use of these very types of metrics when making high stakes personnel decisions. I have pointed out that misuse of this information may raise significant legal concerns. That is, when district administrators do start making teacher or principal dismissal decisions based on these data, there will likely follow, some very interesting litigation over whether this information really is sufficient for upholding due process (depending largely on how it is applied in the process).

I have pointed out that the originators of the SGP approach have stated in numerous technical documents and academic papers that SGPs are intended to be a descriptive tool and are not for making causal assertions (they are not for “attribution of responsibility”) regarding teacher effects on student outcomes. Yet, the authors persist in encouraging states and local districts to do just that. I certainly expect to see them called to the witness stand the first time SGP information is misused to attribute student failure to a teacher.

But the case of the NY-AIR technical report is somewhat more disconcerting. Here, we have a technically proficient author working for a highly respected organization – American Institutes for Research – ignoring all of the statistical red flags (after waiving them), and seemingly oblivious to gaping conceptual holes (commonly understood limitations) between the actual statistical analyses presented and the concluding statements made (and language used throughout).

The conclusion are WRONGstatistically and conceptually.  And the author needs to recognize that being so damn bluntly wrong may be consequential for the livelihoods of thousands of individual teachers and principals! Yes, it is indeed another leap for a local school administrator to use their state approved evaluation framework, coupled with these measures, to actually decide to adversely affect the livelihood and potential career of some wrongly classified teacher or principal – but the author of this report has given them the tool and provided his blessing. And that’s inexcusable.

Teachers Unions: Scourge of the Nation?

UPDATED: 1/29/2015

Let me start by stating that I, myself am somewhat agnostic when it comes to the questions around whether I believe teachers unions are generally good or bad for the overall quality of our education system and for educational equity.  In my personal experiences as a young teacher in the early 1990s, I had my issues with my local teachers unions (in New York State in particular), resulting in some pretty heated battles with local and regional union officials [and some pretty nasty internal politics in my own school].  As a young teacher, I was anything but a fan of the teachers union. But unlike many of my TFA pals [I was a few years too early for TFA, but had friends & later colleagues in the first few waves] who only stuck it out in teaching for a year or two and may have developed similar negative feelings toward their local union, I did outgrow that initial reaction – which in my view- was somewhat isolated – and partly a function of my own youthful ignorance.  I didn’t stick it out in public school teaching much longer than that [the local union actually ran me out!], but did have the unique experience of working in an elite private school that had a union, and I worked in that school during a contract renegotiation.

The idea for this post first came about when I read the following quote in an article in the Economist. This has to be among the most utterly stupid statements I think I’ve ever read in my life:

…no Wall Street financier has done as much damage to American social mobility as the teachers’ unions have. http://www.economist.com/node/21564556

And then there’s this more recent quote:

Many schools are in the grip of one of the most anti-meritocratic forces in America: the teachers’ unions, which resist any hint that good teaching should be rewarded or bad teachers fired. http://www.economist.com/news/leaders/21640331-importance-intellectual-capital-grows-privilege-has-become-increasingly

Now… this quote is these quotes are ridiculous at many levels.  Most notably, the first quote is stupid simply because one could never possible contrive a reasonable quantifiable comparison of the supposed negative effects of either the individual hedge fund manager or the supposed monolithic “teachers union.” It’s the empirical equivalent of arguing whether Superman can beat up Hulk. It’s just asinine.

UPDATE: The second quote above comes from a piece that subsequently implies that teachers’ unions are a major, if not the primary cause of educational inequality across children- specifically between rich and poor children. Here’s a little more on the topic of “teacher equity” in particular. (Post 1 | Post 2)

On the heels of this quote came the Thomas B. Fordham Institute report rating the strength of teachers unions – or unionization more generally – across states.  Perhaps the most useful aspect of this report is that it provides us with insights regarding the heterogeneity of unionization across American states.  Unions and unionization are not monolithic.

As recognized by the Fordham report, we really don’t have an American education system. We have 51 systems. They are all somewhat different, with different standards, different funding systems, different union rules and protections and different student outcomes.  The existing variations across our state systems of education alone render the economist statement utterly stupid and misguided.  Those variations also provide for some fun opportunities to explore the relationship between TB Fordham’s characterization of teachers’ union strength across states and other features of state education systems.

In this post, I use data from several reports that attempt to characterize state education systems to probe two main questions – whether there exists any association between general indicators of education quality across states and union strength, and whether there exists any association between indicators of educational equality across states and union strength.

How is union strength related to funding levels and funding fairness?

Along with colleagues at the Education Law Center of New Jersey, I have been preparing for the past few years, annual reports on education funding fairness. In the Funding Fairness report, we use a statistical model on three years of national data on all school districts to project the cost adjusted per pupil state and local revenues for all districts and state averages nationally, and we characterize the overall fairness – progressiveness or regressiveness of state school finance systems. Below, I evaluate the relationship between “union strength rank” from the TB Fordham report and funding “levels” (an indicator of adequacy) and funding “fairness” (whether higher poverty districts receive systematically more, or less funding per pupil than lower poverty districts in that state).

An important caveat here since I like to pick on inappropriate graphs myself is that I really should not be making scatterplots where the x-axis variable is a “rank” measure. Rank is not an interval measure. But this is purely for illustrative purposes, so please forgive my misuse of rank data in this way! [or at least if you slam me for it, acknowledge that I pointed this out!]

Figure 1

In Figure 1 we can see that states with stronger teachers unions [left hand end] tend to have more adequate overall funding levels. It is however more clearly the case that states with weak teachers unions (ranked 45 to 50th) tend to have particularly low adjusted funding levels. This is certainly not to suggest any direction of causation. That’s the whole trick here. Most of this is probably quite circular – endogenous. [the union cynic might argue that this merely shows that teachers’ unions have extorted funds from the taxpayer] That states which tend to be more educated and progressive happen to both have stronger teachers unions and to spend more on education – but for those states like California that by historical artifact referendum have systematically deprived their education systems for decades.

Figure 2

Perhaps more to the point of the Economist assertion, we see that states with weaker teachers unions also tend to have less fair funding distributions – or are systems where it is more likely that high poverty districts have systematically fewer resources per pupil than lower poverty ones.  Again, this result is likely a function of the endogenous relationships mentioned previously.

See: http://www.schoolfundingfairness.org/

UPDATE: So, wait a second, if stronger union states tend to have fairer funding distributions, might that actually enhance equity? In a really big, important and substantive way? Hmmm….

How is union strength related to competitiveness of teacher pay?

Here, I look at the relationship between union strength and the relative wage of teachers compared to non-teachers in the same state.  This is a particularly important comparison for two reasons. First of all, the relative competitiveness of teacher wages likely has significant effects on the quality of individuals who choose to enter the teacher workforce versus other employment opportunities (selecting from HS into College).  Overall wage competitiveness can have long run effects on overall teacher workforce quality.  Further, this is the one comparison I make in this post where we might hypothesize a direct, easily interpreted relationship. That is, we might expect stronger unions to lead to more competitive wages.  Here, I compare the weekly wage % (teacher percent of non-teacher) from the Economic Policy Institute with the TBF union strength rank.

Figure 3

Somewhat to my own surprise, this relationship is actually quite strong!… with states having stronger teachers unions also having generally more competitive teacher wages.

See: http://www.epi.org/publication/the_teaching_penalty_an_update_through_2010/

Is union strength associated with NAEP achievement levels?

Now, the usual retort to teacher union bashing is to point out that states like New Jersey and Massachusetts have strong unions and also have high NAEP scores, and states like Alabama and Mississippi have weak unions and low NAEP scores.  Yeah… okay… but clearly there’s a lot goin’ on there that has little or nothing to do with unions.  But let’s indulge this premise a little further with some additional graphs just to see the patterns.

In these first few figures I present the relationship between NAEP scores for children in families above the 185% income level for poverty (not on free or reduced lunch) and union strength. Note that the patterns are similar for scores for children qualified for reduced lunch or for free lunch, but I’ve not included them here… ‘cuz there are already enough graphs in this post. I’d be happy to share them though.  In general, what we see in Figure 4 and Figure 5 is that NAEP scores for non-low income kids tend to be slightly lower – with little clear pattern – in weak union states.

Figure 4

Figure 5

Figure 6, however, clarifies that NAEP scores tend to be higher for non-low income children in states where incomes are higher for non-low income children.

Figure 6 (but income dictates NAEP)

We can use the information in Figure 6 to adjust the NAEP scores (are they higher or lower than would be expected, given the income levels) for household income differences.  When we make that adjustment, we get Figures 7 and 8.

Figure 7 (income adjusted NAEP)

Figure 8 (income adjusted NAEP)

Still we see that adjusted NAEP scores are somewhat though hardly systematically lower in states with weaker unions. What we certainly do not see here is that NAEP Scores are systematically lower in states with stronger unions. That is, Unions certainly aren’t driving NAEP scores into the ground!

But, while the second set of graphs is more appropriate than the first, both are dreadfully oversimplified characterizations of complex relationships.

Is union strength associated with NAEP achievement gaps?

This question is perhaps most on target with the Economist claim. Following the economist logic, one might assert that teachers unions likely lead to larger achievement gaps, thus limiting social mobility. Measuring poverty related income gaps and comparing them across states is tricky, as I’ve discussed in numerous previous posts. Specifically, the size of the achievement gap between kids not qualified for free or reduced lunch and those qualified for either free or reduced lunch tends to be highly related to the size of the income gap between the two groups – as shown in Figure 9! That is, we can’t just do straight up achievement gap comparisons- we must adjust for the income gap.

Figure 9 (Income Gaps and NAEP Gaps)

Figure 10 and Figure 11 present the income gap adjusted achievement gaps in relation to union strength rank.  What we see is little or no relationship between union strength and achievement gaps. While this does not illustrate that stronger unions lead to smaller achievement gaps…. It also does not by any stretch illustrate that stronger unions lead to larger achievement gaps… an expectation that might reasonably be derived from the claim made in the Economist.

Figure 10

Figure 11

Then again… these are still cursory… descriptive analyses – using only two variables at a time to characterize education systems that are far more complex than can be legitimately characterized with only two variables at a time. It’s exploratory. It’s a start… and there’s certainly more to be explored here… but likely questions that can never be satisfactorily untangled with available data.

See: https://schoolfinance101.wordpress.com/2011/09/13/revisiting-why-comparing-naep-gaps-by-low-income-status-doesnt-work/

Is union strength associated with NAEP achievement growth?

Finally, I suspect that some curmudgeonly reactors to this post will attempt to argue that weak union states have seen more growth in NAEP achievement over time. Well, Figure 12 kind of thwarts that notion as well. Not much relationship there either, but certainly the only one in this post at all that shows even the slightest upward tilt.

Figure 12

But alas, even that tiny upward tilt is a function of the fact that states that saw the greatest growth on NAEP were simply the states that had and still have the lowest overall performance levels – as shown in Figure 13. And, states with lower average performance levels – now and then – tend to have weaker unions.

Figure 13

For a more thorough discussion on this point, see: https://schoolfinance101.wordpress.com/2012/07/27/learning-from-really-bad-graphs-ill-informed-conclusions-thoughts-on-the-new-pepg-catching-up-report/

Conclusions

So what does this all mean then? Are unions good, or are they bad? Do they increase inequality and lower quality? It’s certainly difficult given the data provided above to swallow the bold assertion in the Economist that teachers’ unions are the scourge of the nation and primary cause of declining social mobility.  That’s just a load of unsubstantiated crap!

But then what can we learn here. Well, it is perhaps important that there appears to be at least some likely indirect and certainly endogenous relationship between unionization and funding fairness and funding levels. As I’ve discussed in related research funding fairness and funding levels – and school finance reforms that improve equity and adequacy do matter!  To summarize:

Do state school finance reforms matter? Yes. Sustained improvements to the level and distribution of funding across local public school districts can lead to improvements in the level and distribution of student outcomes. While money alone may not be the answer, more equitable and adequate allocation of financial inputs to schooling provide a necessary underlying condition for improving the equity and adequacy of outcomes. The available evidence suggests that appropriate combinations of more  adequate funding with more accountability for its use may be most promising.

http://www.shankerinstitute.org/images/doesmoneymatter_final.pdf

See also this post in which I probe more specifically the changes in achievement gaps over time in Massachusetts and New Jersey.

Further, the potentially more direct relationship between unionization and relative competitiveness of teacher wages compared to other labor market opportunities may be important in the long run.  In a related policy brief from last winter, I noted:

To summarize, despite all the uproar about paying teachers based on experience and education, and its misinterpretations in the context of the “Does money matter?” debate, this line of argument misses the point. To whatever degree teacher pay matters in attracting good people into the profession and keeping them around, it’s less about how they are paid than how much. Furthermore, the average salaries of the teaching profession, with respect to other labor market opportunities, can substantively affect the quality of entrants to the teaching profession, applicants to preparation programs, and student outcomes. Diminishing resources for schools can constrain salaries and reduce the quality of the labor supply. Further, salary differentials between schools and districts might help to recruit or retain teachers in high need settings. In other words, resources used for teacher quality matter.

http://www.shankerinstitute.org/images/doesmoneymatter_final.pdf

So, while nothing in this post puts to rest the big – unanswerable – questions of the overall equity and quality effects of teachers unions on our supposed monolithic American public education system, these analyses do at least raise serious questions about the notion that teachers unions are the scourge of the nation cause of all of the supposed – also unfounded – ills of American public schooling.

Cheers! It’s good to be back!

Friday Afternoon Graphs: Graduate Degree Production in Educational Administration 1992 to 2011

I’ll let the pictures tell the story this time. [UPDATED – Errors in original]

Data source: http://nces.ed.gov/ipeds/datacenter/DataFiles.aspx

Data, Data, Data? Dissecting & Debunking NJDOE’s State of the Schools Message

Time again for an NJ State of the Schools Address, as reported HERE in NJ Spotlight (with absolutely no critical question/reporting whatsoever! More or less spoon fed regurgitation).

As I’ve written a number of times on this blog, state officials in New Jersey have decided on specific marketing/messaging plan in order to support current policy initiatives. Those policy initiatives involve:

  1. expanding NJDOE authority to impose desired “reforms” (charter/management takeover, staff replacement, etc.) on specific schools otherwise not under their direct authority.
  2. cutting funding from higher poverty, higher need districts and shifting it toward lower poverty, lower need ones.
  3. expanding charter schooling and promoting other  “innovations” in high poverty concentration schools.

The supposed impetus for these reforms is that New Jersey faces a very large achievement gap between low income and non-low income children (one that is largely mis-measured). While it would seem inconsistent to suggest reducing funding in low income districts and shifting it to others, the creative messaging has been that the additional resources are quite possibly the source of the harm… or at the very least those resources are doing no good. Thus, the path to improvement for low income kids is to transfer their resources to others.  What I have found most disturbing about this messaging – other than the ridiculous message itself! – is the flimsy logic and disingenuous presentations of DATA that have been used to advance the argument.

Look if the message is going to be about Data, Data, Data – then now is the time to take a more thorough, context-sensitive look at the data, and try to better understand what’s really going on.

Let’s do a walk through of some of the information presented in the most recent state of the schools presentation.

Here’s a link to the slides from the recent presentation:

http://www.state.nj.us/education/news/2012/0919con.pdf

NJDOE Message

The most recent state of the schools presentation is now in the post-NCLB waiver era, where we are now presented with those template classifications of schools as Priority, Focus and Reward schools.
The state of the schools presentation revolves to a large extent around these categories, because it is those Priority schools that are the target of the most immediate and disruptive interventions.

Below are the slides that were presented to characterize schools by their performance category. The message to be conveyed by these slides was:

  1. Priority Schools are overspenders (or at least very well resourced)
  2. Priority Schools have very well paid teachers who have slightly higher than average experience
  3. Yet still, priority schools have really crummy outcomes!

Therefore, we must have wide latitude to intervene!

EXHIBIT A – PRIORITY SCHOOLS SPEND MORE(?)

EXHIBIT B – PRIORITY SCHOOLS HAVE HIGH PAID TEACHERS & LOW OUTCOMES!

EXHIBIT C- GAPS REMAIN LARGE

Omitted Information What about demographic differences?

Clearly, a few things are being overlooked in the first two slides which claim characterize Priority schools as schools with plenty of resources that simply don’t get the job done. Now, there’s a little more to the story than that!

Most notable, as I show below, priority schools have about 80% of children qualified for free lunch and reward schools less than 10%! Yet as the NJDOE slide above shows, at the high end these school districts spend slightly under 30% more than state average. Notably, this shoddy comparison does not compare these districts to others in their own labor market.

Indeed, New Jersey more than other states has put some money into these districts. See “Is school funding fair?” But, let’s be clear, these margins of funding difference, while helpful, hardly make these districts – given their needs – flush with excess resources!

In fact, the strongest empirical research on this topic suggests that it would take an additional 100% or so per pupil funding for a district that is 100% low income versus a district that is 0% low income. Here, we are looking at nearly that extreme of low income differential, and not nearly that extreme of funding support! So while these districts are better off than similar districts in other states, implying that they’ve got more than enough to close achievement gaps is a huge stretch.

But do those demographic differences matter?

This figure shows just how much the demographic differences represented above matter with respect to student achievement, and specifically how much school demography continues to dictate the performance classification of schools under the NJDOE waiver plan.

As I pointed out on a recent post, NJDOE has basically flagged schools in low income neighborhoods for experimentation and substantial disruption (closure, etc.) with an option to override any/all local input.

Notably, this pattern is likely better than it would otherwise be because of New Jersey’s past efforts to target additional resources to high need settings, including pre-kindergarten programs, smaller class sizes and more competitive teacher salaries than might otherwise exist in these settings.

What about the teacher pay and teacher characteristics claim?

But what about those salaries? The NJDOE slides present a picture of teachers who – by their argument – are certainly paid enough. And, in fact, setting aside (ignoring entirely the demography of the schools), the implication of the NJDOE slides is that hey… we’re paying these teachers a few thousand more than the average teacher in the state, but clearly they just aren’t very good, or at least there are a bunch of them that aren’t and need to be fired! Further, they have slightly more experience than teachers in other schools… yet they still stink… indicating that experience clearly doesn’t matter. Notice that they didn’t present degree levels.

Okay… now let’s do a legitimate walkthrough of the most recent available data on NJ teachers with respect to the performance categories of schools. I use the 2011-12 Fall Staffing Reports and I fit a regression model of teacher salaries for all elementary and middle level classroom teachers (secondary later if I get a chance). In that model, my goal is to compare the salary a teacher would make:

  • at the same experience level
  • with the same degree level
  • having the same job code
  • working full time
  • in the same labor market (and type of district in that market)
  • in the same year

That is, I’m comparing apples with apples. This first graph shows the average difference in salary on the above comparison bases, statewide. Statewide, teachers in priority schools are earning a lower salary and teachers in reward schools a higher salary than teachers in “all other schools.” But these averages do mask some important differences across labor markets.

Here are the North Jersey/NY projected teacher salaries by experience level, where Newark carries significant weight in the model. Priority school salaries by experience are in blue, reward in red. On average, the differences are rather subtle. Reward schools salaries jump ahead in the mid-range, and priority rise again later, but fall behind in the mid range. But, it’s really important to understand, that simply having roughly the same salary does not mean that salary is actually competitive for recruiting and retaining teachers of comparable qualifications! In fact, to get teachers to work in a high need setting is likely to require a substantively higher wage!

As I explain in a recent review of the literature on this topic: With regard to teacher quality and school racial composition, Hanushek, Kain, and Rivkin (2004) note: “A school with 10 percent more black students would require about 10 percent higher salaries in order to neutralize the increased probability of leaving.”33 Others,however, point to the limited capacity of salary differentials to counteract attrition by compensating for working conditions.34 see: http://www.shankerinstitute.org/images/doesmoneymatter_final.pdf

  • Hanushek, Kain, Rivkin, “Why Public Schools Lose Teachers,” Journal of Human Resources 39 (2) p. 350
  • Clotfelter, C., Ladd, H.F., Vigdor, J. (2011) Teacher Mobility, School Segregation and Pay Based Policies to Level the Playing Field. Education Finance and Policy , Vol.6, No.3, Pages 399–438
  • Clotfelter, Charles T., Elizabeth Glennie, Helen F. Ladd, and Jacob L. Vigdor. 2008. Would higher salaries keep teachers in high-poverty schools? Evidence from a policy intervention in North Carolina. Journal of Public Economics 92: 1352–70.

Now let’s look at south jersey, which appears to be the source of most of the deficit that shows up statewide. In South Jersey/Philly metro, teachers in priority schools are making a much lower wage especially in the mid-range. Non-classified and reward schools lead the way on salaries across most of the experience range. Hey… is this chicken or egg? Do salaries matter – or are more advantaged schools simply able to pay higher salaries.

One issue that NJDOE appears to be ignoring entirely is that the classification of these schools may actually lead to additional teacher sorting – making it even harder to staff priority schools with high quality teachers down the line.

Here are the degree levels of classroom teachers in these schools – something notably absent in the NJDOE presentation. The differences between priority and reward schools are quite striking.

PRIORITY SCHOOLS HAVE FAR MORE TEACHERS WITH ONLY A BA AND FEWER WITH AN MA THAN REWARD SCHOOLS!

Finally, here are the concentrations of novice teachers, where a sizable body of research literature points to the problem of teacher churn in high need schools and the relationship between high novice teacher concentrations and lower student outcomes.

What about the performance of low income children in New Jersey?

Again, part of the message being presented in the state of the schools address is that New Jersey in particular has failed its low income children – as indicated by the suspect, over time proficiency rate graphs presented above. These graphs are presented as coupled with the funding/resource graphs to imply that funding is clearly unhelpful at best and harmful at worst when it comes to fixing the achievement gap.

As I’ve written on this blog before, New Jersey has made substantive gains in recent decades for low income children. Further, to make comparisons of achievement gaps, one must focus on the most comparable measures and most comparable settings. In one recent blog post, I compared Massachusetts, Connecticut and New Jersey – which in terms of income distributions and the characteristics of those above and below the Free/Reduced Income thresholds are most similar. The following graphs show that children of HS dropouts and low income children in NJ and MA have both higher levels of performance and have outpaced the gains in performance of similar children in Connecticut and Rhode Island (but especially CT!)

What has New Jersey done to improve performance of low income children?

I also elaborated in that previous that one key difference between these states is that NJ and MA, more than the others have shifted resources toward higher need districts. The first graph shows the disruption over time in the relationship between district income and district resources. MA and NJ have most significantly disrupted this relationship, providing systematically more resources per pupil in lower income districts.

This second graph shows the pattern across districts by poverty in each state. Note that in CT, while a few high poverty districts (Hartford and New Haven) have higher current spending, the CT pattern is less systematic. Further, in those few districts, much of the additional spending is granted through magnet school aid, and thus may have limited positive impact on the districts’ neediest students.

To the best of my understanding, teacher tenure laws are/were strong in each of these states. Few if any districts in these states base teacher evaluation heavily on student test scores – especially during the periods represented in the graphs above – which predate Race to the Top. That is, clearly the differences in low income achievement growth between these states have little/nothing to do with state teacher evaluation policy. To go even further, NJ and CT have relatively small charter school market share, so charter school market share likely is not a major factor either.

Further, as explained in this report, and in this article, substantive and sustained school finance reforms do matter! And the evidence on the effectiveness of these reforms far outweighs the more speculative reforms being suggested as replacements for funding in New Jersey.

What does NJDOE & the current administration propose to do about future funding?

Finally, as I noted previously, the current direction of policy initiatives is to attempt to reshuffle funding away from higher poverty/need districts and toward lower poverty/need ones. Here’s the graph from the previous post.

The Strange Logic of it All?

Coupling this DOOHNIBOR (uh… reverse robinhood) strategy with arguments for disruptive reforms in high poverty settings is illogical at best and reckless and irresponsible at worst.

Children in high poverty settings in New Jersey have made substantive gains over time.

It is quite likely that New Jersey’s investments in the schools and communities of these children have played a significant role in those gains.

Yet, even in New Jersey, where the state has made those efforts, poverty-related disparities do persist and require attention.

There is little or no evidence that expanded charter schooling is substantively improving the outcomes of our lowest income children, largely because those “successful charter schools” of which we most often speak are not serving our lowest income children in any significant numbers, and in some cases are increasing concentrations of disadvantaged children left behind in district schools.

And there’s little evidence that either New Jersey’s failures or gains are a function of an oversimplified good teacher/bad teacher dichotomy, suggesting a need for oversimplified reformy solutions like teacher deselection and/or pay-for-test scores.

Despite the state’s efforts to provide support to high poverty settings/schools, teacher wages still are not where they necessarily need to be in those districts to recruit and retain a high quality applicant pool year after year. There remain disparities in teacher qualifications, including novice teacher concentrations. Teacher quality disparities may be/are an issue – but not in the way they are presently being framed!

These are the basic issues that need to be addressed. They aren’t sexy. They aren’t reformy. They aren’t consistent with the current marketing/messaging of NJDOE.

But they are based on data, data, data, DATA, DATA and more freakin’ Data!

And there’s a lot more where that came from!