Litigating DC IMPACT: The real usefulness of the Dee/Wyckoff Regression Discontinuity Design

Much has been made of late regarding the erroneous classification of 44 teachers in Washington DC as ineffective, thus facing job consequences. This particular erroneous rating was based on an “error” in the calculation of the teachers’ total ratings, as acknowledged by the consulting firm applying the ratings. That is, in this case, the consultants simply did not carry out their calculations as intended. This is not to suggest by any stretch that the intended calculations are necessarily more accurate or precise than the unintended error. That is, there certainly may be far more – are likely far more than these 44 teachers whose ratings fall arbitrarily and capriciously in the zone whereby those teachers would face employment consequences.

So, how can we tell… how can we identify such teachers. Well, DC’s own evaluation study of IMPACT provides us one useful road map and even a list of individuals arbitrarily harmed by the evaluation model. As I’ve stated on many, many occasions, it is simply inappropriate to make bright line distinctions through fuzzy data. Teacher evaluation data are fuzzy. Yet teacher evaluation systems like IMPACT impose on those data many bright lines – cut points… to make important consequential decisions. Distinctions which are unwarranted. Distinctions which characterize as substantively different individuals who simply are not.

Nowhere is this more clearly acknowledged than in Tom Dee and Jim Wyckoff’s choice of regression discontinuity to evaluate the effect of being place in different performance categories. I discussed this method in a previous post. As explained in the NY Times Econ blog:

To study the program’s effect, the researchers compared teachers whose evaluation scores were very close to the threshold for being considered a high performer or a low performer. This general method is common in social science. It assumes that little actual difference exists between a teacher at, say, the 16th percentile and the 15.9th percentile, even if they fall on either side of the threshold. Holding all else equal, the researchers can then assume that differences between teachers on either side of the threshold stem from the threshold itself.

In other words, the central assumption of the method is that those who fall just above and those just below a given, blunt threshold (through noisy data) really are no different from one another. Yet, they face different consequences and behave correspondingly. I pointed out in my previous post that in many ways, this was a rather silly research design to prove that “IMPACT works.” Really what it shows is that if you arbitrarily label otherwise similar teachers as acceptable and others as bad, those labeled as bad are going to feel bad about it and be more likely to leave. That’s not much of a revelation.

But, there may be other uses for the Dee/Wyckoff RD study and its underlying data, with opportunity to call these researchers to the witness stand to explain the premise of regression discontinuity.  You see, underlying their analyses is a sufficient sample teachers in DC who have been put in the bottom performance category and may have faced job consequences as a result. By admission of the research design itself, these teachers have been arbitrarily placed in that category by the placement of a cut-score. They are, by admission of the research design, statistically no different from their peers who were lucky enough to be placed above that line and avoid consequences.

This seems at least a worthwhile due process challenge to me. To be clear, such violations are unavoidable in these teacher evaluation systems that try so hard to replace human judgment with mathematical algorithms, applying certain cut points to uncertain information.

So, to my colleagues in DC, I might suggest that now is the time to request the underlying data on the teachers included in that regression discontinuity model, and identify which ones were arbitrarily classified as ineffective and face consequences as a result.

Point of clarification: Is this the best such case to be brought on this issue? Perhaps not. I think the challenges to SGPs being bluntly used for consequential decisions – not even designed for distilling teacher effect – are much more straightforward. But what is unique here is that we now have on record an acknowledgement that the cut-points distinguishing between some teachers facing job consequences and others not facing job consequences was considered by way of research evaluation design to be arbitrary and not meaningful statistically. From a legal strategy standpoint, that’s a huge admission. It’s an admission that cut-points that forcibly (by policy design) over-ride professional judgment, are in fact arbitrary and distinguish between the non-distinguishable. And I would argue that it would be pretty damning to the district for plaintiffs counsel to simply ask Dee or Wyckoff on the stand what a regression discontinuity design does… how it works… etc.

Additional Readings

Baker, B.D., Green, P.C., Oluwole, J. (2013) The legal consequences of mandating high stakes
decisions based on low quality information: Teacher Evaluation in the Race‐to‐the‐Top Era.
Education Policy Analysis Archives

Green, P.C., Baker, B.D., Oluwole, J. (2012) Legal implications of dismissing teachers on the basis of value‐added measures based on student test scores. BYU Education and Law Journal 2012 (1)

Aggregated Ignorance…

Short one for today… on a personal pet peeve, which is apparently not only my pet peeve. Perhaps more than anything else, I hate it when pundits – who often have little clue what they are talking about to begin with, toss around big numbers with lots of zeros… or “illions” attached in order to make their ideological point. Case and point this morning on Twitter:

Bear in mind, this is nothing new for this particular individual.

Thankfully, I don’t even have to write the critique of this utter foolishness, since the Center for Economic and Policy Research preemptively wrote it the other day! Here’s a portion of their explanation:

As we mark the 50th anniversary of the War on Poverty, it would be appropriate to note one of the main causes of its limited success, using big numbers without context. The issue here is a simple one; most people think that we have committed vastly more resources than is in fact the case to fighting this war. As a result, they are reasonably (based on their understanding) reluctant to contribute more resources. (emphasis added)

Please read the rest. It’s only a few paragraphs. Of course, if the intent is to deceive and warp public opinion, then Smarick and others are right on target.

 

 

 

Championing Fact-Challenged Facts

The New Teacher Project and Students First have recently posted/cross-posted one of the more impressively fact-challenged manifestos I’ve encountered.

The core argument in this recent post is that the facts on education reform speak for themselves and that the facts, as they describe them, simply need a champion – someone to make the public aware of these undeniable facts. However, the dreaded and evil teachers’ union, and its stranglehold over the media and public opinion is dead set on obfuscating the undeniable facts about the effectiveness of recent education reforms. As they put it:

The reality is that while unions and their allies have the motivation, discipline and resources to get their messages out and repeat them endlessly, the facts have no champion.

So then, what are these supposed facts that the teachers’ union has so successfully obfuscated?

 The Facts According to TNTP/SF: U.S. Failure on PISA

According to TNTP and SF…

There’s no disputing that the results are pretty dismal—15-year-olds in the United States ranked 30th in math, 23rd in science and 20th in reading among participating industrialized countries. But the conversation about the PISA results was just as depressing.

Hayes argued that these results were a reflection of income inequality, not the poor quality of our schools, that we rank near the bottom because we have “so many test takers from the bottom of the social class distribution.” It’s a ridiculous assertion, and one that is easily disproved by a close look at the data, which compare the performance of students with similar socio-economic backgrounds around the globe. The wealthiest American 15-year-olds, for example—those in the top socio-economic quartile—rank 26th in math compared to their affluent peers elsewhere. In other words, poverty does not explain the poor performance of our K-12 education system. (Amanda Ripley has more on this, which you can read here.)

That’s right… no disputing. We all know it. It’s a simple fact. U.S. Schools stink when compared on simple rankings to other countries… and this stinkiness can be attributed to bad teaching, limited choice and unions, of course. Okay… they didn’t say that… but it does seem implied by the fact that their blog post blames unions and Randi Weingarten specifically for denying the facts and creating false public messages. Most importantly, Amanda Ripley, quantitative researcher extraordinaire, proves that poverty has nothing to do with our massive failure!

What do we actually know about U.S. Performance on PISA?

Here’s what I wrote back on PISA day!

With today’s release of PISA data it is once again time for wild punditry, mass condemnation of U.S. public schools and a renewed sense of urgency to ram through ill-conceived, destructive policies that will make our school system even more different from those breaking the curve on PISA.

With that out of the way, here’s my little graphic contribution to what has become affectionately known to edu-pundit class as PISA-Palooza.  Yep… it’s the ol’ poverty as an excuse graph – well, really it’s just the ol’ poverty in the aggregate just so happens to be pretty strongly associated with test scores in the aggregate – graph… but that’s nowhere near as catchy.

pisa_palooza

PISA Data: http://nces.ed.gov/pubs2014/2014024_tables.pdf (table M4)

OECD Relative Poverty: Source: Provisional data from OECD Income distribution and poverty database (www.oecd.org/els/social/inequality).

Yep – that’s right… relative poverty – or the share of children in families below 50% of median income – is reasonably strongly associated with Math Literacy PISA scores. And this isn’t even a particularly good measure of actual economic deprivation. Rather, it’s the measure commonly used by OECD and readily available. Nonetheless, at the national aggregate, it serves as a pretty strong correlate of national average performance on PISA.

What our little graph tells us – albeit not really that meaningful – is that if we account (albeit poorly) for child poverty, the U.S. is actually beating the odds. Way to go? (but for that really high poverty rate).

Bottom line – economic conditions matter and simple rankings of countries by their PISA scores aren’t particularly insightful (and the above graph only marginally more insightful). Further, comparisons of cities in China to entire nations is a particularly silly approach.

But then how does one explain away Amanda Ripley’s supposed brilliant rebuttal of the poverty concern? Note that she points to a table of how children in the top quartile within the United States according to an OECD socioeconomic index compare to children in the top quartile within other countries. This is a major math/logic fail on the part of Ripley and others interpreting these data. You see, the top quarter within a poorer country is, well, poorer than the top quarter within a richer country. So really, the above graph still applies.

But to illustrate my point, here are the countries – and Chinese Cities… and Singapore (hardly a relevant comparison) – ranked by math score, including some specific U.S. States. The top quarter of students in a “richer” U.S. state (because the top quarter among the rich are richer than the top quarter among the poor) seem to do pretty darn well… with Massachusetts being beaten only by Korea (along with select Chinese cities and Singapore – hardly relevant comparisons).  Of course, referring to these comparisons as comparing the wealthy, or affluent in one country to the wealthy, or affluent in another is offensive enough to begin with. It’s all relative.

Slide1So, NO… the scores of our top quarter falling behind those in the top quarter of other nations does NOT by any means contradict the finding that poverty matters. In fact, breaking out U.S. States of varied poverty levels and ranking them among countries in this very graph provides additional support the economic context remains the primary driver of jurisdictional aggregate test score comparisons (or maybe these scores prove that Florida’s education reforms are a dreadful failure?).

The Facts According to TNTP/SF: Test Based School Closures Improve Outcomes!

This particular quote is truly baffling, since the linked study provides no support for the actual claim made in the quote – that policies such as closing failing schools based on test-score based accountability is leading to performance gains.

And research also shows that these gains were not achieved through happenstance. They were caused, in part, by the very policies Randi decries, such as closing failing schools based on test-score-based accountability systems.

What does the linked study actually say?

The MDRC study linked above focused on the longer term outcomes of students attending small high schools in New York City. While it may be the case that some students migrated to these small high schools after having their larger neighborhood high schools closed, for any number of reasons including test-based accountability, that was not the emphasis of the study. As stated in the study summary itself, here are the findings:

  • Small high schools in New York City continue to markedly increase high school graduation rates for large numbers of disadvantaged students of color, even as graduation rates are rising at the schools with which SSCs are compared. For the full sample, students at small high schools have a graduation rate of 70.4 percent, compared with 60.9 percent for comparable students at other New York City high schools.
  • The best evidence that exists indicates that small high schools may increase graduation rates for two new subgroups for which findings were not previously available: special education students and English language learners. However, given the still-limited sample sizes for these subgroups, the evidence will not be definitive until more student cohorts can be added to the analysis.
  • Principals and teachers at the 25 small high schools with the strongest evidence of effectiveness strongly believe that academic rigor and personal relationships with students contribute to the effectiveness of their schools. They also believe that these attributes derive from their schools’ small organizational structures and from their committed, knowledgeable, hardworking, and adaptable teachers.

The Facts According to TNTP/SF: DC & Tennessee NAEP Gains!

And finally, here’s one I’ve blogged about more than once in recent months – the bold and completely unfounded claim that NAEP gains in Washington DC and Tennessee provide proof positive of the value of recent “reforms” toward improving student outcomes.

So why the cognitive dissonance? While no one should be declaring victory based on these results (a large majority of kids in New York still do NOT graduate college-ready), you might expect that the city’s results (and the most recent NAEP results, which show similarly impressive gains in Washington, D.C. and Tennessee) would give Weingarten and like-minded stakeholders some pause before they continue to issue blanket indictments of the reform agenda.

And about that claim of DC & Tennessee “impressive” gains?

As I explain in my recent post, for these latest findings to actually validate that teacher evaluation and/or other favored policies are “working” to improve student outcomes, two empirically supportable conditions would have to exist.

  • First, that the gains in NAEP scores have actually occurred – changed their trajectory substantively – SINCE implementation of these reforms.
  • Second, that the gains achieved by states implementing these policies are substantively different from the gains of states not implementing similar policies, all else equal.

And neither claim is true, as I explain more thoroughly here! But here’s a quick graphic run down.

First, major gains in DC actually started long before recent evaluation reforms, whether we are talking about common core adoption or DC IMPACT. In fact, the growth trajectory really doesn’t change much in recent years.  But hey, assertions of retro-active causation are actually more common than one might expect!

 Figure 11

Slide11

Note also that DC has experienced demographic change over time, an actual decline in cohort poverty rates over time and that these supposed score changes over time are actually simply score differences from one cohort to the next. This is not to downplay the gains, but rather to suggest that it’s rather foolish to assert that policies of the past few years have caused them.

Second, comparing cohort achievement gains (adjusted for initial scores… since lower scoring states have higher average gains on NAEP) with STUDENTS FIRST’s own measures of “reformyness” we see first that DC and TN really aren’t standouts, that other reformy states actually did quite poorly (states on the right hand side of the graphs that fall below the red line), and many non-reformy states like Maryland, New Jersey, New Hampshire and Massachusetts do quite well (states toward the middle or left that appear well above the line).

Needless to say, if we were to simply start with these graphs and ask ourselves, whose kickin’ butt on NAEP gains… and are states with higher grades on Students First policy preferences systematically more likely to be kickin’ butt, the answers might not be so obvious. But if we start with the assumption that DC and TN are kicking butt and have the preferred policies, and then just ignore all of the others, we can construct a pretty neat – but completely invalid story line.

 Figure 12

Slide12

And those, my friends, are the facts!

Thoughts on Elite Private Independent Schools and Public Education Reforms

I was informed by my brilliant and thoughtful cousin Bill the other day that on Jan 6-7 in Washington, DC., John Chubb, the new head of the National Association of Independent Schools is convening what he refers to as a Prominent Research Gathering, described here:

NAIS will convene leading economists and educational research professionals with a cross section of independent school thinkers on January 6-7 at the association’s DC offices to address the economics of independent schools. The group will identify market trends affecting independent schools, new business models that will drive growth, and methodologies to measure and articulate the benefits of an independent school education.

There are many reasons why this gathering is both interesting and somewhat disconcerting.

First, few of the “prominent researchers” invited have actually done much if any research pertaining to private schools generally, or NAIS and NAIS type schools specifically.

Really, only Peter Cookson has written anything of substance on private independent schools (specifically on elite boarding schools). Others have opined broadly about private schooling writ large, usually in the context of voucher models.  In fact, some (if not many of these researchers often falsely project issues affecting one set of private schools onto all private schools).

In one particularly egregious example, Checker Finn recently proclaimed the impending collapse of private schooling, implying strongly that private schools invariably were in trouble and unsustainable.

What Checker Finn seems to have missed is that a) overall, private school enrollment shares in the U.S. actually aren’t declining (as evidenced by the American Community Survey), and b) that declining enrollments in private schools where they do exist appear relatively isolated among Catholic parochial schools – NOT NAIS/INDEPENDENT Schools.

Figure 1.Private Schooling as a Share of Population (by Income Group)

Slide1

Figure 2.Private School Enrollments by Affiliation

Slide2

But you see, unless you can create a crisis, or at least a feeling of crisis in the air, then you can’t scare enough people into rashly adopting ill-conceived policies that serve your goals (not necessarily theirs). That’s how the crisis mentality works, and that’s certainly the message of this particular group of researchers and those posing as researchers.

Note to NAIS leaders who may be graced with this message of impending doom today and tomorrow, the death of private schools is greatly exaggerated!

I might go so far as to argue that some of those on the invited panel, through their repeated claims that public schools are wasteful and inefficient, must have their budgets slashed (the “new normal”), should reduce teacher compensation and increase class sizes, use test score based models to shed “weak teachers,” have exerted strong negative influence on public school quality. Further, that the policies endorsed by many in this crowd arguably have led to decline in support for and funding of public schooling to the point where private schooling alternatives are quite likely to benefit.

Second, the common threads and policy preferences of those invited run in stark contrast with goals and preferences of private independent schooling!

Now, it may in fact be John Chubb’s point to encourage private independent schools to get on board with the current reform preferences advocated by the members of this esteemed, generally like-minded panel.

I would counter that private independent schools would be better positioned by maintaining their differentiation, and sadly, by capitalizing on the damage many of these individuals have inflicted and continue to inflict on public school systems via their disproportionate leverage with select policymakers.

What are some of the specific policy messages from this crowd?

Many of them have written repeatedly that small class size simply doesn’t matter, it’s too expensive and wasteful.

Matt Chingos called class size the “most expensive” reform, albeit never actually providing legitimate cost comparison to anything else.  (in my view, when you say “most” you really have to compare to something).

Hanushek and Hoxby too have claimed on numerous occasions the inefficiency and ineffectiveness of class size reduction and lower pupil to teacher ratios.  In each case, claims rest only on lack of statistical relationship to tested student outcomes (and discarding strong evidence to the contrary).

The work of these individuals has been used repeatedly to justify increases to class size in major urban districts to levels unsupported, and unsupportable by any legitimate research!  (for summary of research, see: http://www.shankerinstitute.org/images/doesmoneymatter_final.pdf )

But check almost any prominent independent school web site and you’ll often see specific reference to small class size.  Notably, most Harkness Tables are made to seat about 12 students.

Accepting Hanushekian and Chingosian preferences, we might just have to start making Harkness Tables to seat 30+ students. I suspect most Independent School leaders realize how utterly foolish such a move would be.

Alternatively, we might just follow Checker Finn’s adoration of the Rocketship charter school model, which instead of those cumbersome Harkness tables that actually encourage students to face one another and engage in intellectual debate, place students in cubicles with computers or tablets!

Figure 3. From Harkness to Rocketship

 Slide10

I suspect that’s just what every parent seeking an individualized, rich, balanced education for their child is looking for, right?

What do we know about Private Independent School pupil to teacher ratios (for lack of specific, comparable class size data)? Well, back in 2009 when I did my report dissecting the private school market place by affiliation, I found that NAIS and NIPSA schools tended to provide pupil to teacher ratios slightly greater than half that of public districts.

Figure 4. Pupil to Teacher Ratios

Slide7

Now, I also suspect that independent school leaders view small class size as contributing to more than just marginal gains in measured standardized achievement scores. And that is the narrow perspective to which Chingos, Hoxby, Hanushek and others speak when they cast doubt on cost-effectiveness of class size reduction policies, based on what they characterize as weak statistical evidence and modest effects.

Small class sizes provide a unique learning environment, provide the opportunity for teachers to keep closer track of student learning, and also serve as a beneficial working condition for recruiting and retaining teachers. And small class size remains “marketable.” Prospective private school (and current public or private school) parents seem relatively unconvinced that their child would be as well off in a class of 30 with a “great” (albeit really hard to measure) teacher than in a class of 12 with an “average” teacher. Of course, private independent schools can (at least attempt to) lay claim to providing both “exceptional teachers” and small classes.

Many have argued that public school districts simply spend too much, are underproductive, inefficient and wasteful (in large part because they try to provide small class sizes).

Professor Hanushek in particular has made a fine living providing testimony that public school districts – regardless of how much money they already have or spend, simply have and spend too much. They are inefficient and wasteful and should not be provided any additional resources until we change the way the operate (Increase class size, impose merit pay, deselect bad teachers).

Such is the nature of his testimony recently provided to the Kansas courts, but thankfully the 3-judge Kansas panel wasn’t having it!  Specifically, regarding Hanushek’s premise that because money is spent so inefficiently, cuts imposed could do no harm, the 3-judge panel opined:

This is simply not only a weak and factually tenuous premise, but one that seems likely to produce, if accepted, what could not be otherwise than characterized as sanctioning an unconscionable result within the context of the education system.

Now, if public districts are so woefully inefficient in their exorbitant spending, driven largely by small class sizes, I shudder to think what Hanushek would think of NAIS school spending, were he ever to take a look at it.

Private Independent DAY Schools tend to spend per pupil nearly twice as much as per pupil as local public districts in their same labor market!

Figure 5. Per Pupil Spending(1) Nationally

 Slide3

Figure 6. Per Pupil Spending (2) Within Metro

Slide4

Many have argued that schools should rely more heavily on student assessment data to evaluate remove bad teachers (teacher deselection)

This argument is perhaps most attached to Hanushek – who crafted a nifty hypothetical simulation showing how if U.S. schools simply used value-added estimates to annually fire the bottom 5% of teachers, we could become Finland (at least in terms of test scores) in a decade! Several writers have challenged the logic of Hanushek’s assertions as well as the usefulness of this approach as an actual Human Resource Management tool. (Yes, even in private sector business)

Really, any thoughtful private school leader understands just how ill-conceived this approach is, especially when applied in the context of the typical private independent day or boarding school.

  • First, I suspect many parents would be less than thrilled at the prospect of the annual – spring/fall – standardized (weeks on end) testing in every subject, every year for every student required to estimate the optimal deselection statistical model.
  • Second, and this is true even in public districts, a good manager only seeks to shed his/her weakest link if he/she has some confidence that the weak link can be replaced with someone “better.”
  • Third, personnel decisions are complex and involve figuring out not just what a teacher might contribute to test scores in one content area, but how that teacher contributes to the community as a whole. This is especially true of private independent schools and a seemingly foreign concept to many on this esteemed panel!

And many have argued that technology can be an efficient replacement for brick and mortar classrooms and living/breathing teachers

As mentioned above, the folks at TB Fordham Institute during the reign of Checker Finn (and likely still) certainly had a love affair with models like Rocketship Education and online learning more generally. But as per the pictures above, these models are in stark contrast with current preferences for private independent schooling, and I can’t see these approaches being in high demand among the parent population current seeking out NAIS schools.  For more thorough analysis of the costs of online learning alternatives, see this report!

Of course, among the “researchers” in this mix, claims about costs and cost effectiveness of online learning range from suspect, to completely made up!  Heads up to anyone attending this event, please see this completely absurd claim by Marguerite Roza regarding the supposed efficiency gains achieved by implementing “technology” solutions.

Many have in their writing advocated the virtues of vouchers

But a) the vast majority of research to which they point on this topic involves voucher models in large urban settings where most children apply the vouchers to Catholic schools, and b) these authors have never considered vouchers awarded at the levels of tuition and expenditure that exist for most NAIS schools.  This is precisely the reason why most elite independent schools have not participated in voucher programs even where such opportunity exists (DC Scholarships). Voucher levels offered generally fall well below 50% of per pupil operating costs for independent schools, requiring the school to provide substantial financial aid to offset costs, thus limiting their capacity to serve voucher receiving students.

To extend Hanushek’s usual reasoning regarding public school spending, offering vouchers at the cost of private independent schooling would clearly be inefficient and wasteful. Why would anyone allocate a voucher at twice the average public district expense, simply to give kids access to small classes, which of course don’t matter?

Notably, at least some involved on this esteemed panel are prone to stretching their findings regarding the benefits of vouchers (see here, and here).

Many have found that peer effects matter!

Hanushek, Hoxby, Zimmer have each found that who you go to school with matters – that is, the composition of a student’s peer group affects how and how much each student learns.

I suspect that most private independent school leaders already get that!

To conclude

I suspect that heads of leading private schools will see that the proposals forwarded to them by this supposed esteemed research panel simply aren’t a good fit for the typical private independent school.  For those seeking a new marketing niche, might I suggest my fully research-based school about which I wrote some time ago. I would strongly assert, and other prominent scholars seem to agree, that these proposals aren’t a good fit for public districts either.  Nor are they representative of leading research on education, education interventions, public and private schooling productivity, cost and efficiency.

In fact, the now decades long hoisting of these strategies onto public districts may just be the best thing going for private schools. Heavy handed standardization of public schooling, over-testing, resource deprivation, and the broad political campaign to undermine the teaching profession are quickly rendering public districts both less desirable places to work, or attend, making teachers, parents and children on the margins who might not have otherwise considered private schooling give it a second look. [the one potential threat being the emergent quasi-private suburban charter school]

Additional Readings:

For a better concept of private schooling distribution, labor markets and spending behavior, I encourage reading my 2009 report (based largely on 2006-2007 data).

For a thorough discussion of how and why money, class size and other resources matter in education, see:

Finally, for a discussion of the lack of research, and weak assumptions behind many of the proposals advanced by these scholars (and pseudo-scholars), see:

Posts in which  I mention

Matt Chingos

Eric Hanushek

Marguerite Roza

Center on Reinventing Public Education (Robin Lake)

Checker Finn

Response from John Chubb:

Dear Prof. Baker,

NAIS will be posting more details of the research meeting later this week. I think you will find that the meeting has a very different aim than you suggest.

The purpose of this meeting is to help NAIS develop its own robust research agenda that will best serve the interests of its members. In surveys of the top issues facing independent schools, members have asked NAIS to research financial models, new ways to demonstrate the value that independent schools add to students’ lives, and emerging issues that will inform schools’ strategic planning.

This meeting convenes researchers and thinkers who have experience in different areas (economics, education, etc.). Our intent was to bring together people whose diverse opinions and expertise could challenge NAIS as we determine which research topics will help independent schools thrive long into the future. We have been discussing what we should research, but also how we can gather the most useful information from various research projects.

For me, day one of the meeting has confirmed that brainstorming with people outside your own industry not only helps inspire new ideas, but it also helps articulate and reinforce the core values and attributes (many of which you mentioned) that matter most to members.

Sincerely,
John E. Chubb

My Reply

I appreciate your response and look forward to what comes of this meeting.

However, I would assert that the group you’ve convened is anything but diverse in terms of its views on effective and efficient resource allocation in education. Notably, few of these individuals actually work on financial models or resource allocation to begin with, but for their frequently stated views on class size, teacher compensation and overall spending, which clearly relate to resource allocation choices. Those on this panel who do focus on resource allocation more explicitly have a tendency to promote completely unfounded approaches (see: http://edr.sagepub.com/content/41/3/98.short).

Thanks again. I look forward to hearing more about the outcomes of this meeting.

Bruce

Data, Portfolios & the Path Forward for NYC (& Elsewhere)

As the new year begins, I’ve been pondering what I might recommend as guiding principles for the path forward for education policy in New York City under its new Mayor, Bill de Blasio, who is often referred to on Twitter as BDB. So here are my thoughts for the way forward, from one BDB (Bruce D. Baker) to another.

Note that I had drafted much of this content last spring when convening with a group of scholars to discuss the path forward for NYC education policies. Not being as well versed in the specifics of NYC education policies, but having at least written academically about some, I kept my ideas broad, and applicable to many educational settings across the U.S.

My recommendations fall into two broad categories:

Develop a robust, balanced, least intrusive system of indicators for evaluating New York City Schools and then use that information appropriately

NYC BOE policies of the past ten years have been rife with data abuse (though at times, merely in an effort to comply with state required data abuse). School closures have been based on ill-conceived measures of “school failure” which do little more than target the city’s neediest student populations, imposing on them repeated disruptions.

New York City’s teacher performance reports, albeit better than many, apply the worst form of statistical reductionism to quantify teacher “quality,” taking noisy statistical estimates of the association between teachers-of-record and assigned students test score gains (applying only the most convenient statistical corrections) in limited curricular areas and grades, and assuming levels of precision and accuracy that are completely unwarranted.

Such data abuse – on both counts [school closures and teacher ratings] – is reprehensible.

Right-sized (NOT BIG) data can indeed be useful for guiding decision-making in large, complex urban education systems. But data should never be the exclusive determinant of policies or other high stakes decisions.

Human judgment matters, including human interpretation of the meaning and usefulness of data as it informs decisions which ultimately affect other human beings.

New York City should give serious consideration to how data are collected, maintained and ultimately used for informing policy and decision making. Four guiding principles are:

  1. Emphasis should be on understanding what the data can and cannot tell us about schools, their climate, students and their achievement and the role of teachers, leaders, programs and services. Policies should emphasize how various constituents can make sense of data, coupled with their knowledge and experience, to inform the path forward.
  2. Data should NEVER dictate decisions. Rather, data may inform them. Along these same lines, despite ill-conceived requirements of state policies, imprecise information (which includes nearly all social science measures) should NEVER be treated as determinative, attaching specific consequences to specific scores or estimates (splitting hairs that cannot or shouldn’t be split).
  3. Data systems must better capture the scope of public service that is public schooling in a modern era. This means collecting more than just that which is most easily quantifiable, and more than just achievement test scores on mandated core curriculum.
  4. Data should be collected in the least intrusive manner necessary to draw valid inference, or provide valid descriptive profiles.

School principals and leadership teams should have available to them sufficient and appropriate data to guide – Not Dictate – building level management decisions. Information might include typical measures of student achievement as well as measures of gain, but also include measures on longer term outcomes of students who attended any given school (graduation, college attendance/persistence) – linked longitudinally to both outcome data while they were in attendance at the school as well as data on programs and services in which they participated. Data on students should similarly be traceable backwards. Data might also include indicators of parent and student perceptions of school environment, etc.

Data should attempt to capture not only limited, easily measurable “outcomes” but also more accurately measure inputs and resources as well as characterize ongoing educational processes.  After all, a central objective of data collection and maintenance is to be able to make connections between inputs, process and outcomes. And the central objective of city leaders should be to ensure equitable and adequate inputs and processes, to support the achievement of more equitable, more adequate outcomes. Not the other way around.

An important consideration is that data should be collected more strategically so as to be far less intrusive than current practices on the actual educational processes being monitored with the data. That is, the goal of actors within the system should not be to improve the measured data elements, but rather to more substantively improve their practices in ways that lead to shifts in the measured data elements, assuming we’re measuring the right things (often a bold assumption).

Lengthy performance assessments, achievement tests or survey instruments need not be given every year to every child. Appropriate sampling can achieve robust data with far less intrusion (or expense). Such is the design of assessment systems like the National Assessment of Educational Progress. Providing samples of items to samples of students across schools can reduce cost, reduce intrusion, reduce the likelihood of teaching to the test, item familiarity and other threats to validity, and thus provide more useful information. This approach also reduces the digital record maintained on any one student.

As a basic rule of thumb, high stakes decisions should never be made with low quality information.  One should never conclude with certainty based on uncertain information.

The city should avoid the urge to apply categories to otherwise continuous and noisy data – such as applying specific cut points and imposing quality/value judgments based on those cut-points. Few if any measures collected in social science, including test scores from multiple choice assessments given to nine and ten year old children, are sufficiently precise for making high stakes determinations by splitting hairs between getting 20 versus 21 (or even 20 vs 25) correct responses. Most of the types of data collected in such an environment are simply not sufficiently precise for such determinations.

Complex decisions like school closures and reorganization require multiple perspectives and varied forms of information/data. A school should not be closed simply on the basis of one or a handful of bad performance indicators – typical school report card elements. The role of schools in communities should not be completely ignored. Nor should the extent to which the same children’s lives/educations are repeatedly disrupted. More mundane considerations including transportation efficiency & facilities quality/efficiency/fit are likely more relevant than student outcomes when considering school closings. In fact, rarely if ever are “low tested outcomes” a legitimate reason for school closure. Rather, they are usually an indicator of other underlying processes – some non-school and perhaps some school processes – requiring far more thoughtful intervention than the current slash-and-burn approach to “failing schools.”

Similarly, data may inform but should never dictate human resource decisions. Such is the core problem with recently adopted statewide teacher and principal evaluation models that prescribe percentages of evaluations that must be dictated by X, Y or Z, and that require specific personnel actions be taken when the numbers fall into preset categories.

As my colleagues and I explain in a recent article,

Arguably, a more reasonable and efficient use of these quantifiable metrics in human resource management might be to use them as a knowingly noisy pre-screening tool to identify where problems might exist across hundreds of classrooms in a large district. Value-added estimates might serve as a first step toward planning which classrooms to observe more frequently. Under such a model, when observations are completed, one might decide that the initial signal provided by the value-added estimate was simply wrong. One might also find that it produced useful insights regarding a teacher’s (or group of teachers’) effectiveness at helping students develop certain tested skills.

School leaders or leadership teams should clearly have the authority to make the case that a teacher is ineffective and that the teacher even if tenured should be dismissed on that basis. It may also be the case that the evidence would actually include data on student outcomes – growth, etc. The key, in our view, is that the leaders making the decision – indicated by their presentation of the evidence – would show that they have reasonably used information to make an informed management decision. Their reasonable interpretation of relevant information would constitute due process, as would their attempts to guide the teacher’s improvement on measures over which the teacher actually had control.

Put simply, mindless reliance on prescribed metrics is not effective human resource management whether in the private sector or in public schools.

Additional Readings:

Balance choice with support for equity & access

The theme of individual (parent/child) liberty via parental choice that has dominated the past decade of education policy in New York City (and elsewhere) must be counterbalanced with a greater emphasis on equity and equality of opportunity and access. This requires considering carefully the geographic and socioeconomic distribution of educational opportunities at all grade levels across all children.

Choice in and of itself does not ensure equity. This false premise promoted by many “education reformers” runs counter to centuries of political theory, which explains that liberty (a core tenet of choice) and equality are at constant tension with one another (only at some extreme point might “meet and be confounded together” Tocqueville, Alexis de.  Democracy in America, volume 2, part II, chapter 1).

Implicit in policy preferences for choice program expansion is the notion that more children should have the choice to attend “higher quality” schooling options and that such options will emerge, as a function of the competitive marketplace for quality schooling with little attention to the level of resources provided or other prerequisite conditions for sustaining an equitable distribution of quality schooling.

The notion that one would provide via public subsidy, “higher quality” alternatives means also consciously providing lower quality ones. That is, consciously endorsing a policy of such inequity that the parents of children presently attending “low(er) quality” schools will endure the transaction costs (family/child disruption, geographic inconvenience) to move their child from their neighborhood schools. This is simply wrongheaded.

Even more wrongheaded are policies yielding outright deprivation by labeling neighborhood schools as failing and shuttering them on false pretenses (low test scores as a method of placing blame), leaving parents to scramble to find an acceptable alternative (one that is merely better than nothing).  Such policies create a false sense of demand for those alternatives (typically charters), further advancing current policy preferences. That is, the argument that charter waiting lists provide validation for further charter expansion, even when those waiting lists have been induced by school closures.

Recent policy preferences are built on the assumption that liberty achieved by choice programs serves as substitute for the provision of broad based, equitable and adequate financing. Studies purporting significant advantages achieved by students attending charter schools have invariably neglected to evaluate their access to financial resources (see also) (while selectively evaluating outcomes of children attending those schools with access to resources), frequently downplaying the importance of money or relevance of equity traditionally conceived.

Advocates suggest that if some children are made better off by the presence of higher quality options, all are better off and certainly no one is worse off.  This too is false. In forthcoming work, I explain that:

Baker and Green (2008) as well as Koski and Reich (2006) explain that to a large extent education operates as a positional good, whereby the advantages obtained by some necessarily translate to disadvantages for others. For example, Baker and Green (2008) explain that “In a system where children are guaranteed only minimally adequate K–12 education, but where many receive far superior opportunities, those with only minimally adequate education will have limited opportunities in higher education or the workplace.” (p. 210) This concern is particularly pronounced in a city like New York where children and families are constantly jockeying for position to gain access to selective admissions public middle and secondary schools, and where the majority of charter schools serve elementary and middle grades. The competitive position of children in otherwise similar district or charter schools with fewer resources is compromised by the presence of better resourced district or charter schools. Though surely, all would be less well off if all were substantially though equally deprived.

Variation in resources across private providers, as well as across charter schools tends to be even greater than variation across traditional public schools (Baker, 2009, Baker, Libby & Wiley, 2012). Further, higher and lower quality private and charter schools are not equitably distributed geographically and broadly available to all. In the most extreme case, in New Orleans following Hurricane Katrina where traditional district schools were largely wiped out, and where choice based solutions were imposed during the recovery, entire sections of the city were left without secondary level options and provided a sparse few elementary and middle level options (Buras, 2011).

Baker, Libby and Wiley show that in New York City, charter expansion has yielded vastly inequitable choices. Most New York City charter school networks serve far fewer children qualifying for free lunch (<130% poverty level), far fewer English language learners and far fewer children with disabilities than same grade level schools in the same borough of the city. These patterns of student sorting induce inequities across schools. But, these schools also have widely varied access to financial resources despite being equitably funded by the city. Some charter networks are able to outspend demographically similar district schools by over $5,000 per pupil, and to provide class sizes that are 4 to 6 (or more) students smaller.

Put simply, on cannot assume that providing a “system of great schools” will necessarily yield an equitable system of high quality, operationally efficient schools. It hasn’t and it won’t, in New Orleans, New York, in Sweden or anywhere. City leaders must actively manage the provision of an equitable, high quality and operationally efficient school system rather than simply assuming that a system of great schools will necessarily accomplish that goal.

Moving forward in the short term:

  1. The city should develop more transparent, comparable reporting of district and charter school site-based revenues and expenditures, inclusive of a) private contributions by source and b) in-kind expenditures from parent organizations, including salaries and benefits of centrally employed staff.  More detailed reporting of soft money and in-kind contributions may provide insights regarding policy efforts to improve resource equity between charter and district schools, and among charter schools.
  2. All schools operating within the city should be brought under the same policy umbrella to ensure more equitable distributions of students and the resources to serve them. This means financing charter schools in accordance with the student populations they serve, via weighted student funding. This also means considering policy alternatives for balancing resource access across schools, given their widely varied access to private resources.
  3. City leaders should push state leaders for the billions in resources stilled owed city school children in the years since the ruling in Campaign for Fiscal Equity.
  4. Finally, given the increased organizational complexities of privately governed and managed charter schools, the city should take steps to ensure that children’s and employees’ rights remain equally protected (when compared with their peers in “government operated” district schools).  The choice between private management and public provision of schooling is not benign and should not be taken lightly.  Increasingly federal and state case law is revealing that children’s and employees rights are substantively lessoned in schools managed and operated by private entities. In addition, taxpayer rights to gain access to records and finances of private providers have also been interpreted by courts as more limited (than access to similar information from government entities).

Additional Readings

  • Baker, B.D., Libby, K., & Wiley, K. (2012) Spending by the Major Charter Management Organizations: Comparing charter school and local public district financial resources in New York, Ohio, and Texas.Boulder, CO: National Education Policy Center. Retrieved [date] from http://nepc.colorado.edu/files/rb-charterspending_0.pdf
  • Baker, B.D., Libby, K., Wiley, K. Charter School Expansion & Within District Equity: Confluence or Conflict? Education Finance and Policy
  • Baker, B.D. (2012). Review of “New York State Special Education Enrollment Analysis.” Boulder, CO: National Education Policy Center. Retrieved [date] from http://nepc.colorado.edu/thinktank/review-ny-special-ed.
  • Buras, K. L. (2011). Race, charter schools, and conscious capitalism: On the spatial politics of whiteness as property (and the unconscionable assault on black New Orleans). Harvard Educational Review, 81(2), 296-331.

The Post-Equity Era in School Finance

I’ve written a few posts in recent months where I’ve raised concern about the apparent complete disregard (& outright ignorance) for the role of equitable and adequate financing of our public schools. The bottom line is that providing for a high quality, equitably distributed system of public schooling in the United States requires equitable, adequate and stable and sustainable public financing. There’s no way around that. It’s a necessary underlying condition.

I too often here pundits spew the vacuous mantra – it doesn’t matter how much money  you have – it matters more how you spend it. But if you don’t have it you can’t spend it. And, if everyone around you has far more than you, their spending behavior may just price you out of the market for the goods and services you need to provide (quality teachers being critically important, and locally competitive wages being necessary to recruit and retain quality teachers).  How much money you have matters. How much money you have relative to others matters in the fluid, dynamic and very much relative world of school finance (and economics more broadly). Equitable and adequate funding matters.

But alas, it seems that one of the first things to go when the economy tanked a few years back was any sense that equity could ever be important. Take, for example, NY Governor Cuomo’s recent response to a challenge to racial disparities in funding shortfalls in his state.

Asked in a radio interview this morning about Schenectady Schools Superintendent Larry Spring joining those filing a federal civil rights complaint against the state alleging its school funding mechanism shortchanges minority students and those with disabilities, Gov. Cuomo didn’t so much answer the question as elaborately re-phrase it.

That’s called democracy, and that’s what the Legislature debates every year and what is the fair amount of funding for each district. And should a rich district get no money because they’re a richer district, or should they get more money because they put in more money? Should the needier districts get all the money because they’re needier even though they put in less? And that is the annual debate of the state budget and the education funding formula.

Over the past several weeks, as I embark on a new project evaluating the past 20 years of funding equity and funding level shifts across states, I’ve begun playing around with alternative ways to characterize changes to funding over time, and evaluate causes of those changes. I’ve explained in previous posts how the amount of total state aid, for example, is only part of puzzle. The extent to which state aid is targeted according to local fiscal capacity and need is most important for determining whether increased state aid will improve overall equity. For example, here’s New Jersey state aid per pupil and total state and local revenue per pupil in 1997 and again in 2007.

In 1997, districts with higher poverty rates were already receiving higher levels of state aid than their less needy counterparts. But the differences in aid were not sufficient to create an overall upward tilt – progressive pattern.

Figure 1 – NJ in 1997

Slide1

By 2007, the infusion of state aid into high need districts had pushed those districts to a point where they were better positioned to provide smaller class sizes and to pay more competitive wages. The state aid had been sufficiently targeted to achieve an overall progressive distribution of state and local revenue.

Figure 2 – NJ in 2007

Slide2What I’ve been working on lately, is tracking the relative progressiveness/regressiveness of state and local revenues and of state aid, over time from 1993 to 2011 for all states, using the same “fairness ratio” we use in our annual report on school funding fairness. 

Imagine, for example, having a state and local revenue fairness indicator for every year, for each state, from 1993 to 2011.

  • Where the index is 1.0, a district with 30% children in poverty (census poverty) is expected to have about the same state and local revenue per pupil as a district with 0% poverty.
  • Where the index is 1.2, a district with 30% children in poverty is expected to have about 120% (20% more) revenue per pupil than a district with 0% poverty.
  • And where the index is .8, a district with 30% poverty is expected to have about 80% of the revenue of a district with 0% poverty.

As in this hypothetical, one can track the changes in targeting of state aid along with the changes in overall state and local revenue fairness. Note that even if state aid fairness stays constant from one year to the next, changes in local revenue raising patterns can alter equity. Further, state aid might be allocated relatively “fairly” but at too low a level to improve overall state and local revenue equity.

In my forthcoming academic papers on this topic, models include multiple moving parts.

In this hypothetical, in 1993, state aid is poorly targeted according to needs but that targeting improves over time (moving to right) and as a result, state and local revenue fairness improves (moving up vertically). By 2007, the system reaches it’s peak of aid targeting and overall progressiveness. But then the system falls back as state aid targeting declines – perhaps as a function of disproportionate aid cuts to the neediest districts while holding harmless less needy districts.

Figure 3 – Hypothetical over Time

Slide3

Well, that’s what it looks like in hypothetical land. Now how does this look for actual states? Let’s start with a few that have spent much of the period in the progressive funding zone, above the red horizontal line.

Each of these states reaches its peak around 2008 or 2009, then declines. New Jersey’s decline in aid targeting and overall progressiveness between 2009 and 2011 is particularly striking (at this point I’m waiting for the next year of data to see what’s going on here. Yes, it’s declined, but this seems more than expected).

Figure 4 – New Jersey

Slide4

Massachusetts is a messier picture. School finance reforms in the early 1990s substantially shifted responsibility to the state (counterbalancing local revenue losses and emergent inequities from the 1980s in response to constitutional tax limits). Since that time, overall progressiveness has floated around between 1.2 and 1.4 and state aid fairness between 2.5 and 4.0. Overall progressiveness appears to peak twice in 2001 and again in 2008.

Notably, Massachusetts was among those states that took a pretty hard economic hit in the 2001-2003 economic slowdown (where states that had the largest share of income derived from investments/non-wage income seemed to suffer most). By 2011, however, Massachusetts is about as low on overall progressiveness as it has been since implementing school finance reforms in the 1990s.

Figure 5 – Massachusetts

Slide5

Ohio’s story is similar. Ohio was under judicial pressure throughout the 1990s (DeRolph cases) and is often characterized as being relatively non-responsive to that pressure. But Ohio did improve its overall progressive and its state aid targeting throughout that period, reaching a peak around 2007/08. But, then, like others, declining state aid targeting set the state back quite significantly (though 2010 and 2011 stay about the same).

Figure 6 – Ohio

Slide6Connecticut really never implemented any systematic statewide school finance reform (maintaining variations on the Education Cost Sharing Formula since the mid 1990s). But Connecticut did over time allocate substantial lumps of aid to Hartford and New Haven for their magnet school programs, creating an appearance of a progressive statewide system. That system reaches its peak of progressiveness (leaving out many other high need districts) at a few points in the 2000s (2000, 2001, 2008, 2002) and peak of targeting in 2007, then, like others declines quite substantially – ending at flat funding.

Figure 7 – Connecticut

Slide8

Figure 8 – Kansas

Slide7Kansas is a fun case that begins with adoption of a new weighted funding formula coupled with spending/revenue limits at the outset of the data. changes to local property taxation and shifting state aid lead to a rather jumbled mess of persistent regressiveness through the early to mid 1990s. In the later 1990s, legislators imposed cuts to local revenue requirements and then in 2001, state aid freezes and cuts let to a system that became more and more regressive, despite marginally better aid targeting, reaching its low in 2004.  At this point, judicial pressure  (2003) kicks in, followed by a high court ruling (2006) accepting reform legislation which temporarily drives Kansas school funding into the progressive zone. But that doesn’t last long, and by 2011, Kansas finds itself back in regressive territory, similar to year 2000 levels.

And now for the states that have never even come close to cracking the progressive threshold even with reforms (Pennsylvania) and judicial pressure (New York).

Figure 9 – New York

Slide9New York did make progress over time, and implementing a new funding formula after the Campaign for Fiscal Equity ruling did make some difference on state aid targeting. But that targeting has declined with steep aid cuts to needy districts.

Figure 10 – Pennsylvania

Slide11

Pennsylvania had a fleeting moment in the late 200s where targeting of aid, and overall equity improved, but they’ve now reverted to regressiveness levels comparable to 1993.

Finally, Illinois never has, and perhaps never will really give a damn.

Figure 11 – Illinois

Slide10Aid became marginally more targeted around 2006-2009, but by 2011, Illinois aid is about as targeted as it was in the early 1990s and the system remains even more regressive than it was during that time period.

And yet we wonder why our lower income children’s educational outcomes continue to suffer? We pretend that if only our higher poverty districts would fire that bottom 5% of teachers who produce bad test scores (gains), they’d do better (because of course, they can hire a new crop of better teachers even if they can’t pay a competitive wage?). We pretend that expanding charter schooling, to siphon off the less needy among the needy into privately subsidized (soft money) schools (and diminished legal protections) that somehow we’ll achieve a desirable systemwide effect?

We continue to place risky bets on not only revenue neutral, but revenue negative “solutions.” But hey, those are other people’s children anyway, right?

Meanwhile, the damage that’s been done to our public education systems by outright and at times belligerent neglect of state school finance systems has, in the past 3 years alone set us back in many cases 20 years.

Now is the time to turn that corner and attempt to repair that damage as quickly as it was inflicted.

Ignorati Honor Roll 2013: Pundit Version

As 2013 comes to an end, it’s time to review some of the more ridiculous claims and arguments made by pundits and politicians over the course of the past year.

A definition of “Ignorati” is important here:

Elites who, despite their power, wealth, or influence, are prone to making serious errors when discussing science and other technical matters. They resort to magical thinking and scapegoating with alarming ease and can usually be found furiously adding fuel to moral panics and information cascades. [ http://www.urbandictionary.com/define.php?term=Ignorati ]

I’m sure I’ve missed many good ones (please do send) and I’ve definitely put more weight in my selection on stuff I’ve come across recently than stuff that appeared at the beginning of the year.  I’ve tried to select statements and representations of data that are so foolish that, in my view, they severely undermine the credibility of their source. At least a few of these are statements made by pundits (this post) and politicians (next post) and echoed by the media, that are so patently false and/or foolish that it’s rather surprising that anyone could swallow them whole.

So, without further ado…

Petrilli on PISA and Poverty

Let’s start with two claims made by Mike Petrillli in a recent post at Ed Excellence, in which he  opined that bad teachers (or at least bad teaching), not poverty must be causing low PISA scores on Math for U.S. 15 year olds. Mike was perplexed that a) poverty might affect math outcomes as much as (if not more than) reading, thus something else must really be affecting math (bad teachers/teaching) and b) that poverty was affecting our 15 year olds’ outcomes, when we all know poverty affects younger kids more!? (really?).  Mike’s goal was to explain that one must accept unreasonably complicated assumptions if one is to accept that poverty might comparably influence math or that poverty affects outcomes of older children as well as younger ones. Here it is in Mike’s own words (setting up the supposed “bad” assumptions used by others).

First, one must assume that math is somehow more related to students’ family backgrounds than are reading and science, since we do worse in the former. That’s quite a stretch, especially because of much other evidence showing that reading is more strongly linked to socioeconomic class. It’s well known that affluent toddlers hear millions more words from their parents than do their low-income peers. Initial reading gaps in Kindergarten are enormous. And in the absence of a coherent, content-rich curriculum, schools have struggled to boost reading scores for kids coming from low-income families.

…the second assumption must be that “poverty” has a bigger impact on math performance for fifteen-year-olds than for younger students. But I can’t imagine why. If anything, it should have less of an impact, because our school system has had more time to erase the initial disadvantages that students bring with them into Kindergarten.

Here’s a link to my post with the complete rebuttal and explanation!

Petrilli’s conclusion in the face of these inexplicable assumptions?

Maybe we’re just not very good at teaching math, especially in high school.

Here’s a shortened version of my earlier critique. First, there’s no evidence that poverty affects measured reading outcomes more than measured math outcomes, especially for highly aggregated student populations. The key word here is “measured.” Often math achievement simply seems more precisely, accurately or consistently measured (revealing more predictable variation) thus revealing clearer, more predictable gaps.

Yes, we have evidence of disparate outcomes by poverty for reading. But we have ample evidence of disparate outcomes by poverty for math. Even though we’ve been subjected lately to new reports (of old news) that higher income kids get exposed to more words earlier, that doesn’t mean that higher income kids don’t also get exposed to mathematical thinking/basic numeracy early on.

Here are the state aggregate math and reading outcomes by poverty for NAEP, and not-so-surprisingly (for anyone with an ounce of background in this stuff), the math scores are marginally more disparate than the reading scores.

Figure 1

Slide1

The second assertion is actually even more silly – that poverty affects early learning and thus only bad teaching affects what happens since (say, between 4th and 8th grade tests). Put simply, the effects of poverty are cumulative over time, most often leading to increasing gaps in later grades when compared with earlier grades (I should note, especially if we do not put sufficient support into resolving those gaps).

Here’s the empirical snapshot.

Figure 2

Slide2

Now, this type of thinking isn’t novel for Mike. He’s made up lots of stuff before that simply doesn’t pass the most basic smell test, presenting it as some form of clever insightful revelation that makes perfect sense if you have little or no background on the issue (to his credit, it’s always done with a grin/smirk and ability to dance).

Among the most egregious examples was his policy brief a few years back with Marguerite Roza on Stretching the School Dollar which included many examples of policies and spending practices he’d like to see changed in schools, many of which actually had little or nothing to do with stretching dollars at all. For more on this topic, see this post, this policy report, and this peer reviewed article.  (no, I can’t believe I wasted so much time rebutting utterly foolish schlock!)

And with the utmost class coupled with their usual depth of substance, TB Fordham responds by tweeting:

Good stuff. Deep.

Smarick on Propping up Philadelphia

For the most compelling evidence that U.S. schools are dreadfully failing in mathematics preparation, one might point to the quantitative wizardry of Andrew Smarick. Like Petrilli, Smarick can’t be confused with really basic facts and numbers.

Over the past year, Smarick has gone on at least a few twitter rants about how the City of Philadelphia has been propped up with so much additional funding over the years and has still proven itself to be a complete failure. His twitter rants about wasted state aid, and Philly’s egregious, inefficient, under-productive overspending are in support of his agenda to simply eliminate public urban school districts and replace them with collections of charter schools (despite evidence that PA charters haven’t done a very good job).

What’s so ridiculous about Smarick’s claims here is that Philadelphia is and has been for some time, among the least well-funded major urban school districts in the nation. One can find evidence of this in many, many places and public data sources. Smarick’s angle is to simply assert that Philly receives more state aid than other PA districts. Yes… and Philly has a lot more students.

While Smarick’s entire argument for ending the urban district is suspect and full of holes (laced with historical, policy, legal and empirical ignorance) there are many other cities that might serve as better examples than Philly.

Here are two representations of Philly school funding in context. First, here’s Philly’s state and local revenue per pupil, relative to the average for its metropolitan area (PA districts only), where 1.0 is average, with districts arranged by poverty.  Put simply, Philly has much greater need – higher poverty – than surrounding districts and lower than average funding. Philly is that big one… actually, those big four shapes, below the average line and with high poverty – getting higher from year to year. Way up in the upper left, is the adjacent leafy suburb – Lower Merion!

Figure 3

Slide3

Let’s make this even simpler by comparing Philly, Allentown and Reading – three of the most fiscally screwed districts in the nation – to other more affluent Pennsylvania districts, and by breaking out their state and local revenues. Here, we can look specifically for that “propping up” with state aid effect. And guess what? It’s not there.

C’mon dude! This is ridiculous. Download some freakin’ data, either from PA dept of ed, or use the Census Fiscal Survey. It’s really not that hard. Make a graph. Philly has not been “propped up!” Okay… yeah… propped up more than if it was left entirely to raise school funds on local property taxes and propped up slightly more than Reading or Allentown. But to suggest that the state has, time and time again, bailed out Philly, given it more than would be necessary to achieve desired outcomes, is utterly ridiculous, reckless, irresponsible or downright incompetent.

Figure 4

Slide4

Jeanne Allen’s Bizarre Interpretation of the U.S. & Louisiana Constitutions

This one is a bit different, but I would be remiss if I didn’t revisit it. No, it’s not one of those things that can be simply rebutted with a graph or two. Rather, this is one to be rebutted with a basic understanding of civics.  If I go back further in my posts over time, I can find at least a handful of blustery reformy posts which are illustrative of our failures of civics education in the U.S.

There’s this one from the ever insightful Bob Bowdon of Choice Media, in which Bowdon decries supposed union supported legal attacks on Georgia’s charter authorizing authority (totally neglecting the actual phrasing of the Georgia constitution and the role of the courts in interpreting that constitution).

There’s this one, in which a Kansas attorney wishes to argue that there exists an individual liberty interest to impose unlimited local property taxation (e.g. that state imposed tax and expenditure limits violate that individual liberty).

But earlier this year, when the Louisiana courts struck down that state’s voucher program redirecting tax dollars to private schools, Jeanne Allen of the Center for Education Reform penned a response of unprecedented civic ignorance, her core argument being that in her view, the U.S. Constitution (as interpreted in the Cleveland voucher case of Zelman v. Simmons-Harris) protects an individual liberty to taxpayer funded private schooling.

In her own words:

“If indeed the Louisiana constitution, as suggested by the majority court opinion, prohibits parents from directing the course of the funds allocated to educate their child, then the Louisiana constitution needs to be reviewed by the nation’s highest court,” said Center for Education Reform President Jeanne Allen.

Allen added: “I urge Governor Jindal to file an appeal to the US Supreme Court, and ask for the justices’ immediate review of the decision. The Louisiana justices actions today violate the civil rights of parents and children who above all are entitled to an education that our Founders repeated time and time again is the key to a free, productive democracy.”

This is a bizarre interpretation indeed, of a ruling (Zelman) that permits the public financing of private schooling, inclusive of religious alternatives (that is, the specific model used in Cleveland was found NOT to violate the establishment clause in its use of public dollars for vouchers to private religious schools). That is not to say, by any stretch of the imagination, that this case by extension establishes a right for children everywhere to access public dollars for their private education. You see, “permit” and “require” are two very different things.

For more explanation, please see this post.

Lerum/Students First (and many others) on DC and Tennessee NAEP Miracles

I conclude with perhaps my favorite of reformy echo-chamber claims of 2013, one which we were all graced with not just once, but twice in recent months with the release of 2013 State NAEP scores and the later release of 2013 large urban district NAEP scores.

Most of the mis-NAEP-ery centered on claims of great gains in achievement (between one cohort of kids two years ago, and another cohort this year) in reformy favorites Tennessee and Washington DC. The central assertion of the reformy echo-chamber was that these great gains experienced in Tennessee and DC were proof positive that teacher evaluation reforms are working!  Take, for example Eric Lerum’s Blog post on the Students First web site which starts with:

The 2013 National Assessment of Educational Progress (NAEP) results provide some of the strongest evidence yet that investment in student-centered education reforms improves student achievement.

Further down in the post, Lerum explains just what that compelling evidence is and what it means! Lerum provides us 3 “truthys”

First, that investment in teacher quality matters. Tennessee and D.C. have both implemented comprehensive teacher evaluation systems paired with targeted professional development, and (along with Florida) they were out ahead of all other states in doing so. This has established them as national leaders in policies related to teacher quality.

Second, we learned that rigorous academic standards make a difference. D.C. and Tennessee were early adopters of the Common Core State Standards and have been dedicated to good-faith implementation. They gave teachers and schools the resources and training necessary to put the standards in action, and students responded.

Third, it is clear that education reform isn’t about partisan politics. D.C. is one of the most liberal jurisdictions in the country; Tennessee is one of the most conservative. But when policymakers and education stakeholders withstand political pressure and make the changes needed to improve schools, kids win.

Well, that third one’s a bit of an aside, but let’s take a look at the first two. That recently adopted teacher evaluation policies and early adoption of common core standards have lifted DC and TN to new heights on NAEP!

Now, for these latest findings to actually validate that teacher evaluation and/or other favored policies are “working” to improve student outcomes, two empirically supportable conditions would have to exist.

  • First, that the gains in NAEP scores have actually occurred – changed their trajectory substantively – SINCE implementation of these reforms.
  • Second, that the gains achieved by states implementing these policies are substantively different from the gains of states not implementing similar policies, all else equal.

And neither claim is true, as I explain more thoroughly here! But here’s a quick graphic run down.

First, major gains in DC actually started long before recent evaluation reforms, whether we are talking about common core adoption or DC IMPACT. In fact, the growth trajectory really doesn’t change much in recent years.  But hey, assertions of retro-active causation are actually more common than one might expect!

 Figure 11

Slide11

Note also that DC has experienced demographic change over time, an actual decline in cohort poverty rates over time and that these supposed score changes over time are actually simply score differences from one cohort to the next. This is not to downplay the gains, but rather to suggest that it’s rather foolish to assert that policies of the past few years have caused them.

Second, comparing cohort achievement gains (adjusted for initial scores… since lower scoring states have higher average gains on NAEP) with STUDENTS FIRST’s own measures of “reformyness” we see first that DC and TN really aren’t standouts, that other reformy states actually did quite poorly (states on the right hand side of the graphs that fall below the red line), and many non-reformy states like Maryland, New Jersey, New Hampshire and Massachusetts do quite well (states toward the middle or left that appear well above the line).

Needless to say, if we were to simply start with these graphs and ask ourselves, whose kickin’ butt on NAEP gains… and are states with higher grades on Students First policy preferences systematically more likely to be kickin’ butt, the answers might not be so obvious. But if we start with the assumption that DC and TN are kicking butt and have the preferred policies, and then just ignore all of the others, we can construct a pretty neat – but completely invalid story line.

 Figure 12

Slide12

Closing thoughts…

I hear you say… hey… this is a bit personal isn’t it? Well, yes, I’ve called out individuals for their arguments and I too prefer sticking to the substance of those arguments. But let’s be clear here – I’m calling out the arguments – their substance – or blatant lack thereof. These ridiculous arguments happen to have emanated from these individuals.

Certainly, Lerum was far from the only one to have made the absurd DC and Tennessee claims. I could pick many others for that one… but Lerum expressed it all so eloquently wrong in his Students First post, and I had already done the comparisons with Students First own ratings of reformyness. So this was low hanging fruit. I used a ridiculous Nick Kristoff quote in my original post on this topic.

Smarick’s belligerent and repeated wrongness is simply unexcused.  Smarick simply can’t be bothered with facts. He would tweet this silliness. I’d write about how wrong it was… and a week or two later… he’d be back to Philly bashing – all again with his argument that the state has already done all it can to fiscally prop up Philly.

As for Petrilli, these complete wacky, ill-conceived and under-informed arguments are his own, and to his credit, his response to my earlier post was gracious. But that doesn’t keep him off the Ignorati honor roll.  It just allows him to wear that badge with honor.

And now, we sit and wait to see what the new year will bring.

Cheers.

On Short-Term Memory & Statistical Ineptitude: A few reminders regarding NAEP TUDA results

Nothin’ brings out good ol’ American statistical ineptitude like the release of NAEP or PISA data.  Even more disturbing is the fact that the short time window between the release of state level NAEP results and city level results for large urban districts permits the same mathematically and statistically inept pundits to reveal their complete lack of short term memory – memory regarding the relevant caveats and critiques of the meaning of NAEP data and NAEP gains in particular, that were addressed extensively only a few weeks back – a few weeks back when pundit after pundit offered wacky interpretations of how recently implemented policy changes affected previously occurring achievement gains on NAEP, and interpretations of how these policies implemented in DC and Tennessee were particularly effective (as evidenced by 2 year gains on NAEP) ignoring that states implementing similar policies did not experience such gains and that states not implementing similar policies in some cases experienced even greater gains after adjusting for starting point.

Now that we have our NAEP TUDA results, and now that pundits can opine about how DC made greater gains than NYC because it allowed charter schools to grow faster, or teachers to be fired more readily by test scores… let’s take a look at where our big cities fit into the pictures I presented previously regarding NAEP gains and NAEP starting points.

The first huge caveat here is that any/all of these “gains” aren’t gains at all. They are cohort average score differences which reflect differences in the composition of the cohort as much as anything else.  Two year gains are suspect for other reasons, perhaps relating to quirks in sampling, etc.  Certainly anyone making a big deal about which districts did or did not show statistically significant differences in mean scale scores from 2011 to 2013, without considering longer term shifts is exhibiting the extremes of Mis-NAEP-ery!

So, here are the figures…. starting with NAEP 8th grade math gains for 10 years, against the initial average score in 2003.

Slide1The relationship between 10 year gains on 8th grade math and initial average score is relatively strong.  DC and LA which appear to be getting the early applause for their reformy amazingness pretty much fall right in line with expectations. Boston is a standout here… and Cleveland? well… that’s a bit perplexing, but Cleveland reveals perplexing data on many levels in ed policy (including some of the consistently highest school level low income concentrations in the nation).

The relationship for reading is not quite as strong:

Slide2LA is lookin’ pretty good here, but starting pretty darn low – lower than DC… which, by the way, really isn’t a standout here on 10 year gains.  Cleveland? well… not a pretty sight… Other cities fall pretty much in line with expectations given their initial 2003 mean scores.

Here are the 4 year gains for math grade 8:

Slide3DC looks a little better here… but as previously, cities fall among the states in roughly their expected locations- but for Cleveland and Detroit, which seem to lag. San Diego, a relative standout on 10 year gains, lags on 4 year gains, but that’s hardly a condemnation of a city that a) has made longer term gains and b) as of 2009 sits among the higher performing jurisdictions.

Finally, here’s the 4 year gain for reading grade 8:

Slide4This relationship is certainly less consistent. DC falls more or less in line. Cleveland and Milwaukee aren’t lookin’ so good. San Diego is back above the line, but having started and remaining lower in the pack than they were on math.

Again, the big caveat here is that these aren’t “gains” but rather cohort differences. And one might suspect population change to occur more quickly in cities than in states, especially in those cases where cities have smaller overall student populations than states (setting aside those pesky low population states like VT, WY, etc.).

What to make of this all? Not much really.  Does NAEP TUDA provide broad condemnation of urban education in the U.S.  Well, only to the extent that NAEP generally provides such condemnation, since cities and states tend to fall in line with one another (but for some notable standouts).  Do these data present us with obvious pictures about current policy preferences or directions? Well, that would be hard to assert given that these data don’t really present us with consistent pictures – but for the fact that starting point matters, and my previous post illustrating how demography matters.

This is by no means to suggest that policies and practices don’t matter, but rather that frequent, egregious misinterpretation of NAEP data provides no value-added to the policy conversation. (yeah… I said value-added!?)

SUPPLEMENTAL FIGURES

Here are a few additional figures from a few years back… it took a while to find them (they are from a project I did on poverty measurement), but they establish the rather obvious fact that these NAEP TUDA scale scores (level scores) are also associated with economic context – specifically, poverty concentration.

Slide1Given that many of these cities are high poverty settings, the relationship is actually tighter when I use the more stringent census poverty threshold (rather than free lunch, which is 130% of poverty level), even though these city level poverty data are not necessarily completely overlapping with school district enrollments. What these data do show is that Cleveland and Detroit are simply much higher poverty settings than the other cities in the sample (for 5 to 17 year old children). And that is certainly relevant to both score levels and potential changes in cohort level scores over time.

NAEP scores are from 2009

Slide2

Racial Disparities in NY State Aid Shortfalls

Yesterday, Ed Law Prof Blog posted an update about the Office of Civil Rights complaint to be filed by Schenectady School District claiming that shortfalls in New York State aid fall disparately by student race.

I’ve reported on numerous occasions on this blog the patterns of disparity in New York State funding. I actually hadn’t checked recently the strength of the relationship between funding shortfalls and school district racial composition. As the Ed Law blog explains, litigation around this question (that of racially disparate impact of school funding policy) was largely headed off by the Sandoval case which held that no private right of action exists for challenging policies violating disparate impact regulations promulgated under Title VI of the Civil Rights Act. “Disparate impact” occurs where a policy ends up having different effects on one group versus another, by race, ethnicity or national origin but not necessarily because the policy is written explicitly to treat individuals differently by race. That is, it’s a statistical association with race that may not have to do directly with race. But then again, it might. That’s the hard part to prove when race isn’t written right into the policy as it used to be, say, in the pre-Brown era. For those interested in some additional school finance reading on this topic see:

  • Baker, B. D., & Green III, P. C. (2005). Tricks of the Trade: State Legislative Actions in School Finance Policy That Perpetuate Racial Disparities in the Post‐Brown Era. American Journal of Education, 111(3), 372-413. AJE_Baker_Green_Tricks

In the post-Sandoval era, complaints regarding policies that yield racially disparate impact are to be brought as administrative claims, through the relevant federal agencies/departments, just as Schenectady has done here (as elaborated in Ed Law Prof Blog).

So today’s big question is just how bad are the racial disparities in state aid shortfalls in New York State?

Is Schenectady right?

First, let’s define state aid shortfall. As I’ve explained on previous posts, New York operates a foundation aid formula which defines the per pupil amount of funding that is required for each district, given it’s location (labor market) and students (needs) in order to achieve adequate outcomes (this formula being the state’s own proposed remedy to previous state litigation over the adequacy of funding). So, in step one, the state calculates adequate target funding:

1) Sound Basic Funding Target = base funding figure x pupil need index x regional cost index x aidable pupil count

Where that “aidable pupil count” figure includes some additional adjustments.

Step two determines the amount the local district should contribute to the sound basic target funding and thus, the remaining amount to be contributed as state aid.

2) State Aid = Sound Basic Funding Target – Local Contribution

But the problem is that New York has, in nearly every year since proposing this remedy to past litigation, added a few more steps to the calculation, which include:

  1. freezing foundation funding to levels from several years prior
  2. invoking the deceptively named “Gap Elimination Adjustment” to inflict disproportionate cuts on needier districts 
  3. enforcing local property tax limits that effectively prohibit districts from making up their losses in state aid – and effectively prohibit districts from even coming close to achieving the level of funding the state itself has declared as constitutionally adequate. Notably, the aid shortfalls are so extreme that low wealth districts really couldn’t ever tax themselves locally enough to make up the losses even if they tried.

Point #3 above is the subject of a separate lawsuit challenging the absurdity of invoking a policy that would prohibit, even if possible, districts from raising the level of funding the state itself declares as adequate but refuses to provide.

So, after the additional freezes and cuts are invoked, we can determine the state aid gap as follows:

State Aid Shortfall = State Aid to Achieve Sound Basic Funding Target – Actual State Aid after Freeze and Gap Elimination Adjustment

And just how related to race are those aid shortfalls? Well, here it is, based on the 2013-14 State Aid Runs merged with demographic data from the 2012 NYSED School Report Cards:

Race & NY State Funding

Previously, I’ve shown that these aid shortfalls are pretty strongly associated with the state’s own Pupil Need Index with higher need districts facing larger shortfalls. And racial composition is associated with the pupil need index, if we focus on traditionally disadvantaged racial aggregate classifications (which is a whole separate can of worms).

To summarize the graph above, which visually displays only those districts with greater than 2,000 pupils, but includes all (weighted for enrollment) in statistical estimates, it is certainly the case that New York State districts with higher concentrations of black or Hispanic children have greater state aid shortfalls.

There is indeed a racially disparate impact.

Moreover, that impact is pretty darn big. Moving from a district with 0% black or Hispanic children to one with 100% black or Hispanic children yields a difference in funding gap of over $2,000 per pupil.

Many of the state’s highest minority concentration districts have state aid shortfalls between $5,000 and $10,000 per pupil whereas NONE of the lowest minority concentration districts has an aid shortfall over $5,000 per pupil!

And these state aid shortfalls are shortfalls against the State’s own (paltry, low-ball) estimates of what it might have taken to achieve the now dated outcome standards of 2007 (under previous litigation)!

UPDATE:

Here’s a quick multivariate run of the data to determine whether otherwise similar districts with more minority children have bigger funding gaps, where otherwise similar is determined with respect to components of the formula itself – the Regional Cost Index, Pupil Need Index and the additional weights included in the Total Aidable Foundation Pupil Unit count.

funding disparate impactSomewhat surprisingly, in this regression, the racially disparate impact is actually larger than when previously represented only as a bivariate relationship between funding gaps and race. I’d have expected the Pupil Needs Index to have substantially moderated the relationship between race and funding gap. But, it is also likely that within any region, the funding gaps are more disparate by race than they appear statewide. This occurs because many of the high minority districts are in higher cost regions.

Petrilli’s Hammer & the poverty has nothing to do with PISA argument

Mike Petrilli over at TB Fordham has made his case for why differences in national economic context do little to substantively explain variations in PISA scores.

He frames his argument in terms of Occam’s Razor, as if to sound well informed, deeply intellectual and setting the stage to share profound logical argument, summarized as follows:

“among competing hypotheses, the hypothesis with the fewest assumptions should be selected.”

Petrilli asserts that while some might perceive a modest association (actually, it’s pretty strong) between national economic context and average tested outcomes in math, for example… like this…

Slide1

…that it is entirely illogical to assert that child poverty has anything to do with national aggregate differences in math performance at age 15.

That is, the various assumptions that must be made to accept this crazy assertion – that economic context matters in math performance – simply don’t hold water in Petrilli’s mind. Rather, the answer must be much simpler and lie in the classroom, with our good ol’ American ineptitude at teaching math.

As Petrilli concludes in his post:

So what’s an alternative hypothesis for the lackluster math performance of our fifteen-year-olds? One in line with Occam’s Razor?

Maybe we’re just not very good at teaching math, especially in high school.

Accepting the bad math teaching conclusion simply requires fewer tricky assumptions than asserting any role for economic context in determining national aggregate outcomes.

Let’s call this Petrilli’s Hammer! as an illogical, blunt & necessarily under-informed alternative to Occam’s Razor. When in doubt – when too lazy to develop disciplined understanding of the field on which you choose to opine and when data are just too hard to handle, get that hammer and everything can look like a nail! (e.g. the bad teacher conclusion)

These two quotes frame Petrilli’s argument:

First, one must assume that math is somehow more related to students’ family backgrounds than are reading and science, since we do worse in the former. That’s quite a stretch, especially because of much other evidence showing that reading is more strongly linked to socioeconomic class. It’s well known that affluent toddlers hear millions more words from their parents than do their low-income peers. Initial reading gaps in Kindergarten are enormous. And in the absence of a coherent, content-rich curriculum, schools have struggled to boost reading scores for kids coming from low-income families.

AND

So the second assumption must be that “poverty” has a bigger impact on math performance for fifteen-year-olds than for younger students. But I can’t imagine why. If anything, it should have less of an impact, because our school system has had more time to erase the initial disadvantages that students bring with them into Kindergarten.

The problem is that both of these statements are a) conceptually foolish and b) statistically ignorant.

Let’s tackle the second issue conceptually first. These scores for 15 year olds are performance level – or status scores. Status scores reflect the cumulative effects of schooling and family background. Most notably in this case, status scores – math performance at age 15, reflect the cumulative influences of poverty – living in poverty – growing up in poverty – lacking resources over long periods of ones’ early life.

Here’s some more reading on poverty timing and cumulative effects.

And then there’s this report which I prepared last summer with ETS.

So… setting measurement issues aside here, we can logically expect gaps between lower and higher income kids to grow between earlier grade assessments and later grade assessments – if we choose to do little or nothing in policy terms about the circumstances under which these children live. Yes, we can and should leverage resources in schools to offset these gaps. But we’re not necessarily applying those resources either.

Accepting Petrilli’s second point above requires that we ignore entirely that our school system remains vastly disparate in many states and locations between rich and poor communities and reinforces (rather than erasing) the initial disadvantages that students bring with them to Kindergarten.

Now, backing up to his first point, where Petrilli argues that if higher poverty settings/contexts do worse relative to lower poverty settings on math than on reading assessments, there must be a simple answer for the math problem/disparity – like bad math teaching of course.  There can be no logical explanation for why math scores might be more sensitive than reading scores to poverty variation.  Assuming bad math teaching to be the reason for greater disparity in math than in reading is much simpler than exploring why it might appear that math test scores are more sensitive to context/poverty, etc. than reading scores. This is true because we all know that poverty affects reading more than math – or so Mike says without citation to any legitimate source validating his point.

This one is pretty simple. First, it may simply be the case that Mike Petrilli is wrong on all levels here. That conceptually and statistically, economic deprivation seems to have stronger affect on numeracy than on literacy. But even accepting the idea that poverty affects literacy more – in a substantive way – doesn’t mean that we’d find a stronger statistical relationship between a) variations in poverty across settings and b) variations in measured outcomes across settings. The fact is that variations in math assessments are often simply more predictable. They may be both more stable/consistent and may actually have more variation to predict.

Empirical Illustrations

I’m going to use state level NAEP data within the US here to provide statistical illustrations for the rather simple flat-out-wrongness of Mike Petrilli’s Hammer.

The following illustrations simply reveal how data of this type tend to play out, something anyone reasonably well versed in using assessment data along side economic data, at various levels of aggregation, would understand. Some of these patterns reveal conceptually sound underlying hypotheses, and some may simply be an artifact of typical issues occurring in the measurement of student outcomes at different ages and in different subjects.

So, for our first question we ask whether it can possibly be the case that there exists greater disparity in math outcomes in 8th grade than in 4th grade across US states of varying degrees of poverty (setting aside the substantive explanations for why such gaps increase).

Now, careful here, this one requires using a little algebra – slope/intercept analysis. The first figure here shows the variation in NAEP math outcomes for 8th graders and for 4th graders, both in 2013.

This figure shows us first of all, that 8th grade math scores are more predictably disparate as a function of poverty than are 4th grade math scores. For 8th grade, poverty alone explains 63% of the cross state variation in math scores, but marginally less (59%) for 4th grade.

The figure also shows us that by 8th grade, an additional 1% poverty is associated with 1.13 point lower state average scale score, whereas in 4th grade, 1% higher poverty rate is associated only with .83 points lower in state average scale score. That is, the negative slope is greater for 8th than for fourth grade.

There can be many, many reasons for this. Among these reasons might be that as time goes on, cumulative poverty related deficits do increase. Persistent disadvantage makes gaps grow. It may also be a measurement issue, pertaining to the precision of measurement of mathematics knowledge and skill, or it may even be an issue of the stability and predictability of tests on early grade math content given to 9 year olds versus tests on stuff like algebra and pre-algebra given to older, hopefully more mature kids (who’ve also taken far more tests by that time).

But, instead of gettin’ all thoughtful about these possibilities and arming ourselves with well-conceived arguments grounded in data and knowledge of the literature, we could simply use Petrilli’s Hammer to assert that the one and only logical answer is that math teachers in high poverty states like Alabama and Mississippi suck and math teachers in low poverty states like New Jersey and Massachusetts rock!  It’s bad math teaching that is making this negative slope get worse between grade 4 and grade 8 – bad math teaching exclusively in high poverty states!

Is there greater disparity in Grade 8 Math than in Grade 4 Math by Contextual Poverty?

Slide2

The next question then is how can it ever be that math scores might be more disparate as a function of poverty when we all know that poverty affects reading more?

The next figure shows the relationship between poverty by state, and math and reading scores in grade 4. Rather amazingly, math scores are more predictable as a function of poverty than are reading scores – note the difference in variance explained (r-squared). Now, (almost) anyone who has ever  plotted reading and math “level” (status) scores, or even estimated value added scores for reading and math in relation to poverty or nearly any other covariate knows that this is common. Variation in math scores – level or value added – is often much more predictable than is variation in reading scores. As above, this may be for many, many reasons. Maybe we’re just not as good on the measurement side at teasing out differences in underlying skill on reading, with either 9 or 14 year olds?

That math scores are more predictably a function of poverty than reading scores – across states – doesn’t mean that our math teaching is better or worse than our reading teaching. Even though the math scores at 4th grade are more predictable than the reading scores, the reading slope appears slightly more disparate (steeper negative). And that doesn’t mean either that our reading teaching is more disparate, or that the 4th grade scores are picking up some differential on the baggage kids bring to school with them. It’s a statistical artifact of the data – based on how math and reading are being measured. It may mean something, but who knows what? It may mean absolutely nothing.

Are Grade 4 Math Scores more predictably a function of poverty than Grade 4 Reading Scores across contexts?

Slide3

Finally, here’s the 8th grade math and reading. Here, math is marginally more predictable as a function of poverty and math outcomes are more disparate as a function of poverty.

At least by these measures – NAEP math and reading scores – aggregated to the state level – which is similar to making national comparisons – reading is NOT as Petrilli so confidently argues above “more strongly linked to socioeconomic class” than math.

International comparisons work much the same.

What about Grade 8 Math and Reading?

Slide4

Indeed, Petrilli is attempting to assert that there exists an incongruity between the data and the underlying reality – that yes, reading scores are affected by poverty, but math not so much.  Thus, if the data show that math scores are more affected by poverty than are reading scores, then something much more nefarious must be going on – Yes – the bad teacher/teaching problem!

It couldn’t possibly have anything to do with measurement issues or the significant possibility that the full range of student outcomes measured are similarly affected by economic deprivation.  That would just be way too much to swallow.

But, if we want to go there… if we want to accept Petrilli’s argument that there’s simply no excuse for U.S. students to fall where they do on international math comparisons, because poverty doesn’t affect 15 year olds or math, only younger kids and reading, then we must apply Petrilli’s hammer to state-by-state comparisons as well.

And thus we logically conclude that math teaching in DC, MS, AL, LA stink and math teaching in NJ, MA VT and NH is great! And that poverty really has nothing to do with it?

Ignoratis Paradox