On False Dichotomies and Warped Reformy Logic

Pundit Claim 1 – Value added modeling is necessarily better than the “status quo”

There exists this strange perspective that we are faced with a simple choice in teacher evaluation – a choice between using student test scores and value-added modeling, or continuing with the status quo. This is a false dichotomy, false dilemma or logical fallacy. In other words, it’s a really stupid argument in which we are forced to assume that there are only two choices that exist. This argument is usually coupled with an implicit assumption that one of the two must be superior.

“Reformers” continue to press the argument that current teacher evaluations are so bad, so unreliable, that anything is better than this “status quo.”

Expressed mathematically:

Anything > Status Quo

Bear with me while I use the “greater than” symbol to imply “really freakin’ better than… if not totally awesome… wicked awesome in fact,” but since it’s relative, it would have be “wicked awesomer.”

Because value-added modeling exists and purports to measure teacher effectiveness, it therefore counts as “something,” which is a subclass of “anything” and therefore it is better than the “status quo.” That is:

Value-added modeling = “something”

Something ⊆ Anything (something is a subset of anything)

Something > Status Quo

Value-added modeling > Current Teacher Evaluation

Again, where “>”  means “awesomer” even though we know that current teacher evaluation is anything but awesome.

It’s just that simple!

After all, you can’t even measure the error rate in current principal and supervisor evaluations of teachers can you? And if you can’t measure the error rate it must be higher than any error rate you can measure? More really basic reformy logic! That is, the unobserved error rate in one system is necessarily greater than the observed error rate of another – even if we have no way to quantify it – in fact, because we have no way to quantify it?

Unobserved error rate of ‘status quo’ > measured error rate of VAM

Let’s be really blunt here. Both are patently stupid arguments.

And both of these arguments bring to mind one of my favorite analogies related to this issue. If we were in a society that still walked pretty much everywhere, and some tech genius invented a new cool thing – called the automobile – but the automobile would burst into a superheated fireball on every fifth start, I think I’d keep walking until they worked out that little kink. If they never worked out that little kink, I’d probably still be walking. I’ve written previously about how this relates to likely error rates in teacher dismissal (misclassifying truly effective teachers as ineffective) as would occur when using typical value-added modeling approaches.

Pundit Claim 2 – If we get rid of the bad teachers, the system will necessarily be better

The assumption of many pundits is that replacing existing teachers necessarily improves the teaching workforce – that the average potential applicant for any/all available teaching jobs will be better than the average person already there, or at least better than the person we dismiss as ineffective. Now, recall that we have a pretty high chance of misclassifying truly effective teachers and dismissing them.

Now, the math here is similar to that above. The basic premise is that:

Anything > Status Quo

First of all, we know already that schools with more difficult working conditions have a much more difficult time recruiting and retaining quality teachers. Working conditions play a significant role in teacher sorting in initial job matches and in teacher moves over time.

We also know, just by looking at such information as the patterns of higher and lower “effectiveness” scores in the LA Times analysis, that if we dismiss teachers on the basis of their value added scores, we will be dismissing larger shares of teachers in higher poverty, higher minority schools. Or, we can just take the Central Falls, RI approach and declare the entire school failing based on its average performance over time (setting aside demographics and resources) and just fire everyone. Surely the replacements will be better. How could we do worse? Right?

Here’s the thing – even if we assume that some of the lower performance of teachers in poorer LA schools or the lower performance of Central Falls HS is a function of a weaker, less effective teacher workforce, we can only make things “better” by replacing that workforce with “better” teachers.

It is completely arrogant to take the reformy attitude of “how can we possibly do worse?” How could we possibly get a worse pool of teachers than the lazy slugs already in the system?

If the teacher pool in these schools is in fact less effective, and don’t just look that way statistically because of other factors, it may just be that these schools had a difficult time recruiting and retaining teachers to begin with. If we introduce our “game changing” policies – firing all of the teachers for low school performance, or firing individual teachers for bad effectiveness ratings – we will likely make things even worse.

Any teacher wishing to step in line to replace the previous cohort of “failures,” will have to not only consider the difficult working conditions but also the disproportionate likelihood that she/he will be fired a few years down the line, for factors well beyond his/her control (e.g. that pesky non-random assignment problem). That’s a significant change in working conditions – job risk. Without either changing other working conditions or substantially increasing compensation to offset this new risk, the applicant pool is not likely to get better – especially when risk is not increased similarly in other “more desirable” school districts. All else equal, the applicant pool is likely to get worse. The disparity in the quality of applicants for teaching positions is likely to increase dramatically, and the average quality of applicants to high poverty, high minority concentration districts may decline significantly.

Bonus video with thanks to Sherman Dorn:

A few thoughts on the unlikely alliance…

Today was the day of the big Oprah-Christie-Booker-Zuckerberg event, which I guess we can all watch around 4pm if we really want to. I’ve been trying to dig up any information I can, without wasting too much time on this, because there are certainly more important things to get to. That said, I do have a few brief comments in response to specific points and issues raised.

In an effort to get a good soundbite, Mayor Booker commented on Oprah that “You can not have a superior democracy with an inferior system of education” a comment that has now been re-tweeted over a hundred times. Here’s the thing. This whole situation is about a philanthropic contribution from a single wealthy individual, which has been described in the media as a contribution that carries with it a stipulation that the Governor grant unprecedented power to the Mayor to control Newark Public Schools. Anyone else seeing the contradiction here? My basic summary points are:

  • We should be concerned and skeptical any time a single individual uses their wealth to buy substantive changes to public policy.
  • Setting aside Booker’s loose use of the term democracy, I have to ask: Is it really democratic to have a single individual pay to alter the very structure of state and local government?

Would that be “democracy hypocrisy?”

Next on my list – the nature of the preferred reforms. We have little specific information on the types of school reforms that Mark Zuckerberg would like to see implemented in Newark or whether he has any specific interest in promoting certain education reforms. Zuckerberg provides some insights in this interview: http://techcrunch.com/2010/09/24/techcrunch-interview-with-mark-zuckerberg-on-100-million-education-donation/. Perhaps the most striking part of the interview is here:

So that – that way Cory is really aligned towards one – like this is his top priority. He just got re-elected by a pretty big margin and it’s his biggest priority. Then, so now – so that’s kind of what we’re doing, I mean, the idea is fund him and basically support him in doing a really comprehensive program to get all these things in place that they need to get done. [DELETE: So we should close down schools that are failing, get a lot of good charter schools and figure out new contracts for teachers so that better teachers can get paid more money, that more for performance as opposed to just based on how long you’ve been there. Have a lot of programs that are after schools that to keep kids healthy and safe and I mean, Newark, isn’t the safest city. So that’s the basic thing. And I mean for…]

I was particularly intrigued by that part in brackets, after DELETE – where Zuckerberg or the interviewer interpreting Zuckerberg seems to be suggesting a strong preference for massive charter expansion, closing public schools, and pushing for teacher contracts tied to student achievement data. The implication across media sources yesterday and today has been that the preference for these specific types of reforms across this seemingly diverse set of individuals – Zuckerberg, Oprah, Christie and Booker – validates the public interest in moving quickly toward accomplishing these public policy objectives. Setting aside the issue that fast-tracking these reforms under these circumstances is built on buying – with a big $$ gift – a change in state and local governance, I offer the following comments regarding this new unlikely alliance and these specific reform strategies:

  • It’s interesting to see such an eclectic cast of characters unify around a set of unproven and ill-conceived school reform strategies to hoist upon the children of Newark.
  • The fact is that major research organizations including the National Research Council, American Education Research Association, National Council on Measurement in Education, American Psychological Association and others have advised strongly against misusing student testing data to evaluate teacher effectiveness and there are many technical and statistical as well as practical reasons for their conclusions. With all due respect, a consensus vote in favor of these flawed policies from our Governor, the Newark Mayor, Oprah and Mark Zuckerberg doesn’t change that.

More Information:https://schoolfinance101.wordpress.com/category/race-to-the-top/value-added-teacher-evaluation/

  • The reality is that two of Newark’s most acclaimed charter schools – Robert Treat and North Star both serve far fewer of the lowest income children than nearby Newark Public Schools (43% to 47% compared to over 70% NPS) and very few children with disabilities (3.8 to 7.8% compared to 18.1% NPS) or limited English skills. It may be ‘working’ for them, but that’s not scalable reform. Eventually someone has to serve all of those other kids.

Data link: https://sites.google.com/site/schoolfinancepolicy/Home/NJCharters.xls?attredirects=0&d=1

Update from the Star Ledger: http://www.nj.com/news/index.ssf/2010/09/facebook_ceo_mark_zuckerberg_s.html

Apparently the “deleted” section has been removed from the interview, but I’m not the only one who saw it!

A few pictures related to my comment on charter school demographics:


And here are the 2009 assessment results for NPS and Newark Charter schools. As you can see, the very low poverty charters do very well. But they just aren’t comparable to NPS schools. Other higher poverty charters, which are still actually much lower poverty (and low or no special ed) than NPS schools, are actually distributed among the NPS schools, regardless of test subject or grade.


If money doesn’t matter…

A) Then why do private independent schools, like those attended by our President’s children (Sidwell Friends in DC), or by Davis Guggenheim’s children (?), spend so much more than nearby traditional public schools?

Davis Guggenheim, producer of Waiting for Superman, frequently explains to the media these days that he feels uneasy that he has made a personal choice drive by his neighborhood school each day to bring his children to a private school. Now, I don’t know which private school his children attend, but I would suspect (though I may be wrong) that it is more likely to be an academically elite, private independent school than to be a conservative Christian or urban Catholic school. As I discuss in this previous report, the spending differences and resulting programmatic resources and teacher characteristics by type of private school are striking: http://epicpolicy.org/publication/private-schooling-US

I would see little problem with Guggenheim’s personal anecdote were it not for one of the central arguments of Superman being that money plays little or no role in fixing public education systems. Instead, tough-minded superintendents like Michelle Rhee, or charter schools are the solution – money or not.

Again, I’ll fess up to the fact that I am a former teacher at and big supporter of Private Independent Schools. Here’s the school in New York City where I used to teach www.ecfs.org, and here’s its page on tuition: http://www.ecfs.org/admission/tuition.aspx. It was then, and I suspect still is an outstanding example of what a school can be! But that outstanding-ness comes at a price!

(approximately $36,000 per year for middle school and up)

The problem with the assertion that “money wouldn’t help public schools anyway” is that many of those pitching the argument seem themselves to favor private schools that spend more – A LOT MORE – per child than the public schools they are criticizing as failing (speak nothing of the fact that the public schools are serving a much more diverse student population).

Here are some comparisons pulled from my 2009 study on private school expenditures.

First, here’s the per pupil spending in 2005-06 for a handful of major labor markets that had sufficient numbers of Private Independent Day Schools for calculating the averages. My original sample of IRS Tax filings covered about 75% of all Private Independent Day Schools (NAIS or NIPSA member schools), so these are not “outlier” schools.

FIGURE 1 (This figure is now the figure from my original report: http://epicpolicy.org/publication/private-schooling-US)

And here are the regional averages, adjusted for regional differences in competitive wages, using the NCES Education Comparable Wage Index.

FIGURE 2 (This figure is now the figure from my original report: http://epicpolicy.org/publication/private-schooling-US)

If money doesn’t matter when it comes to school quality, then why not pick one of those private schools that charges only $6,000 in tuition, and spends $8,000 per pupil? Clearly there is some basis for the decision to send a child to a more expensive private school? There is some “utility” placed on the differences in what those schools have to offer? In the complete report above, I discuss (in painful detail) those differences across private schools, but here, I quickly summarize some of the differences between private independent schools and traditional public school districts.

FIGURE 3 (This figure is now the figure from my original report: http://epicpolicy.org/publication/private-schooling-US)

Private independent schools a) spend a lot more per pupil, b) have much lower pupil to teacher ratios and c) have much higher shares of teachers who attended more competitive colleges. These seem like potentially substantive differences to me. And they are differences that come at a cost.

I am by no means criticizing the choice to provide your own child with a more expensive education. That is a rational choice, when more expensive is coupled with substantive, observable differences in what a school offers. I am criticizing the outright hypocritical argument that money wouldn’t/couldn’t possibly help public schools provide opportunities (breadth of high school course offerings, smaller class sizes) more similar to those of elite private independent day schools, when this argument is made by individuals who prefer private schools that spend double what nearby public schools spend.

Sidebar: I suspect there are few if any private independent day schools out there which currently evaluate their teachers based on student test score gains alone. Please let me know if you know of one? And, I should note that the private independent school where I worked in New York City was actually unionized and had a tenure system in place with a probationary period similar to that of public schools and a salary schedule tied to experience.

B) Then why do venture philanthropists continue to throw money at charter schools while throwing stones at traditional public schools?

Charter school backers like Whitney Tilson love to throw stones at public schools while throwing money at charter schools. Here’s one of his presentations:

http://www.tilsonfunds.com/Personal/TheCriticalNeedforGenuineSchoolReform.pptx

On Slide 13, Whitney Tilson opines that increased spending on public education has yielded no return to outcomes over time, and therefore, by extension, increased spending would not and could not help public schools in the future. Tilson is featured prominently in this New York Times article on affluent fund managers in NYC rallying for charter schools: http://www.nytimes.com/2010/05/10/nyregion/10charter.html?pagewanted=all

So, here we have one of many prominent New York City charter school supporters on the one hand arguing that throwing more money at the public school system could not possibly help that system, but on the other hand, providing substantial financial assistance to charter schools (or at least participating in and promoting groups that engage in such activity)?

A New York City Independent Budget Office report suggested that charter schools housed in public school facilities have comparable public subsidy to traditional NYC public schools, but charter schools not housed in public school facilities have to make up about $2,500 (per pupil) in difference. I will show in a future post, however, that student population differences (charters serving lower need populations) largely erase this differential.

Kim Gittleson points out here, that in 2008-08, NYC Charter schools raised an average of $1,654 per pupil through philanthropy. But, some raised as much as $8,000 per pupil. As a result, some charters – those most favored by venture philanthropists – spend on a per pupil basis much more than traditional NYC public schools (including KIPP schools). I will provide much more detail in this point in a future post.

One might argue that the Venture Philanthropists are trying to spend their way to success – To outspend the public schools in order to beat them! After all, it’s the New York Yankee, George Steinbrenner way? (spoken from the perspective of a Red Sox fan, who spent the last several years in Kansas City, supporting the underdog – low payroll – Royals).

But here’s the disconnect – These same Venture Philanthropists – like Tilson, who are committed to spending whatever it takes on charters in order to prove they can succeed, are arguing that public schools a) don’t need and b) could never use effectively any more money. They are trying to argue that charters are doing more with less, when some are doing more with more, others less with less, and some may be doing more with less, and others are actually doing less with more. Shouldn’t traditional public schools be given similar opportunity to do more with more? And don’t give me that … “we’ve already tried that and it didn’t work” claim. I’ll gladly provide the evidence to refute that one, much of which is in the article at the bottom of this post!

C) Then why do affluent – and/or low poverty – suburban school districts continue in many parts of the country to dramatically outspend their poorer urban neighbors?

Last but not least, why do affluent suburban school districts in many states continue to far outspend poor urban ones? If there is no utility to the additional dollar spent and/or no effect produced by that additional dollar then why spend it?

Here is the overall trend, over time in the relationship between community income and state and local revenues per pupil.

When the red line is above the green horizontal line, there exists a positive relationship between district income and state and local revenue. That is, higher income districts have more state and local revenue per pupil. The red line never drops below the green line. This graph, drawn from this article (http://epaa.asu.edu/ojs/article/view/718) shows that state and local revenues per pupil remain positively associated with income across school districts nationally, after controlling for a variety of factors (see article for full detail). Things improved somewhat in the 1990s, but then leveled off.

FIGURE 4 (from: http://epaa.asu.edu/ojs/article/view/718)

Here are the trends for mid-Atlantic states, where some including New York State improved, but remain strongly associated with income. New Jersey is the only state among these where the relationship between income and revenue is disrupted and ultimately reversed.

FIGURE 5 (from: http://epaa.asu.edu/ojs/article/view/718)

Here are the trends for the New England trend, where New Hampshire school district state and local revenues remain strongly tied to income.

FIGURE 6 (from: http://epaa.asu.edu/ojs/article/view/718)

Here are the trends for the Great Lakes are trend, where Illinois remains among the most regressively funded systems in the nation (along with New York).

FIGURE 7 (from: http://epaa.asu.edu/ojs/article/view/718)

Here’s a specific look at state and local revenues per pupil in New York State districts in the NY metropolitan area, with districts organized by U.S. Census Poverty rates.

FIGURE 8

Is there a reason why Westchester County and Long Island school districts choose to spend so much more than New York City on a per pupil basis? What about those North Shore Chicago area districts?

These communities demand higher expenditures per pupil for their schools on a presumption that the marginal dollar does not go entirely to waste – that there is some value, some return for that dollar, perhaps in the richness of supplemental programs offered or the smaller class sizes – much like the differences in private schools seen above.

Finally, I point you to this recently published article in Teachers College Record, where Kevin Welner and I try to set the record straight on the effectiveness of “reforms” involving state school finance systems. They’re not the “reformy” reforms, but school finance reforms are reforms nonetheless.

Baker, B.D., Welner, K. School Finance and Courts: Does Reform Matter, and How Can We Tell? Teachers College Record

http://www.tcrecord.org/content.asp?contentid=16106

DoReformsMatter.Baker.Welner

Value-Added and “Favoritism”

Kevin Carey from Ed Sector has done it again. He’s come up with yet another argument that fails to pass even the most basic smell test. A few weeks ago, I picked on Kevin for making the argument that while charter schools, on average, are average, really good charter schools are better than average. Or, as he himself phrased it:

reasonable people acknowledge that the best charter schools–let’s call them “high-quality” charter schools–are really good

I myself am reasonable on occasion and fully accept this premise. Some schools are really good, and some not so good. And that applies to charter schools and non-charters alike, as I show in my recent post Searching for Superguy.

Well, last week Kevin Carey did it again – made a claim that simply doesn’t even pass the most basic smell test.  In the New York Times Room for Debate series on value-added measurement of teachers, Carey argued that Value-added measures would protect teachers from favoritism. Principals would no-longer be able to go after certain teachers based on their own personal biases. Teachers would be able to back up their “real” performance with hard data. Here’s a quote:

“Value-added analysis can protect teachers from favoritism by using hard numbers and allow those with unorthodox methods to prove their worth.” (Kevin Carey, here)

The reality is that value-added measures simply create new opportunities to manipulate teacher evaluations through favoritism. In fact, it might even be easier to get a teacher fired by making sure the teacher has a weak value-added scorecard. Because value-added estimates are sensitive to non-random assignment of students, principals can easily manipulate the distributions of disruptive students, students with special needs, students with weak prior growth and other factors, which, if not fully accounted for by the VA model will bias teacher ratings. And some factors – like disruptive students, or those who simply don’t give a $#*! won’t (and can’t) be addressed in the VA models. That is, a clever principal can use the VA non-random assignment bias to create a statistical illusion that a teacher is a bad teacher. One might argue that some principals likely already engage in a practice of assigning more “difficult” students to certain teachers – those less favored by the principal. So, even if the principal is less clever and merely spiteful, the same effect can occur.

I wrote in an earlier post about the types of contractual protections teachers should argue for, in order to protect against such practices:

The language in the class size/random assignment clause will have to be pretty precise to guarantee that each teacher is treated fairly – in a purely statistical sense. Teachers should negotiate for a system that guarantees “comparable class size across teachers – not to deviate more than X” and that year to year student assignment to classes should be managed through a “stratified randomized lottery system with independent auditors to oversee that system.” Stratified by disability classification, poverty status, language proficiency, neighborhood context, number of books in each child’s home setting, etc. That is, each class must be equally balanced with a randomly (lottery) selected set of children by each relevant classification.

This may all sound absurd, but sadly, under policies requiring high stakes decisions such as dismissal to be based on value added measures, this stuff will likely become necessary. And, it will severely constrain principals who wish to work closely with teachers on making thoughtful, individualized classroom assignments for students. I address the new incentives of teachers to avoid taking on the “tough” cases in this post: https://schoolfinance101.wordpress.com/2010/09/01/kids-who-don%E2%80%99t-give-a-sht/

Technical follow-up: I noticed that Kevin Carey claims that VA measures “level the playing field for teachers who are assigned students of different ability.” This statement, as a general conclusion, is wrong.

a) VA measures do account for the initial performance level of individual students, or they would not be VA measures. Even this becomes problematic when measures are annual rather than fall/spring, so that summer learning loss is included in the year to year gain. An even more thorough approach for reducing model bias is to have multiple years of lagged scores on each child in order to estimate the extent to which a teacher can change a child’s trajectory (growth curve). That makes it more difficult to evaluate 3rd or 4th grade teachers, where many lagged scores aren’t yet available. The LAT model may have had multiple years of data on each teacher, but didn’t have multiple lagged scores on each child. All that the LAT approach does is to generate a more stable measure for a teacher, even if it is merely a stable measure of the bias of which students that teacher typically has assigned to him/her.

b) VA measures might crudely account for socio-economic status, disability status or language proficiency status, which may also  affect learning gains. But, typical VA models, like the LA Times model by Buddin tend to use relatively crude, dichotomous proxies/indicators for these things. They don’t effectively capture the range of differences among kids. They don’t capture numerous potentially important, unmeasured differences.  Nor do they typically capture classroom composition – peer group – effect which has been shown to be significant in many studies, whether measured by racial/ethnic/socioeconomic composition of the peer group or by average performance of the peer group.

c) For students who have more than one teacher across subjects (and/or teaching aides/assistants), each teacher’s VA measures may be influenced by the other teachers serving the same students.

I could go on, but recommend revisiting my previous posts on the topic where I have already addressed most of these concerns.

Searching for Superguy in Gotham

Who is Superguy? By most popular accounts, Superguy is a figure of mythical proportion (urban legend proportion) capable of swooping down into the poorest of urban neighborhoods of America’s largest cities, gaining immediate access to schooling facilities, rounding up unthinkable private contributions from wealthy philanthropists and quite simply saving the lives of low-income urban school children trapped in bleak, adult-centered, perpetually failing traditional public schools.

Superguy could be found anywhere in the U.S where urban charter schools have proliferated in the past decade – Kansas City, Washington D.C., Chicago, Dallas, Houston, or even more likely, New York City – Gotham itself (yeah… Gotham was a Batman thing, not Superman… but hang in there with me).  I’ve chosen to focus on urban locations here, because who has ever heard of a “rural legend?”

I’ve written on a number of occasions about my general skepticism that Superguy really exists, or that he necessarily exists in the form of an urban charter school operator. My skepticism is based on my own read of the balance of research on charter schools and my own casual analysis  of New York City and New Jersey Charter Schools. New Jersey Charter Schools in particular are pretty average and those that are better than average serve very few of the lowest income children, no special needs children and few or no limited English proficient children. Personally, I’d expect Superguy to be out there fighting for these kids in particular, not just setting up shop in their neighborhood and cream-skimming the less needy among the more needy. But hey, that’s just my notion of what Superguy should be.

For an exceptional review of charter school research, I would recommend Robert Bifulco and Katrina Bulkley’s chapter on Charter Schools in the Handbook of Research on Education Finance and Policy. Neither of these scholars are charter school naysayers, yet they conclude:

Research to date provides little evidence that the benefits envisioned in the original conceptions of charter schools – organizational and educational innovation, improved student achievement, and enhanced efficiency – have materialized.

Of course, the true believers in Superguy (as charter operator) will argue vehemently that the finding that charters, on average, are average does not shake their belief… because the “upper half of charter schools is really good!, better than average, in fact!” Skeptically, I respond – isn’t the upper half of all schools better than average? If so, might Superguy actually be found in any school that’s better than average? But who am I to nitpick?

The most compelling evidence that Superguy exists was provided in Caroline Hoxby’s finding regarding NYC charter schools that:

On average, a student who attended a charter school for all of grades kindergarten through eight would close about 86 percent of the “Scarsdale-Harlem achievement gap” in math and 66 percent of the achievement gap in English.

Who other than Superguy could close the Harlem-Scarsdale gap? However, Stanford University researcher Sean Reardon explains:

Because the report relies on an inappropriate set of statistical models to analyze the data, however, the results presented appear to overstate the cumulative effect of attending a charter school.

Superguy in Gotham is also assumed to have competitive effects, lifting entire neighborhoods wherever he may be present. This evidence is often cited to Marcus Winters’ (Manhattan Institute) finding that:

The analysis reveals that students benefit academically when their public school is exposed to competition from a charter.

But thwarting this Superguy sighting is Wellesley economist Patrick McEwan’s observation that:

The statistical analysis suggests that increasing competition has no statistically significant impact on math test scores, but that it has small positive effects on language scores. The report does not conclusively demonstrate that the results are explained by increasing competitive pressure on public school administrators; they may also be explained by shifting peer quality or declining short-run class sizes in public schools.

Those pesky, curmudgeonly,  academics are at it again… denying the true believers that Superguy comes in the singular form of a New York City charter school operator!

Then there’s the claim that Superguy himself may have been outed in Harlem (is Superguy really Geoffrey Canada?) – as evidenced by Dobbie and Fryer’s studies of the amazing success of Harlem Children’s Zone. But then Russ Whitehurst of Brookings stepped in to rain on this parade, finding the HCZ Promise academy to be relatively average as far as NYC charter schools go.

For several additional curmudgeonly critiques of Superguy sightings, see: http://epicpolicy.org/search/epicpolicy/charter

It is with these contradictory findings in mind that I present the following figures, and we begin our statistical search for Superguy. Now, this is not a really deep, statistically rigorous search for Superguy. The approach here is what some refer to as a “beating the odds” approach (BTO) and is similar to the adjusted performance approach used by Whitehurst in his Brooking critique of HCZ.

It seems that the logical place to start would be New York City, home to the greatest number of Superguy sightings. Let’s begin with a simple flyover of NYC schools, including traditional public schools and charter schools just to get a feel for the demographics of those schools.

Here is the % of children qualifying for Free Lunch across Harlem (and South Bronx) schools. This map does not indicate which schools are charters, but you can click the link below the map to see which ones are.

CLICK HERE TO SEE WHICH SCHOOLS ARE CHARTER SCHOOLS (indicated with an asterisk)

Here is the % of children who are limited in their English language proficiency. Again, this map doesn’t show us which schools are charters, but you can click the link below.

CLICK HERE TO SEE WHICH SCHOOLS ARE CHARTER SCHOOLS (indicated with an asterisk)

Now, here’s our first “Beating the Odds” scatterplot. The predicted performance values expressed on the vertical axis are from a regression equation that accounts for a) limited English proficiency shares, b) free lunch shares, c) mobility shares, d) borough and e) year (includes 2008 & 2009). These graphs look at the adjusted performance levels (not value-added) of NYC traditional public and NYC charter schools (standardized difference between actual and predicted values for 2009). These are illustrative. While outcome levels do go up (are inflated) in 2009, the distributions in these scatterplots don’t change a whole lot if I use 2008 or earlier. (here are the models)

The first Beating the Odds scatterplot looks at the average performance from 2009 (yes, the really inflated test score year) for NYC public schools. Charter schools are not identified in this graph. Schools are displayed from low to high poverty along the horizontal axis. Schools above the red horizontal line are beating the odds, or scoring higher than expected given their location and student population. Schools below the line are, well, not beating the odds. We would, of course, expect our Superguy operated schools to be flying high above the rest and certainly not falling well below… at the bottom of the scatter. So, where is Superguy?

CLICK HERE TO SEE WHICH SCHOOLS ARE CHARTER SCHOOLS

Now, here’s the average of the 4th and 5th grade outcomes. Same deal. A good ol’ BTO analysis (yeah… this isn’t really rigorous stuff, but it is illustrative). Again, charters are not identified in this picture. Yes, there are some high and some low flyers in this graph too. But are all of the high flyers charter schools? Is Superguy really here?

CLICK HERE TO SEE WHICH SCHOOLS ARE CHARTER SCHOOLS

While I don’t think we’ve found Superguy here, we are left with some potential clues about the conditions surrounding Superguy sightings – A) that superguy sightings seem more common in the presence of unexplainable deficits in the shares of children who qualify for Free Lunch, B) that Superguy sightings seem more common in the presence of unexplainable deficits in the shares of children with limited English proficiency. Other than that, it seems that Superguy is equally likely to be hiding in a traditional New York public school as it is that Superguy is secretly disguised as a charter school operator somewhere in Gotham.

Alternatively, there exists the depressing but real possibility that Superguy simply doesn’t exist – at least not in the expected form. That there just isn’t a charter school operator out there who can single-handedly swoop into poor urban neighborhoods and save childrens lives – creating results never seen before with a truly representative population of children. Or, at the very least, not all or even the average charter school operator qualifies as Superguy. Yes, some are better than others. And, some are quite good. But you know what I have to say about that argument (see above).

Yeah… I’d like to be a believer. I don’t mean to be that much of a curmudgeon. I’d like to sit and wait for Superguy – perhaps watch a movie while waiting (gee… what to watch?). But I think it would be a really long wait and we might be better off spending this time, effort and our resources investing in the improvement of the quality of the system as a whole. Yeah, we can still give Superguy a chance to show himself (or herself), but let’s not hold our breath, and let’s do our part on behalf of the masses (not just the few) in the meantime.

Value-added and the non-random sorting of kids who don’t give a sh^%t

Last week, this video from The Onion (asking whether tests are biased against kids who don’t give a sh^%t) was going viral among the education social networking geeks like me. At the same time, the conversations continued on the Los Angeles Times Value-Added story, with LAT releasing the scores for individual teachers.

I’ve written many blog posts in recent weeks on this topic. Lately, it seems that the emphasis on the conversation has turned toward finding a middle ground – discussing the appropriate role for VAM (Value Added Modeling) – if any, in teacher evaluation. But also, there is renewed rhetoric defending VAM. Most of that rhetoric seems to take on most directly the concern over the error rates in VAM – and lack of strong year to year correlation between which teachers are rated high or low.

The new rhetoric points out that we’re only having this conversation about VAM error rates because we can measure the error rate in VAM, but can’t even do that for peer or supervisor evaluation – which might be much worse (argue the pundits). The new rhetoric argues that VAM is still the “best available” method for evaluating teacher “performance.” Let me point out that if the “best available” automobile burst into flames on every fifth start, I think I’d walk or stay home instead. I’d take pretty significant steps to avoid driving. Now, we’re not talking about death by VAM here, but the idea that random error alone – under an inflexible VAM based policy structure – could lead to wrongfully firing a teacher is pretty significant.

Again, this current discussion pertains only to the “error rate” issue. Other major – perhaps even bigger issues include the problem that so few teachers could even have test scores attached to them – creating a whole separate sub-class (<20%) of teachers in each school system and increasing divisions among teachers – creating significant tension, for example between teachers under the VAM (math/reading) rating system, and teachers who might want to meet with some of their students for music, art or other enrichment endeavors.

Perhaps most significantly, there still exists that pesky little problem of VAM not being able to sufficiently account for the non-random sorting of students across schools and teachers. For those who wish to use Kane and Staiger as their out on this (without reference to broader research on this topic), see my previous post on the LAT analysis. Their findings are interesting, but not the single definitive source on this issue. Note also that the LAT analysis itself reveals some bias likely associated with non-random assignment (the topic of my post).

So then, what the heck does this have to do with The Onion video about testing and kids who don’t give a sh^%t?

I would argue that the non-random assignment of kids who don’t give a sh^%t presents a significant concern for VAM. Consider any typical upper elementary school. It is quite possible that kids who don’t give a sh^%t are more likely to be assigned to one fourth grade teacher year-after-year than to another. This may occur because that fourth grade teacher really wants to try to help these kids out, and has some, though limited success in doing so. This may also occur because the principal has it in for one teacher – and really wants to make his/her life difficult. Or, it may occur because all of the parents of kids who do give a sh^%t (in part because their parents give a sh^%t) consistently request the same teacher year after year.

In all likelihood, whether the kids give a sh^%t about doing well – and specifically doing well on the tests used for generating VA estimates – matters, and may matter a lot. Teachers with disproportionate numbers of kids who don’t give a sh^%t may, as a result receive systematically lower VA scores, and if the sorting mechanisms above are in place, this may occur year after year.

What incentive does this provide for the teacher who wanted to help – to help kids give a sh^%t? Statistically, even if that teacher made some progress in overcoming the give a sh^%t factor, the teacher would get a low rating because give a sh^%t factor would not be accounted for in the model. Buddin’s LAT model includes dummy variables for kids who are low income and kids who are limited in their English language proficiency. But, there’s no readily available indicator for kids who don’t give a sh^%t. So we can’t effectively compare one teacher with 10 (of 25) kids who don’t give a sh^%t to another with 5 (of 25) who don’t give a sh^%t. We can hope that giving a sh^%t , or not, is picked up by the child’s prior year performance, and even better, by the prior multiple years of value-added estimates on that child. But, do we really know whether giving a sh^%t is a stable student characteristic over time? Many VAM models like the LAT one don’t capture multiple prior years of value-added for each student.

I noted in previous posts that peer-effect is among those factors that compromises (biases) teacher VAM ratings. Buddin’s LAT model, as far as I can tell, doesn’t try to capture differences in peer group when attempting to “isolate” teacher effect (though this is very difficult to accomplish). Unlike racial characteristics or child poverty, whether 1 or 10 kids in a class give a sh^%t might rub off on others in the class. Or, the disruptive behavior of kids who don’t give a sh^%t might significantly compromise the learning (and value-added estimates) of others. Yet, all of this goes unmeasured in even the best VAMs.

Once again, just pondering…

NEW: BONUS VIDEO