NJ Charter Data Round-up

Note: I will be making updates to this post in the coming days/weeks.

As we once again begin discussing & debating the appropriate role for Charter schools in New Jersey’s education reform “mix,” here’s a round-up on the New Jersey charter school numbers, in terms of demographic comparisons to all other public and charter schools in the same ‘city’ and proficiency rates (across all grades) compared to all others in the same ‘city.’

Key Findings:

Many NJ charter schools, especially those most often touted in the media as great success stories, continue to serve student populations that differ dramatically from populations of surrounding schools in the same city (see note *). These charters differ in terms of percentages of children who qualify for free lunch, percent classified as having disabilities, or percent with limited English language proficiency.

On average, given their demographics, NJ charter schools continue to have proficiency rates around where one would expect. Demographically advantaged charter schools have higher average proficiency than other schools around them. Demographically disadvantaged charter schools have lower average proficiency rates than others around them. Not tricky/heavy statistics here. Just a comparison of relative proficiency and relative demography.

When one estimates what I would call a “descriptive regression” model characterizing the differences in proficiency rates across district and charter schools in the same cities, one finds that compared against schools of similar demography, and on the same grade level and subject area tests, the charter proficiency rates, on average are no different than their traditional public school counterparts. In this particular regression model, charters did have higher proficiency in Science (charter x science interaction). More descriptive stuff to come when I get a chance. Not sure when that will be.

Note: The model includes a fixed effect for CITY location for each traditional public and charter school, such that each charter is compared against other schools in the same CITY.

But to be absolutely clear, this particular analysis misses the point entirely in two ways. First, it is merely descriptive of the average proficiency rates of charter and non-charter schools across tests, subjects & grades. It is not a test, by any means of comparative effectiveness of schools. Second, as I explain below,  comparisons of charterness vs. non-charterness are not particularly helpful for informing policy.

Policy Perspectives:

Issue 1: The relevant policy question is not whether charters on average perform better than traditional public schools on average and therefore whether we should simply replace more traditional public schools with more charter schools. The relevant questions are “what works? For whom? And under what circumstances?” Charter schools, traditional public schools and private schools all vary widely in quality and in their ability to serve different populations well. Some schools of each organizational type do well (at least for some kids) while others, quite bluntly, suck, no matter who they try to serve. Further, I’ve written previously about these arguments that charters or private schools “do more (than traditional public schools) with less money.” However, rarely are those money comparisons rigorously or accurately conducted. Often times the assertion of “more with less” isn’t backed by any analysis at all of the “with less” part of that equation (and sometimes not the “more” part either). But these are the types of issues we need to be exploring, including specifically what are the resource implications of the models being offered by those “successful” schools, be they charters, traditional public schools, or other alternatives.

Issue 2: It may not be that the only appropriate role for charters in the mix is for them to all try to serve the most representative population – a population mirroring that of the city as a whole or their zip codes. But, for those that don’t – for those that serve a niche – we need to recognize them as such, and need to monitor the extent that their demographic selection may have adverse effects on the system as a whole. We also need to recognize that their demographic difference may play a significant role in explaining either their apparent success, or apparent failure. We should recognize, for example, that schools like Robert Treat or North Star Academy may be showing high outcomes but are doing so largely as a function of serving very different populations than others around them. Further, there may be nothing wrong with that if they are truly doing well by the kids they serve. That may just be their appropriate niche. We just can’t pretend that this model of success can be spread city wide or statewide. And, it may be inappropriate to encourage these schools to serve more representative populations. Perhaps they should stick with what they are good at. As a result, it may be more reasonable for charters like North Star or Robert Treat to establish similar niche schools in other New Jersey cities rather than pretending they can expand dramatically in the same cities and still maintain their current level of achievements.

Issue 3: We also need to remember that NJ’s large urban districts themselves operate a wide variety of schools and segment their own student populations at the secondary level through such options as magnet schools. Charters aren’t the only segmenting force. Charters including those that are demographically representative and those that aren’t have simply become a part of that mix. And we need to recognize where each fits into that mix and consider very seriously the implications for the system as a whole.

Issue 4: Finally, as I so often point out policy perspectives and parental interests may differ sharply when it comes to “elite” charter schools. From a policy perspective, elite charter schools provide limited implications for scalability (and for charters as a broad-based policy “solution”) because their benefits are derived from concentrating motivated, often less poor (non-disabled & fluent English speaking), self-selected students with the staying power to endure “no excuses” charter models.  From a parental perspective, this public policy limitation often provides the strongest personal incentive to pursue a specific school for one’s own children. Again, it comes down to that “other” strongest in-school factor driving student success – peer effect. Peer effect is a limitation (confounding factor) in public policy (unless we can find clever strategies to optimize peer distribution). But peer effect may be a legitimate quality indicator for parental choices.

Data notes:

As I’ve noted numerous times on this blog, my goal here is to access and report on publicly available data from widely recognized and/or official government sources. These are the most recent data of that type available. And here are the sources:

District and Charter School Location Information: http://nces.ed.gov/ccd/bat (2009-2010)

District special education classification rates: http://www.nj.gov/education/specialed/data/ADR/2010/classification/distclassification.xls

School level % LEP/ELL & % Free Lunch: http://www.nj.gov/education/data/enr/enr11/enr.zip

Combined Demographic Data: Charter Demographics 2011

*Note: City and Zip Code averages constructed by summing all students, all free lunch students, all LEP/ELL students for all schools in each “city” and in each “zip  code” as identified by school location based on the NCES Common Core of Data and then dividing city-wide (or zip code wide) % LEP/ELL and % Free Lunch by city (or zip) wide total enrollment for both traditional public schools and charters (that is, charters are part of the city-wide, or zip-wide average).  For special education, to estimate the citywide (and zip) average for schools, the district overall rate was applied to district schools.  This would not be an appropriate way to compare individual city schools to charter schools, since special education populations are not evenly distributed across city schools (or throughout a zip code), but is a more reasonable approach for generating the citywide aggregates. Again, charters are included in citywide and in zip-code level averages.

Differentiating “cost savings” from “expenditure reduction”

Today, it’s time for a little School Finance 101, clarifying the difference between what is a “cost savings” versus what is an “expenditure reduction.”

Cost savings means finding ways to reduce expenditure while still addressing the same range of objectives (goals, intended outcomes) and while still achieving the same level or quality of outcomes with respect to each objective.

Expenditure reduction typically means choosing not to address some objectives, goals or intended outcomes. That is, to cut back the scope of production (drop a product line, eliminate product features, cut curricular offerings, address fewer objectives/goals). Further, expenditure reduction might also mean simply choosing not to shoot for as high outcomes on specific objectives.

Note that an expenditure reduction can also be achieved by realizing actual cost savings. But, any old expenditure reduction cannot necessarily be classified as cost savings.

Here’s an example of potential “cost savings.” If you are running a small, remote rural district and questioning whether you can continue to maintain advanced placement calculus as an in-house, district staffed course for 3 to 5 students per year, you might consider the alternative of having those students take the course online. It may just be that student outcomes, for your particular students and that particular course are not measurably adversely affected by the change and that the change results in a substantial reduction in expenditure per child on AP calculus. That is, expenditure was reduced and the outcome held constant. Cost savings were achieved.

The cost savings example above is considerably different from comparing the total per pupil expense on brick and mortar schooling and all of the programs and services included in that, with the per pupil expense needed to offer an online curriculum of core required academic courses (or any subset that differs in scope of goals/objectives).  This is the critical flaw in the interpretations and presentations of (some though not all of) the findings in the recent Fordham Institute study on the supposed “cost” of online versus blended, versus brick and mortar schooling.

One might clean up this aggregate brick and mortar to online schooling comparison by attempting to isolate the per pupil costs of offering those same courses/programs/services within the brick and mortar structure and measuring and comparing the outcomes of those same courses offered each way (in house versus online, as above with AP Calculus).[1]

In these times of tight local school district budgets and rhetoric about “stretching the school dollar” and the “new normal,” paying close attention to the distinctions above is critically important.

The recent Fordham Report on online and blended learning provides some interesting new data, but provides no insights (yet) regarding “cost savings.” Again, there’s some potentially useful stuff in there, but comparisons like those made in Figure 1, p. 4  (comparing total brick and mortar per pupil spending to the other two options) are very deceptive and do much to undermine the rest of the report.

Similarly, the “stretching the dollar” brief released last year by the Fordham Institute provides little or no valuable information regarding “cost savings” but does provide a laundry list of ideas for cutting services (with no evidence or measure of the results of such cuts), such as cutting off services to limited English speaking children after two years or cutting total funding to special education (by capping and redistributing those funds uniformly across districts). Kevin Welner and I address in greater detail the various expenditure reduction strategies cast as “cost savings” by Petrilli and Roza in the “stretching the dollar” brief in a recent NEPC report.

Further, it’s important to understand that it’s not necessarily even an expenditure reduction when a school district cuts from its budget something that it then expects someone else, such as parent, to pay for (like cutting district funding for athletic travel and either replacing it with fees, or expecting local sports booster clubs to raise the money). It may be a school budget reduction, and a reduction to the school’s expenditure, but the expenditure is still there.

I’m not saying that schools or districts should never simply cut expenditures by reducing the scope of their services, or shooting lower on some goals. Some goals/objectives may nolonger be (as) important, or may need to be traded off to use scarce resources toward other more “important” (importance being measured in any number of ways) goals/objectives.

Rather, I’m saying that if it’s an expenditure cut, it’s an expenditure cut.

If it’s really just a transfer of responsibility for the expenditure, acknowledge that.

And, if it really is an attempt at “cost savings” then it’s legit to call it that.

So, when presented with these quick and easy, off the shelf school finance solutions for supposed cost savings, please ask yourself whether the authors/presenters really have evaluated cost savings or merely expenditure reductions.

And to those authors/presenters who I’m not always sure understand the difference themselves please make at least some effort to differentiate between real “cost savings” and simple “expenditure reduction.”

 

 


[1] Alternatively, one might argue that the singular goal of any of the three options is a high school diploma, and that the different estimates are of the “costs” under each model of achieving that singular goal. However, in this case, it becomes important to evaluate the “quality” of the outcome – high school diploma – when obtained these very different ways (perhaps by evaluating preparedness for higher education, access, persistence, 6-year graduation, for otherwise similar students).

Misunderstanding & Misrepresenting the “Costs” & “Economics” of Online Learning

The Fordham Institute has just released its report titled “The Costs of Online Learning” in which they argue that it is incrementally cheaper to move from a) brick and mortar schooling to b) blended learning and then c) fully online learning.

http://www.edexcellencemedia.net/publications/2012/20120110-the-costs-of-online-learning/20120110-the-costs-of-online-learning.pdf

Accompanying this report is a blog post titled “Understanding the Economics of Online Learning” from the Quick and the Ed. http://www.quickanded.com/2012/01/understanding-the-economics-of-online-learning.html/comment-page-1#comment-78690

On first glance, both the report itself and especially the blog post from Quick & Ed display basic misunderstandings of the concept of “cost,” a very basic economic concept. I find this to be particularly disturbing in a blog post titled “understanding the economics of online learning.”

“Cost” refers to the cost of providing a service level of specific quality, which in education might be measured in terms of student outcomes. That includes all costs of achieving those outcomes, whether covered by public subsidy, or whether passed along to other participants in the system. A really good guide for understanding “costs” this type of analysis is Hank Levin’s book on Cost Effectiveness Analysis.

By contrast an “expense” is that which is expended toward providing some given level or portion of service. You can spend less and get less. You can spend more and get more. But getting Y quality service will cost you X, and no less than X (where X represents the minimum amount you would need to spend, given the most efficient production technology for achieving Y quality of service).  You can conceivably spend more than X for Y quality of service, but that would be, shall we say, inefficient.

Often to cover the full cost of any particular service, like public schooling for example, several parties incur expenses. It is assumed that the majority of the cost of brick and mortar schooling is covered at government expense. But, we all know that there are also fees for many things in some states (and districts), such as participation fees for sports, personal expense on school lunch, or transportation fees. Assuming attendance is compulsory, transportation fees are necessarily part of the cost of the education system whether covered by parents through fees (a tax by another name) or covered by the local public school district.

The “cost” of brick and mortar schools doesn’t change if we simply decide to cut transportation services while maintaining compulsory attendance laws. Rather, we pass along that expense to someone else – the parents. That expense is still there, and it may have even increased if we add in the cumulative parental expense on transportation (in effect, a tax for school participation).

What’s being compared in the online learning report is not “cost” but expenses on varied levels of service provision.

We might be generous here and set aside the thorniest issue and assume that the measured academic outcomes addressed by each option are the same regardless of model type or student served (likely a huge, unsupported assumption). But the outcomes of brick and mortar schooling include not only the measured academic outcomes, but any and all outcomes derived from the total expenses on brick and mortar schooling (those used in the study), including outcomes of athletic and arts participation, physical education, etc.  If the range of outcomes covered by brick and mortar schooling are broader, that should be taken into account in this type of analysis. That is, if brick and mortar schooling is providing more than just the core academic programs – including sports, clubs, arts, phys ed  – and online services are not – the analysis should either add these costs to the online service costs (what these things would cost if privately supplemented) or should subtract them from the brick and mortar cost. Otherwise this is a rather pointless apples to five course meal comparison (unless we also throw in a utility analysis and assume all of that other stuff to have zero utility… a suspect assumption).

One might argue… so what’s the big deal, the kid goes to school in the kitchen in their house, and the parent is simply in the next room working from home, as opposed to the child being in a brick and mortar school for the day. Well, even that’s not a $0 expense endeavor. To nitpick, it’s likely that the increased monitoring role of the parent in this case would reduce the parent’s work productivity to some extent – an opportunity cost. The opportunity costs become potentially much larger if the parent’s productivity depends more on not being at home, but they can no-longer be away from home. Then there’s the marginal increase to utilities associated with having the child at home and online, and potential increased food expense (a little hard to judge). Additional computer hardware, etc. This kind of “little” stuff adds up across large numbers of kids.

I do not see anywhere in this study (on quick glance) or in the post above, any discussion of the varied amount of expense (portion of cost) that would be passed along to someone else (parents) under each model in order to achieve the same outcomes.  This has to be accounted for in order to have a thoughtful conversation on public policy implications. In other words, the present study does little to advance thoughtful conversation on public policy implications of online and blended learning models. But with some additional work, perhaps it might.

It may not be feasible to construct a full tally of all of the “costs” passed along to someone else under each model, but it’s at least worth listing out what some/many of those things might be and the likely range of costs being passed along.

It may still be reasonable to make the argument that government expense can be reduced, but it’s not necessarily a reduction in the cost of the service, but rather a transfer of responsibility for covering that cost. It may be… though I’m not entirely sure… that the total cost is also reduced. But taking that next step in the analysis also involves evaluating the full costs of inputs and full range of outcomes achieved.

Spending less to get less doesn’t reduce costs. It reduces only expenditures and that distinction is important.

Fire first, ask questions later? Comments on Recent Teacher Effectiveness Studies

Please also see follow-up discussion here: https://schoolfinance101.wordpress.com/2012/01/19/follow-up-on-fire-first-ask-questions-later/

Yesterday was a big day for big new studies on teacher evaluation. First, there was the New York Times report on the new study by Chetty, Friedman and Rockoff. Second, there was the release of the second part of the Gates Foundation’s Measures of Effective Teaching project.

There’s still much to digest. But here’s my first shot, based on first impressions of these two studies (with very little attention to the Gates study)

The second – Gates MET study – didn’t have a whole lot of punchline to it, but rather spent a great deal of time exploring alternative approaches to teacher evaluation and the correlates of those approaches to a) each other and b) measured student outcome gains. The headline that emerged from that study, in the Washington Post and in brief radio blurbs was that teachers ought to be evaluated by multiple methods and should certainly be evaluated more than once a year or every few years with a single observation. That’s certainly a reasonable headline and reasonable set of assertions. Though, in reality, after reading the fully study, I’m not convinced that the study validates the usefulness of the alternative evaluation methods other than that they are marginally correlated with one another and to some extent with student achievement gains, or that the study tells us much if anything about what schools should do with the evaluation information to improve instruction and teaching effectiveness. I have a few (really just one for now) nitpicky concerns regarding the presentation of this study which I will address at the end of this post.

The BIG STUDY of the day… with BIG findings … at least in terms of news headline fodder, was the Chetty, Friedman & Rockoff (CFR) study.  For this study, the authors compile a massive freakin’ data set for tech-data-statistics geeks to salivate over.  The authors used data back to the early 1990s on children in a large urban school district, including a subset of children for whom the authors could gather annual testing data on math and language arts assessments. Yes, the tests changed at different points between 1991 and 2009, and the authors attempt to deal with this by standardizing yearly scores (likely a partial fix at best). The authors use these data to retrospectively estimate value-added scores for those (limited) cases where teachers could be matched to intact classrooms of kids (this would seem to be a relatively small share of teachers in the early years of the data, increasing over time… but still limited to grades 3 to 8 math & language arts). Some available measures of student characteristics also varied over time. The authors take care to include in their value-added model, the full extent of available student characteristics (but remove some later) and also include classroom level factors to try to tease out teacher effects. Those who’ve read my previous posts understand that this is important though quite likely insufficient!

The next big step the authors take is to use IRS tax record data of various types and link it to the student data. IRS data are used to identify earnings, to identify numbers and timing of dependent children (e.g. did an individual 20 years of age claim a 4 year old dependent?) and to identify college enrollment. Let’s be clear what these measures are though. The authors use reported earnings data for individuals in years following when they would have likely completed college (excluding incomes over $100k). The authors determine college attendance from tax records (actually from records filed by colleges/universities) on whether individuals paid tuition or received scholarships. This is a proxy measure – not a direct one. The authors use data on reported dependents & the birth date of the female reporting those dependents to create a proxy for whether the female gave birth as a teenager.[1] Again, a proxy, not direct measure. More later on this one.

Tax data are also used to identify parent characteristics. All of these tax data are matched to student data by applying a thoroughly-documented algorithm based on names, birth dates, etc. to match the IRS filing records to school records (see their Appendix A).

And in the end, after 1) constructing this massive data set[2], 2) retrospectively estimating value-added scores for teachers and 3) determining the extent to which these value added scores are related to other stuff, the authors find…. well… that they are.

The authors find that teacher value added scores in their historical data set vary. No surprise. And they find that those variations are correlated to some extent with “other stuff” including income later in life and having reported dependents for females at a young age. There’s plenty more.

These are interesting findings. It’s a really cool academic study. It’s a freakin’ amazing data set! But these findings cannot be immediately translated into what the headlines have suggested – that immediate use of value-added metrics to reshape the teacher workforce can lift the economy, and increase wages across the board! The headlines and media spin have been dreadfully overstated and deceptive. Other headlines and editorial commentary has been simply ignorant and irresponsible. (No Mr. Moran, this one study did not, does not, cannot negate  the vast array of concerns that have been raised about using value-added estimates as blunt, heavily weighted instruments in personnel policy in school systems.)

My 2 Big Points

First and perhaps most importantly, just because teacher VA scores in a massive data set show variance does not mean that we can identify with any level of precision or accuracy, which individual teachers (plucking single points from a massive scatterplot) are “good” and which are “bad.” Therein exists one of the major fallacies of moving from large scale econometric analysis to micro level human resource management.

Second, much of the spin has been on the implications of this study for immediate personnel actions. Here, two of the authors of the study bear some responsibility for feeding the media misguided interpretations. As one of the study’s authors noted:

“The message is to fire people sooner rather than later,” Professor Friedman said. (NY Times)

This statement is not justified from what this study actually tested/evaluated and ultimately found. Why? Because this study did not test whether adopting a sweeping policy of statistically based “teacher deselection” would actually lead to increased likelihood of students going to college (a half of one percent increase) or increased lifelong earnings. Rather, this study showed retrospectively that students who happened to be in classrooms that gained more, seemed to have a slightly higher likelihood of going to college and slightly higher annual earnings. From that finding, the authors extrapolate that if we were to simply replace bad teachers with average ones, the lifetime earnings of a classroom full of students would increase by $266k in 2010 dollars. This extrapolation may inform policy or future research, but should not be viewed as an absolute determinant of best immediate policy action.

This statement is equally unjustified:

Professor Chetty acknowledged, “Of course there are going to be mistakes — teachers who get fired who do not deserve to get fired.” But he said that using value-added scores would lead to fewer mistakes, not more. (NY Times)

It is unjustified because the measurement of “fewer mistakes” is not compared against a legitimate, established counterfactual – an actual alternative policy. Fewer mistakes than by what method? Is Chetty arguing that if you measure teacher performance by value-added and then dismiss on the basis of low value-added that you will have selected on the basis of value-added. Really? No kidding! That is, you will have dumped more low value-added teachers than you would have (since you selected on that basis) if you had randomly dumped teachers? That’s not a particularly useful insight if the value-added measures weren’t a good indicator of true teacher effectiveness to begin with. And we don’t know, from this study, if other measures of teacher effectiveness might have been equally correlated with reduced pregnancy, college attendance or earnings.

These two quotes by authors of the study were unnecessary and inappropriate. Perhaps it’s just how NYT spun it… or simply what the reporter latched on to. I’ve been there. But these quotes in my view undermine a study that has a lot of interesting stuff and cool data embedded within.

These quotes are unfortunately illustrative of the most egregiously simpleminded, technocratic, dehumanizing and disturbing thinking about how to “fix” teacher quality.

Laundry list of other stuff…

Now on to my laundry list of what this new study adds and what it doesn’t add to what we presently know about the usefulness of value-added measures for guiding personnel policies in education systems. In other words, which, if any of my previous concerns are resolved by these new findings.

Issue #1: Isolating Teacher Effect from “other” classroom effects (removing “bias”)

The authors do provide some additional useful tests for determining the extent to which bias resulting from the non-random sorting of kids across classrooms might affect teacher ratings. In my view the most compelling additional test involves evaluating the value-added changes that result from teacher moves across classrooms and schools. The authors also take advantage of their linked economic data on parents from tax returns to check for bias. And in their data set, comparing the results of these tests with other tests which involve using lagged scores (Rothstein’s falsification test) the authors appear to find some evidence of bias but in their view, not enough to compromise the teacher ratings. I’m not yet fully convinced, but I’ve got a lot more digging to do. (I find Figure 3, p. 63 quite interesting)

But more importantly, this finding is limited to the data and underlying assessments used by these authors in this analysis in whatever school system was used for the analysis. To their credit, the authors provide not only guidance, but great detail (and share their Stata code) for others to replicate their bias checks on other value added models/results in other contexts.

All of this stuff about bias is really about isolating the teacher effect from the classroom effect and doing so by linking teachers (a classroom level variable) to student assessment data with all of the underlying issues of those data (the test scaling, equating moves from x to x+10 on one test to another, an on one region of the scale on one test to another region of the scale on the same test).

Howard Wainer explains the heroic assumptions necessary to assert a causal effect of teachers on student assessment gains here: http://www.njspotlight.com/ets_video2/

When it comes to linking the teacher value-added estimates to lifelong outcomes like student earnings, or teen pregnancy, the inability to fully isolate teacher effect from classroom effect could mean that this study shows little more than the fact that students clustered in classrooms which do well over time eventually end up less likely to have dependents while in their teens, more likely to go to college (.5%) and earn a few more dollars per week.[3]

These are (or may be) shockingly unsurprising findings.

Issue #2. Small Share of Teachers that Can Be Rated

This study does nothing to address the fact that relatively small shares of teachers can be assigned value-added scores. This study, like others merely uses what it can – those teachers in grades 3 to 8 that can be attached to student test scores in math and language arts. More here.

Issue #3: Policy implications/spin from media assume an endless supply of better teachers?

This study like others makes assertions about how great it would all turn out – how many fewer teen girls would get pregnant, how much more money everyone would earn, if we could simply replace all of those bad teachers with average ones, or average ones with really good ones. But, as I noted above, these assertions are all contingent on an endless supply of “better” teachers standing in line to take those jobs. And this assertion is contingent upon there being no adverse effect on teacher supply quality if we were to all of the sudden implement mass deselection policies. The authors did not, nor can they in this analysis, address these complexities. I discuss deselection arguments in more detail in this previous post.

A few final comments on Exaggerations/Manipulations/Clarifications

I’ll close with a few things I found particularly annoying:

  • Use of super-multiplicative-aggregation to achieve a number that seems really, really freakin’ important (like it could save the economy!).

One of the big quotes in the New York Times article is that “Replacing a poor teacher with an average one would raise a single classroom’s lifetime earnings by about $266,000, the economists estimate.” This comes straight from the research paper. BUT… let’s break that down. It’s a whole classroom of kids. Let’s say… for rounding purposes, 26.6 kids if this is a large urban district like NYC. Let’s say we’re talking about earnings careers from age 25 to 65 or about 40 years. So, 266,000/26.6 = 10,000 lifetime additional earnings per individual. Hmmm… nolonger catchy headline stuff. Now, per year? 10,000/40 = 250. Yep, about $250 per year (In constant, 2010 [I believe] dollars which does mean it’s a higher total over time, as the value of the dollar declines when adjusted for inflation). And that is about what the NYT Graph shows: http://www.nytimes.com/interactive/2012/01/06/us/benefits-of-good-teachers.html?ref=education

  • The super-elastic, super-extra-stretchy Y axis

Yeah… the NYT graph shows an increase of annual income from about $20,750 to $21,000. But, they do the usual news reporting strategy of having the Y axis go only from $20,250 to $21,250… so the $250 increase looks like a big jump upward. That said, the author’s own Figure 6 in the working paper does much the same!

  • Discussion/presentation of “proxy” measure as true measure (by way of convenient language use)

Many have pounced on the finding that having higher value added teachers reduces teen pregnancy and many have asked – okay… how did they get the data to show that? I explained above that they used a proxy measure based on the age of the female filer and the existence of dependents. It’s a proxy and likely an imperfect one. But pretty clever. That said, in my view I’d rather that the authors say throughout “reported dependents at a young age” (or specific age) rather than “teen pregnancy.” While clever, and likely useful, it seems a bit of a stretch, and more accurate language would avoid the confusion. But again, that doesn’t generate headlines.

  • Gates study gaming of stability correlations

I’ve spent my time here on the GFR paper and pretty much ignored the Gates study. It didn’t have those really catchy findings or big headlines. And that’s actually a good thing. I did find one thing in the Gates study that irked me (I may find more on further reading). In a section starting on Page 39 the report acknowledges that a common concern about using value-added models to rate teachers is the year volatility of the effectiveness ratings. That volatility is often displayed with correlations between teachers’ scores in one year and the same teachers’ scores the next year, or across different sections of classes in the same year. Typically these correlations have fallen between .15 and .5 (.2 and .48 in previous MET study). These low correlations mean that it’s hard to pin down from year to year, who really is a high or low value added teacher. The previous MET report made a big deal of identifying the “persistent effect” of teachers, an attempt to ignore the noise (something which in practical terms can’t be ignored), and they were called out by Jesse Rothstein in this critique: http://nepc.colorado.edu/thinktank/review-learning-about-teaching

The current report doesn’t focus as much on the value-added metrics, but this one section goes to yet another length to boost the correlation and argue that value-added metrics are more stable and useful than they likely are. In this case, the authors propose that instead of looking at the year to year correlations between these annually noisy measures, we should correlate any given year with the teacher’s career long average where that average is a supposed better representation of “true” effectiveness. But this is not an apples to apples comparison to the previous correlations, and is not a measure of “stability.” This is merely a statistical attempt to make one measure in the correlation more stable (not actually more “true” just less noisy by aggregating and averaging over time), and inflate the correlation to make it seem more meaningful/useful. Don’t bother! For teachers with a relatively short track record in a given school, grade level and specific assignment, and schools with many such teachers, this statistical twist has little practical application, especially in the context of annual teacher evaluation and personnel decisions.


[1] “We first identify all women who claim a dependent when filing their taxes at any point before the end of the sample in tax year 2010. We observe dates of birth and death for all dependents and tax filers until the end of 2010 as recorded by the Social Security Administration. We use this information to identify women who ever claim a dependent who was born while the mother was a teenager (between the ages of 13 and 19 as of 12/31 the year the child was born).”

[2] There are 974,686 unique students in our analysis dataset; on average, each student has 6.14 subject-school year observations.

[3] Note that the authors actually remove their student level demographic characteristics in the value-added model in which they associate teacher effect with student earnings The authors note: When estimating the impacts of teacher VA on adult outcomes using (9), we omit the student-level controls Xigt. (p. 22) Tables in appendices do suggest that these student level covariates may not have made much difference. But, this may be evidence that the student level covariates themselves were too blunt to capture real variation across students.

6 Things I’m Still Waiting for in 2012 (and likely will be for some time!)

I start this new year with reflections on some unfinished business from 2011 – Here are a few bits of information I anxiously await for 2012. Some are likely within reach. Others, well, not so much.

  1. A thoroughly documented (rigorously vetted) study by Harvard economist Roland Fryer, which actually identifies and breaks out in sufficient detail (& with appropriate rigor & thorough documentation) the costs of delivering in whole and in part (and costs of bringing to scale), no excuses curriculum/models/strategies and comprehensive wrap-around services.
  2. The long since promised rigorous New Jersey charter school evaluation – or even better – improved student level data in New Jersey such that researchers can actually conduct reasonable analyses of charter schooling and reforms/strategies more generally across New Jersey public & charter schools.
  3. That long list of all of those other average to below average paying professions – professions other than teaching – where compensation is entirely merit based and based substantially on (noisy) multiple regression estimates of employee effectiveness determined by the behavior of children as young as 8 years old [generously assuming 3rd grade test scores to represent the lower end of the value-added grade range],  AND where the top college graduates just can’t wait to sign up!
  4. That long list of highly successful market-based charter and/or independent private schools – schools not bound by the shackles of union negotiated agreements – where teacher compensation is not strongly predicted by (or directly a function of) experience and/or academic credentials,  AND where the top college graduates just can’t wait to sign up (or stick around)! (see also: https://schoolfinance101.wordpress.com/2010/10/09/the-research-question-that-wasn%E2%80%99t-asked/)
  5. Evidence that there really is enough money tied up in (wasted on) cheerleading and ceramics to be reallocated to provide sufficient class size reduction in core content areas and increased classroom teacher wages (toward improving teacher quality) to make substantive improvements to the quality of high poverty schools!
  6. Evidence that  the differences in student outcomes between high performing affluent suburban public school districts and lower performing poor urban and inner urban fringe school districts are somehow explained by substantial differences in personnel policies, merit-based teacher compensation, teacher benefits and negotiated agreements as opposed to substantive differences in family backgrounds and available resources.

For elaboration on a few of these issues, see my recent AP interview with Geoff Mulvihill: http://www.mycentraljersey.com/article/20120101/NJNEWS10/301010003

And so the new year of education policy research and blogging begins. A year in which I, myself, will be engaged in addition, more extensive analyses of the finances of charter schools, revenue raising and expenditure patterns by locations and by network affiliation. A year in which I also expect to be digging deeper into the distribution and effects of cuts in state aid and funding constraints on school and district resource allocation and exploring across multiple states (and districts and schools within states) the causes and consequences of inequities and inadequacies in public education funding.