Dealing with the Devil? Policy Research in a Partisan World

This note is in response to James O’Keefe’s attempt to discredit me on his Project Veritas web site (though I think his point was intended to larger than this). I was lucky (?) enough to be part of one of his investigative set ups earlier this fall. I wrote and held on to this post and all related e-mails.

His scheme was uncovered in this Huffington Post piece to which he refers in his most recent report:

http://www.huffingtonpost.com/mobileweb/2011/10/17/james-okeefe-economic-policy-institute_n_1015845.html

The story:

Back in September, I was contacted by this fictional Peter Harmon who characterized himself as working for the Ohio Education Association, but never made it absolutely clear that he was working for the state teachers’ union of Ohio. In my case, unlike the EPI case, Harmon didn’t (I don’t recall) indicate being a hedge fund guy or being backed by one, but rather that he had “funders.” He dropped me a phone message and an email which were pretty innocuous, so I agreed to talk by phone. That’s where I pick up in this string of e-mails:

===================================

EMAIL #2 – PHONE CALL SET UP

From: peter.harmon@ohioedassoc.org

Sent: Monday, September 19, 2011 10:14 PM

To: bruce.baker@gse.rutgers.edu;

gse.rutgers.edu/bruce_baker@ohioedassoc.org

Subject: Meeting

Dr. Baker,

Thank you for getting back to me. We are eager to talk with you about this project. Would 3pm tomorrow work alright for you?

Sincerely,

Peter Harmon

614-468-3941

===================================

Then there was the strange phone call (which I’m quite sure in retrospect was recorded) where first, “Peter Harmon” wanted me to do a study showing that the collective bargaining legislation in Ohio would hurt children, to which I suggested that a) evaluating collective bargaining legislation is outside the realm of my expertise and b) that even if I agreed that it might, I’d have no clear, defensible way to analyze and argue that point.

From there I suggested things that I can and often do analyze and argue, in each case pointing out that the ability to make such an argument is contingent upon data to support that argument. For example, evaluating the competitiveness of teacher wages over time, or evaluating the distribution of state aid cuts. These are two issues on which I have already actually evaluated Ohio data. I pointed out that there are 3 basic types of products we might be talking about – a) critiques of policy reports or arguments by others (for a few thousand dollars), b) policy briefs/research brief reports (typically about ten thousand dollars) or c) full scale research report (thirty to fifty thousand dollars, with clarification that projects of this magnitude would have to go through RU and/or be done over the Summer). I attempted repeatedly to shift his focus to answerable questions and topics within my expertise, and to topics or issues where I felt I could be helpful to him, on the assumption that he was advocating for the state teachers’ union.

It got strange when Peter Harmon laid down his requirement that if they were going to fund a study, they didn’t want it coming out finding the opposite of what they wanted. I did explain that if he had a topic he was interested in, that I would be willing to explore the data to see if the data actually support his position on the issue and that I would do so before agreeing to write a report for him. The phone call ended with no clear agreement on anything, including no agreement on even what the topic of interest was. In fact, my main point was repeatedly that he needed to figure out what the heck he even wanted to study, though I tried to keep it friendly and supportive. No reason to argue on a first phone call.

It was a strange and disturbing conversation, but I played along until I could get off the phone with the guy. Note that the playing along in a conversation like this also involves trying to figure out what the heck is up with the caller – whether he/she has a particular axe to grind – or other issues that would make any working relationship, well, not work out.

Sadly, as twisted as this phone call was, I’ve had similarly twisted conversations with real representatives of legitimate organizations. However, with most legitimate organizations, you can later identify the less sleazy contact person. My approach has generally been to humor them while on the phone… perhaps probe as to see how twisted they really are… and when the phone conversation ends….let it pass. Move on.

Then came the follow up:

===================================

EMAIL #3 – HARMON FOLLOW-UP

From: peter.harmon@ohioedassoc.org [mailto:peter.harmon@ohioedassoc.org]

Sent: Friday, September 23, 2011 10:01 AM

To: bruce.baker@gse.rutgers.edu; gse.rutgers.edu/bruce_baker@ohioedassoc.org

Subject: Next Meeting

Dr. Baker,

I have good news, my colleagues are very interested in moving forward.

We are confident we can cover the expense of this potential study.

We have a few ideas we would like to run by you for this project.

When would be a good time to call you next?

Regards,

Peter Harmon

614-468-3941

===================================

So now, Harmon is basically suggesting that he can generate the $30 to $50k figure which I had given him for a bigger study, a figure I had basically given him to encourage him to think about doing something else – like contracting a few short policy briefs or critiques. But, he still has no idea what he supposedly wants me to write about. Quite honestly that’s really strange. So my response is simple – it’s essentially a get your act together and don’t both me again until you do. In other words, here are a few examples of the work I do and am proud of. Figure out your damn question and let me know when you do.

===================================

EMAIL #4 – BAKER REPLY

From: Bruce Baker [bruce.baker@gse.rutgers.edu]

Sent: Friday, September 23, 2011 10:06 AM

To: ‘peter.harmon@ohioedassoc.org’

Subject: RE: Next Meeting

Rather busy for next week or so. Would prefer if you could at least send an outline of potential topics & research questions of interest, so I can mull them over.

For examples of reviews/critiques of policy reports, see:

http://nepc.colorado.edu/thinktank/review-middle-class

http://nepc.colorado.edu/thinktank/review-spend-smart

For an example of a policy brief/research report, see:

http://nepc.colorado.edu/publication/NYC-charter-disparities

http://nepc.colorado.edu/publication/private-schooling-US

Thanks.

Bruce Baker

===================================

Here’s Harmon’s attempt at figuring out his question:

===================================

EMAIL #5 – HARMON REPLY

Dr. Baker,

Thanks for getting back to us.

Once of the topics we want to pursue is research regarding spending.

Specifically and increase in spending having a good effect on children. If you need to limit the scope of your research to a specific county, district or other local geographic area. that’s OK.

I will take a closer look at the examples you sent on your last email to get a better idea of what you would like from our end. But,I hope this more specific goal better illustrates what we are looking for.

Let me know when would be good time to call, so I can clarify whatever questions you have about this.

Peter Harmon

614-468-3941

===================================

So, Peter Harmon wants me to explain, or more strangely to show that increasing spending is good for children. Okay. Anyone even modestly informed would know that’s an odd way to frame the question or issue. But clearly, given my body of work, I have argued on many occasions in writing and in court that having more funding available to schools can improve school quality, which is something I would certainly argue is good for children. Would I somehow use data on a specific district or county to do this? No…. uh… not sure? I’d probably start with an extensive review of what we already know from existing research on money and school quality.

At this point, I’m ready to drop the whole discussion, but receive an e-mail notice of a new Economic Policy Institute paper on public employee wages in Ohio. So, to save Mr. Harmon money paying for a new study on this topic, I a) send him a link to that study, and b) explain that I’m already working on a paper related to his issues of concern.

===================================

EMAIL #6 – BAKER REPLY

From: Bruce Baker [bruce.baker@gse.rutgers.edu]

Sent: Thursday, October 06, 2011 10:44 AM

To: ‘peter.harmon@ohioedassoc.org’

Subject: FYI

From one of my Rutgers colleagues:

Click to access Briefing_Paper_329.pdf

Working on some related projects myself, which may be of use to you in near future. Will be back in touch as schedule frees up.

Bruce

===================================

And so it ended. And as I suspected by this point, it appears that this whole thing was a sham… and an attempt at a sting. Interestingly, this appears to be when Harmon moved on to go after EPI.

Quite honestly, O’Keefe’s concept for the investigation isn’t entirely unreasonable except that he and his colleagues didn’t seem to fully understand the fundamental difference between research projects per se, and policy analyses – between writing summaries and opinions based on data that already exist and research that’s already been done – versus exploring uncharted territory – where the data do not yet exist and where the answers cannot yet be known.

At this point, I think a few clarifications are in order about doing policy research, or more specifically writing policy briefs in a highly political context.

First, why would I ever vet the data on an issue before signing on to do work for someone? Well, this is actually common, or should be in certain cases. For example, let’s say the funder wants me to show that “teachers in Ohio are underpaid.” I don’t know that to be true. I’m not going to take his money to study an issue where he has a forgone conclusion and a political interest in that conclusion but where the data simply don’t support that conclusion. It is relatively straight forward for me to check to see if the data support the conclusion before I agree to write anything about it. This is an easy one to check. There are a standard set of databases to use, including statewide personnel data, census data and Bureau of Labor Statistics data and there are standard credible methods for comparing teacher wages. If the argument holds up applying the most conservative (most deferential analysis to the “other side” of an argument) analysis, then it’s worth discussing how to present it or whether to move forward.

A different type of example which I’ve learned by experience is that it’s always worth taking a look at the data before engaging as an expert witness on a school funding related case. I often get asked to serve as an expert witness to testify about inequities or inadequacies of funding under state school finance systems. Sometimes, attorneys have already decided what their argument is based only on the complaints of their clients. It would be utterly foolish of me to sign on to represent those clients and accept payment from them without first checking the data to see if they actually have a case.

Then there’s the issue of doing work for partisan clients to begin with. That’s a different question than doing work for sleazy clients. But sometimes, if it’s a legitimate organization, there may be a sleazy contact person, but further checking reveals that the organization as a whole is credible – and not sleazy. But back to the point…

Quite honestly, the toughest kind of policy analysis to do is for partisan clients – clients with an axe to grind or a strong interest in viewing an issue in one particular way. That is usually the case in litigation and increasingly the case when it comes to writing policy briefs on contentious topics. What this means is that the analyses have to be “bullet-proof.” There are a few key elements to making an analysis “bullet proof.”

First, the analysis must be conservative in its estimates and one must avoid at all cost overstating any claims favored by the client. In fact, the analysis needs to be deferential, perhaps even excessively, to the opposing view.

Second, the analysis must use standard, credible methods that are well known, well understood and well documented by others. Examples in my field would include comparable wage analysis, or wage models which typically include a clearly defined set of variables.

Third, the analysis must rely on publicly accessible data, with preference for “official” data sources, such as state and federal government agencies. This is because the analyses should be easy for any reader to replicate by reading through my methods and downloading or requesting the data.

So here are my final thoughts on this issue…

If this kind of stuff causes anyone to place greater scrutiny on my work of that of any others writing policy briefs on contentious topics that’s fine. It’s not only fine, but desirable. I am fully confident that my work stands on its own. Unlike some, I don’t simply take a large commission to offer my opinion without ever having looked at any data. For example, Eric Hanushek of Stanford University took $50,000 from the State of Colorado to testify that more money wouldn’t help kids and that Colorado’s school funding system is just fine, without ever having looked at any data on Colorado’s school funding system. See:

http://www.edlawcenter.org/news/archives/school-funding/what-hanushek-shows-up-again.html?searched=hanushek&advsearch=oneword&highlight=ajaxSearch_highlight+ajaxSearch_highlight1

By contrast, I did indeed accept a payment of $15,000 for writing a nearly 100 page report filled with data and detailed analyses of Colorado’s school funding system raising serious questions about the equity and adequacy of that system (available on request). In fact, I had already come to the conclusions about the problems with Colorado’s school funding system long before I was engaged by the attorneys for the plaintiff districts (as one will find in many of my blog posts referring to Colorado).

My rule #1 is always to check the data first and to base my opinions on the data. So I welcome the scrutiny on my work and I especially welcome it directly. If you have a criticism of my work, write to me. The more scrutiny on my work the better.

=========

Note #1: for an example of the types of policy briefs and/or analyses to which I am referring here, see: NY Aid Policy Brief_Fall2011_DRAFT6

In my view, this is a solid, rigorous and very defensible analysis. It is a policy brief. It uses numerous sources of publicly available data. And, it was written on behalf of an organization which has self-interested concerns with the NY school finance formula.

Note #2: Indeed there were some poor word choices on my part in the phone conversation. “Play with data” is how I tend to refer to digging in and vetting the data to see what’s there. This blog is dedicated to what I would refer to as playing with data. Looking stuff up. Downloading large data files (IPUMS, NCES). Running statistical models. My friends and colleagues, as well as my students know full well that I take great joy in working with data and that I consider it play. But I’ll admit that it sure doesn’t sound too good when taken out of that context.

Note #3: A few people have asked about the portion of the conversation where I suggest that if I find results that do not support the funders’ views, I will not charge them for the work. Some have suggested that this is an example of burying an undesirable result, which would in my view be unethical. So, what’s the point of not charging them? Actually, it’s so that the result won’t get buried. If I do a bunch of preliminary data analyses only to find that the data do not support a funder’s claims/preferences, I’d rather not write up the report for the funder and charge him/her, because they then own the report and its findings, and have the control to bury it if they so choose. Now, I typically don’t permit gag-order type clauses in my consulting contracts anyway, but, it’s much easier just to avoid the eventual pissing match over the findings and any pressure to recast them, which I will not do. If I keep the results of my preliminary work for myself, then I have complete latitude to do with them as I see fit, regardless of the funder’s preferences. It’s my out clause. My freedom to convey the findings of any/all work I do.

I’ve come to this approach having had my results buried in the past on at least two occasions, one in particular where the funder clearly did not want the results published under their name due in part to pending litigation in which they were a defendant. Much to my dismay, the project coordinators (agency that subcontracted me) capitulated to the funder. I was, and remain to this day, deeply offended by the project coordinator’s choice under pressure by the funder, to edit the report and exclude vital content. Yeah… I got paid for the work. But the work got buried, even though the work was highly relevant. I’m unwilling to go down that road again.

License to Experiment on Low Income & Minority Children?

John Mooney at NJ Spotlight provided a reasonable overview of the NJDOE waiver proposal to “reward” successful schools and sanction and/or takeover “failing” ones.

The NJDOE waiver proposal includes explanation of a new classification system for identifying which schools should be subject to state intervention, ultimately to be managed by regional offices throughout the state. This new targeted intervention system classifies districts in need of intervention as “priority” districts, with specific emphasis on “focus” districts. Mooney explains:

In all, 177 schools — known as Focus Schools — fell into this category, largely defined as the bottom 10 percent in terms of the achievement gaps between the highest- and lowest-performing student groups over three years.

http://www.njspotlight.com/stories/11/1117/0003/

The new system also has a reward program:

The same list also includes the schools that the state designates as Reward Schools, based on both their overall achievement and their progress. Reward Schools with high poverty concentrations will also be rewarded with cash: $100,000 each.

http://www.njspotlight.com/stories/11/1117/0003/

But, some significant questions persist as to whether the state is over-reaching its authority to intervene in the “focus” and priority schools. Here are a few comments from a related article:

“Consistent with state law, they can go in and direct districts to take particular actions,” said David Sciarra, director of the Education Law Center that has spearheaded the Abbott litigation. “All of that, they clearly have the authority to do.

“But nothing that I am aware of allows them to close existing schools,” he said. “And they have no power to withhold funds. That’s even outside the scope of the federal guidelines. ”

Paul Tractenberg, a Rutgers Law School professor and noted expert on education law, said he also questioned whether the application’s reform plans ran counter to the state’s current school-monitoring system, the Quality Single Accountability Continuum (QSAC).

“As a constitutional matter, it is pretty clear the commissioner has whatever power he needs to ensure a thorough and efficient education,” he said. “But that’s different than saying if there is a legislation out there, he can just ignore it.”

In terms of significant alterations such as reassigning staff or directing changes in collective bargaining, Tractenberg said, “there are all kinds of big-time issues about their legal authority to do that.”

http://www.njspotlight.com/stories/11/1117/2359/

Of course, a related twist here is just which schools are involved. NJDOE like other state agencies has adopted a set of performance metrics most likely to single out schools serving the largest shares of low income and minority students for dramatic interventions – for school closure – or for major staffing disruptions (strategies with little track record of success).

Here’s the breakdown of which schools will be subject to closure, staff replacement or other intervention, versus those who will be left alone and those eligible for a check for $100,000.

When considering racial composition, poverty and geographic location (metro area) simultaneously as predictors of school classification:

A school that is approaching 100% free lunch is nearly 30 times more likely to be classified as a focus school (as opposed to all other categories including priority) than a school that is 0% free lunch.
A school that is approaching 100% free lunch is nearly 60 times more likely to be either a priority or focus school (compared to all other options) than a school that is 0% free lunch.

While the typical FOCUS school is 26% black, 39% Hispanic and 51% free lunch, the typical reward school is 7.2% black, 11.3% Hispanic and 10.3% free lunch.

[note: several NJ schools had missing data in the 2009-10 NCES Common Core of Data which were merged with the NJDOE schools list http://www.njspotlight.com/assets/11/1116/2300. Total school enrollment data were most commonly missing, and where possible were replaced with the sum of racial subgroup data for calculating racial composition. Complete data were matched and available for 160 of the 177(9?) focus schools and 120 of the 138(?) reward schools. Thus, I am sufficiently confident that the above patterns will hold as remaining missing data are added.]

NJDOE will likely argue that they are intervening in these schools because poor and minority kids are the ones getting the worst education, which may in part be true. But causal attribution to the teachers and administrators in these schools and districts stands on really shaky ground – especially on the statistical basis provided by NJDOE. The accountability framework chosen is merely identifying schools by the extent of the disadvantage of the students served and not by any legitimate measures of the quality of education being provided.

Further, and perhaps most disturbing, is that this policy framework, like those proposed and used elsewhere is, in effect, (self-granted) license for NJDOE to experiment on these children with unproven “reform” strategies which are as likely to do harm as to do good (that is, likely to do more harm than even simply maintaining the status quo). Helen Ladd’s recent presidential address at the Association for Public Policy Analysis and Management provides exceptional insights in this regard!

Why we need those 15,000+ local governments?

Neal McClusky at Cato Institute makes a good point about our casual, imprecise use of the term “democracy” in the post linked here. I did not delve into this in my previous post, and more or less allowed the imprecise terminology to slip past. Clearly there are huge differences between simple majority rule through direct democracy and our constitutional republic with separation of powers, and I certainly favor the latter.

My original point was that Bowdon completely misrepresents not just a single judicial decision in Georgia, but the notion of the “will of the people” as expressed through our form of government, especially in Georgia and especially in this case. By Bowdon’s strange logic, the will of the people in Georgia is only expressed through the legislation adopted by elected state officials – the state legislature. Local elected officials apparently don’t count – and in Bowdon’s view, the choice of these local elected officials to challenge the constitutionality of state legislative action is somehow an attack on the will of the people. Further, the judicial mediation of this dispute – by an elected judiciary – is an extension of that attack on the will of the people?

Really, the big question which goes back to Mike Petrilli’s post is determining the right balance between centralized versus local control, as carried out by our elected officials at each level. Certainly the process of electing our officials at either the local, state or federal level can become corrupted over time. Local elections can be corrupted (or at least become less expressive of the “will of the people”) by imbalanced influence (the will of some preferred more than others) on those elections and so too can state and federal elections. It would seem that Petrilli’s core argument is that local elections are necessarily most corrupt and most imbalanced because, as he sees it, local elections are entirely controlled, essentially owned by teachers’ unions, whereas state and federal elections clearly remain more pure? less influenced by imbalance of money/power? So, essentially, Mike’s argument is that we must negate the policy decision making power of the most corrupted level of the system, which in his view, are local elected officials. I find that a really hard argument to swallow.

Alternatively, on can argue in favor of centralization, as I used to (and still do on some occasions), that the higher levels of government should – by representing larger and more diverse constituencies and by having greater access to resources (including bigger budgets) – be able to accumulate better technical capacity to make more informed policy decisions. That is, to develop/design/adopt policies better grounded in technical analysis of what works. I’ve become increasingly cynical on this point of late, and quite honestly, I’m generally unwilling to see the overall power distribution shift more heavily from local to state, especially to federal policy decision making.

I still feel strongly that due to economic inequities in tax base and other measures of collective fiscal capacity of communities to provide schools – many of which were induced by policies of housing segregation and discrimination – that states must play a strong role in revenue redistribution in order to ensure that children, regardless of where they live, have access to equitable and adequate schooling.This perhaps where my perspectives begin to diverge most dramatically from McClusky’s preferred policy solutions (though we’ve not debated/discussed the particulars).

I still feel that state agencies can (in their better days), perhaps provide technical support to local schools and districts which are struggling, but I fear that state agencies (departments of education) have become increasingly politicized and instead of providing technical support, are now invariably promoting political agendas (perhaps I’m just waking up to something that’s been occurring all along?), and in many cases forcing ill-conceived politically motivated “reforms” on struggling districts and schools (rather than ensuring access to sufficient resources). See my previous post on pundits vs. practitioners.

So, at this stage in my life and career, I’m not willing to cede to the idea of eliminating entirely the role of local elected officials (or even unbalancing these roles further), as Mike Petrilli might wish. Nor do I accept that a reason for eliminating local elected officials from the mix is that local elections are most corrupted by money & uneven influence (of unions?). This seems merely an argument of convenience from the Petrillian standpoint that right now, he just happens to agree more with the policies of states – and potential to influence federal policy in order to control states – than the current push-back of locals. That’s a rather common perspective from inside the beltway (physically or mentally). It’s logistically easier for an organization like Fordham Institute (which casts itself as providing research/technical guidance?) to have disproportionate impact on policy through a single locus of control – federal gov’t – than through 15,000 local governments (that takes a lot of leg work). And that’s precisely why we need those 15,000+ local governments!

The Wrong Thinking about Measuring Costs & Efficiency in Higher Education (& how to fix it!)

There is a movement afoot to reduce the measurement of the value of public institutions of higher education to a simple ratio of the revenue brought in by full time faculty members divided by the salaries and benefits of those faculty members. That is, does each faculty member “pay” for him or herself, on an annual cash flow basis?[1]

Even some of the finest major public colleges and universities have recently succumbed to reporting such information, arguably, in an effort to appease politically motivated critics.[2] This seemingly simple ratio of the “net cost” of faculty salaries and benefits is presumed representative of the relative efficiency of higher education institutions and/or entire public systems of higher education.

This is a dreadfully oversimplified if not simply wrongheaded approach to measuring the cost of providing public higher education. It is also a simply wrong approach to characterizing the efficiency of production of higher education institutions or higher education systems, largely because the approach ignores entirely the question of what higher education institutions produce. More importantly, measuring institutional performance and efficiency in this way does little or nothing to inform policymakers or institutional leaders on how to get more bang for the buck from higher education. That is, how to generate greater economic benefit to the state or society as a whole, by achieving more efficient production of an educated citizenry.

Arguably, the greatest economic (setting aside cultural and social) value-added of public higher education systems is achieved when those systems can efficiently transform high school graduates into college graduates, with all of the economic and societal benefits bestowed on them (at least in relative terms). This is especially true for high school graduates from low-income backgrounds, including first generation college students. Accepting an economic emphasis, public higher education institutions can and should substantially improve the economic outlook and lifelong earnings of students who otherwise have the least likelihood of college degree completion. Thus, public higher education’s role in providing value added to the economy and to society as a whole.

As such, what we must begin to better understand is how colleges and universities can improve the efficiency with which they produce undergraduate (and graduate) degrees across a variety of fields, and for students of varied backgrounds. Further, we must establish metrics of cost and efficiency that promote the right incentives for faculty and institutions of higher education to improve degree production, especially for those students previously least likely to complete their undergraduate education in a timely and efficient manner. The current policy rhetoric and proposed metrics do little or nothing to advance these policy objectives.

Flawed Reasoning and Bad Incentives of the Net-Value Approach

Under the politically popular model of faculty “net value,” the basic underlying assumption is that higher education faculty are worth as much as the sum of a) the grant funding they bring to the institution and b) the number of student credit hours they produce, thus generating tuition revenue. It is then assumed that if the state subsidized portion of the faculty member’s salary is greater than the sum of the other two values, that faculty member is inefficient (or not worth it). Therefore, the incentives for any faculty member formally evaluated or even informally characterized by this model are to either, track down enough external grant and contract funding to pay in full, his or her own salary and/or to teach enough large sections of large classes and recruit enough students into his or her classes to cover salary and benefits. The same incentives similarly apply to all faculty. But both are counterproductive incentives.

If the mission of public higher education is to produce an educated citizenry that contributes to the economy and society as a whole, as well as being a direct engine of economic development through research and scholarly productivity, then having all faculty focus their efforts on chasing external funding to cover their costs and reduce or eliminate teaching from their responsibilities is counterproductive. Second, production of credit hours and generating tuition may also operate at odds with helping college students progress most efficiently toward degree completion. Maximizing course enrollments generates tuition and credit hours, but may actually reduce time-to-completion as more students get lost in the shuffle. It also reduces the incentive to provide lower enrollment higher level courses that may improve completion rates.

The net-value metric is at best neutral to whether institutions try to move students forward toward completion, or allow them to flounder, repeat numerous (large enrollment) courses and never quite reach the end goal. That just doesn’t make sense, on many levels.

Finally, using this net-value metric forces the same incentive structure onto all faculty members uniformly, encouraging them to act as autonomous agents choosing either one or other approach to covering their margin.

Understanding the Role of Student Behaviors

How might we better think about productivity and efficiency in higher education? Again, consider that a primary goal is the efficient production of degreed or credentialed graduates. That is, taking high school completers and moving them efficiently through their coursework to degree completion, at which point they are likely to, at the very least, be a higher wage earner than they otherwise might have been, and in an even better light might be more likely to contribute more significantly to the economy and society as a whole.

Higher education institutions consist of a maze of pathways often navigated naively (or at least irregularly) by college students trying to find their way toward that light at the end of the tunnel. Evaluating the relative efficiency of higher education institutions requires that we better understand these student behaviors – student course taking patterns – and figure out a) which behaviors seem to be more (and less) associated with successful degree completion and b) whether institutional constraints or supports make any difference. It is naïve, if not completely ignorant to try to evaluate the productivity or efficiency of higher education systems and their economic contributions (or financial drain) without considering these student behaviors and how to influence them.

On the one hand, understanding student pathways helps us understand who is more likely to complete their degree in a timely manner. Further, for those critics of higher education who believe that too many students are pursuing (or at least completing) “useless” degrees in “unproductive” fields, it is important to understand how and why students migrate across degree programs through course selection behavior.

For example, let’s say that we believe society needs more electrical engineers than economists, a reasonable assertion indeed! (note the old adage that majoring in EE [electrical engineering] refers to “eventual economics”). Evaluation of course taking behaviors may reveal that many EE majors become economics majors (without really wanting to) after performing poorly in specific lower level engineering courses, for a variety of reasons. It may be that these students would still have been great engineers and would have flourished in their higher level courses. But perhaps course delivery approaches (large lectures) lack of supports or other institutional barriers are partly at fault. Identifying these barriers and shifting institutional policies may lead to an increased production of electrical engineering completers (and most importantly a decrease in future economists).

Linking Student Behaviors to their Cost & Efficiency Implications

Building on understanding student pathways, we should shift our focus toward the way groups of faculty members and the sequences of courses (and degree programs) they provide lead to differences in the likelihood of degree completion, differences in time to completion and differences in the total costs of degree completion. This is another area where higher education cost research has gone awry in the past. One cannot calculate the differences in costs of producing an economics versus an engineering major by simply looking at the costs of operating those departments. Departments are top down organizational units of universities. But students pursuing a degree in any one field take courses across many units. Instead, we can estimate the cost per credit hour for any one student taking any course in the university, and can then estimate the cumulative costs of common student pathways, and identify the higher and lower average and total cost pathways toward achieving any one degree.

Taking this approach, we might find, for example, that offering smaller class sizes (thus higher unit cost) in specific lower tier courses decreases the likelihood of repeating those courses and/or increases likelihood of successful completion of subsequent courses, leading to an overall more efficient pathway to degree completion. But under the current model of evaluating the net cash value of faculty, the incentive works in the opposite direction by encouraging filling seats over completing degrees and programs.

We might find that offering additional supports for students from disadvantaged backgrounds (who attended high schools with weaker math and physical science programs) taking their lower level courses in engineering calculus leads to greater likelihood of timely degree completion in electrical engineering. Further, that doing so significantly decreases average cost to degree completion by decreasing course repeats. Again, the current net-value approach creates the opposite incentive, favoring course repeats to beef up credit hour production in high enrollment lower level classes.

In reality, the unit costs of any single course, or net value of the faculty member delivering that course, matter far less than how that course more broadly influences the cost of degree completion overall.

Institutional and Public Policy Implications

For progress to be made in the current policy conversations around higher education costs and efficiency, we must improve our metrics and must link new metrics to a much deeper understanding of just how higher education systems work, the role of individual student behaviors and the complexity of the delivery systems and institutional structures designed to serve those students.

We must also be cognizant of the fact that higher education systems are not uniformly, as often characterized in policy rhetoric, stagnant structures of ancient origin, assuming a single woefully inefficient, exorbitantly costly and arcane governance and program delivery structure. Arguably, many elite institutions which best fit this caricature (elite private liberal arts colleges), while sustaining themselves with very high tuition, also achieve very high degree completion rates, albeit for the most advantaged high school graduates.

By contrast, in recent decades we have seen a dramatic proliferation of alternative delivery mechanisms, including rapid expansion of online and for profit higher education institutions. Further, many of these alternative delivery institutions have begun to disproportionately serve high school graduates with the least likelihood of timely (6 year or less) degree completion and have done so at substantial public expense through access to federal student loans. If evaluated on a net-value of faculty basis, these institutions likely look quite good. They must in order to achieve their desired financial bottom line. Yet, their financial bottom line (and in some cases stock value) comes at the taxpayer expense of high rates of loan default and societal and economic expenses of dismal rates of completion of meaningful degrees or credentials.

Getting higher education cost and efficiency measures right is critically important for informing the policy debate and for informing institutional practices. Getting these measures right means the difference between incentivizing non-productive course credit and financial debt accumulation versus incentivizing timely degree completion. When one group of students completes their degrees in a timely fashion, institutions have more resources available for the next wave. Finally, getting these measures right means the difference between a) having each and every faculty member in public institutions of higher education operate autonomously and inefficiently out of self-interest, often to the disadvantage of their students, or b) having faculty working collectively with colleagues and their institutions to improve degree production for the benefit of students, and the broader economy.

[1] http://www.texastribune.org/texas-education/higher-education/ut-faculty-productivity-gets-high-marks-new-report/

[2] http://alt.coxnewsweb.com/shared-blogs/austin/investigative/upload/2011/11/new_ut_study_finds_its_profess/Faculty%20Productivity%20Report.pdf

Professionals 2: Pundits 0! (The shifting roles of practitioners and state education agencies)

Professionals, Pundits and Evidence Based Decision Making

In Ed Schools housed within research universities, and in programs in educational leadership which are primarily charged with the training of school and district level leaders, we are constantly confronted with deliberations over how to balance teaching the “practical stuff” and “how to” information on running a school or school district, managing personnel, managing budgets, etc. etc. etc., and the “research stuff” like understanding how to interpret rigorous research in education and related social sciences (increasingly economic research). Finding the right balance between theory, research and practice is an ongoing struggle and often the subject of bitter debate in professional programs housed in research universities.

Over the past year, I’ve actually become more supportive of the notion that our future school and district leaders really do need to know the research, understand statistics and other methods of inquiry and be able to determine how it all intersects with their daily practice, even when it seems like it couldn’t possibly do so.

Unfortunately, a major reason that it has become so important for school leaders to know their shit is because state agencies, including departments of education, which to some extent are supposed to be playing a “technical support role,” have drifted far more substantially toward political messaging than technical support, and have in many cases drifted toward driving their policy agendas with shoddy fact sheets, manifestos and other shallow, intellectually vacuous but “easy to digest” Think Tank fodder.

In many cases, this intellectually vacuous, technically bankrupt think tank fodder is actually being trotted out by state education agencies as technical guidance to local school administrators.

Punditry in NY State

SchoolFinanceForHighAchievement

commissioner-nyscoss-presentation-092611

nyssba2011

For example, I’ve mentioned these two graphs previously on this blog, which have now been repeatedly trotted out by New York State Education Commissioner John King in presentations to local school officials.

The first graph fabricates an argument that putting more funding into current practices in schools would necessarily be less efficient than putting more funding into either a) alternative compensation schemes which pay teachers based on performance (or at least not on experience and degree level) or b) tech-based solutions. While the latter is never even defined, neither has been shown to produce

Figure 1

The second graph basically argues that most money currently in schools is simply wasted because it’s allocated to portions of compensation that aren’t directly tied to performance. More or less and extension of the first graph, by a different author.

Figure 2

The latest version of the NYSED/King presentation also includes an exaggerated representation of what some refer to as the Three Great Teachers legend. That is, based on estimates from a study in the 1990s, that having three great teachers in a row can close any/all achievement gaps. This is a seriously misguided overstatement/extrapolation from this one study.

Figure 3

To put it bluntly, these various materials compiled and presented by the New York State Education Department are, well, in most cases, not research at all, and in the one case, a gross misrepresentation of a single piece of research on a topic where there are numerous related sources available.

NY Professionals Respond (albeit not directly to the information above, but concurrent with it)

Thankfully, a very large group of Principals on Long Island have been doing their reading, and have been making more legitimate attempts to understand and interpret research as applies to their practice.

APPR_Position_Paper_10Nov11

The principals were primarily concerned with the requirement under new state policies that they begin using student assessment data as a substantial component of teacher evaluation. The principals raised their concerns as follows:

Concern #1: Educational research and researchers strongly caution against teacher evaluation approaches like New York Stateʼs APPR Legislation

A few days before the Regents approved the APPR regulations, ten prominent researchers of assessment, teaching and learning wrote an open letter that included some of the following concerns about using student test scores to evaluate educators1:

a) Value-added models (VAM) of teacher effectiveness do not produce stable ratings of teachers. For example, different statistical models (all based on reasonable assumptions) yield different effectiveness scores.2 Researchers have found that how a teacher is rated changes from class to class, from year to year, and even from test to test3.

b) There is no evidence that evaluation systems that incorporate student test scores produce gains in student achievement. In order to determine if there is a relationship, researchers recommend small-scale pilot testing of such systems. Student test scores have not been found to be a strong predictor of the quality of teaching as measured by other instruments or approaches4.

c) The Regents examinations and Grades 3-8 Assessments are designed to evaluate student learning, not teacher effectiveness, nor student learning growth5. Using them to measure the latter is akin to using a meter stick to weigh a person: you might be able to develop a formula that links height and weight, but there will be plenty of error in your calculations.

Citing:

Baker, E. et al. (2011). Correspondence to the New York State Board of Regents. Retrieved October 16, 2011 from: http://www.washingtonpost.com/blogs/answer-sheet/post/the-letter-from-assessment-experts-the-ny-regentsignored/2011/05/21/AFJHIA9G_blog.html.
Papay, J. (2011). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal 48 (1) pp 163-193.
McCaffrey, D. et al. (2004). Evaluating value-added models of teacher accountability. Santa Monica, CA.; Rand Corporation.
See Burris, C. & Welner, K. (2011). Conversations with Arne Duncan: Offering advice on educator evaluations. Phi Delta Kappan 93 (2) pp 38-41.
New York State Education Department (2011). Guide to the 2011 Grades 3-8 Testing Program in English Language Arts and Mathematics. Retrieved October 18, 2011 from http://www.p12.nysed.gov/apda/ei/ela-mathguide-11.pdf .
Committee on Incentives and Test-Based Accountability in Education of the National Research Council. (2011). Incentives and Test-Based Accountability in Education. Washington, D.C.: National Academies Press.
Baker, E. et al (2010). Problems with the use of test scores to evaluate teachers. Washington, D.C. Economic Policy Institute. Retrieved October 16, 2011 from: http://epi.3cdn.net/b9667271ee6c154195_t9m6iij8k.pdf; Newton, X. et al. (2010). Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Education Policy and Analysis Archives. Retrieved October 16, 2011 from http://epaa.asu.edu/ojs/article/view/810/858. ; Rothstein, J. (2009). Student sorting and bias in value-added estimation: Selection on observables and unobservables. Education Finance and Policy, 4(4), 537–571.

In short, the principals built their case against the punditry that’s been hoist upon them, on a reasonable read of existing research. Thankfully, they had the capacity to do so, and the interest in pursuing guidance from experts around the country in crafting their response. I urge you to read the remainder of their memo and compare the rigor of evidence behind their arguments to the type of content that has most recently been presented to them in recent months.

New Jersey Punditry

The New York principals backlash was relatively high profile. A similar situation occurred last winter/spring in New Jersey, but went largely unnoticed, at least nationally. At that time, a Task Force established by the Governor released its report on how to reform teacher evaluation. The Task Force had been charged with developing an evaluation system based at least 50% on use of student assessment data. So, of course, they did. The task force include an odd array of individuals. It was not, as does occur in some cases, a true “citizen task force” of lay persons providing their lay perspectives. Rather, it was cast as a task force of interested and knowledgeable constituents.

Here is their report: NJ Teacher Effectiveness Task Force

The task force does have a bibliography on their report listing a number of potentially useful sources. Whether they actually read any of them or understood any of the content is highly questionable, given the content of the recommendations and footnotes actually cited to validate their recommendations.

And here are the majority of the footnotes (those which actually site some supposed source of support) from the teacher evaluation section (excludes principal section) or their report, and the claims those footnotes are intended to support:

NJ Educator Effectiveness Task Force Report

Claim: And when used properly, a strong evaluation system will also help educators become more effective.2
Source: 2 For more on this subject, see the discussion in DC IMPACT: http://dc.gov/DCPS/Learn+About+Schools/School+Leadership/IMPACT+(Performance+Assessment)

Claim: The Task Force recommends that the new system have four summative categories: Highly Effective, Effective, Partially Effective, and Ineffective. The number of rating categories should be large enough to give teachers a clear picture of their performance, but small enough to allow for clear, consistent distinctions between each level and meaningful differentiation of teacher performance3
Source: 3 “Teacher Evaluation 2.0,” p. 7, The New Teacher Project, 2010.

Claim: The state review and approval of measurement tools and their protocols will assure that they are sufficiently rigorous, valid, and reliable while also providing districts flexibility to innovate and develop their own tools.4
Source: 4 The Bill and Melinda Gates Foundation in collaboration with many prominent research organizations are in the process of testing a wide array of measurement tools in the Measuring Effective Teaching project: http://metproject.org/

Claim: Studies have found that the results of student surveys can be tightly correlated with student achievement results. Persuasive evidence can be found in the Gates MET study, which uses a survey instrument called Tripod.5
Source: 5 Learning about Teaching: Initial Findings from the Measures of Effective Teaching Project, Bill and Melinda Gates Foundation, 2009

Claim: Growth scores are a fairer and more accurate means of measuring student performance and teachers’ contributions to student learning. In fact, over half of the states surveyed by the Council of Chief State School Officers (CCSSO)—24 out of 43—reported that they either already do or plan to use student growth in analyzing teacher effectiveness.7
Source: 7 State Growth Models for School Accountability: Progress on Development and Reporting Measures of Student Growth, 2010, by the Council of Chief State School Officers.

In short, most of these claims amount to either a) because The New Teacher Project said so, b) because Washington DC does it in the IMPACT evaluation model or d) because one preliminary release study from the Gates foundation included inferences to this effect.

NJ Professional Response

Like those pesky informed Long Island principals, a group of New Jersey educators responded, through an organization spearheaded by a local superintendent who has immersed himself in the relevant research on the issues and has maintained constant open communication with and attended many sessions presented by economists engaged in teacher evaluation studies. The New Jersey group also engaged researchers from the region to assist in the development of their report.

Here’s a portion of their report, which was drafted concurrent with the Task Force Activities (and presented to the Task Force, apparently to no avail):

EQUATE REPORT: NJ EQuATE Report

Once again, the professionals have far outpaced the pundits in their intellectual rigor, use and interpretation of far more legitimate, primarily peer reviewed research.

Summing it all up…

I am so thankful these days that we have in our schools, professionals like these who a) are willing to speak out in the face of pure punditry, and b) are capable of making such a strong and well reasoned case for their own policy proposals or at the very least for why they should not be backed into the ill-conceived, poorly grounded policy proposals of their governing bodies.

I expect that many “reformy” types and the politicos they support are thinking that these necessarily dumb, high paid bureaucrat local public school administrators should just sit down and shut up (as in this case) and adopt the policies that they are being told to adopt by those (often highly educated pundits) who simply know better. How pundits “know better,” stumps me, because the quality of evidence behind their all knowing-ness is persistently weak.

I might be more inclined to accept and argument for state policy preferences and technical capacity over local resistance if the contrast in the quality of information being presented by the pundits and professionals wasn’t so damn stark.

Regardless of political disposition (which is obviously an impossible hypothetical to achieve), if each of these sources was handed to me as a paper to grade in a graduate class (even in a school of education), differentiating among them would be quite easy.

The NYSED materials include completely fabricated information, ill-defined concepts, little basis in peer reviewed (or any “real”) research, and such utterly silly things as claiming that we can quadruple outcomes by moving to some undefined strategy. Yes, this stuff was presented to them by experts they hired. But rather than even attempt to think critically about any of it (and realize it was junk) they simply copied and pasted it into their report and took it on the road. This work fails on any level.

The NJ Task Force report which argues that NJ should adopt a multi-category effectiveness classification system (without any understanding of the information lost in aggregation or problems of aggregating around uncertain cut points), merely because TNTP said so, and suggests use of growth measures is “fair” by citation to a Council of Chief State School Officers report, and bases much of the rest of their recommendations on “what Washington DC did.” Yeah, I’ve read student papers like this. They fail too! Most of my students know full well not to hand me this kind of crap, even if they believe I’m sympathetic to their ultimate conclusion.

But the memo prepared by the NY principals and the report by the NJ professionals are pretty darn good when viewed as a paper I might have to grade. They use real research, and for the most part, use it responsibly. Their recommendations and criticisms are generally well thought out. For that I applaud them.

That said, it is certainly discomforting that local practitioners have had to counter the pure punditry of the very agencies which arguably should be attempting to provide legitimate, well grounded technical support.

More Inexcusable Inequalities: New York State in the Post-Funding Equity Era

I did a post a short while back about the fact that there are persistent inequities in state school finance formulas and that those persistent inequities have real consequences for students’ access to key resources in schools – specifically their access to a rich array of programs, services, courses and other opportunities. In that post I referred to the post school funding equity era as this perceived time in which we live. Been there, done that. Funding equity? No problem. We all know funding doesn’t matter anyway. Funding can’t buy a better education. It’s all about reform. Not funding. And we all know that the really good reformy strategies can, in fact, achieve greater output with even less funding. Hey, just look at all of those high flying, no excuses charter schools. Wait… aw crap… it seems that many of them actually do spend quite a bit. But, back to my point. Alexander Russo put up a good post today about those pesky school funding gaps, asking whatever happened to them? And he nailed it when he pointed out:

If funding didn’t matter, then rich districts wouldn’t bother taxing themselves to provide resources to local kids. If funding didn’t matter, high-performing charter schools wouldn’t cost so much. Until and unless funding matters again in the public debate over education, I fear that we’ll largely be left fiddling at the margins (which is what it feels like we’re doing now).

I will have much more to say in the near future about the mythology about whether, why and how money matters in education. In this post, I’d just like to illustrate some of the extremes in access to resources that persist across school districts in New York State, which along with Illinois (the topic of Russo’s post) remains among the most inequitable states in the nation. (see: http://www.schoolfundingfairness.org)

Let’s start here.

This is a snapshot if the total expenditures per pupil and the need and cost adjusted expenditures per pupil of some of the MOST and LEAST advantaged school districts in New York State (in terms of a mix of need & spending measures). Without any adjustment for needs and costs, the high poverty, high need districts in many cases are spending below $16,000 per pupil, and the Top 30 districts nearly double that. When adjusted for needs/costs, the disparities widen dramatically.

Even worse, as I’ve explained a few times on this blog, New York State actually uses state aid to help support these disparities, by giving unnecessarily large sums of aid to the top group while continuing to cut aid from the bottom. Here is the distribution of some of that aid:

And here is the distribution of the most recent per pupil cuts in aid:

This all results in a rather ugly pattern of disparities that look rather like this, when we compare current need and cost adjusted funding levels with current district outcomes, as I did in a recent post on Illinois and Connecticut schools:

Because NY has so many districts, I’ve included only the relatively large ones here. This graph shows that districts with more need and cost adjusted funding tend to have higher outcomes and those with less need and cost adjusted funding tend to have lower outcomes. But, this graph is not intended to be a causal representation of that relationship. Rather, it’s intended to display the patterns of disparity across these districts. In the Lower Left are districts that are very high need, very low resources and very low outcomes. Among the standouts in this group are Utica and Poughkeepsie (in red in the first table above). In the upper right hand corner of the picture are the lower need, high resource and high outcome districts.

What I’ve been finding most interesting though hardly surprising in my research is just how stark the consequences of these disparities are in terms of the actual programs and services provided within these districts. Reformy logic has told us in the past (see: https://schoolfinance101.wordpress.com/2011/05/05/resource-deprivation-in-high-need-districts-caps-goofy-roi/) that really, these districts in the lower left have more than enough money but they insist on wasting it all on junk like cheerleading and ceramics when they should be putting it into basic math/reading coursework. Alternatively, related reformy logic is that these districts are really just wasting it all on paying additional salaries for experience and degree levels when they could just pay teachers the base salary and do just as well (I’m sure Utica would have great luck in recruiting and retaining teachers with that kind of salary structure. Actually, one of the better articles on relative salaries and teacher job choices uses data on upstate NY cities: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142.5636&rep=rep1&type=pdf)

Setting aside these, well, completely stupid and unfounded claims (which are so pervasive in today’s education policy debate, especially in NY State), these next few slides take a look at the types of disparities in access to specific courses and opportunities faced by students in New York State’s schools.

First, here are a few slides using data from the Office of Civil Rights data collection on AP participation rates and participation in other key milestone courses.These data are shown with respect to district poverty rates, and poor small city districts (and some less poor, but still not advantaged ones) are highlighted.

This first slide shows the ratio of students in 7th grade (early) algebra to those taking algebra in high school. As poverty rates increase, rates of participation in early algebra decline.Clearly, to a large extent, this pattern occurs because fewer students in these districts are prepared for early algebra.

This slide shows overall participation in advanced placement courses. Overall, AP participation declines as poverty increases. Again, this is likely partly due to differences in readiness for these courses among higher poverty populations.

But, it’s also likely due to differences in access to/availability of resources. For a high need district to both a) provide the advanced opportunities for kids in middle and secondary school and b) make sure kids are prepared to take advantage of those opportunities, those districts would need additional resources on the front end – to make sure kids are prepared for early algebra and on the back end to be able to provide the advanced courses once kids are prepared.

The contrast between the top 30 and bottom 30 (and small city) districts in New York State, as evidenced by the allocation of teaching assignments is striking and disturbing. Let’s start with allocation of teaching assignments to advanced and college credit courses (all are not included). I’ve tallied teaching assignments per 1,000 student (in the group of schools, excluding NYC) based on statewide staffing data from 2010-11.This is very preliminary stuff, from a large data set on all teacher assignments in NY State.

What this first tally shows is that in the high performing, high spending, affluent school districts, there are .5 teacher assignments per 1,000 pupils allocated to AP Physics B. In low performing, low spending, high poverty districts, there are only .05 teacher assignments per 1,000 pupils. That adds up to a disparity ratio of 8.61. In other words, pupils in advantaged districts have nearly 9 times the access to teachers assigned to AP Physics as do pupils in disadvantaged districts. In nearly every and any college credit or AP course, disparity ratios run from about 2 to 9 fold differences. The same is true for disparities specifically between the top districts and poor small city districts which largely fall in the lower left of the Quadrant figure above.

Now, you might be saying…well… they don’t have these programs because of all of their frivolous spending on music and arts. Not so much.

On average, most middle and secondary music and arts staffing assignments also run at about a 2 fold or greater disparity between high and low need/resource districts in New York State. Kids in high need, low resource, low outcome districts have substantially less access to band, chorus, orchestra, private instrumental or vocal lessons…. and JAZZ BAND! This is not an exhaustive list. And a handful of arts opportunities are allocated roughly with parity (1:1), but high need, low resource districts do not have substantially greater resources allocated to any of these areas and generally have much less.

The one area where the resource balance shifts systematically is in the allocation of remedial and special education related staffing assignments. Here are some examples. Even in special education, in some cases high resources districts retain their advantage. But on average, the higher need, lower resource districts are driving additional resources into special education related teaching assignments. And just to clarify, no, these districts are not way ahead on class size reduction. A few are. Others clearly are not!

In general in NY State, high need districts are, well, screwed. And as I’ve shown in recent posts, the current leadership in New York State has done little to really help – and arguably much to hurt.

Inequity still matters.

Funding inequity has real consequences for the programs, services and educational opportunities that can be provided to kids.

Anyone who suggests otherwise – that funding is somehow irrelevant to any and all of this – is, well, full of crap. These things cost money. Providing both/and costs more than providing either/or.

To reiterate, this is not the post-funding era!

In fact, quite depressingly, we may be sitting at the edge of a new era of dramatic educational inequalities unlike any we’ve experienced in recent decades.

When VAMs Fail: Evaluating Ohio’s School Performance Measures

Any reader of my blog knows already that I’m a skeptic of the usefulness of Value-added models for guiding high stakes decisions regarding personnel in schools. As I’ve explained on previous occasions, while statistical models of large numbers of data points – like lots of teachers or lots of schools – might provide us with some useful information on the extent of variation in student outcomes across schools or teachers and might reveal for us some useful patterns – it’s generally not a useful exercise to try to say anything about any one single point within the data set. Yes, teacher “effectiveness” estimates tend to be based on the many student points across students taught by that teacher, but are still highly unstable. Unstable to the point, where even as a researcher hoping to find value in this information, I’ve become skeptical.

However, I had still been holding out more hope that school level aggregate information on student growth – value added estimates – might still be more useful mainly because it represents a higher level of aggregation. That is, each school is indeed a single point in a school level analysis, but that point represents an aggregation of student points and more student points than would be aggregated to any one teacher in a school. Generally, school level value-added measures BECAUSE of this aggregation are somewhat more reliable.

I’m in the process of compiling data as part of a project which includes data on Ohio public schools. Ohio makes available school level value added ratings as well as traditional school performance level ratings. For that, I am grateful to them. Ohio also makes school site financial data available. Thanks again Ohio!

At the outset of any project, I like to explore the properties of various measures provided by the state. For example, to what extent are current accountability measures a) related to the same measures in the previous years, and b) related to factors such as student population characteristics?

Matt Di Carlo over at http://www.shankerblog.org (see: http://shankerblog.org/?p=3870) has already addressed many/most of these issues with regard to the Ohio data. But, I figured I’d just reiterate these points with a few additional figures, focusing especially on the school level value added ratings.

As Matt Di Carlo has already explained, Ohio’s performance index which is based on percent passing data is highly sensitive to concentrations of low income students.

Ohio performance index and % free lunch:

Nothing out of the ordinary here (except perhaps the large number of 0 values, which I didn’t bother to exclude – and which really compromise my r-squared… will fix if I get a chance). On this type of measure, this is pretty much expected and common across state systems. This is precisely why many state accountability system measures systematically penalize higher poverty schools and districts. Because they depend on performance level comparisons and because performance levels are highly sensitive to student/family backgrounds.

As a result, these heavily poverty biased measures are also pretty stable over time. Here’s the year to year correlation of the performance Index.

I’ve pointed out previously that one good way to get more stable performance measures over time – for schools, districts or for teachers – is to leave the bias in there. That is, keeping the measure heavily biased by student population characteristics keeps the measure more stable over time – if the student populations across schools and districts remain stable. More reliable yes. More useful, absolutely NOT.

It’s pretty much the case that the performance index received by a school this year will be in line with the index received the previous year.

Therein lies part of the argument for moving toward gain or value-added ratings. Note however that an exclusive emphasis on value-added without consideration for performance level means that we can ignore persistent achievement gaps between groups and the overall level of performance of lower performing groups. That’s at least a bit problematic from a policy perspective! But I’ll set that aside for now.

Let’s take a look at what we can resolve and can’t resolve in Ohio school ratings by moving toward their value-added model (technical documentation here: http://www.ode.state.oh.us/GD/Templates/Pages/ODE/ODEDetail.aspx?Page=3&TopicRelationID=117&Content=113068)

As I noted above, I’d love to believe that the school level value-added estimates would provide at least some useful information to either policymakers or school officials. But, I’m now pretty damn skeptical, and here’s more evidence regarding why. Here is the relationship between 2008-09 and 2009-10 school value added ratings using the overall “value added index.”

Note that any district in the lower right quadrant is a district that had positive growth in 2009 but negative in 2010. Any district that is in the upper left had negative growth in 2009 and positive in 2010. It’s pretty much a random scatter. There is little relationship at all between what a school received in 2009 and in 2010 (or in 2008 or earlier for that matter).

So, imagine you are a school principal and year after year your little dot in this scatter plot shows up in a completely different place – odds are quite in favor of that! What are you to do with this information? Imagine trying to attach state accountability to these measures? I’ve long expressed concern about attaching any immediate policy actions to this type of measure. But in this case, I’m even concerned as to whether I have any reasonable research use for these measures. They are pretty much noise.

Here’s a little fishing into the rather small predictable shares of variation in those measures:

As it turns out, the prior year index is a stronger (though still weak) predictor of the current year index. But, it’s also the case that districts that had higher overall performance levels in the prior year tended to have lower value added the following year, and districts with higher % free lunch and higher % special ed population also had lower value added (among those starting at the same performance index level). That is some of the predictable stuff here is bias… indicative of model-related (if not test related) ceiling effects as well as demographic bias. That’s really unhelpful, and likely overlooked by most playing around with these data.

I get a little further if I use the math gains (the reading gains are particularly noisy).

These are ever so slightly more predictable than the aggregate index. But not a whole lot. But, they too are also a predictable function of stuff they shouldn’t be:

Again, districts that started with higher performance index have lower gain, and districts with higher free lunch and special ed populations have lower gain… and yes… these biases cut in opposite directions. But that doesn’t provide any comfort that they are counterbalancing in any way that makes these data at all useful.

If anything, the properties of the Ohio value-added data are particularly disheartening. There’s little if anything there to begin with and what appears to be there might be compromised by underlying biases.

Further, even if the estimates were both more reliable and potentially less biased, I’m not quite sure how local district administrators would derive meaning from them – meaning that would lead to actions that could be taken to improve – or turn around their school in future years.

At this point and given these data, the best way to achieve a statistical turn around is probably to simply do nothing and sit and wait until the next year of data. Odds are pretty good your little dot (school) on the chart will end up in a completely different location the next time around!

A Look at State Aid Cuts in New York State 2011-12

Following is another in my school finance geeky series of straight-up analyses of state school finance formulas. I wrote about New Jersey’s funding formula few days ago. This analysis focuses specifically on the cuts levied across NY school districts for 2011-12 and the underfunding of the foundation formula for select districts.

In 2007, New York State adopted the new Foundation Aid Program.

A full critique of that state aid program can be found here: NY Aid Policy Brief_Fall2011_DRAFT6

That school funding formula was argued by the state to represent the state’s constitutional obligation to provide for a sound basic education. That argument was built on the assumption that the underlying base aid for the formula would be calculated by estimating the average instructional spending per pupil of districts statewide that were performing well, or achieving 80% proficiency on state assessments.[1] By 2011-12, the foundation level was to be set to $6,535.[2] For each district, the sound basic level of funding would be determined by multiplying the foundation funding level times that district’s Pupil Need Index to account for variations in student populations to be served, and Regional Cost Index to account for variations in regional labor costs.

Target “Sound Basic” Funding per Pupil = Foundation x PNI x RCI

Next, to determine each district’s total sound basic, or foundation formula funding target, this per pupil funding figure was to be multiplied times the Total Aidable Foundation Pupil Units, or TAFPU. TAFPU is based on district enrollments, but includes additional weightings to account for student needs, such as students with disabilities and summer school pupils.

Total Sound Basic Funding Target = Sound Basic Funding per Pupil x TAFPU

Next, for each district, the state determines the share of the total to be raised locally and the share to be distributed in state foundation aid. A district receives the greater of aid levels based on two different calculations:

State Foundation Aid = Total Sound Basic Funding Target – Expected Minimum Local Contribution

State Foundation Aid = Total Sound Basic Funding Target x State Aid Sharing Ratio

Applying the Formula to Small Cities and New York City

We can apply these calculations to determine the aid that should have been received in 2011-12 by several of the state’s small cities and by New York City, based on data and parameters from state aid runs as provided on April 1, 2011. (again… this is how it hypothetically works).

Table 1 shows the first portion of the calculations

Note that these are all high need districts, though Tonawanda and North Tonawanda are certainly lower need than Utica or New York City. Among the districts Utica has by far the highest pupil need index. New York City and other downstate Hudson Valley districts have the highest labor market cost estimates. All but Tonawanda and North Tonawanda receive target per pupil funding levels over $10,000.

In the next step, we determine the total foundation funding and the state share of that funding target.

Table 2. Calculation of Promised State Aid

For example, for Albany, the target per pupil funding is $12,179. The expected minimum local contribution is $4,749 and the difference between the two is $7,430 per pupil. In the case of Albany, that difference becomes the state aid per pupil amount. Multiply that amount times the aidable pupils, and you’ve got a total state aid of about $93.5 million. For New York City, it turns out that the higher aid amount is allotted by using the State Aid Sharing Ratio instead of the difference between target funding and estimated local contribution. By the final calculation, New York City would receive about $8.6 billion in aid.

Broken Promises: Aid Freezes and Gap Elimination

But, this is all hypothetical. This is all entirely based on the promised foundation aid formula. This is all based on the foundation aid formula that the state has argued is by its design the manifestation of the state’s own constitutional obligation to provide a sound basic and meaningful high school education to children across New York State. Note that I have provided an entirely separate report which explains the insufficiency of these targets and the rationale behind them. But let’s accept these targets for the moment and explore the extent to which even these modest promises have been ignored. Because we are dealing with really big numbers here, Table 3 reports those numbers in millions.

Table 3. Foundation Freezes and Gap Reductions (or are they just aid cuts?)

For Albany, the sound basic level of aid calculated by the legislature’s own formula is about $93.5 million. But, from the start, foundation aid was frozen at prior year levels, which were actually frozen at the levels of the year prior to that. For Albany, the aid freeze brings them down to $56.7 million, or a $37 million shortfall from their sound basic aid calculation. For New York City, the freeze alone pulls out $2.4 billion in aid. For small cities, the total reduction from the freeze, the total underfunding of sound basic aid, is about $271 million.

But it doesn’t end there. The state budget for 2011-12 does not promise to fund even that frozen level of aid. Rather, an additional “Gap Elimination Adjustment” was applied to cut aid further. At the last minute of the legislative session, there was partial reduction of this adjustment, but not full reduction. The adopted Gap Elimination adjustment removes another $12.5 million from Albany, bring their actual state aid level for 2011-12 to rest at $44.2 million, or less than half of their sound basic aid target. The total funding gap for small cities is $370 million. And the total funding gap for New York City after the Gap Elimination adjustment is $3.2 billion.

In summary, even if we pretend that the current foundation formula does provide for a sound basic education, even if we ignore that the current foundation formula is set to relatively low success rates on an assessment where scores had become inflated over time, the New York State Legislature has fallen 30% to 50% or more below these funding promises for many high need, large districts. Statewide, the foundation formula shortfall before Gap Elimination adjustment is approximately $5.5 billion, and after gap elimination adjustment is $8.1 billion. While the current formula itself falls short in many ways, the New York Legislature faces a serious uphill climb simply to keep their own promises.

Spreadsheet of Calculations: Funding Gap NY Calculations

Note: Analysis above focuses on the Foundation Aid Program. Other aids outside this formula include:

F(FA0013) 00 2011-12 CHARTER SCHOOL TRANSITIONAL

G(FA0029) 00 2011-12 HIGH TAX AID

H(FA0065) 00 2011-12 SUMMER TRANSPORTATION AID

I(FA0069) 00 2011-12 TRANSPORTATION AID W/O SUMMER

J(FA0073) 00 2011-12 BUILDING AID

K(FA0077) 00 2011-12 BUILDING REORG INCENTIVE AID

L(FA0081) 00 2011-12 OPERATING REORG INCENTIVE AID

M(FA0085) 00 2011-12 NON-CMPNT COMPUTER ADMIN AID

N(FA0089) 00 2011-12 NON-CMPNT CAREER EDN AID

O(FA0021) 00 2011-12 NON-CMPNT ACADEMIC IMPROVMT AID

P(FA0093) 00 2011-12 BOCES AID

Q(FA0097) 00 2011-12 PUBLIC EC HIGH COST AID

R(FA0101) 00 2011-12 PRIVATE EXCESS COST AID

S(FA0105) 00 2011-12 SOFTWARE AID

T(FA0109) 00 2011-12 LIBRARY MATERIALS AID

U(FA0113) 00 2011-12 TEXTBOOK AID

V(FA0117) 00 2011-12 HARDWARE & TECHNOLOGY AID

W(FA0121) 00 2011-12 FULL DAY K CONVERSION

X(FA0125) 00 2011-12 UNIV PREKINDERGARTEN AID

Y(FA0033) 00 2011-12 SUPPLEMENTAL PUB EXCESS COST

Z(FA0185) 00 2011-12 ACADEMIC ENHANCEMENT AID

[1] http://www.oms.nysed.gov/faru/PDFDocuments/technical_2009.pdf

[2] http://www.oms.nysed.gov/faru/PDFDocuments/Primer11-12D.pdf

More with Less or More with More & Why it Matters!

I did a piece a short while back on TEAM Academy, a Charter school which I thus far admire in Newark, NJ. I admire the school because, while the data I’ve been able to gather from official sources still indicates that TEAM is far from a statistical match with its surroundings, and appears to have greater cohort attrition than I might like to see, I am, at this point, comfortable stating that TEAM Academy is more comparable than others to its surroundings than other Newark Charters.

Allow me to restate why I care about the comparability piece of the puzzle. First, let me say that I do believe that there is (or at least may be) an important role in urban school systems or any school systems for that matter, for schools that aren’t entirely comparable. That’s the case for Magnet schools for example, which have in some rigorous studies been shown to produce positive outcomes for kids who attend. (see: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.152.385&rep=rep1&type=pdf)

But, when schools like magnet schools show positive outcomes we must recognize them for what they are and not make bold assumptions that those schools can easily be replicated districtwide or nationwide for “all kids” otherwise “trapped” in “failing schools.” Magnet or other selective schools’ success is likely significantly contingent on the student population served. The same goes for some charter schools, a key point of which is that it is foolish to ever lump all charters into one basket as if they represent a single reform strategy. They are a diverse mix of schools. Some serve more comparable populations to surrounding district schools and operate more similarly to open enrollment public schools while others are far more similar to magnet schools in terms of population served and in terms of the curriculum that can then be delivered to that population. When charters are effectively magnet schools (like North Star in Newark) scalability must be viewed differently (in part because the “success” of the school is as likely dependent on the selective student body as it is on any program/services/curriculum provided).

But the debate on scalability of “successful” charters goes beyond just the student population comparability issue. Far too often the rhetoric around successful charters involves the following three part claim:

Claim: Successful charter schools serve the same students, for less money and get better outcomes than traditional public schools.

Rarely if ever are these three components sufficiently validated. This is especially true of the same students and less money prongs of the argument. If policymakers accept on faith that pundits are truthful in these claims, policymakers may develop a false confidence as to how easily and how cheaply charter expansion can lead to improved outcomes. It would behoove policymakers to take a much closer look at all three prongs of the issue, and consider each of these possibilities in Table 1.

Table 1. Framework of Possibilities

Note that this table can be expanded to include those cases of charters that serve non-comparable populations that are more needy than nearby traditional public schools (a focus of many specialized charters)

As I noted in my post regarding TEAM Academy, while the expenditure comparisons (particularly in New Jersey) are complicated they are critically important. And, perhaps my most important statement in that post is that there is no shame in spending more to provide a good education. Charter supporters (or anyone for that matter) should not understate the costs of their additional efforts. Charter supporters should not downplay the importance of class size reduction, teacher salaries, extended learning time in an effort to fit themselves into a category in Table 1 into which they don’t really belong.

Policymakers need to know what works and why it works. If a charter school is really freakin’ successful by spending more money on certain things and/or spending it differently, that’s important to know, even if their overall success is partly contingent on serving a selective population. Simply adopting the rhetoric of serving the same students, for less money and getting better outcomes than traditional public schools is unhelpful when it’s simply not true. Even worse, it’s potentially harmful to promote expansion on such a false premise.

So, here are a few more examples which come from preliminary explorations which are part of a much bigger project to get a handle on charter spending. Note that I began this project over a year ago and released a detailed report on New York City charter spending last year: http://nepc.colorado.edu/publication/NYC-charter-disparities. That report provides important supporting detail for this post regarding making sound comparisons of spending in Charters and traditional public schools in NYC.

Let’s start with a look at Amistad Academy, a well-known high performing charter school in New Haven Connecticut and part of the Achievement First network (www.achievementfirst.org). By usual accounts, Amistad is a high flying charter school. Let me be absolutely clear about this – I’m not crapping on Amistad. To the best of my data-driven understanding, it’s a very good school providing strong academic opportunities for kids in New Haven. But, from a policy standpoint, it’s worth at least cursory exploration of data on the three prongs above.

The following analyses use a mix of data from the National Center for Education Statistics Common Core of Data, from the CTDOE data system (http://sdeportal.ct.gov/Cedar/WEB/ct_report/DTHome.aspx) and from Guidestar (www.guidestar.or). In order to have all data elements lined up to a common fiscal and enrollment year, I’ve focused on school year 2008-09 here.

Figure 1. Amistad % Free Lunch compared to New Haven Schools

Figure 2. Map of Amistad % Free Lunch compared to Surround Schools

NOTE: I’m informed (see comment below) that the school location for Amistad is not correct. Note that the school location is based on the latitude and longitude as provided in the NCES Common Core of Data (www.nces.ed.gov/ccd/bat). As I suspected might be the case, the CCD Lat/Lon indicates the location of the Central Office of Achievement First (403 James Street). Amistad is located over in the area indicated, near many high poverty traditional public schools. (130 Edgewood Avenue New Haven, CT 06511)

So, Figure 1 and Figure 2 show quite decisively that Amistad is not serving a population which is comparable to surroundings in terms of % qualified for free lunch. Amistad also reports 0% LEP/ELL [no data] while the district reports 12.6% (http://sdeportal.ct.gov/Cedar/WEB/ct_report/EllDTViewer.aspx)

Now, let’s take a look at Amistad’s per pupil spending compared to New Haven public schools. Note that it’s generally not a great idea to try to compare against the district as a whole. If we are comparing Amistad’s performance to elementary and middle schools in New Haven, we should be comparing Amistad’s spending to elementary and middle schools. I’ll provide examples for KIPP charter schools in NYC and Houston at the end of this post.

One must also figure out what components are “in” and what components of spending are “out “ when making a host district comparison to charter spending. For example, host districts are responsible for transportation of children to charters in CT. So, that spending should be removed from host district spending. Note that Amistad logically reports no expenditures on transportation in CTDOE spending reports. http://sdeportal.ct.gov/Cedar/WEB/ct_report/FinanceDTViewer.aspx

Further, host districts are responsible for costs of all resident children with disabilities, and it is difficult to discern whether any of these costs (other than the regular education costs of those students) show up in the charter expenditures. Amistad reports no percentage of spending on special education in CTDOE reports (reporting its total general expenditure figure instead). It is most likely that a large share if not all of the district special education spending should be excluded from the district spending figure. http://sdeportal.ct.gov/Cedar/WEB/ct_report/FinanceDTViewer.aspx

Finally, Amistad is part of a national network which might be considered analogous to its “district,” and expenditures by that national organization should be included. I’ve played it very conservative by only prorating the “administrative” expenses (www.guidestar.org) of Achievement First Inc. across all students in the network, for an additional $218 per pupil. (www.achievementfirst.org)

Figure 3. Per pupil Spending in Amistad Academy and New Haven

Data Sources: New Haven City & Amistad CTDOE http://sdeportal.ct.gov/Cedar/WEB/ct_report/FinanceDTViewer.aspx Amistad Academy IRS 990 from www.guidestar.org (Total expenditures = $9,575,340, enrollment = 641 in 2008-09)

[1] Host districts are responsible for transportation costs for in district students enrolled in charters.

[2] 18.08% of New Haven Public Schools total expense is on special education. Amistad reported total expenditures as special educ. expenditures & 0% to special education in 2008-09. See: http://sdeportal.ct.gov/Cedar/WEB/ct_report/FinanceDTViewer.aspx

[3] Achievement First Administrative Expense in subsequent year (guidestar.org) $1.125 million with cumulative enrollment 2010-11 of approximately 5,150 (tallied from achievementfirst.org)

So, Figure 3 shows that Amistad academy at the very least spends comparable to New Haven district wide spending after excluding transportation, and spends quite a bit more than New Haven district per pupil if we exclude all of New Haven’s special education spending. Even if we excluded only a portion of New Haven’s special education spending (likely more appropriate), Amistad’s spending would be quite a bit higher than New Haven Public Schools. Again, there should be no shame in trying to spend more to provide a good school. Rather, it’s arguably quite noble.

I’m not a big fan of relying exclusively on aggregate spending figures. Rather, I prefer to dig under the hood a bit to see how those dollars are leveraged. This is especially important if we really want to figure out how to replicate the successes of a school like Amistad, albeit with a very different population.

Figure 4 shows the class sizes by grade level in Amistad and New Haven public schools based on CTDOE data from 2008-09. Amistad appears to have leverage money for smaller class sizes in the lower grades, a choice which arguably makes sense given the existing research on the effects of class size reduction. Overall, Amistad has lower class sizes than the district at the same grade level. And that costs money.

Figure 4. Class Size by Grade Level

Now, on to teacher salaries. In my previous post on TEAM Academy in Newark, NJ, I found that TEAM had scaled up teacher salaries on the front end of experience and paid much higher salaries than Newark Public Schools (no easy accomplishment), for new to mid-career teachers, putting TEAM in a pretty good position for local recruitment and retention. Figure 5 shows that Amistad has done much the same. To construct Figure 5 I used 6 years of data on individual classroom teachers in Connecticut and estimated a teacher salary model as a function of experience, degree level and year of the data. I estimated separate models for New Haven schools and for Amistad, and used those models to impute the implicit teacher salary schedule.

Amistad is paying more on the front end, and far outpacing the district across the first several years of the salary schedule (figures jump around for later years in Amistad due to very few teachers in those categories). And perhaps this allows Amistad to recruit and retain the teachers it wants. More exploration is warranted.

Figure 5. Modeled Teacher Salaries by Degree and Experience Level

So, in summary, what we have here is a high performing school that does not serve the same population, spends more than the local district and chooses to leverage spending toward class size reduction in the early grades and toward competitive early to mid-career teacher salaries. That’s a realistic look at a school that by many accounts is a darn good one.

[but a look I suspect some will still take offense to]

The population differences of the school create serious limitations for determining its scalability. That is, is the performance a function of the students or of the school? That’s hard to tell (even in a rigorous lottery based analysis). Further, the expense of the Amistad model of reduced class size and higher wages on the front end may cause some policymakers to balk. But that expense may be indicative of what’s actually needed, even with a more selective student population.

Perhaps more importantly, even with publicly available macro level data we can gain some insights into how the additional money is leveraged. And it would appear that Amistad is doing things I would consider quite logical, such as early grade class size reduction and paying competitive teacher wages. Those aren’t necessarily the sexy things the “cool kids” might be expecting. And those are both things that cost money. It would be hard to run a school with both reduced class sizes AND competitive wages while spending substantially less. And it is critically important that we recognize this!

Addendum: Making school level spending comparisons in New York City and Houston

Note that a major shortcoming of the Connecticut data above is that they don’t allow for comparison of New Haven schools spending by grade level or individual comparable schools. I have begun large scale analysis of school site expenditure in numerous other contexts. Below are two examples of school site comparison against same grade level schools – including comparable budget components (as well as spelling out in the fine print those aspects which aren’t directly comparable – see FN about KIPP Academy financial reporting – much more detail in my NEPC report).

A1. KIPP Schools in New York City (preliminary analysis)

Like Amistad, and KIPP middle schools in NYC appear to be spending more than NYC public middle schools in the same parts of the city. They are a) not serving comparable populations and b) spending more (even if we spread KIPP Academy spending across all schools and if we exclude KIPP to College spending).

Making the appropriate corrections for facilities access is complicated in Connecticut because facilities expenses are not broken out for the Charters. The CTDOE figures for Amistad and New Haven above contain the same reported components (when transportation & special education are excluded for New Haven), but facilities lease payments may be (are likely) embedded in operating expenses of Amistad (& tend to run around $1,500 per pupil in NJ cities, and over $2,000 per pupil in Manhattan). However, New Haven remains responsible for upkeep and renovation for its facilities as well as any payments on debt that may exist. That is, district facilities are not, as some might argue “free.” So, for example, Amistad spends about $828 per pupil on plant operations and maintenance, while New Haven spends $1,735 per pupil in 2008-09 (a difference of $907). But, on administrative & support services, New Haven spends $1,863 per pupil and Amistad spends $3,585 per pupil (a difference of $1,722). This latter figure likely includes a significant lease payment (or some other peculiar overhead expense), but is partially offset by the differences in operations and maintenance (net difference of $1,722 – $907 = $815, which is smaller than the total expenditure differences reported above, but does close some of the gap). But these back of the napkin approaches only get you so far.

I have greater capacity to correct for these differences in my more detailed NYC data used previously in my NEPC report and used above.

A2: KIPP (and all other charters) in Houston (preliminary analysis)

http://ritter.tea.state.tx.us/perfreport/aeis/2010/DownloadData.html

One can see in the figure above that many of the KIPP schools in Houston are spending well above a) most other charters b) most Houston public schools and c) the Houston district average expenditure. Yes, charters on the whole are a mixed bag. Many are quite low spending. These data likely need much more cleaning and cross-checking. But they are generally accessible through the TEA web site.

========================

NOTE: All data used in these posts come from official state, federal and IRS documents, in a few cases through respected aggregators of data (guidestar.org). In a few cases above, I rely on total enrollment counts from the organization web site (Achievement First). Generally, I rely on official data and provide URLs to data sources so that any and all analyses can be checked, replicated, etc. If you are a representative of a school and believe your data to be “wrong,” I will typically respond by at least checking that I have not made an error in reporting the data. But, if the data are what they are, then I suggest that you go to the source for any corrections. Most of these data are reported by the schools themselves to the state and federal agencies in question. I just report them as they are, and do certainly attempt to reconcile anything that appears out of line – and will make corrections when the correction can be validated.

Thoughts on Improving the School Funding Reform Act (SFRA) in NJ

I’ve seen a number of tweets and vague media references of late about the fact that NJ Education Commissioner Cerf will at some point in the near future be providing recommendations for how to change the School Funding Reform Act of 2008.

I also have it on good authority that NJDOE has convened a working group to discuss how to alter SFRA and are bringing in outside consultants for ideas. To no surprise, I’ve been left out of these conversations, despite my narrowly focused expertise on these very topics.

SFRA is subject to review by the department. Most of SFRA is laid out in statute, or laws passed by the legislature. But, as I understand it, the department of education does have some latitude to “tweak” parameters within SFRA. For example, adjusting/changing various weights and other factors which drive more money to some districts and less to others.

Now, I hate to stick my nose in on this process with my own preemptive recommendations, but you see, this happens to be a topic I know something about. After all, if within my broad areas of expertise on education policy/finance there is one area in which I really specialize it’s the design of state school finance formulas to meet student needs. And, I happen to have a little background on NJ’s SFRA. So, here’s my free advice. A little pro-bono technical advisement.

First, keep in mind that I have in the past testified on problems with SFRA, specifically focusing on what I consider to be technical errors made in the original design of the formula which fall under the umbrella of “tweakable” stuff. I also happen to have done research conference presentations and have published peer reviewed research related to some of the problematic features of SFRA – specifically the way the state chose to adjust for competitive wage variation across settings and the way the state chose to fund special education.

My apologies to all the non-Jersey and non-finance geeks out there for whom this analysis is going to quickly go technical. Can’t avoid it. Would take far too much space to provide full background on each issue. But I do have complete related documentation linked throughout. My reason for this post is simply to get this stuff out there. To make it known what the actual, technical issues are and what should be addressed when talking about “tweaking” SFRA. Some background is in order though, if for no other reason to explain how I’ve narrowed my scope here.

First, state school funding formulas like SFRA start out by calculating an “adequacy budget” target for each school district:

Adequacy Budget = (Base Funding + Student Need Funding) x Geographic Cost Variation

Typically, the student need category includes additional funding for a) low income children, b) children with limited English language proficiency, and c) children with disabilities. Under geographic cost variation, states generally adjust for geographic variation in competitive wages (how much more does it cost to pay teachers competitively in one labor market versus another) and for small, remote and sparsely populated districts (economies of scale & sparsity). The latter issue is less relevant in NJ.

Typically the second step in a state school finance formula is the parsing of state versus local responsibility to pay for the adequacy budget:

Foundation Formula State Aid = Adequacy Budget – Local Fair Share

This part is important too, especially for balancing tax equity concerns. But, in this post and in most of my analyses of SFRA, I’m focused on getting those adequacy targets correct. And with SFRA, there is plenty to talk about.

SFRA emerged in part from an analysis prepared for the department of education on the costs of providing an adequate education. That report, by John Augenblick and Associates was produced to the department around 2003, but was not released by the department until 2006. Elements of that report were used to guide a new school funding formula adopted in 2008 – SFRA.

It’s really important to understand that the adoption of state school funding formulas is necessarily a political process. That’s just reality. One can ponder a world in which we substitute technical expertise for political deliberation as somehow being the perfect substitute, but even I understand that’s not realistic.

And quite honestly the quality of technical advisement varies widely. I would go so far as to say that some technical advisement is clearly better than other technical advisement, and some is not worth a damn. For examples of the latter, see: https://schoolfinance101.wordpress.com/2011/06/06/roza-tinted-reality/ and: https://schoolfinance101.wordpress.com/2011/04/01/publicincompetence/

So, the reality is that legislatures adopt something, perhaps with technical advisement and state courts are available to hear any legally relevant grievances (and consider technical advisement) to evaluate whether those concerns rise to the level of constitutional violation.

I often assist in identifying what those grievances are. Here, I’m pointing mainly to technical quibbles over what came out of the legislative process in New Jersey. These are technical quibbles for which I would argue the research suggests there is a “right way” to do things and the New Jersey legislature and department of education chose the “wrong way.” These are technical quibbles which result in relatively modest, though important corrections to the setting of district “adequacy budgets.” And these are technical quibbles which the court appointed special master decided did not rise to a level of constitutional violation. That is, SFRA was “good enough” to meet constitutional muster.

So then, I suggest that the departmental (regulatory) review process is the right time to address these technical problems.

Table 1 provides my short list of relatively easy fixes.

First, when adopting SFRA someone, somewhere along the line suggested that the formula provide substantially greater money for each high school student than for each elementary student and marginally more money for each middle school student than for each elementary student. But, there is no clear evidence – no firm research basis for such differentiation. No evidence, for example, that it costs more to provide equal educational opportunity in districts that have a larger share of secondary than elementary students. Rather, differences that do exist in spending on high school versus elementary students are merely artifacts of the ways in which districts have typically spent regardless of which children would benefit more from additional expenditure. The most problematic feature of this adjustment is that higher poverty districts tend to have smaller shares of their total enrollment in high school, meaning that this adjustment drives more money to lower poverty and less to higher poverty districts. And it does so without any real justification. This pattern occurs for a variety of reasons, including dropout rates but also family migration patterns and family economic status shifts with maturation.

Second, when determining how to include an adjustment for differences in competitive wages across areas of New Jersey, department officials decided to rely conceptually on a new approach proposed by the National Center for Education Statistics – the Comparable Wage Index (see link below). But then they abandoned the actual index and the actual methods behind it to come up with their own. In their own method, NJDOE looked not at labor market level wages but at county level wages of non-teachers (controlling for age, occupation, industry and education level). By using county level data, NJDOE officials came up with a “geographic cost adjustment” that gives the biggest adjustments to the highest income counties (Bergen, Morris, Essex) rather than broadly applying the adjustment to regions of the state. Most problematically, this GCA gives a bigger funding boost to affluent Ridgewood (Bergen) than to nearby Paterson (Passaic) and to Franklin Township than to New Brunswick. That’s just wrong!

Third, and this is a big one, when adopting SFRA the choice was made to fund special education by a method called Census Based funding. That is, assuming that every district really has or should have the same share of population in need of services. They set the rate to 14.69% of students. The argument is that districts with more than that have simply been identifying more to chase additional funding and not that they actually have greater need. I address the flaws of this logic extensively in the linked research article below. Of course, the most absurd aspect of financing every district as if they have 14.69% children with disabilities is the assumption that it is somehow appropriate to fund many districts at that level who actually have far fewer children in need. Fiscal prudence this is not! But again, it does tend to reduce funding in higher poverty urban districts as well as larger, poor remote southern NJ towns (see my research article).

Fourth, in another seemingly back of the napkin exercise, someone decided that a child who is both from a low income background and with limited English language proficiency clearly doesn’t need the additional funding tied to both characteristics, and instead should be provided something in between. So, they instituted a “combination weight” which was a marginal increase over the low income weight, instead of the sum of the low income weight and LEP/ELL weight. I could probably make a stronger case that increased concentrations of both needs in districts serving very high concentrations of children who are both low income and non-English speaking leads to escalating not diminishing costs. Clearly, use of this weight instead of using the sum of the two reduces funding to the districts with the highest concentrations of students who are both poor and non-English speaking. Further, if a district is majority low income, each marginal child who is non-English speaking is more likely to be both and receive the lower combination weight.

Table 1. Summary of Current Errors and Proposed Fixes

Errors in Original SFRA 2008-09	How it Works	Why it’s Wrong	Alternative
Grade Level Weight	1.0 Elementary	Based on back of the napkin analysis. No real basis in true cost differential. Disadvantages higher poverty districts with lower share of children in upper grades.	Eliminate (Revenue neutral, set to average)
	1.04 Middle
	1.17 Secondary
Geographic Cost Adjustment	Based on non-teacher wages in county	County is the wrong unit for this analysis. Should be labor market (clusters of counties). Current approach rewards affluent counties (Bergen, Morris, Somerset).	Labor Market Based Comparable Wage Index
Census Based Funding of Special Education	Special education funding is allocated in flat amount assuming each district has 14.69% children qualified for special education.	This assumption is wrong and it leads to significant inequities in special education funding per child with actual needs.	Allocate on need basis
Combination Weight	Children who are both ELL and Low Income do not receive weighted funding for both, but rather receive an adjustment between the two.	Reduction was based on back of the napkin estimate, and signifcanlty draws funding away from most needy districts.	Reinstate full weighting for both

Here is a link to my full report in which I first identify these issues:

Baker.PJP-SFRA.Report.WEB (My complete report explaining the above problems)

Figure 1 shows what happens if we run a formula simulation based on the original 2008 SFRA parameters, and if we incrementally fix each one of these errors.

First, I remove the Combination weight and replace it with an option where each child can receive the sum of the at risk weight and the LEP/ELL weight if they qualify for both. Table 2 below shows that taking this approach raises the combo weight cost for TYPE 3 districts from $212 million to $330 million. And, looking at the second set of bars in Figure 1, it increases funding in lower income, higher need districts. Note that these are shifts in the total adequacy targets, for which costs will be shared between the state and local districts (albeit increasing targets more in districts heavily reliant on state aid).

Second, I allocate special education funding according to actual concentrations of children with disabilities. This does come at an increased total cost as well, raising total target funding for special education from $991 million to just over $1 billion. Again, total, to be funded by state and local, but again with stronger effect on districts more dependent on state aid.

Third, I get rid of that pesky grade level adjustment and replace it with the revenue neutral average foundation funding level. This does drive some more money into lower income districts.

Fourth, I replace the county level geographic cost adjustment with the National Center for Education Statistics adjustment, set to a statewide average of 1.0 (to make it more revenue neutral). This ain’t perfect. The NCES index has some “rough edges” (see my linked paper). But it’s still more justifiable in general, even if it does hurt some districts which actually need more help. This issue really requires a complete redo!

Figure 1. Simulation based on Operating Type 3 Districts

Table 2 provides some fiscal implications, as noted above, but it’s important to understand that these fiscal implications are based on a simulation of only Type 3 districts (which does include most of the kids). Table 2 is intended to show the patterns of reshuffling that would occur with these corrections.

Table 2. Simulation based on Operating Type 3 Districts

Formula Component	Status Quo	Remove Combo	Fix Special Ed	Remove Grade Level	Fix GCA	Fix All
Total Base Cost	$9,547	$9,547	$9,547	$9,547	$9,547	$9,547
Total Cost of At Risk	$1,610	$1,610	$1,610	$1,611	$1,610	$1,610
Total Cost of LEP/ELL	$70	$70	$70	$70	$70	$70
Total Cost of Combo	$212	$330	$212	$212	$212	$330
Total Cost of Special Ed Base	$991	$991	$1,018	$991	$991	$1,018

Full State Funding
Total Cost of Special Ed Categorical	$496	$496	$509	$496	$496	$509

Bottom Line Before Regional Wage Index	$12,926	$13,044	$12,966	$12,927	$12,926	$13,084
Bottom Line After Regional Wage Index	$13,007	$13,126	$13,043	$13,008	$13,041	$13,198

Figure 2. Distribution of Need-based Adjustments before Adjustment

(excludes special education)

Figure 3. Distribution of Need-based Adjustments after Adjustment (Fix All)

(excludes special education)

The bottom line here is that the reason each and every one of these corrections is important is that each of the original errors of logic and analysis that found their way into the SFRA formula shifts funding away from higher need and toward lower need districts. These aren’t huge shifts, but they’re not trivial either.

For those who wish to play around, here’s the simulation:

Aid Simulation (MS Excel File with Macros)

And for those wishing some additional technical reading to explain my arguments above, here are links to some of my related writing.

AERA.WageIndexPaper.March2008 (Conference Paper on Problems with NJ Wage Index)

Link to Published Article on Problems with Census Based Special Education Funding

Cheers!

Share this:

Share this:

Share this:

Flawed Reasoning and Bad Incentives of the Net-Value Approach

Understanding the Role of Student Behaviors

Linking Student Behaviors to their Cost & Efficiency Implications

Institutional and Public Policy Implications

Share this:

Share this:

Share this:

Share this:

Applying the Formula to Small Cities and New York City

Table 1 shows the first portion of the calculations

Table 2. Calculation of Promised State Aid

Broken Promises: Aid Freezes and Gap Elimination

Share this:

Share this:

Share this: