Disg-RACE to the TOP?

Here’s how Dems for Ed Reform characterizes Louisiana’s education reform efforts in relation to the federal Race to the Top competition:

Louisiana. The state passed legislation by Rep. Walt Leger III (D-New Orleans) lifting its charter school cap in June at the end of its legislative session. Louisiana is also pioneering an accountability system that tracks graduates of teacher training programs so that they can be held accountable for the performance of the teachers they train and so that their programs can be improved and/or revamped. A “unified group” of education and community-based organizations launched a statewide RttT effort in August.

http://www.dfer.org/2009/12/who_would_have.php

Note that I’m merely using this description as an example. DFER is far from the biggest offender when it comes to heaping praise on Louisiana.

Most pundits seem to agree that Louisiana is a front-runner to receive race to the top funding primarily because of its efforts to increase data and link student data to teachers (for practical issues on this point, see: https://schoolfinance101.wordpress.com/2009/12/04/pondering-the-usefulness-of-value-added-assessment-of-teachers/) and for the state’s lack of caps on numbers of new charters which can be granted per year.

I continue to argue, however, that even if these to factors are signs of “innovation” or an environment to support “innovation,” that innovation without real investment or true commitment is doomed to fail. Louisiana is the perfect example of the insanity that is race to the top. I pick on Louisiana here because it is such an absurd case, and because it is illustrative of the myopic and misguided criteria being used to evaluate innovation, and even more so, illustrative of the utter lack of critical thinking and analysis by pundits and ill-informed media-junkies, ed-writers and twitterers (who seem to lack any ability to critically evaluate … anything… but will re-tweet anything that praises Louisiana’s RttT application).

Let’s take a look at Louisiana’s education system. Yes, their system needs help, but the reality is that Louisiana politicians have never attempted to help their own system. In fact they’ve thrown it under the bus and now they want an award? Here’s the rundown:

  • 3rd lowest (behind Delaware & South Dakota) % of gross state product spent on elementary and secondary schools (American Community Survey of 2005, 2006, 2007)
  • 2nd lowest percent of 6 to 16 year old children attending the public system at about 80% (tied with Hawaii, behind Delaware) (American Community Survey of 2005, 2006, 2007). The national average is about 87%.
  • 2nd largest (behind Mississippi) racial gap between % white in private schools (82%) and % white in public schools (52%) (American Community Survey of 2005, 2006, 2007).  The national average is a 13% difference in whiteness, compared to 30% in Louisiana.
  • 3rd largest income gap between publicly and privately schooled children at about a 2 to 1 ratio. (American Community Survey of 2005, 2006, 2007)
  • 4th highest percent of teachers who attended non-competitive or less competitive (bottom 2 categories) undergraduate colleges based on Barrons’ ratings (NCES Schools and Staffing Survey of 2003-04). Almost half of Louisiana teachers attended less or non-competitive colleges, compared to 24% nationally.
  • Negative relationship between per pupil state and local revenues and district poverty rates, after controlling for regional wage variation, economies of scale, population density (poor get less).
  • 46th (of 52) on NAEP 8th Grade Math in 2009. 38th of 41 in 2000. http://nces.ed.gov/nationsreportcard/statecomparisons/
  • 49th (of 52) on NAEP 4th Grade Math in 2009. 35th of 42 in 2000.

So, this is a state where 20% abandon the public system and 82% of those who leave are white and have income twice that of those left in the public system, half of whom are non-white. While the racial gap is large in Mississippi, a much smaller share of Mississippi children abandon the public system and Mississippi is average on the percent of GSP allocated to public education. Mississippi simply lacks the capacity to do better. Louisiana doesn’t even try. And they deserve and award?

I read an article the other day that was uncritically tweeted (http://www.washingtonpost.com/wp-dyn/content/article/2009/12/12/AR2009121202631.html), explaining how Louisiana has adopted this great new teacher evaluation system. But, hey, look above. Louisiana ranks right near the top of the pack on the percent of all public school teachers who attended the least competitive colleges (which matters). Why worry about a dysfunctional supply pipeline for teachers? You wouldn’t want to consider the possibility that improved teacher wages and working conditions and investment in higher education could possibly improve that pipeline? A good teacher evaluation system will wash that  supply problem away!

Quite simply, if you’ve got the academically weakest teachers to begin with and you’ve got a system where 20% of students, almost entirely white from households with twice the average income leave the system, and where you’re putting about the lowest share of your state productivity into schools, and where your kids continue to score near the bottom on national assessments, all the data and supposed accountability in the world is not going to make much difference. Throwing RttT money into this mess isn’t likely to help much either. Applying a business investment mindset, Louisiana schools are certainly not a product line in which I’d invest my own hard earned money (but wait, RttT is ours, isn’t it?). That is, if I bother to think critically for a minute or two.

While I sympathize with the 80% of children left in Louisiana public schools, it is not the federal gov’t via RttT that is going to begin to dig them out of the hole in which they’ve been buried for decades by their own political leadership. The state of Louisiana must step up first, and big-time. The state must invest sufficiently in public schools to improve quality to the point where some of the wealthier and whiter families might actually opt back into the public system. At the very least, the state should be required  to put up “average” fiscal effort (% of GSP to schools) if it wants an award and should be required to show that it has targeted money to the highest need schools and children. Louisiana needs a stick, not a carrot!

Heaping mindless tweeted and re-tweeted praise on Louisiana is incredibly unhelpful and quite honestly, a bit embarrassing!  State data systems and charter caps cannot alone solve the world’s problems and certainly can’t solve Louisiana’s self-inflicted ailments.

Let’s hope the federal government can see through the smokescreen that it is at least partially responsible for creating, and make good use of RttT funding. Dumping that funding into states such as Louisiana or Delaware, Colorado, or Illinois is probably not best use. See: https://schoolfinance101.wordpress.com/2009/12/14/racingwhere/)

I have written previously about Louisiana among other states, here: https://schoolfinance101.wordpress.com/2009/12/15/why-do-states-with-best-data-systems/

And here: https://schoolfinance101.wordpress.com/2009/02/25/public-schooling-in-louisiana-and-mississippi/

Why do states with the “best” data systems have the worst schools?

Okay, so the title of this blog is a bit over the top and potentially inflammatory, but let’s take a look at those states, which, according to the Data Quality Campaign, have achieved the best possible state data systems by having all 10 elements recognized by the campaign. I should note that I appreciate the 10 data elements, especially as a data geek myself. It’s good stuff and this post is not intended to criticize the Data Quality Campaign. Rather, this post is intended to question whether this focus – or obsession – we have had of late, to rate the quality of state education systems by two criteria alone – a) whether they have certain data linked to certain other data and b) whether they have caps on charter schools – has created an unfortunate diversion. This obsession has caused us to take our eye off the ball – to applaud states who have, in reality, put little or no effort into improving their education systems – states who have, over time, dreadfully under-supplied public schooling, and states who have consistently produced the lowest educational outcomes (not merely as a function of the disadvantages of their student populations).

So, here’s a quick run-down. First, let’s begin with a look at the number of data quality elements compiled by states in relation to the percent of Gross State Product (Gross Domestic Product by State) allocated in the form of State and Local Revenue per Pupil to local public schools. There’s no real tight relationship here, but as we can see, Delaware, Louisiana and Tennessee are 3 states which now have all 10 data elements – HOORAY – but have very low educational effort. Utah and Washington also have low educational effort.

This might be inconsequential if it was… well… inconsequential. That is, if there was also no relationship to educational outcomes. Here’s a plot of the mean NAEP Math and Reading Grades 4 and 8 for 2007 (% Proficient) along with # of Data Quality Elements. In this case, there’s actually some relationship. Yep, states with better data have lower outcomes. Maybe having better data will increase the likelihood that they figure this out. A somewhat unfair argument given that many of these states are relatively poor states, but it’s not all about poverty (in fact, higher poverty would require greater effort to improve outcomes – but it doesn’t play out that way for these states. See this post for a discussion of poverty variation across states). Low effort, low performing, but high data quality states include Lousiana and Tennessee.  Yet, somehow, when viewed through a data quality lens alone – these states become superstars!

This next figure looks at the predicted per pupil state and local revenue in each state for a district having 10% poverty (relatively average for U.S. Census Poverty rates). The point here is to compare a truly comparable state and local revenue figure corrected for poverty variation, regional wage variation, economies of scale and population density. Here, we see that Utah and Tennessee (again) are standouts – having the lowest state and local revenue per pupil. Recall that both are also low to very low effort. Their revenue to districts is not low because they poor, but rather because they don’t put up the effort. But hey, they’ve got great data!!!!!

Another relevant “effort” related point to consider is just how many children of school age in the state are actually even served by the public system. If we were discussing child health care across states or even pre-school, we would most certainly consider the extent of “coverage.” We tend to ignore “coverage” in k-12 education because we too often assume near universal coverage. But that’s not the case. And coverage varies widely across states. Here, I measure coverage by the % of 6 to 16 year olds (American Community Survey of 2007) enrolled in public schools.

Not only are Lousiana and Delaware very low in their effort for schools, and Lousiana low on outcomes, both are also very low on Coverage. They don’t even serve 80% of 6 to 16 year olds in their public school system (remember, charter schools are part of the public system)!!!! Yet somehow, having good data on those who remain in the public system is a substitute making the state worthy of praise!!!!!

One might speculate that these differences are mainly about the wealth of states – especially when it comes to the ability of states to spend on their schools and the outcomes achieved in those schools. This is indeed true to a significant extent. But, as it turns out, the effort a state puts up toward public school spending is actually more strongly related than wealth (per capita gross state product) to predicted state and local revenues per pupil. That is, states which put up more effort, do raise more per pupil for their schools. Yes, states like Mississippi are at a disadvantage because they lack wealth. Tennessee and Utah have much less excuse! Delaware’s unique economic position allows it to raise significant revenue with little effort.

Finally, the effort –> revenue relationship would be of little consequence if it was not also the case that the predicted state and local revenue differences across states are associated with those pesky NAEP outcomes. Yes, there does exist a modest relationship (with many entangled underlying factors) between state and local revenues and NAEP outcomes.

There is indeed a lot tangled up in the various relationships presented above. But one thing is clear – DATA QUALITY ALONE PROVIDES LITTLE USEFUL INFORMATION ABOUT THE QUALITY OF A STATE’S EDUCATION SYSTEM! Our obsession with comparing states on this basis has caused us and policymakers to take their eye off the ball (former tennis coach speaking here!). Applauding states and financially rewarding them (RttT) merely for collecting better data with little attention to the actual school systems and children served (or not served) by those systems is, at best, disingenuous. 

To quote John McEnroe – You cannot be serious!

Racing Where? (the DFER list)

A few quick comments on the Democrats for Ed Reform list of states in line for RttT funds.

  • The list includes two of the only states that year after year maintain a pattern of higher poverty school districts receiving systematically fewer resources than lower poverty school districts – New York and Illinois. Colorado is also on the systematically regressive funding list – but is not a year after year standout like the other two.
  • The list includes the two states which allocate the smallest share of their gross state product to public education – Louisiana and Delaware. To add insult to injury, based on American Community Survey Data from 2007, neither Delaware nor Louisiana even serves 80% of its 6 to 16 year olds in the public school system (a “coverage” metric).
  • The list includes the state with the absolute lowest cost and need adjusted per pupil state and local revenue among all states – Tennessee.

The DFER post notes that Illinois’ chances aren’t lookin’ as good as earlier this year. But, TN, DE, CO and LA  sound like strong contenders! (?)

Details on methods and analysis behind these findings available on request in  the near future. Cheers!

Update – Do School Finance Reforms Matter?

Here’s an excerpt from a forthcoming article on whether school finance reforms have made any difference for students. The article is partly in response to claims by Eric Hanushek and Alfred Lindseth that school finance reforms have resulted in massive increases in funding to public schools which have not helped and may have in fact harmed children. My forthcoming work on this topic is co-authored with Kevin Welner of U. of Colorado.

=====

In terms of quality and scope, the most useful single study of judicially induced state finance reform was published by Card and Payne in 2002. They found that court declarations of unconstitutionality in the 1980s increased the relative funding provided to low-income districts. And they found that these school finance reforms had, in turn, significant equity effects on academic outcomes:

Using micro samples of SAT scores from this same period, we then test whether changes in spending inequality affect the gap in achievement between different family background groups. We find evidence that equalization of spending leads to a narrowing of test score outcomes across family background groups. (p. 49)

To evaluate distributional changes in school finance, Card and Payne estimated the partial correlations between current expenditures per pupil and median family income, conditional on other factors influencing demand for public schooling across districts within states and over time. Card and Payne then measured the differences in the change in income-associated spending distribution between states where school funding systems had been overturned, upheld, or where no court decision had been rendered. Importantly, they also evaluated whether structural changes to funding formulas (that is, the actual reforms) were associated with changes to the income-spending relationship, conditional on the presence of court rulings.

To make the final link between income-spending relationships and outcome gaps, Card and Payne evaluated changes in gaps in SAT scores among individual SAT test-takers categorized by family background characteristics.[1] Put in terms of our Figure 1, Card and Payne (2002) appear to have taken the greatest care in a multi-year, cross-state study, to establish appropriate linkages between litigation, reforms by type, changes in the distribution of funding, and related changes in the distribution of outcomes.

Notwithstanding the generally acknowledged importance of this study,[2] Hanushek and Lindseth (2009) never mention it in their book, including the chapter in which they conclude that school finance reforms have no positive effects.

This omission – as well as the other omissions noted below – is telling of a larger point. The development of, and reliance upon, a research base should depend on relatively objective criteria. Readers depend on authors of literature reviews to come forward with the best and most applicable research bearing on the issues under consideration. While Hanushek and Lindseth might argue that this particular omission is because Card and Payne (2002) are speaking to equity (not adequacy) litigation, we have already described how the line between equity and adequacy is not so simple. Moreover, the research that Hanushek and Lindseth do choose to include goes far beyond that directly focused on adequacy – including the Cato study of a Kansas City desegregation order discussed below.

Another key study not mentioned by Hanushek and Lindseth (2009) concerned the effects of reforms implemented under the Kansas court’s pre-ruling in 1992 (Deke, 2003). The reforms leveled up funding in low-property-wealth school districts, and Deke found as follows:

Using panel models that, if biased, are likely biased downward, I have a conservative estimate of the impact of a 20% increase in spending on the probability of going on to postsecondary education. The regression results show that such a spending increase raises that probability by approximately 5% (p. 275).

The Kansas reforms addressed by Deke (2003) came as a result of a judicial pre-order, advising the legislature that if the pending suit made it to trial, the judge would declare the school finance system unconstitutional (Baker and Green, 2006).

Hanushek and Lindseth (2009) also omitted from their discussion two additional studies, both peer-reviewed, that explore the effects of Michigan’s school finance reforms, known as “Proposal A,” implemented in the mid-1990s. Michigan’s reforms were implemented without ruling or high level of litigation threat, but the reforms were nonetheless comparable in many ways to reforms implemented following judicial rulings[3] (see Leuven et al., 2007; and Papke, 2001). In the first study, Papke (2001) finds:

Focusing on pass rates for fourth-grade and seventh grade math tests (the most complete and consistent data available for Michigan), I find that increases in spending have nontrivial, statistically significant effects on math test pass rates, and the effects are largest for schools with initially poor performance. (Papke, 2001, p. 821.)

Leuven and colleagues (2007) find no positive effects of two specific increases in funding targeted to schools with elevated at-risk populations, a convenient conclusion for Hanushek and Lindseth to have included.

A third Michigan study (available online since 2003 as a working paper from Princeton University, and now accepted for publication in Education Finance and Policy, a peer-reviewed journal) directly estimates the relationship between implemented reforms and subsequent outcomes (Roy, 2003). Roy, whose work was not cited by Hanushek and Lindseth, finds:

Proposal A was quite successful in reducing inter-district spending disparities. There were also significant gains in achievement in the poorest districts, as measured by success in state tests. However, as yet these improvements do not show up in nationwide tests like NAEP and ACT. (Roy, 2003, p. 1.)

Most recently, a study by Choudhary (2009) “estimate[s] the causal effect of increased spending on 4th and 7th grade math scores for two test measures—a scale score and a percent satisfactory measure” (p. 1). She “find[s] positive effects of increased spending on 4th grade test scores. A 60% percent increase in spending increases the percent satisfactory score by one standard deviation” (p. 1).

Perhaps because there was no judicial order involved in Michigan, researchers were able to avoid the tendency to focus on or classify the judicial order. Moreover, single-state studies generally avoid such problems because there is little statistical purpose in classifying litigation. Importantly, each of these studies focuses instead on measures of the changing distribution and level of spending (characteristics of the reforms themselves) and resulting changes in the distribution and level of outcomes. Each takes a different approach, but attempts to appropriately align their measures of spending change and outcome change, adhering to principles laid out in our Figure 1.

Other high-quality but non-peer reviewed empirical estimates of the effects of specific school finance reforms linked to court orders have been published for Vermont and Massachusetts. For example, Downes (2004), in an evaluation of Vermont school finance reforms that were ordered in 1997 and implemented in 1998, found as follows:

All of the evidence cited in this paper supports the conclusion that Act 60 has dramatically reduced dispersion in education spending and has done this by weakening the link between spending and property wealth. Further, the regressions presented in this paper offer some evidence that student performance has become more equal in the post–Act 60 period. And no results support the conclusion that Act 60 has contributed to increased dispersion in performance. (p. 312)

Hanushek and Lindseth (2009) never acknowledge this positive finding (although they do briefly cite the Downes evaluation, for a different point). Again, one might attribute this omission to the argument that the Vermont reforms were equity reforms, not adequacy reforms. However, similar to the 1992 Kansas reforms, the overall effect of the Vermont Act 60 reforms was to level up low-wealth districts and increase state school spending dramatically, thus addressing both adequacy and equity.

For Massachusetts, two independent sets of authors (in addition to Hanushek and Lindseth) have found positive reform effects. Most recently — after the Hanushek and Lindseth book was written — Downes, Zabel and Ansel (2009) found:

The achievement gap notwithstanding, this research provides new evidence that the state’s investment has had a clear and significant impact. Specifically, some of the research findings show how education reform has been successful in raising the achievement of students in the previously low-spending districts. Quite simply, this comprehensive analysis documents that without Ed Reform the achievement gap would be larger than it is today. (p. 5)

Previously, Guryan (2003) found:

Using state aid formulas as instruments, I find that increases in per-pupil spending led to significant increases in math, reading, science, and social studies test scores for 4th- and 8th-grade students. The magnitudes imply a $1,000 increase in per-pupil spending leads to about a third to a half of a standard-deviation increase in average test scores. It is noted that the state aid driving the estimates is targeted to under-funded school districts, which may have atypical returns to additional expenditures. (p. 1)

Although Hanushek and Lindseth concede that Massachusetts reforms appear successful,[4] they failed to cite Guryan’s NBER working paper, the inclusion of which would have (like most other omitted studies) weakened their overall conclusions about the non-impact of these reforms.

Turning to New Jersey, two recent (though not yet peer-reviewed) studies find positive effects of that state’s finance reforms. Alexandra Resch (2008), in a study published as a dissertation for the economics department at the University of Michigan, found evidence suggesting that New Jersey Abbott districts “directed the added resources largely to instructional personnel” (p. 1) such as additional teachers and support staff. She also concluded that this increase in funding and spending improved the achievement of students in the affected school districts. Looking at the statewide 11th grade assessment (“the only test that spans the policy change”), she found “that the policy improves test scores for minority students in the affected districts by one-fifth to one-quarter of a standard deviation” (p. 1).

The second recent study was originally presented at a 2007 conference at Columbia University, and a revised, peer-reviewed version was recently published by the Campaign for Educational Equity at Teachers College, Columbia University (Goertz and Weiss, 2009). This paper offered descriptive evidence that reveals some positive test results of recent New Jersey school finance reforms:

State Assessments: In 1999 the gap between the Abbott districts and all other districts in the state was over 30 points. By 2007 the gap was down to 19 points, a reduction of 11 points or 0.39 standard deviation units. The gap between the Abbott districts and the high-wealth districts fell from 35 to 22 points. Meanwhile performance in the low-, middle-, and high-wealth districts essentially remained parallel during this eight-year period (Figure 3, p. 23).

NAEP: The NAEP results confirm the changes we saw using state assessment data. NAEP scores in fourth-grade reading and mathematics in central cities rose 21 and 22 points, respectively between the mid-1990s and 2007, a rate that was faster than the urban fringe in both subjects and the state as a whole in reading (p. 26).

The Goertz and Weiss paper (which was, as designed and intended by the paper’s authors, the statistically least rigorous analysis of the ones presented here) does receive mention from Hanushek and Lindseth multiple times, but only in an effort to discredit and minimize its findings.

Card, D. and Payne, A. A. (2002). School Finance Reform, the Distribution of School Spending, and the Distribution of Student Test Scores. Journal of Public Economics, 83(1), 49-82.

Choudhary, L. (2009). Education Inputs, Student Performance and School Finance Reform in Michigan. Economics of Education Review, 28(1), 90-98.

Deke, J. (2003). A study of the impact of public school spending on postsecondary educational attainment using statewide school district refinancing in Kansas, Economics of Education Review, 22(3), 275-284.

Downes, T. A. (2004). School Finance Reform and School Quality: Lessons from Vermont. In Yinger, J. (ed), Helping Children Left Behind: State Aid and the Pursuit of Educational Equity. Cambridge, MA: MIT Press.

Downes, T. A., Zabel, J., Ansel, D. (2009). Incomplete Grade: Massachusetts Education Reform at 15. Boston, MA. MassINC.

Goertz, M., and Weiss, M. (2009). Assessing Success in School Finance Litigation: The Case of New Jersey. New York City: The Campaign for Educational Equity, Teachers College, Columbia University.

Guryan, J. (2003). Does Money Matter? Estimates from Education Finance Reform in Massachusetts. Working Paper No. 8269. Cambridge, MA: National Bureau of Economic Research.

Leuven, E., Lindahl, M., Oosterbeek, H., and Webbink, D. (2007). The Effect of Extra Funding for Disadvantaged Pupils on Achievement. The Review of Economics and Statistics, 89(4), 721-736.

Resch, A. M. (2008). Three Essays on Resources in Education (dissertation). Ann Arbor: University of Michigan, Department of Economics. Retrieved October 28, 2009, from http://deepblue.lib.umich.edu/bitstream/2027.42/61592/1/aresch_1.pdf

Roy, J. (2003). Impact of School Finance Reform on Resource Equalization and Academic Performance: Evidence from Michigan. Princeton University, Education Research Section Working Paper No. 8. Retrieved October 23, 2009 from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=630121 (Forthcoming in Education Finance and Policy.)


[1] Card and Payne provide substantial detail on their methodological attempts to negate the usual role of selection bias in SAT test-taking patterns. They also explain that their preference was to measure more directly the effects of income-related changes in current spending per pupil on income-related changes in SAT performance, but that the income measures in their SAT database were unreliable and could not be corroborated by other sources. As such, Card and Payne used combinations of parent education levels to proxy for income and socio-economic differences between SAT test takers.

[2] As one indication of its prominence among researchers, as of the writing of this article, Google Scholar identified 153 citations to this article.

[3] There is little reason to assume that the presence of judicial order would necessarily make otherwise similar reforms less (or more) effective, though constraints surrounding judicial remedies may.

[4] Hanushek and Lindseth attribute the success of the Massachusetts reforms not to spending, but to the fact that the “remedial steps passed by the legislature also included a vigorous regime of academic standards, a high-stakes graduation test, and strict accountability measures of a kind that have run into resistance in other states, particularly from teachers unions” (p. 169). That is, it was not the funding that mattered in Massachusetts, but rather it was the accountability reforms that accompanied the funding.

NJ Charter Update – Math Trends over Time

Note: This is not a “Study.” This is just a summary of NJDOE Report Card Data, which can be found here: http://education.state.nj.us/rc/

I made a few more graphs for fun today, pursuing the question of whether the shares of children scoring only “partially proficient” over time are changing in New Jersey Charter Schools at any different rate from the shares scoring partially proficient in traditional public school districts. Note, of course, that partially proficient is a nice way of saying – failed the test. Each graph below includes only General Education students, as I have previously shown that NJ Charters serve very few if any special education students and including these students substantially changes performance levels for non-Charters. Again, it is most relevant to compare visually the Charter schools to schools in district factor groups A and B, because Charter schools do tend to serve children from these districts – primarily A. But, as the graphs show, Charters have continued to perform similarly to schools in DFG A (less well in Grade 4).

"R" indicates Charter Schools
"R" indicates Charter
"R" Indicates Charter Schools

Smarter School Leaders

Smarter School Leaders: Enough to reverse the trend?

http://www.nytimes.com/2009/12/05/opinion/05herbert.html?_r=1

This recent New York Times article highlights a new doctoral program for educational leaders that is a joint venture of Harvard Graduate School of Education, Kennedy School of Government and Harvard Business School. An interesting approach indeed and one that will hopefully generate some top quality leaders for public schools and school districts. But, there are about 100,000 public schools out there, spread across 16,000 or so school districts and public charter schools. In the best of cases, each of these schools and districts would get the best and brightest possible leader. My guess, however, is that this new Harvard program will barely make a dent in our national needs.

Perhaps the new Harvard program can serve as a model for making a bigger and better dent.  Now, when I say that, I should clarify that I’m not taking the pop-policy position that this program is a model simply because it involves a business school and public policy school and the education school, but rather because it involves a GOOD business school, HIGH QUALITY public policy school and TOP NOTCH education school. There are as many, if not more intellectually vacuous b-school programs as comparably vacuous ed-school programs. You see, it’s not about b-school versus ed-school. It’s about high quality schools with highly self-selective pools of degree-seekers and top notch faculty deciding to play a more significant role in public school leadership. However, it’s going to be an uphill battle!

A few years back, Michelle Young, Terry Orr and I explored changing patterns of degree production in educational administration. With other colleagues, I explored the characteristics of faculty in educational administration programs, their pipeline and their qualifications. More recently, I’ve been exploring the effects of the changing principal preparation pipeline on schools in states like Missouri. AND IT’S NOT A PRETTY PICTURE!

Michelle, Terry and I found in our degree production study, that:

“The largest number and greatest increase were among master’s degrees. In 2003, there were 15,720 master’s degrees conferred in educational leadership, a 90 percent increase since 1993.”

And:

“Even more striking are the increases in master’s degree granting programs at Comprehensive II and Liberal Arts II institutions. Such program increases reflect a dramatic growth in the availability of programs in local and regional institutions.”

And further, that:

“The percentage of all master’s degrees produced by higher status institutions, the Research I through Doctoral II institutions dropped from 42 percent in 1993 to 36 percent in 2003.”

That is, master’s degree production in particular has mushroomed over the past decade-and-a-half and many of the new masters degrees produced are from institutions that previously had minimal involvement in educational administration and are generally considered lower status institutions.

The figure below shows the top Educational Administration masters’ granting institutions in 1990 and then again for the period from 2002 to 2005, based on data from my study with Michelle Young and Terry Orr. The data are from the National Center for Education Statistics, Integrated Postsecondary Education Data System – Degree Completion files. In 1990, Harvard made the list. But by the later period (and perhaps even worse by now), the list had changed – a lot. The list now includes mass-producers of graduate degrees like Nova Southeastern University and William Woods (Missouri) pumping out about 500 masters degrees per year in educational administration (and related degree codes). Other standout newcomers include Lindenwood University (also Missouri), National Louis University (Illinois) and St. Peters College (New Jersey).

From 2002 to 2005, Harvard continued production at its 1990 levels, like many major research universities. But by 2002 to 2005, Harvard had dropped to 68th in production, right behind Mid-America Nazarene University in Kansas (their radio jingle still sticks in my head from my Kansas years… MNU, not Harvard… who I doubt has a radio jingle).

If trends in Masters’ Degree production weren’t bad enough, similar if not more disturbing trends have occurred in the production of doctoral degrees in educational administration. In 1990, Harvard reported about 40 doctoral degrees in Educational Administration and Nova Southeastern about 100. Bad enough already. By 2005, Harvard was no-longer listing or reporting doctoral degrees granted under program codes for Educational Administration, and the biggest producers nationally were: Nova Southeastern (368), Argosy University – Sarasota (196), St. Louis University (62).  Even if these programs were/are credible, managing the quality control on 200 to 400 doctoral candidates per year seems problematic at best. Simply finding, enrolling and retaining 200 to 400 high quality candidates willing to pursue this type of degree seems a bit of a stretch! How many applied? How many, if any were rejected?

The damage done by these institutions and the diversified production of educational leaders is astounding in some states. In 1999, only a few principals of Missouri public schools held graduate degrees from the state’s emerging degree-mills. By 2006, 185 held their Masters’ degrees from Lindenwood University and 205 from William Woods out of a data set having just over 2,000 completely matched records over time. Nearly 400 of 2,000 or nearly 20% of Missouri principals held degrees from institutions which are arguably hardly qualified to grant them.

Principals who attended these graduate programs are substantially more likely to have attended the least competitive undergraduate colleges. For William Woods University, 80% of Masters Degree recipients who became Missouri principals attended undergraduate colleges in the bottom 3 (of 6) categories of competitiveness (based on Barrons’ Guide ratings) compared to 68% of principals statewide.

And further, the shares of teachers who also attended the least competitive colleges hired into schools headed by these principals have grown dramatically – from 65% to 75% from bottom two categories of Barrons’ ratings in 7 years – and faster than for other schools statewide.

This shift would be inconsequential were it not for strong and consistent evidence from a multitude of studies that the academic caliber of the teacher workforce is highly relevant to student success. While many sources highlight this issue (see for example, Baker & Cooper, 2005), Loeb and colleagues provide a particularly striking in the work in New York City. They report that:

“ . . . almost half of the teachers in the most effective quintile (based on student outcomes) graduated from a college ranked competitive or higher by Barron’s, compared to only ten percent of the teachers in the least effective quintile.”(p. 23)

This is a serious issue and one state policy makers seem unwilling to address. National accrediting agencies are comparably unwilling and/or incapable of addressing this educational leadership brain drain.

A graduate program in educational leadership or any field is only as good as the quality of its students and faculty, but criteria for program accreditation pay little attention to either the academic quality of students or qualifications of faculty.

Altering the quality of school leadership requires greater involvement of leading public and private universities, pursuing endeavors like the new Harvard program. But equally important, altering the quality of school leadership requires that state policymakers step up and shut down institutions that by the quality of their average student and qualifications of their faculty have no business preparing school leaders.

While this argument might easily be construed as academic elitism, it is important to acknowledge that this argument relates to the preparation of leaders for academic institutions –namely public schools. It is difficult to conceive of a rational argument for ignoring the relevance of academic credentials for individuals wishing to lead academic institutions.

Relevant research readings:

Baker, B., & Cooper, B. (2005). Do principals with stronger academic backgrounds hire better teachers? Policy implications for improving high-poverty schools. Educational Administration Quarterly, 41(3), 413-448.

Baker, B.D, Orr, M.T., Young, M.D. (2007) Academic Drift, Institutional Production and Professional Distribution of Graduate Degrees in Educational Administration. Educational Administration Quarterly 43 (3)  279-318

Baker, B.D., Wolf-Wendel, L.E., Twombly, S.B. (2007) Exploring the Faculty Pipeline in Educational Administration: Evidence from the Survey of Earned Doctorates 1990 to 2000. Educational Administration Quarterly 43 (2) 189-220

Pondering the Usefulness of Value-Added Assessment of Teachers

Value-added teacher assessment has been a mantra for education “reformers” throughout the debate over Race to the Top. We’ve got to evaluate teachers and make hiring and firing decisions on the basis of real student performance measures – you know, like businesses – like the real world does! (A highly questionable assumption indeed – AIG bonuses anyone?).

I address the technical issues with value-added assessment of teachers here, indicating just how premature these assertions are from a technical standpoint.

https://schoolfinance101.wordpress.com/2009/11/07/teacher-evaluation-with-value-added-measures/

At present, good value added measures are little more than  a really cool (if not totally awesome) research tool, but most of the best analyses of value-added as a tool for teacher evaluation suggest that even in the best of cases there still exist potentially problematic biases.

Let’s set these technical issues aside for now and explore some practical issues. For example, just how many teachers in a public education system could even be evaluated with value-added assessment? Consider these constraints.

  1. Most states, like New Jersey, implement yearly assessments in grades 3 through 8, and perhaps end of course or some HS exit exam. (I’ll set aside concerns over the fact that annual, rather than fall-spring assessment captures vast differences in summer learning which play out by student economic status – advantaging some teachers and disadvantaging others, depending on which kids they have).
  2. In most cases, the established and more reliable tests exist only in language arts and math, though some states have implemented science and/or social studies tests which are arguably less cumulative.
  3. The most reliable VA assessment of teachers occurs where there exist multiple points of historical scores on students prior to the observed teacher  (smaller technical point). This really casts doubt on the usefulness of VA assessment for evaluating teachers who have kids in their first few years of being assessed (grades 3 and 4 in NJ and many states).
  4. by the time a student hits middle school, they typically interact with multiple teachers who may have simultaneous influences on each others’ content area success. Even if we ignore this, at best we can look at the language arts and math teachers in the middle school setting.
  5. you have to jump over those untested grade 9 and 10 students and their teachers. If we have end of course exams, we don’t know what the beginning of course status necessarily was – at least in a VA modeling sense.

So, here is a listing of the certified staffing in New Jersey (below) in 2008 based on their grade levels and areas of teaching. The list does not include everyone, but does capture the main assignment (JOB Code 1) for the vast majority of school assigned teaching (and principal) personnel.

What this list shows us is that in the best possible case, in a state with annual Grades 3 to 8 assessment and shifting to end of course exams, we might be able to generate VA estimates of effectiveness for about 10% or 20% (just saw that “ungraded elementary” group) of the teachers. That is, 10% (up to 20%) would be subject to a different evaluation system than the rest. In fact, nearly 50% of teachers would be infeasible to evaluate at all. Indeed they are an important 10% (or perhaps 20%).

Okay, so maybe this would create incentive for the real gunners in the mix of potential teachers to dive into those areas evaluated by VA. There exists an equal if not stronger possibility that the real gunners in the mix of potential teachers will avoid those classrooms of kids, schools or districts where – in the evaluated content areas and grade levels – they face an uphill battle to improve outcomes (hopefully, some will welcome the challenge).

There are some obvious solutions to this dilemma –

  1. Test everything, every year by cumulative measures, fall and spring. Okay. That seems a bit absurd, but it might be a good economic stimulus for the testing industry. I still struggle with how we would evaluate teachers in supporting roles, as many of those listed below or teachers in the Arts and Music (perhaps applause meters… but only if we measure applause gain from concert to concert, rather than applause level?). What about vocational education?
  2. Just dump all of those teachers and all of that frivolous stuff kids don’t really need and assign each group of kids a 12 year sequence of reading and math teachers. Some have actually argued that this really should be done, especially in higher poverty and/or underperforming schools. Why, for example, should a school with inadequate math and reading scores offer instrumental music or advanced math or journalism courses? (Put down that saxophone and pick up that basic math book Mr. Parker!) The reality is that high poverty and underperforming schools in New Jersey and elsewhere already have concentrated their teaching staff on core activities to the extent that kids in poor urban schools have much less access to arts and athletics.

I personally have significant concerns over the idea that poor urban kids should have access to a string of remedial reading and math teachers over time and nothing else, but kids in affluent neighboring suburbs should be the ones with additional access to foreign languages, tennis and lacrosse teams and elite jazz ensembles (this one really irks me) and orchestras. Quite honestly, successful participation in these activities is highly relevant to college admission – at least at the competitive schools. Certainly, the affluent communities are not going to go along with dumping all of these things.

So, if we can’t test everything every year and if it is offensive to argue for dumping all areas that aren’t or can’t reasonably be evaluated, then we have a significant gap in the usefulness of VA teacher assessment.

I did this tally very quickly using 2007-08 NJ staffing files. Feel free to tally and re-tally and post alternative counts below. Note that most of the special education teachers are missing from the tally below because I’ve not yet fully recoded them for 2008. While I have done so for earlier years, those years of the staffing files don’t break out content area for MS teachers or grade level for elem teachers. About 14% of teachers in 2005 or 2006 data were special education. At a maximum, I get to about 20% of teachers as ungraded elementary and about another 5% or so potentially relevant in 2005 and 2006 for VA assessment (without ability to remove untested grades).

Main Assignment Number of Teachers % of Teachers Potentially Reliable VA Assessment No Assessment at All
Art 3,106 2.84 X
Basic Skills 1,779 1.63 X
Bilingual 697 0.64 X
Computer 917 0.84 X
Coord/Director 1,263 1.15 X
Counselors 29 0.03 X
Elem English 522 0.48
Elem Math 535 0.49
Elem Science 381 0.35
Ungraded Elem 11,308 10.33 ?
ESL 1,700 1.55 X
FCS 837 0.76 X
Grades 1 to 3 12,006 10.97
Grades 4 to 6 7,012 6.41 X
Grades 6 to 8 1,305 1.19 ?
HS English 13 0.01
HS English 5,041 4.61
HS Math 4,727 4.32
HS Science 4,391 4.01
HS Soc Studies 3,968 3.63 X
HS World Language 4,460 4.08 X
Indus Arts 1,217 1.11 X
Kindergarten 321 0.29 X
Kindergarten 3,565 3.26 X
MS Lang Arts 2,844 2.6 X
MS Math 2,439 2.23 X
MS Science 1,669 1.53 ?
MS Soc Studies 1,629 1.49 ?
MS World Language 440 0.4 X
Music 3,665 3.35 X
PE 6,963 6.36 X
Perf Arts 222 0.2 X
Preschool 1,052 0.96 X
Preschool 557 0.51 X
Principal 2,172 1.98 ?
Psychologist 1,545 1.41 X
SC Spec Educ 163 0.15 X
SC Spec Educ 6,747 6.17 X
SE RR/Inclusion 963 0.88 X
Supervisor 2,360 2.16 X
Vice Principal 1,828 1.67 X
Voc Ed 1,067 0.98 X
Total 109,433 (of about 142,000 recoded) 11.24 47.01

Okay – So New Jersey is just probably a wacky inefficient example that has way too many of those extra teachers in trivial and wasteful assignments. Well, here’s the breakout of Illinois teachers for 2008.


I could go on, and do this for Missouri, Minnesota, Wisconsin, Iowa, Washington and many others showing generally the same pattern. I chose New Jersey above  because the most recent years of NJ data actually break out the grade level assignment of most elementary teachers so we can see how many grades 1 through 3 teachers would fall outside the evaluation system.

My point here is not to try to trash VA evaluation of teachers, but rather to point out just how little – even in a practical sense – the pundits who are pitching immediate action on using VA for hiring and firing teachers and providing incentive pay have bothered to think about even the most basic issues. Not the technical and statistical issues, but really simple stuff like just how many teachers would even be evaluated under such a system. And more importantly, since this is supposedly about “incentives” – just what kind of incentives this selective evaluation might create.

Title I Does NOT make “Rich” states “Richer!”

This is one fly I keep forgetting to swat, but one that has been repeatedly advanced by the Center for American Progress with excessively crude analyses. See: http://www.americanprogress.org/issues/2009/08/title1_map.html WOW! Just look at it. Those darn rich states like Connecticut, New York and New Jersey are running away with federal funding that should be targeted to poor states like Arkansas, Alabama and Mississippi.

Two glaring omissions in this analysis undermine entirely its conclusions. First, there is the issue of regional variation in true poverty, where – because poverty thresholds used in the CAP analysis are not regionally sensitive to income variation or costs – poverty rates tend to be overstated in lower income lower cost regions. The U.S. Census Bureau has been engaged in research on this topic and released a new report last summer:

Second, the value of the Title I dollar varies significantly by location, largely as a function of the competitive wages for staff and other resources that might be purchased with those Title I dollars.

So then, how does all of this academic, trivial griping affect the CAP analysis? First, here’s a slide of the 2006-07 title I allocations per poverty pupil – same measure as CAP – by state poverty rate.

What we see here is that the small state minimum allotment does generate distorted higher amounts of T1 funding per poor child in states like North Dakota, Wyoming and Vermont.  We would also be led to believe that states like Louisiana, Mississippi, Arkansas and Tennessee are significantly disadvantaged by the formula (receiving well less than $2,000 per poor child each) and New York, Connecticut and New Jersey (hidden in the mass of points) receive around $2,000 or more (NY much more) per poor child. An abomination I say! (or at least CAP would argue).

What happens when we correct for the mis-specification of poverty, using an average of the three alternatives from the August 2009 Census Bureau paper? Well, we get:

Hmmm… Now it would appear that states like Louisiana are actually getting much more funding than New York per corrected poverty child. And Tennessee more than New Jersey! Wait – are you telling me that Title I doesn’t make these rich states richer? Yep – and I’m not even done yet.

Let’s go the next step and correct these Title I allocations per actual poor child for the regional value (based on competitive wage variation) of the Title I allocation.  Now we get:

Now we see that state’s like New Jersey, New York and especially California are actually significantly more disadvantaged by the Title I formula than states like Mississippi or Louisiana.

Look, the Title I formula certainly doesn’t produce the most logical allocations, or most equitable ones. One might also argue that it doesn’t maximize incentives for states to clean up their own act on equity or effort.

That said, there exists little excuse for excessively crude analyses which lead to such absurdly bold – AND FLAT OUT WRONG – conclusions like the conclusion that Title I makes rich states richer. Yeah – this kind of claim sounds good – makes good political rhetoric – good stump speech stuff for the absurdities of government behavior. But in this case, the CAP critique is simply wrong!

Here is a previous presentation I made on this topic before the Census working paper was available:

Baker.AERA.Title1

Let me clarify that the same issue of mis-measurement of poverty plagues urban-rural comparisons within states. Rural poverty is, in relative terms, overstated compared to urban poverty. So too are rural costs (competitive wages) lower than urban costs. So, just as it is true that Title I does not necessarily overfund “rich” states, Title I also does not necessarily overfund urban districts at the expense of rural ones. Unfortunately, I do not yet have available a finer grained adjusted poverty measure which will allow me to easily display the urban/rural issue.

Checking the Tab

As follow up to yesterday’s post on the completely fabricated and back-of-the-napkin numbers presented in The Tab,  here’s a quick simulated allocation of the $11,000 foundation + $3,000 poverty weight (applied to free or reduced lunch) + $400 per ELL/LEP child.

The Tab pretty much conceals any real changes or patterns of changes by lumping them into a summary table by groups of districts without any documentation as to how the summary stats were estimated (page 27). Above is what the district by district changes would look like. Looks pretty much like a back-of-the-napkin attempt at roughly break-even analysis. Remember, this is a proposal for the future compared against actual spending from 2007-08 – two years back now!

Specifically, the proposal would appear to reduce funding in Hartford and New Haven by greater amounts than it would increase funding in districts like New Britain and Waterbury and only similarly to the increase for Bridgeport. That is, it levels down high poverty districts as much as it levels some up – a fact concealed by the claims of a net increase of $620 per pupil in the short term. Mind you, The Tab certainly provides no evidence that districts like Hartford and New Haven are massively over-funded, as their own policy solutions would imply. Oh wait… The Tab really doesn’t rely on evidence at all. Silly me.

Just checkin the numbers – the made up numbers.

Why is it OK for Think Tanks to just make stuff up?

Something that has perplexed me for some time in my field of school finance, is why it seems to be okay for policy advocates and “Think Tanks” to just make stuff up. For example, to just make up what level of funding would be appropriate for accomplishing any particular set of goals? or to just make up a figure for how much more a child with specific educational needs requires under state school finance policy. Just “making stuff up” seems particularly problematic for “Think Tanks,” which as far as I can tell should be producing information backed by at least some degree of … Thinking? Perhaps based on some of the more reasonable thinking of the field?

This topic comes to mind today because ConnCan has just released a report (http://www.conncan.org/matriarch/documents/TheTab.pdf)    on how to fix Connecticut school funding which provides classic examples of just makin’ stuff up (page 25). The report begins with a few random charts and graphs showing the differences in funding between wealthy and poor Connecticut school districts and their state and local shares of funding. These analyses, while reasonably descriptive are relatively meaningless because they are not anchored to any well conceived or articulated explanation of “what should be.” Such a conception might be located here or even here (Chapters 13, 14 & 15 are particularly on target)!

The height of making stuff up in the report is the recommended policy solution to the problem which is never clearly articulated. There are problems in CT, but The Tab, certainly doesn’t identify them!

The supposed ideal policy solution involves a pupil-based funding formula where each pupil should receive at least $11,000 per pupil (made up), and each child in poverty (no definition provided – just a few random ideas in a footnote) should receive an additional $3,000 per pupil (also made up) and each child with limited English language proficiency should receive an additional $400 per pupil (yep… totally made up). There is minimal attempt in the report (http://www.conncan.org/matriarch/documents/TheTab.pdf) to explain why these figures are reasonable. They’re simply made up.

The authors do provide some back-of-the-napkin explanations for the numbers they made up – based on those numbers being larger than the amounts typically allocated (not necessarily true). They write off the possibility that better numbers might be derived by way of a general footnote reference to a chapter in the Handbook of Research on Education Finance and Policy by Bill Duncombe and John Yinger which actually explains methods for deriving such estimates.

The authors of The Tab conclude: “Combined with federal funding that flows on the basis of poverty and (in some cases) the English Language Learner weight of an additional $400, the $3,000 poverty weight would enable districts and schools to devote considerable resources to meeting the needs of disadvantaged students.” I’m glad they are so confident in their “made up” numbers! I, however, am less so!

It would be one thing if there was no conceptual or methodological basis for figuring out which children require more resources or how much more they might actually need. Then, I guess, you might have to make stuff up. Even then, it might be reasonable to make at least some thoughtful attempt to explain why you made up the numbers you… well… made up. But alas, such thinking seems beyond the grasp of at least some “think tanks.” Guess what? There actually are some pretty good articles out there which attempt to distill additional costs associated with specific poverty measures… like this one, by Bill Duncombe and John Yinger:

How much more does a disadvantaged student cost?

It’s not like the title of this article somehow conceals its contents, does it? Nor is the journal in which it was published (Economics of Education Review) somehow tangential to the point at hand. This paper, prepared for the National Research Council provides some additional insights into additional costs associated with poverty and methods for estimating those costs.

Rather than even attempt to argue that these figures are somehow founded in something, the authors of The Tab seem to push the point that it really doesn’t matter what these numbers are as long as the state allocates pupil-based funding.  That’s the fix! That’s what matters… not how much funding or whether the right kids get the right amounts. In fact, the reverse is true. The potential effectiveness, equity and adequacy of any decentralized weighted funding system is highly contingent upon driving appropriate levels of funding and funding differentials across schools and districts!

I’ve critiqued the notion of pupil-based funding as a panacea, here:

Review of Fund the Child: Bringing Equity, Autonomy and Portability to Ohio School Finance

Review of Shortchanging Disadvantaged Students: An Analysis of Intra-district Spending Patterns in Ohio

Review of Weighted Student Formula Yearbook 2009

Oh, and also here: http://epaa.asu.edu/epaa/v17n3/

Among other things, in each of these critiques of think-tank reports I question why it seems okay to just make up “weights” and cost figures when applying distribution formulas – either for within or between district distribution.

Just thinking… but not making stuff up!