Blog

Racing Where? (the DFER list)

A few quick comments on the Democrats for Ed Reform list of states in line for RttT funds.

  • The list includes two of the only states that year after year maintain a pattern of higher poverty school districts receiving systematically fewer resources than lower poverty school districts – New York and Illinois. Colorado is also on the systematically regressive funding list – but is not a year after year standout like the other two.
  • The list includes the two states which allocate the smallest share of their gross state product to public education – Louisiana and Delaware. To add insult to injury, based on American Community Survey Data from 2007, neither Delaware nor Louisiana even serves 80% of its 6 to 16 year olds in the public school system (a “coverage” metric).
  • The list includes the state with the absolute lowest cost and need adjusted per pupil state and local revenue among all states – Tennessee.

The DFER post notes that Illinois’ chances aren’t lookin’ as good as earlier this year. But, TN, DE, CO and LA  sound like strong contenders! (?)

Details on methods and analysis behind these findings available on request in  the near future. Cheers!

Update – Do School Finance Reforms Matter?

Here’s an excerpt from a forthcoming article on whether school finance reforms have made any difference for students. The article is partly in response to claims by Eric Hanushek and Alfred Lindseth that school finance reforms have resulted in massive increases in funding to public schools which have not helped and may have in fact harmed children. My forthcoming work on this topic is co-authored with Kevin Welner of U. of Colorado.

=====

In terms of quality and scope, the most useful single study of judicially induced state finance reform was published by Card and Payne in 2002. They found that court declarations of unconstitutionality in the 1980s increased the relative funding provided to low-income districts. And they found that these school finance reforms had, in turn, significant equity effects on academic outcomes:

Using micro samples of SAT scores from this same period, we then test whether changes in spending inequality affect the gap in achievement between different family background groups. We find evidence that equalization of spending leads to a narrowing of test score outcomes across family background groups. (p. 49)

To evaluate distributional changes in school finance, Card and Payne estimated the partial correlations between current expenditures per pupil and median family income, conditional on other factors influencing demand for public schooling across districts within states and over time. Card and Payne then measured the differences in the change in income-associated spending distribution between states where school funding systems had been overturned, upheld, or where no court decision had been rendered. Importantly, they also evaluated whether structural changes to funding formulas (that is, the actual reforms) were associated with changes to the income-spending relationship, conditional on the presence of court rulings.

To make the final link between income-spending relationships and outcome gaps, Card and Payne evaluated changes in gaps in SAT scores among individual SAT test-takers categorized by family background characteristics.[1] Put in terms of our Figure 1, Card and Payne (2002) appear to have taken the greatest care in a multi-year, cross-state study, to establish appropriate linkages between litigation, reforms by type, changes in the distribution of funding, and related changes in the distribution of outcomes.

Notwithstanding the generally acknowledged importance of this study,[2] Hanushek and Lindseth (2009) never mention it in their book, including the chapter in which they conclude that school finance reforms have no positive effects.

This omission – as well as the other omissions noted below – is telling of a larger point. The development of, and reliance upon, a research base should depend on relatively objective criteria. Readers depend on authors of literature reviews to come forward with the best and most applicable research bearing on the issues under consideration. While Hanushek and Lindseth might argue that this particular omission is because Card and Payne (2002) are speaking to equity (not adequacy) litigation, we have already described how the line between equity and adequacy is not so simple. Moreover, the research that Hanushek and Lindseth do choose to include goes far beyond that directly focused on adequacy – including the Cato study of a Kansas City desegregation order discussed below.

Another key study not mentioned by Hanushek and Lindseth (2009) concerned the effects of reforms implemented under the Kansas court’s pre-ruling in 1992 (Deke, 2003). The reforms leveled up funding in low-property-wealth school districts, and Deke found as follows:

Using panel models that, if biased, are likely biased downward, I have a conservative estimate of the impact of a 20% increase in spending on the probability of going on to postsecondary education. The regression results show that such a spending increase raises that probability by approximately 5% (p. 275).

The Kansas reforms addressed by Deke (2003) came as a result of a judicial pre-order, advising the legislature that if the pending suit made it to trial, the judge would declare the school finance system unconstitutional (Baker and Green, 2006).

Hanushek and Lindseth (2009) also omitted from their discussion two additional studies, both peer-reviewed, that explore the effects of Michigan’s school finance reforms, known as “Proposal A,” implemented in the mid-1990s. Michigan’s reforms were implemented without ruling or high level of litigation threat, but the reforms were nonetheless comparable in many ways to reforms implemented following judicial rulings[3] (see Leuven et al., 2007; and Papke, 2001). In the first study, Papke (2001) finds:

Focusing on pass rates for fourth-grade and seventh grade math tests (the most complete and consistent data available for Michigan), I find that increases in spending have nontrivial, statistically significant effects on math test pass rates, and the effects are largest for schools with initially poor performance. (Papke, 2001, p. 821.)

Leuven and colleagues (2007) find no positive effects of two specific increases in funding targeted to schools with elevated at-risk populations, a convenient conclusion for Hanushek and Lindseth to have included.

A third Michigan study (available online since 2003 as a working paper from Princeton University, and now accepted for publication in Education Finance and Policy, a peer-reviewed journal) directly estimates the relationship between implemented reforms and subsequent outcomes (Roy, 2003). Roy, whose work was not cited by Hanushek and Lindseth, finds:

Proposal A was quite successful in reducing inter-district spending disparities. There were also significant gains in achievement in the poorest districts, as measured by success in state tests. However, as yet these improvements do not show up in nationwide tests like NAEP and ACT. (Roy, 2003, p. 1.)

Most recently, a study by Choudhary (2009) “estimate[s] the causal effect of increased spending on 4th and 7th grade math scores for two test measures—a scale score and a percent satisfactory measure” (p. 1). She “find[s] positive effects of increased spending on 4th grade test scores. A 60% percent increase in spending increases the percent satisfactory score by one standard deviation” (p. 1).

Perhaps because there was no judicial order involved in Michigan, researchers were able to avoid the tendency to focus on or classify the judicial order. Moreover, single-state studies generally avoid such problems because there is little statistical purpose in classifying litigation. Importantly, each of these studies focuses instead on measures of the changing distribution and level of spending (characteristics of the reforms themselves) and resulting changes in the distribution and level of outcomes. Each takes a different approach, but attempts to appropriately align their measures of spending change and outcome change, adhering to principles laid out in our Figure 1.

Other high-quality but non-peer reviewed empirical estimates of the effects of specific school finance reforms linked to court orders have been published for Vermont and Massachusetts. For example, Downes (2004), in an evaluation of Vermont school finance reforms that were ordered in 1997 and implemented in 1998, found as follows:

All of the evidence cited in this paper supports the conclusion that Act 60 has dramatically reduced dispersion in education spending and has done this by weakening the link between spending and property wealth. Further, the regressions presented in this paper offer some evidence that student performance has become more equal in the post–Act 60 period. And no results support the conclusion that Act 60 has contributed to increased dispersion in performance. (p. 312)

Hanushek and Lindseth (2009) never acknowledge this positive finding (although they do briefly cite the Downes evaluation, for a different point). Again, one might attribute this omission to the argument that the Vermont reforms were equity reforms, not adequacy reforms. However, similar to the 1992 Kansas reforms, the overall effect of the Vermont Act 60 reforms was to level up low-wealth districts and increase state school spending dramatically, thus addressing both adequacy and equity.

For Massachusetts, two independent sets of authors (in addition to Hanushek and Lindseth) have found positive reform effects. Most recently — after the Hanushek and Lindseth book was written — Downes, Zabel and Ansel (2009) found:

The achievement gap notwithstanding, this research provides new evidence that the state’s investment has had a clear and significant impact. Specifically, some of the research findings show how education reform has been successful in raising the achievement of students in the previously low-spending districts. Quite simply, this comprehensive analysis documents that without Ed Reform the achievement gap would be larger than it is today. (p. 5)

Previously, Guryan (2003) found:

Using state aid formulas as instruments, I find that increases in per-pupil spending led to significant increases in math, reading, science, and social studies test scores for 4th- and 8th-grade students. The magnitudes imply a $1,000 increase in per-pupil spending leads to about a third to a half of a standard-deviation increase in average test scores. It is noted that the state aid driving the estimates is targeted to under-funded school districts, which may have atypical returns to additional expenditures. (p. 1)

Although Hanushek and Lindseth concede that Massachusetts reforms appear successful,[4] they failed to cite Guryan’s NBER working paper, the inclusion of which would have (like most other omitted studies) weakened their overall conclusions about the non-impact of these reforms.

Turning to New Jersey, two recent (though not yet peer-reviewed) studies find positive effects of that state’s finance reforms. Alexandra Resch (2008), in a study published as a dissertation for the economics department at the University of Michigan, found evidence suggesting that New Jersey Abbott districts “directed the added resources largely to instructional personnel” (p. 1) such as additional teachers and support staff. She also concluded that this increase in funding and spending improved the achievement of students in the affected school districts. Looking at the statewide 11th grade assessment (“the only test that spans the policy change”), she found “that the policy improves test scores for minority students in the affected districts by one-fifth to one-quarter of a standard deviation” (p. 1).

The second recent study was originally presented at a 2007 conference at Columbia University, and a revised, peer-reviewed version was recently published by the Campaign for Educational Equity at Teachers College, Columbia University (Goertz and Weiss, 2009). This paper offered descriptive evidence that reveals some positive test results of recent New Jersey school finance reforms:

State Assessments: In 1999 the gap between the Abbott districts and all other districts in the state was over 30 points. By 2007 the gap was down to 19 points, a reduction of 11 points or 0.39 standard deviation units. The gap between the Abbott districts and the high-wealth districts fell from 35 to 22 points. Meanwhile performance in the low-, middle-, and high-wealth districts essentially remained parallel during this eight-year period (Figure 3, p. 23).

NAEP: The NAEP results confirm the changes we saw using state assessment data. NAEP scores in fourth-grade reading and mathematics in central cities rose 21 and 22 points, respectively between the mid-1990s and 2007, a rate that was faster than the urban fringe in both subjects and the state as a whole in reading (p. 26).

The Goertz and Weiss paper (which was, as designed and intended by the paper’s authors, the statistically least rigorous analysis of the ones presented here) does receive mention from Hanushek and Lindseth multiple times, but only in an effort to discredit and minimize its findings.

Card, D. and Payne, A. A. (2002). School Finance Reform, the Distribution of School Spending, and the Distribution of Student Test Scores. Journal of Public Economics, 83(1), 49-82.

Choudhary, L. (2009). Education Inputs, Student Performance and School Finance Reform in Michigan. Economics of Education Review, 28(1), 90-98.

Deke, J. (2003). A study of the impact of public school spending on postsecondary educational attainment using statewide school district refinancing in Kansas, Economics of Education Review, 22(3), 275-284.

Downes, T. A. (2004). School Finance Reform and School Quality: Lessons from Vermont. In Yinger, J. (ed), Helping Children Left Behind: State Aid and the Pursuit of Educational Equity. Cambridge, MA: MIT Press.

Downes, T. A., Zabel, J., Ansel, D. (2009). Incomplete Grade: Massachusetts Education Reform at 15. Boston, MA. MassINC.

Goertz, M., and Weiss, M. (2009). Assessing Success in School Finance Litigation: The Case of New Jersey. New York City: The Campaign for Educational Equity, Teachers College, Columbia University.

Guryan, J. (2003). Does Money Matter? Estimates from Education Finance Reform in Massachusetts. Working Paper No. 8269. Cambridge, MA: National Bureau of Economic Research.

Leuven, E., Lindahl, M., Oosterbeek, H., and Webbink, D. (2007). The Effect of Extra Funding for Disadvantaged Pupils on Achievement. The Review of Economics and Statistics, 89(4), 721-736.

Resch, A. M. (2008). Three Essays on Resources in Education (dissertation). Ann Arbor: University of Michigan, Department of Economics. Retrieved October 28, 2009, from http://deepblue.lib.umich.edu/bitstream/2027.42/61592/1/aresch_1.pdf

Roy, J. (2003). Impact of School Finance Reform on Resource Equalization and Academic Performance: Evidence from Michigan. Princeton University, Education Research Section Working Paper No. 8. Retrieved October 23, 2009 from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=630121 (Forthcoming in Education Finance and Policy.)


[1] Card and Payne provide substantial detail on their methodological attempts to negate the usual role of selection bias in SAT test-taking patterns. They also explain that their preference was to measure more directly the effects of income-related changes in current spending per pupil on income-related changes in SAT performance, but that the income measures in their SAT database were unreliable and could not be corroborated by other sources. As such, Card and Payne used combinations of parent education levels to proxy for income and socio-economic differences between SAT test takers.

[2] As one indication of its prominence among researchers, as of the writing of this article, Google Scholar identified 153 citations to this article.

[3] There is little reason to assume that the presence of judicial order would necessarily make otherwise similar reforms less (or more) effective, though constraints surrounding judicial remedies may.

[4] Hanushek and Lindseth attribute the success of the Massachusetts reforms not to spending, but to the fact that the “remedial steps passed by the legislature also included a vigorous regime of academic standards, a high-stakes graduation test, and strict accountability measures of a kind that have run into resistance in other states, particularly from teachers unions” (p. 169). That is, it was not the funding that mattered in Massachusetts, but rather it was the accountability reforms that accompanied the funding.

NJ Charter Update – Math Trends over Time

Note: This is not a “Study.” This is just a summary of NJDOE Report Card Data, which can be found here: http://education.state.nj.us/rc/

I made a few more graphs for fun today, pursuing the question of whether the shares of children scoring only “partially proficient” over time are changing in New Jersey Charter Schools at any different rate from the shares scoring partially proficient in traditional public school districts. Note, of course, that partially proficient is a nice way of saying – failed the test. Each graph below includes only General Education students, as I have previously shown that NJ Charters serve very few if any special education students and including these students substantially changes performance levels for non-Charters. Again, it is most relevant to compare visually the Charter schools to schools in district factor groups A and B, because Charter schools do tend to serve children from these districts – primarily A. But, as the graphs show, Charters have continued to perform similarly to schools in DFG A (less well in Grade 4).

"R" indicates Charter Schools
"R" indicates Charter
"R" Indicates Charter Schools

Smarter School Leaders

Smarter School Leaders: Enough to reverse the trend?

http://www.nytimes.com/2009/12/05/opinion/05herbert.html?_r=1

This recent New York Times article highlights a new doctoral program for educational leaders that is a joint venture of Harvard Graduate School of Education, Kennedy School of Government and Harvard Business School. An interesting approach indeed and one that will hopefully generate some top quality leaders for public schools and school districts. But, there are about 100,000 public schools out there, spread across 16,000 or so school districts and public charter schools. In the best of cases, each of these schools and districts would get the best and brightest possible leader. My guess, however, is that this new Harvard program will barely make a dent in our national needs.

Perhaps the new Harvard program can serve as a model for making a bigger and better dent.  Now, when I say that, I should clarify that I’m not taking the pop-policy position that this program is a model simply because it involves a business school and public policy school and the education school, but rather because it involves a GOOD business school, HIGH QUALITY public policy school and TOP NOTCH education school. There are as many, if not more intellectually vacuous b-school programs as comparably vacuous ed-school programs. You see, it’s not about b-school versus ed-school. It’s about high quality schools with highly self-selective pools of degree-seekers and top notch faculty deciding to play a more significant role in public school leadership. However, it’s going to be an uphill battle!

A few years back, Michelle Young, Terry Orr and I explored changing patterns of degree production in educational administration. With other colleagues, I explored the characteristics of faculty in educational administration programs, their pipeline and their qualifications. More recently, I’ve been exploring the effects of the changing principal preparation pipeline on schools in states like Missouri. AND IT’S NOT A PRETTY PICTURE!

Michelle, Terry and I found in our degree production study, that:

“The largest number and greatest increase were among master’s degrees. In 2003, there were 15,720 master’s degrees conferred in educational leadership, a 90 percent increase since 1993.”

And:

“Even more striking are the increases in master’s degree granting programs at Comprehensive II and Liberal Arts II institutions. Such program increases reflect a dramatic growth in the availability of programs in local and regional institutions.”

And further, that:

“The percentage of all master’s degrees produced by higher status institutions, the Research I through Doctoral II institutions dropped from 42 percent in 1993 to 36 percent in 2003.”

That is, master’s degree production in particular has mushroomed over the past decade-and-a-half and many of the new masters degrees produced are from institutions that previously had minimal involvement in educational administration and are generally considered lower status institutions.

The figure below shows the top Educational Administration masters’ granting institutions in 1990 and then again for the period from 2002 to 2005, based on data from my study with Michelle Young and Terry Orr. The data are from the National Center for Education Statistics, Integrated Postsecondary Education Data System – Degree Completion files. In 1990, Harvard made the list. But by the later period (and perhaps even worse by now), the list had changed – a lot. The list now includes mass-producers of graduate degrees like Nova Southeastern University and William Woods (Missouri) pumping out about 500 masters degrees per year in educational administration (and related degree codes). Other standout newcomers include Lindenwood University (also Missouri), National Louis University (Illinois) and St. Peters College (New Jersey).

From 2002 to 2005, Harvard continued production at its 1990 levels, like many major research universities. But by 2002 to 2005, Harvard had dropped to 68th in production, right behind Mid-America Nazarene University in Kansas (their radio jingle still sticks in my head from my Kansas years… MNU, not Harvard… who I doubt has a radio jingle).

If trends in Masters’ Degree production weren’t bad enough, similar if not more disturbing trends have occurred in the production of doctoral degrees in educational administration. In 1990, Harvard reported about 40 doctoral degrees in Educational Administration and Nova Southeastern about 100. Bad enough already. By 2005, Harvard was no-longer listing or reporting doctoral degrees granted under program codes for Educational Administration, and the biggest producers nationally were: Nova Southeastern (368), Argosy University – Sarasota (196), St. Louis University (62).  Even if these programs were/are credible, managing the quality control on 200 to 400 doctoral candidates per year seems problematic at best. Simply finding, enrolling and retaining 200 to 400 high quality candidates willing to pursue this type of degree seems a bit of a stretch! How many applied? How many, if any were rejected?

The damage done by these institutions and the diversified production of educational leaders is astounding in some states. In 1999, only a few principals of Missouri public schools held graduate degrees from the state’s emerging degree-mills. By 2006, 185 held their Masters’ degrees from Lindenwood University and 205 from William Woods out of a data set having just over 2,000 completely matched records over time. Nearly 400 of 2,000 or nearly 20% of Missouri principals held degrees from institutions which are arguably hardly qualified to grant them.

Principals who attended these graduate programs are substantially more likely to have attended the least competitive undergraduate colleges. For William Woods University, 80% of Masters Degree recipients who became Missouri principals attended undergraduate colleges in the bottom 3 (of 6) categories of competitiveness (based on Barrons’ Guide ratings) compared to 68% of principals statewide.

And further, the shares of teachers who also attended the least competitive colleges hired into schools headed by these principals have grown dramatically – from 65% to 75% from bottom two categories of Barrons’ ratings in 7 years – and faster than for other schools statewide.

This shift would be inconsequential were it not for strong and consistent evidence from a multitude of studies that the academic caliber of the teacher workforce is highly relevant to student success. While many sources highlight this issue (see for example, Baker & Cooper, 2005), Loeb and colleagues provide a particularly striking in the work in New York City. They report that:

“ . . . almost half of the teachers in the most effective quintile (based on student outcomes) graduated from a college ranked competitive or higher by Barron’s, compared to only ten percent of the teachers in the least effective quintile.”(p. 23)

This is a serious issue and one state policy makers seem unwilling to address. National accrediting agencies are comparably unwilling and/or incapable of addressing this educational leadership brain drain.

A graduate program in educational leadership or any field is only as good as the quality of its students and faculty, but criteria for program accreditation pay little attention to either the academic quality of students or qualifications of faculty.

Altering the quality of school leadership requires greater involvement of leading public and private universities, pursuing endeavors like the new Harvard program. But equally important, altering the quality of school leadership requires that state policymakers step up and shut down institutions that by the quality of their average student and qualifications of their faculty have no business preparing school leaders.

While this argument might easily be construed as academic elitism, it is important to acknowledge that this argument relates to the preparation of leaders for academic institutions –namely public schools. It is difficult to conceive of a rational argument for ignoring the relevance of academic credentials for individuals wishing to lead academic institutions.

Relevant research readings:

Baker, B., & Cooper, B. (2005). Do principals with stronger academic backgrounds hire better teachers? Policy implications for improving high-poverty schools. Educational Administration Quarterly, 41(3), 413-448.

Baker, B.D, Orr, M.T., Young, M.D. (2007) Academic Drift, Institutional Production and Professional Distribution of Graduate Degrees in Educational Administration. Educational Administration Quarterly 43 (3)  279-318

Baker, B.D., Wolf-Wendel, L.E., Twombly, S.B. (2007) Exploring the Faculty Pipeline in Educational Administration: Evidence from the Survey of Earned Doctorates 1990 to 2000. Educational Administration Quarterly 43 (2) 189-220

Pondering the Usefulness of Value-Added Assessment of Teachers

Value-added teacher assessment has been a mantra for education “reformers” throughout the debate over Race to the Top. We’ve got to evaluate teachers and make hiring and firing decisions on the basis of real student performance measures – you know, like businesses – like the real world does! (A highly questionable assumption indeed – AIG bonuses anyone?).

I address the technical issues with value-added assessment of teachers here, indicating just how premature these assertions are from a technical standpoint.

https://schoolfinance101.wordpress.com/2009/11/07/teacher-evaluation-with-value-added-measures/

At present, good value added measures are little more than  a really cool (if not totally awesome) research tool, but most of the best analyses of value-added as a tool for teacher evaluation suggest that even in the best of cases there still exist potentially problematic biases.

Let’s set these technical issues aside for now and explore some practical issues. For example, just how many teachers in a public education system could even be evaluated with value-added assessment? Consider these constraints.

  1. Most states, like New Jersey, implement yearly assessments in grades 3 through 8, and perhaps end of course or some HS exit exam. (I’ll set aside concerns over the fact that annual, rather than fall-spring assessment captures vast differences in summer learning which play out by student economic status – advantaging some teachers and disadvantaging others, depending on which kids they have).
  2. In most cases, the established and more reliable tests exist only in language arts and math, though some states have implemented science and/or social studies tests which are arguably less cumulative.
  3. The most reliable VA assessment of teachers occurs where there exist multiple points of historical scores on students prior to the observed teacher  (smaller technical point). This really casts doubt on the usefulness of VA assessment for evaluating teachers who have kids in their first few years of being assessed (grades 3 and 4 in NJ and many states).
  4. by the time a student hits middle school, they typically interact with multiple teachers who may have simultaneous influences on each others’ content area success. Even if we ignore this, at best we can look at the language arts and math teachers in the middle school setting.
  5. you have to jump over those untested grade 9 and 10 students and their teachers. If we have end of course exams, we don’t know what the beginning of course status necessarily was – at least in a VA modeling sense.

So, here is a listing of the certified staffing in New Jersey (below) in 2008 based on their grade levels and areas of teaching. The list does not include everyone, but does capture the main assignment (JOB Code 1) for the vast majority of school assigned teaching (and principal) personnel.

What this list shows us is that in the best possible case, in a state with annual Grades 3 to 8 assessment and shifting to end of course exams, we might be able to generate VA estimates of effectiveness for about 10% or 20% (just saw that “ungraded elementary” group) of the teachers. That is, 10% (up to 20%) would be subject to a different evaluation system than the rest. In fact, nearly 50% of teachers would be infeasible to evaluate at all. Indeed they are an important 10% (or perhaps 20%).

Okay, so maybe this would create incentive for the real gunners in the mix of potential teachers to dive into those areas evaluated by VA. There exists an equal if not stronger possibility that the real gunners in the mix of potential teachers will avoid those classrooms of kids, schools or districts where – in the evaluated content areas and grade levels – they face an uphill battle to improve outcomes (hopefully, some will welcome the challenge).

There are some obvious solutions to this dilemma –

  1. Test everything, every year by cumulative measures, fall and spring. Okay. That seems a bit absurd, but it might be a good economic stimulus for the testing industry. I still struggle with how we would evaluate teachers in supporting roles, as many of those listed below or teachers in the Arts and Music (perhaps applause meters… but only if we measure applause gain from concert to concert, rather than applause level?). What about vocational education?
  2. Just dump all of those teachers and all of that frivolous stuff kids don’t really need and assign each group of kids a 12 year sequence of reading and math teachers. Some have actually argued that this really should be done, especially in higher poverty and/or underperforming schools. Why, for example, should a school with inadequate math and reading scores offer instrumental music or advanced math or journalism courses? (Put down that saxophone and pick up that basic math book Mr. Parker!) The reality is that high poverty and underperforming schools in New Jersey and elsewhere already have concentrated their teaching staff on core activities to the extent that kids in poor urban schools have much less access to arts and athletics.

I personally have significant concerns over the idea that poor urban kids should have access to a string of remedial reading and math teachers over time and nothing else, but kids in affluent neighboring suburbs should be the ones with additional access to foreign languages, tennis and lacrosse teams and elite jazz ensembles (this one really irks me) and orchestras. Quite honestly, successful participation in these activities is highly relevant to college admission – at least at the competitive schools. Certainly, the affluent communities are not going to go along with dumping all of these things.

So, if we can’t test everything every year and if it is offensive to argue for dumping all areas that aren’t or can’t reasonably be evaluated, then we have a significant gap in the usefulness of VA teacher assessment.

I did this tally very quickly using 2007-08 NJ staffing files. Feel free to tally and re-tally and post alternative counts below. Note that most of the special education teachers are missing from the tally below because I’ve not yet fully recoded them for 2008. While I have done so for earlier years, those years of the staffing files don’t break out content area for MS teachers or grade level for elem teachers. About 14% of teachers in 2005 or 2006 data were special education. At a maximum, I get to about 20% of teachers as ungraded elementary and about another 5% or so potentially relevant in 2005 and 2006 for VA assessment (without ability to remove untested grades).

Main Assignment Number of Teachers % of Teachers Potentially Reliable VA Assessment No Assessment at All
Art 3,106 2.84 X
Basic Skills 1,779 1.63 X
Bilingual 697 0.64 X
Computer 917 0.84 X
Coord/Director 1,263 1.15 X
Counselors 29 0.03 X
Elem English 522 0.48
Elem Math 535 0.49
Elem Science 381 0.35
Ungraded Elem 11,308 10.33 ?
ESL 1,700 1.55 X
FCS 837 0.76 X
Grades 1 to 3 12,006 10.97
Grades 4 to 6 7,012 6.41 X
Grades 6 to 8 1,305 1.19 ?
HS English 13 0.01
HS English 5,041 4.61
HS Math 4,727 4.32
HS Science 4,391 4.01
HS Soc Studies 3,968 3.63 X
HS World Language 4,460 4.08 X
Indus Arts 1,217 1.11 X
Kindergarten 321 0.29 X
Kindergarten 3,565 3.26 X
MS Lang Arts 2,844 2.6 X
MS Math 2,439 2.23 X
MS Science 1,669 1.53 ?
MS Soc Studies 1,629 1.49 ?
MS World Language 440 0.4 X
Music 3,665 3.35 X
PE 6,963 6.36 X
Perf Arts 222 0.2 X
Preschool 1,052 0.96 X
Preschool 557 0.51 X
Principal 2,172 1.98 ?
Psychologist 1,545 1.41 X
SC Spec Educ 163 0.15 X
SC Spec Educ 6,747 6.17 X
SE RR/Inclusion 963 0.88 X
Supervisor 2,360 2.16 X
Vice Principal 1,828 1.67 X
Voc Ed 1,067 0.98 X
Total 109,433 (of about 142,000 recoded) 11.24 47.01

Okay – So New Jersey is just probably a wacky inefficient example that has way too many of those extra teachers in trivial and wasteful assignments. Well, here’s the breakout of Illinois teachers for 2008.


I could go on, and do this for Missouri, Minnesota, Wisconsin, Iowa, Washington and many others showing generally the same pattern. I chose New Jersey above  because the most recent years of NJ data actually break out the grade level assignment of most elementary teachers so we can see how many grades 1 through 3 teachers would fall outside the evaluation system.

My point here is not to try to trash VA evaluation of teachers, but rather to point out just how little – even in a practical sense – the pundits who are pitching immediate action on using VA for hiring and firing teachers and providing incentive pay have bothered to think about even the most basic issues. Not the technical and statistical issues, but really simple stuff like just how many teachers would even be evaluated under such a system. And more importantly, since this is supposedly about “incentives” – just what kind of incentives this selective evaluation might create.

Title I Does NOT make “Rich” states “Richer!”

This is one fly I keep forgetting to swat, but one that has been repeatedly advanced by the Center for American Progress with excessively crude analyses. See: http://www.americanprogress.org/issues/2009/08/title1_map.html WOW! Just look at it. Those darn rich states like Connecticut, New York and New Jersey are running away with federal funding that should be targeted to poor states like Arkansas, Alabama and Mississippi.

Two glaring omissions in this analysis undermine entirely its conclusions. First, there is the issue of regional variation in true poverty, where – because poverty thresholds used in the CAP analysis are not regionally sensitive to income variation or costs – poverty rates tend to be overstated in lower income lower cost regions. The U.S. Census Bureau has been engaged in research on this topic and released a new report last summer:

Second, the value of the Title I dollar varies significantly by location, largely as a function of the competitive wages for staff and other resources that might be purchased with those Title I dollars.

So then, how does all of this academic, trivial griping affect the CAP analysis? First, here’s a slide of the 2006-07 title I allocations per poverty pupil – same measure as CAP – by state poverty rate.

What we see here is that the small state minimum allotment does generate distorted higher amounts of T1 funding per poor child in states like North Dakota, Wyoming and Vermont.  We would also be led to believe that states like Louisiana, Mississippi, Arkansas and Tennessee are significantly disadvantaged by the formula (receiving well less than $2,000 per poor child each) and New York, Connecticut and New Jersey (hidden in the mass of points) receive around $2,000 or more (NY much more) per poor child. An abomination I say! (or at least CAP would argue).

What happens when we correct for the mis-specification of poverty, using an average of the three alternatives from the August 2009 Census Bureau paper? Well, we get:

Hmmm… Now it would appear that states like Louisiana are actually getting much more funding than New York per corrected poverty child. And Tennessee more than New Jersey! Wait – are you telling me that Title I doesn’t make these rich states richer? Yep – and I’m not even done yet.

Let’s go the next step and correct these Title I allocations per actual poor child for the regional value (based on competitive wage variation) of the Title I allocation.  Now we get:

Now we see that state’s like New Jersey, New York and especially California are actually significantly more disadvantaged by the Title I formula than states like Mississippi or Louisiana.

Look, the Title I formula certainly doesn’t produce the most logical allocations, or most equitable ones. One might also argue that it doesn’t maximize incentives for states to clean up their own act on equity or effort.

That said, there exists little excuse for excessively crude analyses which lead to such absurdly bold – AND FLAT OUT WRONG – conclusions like the conclusion that Title I makes rich states richer. Yeah – this kind of claim sounds good – makes good political rhetoric – good stump speech stuff for the absurdities of government behavior. But in this case, the CAP critique is simply wrong!

Here is a previous presentation I made on this topic before the Census working paper was available:

Baker.AERA.Title1

Let me clarify that the same issue of mis-measurement of poverty plagues urban-rural comparisons within states. Rural poverty is, in relative terms, overstated compared to urban poverty. So too are rural costs (competitive wages) lower than urban costs. So, just as it is true that Title I does not necessarily overfund “rich” states, Title I also does not necessarily overfund urban districts at the expense of rural ones. Unfortunately, I do not yet have available a finer grained adjusted poverty measure which will allow me to easily display the urban/rural issue.

Checking the Tab

As follow up to yesterday’s post on the completely fabricated and back-of-the-napkin numbers presented in The Tab,  here’s a quick simulated allocation of the $11,000 foundation + $3,000 poverty weight (applied to free or reduced lunch) + $400 per ELL/LEP child.

The Tab pretty much conceals any real changes or patterns of changes by lumping them into a summary table by groups of districts without any documentation as to how the summary stats were estimated (page 27). Above is what the district by district changes would look like. Looks pretty much like a back-of-the-napkin attempt at roughly break-even analysis. Remember, this is a proposal for the future compared against actual spending from 2007-08 – two years back now!

Specifically, the proposal would appear to reduce funding in Hartford and New Haven by greater amounts than it would increase funding in districts like New Britain and Waterbury and only similarly to the increase for Bridgeport. That is, it levels down high poverty districts as much as it levels some up – a fact concealed by the claims of a net increase of $620 per pupil in the short term. Mind you, The Tab certainly provides no evidence that districts like Hartford and New Haven are massively over-funded, as their own policy solutions would imply. Oh wait… The Tab really doesn’t rely on evidence at all. Silly me.

Just checkin the numbers – the made up numbers.

Why is it OK for Think Tanks to just make stuff up?

Something that has perplexed me for some time in my field of school finance, is why it seems to be okay for policy advocates and “Think Tanks” to just make stuff up. For example, to just make up what level of funding would be appropriate for accomplishing any particular set of goals? or to just make up a figure for how much more a child with specific educational needs requires under state school finance policy. Just “making stuff up” seems particularly problematic for “Think Tanks,” which as far as I can tell should be producing information backed by at least some degree of … Thinking? Perhaps based on some of the more reasonable thinking of the field?

This topic comes to mind today because ConnCan has just released a report (http://www.conncan.org/matriarch/documents/TheTab.pdf)    on how to fix Connecticut school funding which provides classic examples of just makin’ stuff up (page 25). The report begins with a few random charts and graphs showing the differences in funding between wealthy and poor Connecticut school districts and their state and local shares of funding. These analyses, while reasonably descriptive are relatively meaningless because they are not anchored to any well conceived or articulated explanation of “what should be.” Such a conception might be located here or even here (Chapters 13, 14 & 15 are particularly on target)!

The height of making stuff up in the report is the recommended policy solution to the problem which is never clearly articulated. There are problems in CT, but The Tab, certainly doesn’t identify them!

The supposed ideal policy solution involves a pupil-based funding formula where each pupil should receive at least $11,000 per pupil (made up), and each child in poverty (no definition provided – just a few random ideas in a footnote) should receive an additional $3,000 per pupil (also made up) and each child with limited English language proficiency should receive an additional $400 per pupil (yep… totally made up). There is minimal attempt in the report (http://www.conncan.org/matriarch/documents/TheTab.pdf) to explain why these figures are reasonable. They’re simply made up.

The authors do provide some back-of-the-napkin explanations for the numbers they made up – based on those numbers being larger than the amounts typically allocated (not necessarily true). They write off the possibility that better numbers might be derived by way of a general footnote reference to a chapter in the Handbook of Research on Education Finance and Policy by Bill Duncombe and John Yinger which actually explains methods for deriving such estimates.

The authors of The Tab conclude: “Combined with federal funding that flows on the basis of poverty and (in some cases) the English Language Learner weight of an additional $400, the $3,000 poverty weight would enable districts and schools to devote considerable resources to meeting the needs of disadvantaged students.” I’m glad they are so confident in their “made up” numbers! I, however, am less so!

It would be one thing if there was no conceptual or methodological basis for figuring out which children require more resources or how much more they might actually need. Then, I guess, you might have to make stuff up. Even then, it might be reasonable to make at least some thoughtful attempt to explain why you made up the numbers you… well… made up. But alas, such thinking seems beyond the grasp of at least some “think tanks.” Guess what? There actually are some pretty good articles out there which attempt to distill additional costs associated with specific poverty measures… like this one, by Bill Duncombe and John Yinger:

How much more does a disadvantaged student cost?

It’s not like the title of this article somehow conceals its contents, does it? Nor is the journal in which it was published (Economics of Education Review) somehow tangential to the point at hand. This paper, prepared for the National Research Council provides some additional insights into additional costs associated with poverty and methods for estimating those costs.

Rather than even attempt to argue that these figures are somehow founded in something, the authors of The Tab seem to push the point that it really doesn’t matter what these numbers are as long as the state allocates pupil-based funding.  That’s the fix! That’s what matters… not how much funding or whether the right kids get the right amounts. In fact, the reverse is true. The potential effectiveness, equity and adequacy of any decentralized weighted funding system is highly contingent upon driving appropriate levels of funding and funding differentials across schools and districts!

I’ve critiqued the notion of pupil-based funding as a panacea, here:

Review of Fund the Child: Bringing Equity, Autonomy and Portability to Ohio School Finance

Review of Shortchanging Disadvantaged Students: An Analysis of Intra-district Spending Patterns in Ohio

Review of Weighted Student Formula Yearbook 2009

Oh, and also here: http://epaa.asu.edu/epaa/v17n3/

Among other things, in each of these critiques of think-tank reports I question why it seems okay to just make up “weights” and cost figures when applying distribution formulas – either for within or between district distribution.

Just thinking… but not making stuff up!

Playing with Charter Numbers in NJ

About a week ago, I commented that charter school average performance was not much, if any different from the average performance of the poorest urban public schools. This is admittedly an oversimplified comparison, but not one I would have made had I believed it to be deceptive, which it is not – given the available data on New Jersey schools.

Here, I will walk through a more complicated though still imperfect analysis of elementary school performance in host districts and in charter schools based on data from 2004 to 2006 (data I had already compiled for related work). First, let’s begin with some descriptive characteristics of the charter schools and schools of similar grade level (elementary in this case) in their host districts based largely on school reports data from those years.

The table below shows that the data set includes 28 charter schools per year and 173 host district schools of same grade level.  The charters serve about 1,000 tested students and the host district schools about 11,000 tested students.  While the free/reduced lunch share is roughly the same between the two, the free lunch share is higher in the host district schools (these are the poorer students). These differences vary by host district and charters. Newark Charters, for example, are on average (though not all) relatively high poverty.

Note that the average free lunch share in DFG A schools, used in my previous comparison, is 63% (much higher than charters or their hosts on average).

Also higher in the host district schools are the share of children who are LEP/ELL and who are classified as having disabilities. But, the host district schools do have higher total certified salaries per pupil (compiled from state database on personnel salaries).

Slide1

Three year average scale scores are also listed, for the 2004 to 2006 period.

But, the big question is what happens when you throw this all into the mix of a statistical model to evaluate whether charters outperform host district schools, controlling for the fact that they have less needy populations, but fewer resources to work with? Again, this is a simple school level model, which does not account for individual children’s relative gains in charters (treatment effect) compared to otherwise similar children not in charters but in host district schools. It would be wonderful to be able to conduct such analyses in NJ.

This school level model includes a dummy variable for each district that is a host district, such that charter performance in the model is measured against performance of the host district of that charter. The model includes only host districts and their respective charters. The overall charter effect is essentially the average of differences between charters and hosts, across hosts (and their respective charters).

What we see in this model is that charters, on average, are no different from their hosts on the combined math and language scale scores for NJASK from 2004 to 2006.  While the statewide model of the same data shows a strong effect of cumulative salaries per pupil on outcomes, the model within host districts of charters does not – an interesting point to explore. But, other factors play out quite logically – with each student need factor statistically significantly depressing scale scores.

Slide3

So, what does this more complicated, but still not complicated enough analysis tell us? It tells us that average charter school performance from 2004 to 2006 on elementary assessments is  no different from that of average performance in other poor urban schools – specifically the host districts of those charters. It just says this in a more complicated way. Sometimes simple averages – when not deceptive – can be sufficient.

One factor that could turn the findings in favor of charters (as treatment effect) would be if the average starting performance level of charter students, compared to otherwise similar host school students, is lower than that of host school students – which could occur if there is a tendency for parents to look to charters when their children are under-performing. This appears to be the case in the Missouri data in the CREDO study noted below. But, this is unlikely to create a substantial effect.

Again, this is just playing with the numbers, albeit a more rigorous play than my previous posts – leading to the same conclusions.

For more thorough discussions of charter school research, see:

http://epicpolicy.org/think-tank/reviews

Check out specifically, the original NYC Hoxby study, and critique of it, and the CREDO 16 state study and RAND 8 state study.  Exercise caution in linking any specific findings to the New Jersey context.

Illinois Salary Gaps – Do they matter?

I picked up this article on Twitter yesterday, which seemed at first to make a veiled version of the classic “money doesn’t matter” argument, or at least that’s how the tweets and headlines were spun. The article is somewhat more thoughtful, discussing many reasons why teacher salaries vary and how those variations are largely tied to differences in taxable property wealth across Illinois school districts, but the article misses a real opportunity to shed light on some striking disparities across the state and across districts within the Chicago metro area.

http://www.chicagotribune.com/news/education/chi-teacher-salary-09-nov09,0,1639857.story

So, how might we better understand salary variation across Illinois school districts and children, and whether that variation is problematic or not? First, we know that teachers matter! Second, we know from work by Hanushek and Rivkin that the uneven distribution of teaching quality by racial composition of students can explain substantial portions of the growth in achievement gap – black-white gap – between 3rd and 8th grade. http://faculty.smu.edu/millimet/classes/eco7321/papers/hanushek%20rivkin%2002.pdf To quote:

Unequal distributions of inexperienced teachers and of racial concentrations in schools can explain all of the increased achievement gap between grades 3 and 8. (p. 1)

Further, we know from these same authors in an earlier study that:

Table 7 suggests that a school with 10 percent more black students would require about 10 percent higher salaries in order to neutralize the increased probability of leaving. (p. 38 of PDF, not numbered)

https://www.utdallas.edu/research/tsp-erc/pdf/jrnl_hanushek_2004_public_schools_lose.pdf.pdf

Recap – Two major factors determining how well kids do in school are the characteristics of the other kids in the same class and the quality of their teacher. Unfortunately, the characteristics of kids in a given class affects who typically ends up teaching that class. Classrooms with greater shares of minority children end up with less well educated, less experienced teachers. This, in combination with peer effects, produces substantial disparities in outcomes which grow over time. Salary differentials might help to offset these disparities.

As such, it would likely be quite problematic if the teacher salaries in Illinois – WITHIN ANY GIVEN LABOR MARKET – were systematically lower in districts with higher concentrations of black and/or black and hispanic children. By cursory analysis it is rather difficult to disentangle the adverse affect of salary alone. That is, you can’t just take average salaries and try to relate them to average test scores, and then conclude that salaries don’t matter, but student characteristics do. The reality is that the two simultaneously matter and interact in important ways.

In many states, salaries and overall funding are actually comparable between districts with higher minority concentrations and other districts and in some states salaries and overall funding are actually higher (though not necessarily enough higher) in higher minority concentration districts (see: http://eric.ed.gov/ERICWebPortal/custom/portlets/recordDetails/detailmini.jsp?_nfpb=true&_&ERICExtSearch_SearchValue_0=EJ718694&ERICExtSearch_SearchType_0=no&accno=EJ718694.)

OH, BUT NOT IN ILLINOIS!

In my most recent analysis of individual teacher salary data for 2004 to 2008 in Illinois, I find that for a full time teacher, at constant contractual months, same degree level and same experience, and compared to districts in the same labor market, the teacher in a school that is majority black and Hispanic children, is paid about $2,000 less per year. Further, a teacher with a masters degree makes about $8,500 more per year. And, teachers in majority minority schools in Illinois are only about 60% to 70% as likely to hold a masters degree as teachers in predominantly white schools in the same labor market in Illinois.

We also know that the dropout rate is over 7% higher in majority minority districts compared with other districts in the same labor market. The mean ACT is over 4 points lower and the mean proficiency rates on state assessments are about 20% lower in majority minority districts compared to predominantly white districts in the same labor market.

These are striking disparities. And in Illinois, unlike many other states, state policymakers have applied no financial leverage to attempt to resolve these disparities.  No harm no foul? doubtful!