Why I’m not crying for Louisiana and Colorado

Many of the “reformers” out there are whining and fist-thumping about the surprise omission of Louisiana and Colorado as Race to the Top Winners. After all, Louisiana has been a heavy favorite from the outset of RttT, and Colorado… well Colorado took the amazingly bold leap of adopting legislation to mandate that a majority of teacher evaluation be based on value-added test scores. That’s got to count for something. Heck, these two states should have gotten the whole thing? Here’s Tom Vander Ark’s take on this huge surprise loss: http://edreformer.com/2010/08/co-la-surprise-losers/

Now here’s why I find it somewhat of a relief that these two states did not find themselves in the winners’ circle (not that I can identify a great deal of logic to support those who did… but…).

I’ve written numerous times about Louisiana’s public education system, and that state’s support or lack-thereof for providing a decent quality education for the children of Louisiana.

https://schoolfinance101.wordpress.com/2009/12/18/disg-race-to-the-top/

Here’s an excerpt from that previous post:

Let’s take a look at Louisiana’s education system. Yes, their system needs help, but the reality is that Louisiana politicians have never attempted to help their own system. In fact they’ve thrown it under the bus and now they want an award? Here’s the rundown:

  • 3rd lowest (behind Delaware & South Dakota) % of gross state product spent on elementary and secondary schools (American Community Survey of 2005, 2006, 2007)
  • 2nd lowest percent of 6 to 16 year old children attending the public system at about 80% (tied with Hawaii, behind Delaware) (American Community Survey of 2005, 2006, 2007). The national average is about 87%.
  • 2nd largest (behind Mississippi) racial gap between % white in private schools (82%) and % white in public schools (52%) (American Community Survey of 2005, 2006, 2007).  The national average is a 13% difference in whiteness, compared to 30% in Louisiana.
  • 3rd largest income gap between publicly and privately schooled children at about a 2 to 1 ratio. (American Community Survey of 2005, 2006, 2007)
  • 4th highest percent of teachers who attended non-competitive or less competitive (bottom 2 categories) undergraduate colleges based on Barrons’ ratings (NCES Schools and Staffing Survey of 2003-04). Almost half of Louisiana teachers attended less or non-competitive colleges, compared to 24% nationally.
  • Negative relationship between per pupil state and local revenues and district poverty rates, after controlling for regional wage variation, economies of scale, population density (poor get less).
  • 46th (of 52) on NAEP 8th Grade Math in 2009. 38th of 41 in 2000. http://nces.ed.gov/nationsreportcard/statecomparisons/
  • 49th (of 52) on NAEP 4th Grade Math in 2009. 35th of 42 in 2000.

So, this is a state where 20% abandon the public system and 82% of those who leave are white and have income twice that of those left in the public system, half of whom are non-white. While the racial gap is large in Mississippi, a much smaller share of Mississippi children abandon the public system and Mississippi is average on the percent of GSP allocated to public education. Mississippi simply lacks the capacity to do better. Louisiana doesn’t even try. And they deserve and award?

Quite honestly, I hadn’t really thought much about Colorado’s chances until today. I was certainly aware of their finalist status and aware of the reform crowd support for their new teacher evaluation legislation. But I hadn’t really reviewed their “indicators.” Here’s my summary of Colorado from earlier today:

Using 2007-08 data, Colorado ranks:

  • 45th in effort (% gross state product spent on schools)
  • 39th in funding level overall
  • 32nd in funding fairness (has a system whereby higher poverty districts have systematically less state and local revenue per pupil than lower poverty districts)

Yes, better than Louisiana, but nothin’ to brag about. And yes, both are marginally better than Round 1 winner Tennessee… but nearly every other state in the nation is.

So, where do these two states fit into those scatterplots I posted earlier today which identified Round 1 and Round 2 winners? Here they are – First, fiscal effort and overall spending level. Both states are very low effort states, and both are relatively low spending states.

And next, effort and funding fairness – or the extent to which funding is allocated in greater amounts to districts with greater needs.

In both cases, Louisiana and Colorado fall toward the lower left hand corner of the plot. Both are very low fiscal effort states. They have the capacity to provide more support for public education – BUT DON’T! Both states are also “regressive” – allocating systematically less funding per pupil to higher need districts, with Louisiana close to a flat distribution. And both are generally low spending despite their capacity to do better.

Improving state data systems – linking those data to teacher preparation institutions in order to impose sanctions on those institutions – banning teachers from obtaining tenure until they can achieve 3 consecutive years of positive value-added scores (error rates alone and year to year fluctuations may make this a low probability event) – and expanding charter schools are not likely to dig these states out of their current position. Doing so will require far greater investment than RttT could ever provide, especially in the case of Louisiana.  In fact, dramatically increasing job risk and career instability for individuals interested in entering teaching without also increasing the reward is likely to have significant negative effects. Unfortunately, it is about as likely that losing RttT will cause these states to reconsider their shortsighted reform agendas as it is that reading this blog post will get them to reconsider the persistent deprivation of their public education systems.

RttT Round 2 – Stuff that Doesn’t Matter!

Unlike many RttT enthusiasts, I have to say that I was pleased to see that Louisiana and Colorado were not among the winners. I’ve written extensively about Louisiana public schools in the past:

https://schoolfinance101.wordpress.com/2009/12/18/disg-race-to-the-top/

Although Colorado doesn’t look as bad as Louisiana on the indicators I often use on this blog, it ain’t pretty.

Using 2007-08 data, Colorado ranks:

  • 45th in effort (% gross state product spent on schools)
  • 39th in funding level overall
  • 32nd in funding fairness (has a system whereby higher poverty districts have systematically less state and local revenue per pupil than lower poverty districts)

Of course, these indicators – which I believe tell us a lot about state education systems – don’t really matter much when it comes to the big race, as I pointed out here:

https://schoolfinance101.wordpress.com/2010/03/29/and-the-rttt-winners-are/

Thankfully, while these indicators of actual effort to finance state school systems and participation rates in those systems didn’t matter in Round 2 either, the picks for Round 2 winners are somewhat – though not entirely – less offensive. I’ve highlighted in yellow with red type any cases where a Round 1 or 2 winner comes in 40th or lower on an indicator – Bottom 10. I’ve indicated in green with blue type, cases where states are in the Top 10. Sadly, there are far more bottom 10 cases than top 10 cases.

I would consider EFFORT and FAIRNESS to be the two key indicators here over which states have greatest control. A poor  state could put up significant effort and still not raise significant funding (Mississippi). The only Round 2 winner state with no “bad” marks and many good ones is Massachusetts. Massachusetts scores well on fairness and overall funding level. Tennessee, from Round 1 is simply a disgrace! North Carolina is perhaps the weakest link in Round 2, along with Florida which ranks poorly but avoids the bottom ten on any measure, and Hawaii which makes the bottom 10 on measures less within the control of the state – coverage. But, Hawaii has inflicted significant damage on its already struggling public schooling system in recent years.

And here are a few interesting two-dimensional views of RttT Round 1 and Round 2 states. First, here’s a two-dimensional view of educational effort and spending level – spending for high poverty districts. The two are reasonably related. Effort explains about half of the variation in spending levels. States like North Carolina and Tennessee are low on effort and low on spending. States like Massachusetts are relatively high on spending, but average on effort.  Rhode Island, Maryland, New York, and Ohio are above average on spending and effort. But spending level doesn’t guarantee that it’s spent – or distributed – fairly across wealthier and poorer districts.

Here’s a look at “fairness” and spending level. New York is high on spending level, but not so good on fairness. In New York State, wealthy districts in Westchester County and Long Island outspend much of the rest of the nation. But, poorer districts including New York City are largely left out, spending significantly less than the affluent suburbs.  Then there are those wonderful states where higher poverty districts have slightly higher revenue per pupil than lower poverty ones, but for the most part – everyone is similarly deprived. These are the “you get nothing!” (reference to Willy Wonka in previous post) states and Tennessee tops that list! Even more depressing are the states where “you get nothing” in general, and you get less if you are poor. Those states include RttT Round 2 standout North Carolina … and Florida sits on the margin of this group. Massachusetts is the “good” standout in this figure.


And here’s effort and coverage – or the share of 6 to 16 year olds attending the public school system.  New York, Maryland and Ohio (on the margin) do poorly on coverage, but have reasonable overall effort. Delaware is the real outlier here… with very low effort and very, very low coverage. Thankfully, none in Round 2 can match Delaware!

Finally, here is the state and local revenue predicted level for high poverty districts, and NAEP mean 2009, grade 4 reading and math scores (combined).  It’s always fun to throw the outcome data in there. And in this case, the RttT Round 1 and Round 2 winners are distributed across the range. Again, Tennessee from Round 1 is the biggest “bad” outlier, but one could say that Massachusetts from Round 2 is a positive counterbalance. Clearly, the demography and economy of these two states differs significantly. My complaint with Tennessee is not that it performs poorly partly because it has a large, low-income population. Rather, my problem with Tennessee, as I’ve noted many previous times is that TN puts up little effort and spends little, and barely spends even that paltry amount equitably. In addition, as I’ve discussed previously, TN has consistently had the lowest outcome standards.

For more on Rhode Island school funding, see: https://schoolfinance101.wordpress.com/2010/07/01/the-gist-twists-rhode-island-school-finance/

For more on Hawaii, see: https://schoolfinance101.wordpress.com/2009/11/06/hawaiis-funding-mess-my-thoughts-on-why/

As I noted on my previous post regarding Round 1 winners:

So then, who cares? or why should we? Many have criticized me for raising these issues, arguing “that’s not the point of RttT.  It’s (RttT)not about equity or adequacy of funding, or how many kids get that funding. That’s old school – stuff of the past – get over it! This…  This is about INNOVATION! And RttT is based on the ‘best’ measures of states’ effort to innovate… to make change… to reach the top!”

My response is that the above indicators measure Essential Pre-Conditions! One cannot expect successful innovation without first meeting these essential preconditions.  If you want to buy the “business-minded” rhetoric of innovation, which I wrote about here , you also need to buy into the reality that the way in which businesses achieve innovation also involves investment in both R&D and production (coupled with monitoring production quality). You can have all of the R&D and quality monitoring systems in the world, but if you go cheap on production and make a crappy product – you haven’t gotten very far.  On average, it does cost more to produce higher quality products.

This also relates to my post on common standards and the capacity to achieve them. It’s great to set high standards, but if don’t allocate the resources to achieve those standards, you haven’t gotten very far! It costs more to achieve high standards than low ones. Tennessee provides a striking example in the maps from this post! (their low spending seems generally sufficient to achieve their even lower outcome standards!)

That in mind, should states automatically be disqualified from RttT for doing so poorly on these Essential Preconditions? Perhaps not. After all, these are states which may need to race to the top more than others (assuming the proposed RttT strategies actually have anything to do with improving schools). But, for states doing so poorly on key indicators like effort and overall resources, or even the share of kids using the public school system, those states should at least have to explain themselves – and show how they will do their part to rectify these concerns.

For a video version of my comments on the big race, see:


New from the Center on Inventing Research Findings

The other day, the Center on Reinventing Public Education (CRPE) at University of Washington released a bold new study claiming that Washington school districts underpay Math and Science teachers relative to other teachers – which is clearly an abomination in a state that is home to high-tech industries like Boeing and Microsoft.

The study consisted of looking at the average salaries of math and science teachers and other teachers in several large Washington State school districts and showing that in most, the average for math and science teachers is lower than for other teachers. As it turns out, the average experience of math and science teachers is lower and far more of them are in their first five years. So, it’s mainly about the experience differential. The authors infer from this that turnover of math and science teachers must be higher, but never actually test this assumption. They next infer that this turnover must be a function of having less competitive salaries – relative to what they could earn outside of teaching.

The study never calculates relative turnover of math and science versus other teachers. Rather, the study implies that lower average experience levels must be indicative of higher turnover. The only follow-up analysis on this point is to show that math and science teachers, in addition to being less experienced, are also younger. Wow! That doesn’t validate the turnover claim though, which may be true… but no validation here.

This is a silly study to begin with, but check out the not-so-subtle difference between the press release and the study itself.

The Press Release
http://www.crpe.org/cs/crpe/view/news/111

The analysis finds that in twenty-five of the thirty largest districts, math and science teachers had fewer years of teaching experience due to higher turnover—an indication that labor market forces do indeed vary with subject matter expertise. The subject-neutral salary schedule works to ignore these differences.

The Study
http://www.crpe.org/cs/crpe/download/csr_files/rr_crpe_STEM_Aug10.pdf

That said, the lower teacher experience levels are indicative of greater turnover among the math and science teaching ranks, lending support to the hypothesis that math and science teachers may have access to more compelling non-teaching opportunities than do their peers. (p. 5)

Both are a stretch, given the thin analysis, but the press release declares outright that turnover is the issue, while the study merely infers without ever testing or validating.

The study goes on to be an indictment of paying teachers more for years of experience – (because we all know that experience doesn’t matter?) – and argues that differential pay by teaching field is the answer. This is an absurd false dichotomy. Even if it is reasonable to differentiate pay by teaching field that does not mean that it is unreasonable to differentiate by experience, or that taking dollars away from experience-based pay is the only way to differentiate by field.

I happen to agree that there exist significant problems with Washington’s statewide teacher salary schedule, and that among other things, math and science teachers in Washington State are disadvantaged on the broader labor market. But the CRPE study does nothing to advance this argument.

Previous work by Lori Taylor, of Texas A&M does:

Report on Taylor Study:

http://www.wsipp.wa.gov/rptfiles/08-12-2201.pdf

Taylor Study:

http://www.leg.wa.gov/JointCommittees/BEF/Documents/Mtg11-10_11-08/WAWagesDraftRpt.pdf

The CRPE study goes further to say that the findings indicate that school districts haven’t taken seriously a state policy initiative to increase investment in math and science teaching. So let’s say that the bill to which the CRPE press release refers – House Bill 2621 – really did stimulate districts to step up their efforts to hire more math and science teachers. What would likely happen to math and science teacher average salaries? Well, many new math and science teachers would enter the system. That would alter the experience distribution of math and science teachers – they would likely become less experienced on average – and hence their average salaries would decline and be lower than average salaries in other fields not stimulated by similar initiatives.

When I get a chance, I’ll try to play around with my Washington teacher data set and post some follow-up analyses.

Kevin Welner and I point to similar misrepresentations of findings from several reports from this same center in this article on within and between-district financial disparities:

Baker, B. D., & Welner, K. G. (2010). “Premature celebrations: The persistence of interdistrict funding disparities” Educational Policy Analysis Archives, 18(9). Retrieved [date] from http://epaa.asu.edu/ojs/article/view/718

And now, for some fun follow-up figures:

These figures use individual teacher level data from the State of Washington. I include all teachers holding “secondary” assignments and identify teachers certified to teach biology, chemistry, physics, general science and math (and all subcategories) using the certification record files on the same teachers. Note that some teachers in the data set hold multiple assignments, so the total numbers of cases in these graphs is not an exact match for the total number of individual teachers. I haven’t asked for Washington Teacher data for a few years, so these only go up to 2006-07. Unlike the CRPE report, which cherry picks 30 districts, I use the whole state. If I get a chance, I’ll play with some other cuts at the data. These data don’t coincide at all with the CRPE “findings.”

Here are the experience differences:

Here are the salary differences, on average, which coincide with the experience differences:

Now, here are the total numbers of teachers, and apparent decline in share that are math/science certified over this time period. Math/science teachers were relatively flat, while others grew.

Finally, here’s a portion of the regression model of certified base salaries, where I control for degree level, experience, year, hours per day and days per year, all of which influence salaries. Interestingly, this regression shows that math and science teachers, holding all that other stuff constant, made about $380 more than non-math/science teachers, even under the fixed salary schedule.


LA Times Study: Asian math teachers better than Black ones

The big news over the weekend involved the LA Times posting of value-added ratings of LA public school teachers.

Here’s how the Times spun their methodology:

Seeking to shed light on the problem, The Times obtained seven years of math and English test scores from the Los Angeles Unified School District and used the information to estimate the effectiveness of L.A. teachers — something the district could do but has not.

The Times used a statistical approach known as value-added analysis, which rates teachers based on their students’ progress on standardized tests from year to year. Each student’s performance is compared with his or her own in past years, which largely controls for outside influences often blamed for academic failure: poverty, prior learning and other factors.

This spin immediately concerned me, because it appears to assume that simply using a student’s prior score erases, or controls for, any and all differences among students by family backgrounds as well as classroom level differences – who attends school with whom.

Thankfully (thanks to the immediate investigative work of Sherman Dorn), the analysis was at least marginally better than that and conducted by a very technically proficient researcher at RAND named Richard Buddin. Here’s his technical report:

The problem is that even someone as good as Buddin can only work with the data he has. And there are at least 3 major shortcomings of the data that Buddin appeared to have available for his value added models. I’m setting aside here the potential quality of the achievement measures themselves.  Calculating (estimating) a teacher’s effect on their students’ learning and specifically, identifying the differences across teachers where those students are not randomly assigned (with same class size, comparable peer group, same air quality, lighting, materials, supplies, etc.) requires that we do a pretty damn good job of accounting for the measurable differences across the children assigned to teachers. This is especially true if our plan is to post names on the wall (or web)!

Here’s my quick read, short list of shortcomings to Buddin’s data, that I would suspect, lead to significant problems in precisely determining differences in teacher quality across students:

  1. While Buddin’s analysis includes student characteristics that may (and in fact appear to) influence student gains, Buddin – likely due to data limitations – includes only a simple classification variable for whether a student is a Title I student or not, and a simple classification variable for whether a student is limited in their English proficiency. These measures are woefully insufficient for a model being used to label teachers on a website as good or bad. Buddin notes that 97% of children in the lowest performing schools are poor, and 55% in higher performing schools are poor. Identifying children simply as poor or not poor misses entirely the variation among the poor to very poor children in LA public schools – which is most of the variation in family background in LA public schools. That is, the estimated model does not control at all for one teacher teaching a class of children who barely qualify for Title I programs, versus a teacher with a classroom of children of destitute homeless families, or multigenerational poverty. I suspect Buddin, himself, would have liked to have had more detailed information. But, you can only use what you’ve got. When you do, however, you need to be very clear about the shortcomings. Again, most kids in LA public schools are poor and the gradients of poverty are substantial. Those gradients are neglected entirely.  Further, the model includes no “classroom” related factors such as class size, student peer group composition (either by a Hoxby approach of average ability level of peer group, or considering racial composition of peer group as done by Hanushek and Rivkin. Then again, it’s nearly if not entirely impossible to fully correct for classroom level factors in these models.).
  2. It would appear that Buddin’s analysis uses annual testing data, not fall-spring assessments. This means that the year-to-year gains interpreted as “teacher effects” include summer learning and/or summer learning lag. That is, we are assigning blame, or praise to teachers based on what kids learned, or lost over the summer. If this is true of the models, this is deeply problematic. Okay, you say, but Buddin accounted for whether a student was a Title I student and summer opportunities are highly associated with Poverty Status. But, as I note above, this very crude indicator is far from sufficient to differentiate across most LA public school students.
  3. Finally, researchers like Jesse Rothstein, among others have suggested that having multiple years of prior scores on students can significantly reduce the influence of non-random assignment of students to teachers on the ratings of teachers. Rothstein speaks of using 3-years of lagged scores (http://gsppi.berkeley.edu/faculty/jrothstein/published/rothstein_vam2.pdf) so as to sufficiently characterize the learning trajectories of students entering any given teacher’s class. It does not appear that Buddin’s analysis includes multiple lagged scores.

So then what are some possible effects of these problems, where might we notice them, and why might they be problematic?

One important effect, which I’ve blogged about previously, is that the value-added teacher ratings could be substantially biased by the non-random sorting of students – or in more human terms – teachers of children having characteristics not addressed by the models could be unfairly penalized, or for that matter, unfairly benefited.

Buddin is kind enough in his technical paper to provide for us, various teacher characteristics and student characteristics that are associated with the teacher value-added effects – that is, what kinds of teachers are good, and which ones are more likely to suck? Buddin shows some of the usual suspects, like the fact that novice (first 3 years) teachers tended to have lower average value added scores. Now, this might be reasonable if we also knew that novice teachers weren’t necessarily clustered with the poorest of students in the district. But, we don’t know that.

Strangely, Buddin also shows us that the number of gifted children a teacher has affects their value-added estimate – The more gifted children you have, the better teacher you are??? That seems a bit problematic, and raises the question as to why “gifted” was not used as a control measure in the value-added ratings? Statistically, this could be problematic if giftedness was defined by the outcome measure – test scores (making it endogenous). Nonetheless, the finding that having more gifted children is associated with the teacher effectiveness rating raises at least some concern over that pesky little non-random assignment issue.

Now here’s the fun, and most problematic part:

Buddin finds that black teachers have lower value-added scores for both ELA and MATH. Further, these are some of the largest negative effects in the second level analysis – especially for MATH. The interpretation here (for parent readers of the LA Times web site) is that having a black teacher for math is worse than having a novice teacher. In fact, it’s the worst possible thing! Having a black teacher for ELA is comparable to having a novice teacher.

Buddin also finds that having more black students in your class is negatively associated with teacher’s value-added scores, but writes off the effect as small. Teachers of black students in LA are simply worse? There is NO discussion of the potentially significant overlap between black teachers, novice teachers and serving black students, concentrated in black schools (as addressed by Hanushek and Rivken in link above).

By contrast, Buddin finds that having an Asian teacher is much, much better for MATH. In fact, Asian teachers are as much better (than white teachers) for math as black teachers are worse! Parents – go find yourself an Asian math teacher in LA? Also, having more Asian students in your class is associated with higher teacher ratings for Math. That is, you’re a better math teacher if you’ve got more Asian students, and you’re a really good math teacher if you’re Asian and have more Asian students?????

Talk about some nifty statistical stereotyping.

It makes me wonder if there might also be some racial disparity in the “gifted” classification variable, with more Asian students and fewer black students district-wide being classified as “gifted.”

IS ANYONE SEEING THE PROBLEM HERE? Should we really be considering using this information to either guide parent selection of teachers or to decide which teachers get fired?

I discussed the link between non-random assignment and racially disparate effects previously here:

https://schoolfinance101.wordpress.com/2010/06/02/pondering-legal-implications-of-value-added-teacher-evaluation/

Indeed there may be some substantive differences in the average academic (undergraduate & high school) preparation in math of black and Asian teachers in LA. And these differences may translate into real differences in the effectiveness of math teaching. But sadly, we’re not having that conversation here. Rather, the LA times is putting out a database, built on insufficient underlying model parameters, that produces these potentially seriously biased results.

While some of these statistically significant effects might be “small” across the entire population of teachers in LA, the likelihood that these “biases” significantly affect specific individual teacher’s value-added ratings is much greater – and that’s what’s so offensive about the use of this information by the LA Times. The “best possible,” still questionable, models estimated are not being used to draw simple, aggregate conclusions about the degree of variance across schools and classrooms, but rather they are being used to label individual cases from a large data set as “good” or “bad.” That is entirely inappropriate!

Note: On Kane and Staiger versus Rothstein and non-random assignment

Finally, a comment on references to two different studies on the influence of non-random assignment. Those wishing to write off the problems of non-random assignment typically refer to Kane and Staiger’s analysis using a relatively small, randomized sample. Those wishing to raise concerns over non-random assignment typically refer to Jesse Rothstein’s work. Eric Hanushek, in an exceptional overview article on Value Added assessment summarizes these two articles, and his own work as follows:

An alternative approach of Kane and Staiger (2008) of using estimates from a random assignment of teachers to classrooms finds little bias in traditional estimation, although the possible uniqueness of the sample and the limitations of the specification test suggest care in interpretation of the results.

A compelling part of the analysis in Rothstein (2010) is the development of falsification tests, where future teachers are shown to have significant effects on current achievement. Although this could be driven in part by subsequent year classroom placement on based on current achievement, the analysis suggests the presence of additional unobserved differences..

In related work, Hanushek and Rivkin (2010) use alternative, albeit imperfect, methods for judging which schools systematically sort students in a large Texas district. In the “sorted” samples, where random classroom assignment is rejected, this falsification test performs like that in North Carolina, but this is not the case in the remaining “unsorted” sample where random assignment is not rejected.

http://edpro.stanford.edu/hanushek/admin/pages/files/uploads/HanushekRivkin%20AEA2010.CALDER.pdf

Newsflash: The upper half is better than average!

I’ve seen many versions of this argument in the past year, but this one comes from Kevin Carey in response to the Civil Rights Framework which criticized the current administration’s overemphasis on Charter Schools as lacking evidentiary support. Carey responds that the Civil Rights Framework selectively interprets the research on Charter schools, noting:

Here’s the problem: the contention that charters have “little or no evidentiary support” rests on studies finding that the average performance of all charters is generally indistinguishable from the average regular public school. At the same time, reasonable people acknowledge that the best charter schools–let’s call them “high-quality” charter schools–are really good, and there’s plenty of research to support this.

http://www.quickanded.com/2010/08/evidence-and-the-civil-rights-group-framework.html

I recall a similar comment in the media a few months back, by a researcher, regarding a national charter schools study – something to the effect of – Charter schools on average performed similarly to traditional public schools, but if we look at the upper half of the charter schools in the sample, they substantially  outperformed the average public school serving similar students.

These statements have been driving me crazy for months now. Here’s why –

To put it in really simple terms:

THE UPPER HALF OF ALL SCHOOLS OUTPERFORM THE AVERAGE OF ALL SCHOOLS!!!!!

or … Good schools outperform average ones. Really?

Why should that be any different for charter schools (accepting a similar distribution) that have a similar average performance to all schools?

This is absurd logic for promoting charter schools as some sort of unified reform strategy – Saying… we want to replicate the best charter schools (not that other half of them that don’t do so well).

Yes, one can point to specific analyses of specific charter models adopted in specific locations and identify them as particularly successful. And, we might learn something from these models which might be used in new charter schools or might even be used in traditional public schools.

But the idea that “successful charters” (the upper half) are evidence that charters are “successful” is just plain silly.

=======

Let’s throw a few visuals and numbers on my whining session above.  Below are some snapshots of New York City Charter schools. First, lets take a quick look at the mismatched demographics of New York City charters compared to same grade level traditional public schools. Here are the Free Lunch rates. I’ve tended to focus on Free Lunch rates rather than Free and Reduced, because Free Lunch falls under a lower poverty threshold, and, as my previous analyses have shown, while charters often serve similar numbers of combined free and reduced lunch children, they tend to serve the less poor among the poor (larger reduced shares, smaller free shares). This graph confirms my previous findings, and is based on data corroborated from both the NCES Common Core, Public School Universe Data from 2007-08 and the New York State Education Department School Report Cards.  Note also that the biggest differences are at the elementary level, which covers most of the charter schools.

Second, let’s look at the rates of children who are limited in their English Language proficiency. Here, the differences at the elementary level are huge! Charters in NYC simply don’t serve limited English proficient children!

Now for a few oversimplified scatterplots comparing charter school performance outcomes to traditional public schools – all “Regular Schools” by the school type classification in then NCES Common Core – and compared against those in the same borough. I’ve focused on Brooklyn and the Bronx here because of the wide variations in student population composition across Manhattan schools.

First note that none of the charters in the Bronx which had 8th grade 2009 test scores available had a free lunch rate over 80%, while several traditional public schools in the Bronx did. This chart shows the relationship between % scoring level 4 (top level) and % qualifying for free lunch. Charters are named and shown in red. Traditional publics are hollow circles. Both groups scatter! In fact, there are a few traditional publics at the top (which may be classified as “regular schools” but may be far from regular). Among Charters, Bronx Prep, KIPP Academy and Icahn 1 do rather well. Hyde Leadership (higher poverty than the other charters) and Harriet Tubman – not so well. But there are plenty of traditional public schools in the Bronx that appear to do well, and others not so well.

Here are the Brooklyn Charters and traditional public schools on the same outcome measure – percent scoring level 4 or higher on 8th grade math.  Here, all but Brooklyn Excelsior Charter have much lower poverty rates and simply aren’t comparable to most Brooklyn traditional public schools. And don’t forget, there are also likely very large differences in rates of children with other needs – like limited English proficiency. Williamsburg Collegiate and Brooklyn Excelsior appear to be doing quite well. But then again, Williamsburg Collegiate starts at 5th grade, so their success is likely at least partly a function of feeder schools.  There are plenty of “high flying” traditional public schools in this picture as well… and likely a few unique explanations as to why they fly so high. There are also plenty of low-flying charters. Here are the Bronx Charters in 2009, on 5th grade math. Again, the charters generally have much lower free lunch rates than the traditional public schools. In this figure, most of the traditional public schools have free lunch rates over 80% while none of the charters do. And again, charter performance, like traditional public school performance is scattered – some low – some high.

And finally, those Brooklyn charters on 5th grade math performance. Low poverty and scattered (except Brooklyn Excelsior which is higher poverty, and seemingly doin’ pretty well).A few new ones – Here are the Bronx and Brooklyn charter 5th grade performance levels based on a regression model controlling for stability rates, free lunch, ELL concentrations, year of data (using 2008 and 2009) and comparing specifically against other schools in the same borough. The performance levels are represented by the residuals of the regression model. Above “0” on the vertical axis is “better than predicted – or better than average at given characteristics, and below “0” is below expected at given characteristics.

In these graphs, most of the highest high flyers are non-charters. Charters are split above and below the “0” line, as one might expect.

Anyway, on this cursory walk through of the relative demographics and relative position of charters in the performance mix, it continues to evade me as to why we should be considering “charters” as a specific reform strategy and one that can raise urban school districts from their dreadful depths of failure. Had I not indicated which schools were charters in these graphs, I wonder how many “reformy” types could have picked out the dots that were charters. I suspect, given a blind sample, they would select the dots that fall furthest out of line in the upper right hand corner of each graph – the highest performing high poverty schools.  In three of the above 4 graphs, they’d have picked non-charters first and would have done so on the misguided perceptions that a) charters are the high flyers in any mix of schools and b) charters serve very high poverty populations. The reality is that charters are as scattered as traditional schools, and in general in NYC, they are serving lower need populations.

=======

A little more fun here. Here are schools in the area around the Harlem Children’s Zone. First, here are the maps of free lunch shares and LEP shares for charter and traditional public schools.  Green dots have lower rates of LEP or free lunch. Stars indicate charters. Names are adjacent to schools. Note that most of the charters are lower poverty and much lower LEP than surrounding schools.

And here are the residuals of the same regression model used above, applied in this case to Grade 5 Math Mean Scale Scores. Red dots are schools that perform less well than expected and green dots are those that perform much better than expected. Note that charters are a mixed bag, and the HCZ charter performs particularly poorly – which caught me off guard.

What does the education level of 25 to 34 year olds really mean?

About a week ago, The College Board released their latest status report in their college completion series.

http://completionagenda.collegeboard.org/sites/default/files/reports_pdf/Progress_Executive_Summary.pdf

The parts of the report that seemed to grab the most media attention were those related to a) comparing the US to other countries on the percent of 25 to 34 year olds who hold an associates degree or higher and b) comparing US states to one another on the same measure.

Newspapers across the country ran with this stuff and Twitter was buzzing with punditry on what these indicators meant about the quality of K-12 public schools in each state. Our public schools must be failing us if we’re only 24th on the education level of our younger adults – one Missouri pundit tweeted (related news story here).

The first thing that caught my eye was that Washington, DC was first in the rankings of percent of 25 to 34 year olds with an associates degree or higher.  Of course it is. Washington DC is a magnet for recent college graduates. Clearly, this particular indicator says as much about the employment options for a young, college educated workforce as it does about a state’s own education system. This indicator also tells us something about the education level and expectations of the previous generation – parents of these 25 to 34 year olds, whether in the same state or elsewhere. And, this indicator may also tell us something about the extent to which a state imports or exports college students.

So, I decided to play with some data…’cuz that’s what I like to do… just to see how these rankings might change if I tweaked them a bit.

I decided it might be fun to look at the differences in the rates of college educated adults – % of 25 to 34 year olds with a bachelors degree or higher – across states in three different ways:

  1. percent of 25 to 34 year old current adult residents who hold a BA or higher
  2. percent of 25 to 34 year old adult current residents who were born in the state who hold a BA or higher
  3. percent of 25 to 34 year old adults who were born in the state, whether they continue to reside there or not, who hold a BA or higher

It would seem to me that the second of these measures is most on target – the percent of the native population that holds a certain level of education. Needless to say, when I focus on the second measure, the rankings change somewhat. Here it is:

Table 1

Education Level (% BA or Higher) of the 25 to 34 Year Old Population by State

U.S. Census – American Community Survey 2006 to 2008

Data Source: Steven Ruggles, J. Trent Alexander, Katie Genadek, Ronald Goeken, Matthew B. Schroeder, and Matthew Sobek. Integrated Public Use Microdata Series: Version 5.0 [Machine-readable database]. Minneapolis: University of Minnesota, 2010.

Washington DC which ranks 1st on resident college graduates drops to 24th on native college graduates. MA, NY and NJ which were 2, 5 & 4, are now 1, 2, 3. Virginia goes from 9th to 26th and Maryland goes from 6th to 15th when only natives are considered. This is likely a DC effect as well. NH also drops quite a bit. Wisconsin rises quite a bit. Overall, there are some pretty big changes here.

Here are a few scatterplots – ‘cuz nothin’ is more fun than a good scatterplot.

This one shows on the horizontal axis, the share of 25 to 34 year old residents who are natives (born there).  On the vertical axis is the % BA or higher for all current residents. There’s DC, way above the rest on the vertical axis and pretty far to the left on the horizontal – that is, not too many 25 to 34 year olds who live there, were born there. The native share is only lower in Nevada. But Nevada doesn’t seem to be importing college grads!

This one shows the relationship between the % BA or higher among all current residents (horizontal axis) and % BA or higher among native residents (born and live there). Clearly there’s a pretty strong relationship between the two. But, there is enough variation to really change some rankings. Mass is high either way.  The big movers are those identified above, like Maryland, Virginia and New Hampshire, which have much more educated resident young adult populations than native resident young adult populations.

This one puts the “native share” again on the horizontal axis. On the vertical axis is a measure of the difference in the education level of all current residents (25 to 34) and native current residents. It’s somewhat of a net “import” effect measure. How much more educated is the current resident population than the born and raised population? Now, this is net difference, including the fact that some individuals who were born and raised in a state might have left and become more educated. Big net importers here appear to be Maryland, Virginia and Vermont and New Hampshire (Vermont surprised me a bit here… since there isn’t a whole lot of industry to attract college grads, but Burlington does always make those “great places to live” lists). It might also be a small sample size issue with the Vermont data. At the other end of the picture are Nebraska and Nevada, which don’t appear to importing a more educated adult population. Strangely, all but Nebraska are in the positive zone on this measure (note that this measure does not have to be net-zero across states because between state migration is not the only type of migration occurring. International migration may also affect these differences. This may also reflect the fact that more educated individuals tend to be more mobile. Just pondering).

In this one, we have the “native share” again on the horizontal axis, and the difference between the education level of those born in the state – whether they stayed or not – and those who reside in the state. This is somewhat of a net “export” measure. In this case, it would appear that Wyoming is the big loser. So too are Nebraska and Wisconsin. This is the one interesting piece about Wyoming. In the rankings above, Wyoming doesn’t move much. It’s 47th in % BA for current residents and 48th for native residents. But, Wyoming does much better on the education level of those born in the state, whether they stay or not – which apparently they don’t if they have a BA or higher.

So what does all of this mean? Probably not much. These figures and additional analyses certainly tell a more nuanced story than the media buzz of last week. But, it’s hard to really link much of this back to the quality of states’ underlying elementary and secondary education systems. Far too many factors are in play here, and even tweaking this one factor – whether residents are native residents or not- has significant consequences for state rankings.

So much for attaching any simple, bold statement about [YOUR STATE HERE] to that huge, pull-out multi-color map in the College Board Report!


Rolling Dice: If I roll a “6” you’re fired!

Okay… Picture this…I’m rolling dice… and each time I roll a “6” some loud-mouthed, tweet happy pundit who just loves value-added assessment for teachers gets fired. Sound fair? It might happen to  someone who sucks at their job…or might just be someone who is rather average. Doesn’t matter. They lost on the roll of the dice.  A 1 in 6 chance. Not that bad. A 5 in 6 chance of keeping their job. Can’t you live with that?

This report was just released the other day from the National Center for Education Statistics:

http://ies.ed.gov/ncee/pubs/20104004/pdf/20104004.pdf

The report carries out a series of statistical tests to determine the identification “error” rates for “bad teachers” when using typical value added statistical methods. Here’s a synopsis of the findings from the report itself:

Type I and II error rates for comparing a teacher’s performance to the average are likely to be about 25 percent with three years of data and 35 percent with one year of data. Corresponding error rates for overall false positive and negative errors are 10 and 20 percent, respectively.

Where:

Type I error rate (α) is the probability that based on c years of data, the hypothesis test will find that a truly average teacher (such as Teacher 4) performed significantly worse than average. (p. 12)

So, that means that there is about a 25% chance, if using three years of data or 35% chance if using 1 year of data that a teacher who is “average” would be identified as “significantly worse than average” and potentially be fired. So, what I really need are some 4 sided dice. I gave the pundits odds that are too good! Admittedly, this is the likelihood of identifying an “average” teacher as well below average. The likelihood of identifying an above average teacher as below average would be lower. Here’s the relevant definition of a “false positive” error rate from the study”

the false positive error rate, ()FPRq, is the probability that a teacher (such as Teacher 5) whose true performance level is q SDs above average is falsely identified for special assistance. (p. 12)

From the first quote above, even this occurs 1 in 10 times (given three years of data and 2 in 10 given only one year). And here’s the definition of a “false negative error:”

false negative error rate is the probability that the hypothesis test will fail to identify teachers (such as Teachers 1 and 2 in Figure 2.1) whose true performance is at least T SDs below average.

…which also occurs 1 in 10 times (given three years of data and 2 in 10 given only one year).

These concerns are not new. In a previous post, I discuss various problems with using value added measures for identifying good and bad teachers, such as temporal instability: http://www.urban.org/UploadedPDF/1001266_stabilityofvalue.pdf.

The introduction of this new report notes:

Existing research has consistently found that teacher- and school-level averages of student test score gains can be unstable over time. Studies have found only moderate year-to-year correlations—ranging from 0.2 to 0.6—in the value-added estimates of individual teachers (McCaffrey et al. 2009; Goldhaber and Hansen 2008) or small to medium-sized school grade-level teams (Kane and Staiger 2002b). As a result, there are significant annual changes in teacher rankings based on value-added estimates.

In my first post on this topic (and subsequent ones), I point out that the National Academies have already cautioned that:

“A student’s scores may be affected by many factors other than a teacher — his or her motivation, for example, or the amount of parental support — and value-added techniques have not yet found a good way to account for these other elements.”

http://www8.nationalacademies.org/onpinews/newsitem.aspx?RecordID=1278

And again, this new report provides a laundry list of factors that affect value-added assessment beyond the scope of the analysis itself:

However, several other features of value-added estimators that have been analyzed in the literature also have important implications for the appropriate use of value-added modeling in performance measurement. These features include the extent of estimator bias (Kane and Staiger 2008; Rothstein 2010; Koedel and Betts 2009), the scaling of test scores used in the estimates (Ballou 2009; Briggs and Weeks 2009), the degree to which the estimates reflect students’ future benefits from their current teachers’ instruction (Jacob et al. 2008), the appropriate reference point from which to compare the magnitude of estimation errors (Rogosa 2005), the association between value-added estimates and other measures of teacher quality (Rockoff et al. 2008; Jacob and Lefgren 2008), and the presence of spillover effects between teachers (Jackson and Bruegmann 2009).

In my opinion, the most significant problem here is the non-random assignment problem. The noise problem is significant and important, but much less significant than the non-random assignment problem. It just happens to be the topic of the day.

But alas, we continue to move forward… full steam ahead.

As I see it there are two groups of characters pitching fast-track adoption of value-added teacher evaluation policies.

Statistically Inept Pundits (who really don’t care anyway): The statistically inept pundits are those we see on Twitter every day, applauding the mass firing of DC teachers, praising the Colorado teacher evaluation bill and thinking that RttT is just AWESOME, regardless of the mixed (at best) evidence behind the reforms promoted by RttT (like value-added teacher assessment). My take is that they have no idea what any of this means… have little capacity to understand it anyway… and probably don’t much care. To them, I’m just a curmudgeonly academic throwing a wet blanket on their teacher bashing party. After all, who… but a wet blanket could really be against making sure all kids have good teachers… making sure that we fire and/or lay off the bad teachers, not just the inexperienced ones. These teachers are dangerous after all. They are hurting kids. We must stop them! Can’t argue that.  Or can we? The problem is, we just don’t have ideal, or even reasonably good methods for distinguishing between those good and bad teachers. And school districts that are all-of-the-sudden facing huge budget deficits and laying off hundreds of teachers, don’t retroactively have in place an evaluation system with sufficient precision to weed out the bad – nor could they.  Implementing “quality-based layoffs” here and now is among the most problematic suggestions currently out there.  The value-added assessment systems yet-to-be-implemented aren’t even up to the task. I’m really confused why these pundits who have so little knowledge about this stuff are so convinced that it is just so AWESOME.

Reform Engineers: Reform engineers view this issue in purely statistical and probabilistic terms – setting legal, moral and ethical concerns aside. I can empathize with that somewhat, until I try to make it actually work in schools and until I let those moral, ethical and legal concerns creep into my head. Perhaps I’ve gone soft. I’d have been all for this no more than 5 years ago. The reform engineer assumes first that it is the test scores that we want to improve  as our central objective – and only the test scores. Test scores are the be-all and end-all measure.  The reform engineer is okay with the odds above because more than 50% of the time they will fire the right person. That may be good enough – statistically. And, as long as they have decent odds of replacing the low performing teacher with at least an average teacher – each time – then the system should move gradually in a positive direction.  All that matters is that we have the potential for a net positive quality effect on replacing the 3/4 of fired teachers who were correctly identified and at least breaking even on the 1/4 who were falsely fired. That’s a pretty loaded set of assumptions though. Are we really going to get the best applicants to a school district where they know they might be fired for no reason on a 25% chance (if using 3 years of data) or 35% chance (on one year?). Of course, I didn’t even factor into this the number of bad teachers identified as good.

I guess that one could try to dismiss those moral, ethical and legal concerns regarding wrongly dismissing teachers by arguing that if it’s better for the kids in the end, then wrongly firing 1 in 4 average teachers along the way is the price we have to pay. I suspect that’s what the pundits would argue – since it’s about fairness to the kids, not fairness to the teachers, right? Still, this seems like a heavy toll to pay, an unnecessary toll, and quite honestly, one that’s not even that likely to work even in the best of engineered circumstances.

========

Follow up notes: A few comments I have received have argued from a reform engineering perspective that if we a) use the maximum number of years of data possible, and b) focus on identifying the bottom 10% or fewer of teachers, based on the analysis in the NCES/Mathematica report, we might significantly reduce our error rate – down to say 10% of teachers being incorrectly fired. Further, it is more likely that those incorrectly identified as failing are closer to failing anyway. That is not, however, true in all cases. This raises the interesting ethical question of – what is the tolerable threshold for randomly firing the wrong  teacher? or keeping the wrong teacher?

Further, I’d like to emphasize again that there are many problems that seriously undermine the application of value-added assessment for teacher hiring/firing decisions. This issue probably ranks about 3rd among the major problem categories. And this issue has many dimensions. First there is the statistical and measurement issue of having statistical noise result in wrongful teacher dismissal. There are also the litigation consequences that follow. There are also the questions over how the use of such methods will influence individuals thinking about pursuing teaching as a career, if pay is not substantially increased to counterbalance these new job risks. It’s not just about tweaking the statistical model and cut-points to bring the false positives into a tolerable zone. This type of shortsightedness is all too common in the types of technocratic solutions I, myself, used to favor.

Here’s a quick synopsis of the two other  major issues undermining the usefulness of value-added assessment for teacher evaluation & dismissal (on the assumption that majority weight is placed on value-added assessment):

1) That students are not randomly assigned across teachers and that this non-random assignment may severely bias estimates of teacher quality. The fact that non-random assignment of students may bias estimates of teacher quality will also likely have adverse labor market effects, making it harder to get the teachers we need in the classrooms where we need them most – at least without a substantial increase to their salaries to offset the risk.

2) That only a fraction of teachers can even be evaluated this way in the best of possible cases (generally less than 20%), and even their “teacher effects” are tainted – or enhanced – by one another. As I discussed previously, this means establishing different contracts for those who will versus those who will not be evaluated by test scores, creating at least two classes of teachers in schools and likely leading to even greater tensions between them. Further, there will likely be labor market effects with certain types of teachers either jockeying for position as a VAM evaluated teacher, or avoiding those positions.

More can be found on my entire blog thread on this topic: https://schoolfinance101.wordpress.com/category/race-to-the-top/value-added-teacher-evaluation/

Private Schools & Public Education Policy in New Jersey

The commission on private schools established by former Governor Corzine has just released its report:

http://nj.gov/governor/news/reports/pdf/20100720_np_schools.pdf

This report is more fun than many recent reports in New Jersey because it actually has some data and citations. Nonetheless, I have at least a few concerns regarding the presentation of the data and implications drawn from it. I was particularly intrigued by the graph on page 7 – which I replicate below:


This graph shows an apparent catastrophic collapse of the private schooling sector in New Jersey… or does it? Look at that the Y (vertical) axis. The range is from 160,000 to 192,000.  Yeah… that makes for a really steep apparent drop off. Note also that this data is from a state department of education source and is not reconciled against any other source. So, a stretched Y axis to make it look really, really, really dramatic. No second look – second opinion. And, only a single aggregate count of private school kids to show a major across-the-board collapse.

Here’s a more detailed exploration, using two data sources: 1) The National Center for Education Statistics Private School Universe Survey and 2) the U.S. Census Bureau American Community Survey, via the Integrated Public Use Microdata System.

First, here are the number of private schools by type in New Jersey over time:

This graph shows that the only significant decline in numbers of schools occurs for Catholic Parochial schools. Other private school types hold their ground in total numbers of schools.

Next, here are the enrollment and in the second graph, enrollment adjusted for missing data.

As with numbers of schools, the most substantive decline is for Catholic Parochial schools. There is a smaller drop for Catholic Diocesan schools. Other schools stay relatively constant, with some reclassification occurring between Other Religious – Not Affiliated and Other Affiliated. Note that the corrected, weighted version in the second graph above shows a somewhat smaller decline in Catholic Parochial enrollment than the un-adjusted version.

Next, I address private school enrollment by grade level and as a share of the total population of students in public and private school. A drop in private school enrollment would only be significant if it occurred in a context of stable or growing overall student population.

Here’s the total school population by grade level:

And the private school population by grade level:

What we see in this second graph is that the Grades 1 to 4 population appears to be declining most.

Here’s the private school enrollment by grade level as a percent of total enrollment. Kindergarten private school enrollment as a share of kindergarten students has declined. But, other grade level private school populations have declined only very slightly as a share of all children statewide in the same grade level.

This much more refined picture, across two additional data sets casts some doubt on the significance of the first graph above. Is there really a massive collapse of private schooling in New Jersey? It doesn’t look that way to me.

Explanations and Policy Implications for Catholic Schooling in New Jersey

Indeed, there may be some cause for concern for Catholic Parochial schools which appear to be closing and losing enrollment. But this phenomenon is not unique to New Jersey. Others have attempted to shed light on why Catholic schools are struggling in many urban centers.  Catholic schools have tried to remain accessible to the middle class by holding tuition down. At the same time, costs have risen. Decades ago, Catholic schools relied heavily on unpaid, church affiliated staff. Now, nearly all staff are salaried. My own recent analysis suggest that the cost of operating many Catholic schools are quite similar to those for traditional public school districts. The gap between tuition and cost has grown substantially over time for these schools. That’s not sustainable.

Two recent reports provide additional insights regarding public policy forces that may be compromising the stability of Catholic schooling in particular:

1) This Pew Trust report on parental choices in Philadelphia suggests that the expansion of Charter schools has potentially cut into the non-Catholic enrollment in urban Catholic schools.

http://www.pewtrusts.org/uploadedFiles/wwwpewtrustsorg/Reports/Philadelphia_Research_Initiative/PRI_education_report.pdf

Notably, New Jersey has not expanded charter schools as quickly as other states. But, it remains possible that existing New Jersey charter schools have drawn some students away from urban Catholic schools. As such, if the state is truly concerned with the sustainability of Catholic schools, the state should evaluate the effect of charter expansion on Catholic school enrollment (and on teacher recruitment/retention).

2) This Thomas B. Fordham Institute report suggests that vouchers in other locations such as Milwaukee have been a double-edged sword for Catholic schools. Vouchers do not provide full cost subsidy and restrict charging tuition above the subsidy to cover the gap. As such, schools are required to take a loss for each voucher student accepted. Further, as Catholic schools take on more non-Catholic vouchered students, parishioner contributions tend to decline – because it is perceived that the Catholic mission of the school has been compromised.

http://www.edexcellence.net/doc/catholic_schools_08.pdf

This situation does not apply in New Jersey, but findings from other cities raise concern that an under-subsidized voucher or tuition tax credit like the proposed Opportunity Scholarship Act (NJOSA) could actually do more harm than good for many private schools.

Vouchers differ from other subsidies (like the transportation and textbook subsidies) because of the restriction on charging tuition to cover the margin between the subsidy level and actual cost.  Some schools may subvert this requirement with strongly implied requirements for “tithing” as a substitute for tuition – including voucher receiving families. In fact, families could be obligated to tithe sufficient income to the private schools (or the religious institution that governs those schools) such that the family then qualifies for the tax credit program. The state should attempt to guard against this possibility in the design of any related policy.

Follow-up information:

A reader was kind enough to send me this link: http://www.avi-chai.org/census.pdf

Page 23 of this census report on Jewish school enrollment explains:

The other side of the geographic distribution picture is the concentration of schools in New York and New Jersey, as well as the overwhelming Orthodox domination in these two states. New York has 132,500 students, up from 104,000 ten years ago, while New Jersey has nearly 29,000 students, up from 18,000 in 1998. New Jersey’s gain is nearly all attributable to Lakewood, although there has been meaningful growth in Bergen County and the Passaic area. At the same time, Solomon Schechter enrollment in New Jersey has declined precipitously.

Clearly, the Orthodox schools in New Jersey are not in a free fall, as implied by the aggregation of all private schools in the private school commission report.

Another reader sent me this link:  http://www.njpsa.org/userfiles/File/EO161.pdf

This link explains the charge of the commission. It would seem to me that the final report has strayed somewhat from this charge.


Another “You Cannot be Serious!” The demise of private sector preschool in New Jersey?

There is little I find more enjoyable than boldly stated claims where the claims are entirely unsubstantiated… but where data are relatively accessible for testing those claims.

This week, the Governor’s Task Force on Privatization in New Jersey released their final report on the virtues of privatization for specific services. I took particular interest in the claims made about preschool in New Jersey. Preschool programs were expanded significantly with public support for both public and private programs for 3 and 4 year olds following the 1998 NJ Supreme Court ruling in Abbott v. Burke. For more information on the rulings and Abbott pre-school programs, see: http://www.edlawcenter.org/ELCPublic/AbbottPreschool/AbbottPreschoolProgram.htm

Here are the claims made in the privatization report:

•At the program’s inception, nearly 100 percent of students were served by providers in the private sector, many of which are women‐and minority‐owned businesses. Now, approximately 60 percent are served by private providers, as traditional districts have built preschools at great public expense and unfairly regulated their private‐sector competitors out of business.

•There are currently two sets of state regulations governing pre‐k. The majority of private pre‐k providers are subject to Dept. of Children and Families (DCF) regulations, but private pre‐k providers working in the former Abbott districts and serving low‐income children in some other districts are subject to the regulation of the DOE and the respective districts themselves, effectively crowding out the private sector and driving up costs to the taxpayer without any documented benefit to the children they serve.

To summarize, the over-subsidized public option of Abbott preschool has decimated the private preschool market in New Jersey, adding numerous women and minority business owners to the unemployment roles since the program was implemented (okay… a bit extreme… but I suspect you’ll hear it spun this way… since the above language isn’t far off from this).
The last time I read something this silly was in a research report from The Reason Foundation regarding “weighted student funding.” Not surprisingly, the Reason Foundation is among the only sources cited for… anything… in this report on the virtues of privatization (see page 4).

In this post, I’ll address two issues:

First, I address whether the claim that private preschool enrollment has dropped is true. Has private preschool in New Jersey actually been decimated since the 1998 Abbott decision? Are there that many fewer slots in private versus public preschools than before that time? Have public programs continued to grow while private programs have been eliminated? Has private preschool enrollment declined at any greater rate than private school enrollment generally? if at all?

Second, I revisit some of my previous findings about private versus public school markets, cost and quality. The recommendation that follows from the above claims is that the state, instead of continuing to subsidize expensive Abbott preschool programs, should allow any private provider to participate without Abbott regulation. This, it is assumed, would dramatically reduce costs. Rather, this might reduce expenditures… and the quality of service along with it. Lower spending (not cost) private providers simply don’t and can’t offer what higher spending providers do. Cost assumes specific quality, and lower “cost” assumes that less can be spent for the same quality. In this case, quality is being ignored entirely (or assumed entirely unimportant). That is, the proposed plan of allowing any private provider to house “preschool” students would likely be the equivalent of subsidized “daycare” (minimally compliant with Dept. of Children and Families (DCF) regulations) and not actual “pre-school.”

Issue 1

For these first four figures, I use data from the U.S.Census Bureau’s Integrated Public Use Microdata System. One of my favorites. Specifically, I evaluate the school enrollment patterns of 3 and 4 year olds in New Jersey from 1990 to 2008, by school type. Note that Census IPUMS data are actually not great for evaluating parent responses to the “school” enrollment question for 3 and 4 year olds, because in many cases a parent will identify their child as being in “school” even if the child is merely in daycare… home based, non-instructional, or any type of daycare. This is not hugely problematic here, because the report on privatization assumes that home based daycare or anything registered with DCF to supervise children during the day qualifies as a pre-school.  If anything, there may be under-reporting of private enrollment in these data by parents who actually don’t consider their private daycare to be “school.”

For 3 year olds, from 1990 to 2000, both public and private enrollment increase, while non-enrollment decreases. Public and private enrollment then stay relatively steady, except for an apparent increase in private enrollment in 2008 (I’m not confident in this bump, having seen other odd jumps between 2007 and 2008 IPUMS data). In any case, it would not appear that public enrollment has continued to severely squeeze out the private market place, unless we were to assume that the private market would have absorbed the entirety of the reduction in non-enrollment.  The lack of substantive shift from 2000 to 2008, with privates if anything, increasing their share, suggests that public subsidized have not led to the collapse of the private preschool market.

The next two figures show the enrollment patterns for 4 year olds. In general, 4 year olds are more likely to be enrolled in school, public or private, and less likely to be non-enrolled. As with 3 year olds, there really aren’t any substantive changes to the relative enrollment of 4 year olds in public and private settings between 2000 and 2008. No collapse of the private market here.


As an alternative, I explore the enrollment of private schools which provide pre-kindergarten programs statewide, using the National Center for Education Statistics Private School Universe Survey. Using this data set, we can determine whether the number of enrollment slots at the preschool level among private providers has declined, and whether the decline in private preschool enrollment has been greater than the decline in private school enrollment more generally.  Note that much has been made of the “collapse” of private schooling in New Jersey in the context of the New Jersey Opportunity Scholarship Act.

This figure shows that private school enrollment generally has declined more than private preschool enrollment since 2000. Private preschool enrollment has remained relatively stagnant statewide from 2002 to 2008. No real collapse of private preschools evident here.

Issue 2

As I noted above, preschool might be defined in many different ways. On the one hand, we might wish to consider preschool to be any place that meets minimum health and safety guidelines for caring for children between the ages of 3 and 4. To me, that sounds more like daycare. Alternatively, preschool might actually involve specific curriculum and activities as well as training for personnel, etc. Obviously, these differences in definition can and likely do significantly influence the cost per child of offering the service. If I can hire high school graduates and rely heavily in parent volunteers, and use only minimally compliant physical space to supervise children at play – mix in story time – I can likely do things relatively cheaply. On the other hand, if I actually have to hire teachers who hold college degrees and provide a specific curriculum and have appropriate physical spaces in which to do those things, it’s likely going to get more expensive – publicly or privately provided. It’s not so much about whether it’s publicly or privately provided, but whether there are minimum expectations for what defines “preschool.”

The elementary and secondary private school market is highly stratified by price and quality, as I have discussed on many previous occasions. YOU GET WHAT YOU PAY FOR. Yeah… I know that clashes with the appealing logic that private providers always do more with less…. thwarting the “you get what you pay for” assumption… or even reversing it… ‘cuz private provides do so much more with so much less. But let’s look again at one of my favorite summaries – with a new presentation – of the private school market. Here’s the earlier version.

This figure lines up the national average (regionally cost adjusted for each regional cluster) a) per pupil spending, b) pupil to teacher ratios and c) percentage of teachers who attended competitive undergraduate colleges, for private schools by private school type. Public school expenditures sit right near the middle. The small group of Catholic schools in the national sample sit right along side public schools (the system of Catholic schools has evolved to look much like their public school counterparts over time).  Independent schools spend nearly twice what public schools spend, have much smaller class sizes and have very high percentages of teachers who attended competitive undergraduate colleges. Hebrew and Jewish day schools lie about half way between the elite privates and public and Catholic schools. At the other end of the private school market are conservative christian schools, which spend much less per pupil than public or Catholic schools. They do have somewhat smaller class sizes, but have very poorly paid teachers, and have few if any teachers who attended competitive colleges. For more on these comparisons, see: https://schoolfinance101.wordpress.com/2010/02/20/stossel-coulson-misinformation-on-private-vs-public-school-costs/. In short, this figure shows that even in the k-12 marketplace, private providers are very diverse, some offering small class sizes and highly qualified teachers for a much higher price than public schools, and others offering much less.

We can certainly expect at least as much variation in the private preschool marketplace, if not one-heck-of-a-lot more, since many private daycare facilities require little or no formal training and no college degree for their employees.

As an aside, I was driving down Route 202 the other day west of Somerville Circle and noticed that they are putting in a Creme-de-la-Creme “daycare/preschool.”  We had one around the corner from our house in Leawood, KS.  I suspect that few of the Abbott preschool facilities built at such great expense compare favorably to a “Creme” facility – with waterpark (we’re talking slides, fountains), mini tennis court, indoor fish pond, tv studio, etc. (at least that’s what the one in Leawood had. I expect nothing less here?).  I expect that many parents, having toured many other “less desirable” daycare and preschools, will decide that their child deserves the “Creme” lifestyle (I suspect that there are actually other options with better curriculum and perhaps better teachers in the area, but I have not had the occasion to research it). It’s just an extreme example of the diversity of the private preschool marketplace. I suspect the cost per pupil will far exceed that of the Abbott preschools (heck… it already exceeded $12k per year in Kansas several years ago).

To summarize, the Task Force report on privatization makes bold claims about Abbott preschool programs crowding out, and decimating private preschool programs, many run by women and minority business owners. But the Task Force report does not bother to substantiate a) that private preschools have actually suffered, or b) that any, if they had suffered, were actually owned and operated by women or minorities. The only “evidence” the report has to offer is the undocumented claim that 100% of kids were in private programs and now only 60% are. Where does that come from? What the heck is that? 100% of who? 60% of what?

Further, the Task Force report is willing to assume that warehousing 3 and 4 year olds under the supervision of high school graduates in physical spaces and with supervision ratios compliant with DCF regulations is sufficient for low-income and minority children… or rather… that it is the lower cost option with equivalent quality to Abbott pre-school programs (public or publicly regulated private). It is critically important that we acknowledge the difference in the quality or even type of service received at different price points. Like the private K-12 market, the private preschool market varies widely, and spending much less generally means getting much less.

=====

See also, the Abbott 5th year report: http://edlawcenter.org/ELCPublic/Publications/PDF/PreschoolFifthYearReport.pdf

Manual for Child Care Centers from DCF in NJ: http://www.nj.gov/dcf/divisions/licensing/CCCmanual.pdf

Can’t forget this: