What I Really Think About Teacher Effectiveness Ratings
by Grant Wiggins, Authentic Education
While exploring teacher effectiveness ratings, I went looking for a New York City high school that had the same demographic profile as the struggling high school referenced previously (see here), but had nonetheless earned good ratings for making adequate improvement on their graduation rates and test scores.
(Recall that school accountability in NY is based on improvement, not absolute scores and graduation rates, so schools with modest absolute scores can still be top-rated in New York. Put differently, the struggling school above failed to make adequate progress in the benchmark data of graduation rates and exam scores for 10 years.)
This “effective” school has the same demographics as the struggling school I cited above (more than 90% Black and Hispanic) and is located in Harlem. So what have I found thus far? What does the data suggest?
My Tentative Conclusions About Teacher Effectiveness Ratings
1. Local and state teacher effectiveness ratings are flawed if, as my sample suggests, teachers in less effective high schools generally get high ratings, and often get much higher ratings than those in more successful high schools (especially if they have been struggling to improve for up to 10 years).
2. The combination of survey data and Quality Review site visits presents a fairly consistent and credible picture of the relative strengths and weaknesses of schools – but these rarely align with teacher effectiveness ratings in either successful or unsuccessful schools.
3. This disconnect of Ratings vs. Review and Achievement data can only be reduced when there are exemplars of teaching to anchor the ratings, supported by calibration meetings across schools in a district. Without such exemplars, the teacher rating system will be based only on building-level norms (and politics) – and thus as flawed and non-standards-based as letter grades given to students.
Like grades, in other words, it appears that teacher effectiveness in schools is being scored on a curve (at best), rather than against any credible standard. (There is credible data to show that the typical % of employees in all walks of life whose work is judged to be ineffective is around 8%; in all NY schools the average is 1%, as the data show). NYSED (and/or each district) needs to rectify the lack of official models and calibration protocols.
4. At the very least, the state should say that teacher ratings of 100% effective in ANY school are of questionable validity on their face, and that extra justification for such ratings should be provided). Beyond that audit indicator, NYSED might require schools to provide 2-3 samples of video excerpts for effective teachers to be sure that there is calibration across the state (as is typically done in the AP and IB programs with scored student work, and on writing assessments in the state). The Teaching Channel videos could easily jumpstart such a process as could videos that exist in other states such as California.
5. For “rating inflation” to be lessened, there must be far better incentives for all administrators to be more accurate and honest in rating teachers (and teachers, in proposing credible value-added SLO measures). I have heard from Principal and Supervisor friends of mine that the politics of school climate and the possible counter-weight of exam scores entices everyone to rate teachers higher than they truly believe is warranted.
Perhaps, for example, schools should only be required to give aggregate “standards-based” ratings separate from the more delicate teacher-by-teacher “local norm” ratings. At least there might be an independent check on local rating standards by NYSED or via regional peer meetings chaired by BOCES staff, in the same way that IB teachers have their student grades “moderated” by the external reviewers of exams.
6. It is high time we made a critical examination of the wisdom of ranking each of the four dimensions in the Danielson Framework equally.
7. As it stands now, it is quite possible for a teacher in any school to do well on the three non-teaching dimensions of the four, do poorly on the teaching dimension, but still get a good “teacher effectiveness” score. That seems like a highly misleading way to rate “teacher effectiveness” (even as we should, of course, value planning, professionalism, and community relations).
8. Good schools improve and improvement is possible in all kinds of schools serving all kinds of students. Despite what many anti-reformers endlessly argue, from this data it is incorrect (and very fatalistic) to say that poverty plus ethnicity means ongoing adequate school and teacher improvement is impossible, as I have long argued that the data show.
Furthermore, there are outlier schools in this and every other state that cause successful absolute levels of student achievement in spite of the SES-related obstacles. (See, for example). Alas, the most successful such Charter schools in New York are not represented in the NYSED accountability data resources cited in this post.
A Final Thought
Do I think that Governor Cuomo is up to some crass politics in his report? I do.
Do I think that this is a hard time to be an educator? I do – so much harder than when I was a full-time teacher (where we pretty much left alone, for good and for ill).
Yet, I also believe that until we get an honest and credible accounting of teacher effectiveness in all schools (and especially in struggling ones) we will perform a great disservice to kids in need – and, yes, to their teachers who deserve more accurate feedback than many now receive.
POSTSCRIPT: Numerous people tweeted back after the previous post that many NYS struggling schools are under-funded and that far too much was thus being asked of them. While I agree that many schools are under-funded and that teachers in these schools face very difficult barriers, it seems odd that everyone who responded with this argument failed to engage with the teacher effectiveness data I provided. Many also said that without better funding and other kinds of state support, improvement in such schools is “impossible.”
The data simply do not support this fatalistic conclusion, nor does my work in New York City and Toledo where we, too, have seen solid gains through UbD training and implementation in previously struggling schools.
Let’s get on with improving the schools our students currently attend, doing what is in our control to improve them.
This article was excerpted from a post that first appeared on Grant’s personal blog; Grant can be found on twitter here; What I Think I Think About Teacher Effectiveness Ratings; image attribution flickr user flickeringbrad