how shall we evaluate teachers?

Can you be more specific about what might be involved in a “comprehensive, teacher-centered plan?” What sorts of criteria do you think are fairest in evaluating teachers?

Thanks to my best friend from freshman year for challenging me on some of my thoughts in the last blog post on Facebook. We’ve had a great discussion, and his last question, up top, deserves a thoughtful answer. Here’s mine.

First, let’s note that teacher quality begins with quality teachers: that preservice and inservice preparation of teachers has huge impact on those teachers’ capacity to help their students. It may seem a dodge to open with this perhaps obvious fact, but it bears repeating in North Carolina, where (as usual) teacher induction programs and professional development funds have been the first on the budget crisis chopping block. Ain’t nothing for free.

The present national answer to teacher assessment is “value-added” assessment, despite overwhelming evidence that such data “should not be used to make operational decisions because such estimates are far too unstable to be considered fair or reliable“. VA analysis has the rhetorical advantage of yielding easily-compared scores, and the appeal of apparently straightforward data is hard to resist. Our cultural tendency toward side-by-side comparisons is known to cloud our judgment of the quality of what is being compared, or whether it is even the right thing to measure.

That’s what’s happening here, as a model never intended to be used to these ends has been reverse-engineered to supply comparable scores, sometimes through statistical hijinks that would be funny were their consequences less damaging.

But if not like this – then how? Three thoughts follow.

First, this California teacher makes great points that would help:

  • be sure attendance is considered when deciding which students’ scores to include (she asks for only 90% or higher);
  • empower teachers to remove disruptive students from the teaching environment;
  • ensure students start the year at a comparable base level  by ending social promotion (I appreciate that this “kicking the problem downstairs” chases its tail, since someone, somewhere must draw a line, but I also know that schools continue to find ways for students to advance when they are not ready to).
  • Benchmark student assessment to improvement, not an absolute standard.

Second, we should ensure that multiple measures of teacher efficacy are used when making assessment. Arne Duncan is on record as supporting this principle, and the National Board of Professional Teaching Standards does have years of experience in developing portfolio-based assessments of teaching quality that use multiple measures, and should be used to point the way.

But doing teacher assessment this way is hard, and resource intensive. The up-front costs are a fraction of the actual costs of completing the process, and most of the cost returns to the candidate (though NC is a state that has offered generous support for teachers seeking certification, which contributes I am sure to my own institution’s top rank of alumni teachers who are NBCTs).

I am not the only one concerned about the political will of policy makers to follow through on complex and resource-intensive assessment programs. But my concern really has little to do with whether or not those policy makers are acting in good faith: it has to do with what institutions are good at (counting and comparing standardized data) and what they are not good at (acknowledging and accounting for qualitative difference that falls outside the purview of comparability). As Parker Palmer notes, ” the functions of a profession are not necessarily those of the institutional structures that house it.”

Despite well-articulated and respected theory of what this kind of assessment can and should include, I despair a bit at whether we can ever actually institutionalize accurate and useful qualitative assessment of a process as complex and contingent upon conditions as teaching, even assuming infinite resources and well-considered intent (and we have neither). The NBPTS, despite its critics both friendly and not-so, seems like the best model we’ve got right now.

Third, we can capitalize on the potential for the debate to actually lead to a culture change on how we understand both teacher and student success. Our exercises in trying to define what good teaching and learning looks like shows us the shortcomings of our common sense. Or, as this pretty impressive B-School analysis has it,

…the use of hard data to pin down an objective measure of student progress and thereby teacher performance may be just the first step in a larger cultural shift toward a more rigorous and integrated evaluation of both student and teacher.

I see there’s probably some market-based ideology in this – i.e., as the public becomes less satisfied with the results of the current ways of doing things, they will press for more innovation that gets outcomes that better meet their needs. But let the market do what the market does best: make the changes that best serve the most, especially if it comes from public dissatisfaction about what “good learning” and “good teaching” is beginning to be defined as.

I hope this quick post gets my friend a little closer to an answer to his urgent question. Thoughtful responses don’t yield outcomes that make the cover of TIME: complex issues require complex responses. These are my best thoughts.

What have I missed? What do you think?

Thanks to for image.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: