Roger Williams has taught ESL on the college level on Guam and in Iran, remedial reading in a high school near the Navajo reservation and, for 33 years, middle school English in Richfield. Retired, he’s now a literacy tutor in Provo. I received the following insights from Roger about valued-added teacher assessments:
“Teachers receiving merit pay based solely or in part on student test scores reflects a simple cause-and-effect chain: Excellent teaching results in superior learning, which results in high test scores.
“If only it were that simple! The first problem with that assumption is test reliability. In my 39 years of giving (and writing) language arts tests, I’ve seen good tests and bad tests.
“When I was on a test writing committee in Iran, we came up with a Test Test. A good test was one on which the teachers and top students would get 100%. We spent hours constructing items that couldn’t be misread, argued about or rioted over.
“Some Utah end-of-level tests have been fairly good when judged by the Test Test. There were usually just a few items both teachers and top students missed. Looking at those particular items, one could see the problems.
“However, because end-of-level test scores were independent by grade level, there was no way to track student progress from year to year, so the state has tried tests from a national testing company.
“These tests would not pass the Test Test, as top students and teachers might get only 82% because test items covered an encyclopedic scope of concepts including esoteric terms and non-essential concepts from several grade levels. The 18% of items the top student missed reflected things he was not taught because they didn’t fit in his grade’s focused, useful, valuable curriculum.
“Another test problem is that tests are a small sample of the total curriculum, so the student may have mastered many concepts that don’t show up on the test. Further, some valuable concepts are difficult to test in a multiple-choice format.
“Because multiple-choice tests don’t directly test writing ability, the state also gave writing tests. The scores, though, often didn’t tell the student, parent or teacher much and often didn’t correlate with scores in the student’s writing portfolio.
“An additional problem was the great expense of paying readers to score the writing tests, so the state has tried computer-scored writing tests. Unfortunately, the machines will often reward trite, wordy writing over original, concise writing. So, again, these tests fail the Test Test by too often giving the top student’s brilliant writing a low score.
“To summarize, tests don’t always give a complete, clear picture of what a student has learned.
“But, you counter, what if students were given a good test, one that reflected what they had been taught? Then wouldn’t that test reflect good teaching?
“Not necessarily. The second complicating factor is the student.
“Imagine that you are put into a Utah classroom for two months, and you see firsthand innovative, effective teaching. At test time you (a top student) get 100% on that rare, perfect test. You are surprised, though, that many students missed many clear, obvious test items. You find these reasons:
“The student was
- new to the school and hadn’t received all the instruction
- physically or mentally absent during important instruction
- still struggling to learn the English language
- a student with learning disabilities
- suffering from test fatigue, having already put in seven hours of testing that week
- unmotivated, giving little or no effort on the test
- purposely choosing random answers because of personal “issues”
- in a survival mode due to a family crisis and unable to focus on the test and so on.
“You were in the class. You saw brilliant teaching, but that brilliance was not reflected in all students getting top scores.
“Now imagine you are put into a difficult college course, the very last one you would choose to enroll in. At test time you fail or receive a D. Was the score due to poor instruction? More likely it resulted from your lack of prior knowledge, skills and interest.
“So great teaching doesn’t always show up in high test scores because of complexities in both tests and students. Also, many subjects don’t have state or national tests – technology, music, the arts, PE, home ec, etc. Perhaps some subjects (math?) are more easily tested than others (language arts?).
“If student test scores can’t be used to show the quality of teaching, what about direct classroom observation? Those 1% who are poor teachers can often do quite well presenting a ‘dog and pony show.’ Even if 10 observers spent two months in the classroom, would they all agree? Would 10 observers of the president or of your doctor all agree on a competency level? We’re not judging the quality of assembly line products; even using objective standards, the subjectivity of humans judging humans enters in.
“After 39 years as a teacher, I don’t have the answer to clear, effective, reliable teacher evaluation, but I’m eager to hear from those who do and will watch this space for those answers.”