Ready, Fire, Aim

About half a year ago I wrote about my college classmate H, whose IQ went from very low during his elementary school years to 160 during his last year of high school. The story was part of a discussion of problems in social science and how they carry over into the rating of students and their teachers. It seems that a recently released study shows that the increase of children’s IQs, previously “thought” unlikely or impossible, is not so uncommon.

But it would be more proper to say that his IQ test score went up, for H was always intelligent. It’s true that the testimony of his intelligence came from his parents, but I would argue that “anecdotal evidence,” if treated carefully, can be truer than the results of tests, the reaction of test-takers to which is poorly understood, or governed by assumptions that may not be true. As time passes, the results of psychometry and education “science” seem at times to climb in a hyperbolic line ever closer to those asymptotes of understanding, esprit de finesse and good judgment; but the path is not sure, and the goal is not reached. Why not go with finesse and judgment instead or in the first instance?

The article discussing the unexpected findings noted that the “phenomenon” was not understood. Psychological researchers are probably even now planning the expected “further research” to determine what has happened. Who knows that in another few years these lines of research will not grow even closer to the asymptotes? While we wait for them, some items of discussion immediately occur to me.

Just as it is possible but unlikely that a weak, unhealthy and uncoordinated boy of six can emerge chrysalis-like from school age a championship athlete or a ballet dancer, so it is possible but unlikely for a mentally unprepossessing boy to grow into a coruscating intelligence in a few years. Exceptions only sometimes prove the rule, as young Theodore Roosevelt overcame the difficulties of his physically challenged youth. What is more likely is that boys like the four-year-old Albert Einstein had mental lives that went unsuspected by people looking for crude or generally familiar signs of “progress.”  Finesse is ready to read unfamiliar signs and understand exceptions; testing and psychometry are not, as indeed how could they be? How can a test-giver go meta on the very test he is giving? How consider that maybe the test-taker is not getting brighter but more attuned (or reconciled) to taking tests?

The problems of surveying and testing students don’t stop with the possibility of instruments’ giving inaccurate readings. When I was a boy and wanted to play hooky one day, I complained of feverishness. My mother took my temperature, and while she was out of the room doing other things, I applied the base of the thermometer to an electric light bulb so that it would register a higher temperature. My mother, with her finesse, couldn’t reconcile the high reading of the thermometer and my unfeverish forehead. Her eyes narrowed, and she took my temperature again, this time staying quietly in the room. The jig was up, and off to school I went.

By contrast with the good sense of a narrow-eyed skeptical intelligence equipped with a multiplicity of aids to judgment, the big-eyed naïveté of test administrators is astonishing. Do they think that the Little Dears would never throw a test or a survey? If they needed to spend $45,000,000, that is forty-five million dollars, to learn that students can tell a good teacher from a bad, I am afraid for their judgment. A friend of mine who works at a university where students rate their teachers and where the ratings govern personnel decisions reports that most untenured faculty who mark to exacting standards, make unusually high demands, or assign “too much” homework, receive poor ratings. The teachers, to prevent that, oblige the students. The evaluation of teaching is thus caught in an enfilade of survey-fixing. What kind of sense does it make to ask a student to offer a high-stakes evaluation of his teacher in such conditions? Would we allow hospital patients to offer high-stakes evaluations of their surgeons?

And, as so many stories now show, the results not just of surveys but also of tests are subject to corrupt distortion and manipulation of the already somewhat tenuous data that they generate. Campbell’s Law explains the corruption, but good sense can explain the rest of the trouble with “scientific” evaluation.

One may argue that the use of finesse and judgment can be corrupted too, and one would be right; but the answer to corruption in one method of evaluation is not to replace it by another corruptible method. Rather, we should use our good judgment on a variety of evidence to determine how our teachers and students are doing. We should be trying to get to the bottom of poor data. We should be questioning invalid assumptions. We should be fortifying or building a culture that casts a constantly cold eye on corrupt practices. If we have to use tests and surveys, we should be identifying and removing what Campbell called “corruption pressures” and the misuse of data that gives rise to them.


