Testing 3 – The Ambler

Many of the difficulties associated with testing and other assessments would fall away if they were viewed in a different and better light. Let’s try looking at them as I propose to do in this posting, and see whether this look helps us to draw some useful conclusions. There seems to be a need: why else is the word “testing” so often associated with the words “mania” (mental illness) and “high stakes” (gambling)?

First, we should agree that there is no such thing as an objective test and then ask ourselves what other kinds of test there can be. I’ve never cared for the expression “objective test” because it doesn’t convey much except as a label for something that asks a lot of simple questions and requires choosing simple answers, usually by pointing. Let’s run through Webster’s definitions of the word and eliminate them one by one.

Does it form a “final cause,” as Aristotle called it? That is, does it constitute a purpose that shapes what aims towards it? All good sense tells us that it doesn’t, though the Testing-and-Accountability people have turned tests into final causes. If we didn’t feel the imperatives they artificially impose on testing, people would assume, correctly, that the purpose of taking, say, English is to learn English, not to do well on an English test. (Then they would buy more novels and fewer test prep books!)

Does the test exist independent of the mind? Well, yes, but only in a trivial sense, for every test is in that sense objective: there they are, indubitably, on the desk, whether multiple-choice or essay. Surely that isn’t what we mean either.

Does it verify by scientific methods? In most cases, no, though some people claim that some tests are sometimes scientific. I think these claims need careful examination because “scientific” is usually used as a recommendation but is applied without much rigor. This is not a definition to rally around.

Does it ensure that the material examined is not affected by personal perspectives or feelings? Here we have a problem. The designer of the test always uses personal perspectives or feelings to decide what is important and therefore included. He or she also decides on wording, presentation, and “how much it counts,” which also require a perspective. Are the test-writer’s perspective and feelings somehow privileged against the “charge” of “subjectivity,” i.e., having a point of view? If a test were truly independent of perspectives, then anyone from any culture with any background could be expected to do equally well on it, assuming they knew the same things (whatever that means); but we know that that is not the case. (See my posting about the immigrant who thought the Eagle was a bird, not a spacecraft.)

Is that what we want in a test, assuming it can be achieved? I hope not in high school, where students should be learning to recognize and justify their take on things. A “subjective” test might help not hinder them in such endeavors.

Feelings, perspectives, and ideas shape those facts and give them weight (or weightlessness). For example, fill in this blank: Columbus __________ America in 1492. It would be rather difficult to do so without a perspective on Columbus and on the Americas. The question is not whether perspectives per se are suspect or distorting, but what makes some perspectives useful or sound and others useless or unsound. Bertrand Russell’s “lunatic who thinks he is a poached egg” has a “perspective” that we may legitimately discount. Thomas Kuhn notes that even the equation f = ma depends for its meaning on the “perspective” of the scientist who encounters it. The object of testing should not be to eliminate perspective but to hold it to some standard of fitness. Please note that I’m not saying there is “no right answer.” That is something that a 9^th-grader says before he or she learns what finesse is. I am saying that perspective and feeling can have a perfectly valid place in testing.

I would therefore argue in favor of tests that demand a grasp of factual detail and also of significant ideas, the two examined from a perspective articulately and sensibly maintained by the student and appropriately judged by the teacher or marker. Yes, I have used the J-word, one of the only words that matter in the evaluation of anything at all advanced. Otherwise, there is no alternative to tests that give the highest grade to Funes the Memorious, whose command of tens of thousands of unconnected facts is not just useless but harmful.

The answer to the danger of arbitrariness of judgment is not “objectivity.” It is the achievement of good judgment by teachers and the nurturing of their connoisseurship in subjects where it is needed.

But since good judgment is still not perfect, its vagaries can be minimized by ensuring ampleness of assessment. I mean not just lots of tests but a mixture of kinds: oral and written, ex tempore and prepared in advance. A good example would be the basket of assessments for the International Baccalaureate program’s “language A1” courses: two handwritten essays, one on works studied and one on a work not seen before; two oral presentations, one prepared ahead of time and the other, the fearsome “Formal Commentary,” prepared in twenty minutes and executed in fifteen (the commentary, not the student); and two papers written ahead of time by the student on a theme he or she has chosen. None of these counts more than 25% of the final grade awarded by the IBO. Two of these are graded by the teacher, whose work is vetted by a “moderator.” The others are graded by professional “examiners.” The moderators offer comments to the teacher after the moderation is finished, and all teachers can attend workshops where they learn to teach these courses and to mark course material. These seem to me to be sounder precautions against arbitrariness and poor judgment than the whisking away of judgment and evaluation by ostensibly but not actually “objective” tests.

Another guarantee of sensible evaluation is to have teachers who have learned the subjects they are teaching—seemingly obvious until we discover that upwards of 40% of high school teachers teach subjects they did not major in. Yet another is collegiality in the faculty room, with all the good things it produces. A third, connected to collegiality, would be frequent table-talk about assessments, with more experienced teachers and newer ones discussing how to go about making them.

If we can manage some of these changes in how we view tests, much of the insanity and gambling will fall away.

Leave a Reply