A Problem of Fine Distinction

Saturday, August 28th, 2010

Most of us have seen descriptions of what the letter grades mean. My own favorites are the simplest: A is excellent; B, good; C, fair; D, poor but passable; F, failing. It might be worth examining our sense of the five grades by trying to understand what they mean in particular cases, especially quizzes and tests. I fear that teachers and other people in the field do not often do so. Since to give a grade is to make a judgment, we should have a sense of particular qualitative elements of each grade, or at least have the sort of expertise and connoisseurship that can explain itself. However we do it, in whatever subject, we should be able to make some distinctions between A work and B.

A good assessment must give us what we need to make these distinctions. If a student of English is to get an A, he or she should do things with a depth, a flair, a thoroughness, a subtlety that a B student doesn’t bring to the task. The assessment should therefore give students the chance to show these qualities. Of course it should allow an A student to demonstrate knowledge, but the knowledge is best set in circumstances that also allow the demonstration of skill or know-how and the display of uncommon understanding. Similar distinctions should be possible between the other grades, or why have them?

(In effect, some teachers and schools do not have them, for everyone who attends their classes gets A’s and B’s. I am speaking now of comprehensive schools, not specialized ones with selective admissions. I would be willing to bet that teachers and administrators at such schools giving mostly A’s and B’s cannot readily tell the difference between excellent or good work and fair work without baloney. Of course it is possible that, like the statistically improbable students of Lake Wobegon, the students at such a school are all above average. It is also unfortunately possible that such places are loci of pedagogical or intellectual scandal, as in New York, some of whose “proficiency” tests for promotion could be passed by random guesswork. But let us assume competence, good will, and normal distribution of aptitude for the sake of this discussion.)

A corollary of this idea suggests that awarding grades for tests on a numerical continuum may undercut qualitative distinctions between grades. Imagine a multiple-choice test of one hundred questions. In a standard procedure used by many teachers, a student who answers eighty-nine of them correctly will get a B+, while one who answers ninety will get an A-. On what basis can the teacher giving that test assert that the 90 was excellent work but the 89 merely good?

Perhaps the teacher has devised ten questions unlikely to be answered correctly by anyone but an excellent student. That is fine, but it is probably more than many teachers do, or many of the test banks that go with textbooks and randomly generate multiple-choice and matching questions. Furthermore, the teacher who undertakes the extra work of trying to gauge questions to qualitative distinctions could still be undercut by students’ lucky guesses. Where is the distinction between excellent and good then?

It is often hard to make. Let us take a sample question:

The musical composition called “Emperor” is a. an anthem b. a concerto c. a string quartet d. an aria e. an opera[1]

The answer to this question, combined with those to ninety-nine others like it on a test, might determine whether a student in, say, a music appreciation class (if such things still exist) was excellent, good, or worse.  What makes correct answers to ninety such questions excellent work but to only 89 merely good work? Put this way, I think the question has no defensible answer unless an almost incredible amount of forethought went into the design of the test. (Now, professional test-writers make two or three times as much as professional teachers, but I wonder if even they give this kind of thought to the questions they devise.)

We must also ask How is guesswork discounted? If it is like most multiple-choice tests given in class, the answer is not at all. But let us say that on this test the teacher subtracts .2% for each wrong answer from the total of right answers. A student might have reasoned that no opera would have the title Emperor without an article, that an aria is named after its first line, which Emperor is unlikely to be, and that an anthem would name something or someone more particular, leaving answers b and c.  If test-taking is like gambling—and many courses in “test-taking skills” make it so—the student has increased his odds of a right answer from .2 to .5, making a guess worth the chance. What does that guessed right answer—in effect a coin toss—have to do with music appreciation or even just musical knowledge? A further problem of this question is that the most obvious answer, b, is not the only correct answer, since that name is given not just to the piano concerto by Beethoven but also to a string quartet by Haydn.

If the student had been asked a short-answer question such as Identify a work containing variations on a theme, and name its composer, the student might have written “Haydn’s ‘Emperor’ quartet,”  and the teacher could have been confident that the student knew her stuff. The problem of bad questions would have been sidestepped, and the teacher could at least have been confident that guesswork had been eliminated. (Of course, the teacher would still have had to know that both Beethoven and Haydn wrote compositions called “Emperor,” and he would need a reason for supposing that this fact was somehow representative of the knowledge to be tested and therefore worth including.)

But the fundamental question remains. Does such a test offer a way to distinguish between excellent work and good? I have my doubts. It might at least certify that a mind can retain factual detail, which is important; but where is the assessment of skill and understanding?

The problem demands a solution, and I hope to touch on some in future postings.

[1] The idea for this example comes from Professor Barzun’s discussion of Banesh Hoffman’s book The Myth of Measurement.

(No) Comment

Thursday, August 19th, 2010

Though it would take time for me to tell all the ways in which I gained from being a student in Kenneth Koch’s course of modern poetry, I want to mention one in particular: what he taught me about making comments on students’ papers through the comments he made on mine. That is by way of saying a few things about comments in general.

The first is that properly prepared students are avid for comments. I heard a contrasting view before my third year of teaching, when I attended the summer workshop of a “project” famous for “developing” the “writing process.” The leaders of this workshop contended that students do not read their teachers’ comments, so there is no point in the teachers’ making them. This claim, supposedly based on “research,” was repugnant and in my experience a demonstrably false bit of Bracknellian[1] nonsense, but I entertained it provisionally at least to try and understand in what circumstances it might be true.

I cast back to the team teachers of my 11th-grade English class, Mr. Z. and Mr. M. Any paper returned by Mr. Z. had comments such as “v. good” or “✔”. Those from Mr. M. had notes on my wordiness, my pompousness, or, sometimes, my concision and clarity. Errors of usage came under the red pencil, as did errors of grammar. The virtues he occasionally noted had specific names and were not subsumed with everything else under v. good.

But mostly I thought about Professor Koch (pronounced Coke), whose every returned paper was a course in writing. As you might guess of the teacher of a class in the modern poets who was himself a poet, much of our work consisted in writing poetry. If we studied Whitman, we would write a Whitman imitation, and Yeats, and Pound, and Lawrence. We also wrote a term paper in prose, a midterm essay, and final exam essays, one of which could be a poem.

Unlike many or even most schoolteachers who examine their students’ poems, Koch would subject our poems to genuine criticism, including particular praises and reproofs. The reproofs were as gentle as he could make them if he thought the writer had made a good attempt, but if something was “unWhitman-like,” it was unWhitman-like. He did not accept the notion that students’ poetic work was exempt from discipline and criticism, and most of his students left his class with a realistic estimation of their talents at poetry. He was clear, however, that he wouldn’t let lack of talent get in the way of a decent grade if the student did well on the paper and the essays and gave the poems a try. I was in the unusual position of having him like my poems more than my prose, so my advantage worked the other way around.

The prized comment was “very exciting,” always combined with particulars. It brings me to another important point about comments. Students value the comments of teachers who take them and their work seriously enough to be excited by it, or absorbed, or at least demonstrably occupied. “V. good” doesn’t cut it. If that is what students face when they get their papers back, then of course they would not value the comments. One 9th-grade student during my student teaching submitted a homework assignment early on that said in the middle of a prose passage, “Mr. V are you reading this?” I wrote in the margin, “Of course. I assigned it.” He came to me the day after I returned the paper and thanked me for paying attention to his work. I had his attention for the rest of the semester.

That brings us to a third point. Professor Barzun said that teachers who offer their students the criticism their writing needs and deserves will “work like dogs.” If that was true of his colloquium in important books with its twenty students, or of Koch’s course in modern poetry with its forty, it is terribly true for a high-school teacher with an unspeakably large number of students. Writing, a skill or talent, requires the teacher to be a coach and editor; coaching and editing must aim at particular people. Teachers with large loads of students are bound to have trouble managing this demand unless they have extraordinary fortitude and stamina. The difficulty lies in the quality and intensity of attention required and in the degree of detail that has to come under the teacher’s active notice. It took me a long time to develop that kind of editorial stamina, and even now I have to pace myself when grading writing, taking breaks to stay fresh and open, not burnt out. New teachers daunted by the job should know that the needed ability will probably come, but they must not suppose that it will be easy. Candidates considering jobs at schools where they will have large loads of students may have to inure themselves to dealing with an insufficiency of time and knowing that even with the best will, their work may be “not altogether satisfactory.” Perhaps teachers working in these conditions end up not offering the kind of comments that students take to, but through no grave fault of their own. How sad, then, if they end up moths in a flame!

The answer to the problem of students’ not taking their teachers’ comments seriously does not lie in abandoning comments. Rather, it lies in establishing conditions in which teachers know what comments are worth making and have the working conditions in which to produce comments worth reading.  On the students’ part it means coming to an assignment prepared and ready to seek and take advice, the way my 9th-grader could do once he had satisfied himself that I really recognized him.

[1] This coinage refers to Lady Bracknell, who says that “statistics are laid down for our guidance.”

℞ Stone Tablets

Tuesday, August 10th, 2010

Sometimes it’s a pity that a valuable dictum cannot be presented on a stone tablet by a prophet. Lightning and thunder might help it make an impression too. My dicta of the day are sometimes called Campbell’s Law or Campbell’s Laws, after Donald T. Campbell, a social scientist who died about fifteen years ago. They appear not on stone tablets but on page 49 of a paper he wrote he wrote in 1975:

“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures, and the more apt it will be to distort or corrupt the social processes it is intended to monitor.”

The next few pages of the paper give examples of corrupted processes from law enforcement and government administration, but they end with a clincher drawn from the world of education. A program of compensatory education in Texas was corrupted by the private contractors hired to administer it. It turned out that success was to be determined by the performance of the targeted students on a test, and the contractors “coached” their students in order to produce good results.

Campbell reports that the contractors “defended themselves with a logical-positivist, operational-definitionalist argument that their agreed-upon goal was defined as improving scores on that one test,” but that their methods were generally regarded as scandalous. How far we have come since this scandal was reported in a 1971 paper can be gauged by the fact that not just the city of Texarkana but much of the United States since No Child Left Behind is doing the same thing. Campbell saw it coming and warned that “when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.”

Education, he points out, is not uniquely corruptible: the problem lies in the very idea of gathering simplistic data, deciding that they represent or encapsulate a complex state or process, letting them become normative, using them to determine whether someone has done something right, and attaching these evaluations to a system of rewards and punishments. He was pessimistic about the possibility of circumventing his laws, and the growing “testing-&-accountability” fiasco is bearing him out.

Reading and Wisdom

Friday, August 6th, 2010

The Introduction to Great Books series was originally intended for use in high schools, though a visit to its web site suggests that its main readership is now adults in reading groups. It is excellent to find people of any age reading Kafka, Conrad, Dinesen, O’Connor, and Tocqueville, but sad to think that the joys and rigors of such encounters may not be a part of many high-school English curricula any more.

If students have been properly trained in reading, study, and discussion for a number of years (maybe having read the Junior Great Books series), they will be able to handle these authors and others even more difficult: William James, Thomas Kuhn, Sir Karl Popper. It is not just their vocab­ulary that will grow, but also their range of expression and thinking.

The vocabulary and understanding grow in a healthy and productive way because the students encounter words alive in the reading and not dead on a test-prep list. The health is threefold. It assures them that the words they are learning are used by real writers and so worth learning as part of a real and not a fake landscape of language that they are exploring. It sets up oppor­tunities for emulation or trying-on in writing and discussion, during which they can explore and test good usage and fit words to thoughts. And it gives them a chance to learn how to be rigorous and to increase their intellectual stamina.

William James reports that H. P. Bowditch, “who translated and annotated La­place’s Mécanique céleste, said that whenever his author prefaced a proposition by the words ‘it is evident,’ he knew that many hours of hard study lay before him.” He had enough humility and eagerness and a strong enough sense of obligation to the masterpiece he was encountering to be ready to tangle with it hard and long. Even­tually what had been evident to Laplace became evident to Bowditch too. The same process, on a less exalted but no less important level, occurs or can occur in high school with students who have effectively met a series of demanding but rewarding readings with study and careful conversation.

The sense that something is evident becomes stronger and more confident as it is tested, proved, probed, and exercised. And here we come to a benefit of a good course of reading and discussion that will escape capture by checklists of little skills and attainments, which embody the reductive fallacy in their pedagogical assumption (in this case the fallacy is that high-school reading is nothing but a string of discrete “competencies”). At some point, often but not always foreseeable, the student passes into an intellectual terrain to which James refers when he says that “the art of reading (after a certain stage in one’s education) is the art of skipping.”

But he makes that observation in an analogy whose second part is that “the art of being wise is the art of knowing what to overlook.” Here we have something diffi­cult or impossible to manage in a pedagogy aimed entirely at taking baby steps even on the verge of adulthood through a course of explicit, identifiable minor achieve­ments: how do we cultivate wisdom and understanding when they cannot be handled reductively? We come back to the problem Gilbert Ryle noted in his essay “On Forgetting the Difference between Right and Wrong.” It is that we are talking about something different from knowledge and skill.

If we don’t talk about it, we risk educating not a “mind of a high order” but someone like Funes the Memorious, the title character of a story by Borges. Poor Funes! He couldn’t skip or overlook anything. “My memory…is like a garbage heap,” he tells the narrator of the story. He could remember tens of thousands of futile details but had almost no ideas. James would say that he lacked the power of reasoning because he had learning but not sagacity. Of course, Funes would have been able to ace the kind of test that treasures the quick manipulation of learned detail. Of Funes and people like him one is tempted to ask, with T. S. Eliot, “Where is the wisdom we have lost in knowledge?/ Where is the knowledge we have lost in information?”