A student uses an iPad to take an online test. (Los Angeles Times )
Editorial writers are entitled to their opinions. But Karin Klein's May 6 piece on automated essay scoring simply misses the point.
She questions the value of automated essay scoring software (AES) as a teaching supplement in the area of writing, based in part on her own daughter’s experience. It is unfortunate that her singular personal experience was frustrating for mother and daughter, but it should not be used to censure an evolving technology that can, indeed, produce better writers.
The study for which I was principal investigator did not address the use of AES in formative applications like the teaching of writing, but examined its performance in scoring high-stakes writing assessments such as those on standardized tests. In that study (which included entries from eight commercial vendors and one university laboratory), the scoring engines performed impressively well. Our recommendation was that the two major Race to the Top consortia should further study the validity of machine scoring to ensure that it works as advertised. Three states (Utah, Louisiana and South Dakota) already use AES to score some aspect of their standardized tests, and other states are considering its adoption.
To Klein’s point, we believe there is a role in the use of AES for the instruction of writing. It is lamentable that most high school writing classes offer only three writing assignments during a semester, the reason being that teachers have such a high volume of essays to grade. And one thing we do know after 50 years of educational research is that if you want to be a writer, you have to write more.
AES can provide more opportunities for writing by reducing the burden on teachers, but the teacher would retain his role of coaching students to target what needs to be improved and incorporating creativity, voice and fluidity in the writing. The automation provides a mechanism for consistent and continuous feedback and data to show students’ progress.
The feedback provided by the Web-based software is both quantitative and qualitative. That is, in addition to an overall rating, students may receive scores on individual attributes of writing, and the software may summarize or highlight a variety of errors, ranging from simple grammar to style or content. Some of the software packages also provide a discourse analysis of the work that might look something like this for a persuasive essay:
“The main topic of your paper seems to be the following…
“You have attempted to make three points, but you don’t have too much information for the second and third points.
“This is what I think your conclusion is….”
Certainly, a writer could conclude: “Clearly, the computer doesn’t understand the brilliance of my prose.” But our hope is that the writer would respond, “Wow, if the computer doesn’t understand what I’m trying to write, how can my teacher?” Students will work diligently to modify their writing so that the feedback matches what they expected or envisioned. This is a definition of learning.
Klein would like to see machine perfection, but the software is only as good as the models on which the scoring is based. For every example in which a machine made a scoring error in the evaluation of a paper, there are at least 10 examples where human graders get it wrong. This is one reason why we are reaching out to the writing community to gain consensus on what constitutes good writing.
So far, the best that we can agree on is that we know good writing when we read it. This really doesn’t help our students become better writers.
Daum: Too brainy to be president?
McManus: Obama evolves on gay marriage
Obama's turnabout on gay marriage: Evolution or intelligent design?
Mark D. Shermis, professor and dean of the College of Education at the University of Akron, is the lead researcher on a study of automated essay scoring software.
If you would like to write a full-length response to a recent Times article, editorial or Op-Ed and would like to participate in Blowback, here are our FAQs and submission policy.