Archive for August, 2013

Evaluation rubrics: the good, the bad, and the ugly

A real-time chronicle of a seasoned professor just about to launch the third edition of his massively open online course.

With the third session of my MOOC Introduction to Mathematical Thinking starting on September 2, I am busy putting the final touches to the course materials. As I did when I offered the second session earlier this year, I have made some changes to the way the course is structured. The underlying content remains the same, however – indeed at heart it has not changed since I first began teaching a high school to university “transition” course back in the late 1970s, when I was a young university lecturer just starting out on my career.

With the primary focus on helping students develop an new way of thinking, the course was always very light on “content” but high on internal reflection. A typical assignment question might require four or five minutes to write out the answer; but getting to the point where that is possible might take the student several hours of thought, sometimes days. Students who approach the course thinking it is an introductory course on logic – some of whom likely will, as they have in the past,  post on the course forum that they cannot understand why I am proceeding so slowly and making such heavy weather of the material – will, if they don’t walk away in disgust, eventually (by about week four) realize they are completely lost. Habituated to courses that rush through a pile of material that required mostly procedural mastery, they find it challenging, and in many cases impossible, to slow down and adopt the questioning, reflective approach this course requires.

My course uses elementary linguistics and formal logic as a vehicle to help develop new thinking skills that are essential for university mathematics majors, very valuable for STEM majors, and of considerable value for anyone who wants to lead a more rewarding life. But it is definitely not a course in linguistics or logic. It is about thinking.

Starting with an analysis of certain features of ordinary language, as I do, provides a starting point that is accessible to everyone – though because the language I examine is English, students for whom that is a second language are at a disadvantage. That is unavoidable. (A Spanish language version, embedded in Hispanic culture, is currently under development. I hope other deep translations follow.)

And formal logic is so simple and structured, and so accessible to a beginner, that it too is well suited to an introductory level course on analytic, and in particular mathematical, thinking.

Why my course videos are longer than most

The imperative of a student devoting substantial periods of time engaged in sustained contemplation of the course material has led to me making two decisions that go against the current grain in MOOCs. First, the pace is slow. I speak far more slowly than I normally do, and I repeat each point at least once, and often more so. Second, I do not break my “lectures” into the now-almost-obligatory no-longer-than-seven-and-ideally-under-three-minutes snippets. For the course’s second running, I did split the later hour or more long videos into half-hour sections, but that was to make it easier for students without fast broadband access, who have to download the videos overnight to watch them.

Of course, students can speed up or slow down the videos, they can watch them as many times as they want, and they can stop and start them to suit their schedules. But then they are in control and make those decisions based on their own progress and understanding. My course does not come pre-digested. It is slow cooking, not fast food.

Learning by evaluation

The main difference returning students will notice in the new session is the much greater emphasis on developing evaluation skills. Fairy early in the course, students will be presented with purported mathematical proofs that they have to evaluate according to a grading rubric.

At first these will be fairly short arguments, designed by me to illustrate various key features of proofs, and often incorporating common mistakes beginners make. Later on, the complexity increases. For those students who elect to take the final exam (and thereby become eligible to earn a Distinction grade for the course), evaluation will culminate in grading three randomly assigned, anonymized exam submissions from fellow students, followed by grading their own submission.

Peer evaluation is essential in MOOCs that involve work that cannot be machine graded, definitely the category into which my Mathematical Thinking course falls. The method I use for the Final Exam is called Calibrated Peer Review. It has a long history and proven acceptable results. (I describe it in some detail on my MOOC course website – accessible to anyone who signs up for the course.) So adopting peer evaluation for my course was unavoidable.

The first time I offered the course, I delayed peer evaluation until the final couple of weeks, when it was restricted to the final exam. Though things went better than I had feared, there were problems. The main issues, which came as no surprise, were, first, that many students felt very uneasy grading the work of others, second, many of them did not do a good job, and third, the rubric (which I had taken off another university’s Internet shelf) did not work at all well.

On the other hand, many students posted forum comments saying they found they enjoyed that part of the course, and learned more in those final two weeks than in the entire earlier part of the course.

I had in fact expected this would be the case, and had told the class early on that many of them would have that reaction. In particular, evaluating the work of fellow students is a very powerful, known way to learn new material. Nevertheless, it came as a great relief when this actually transpired.

As a result of my experience in the first session, when I gave the course a second time this spring, I increased the number of assignment exercises that required students to evaluate purported proofs. I also altered the rubric to make it better suited to what I see as the main points in the course.

The outcome, as far as I could ascertain from reading the comments student posted on the course discussion forum, was that it went much better. But it was still far from perfect. The two main issues were the rubric itself and how to use it.

Designing a rubric

Designing a good rubric is not at all easy for any course, and I think particularly challenging for a course on more advanced parts of mathematics. Qualitative grading of mathematical arguments, like grading essays or works of art, is a holistic skill that takes years to acquire to a degree it can be used to evaluate performance with some degree of reliability. A beginner attempting evaluation needs guidance, most typically provided by an evaluation rubric. The idea is to replace the holistic application of a lifetime’s acquisition of tacit domain knowledge with a number of categories that the evaluator should look for.

The more fine-grained the rubric, the easier it will be for the novice evaluator, but the more onerous the grading task becomes. The rubric I started with for my course had six factors, which I felt was about right – enough to make the task doable for the student yet not too many to turn it into a dull chore. I have retained that number. But, based on the experiences of students using the rubric, I changed several categories the first time I repeated the course and I have changed one category for the upcoming third session.

In each of the six categories in the rubric, the student must chose between three levels, which I name Novice, Apprentice, and Practitioner. I chose the names to emphasize that we are using evaluation as a way to learn, and the focus is to measure progress along a path of development, not assign summative performance judgments of “poor”, “okay”, and “good”.

The intention in having just three levels is to force a student evaluator to make a decision about the work being assessed. But this can be particularly difficult for a beginner who is, of course, lacking in confidence in their ability to do that. To counter that, in this third session, when the student enters the numerical value that course software will use to track progress, the numerical equivalents to those three categories are not 0, 1, 2, but 0, 2, and 4. The student can enter 1 or 3 as a “middle value” if they are undecided as to which category to assign.

Using the rubric

Even with “middling” grades available for the rubric items, most students will find the evaluation process difficult and very time consuming. A rubric simply breaks a single evaluation task into a number of smaller evaluation tasks, six in my case. In so doing, it guides the student as to what things to look for, but the student still has to make qualitative judgments within each of the categories.

To help them make these judgments, the last time I gave the course, I provided them with tutorial videos that take them through the grading process. I record myself grading the same sample arguments that they have just attempted to evaluate, verbalizing my thinking process as I go, explaining why I make the calls I do. They are not the most riveting of videos, and they can be a bit long (ten minutes for some assignment questions). But I don’t know of any other way of conveying something of the expertise I have built up over a lifetime. It is essentially a modern implementation of the age-old apprentice system of acquiring tacit knowledge by working alongside the expert.

Unfortunately, as an expert, I make calls based on important distinctions that for me jump from the student’s page, but are not even remotely apparent to a beginner. The result last time was, for some questions, considerable frustration on the part of the students.

To try to mitigate this problem (I don’t think it can be eliminated), I changed some aspects of the way the rubric is formulated and described, and decided to introduce the entire evaluation notion much earlier in the course. The result is that evaluation is now a very central component of the course. Indeed, evaluating mathematical arguments now plays a role equal to constructing them.

If it goes well – and based on my previous experience with this course, I think it will go better than last time – I will almost certainly adopt a similar approach if and when I give the course in a traditional classroom setting once again. (A heavy travel schedule associated with running a research lab means I have not taught a regular undergraduate class for several years now, though an attractive offer to spend a term at Princeton early next year will give me a much welcomed opportunity to spend some time in the classroom once again.)

Evaluating to learn, not to grade

One feature of a MOOC – or at least a MOOC like mine that does not offer college credit – is that the focus is on learning, not acquiring a credential. Thus, grading can be used entirely for formative purposes, as a guide to progress, not to provide a summative measure of achievement. As an instructor, I find the separation of the teaching and the grading extremely freeing. For one thing, with the assignment of grades out of the picture, the relationship between teacher and student is changed significantly. Also, it means numerical grades can be used as useful indicators of progress. A grade of 35% can be given for a piece of work annotated as “good” (i.e., good for someone taking an introductory course for the first time). The number indicates how much improvement would be required to take the student to the level of an expert practitioner.

To be sure, students who encounter this use of grades for the first time find it takes some getting used to. They are so habituated to the (nonsensical but widespread) notion that anything less than an A is a “failure” that they can be very discouraged when their work earns them a “mere” 35%. But in order to function as a school-to-university transition course, it has to help them adjust to a world where 35% if often a respectable passing grade.

(A student who regularly scores in the 90% range in advanced undergraduate mathematics courses can likely jump straight into a Ph.D. program – and some have done just that. 35% really can be a good result for a beginner.)

One final point about peer evaluation is an issue I encountered last time that surprised me, though perhaps it should not have, given everything I know about a lot of high school mathematics instruction. Many students approached grading the work of others as a punitive process of looking to deduct points. Some went so far as to complain (sometimes angrily) on the discussion forums about my video-streamed grading as being far too lenient.

In fact, one or two even held the view that if a mathematical argument was not logically correct, the only possible grade to give was 0. This particular perspective worried me on two counts.

Firstly, it assumes a degree of logical infallibility that no living mathematician possesses. I doubt there is a single published mathematical proof of more than a few paragraphs that does not include some minor logical slips, and hence is technically incorrect. (Most of the geometric proofs in Euclid’s Elements would score 0 if logical correctness were the sole metric!)

Second, my course is not a mathematics course, it is about mathematical thinking, and has the clearly stated aim of looking at the many different aspects of mathematical arguments required to make them “good.” Logical correctness is just one item on that six-point rubric. As a result, at most 4 of the possible 24 points available can be deducted in an argument is logically incorrect. (Actually, 8 can be deducted, as the final category is “Overall assessment”, designed to encourage precisely what the phrase suggest.)

To be sure, if my course were a mathematics course, I would assign greater weight to logical correctness. As it is, all six categories carry equal weight. But that is deliberate. Most of my students’ entire mathematical education has been in a world where “getting the right answer” is the holy grail. One other objective of transition courses is to break them of that debilitating default assumption.

Finally, and remember, this is for posterity, so be honest. How do you feel?

I’ve written elsewhere that I think MOOCs as such will not be the cause of a revolution in higher education. Rather they are just part of what is more like to be an evolution, though a major one to be sure. From the point of view of an instructor, though, they are providing us with a wonderful domain to re-examine all of our assumptions about how to teach and how students learn. As you can surely tell, I continue to have a blast in the MOOCasphere.

To be continued …

Advertisements

I'm Dr. Keith Devlin, a mathematician at Stanford University. I gave my first free, open, online math course in fall 2012, and have been offering it twice a year since then. This blog chronicles my experiences as they happen.

Twitter Updates

New Book 2012

New book 2011

New e-book 2011

New book 2011

August 2013
M T W T F S S
« Jun   Dec »
 1234
567891011
12131415161718
19202122232425
262728293031  

%d bloggers like this: