Designing questions for online, open-book assessment (Video case study)

Dimitra Petropoulou, London School of Economics
Video published September 2020 as part of the Virtual Symposium on Adaptable Assessment
Published as a case study October 2020

Downloadable material: Word document with "old"-style questions, "new"-style questions and reflections on the difference.

My name is Dimitra Petropoulou and the purpose of this video is to talk you through how I went about designing problems for my online open book assessment for a second year micro course at LSE.

So obviously over the last few months we've all been facing the difficulties that have arisen with a COVID-19 pandemic and the objective here is to design effective online assessment that enables our students to progress. But of course we face a lot of constraints. Our students are dispersed internationally, there are limitations in technology and also our students wouldn't necessarily have internet connectivity at specific times. Of course let's not forget that our courses are really geared towards on-campus timed final exams. So as a result of that our students and ourselves aren't very experienced in designing exams that are quite different in nature. So as a result of all these challenges many universities opted for online 24-hour open book exams this past summer. So we've gone from students that look a little bit like this to something closer to this.[1:14] Of course it's quite a different way of being assessed and in itself it poses quite a lot of challenges.

So the first of them is that open book exams typically mean that all aspects of course material are accessible including lecture capture. So, lecture slides, problem sets, solutions to problem sets, past papers, solutions to past papers, and so on. Now it might also mean students have access to search engines and books and online resources of different kinds, not to mention their handwritten notes. Another key factor is that candidates would have a far longer period of time to reflect and work through the computations of problems. So that really changes the nature of the assessment and it means that we really can't rely on time as a way of rationing how students can can cope with the questions and also we can't rely on memory as an indicator of depth of study. This means we can't rely on time constraints and memory as a way of differentiating between first-class students, 2-1 students, 2-2 students, and so on and so forth.

So we have to think quite differently about how we design our assessments and how we can really test the learning outcomes of our courses. So how can we go about designing problems for open book online exams? This is my take on this issue, just based on my experience. The first point I want to make is that a good online problem isn't necessarily a difficult problem. What I've observed is that when faced with 24-hour exams questions have been ratcheted up a little bit in difficulty or made far longer but this isn't necessarily a good design. Making them harder isn't necessarily making them more sensitive and nuanced in their ability to discern the different levels of understanding and learning of different different types of students.

A key principle that I try and follow is application not replication. So when designing your questions it's very important that the answers to those questions can't just simply be lifted off the slides or something that someone could look up on Google. This is a real problem because in many of our questions we ask for definitions or we ask students to discuss or show something that we've directly addressed in a lecture or a class. So we have to move away from that but then again we can't design questions that are very removed from what we've already taught. We have to balance the need for designing questions that are similar to what has been taught with the need to differentiate those questions to create some nuance and for students to have the opportunity to reflect on those questions.

Another thing that I think matters a lot is what proportion of marks arises from computations and what proportion of marks arises from explaining reasoning, discussing intuition, maybe giving an opinion or arguing a point of view. - these more qualitative aspects to complement the computations - and it's the latter that really distinguishes between different types of students, I've found.

So let's think a little bit about keywords in exam questions: words like "define", "find", "show", "derive", "solve", were all over the past paper questions of micro principles which is the course I teach at LSE. Somewhat less so "reason", "explain", "discuss". There was a strong emphasis on computation, albeit with tweaks and twists and turns. Another aspect of designing problems for online exams is to really think about a further dimension which is removing or adding choice.

If you go down the road of making the questions just that little bit more challenging what you could do is introduce a little bit more choice to balance that out so students don't find themselves trapped. Alternatively since students have all these notes at their disposal and 24 hours what you could do is remove choice. For students to study a particular topic for the exam which is something we might not do for a closed book timed exam.

So, to talk you through it, I'm going to show you two problems. One is going to be a micro problem that was in one of my exams a couple of years ago in the course I teach - an exam question from the past. And then the question that I set this past May on a similar theme which is oligopoly essentially. I'm going to talk you through why this traditional micro problem is not appropriate for an online 24 hour exam and how I hope that the question I did design was better at distinguishing between different types of students.

Let's go through this problem.[6:12] This is a question in which candidates are asked to define economies of scale, examine the conditions under which firms have economies of scale, and then look at a particular case where a technology is provided, a cost function is provided, and to establish whether a firm with this cost function would have economies of scale; then to explore whether perfect competition would be possible if firms in the industry had economies of scale. Once that's done students are asked to examine an industry with two firms, look at the possibility of a cartel with these firms, and to establish whether a cartel could operate profitably in this industry. This would be very much a computational part of this exam question. Moving on from the cartel, candidates are asked to basically solve for the Cournot-Nash equilibrium and establish whether the two firms could operate profitably in that situation, then to examine how many firms there will be in the industry.

So let's look at this question which is a perfectly good question under normal circumstances but why isn't it so good for online open book. First of all, if we look at the highlighted blue bits,[7:23] we see there's quite a lot of definition there: things that students can just lift from the internet or from their lecture slides. It's not a very good set of questions to ask because it's not very discriminating; everyone will be able to answer them.

The other thing to point out is the highlighted yellow bits.[7:43] These parts come directly from the lecture material. There are specific slides that address these points so a student could get quite a large chunk of marks simply by visiting the lecture slides, identifying slides that answer these particular questions and replicating them. So it's not a very useful element of assessment.

Last but not least, the last few bits of this problem are really heavily computational. There's very little scope for asking students to provide intuition or discuss something. It's very much about solving the model in different situations and drawing a conclusion. So "how many firms..." obviously they have to show their workings or their reasoning but a very specific concrete computational answer is required. So here the notion is that weak students would remember bits from the slides to pass and then hopefully muddle through with some of the computations while an excellent student would definitely understand what's in the lecture content, in the slides and then be able to solve this particular problem. As you can see, I would expect that, given 24 hours, most of my students would be able to get through this problem and it wouldn't be a very good problem if everyone got 50 out of 50 or thereabouts.

So this is the question I devised for the summer exam in my course.[9:05] I wanted to test oligopoly but didn't want to set a question that was really very similar to the examples that students saw in the lecture or the class and I wanted to really test Cournot competition, Stackelberg competition; really a quantity-setting set-up. So I thought about it a little bit and I came to the conclusion that I need to devise something that moves away from firms choosing quantity to maximize profit but to still have a setting in which quantity is the choice variable and to get students to apply the knowledge they have based on the Cournot-Nash equilibrium and Stackelberg equilibrium in a different context.

Maybe we should be designing questions with this sort of approach anyway?

So I'll just quickly talk you through the question and what it's asking and then explain why I think it worked really well. I really think it did because the distribution of marks was really great and there was a full range of marks in this question so it seemed to be really good at discriminating between different types of students.

So, Andy is a student on a micro course and his performance depends on the number of days he chooses to revise, q, but revision is costly - it's lots of hard work - so there is a cost of effort/ cost of revision which is q squared and Andy's payoff is going to be his exam score, net of the cost of revising. The first 10 marks are pretty basic computational. Candidates would be asked to, first of all, identify what mark Andy would get and his payoff if he doesn't do any revision at all and then to actually work out if Andy was essentially a monopolist here - he was choosing things on his own, if he chose his q - what would be his optimal number of days of revision and what mark would he end up with. The problem goes on to introduce Barry.[10:53] Barry is also taking this course and Andy proposes that they cheat in the final exam. Now if they cheat they each get a score s which is now a function of the number of days each of them chooses to revise and candidates were given, as you can see, an equation for the score but also told the cost of effort still applies: the cost of revision.

So now students are asked to work out the reaction functions, figure out if revision days are strategic substitutes or complements, figure out the Cournot-Nash equilibrium and establish whether Andy and Barry should pursue the plan of cheating or not and to explain why or why not. Finally, I look at it sequentially from the perspective of Stackelberg where one student chooses how many days to revise followed by the next student and then the candidates would have been asked to analyze what they expect to happen.

Last but not least,[11:51] since this is a question on cheating - we don't want students to be cheating - the university introduces a leniency scheme whereby students who admit to cheating and provide evidence are off the hook while everyone else gets seriously punished. I'm inviting candidates to discuss the likely effects of such a leniency scheme. Just to put it into context, we did discuss the case study of Virgin and British Airways and the collusion case between them regarding collusion over FM flights and we talked about the EU and US leniency programs and how they differ. So we have discussed leniency schemes in the context of the airline industry and other industries but not in this particular context. So here again students are asked to draw on the discussion we had in class and apply it to a new setting.

So why do I think this problem really was very good for the setting? The first thing is, you might have noticed there are no definitions here and this question has nothing to do with traditional Cournot or Stackelberg competition. These are students choosing revision days, not firms choosing output, but the principles are the same. The idea here is that if the student wanted to look this up in a book or online or in the slides they will not find it. I literally made this up in my head. So no one can look this up; it doesn't exist anywhere. However, it can be very easily solved using the skills the students have if they have learnt the syllabus in sufficient depth.

So the first thing is, no definitions. The next bit is the parts highlighted here in blue.[13:34] There are quite a lot of computational elements to begin. If I proceed to the next bit everything in blue is computational and if you add it all up a student wouldn't be able to get more than say 45 or 50% - at most 50% - in this question if they relied exclusively on computation without any explanation.

The key to this problem is the parts in yellow.[13:58] In the parts in yellow, students are asked to qualitatively discuss, interpret, or explain the rationale behind the results they get or the answer they provide. So they have to reason based on the mathematical computations; they have to discuss the implications; they have to explore whether the equilibrium is in fact realistic is this Stackelberg equilibrium something that we would actually expect to happen? It turns out that it's somewhat problematic and some students - the better students - did arrive at that nuance of discussion; they said it's not really realistic that one of the students would tolerate colluding with someone in an exam when that other someone hasn't revised at all. So it's quite interesting to see how the sort of nuances in the answers and how the depth with which students approach these questions clearly reflected their level of reflection and understanding on these topics.

Finally, 15 marks given to discussing the economic mechanisms of leniency programs in a particular context. All in all, I would say at least half the marks were about demonstrating reasoning and discussing a point of view. Notice the very last part here: "would you find such a scheme desirable?" Here I am explicitly asking students for their opinion - for their position - and they would need to argue whether they find it to be desirable or not. Now of course they could argue one way or the other. It doesn't matter. It's really about the sort of clarity of arguments they put forth on either side.

So moving away from just computation, definition, and replication: moving to application and discussion: maybe these things are actually good in general? Maybe we should be designing questions with this sort of approach anyway but certainly when we have open book exams I found it to be really useful.

Downloadable material: Word document with "old"-style questions, "new"-style questions and reflections on the difference.

Contributor profiles: