Menus, not traffic lights: A different way to think about AI and assessments

Earlier this year I went on a Disney cruise with my extended family. Seasickness aside, another notable experience was the dining: a sit-down dinner each night with a full menu – appetizers, soups, salads, mains, light mains, bread service, desserts, no-sugar desserts, after-tinis… This bewildering selection was supported by an enthusiastic and caring ‘server’ (kind of like a personal maître d’) who got to know our family, our preferences, and our dietary needs – and was able to match these with expert recommendations from the menu.

It occurred to me while looking out across the great expanse of the Pacific Ocean and trying to hold in my latest meal that perhaps the way we approach students’ use of generative AI in assessment isn’t too far off this experience.

This article is an extension of a post that originally appeared on Linkedin.

Traffic lights and assessment scales

A common approach to AI in assessments has been to use a traffic light system or a linear scale. Red: can’t use AI. Green: go for it. Yellow: only some forms of AI use are allowed. Sometimes there’s a yellow-green, or yellow-red in the mix as well in a linear scale. This definitely brings familiarity and safety, which are much needed in a time of great turbulence. After all, traffic lights exude safety and inherently all of us educators are comfortable with requiring students to use their own brains for most of their work. A scale is tempting because it suggests that we can dial up and down how students use AI.

But, for take-home assessments that are not in some way supervised, we cannot know whether students use AI and cannot physically stop students from using it either. We also can’t truly AI-proof assessments by making questions that AI “can’t answer”. The yellow light is not really valid here when it comes to unsupervised assessments, and neither is the red light. Just because we say an assessment should have restricted or inhibited use of AI doesn’t mean that this will happen. As Phill Dawson, assessment and integrity expert from Deakin University says, any restrictions that are not enforceable hurt assessment validity.

Additionally, a scale or set of lights/numbers does not reflect the reality and complexity of how generative AI are used. This cannot be put into a neat bucket.

This leaves us really with ‘go’.

Menus

When you sit down at one of the on-board restaurants on a Disney cruise and are welcomed by the effervescent server and presented with the overwhelming menu, you know you’ll eat something (or you would have skipped it and gone to trivia instead). But what? This depends on your appetite, what others around you are eating, what is fresher on the menu, your dietary requirements, what you tried previously, etc. One of your server’s roles is to help you make sense of the menu and help you choose the most suitable items for that point in time.

Sometimes you’ll have an entrée and light main, sometimes you’ll have the bread and five desserts, and sometimes you might have a soup, main, and a sugar-free dessert. There is no right or wrong (or allowed/disallowed), but there may be options that are more suitable or less suitable.

I think generative AI use in assessments is a bit like this.

There is a bewildering (and growing) array of generative AI tools available that may be applied for assessments. In addition to the staple ChatGPT, there are tools that help you summarise the literature (e.g. scite.ai, elicit.com), get answers from the internet (e.g. perplexity.ai, copilot.microsoft.com), produce images (e.g. DALL-E 3, MidJourney), create presentations (e.g. Gamma, Plus), analyse data (e.g. ChatGPT Plus), improve writing (e.g. Grammarly), produce videos of yourself talking (e.g. HeyGen), make music (e.g. Udio), and more. The mainly text-based mainstream generative AI tools (e.g. ChatGPT, Google Gemini, Microsoft Copilot) can themselves be used in myriad ways – brainstorming, checking, reflecting, editing, etc.

As educators who know our students and our teaching and assessment contexts, we are uniquely positioned to know what menu items to recommend. For any one assessment (and student, for that matter), there will be ways to use AI and tools that are more suitable or less suitable. Sometimes it will be more productive and responsible for students to use particular AI tools to engage with literature and edit their text. Other times it will be more suitable for students to other AI tools to provoke reflection, draft a structure, and then provide feedback on text. Want to order five desserts? Maybe not for this assessment, but it’s a great idea for the next one.

Knowing what to order

We’ve been here before though.

Consider some standard tools: Word, Excel, PowerPoint, and Photoshop. When we set an assessment that asks students to create an annual report, we don’t go around traffic-lighting particular tools. A good student will know to use Word for word processing and formatting, and Excel to calculate the tables of financial data needed. Another student may use tables in Word to lay out the financial data and work things out with a calculator – perhaps they were never taught how to use Excel. One student may use Photoshop to produce an impactful graphic for the front page, whereas another student may expertly use Powerpoint shapes to the same effect (this would be me).

The point here is that there is no right or wrong per se, but there are definitely tools (and ways to use them) that are more or less suitable. Some students will know, and some won’t.

Our role as educators, like our friendly server who started to feel like part of the family by the 4th night of the cruise, is to help our students understand which tools (and ways to use them) are best suited to their own needs at a particular point in time for a particular assessment. This is a key part of developing students’ information and digital literacy skills.

An ‘AI for assessment’ menu

Brainstorming with my team (the wonderful Jessica Frawley, Samantha Clarke, Eszter Kalman, Benjamin Miller, and Robyn Martin), we came up with some sort of menu of AI for assessments. Each of these menu items are supported by different AI tools, used in different ways.

As a critical friend – Soups

Suggest analyses
Provoke reflection
Provide study/organisation tips
Practicing

Getting started – Entrees

Suggesting structure
Brainstorming ideas

Engaging with literature – Bread service

Suggesting search terms
Performing searches
Summarising literature
Identifying methodologies
Explaining jargon
Fixing reference list

Generating content – Mains

Writing some text
Making images, video, audio
Making slidedecks

Analyses – Lighter mains

Performing analyses of data, text
Suggesting counterarguments

Editing – Coffees

Editing tone
Improving clarity and readability
Fixing grammar
Shortening

Feedback – Desserts

On all of the above elements
Specifically on rubric criteria

If you’ll excuse my poor attempt to connect these to food, hopefully this menu helps to illustrate the nature of generative AI for assessments. For a particular assessment, for example, we may know it will be more beneficial for student learning to use AI predominantly as a critical friend, to engage with literature, and then to edit. Sure, for that assessment they could use AI to generate some content or analyse data, but this will degrade students’ attainment of the learning outcomes. For another assessment and another set of learning outcomes, it may be quite palatable and beneficial to order bread, a main, and skip the AI dessert and do that oneself.

Many of these ‘AI for assessment menu’ items can be achieved through text-based generative AI tools like ChatGPT, Copilot, Gemini, and Claude. It really depends on how you prompt it. Some of these could be built as AI ‘agents’ that are designed by educators to perform a certain task like provoke reflection, provide rubric-specific feedback, or suggest counterarguments.

As an example, for the unit that I’m teaching into this semester, these are the recommendations from the menu:

Assessment 1 (self reflection about current challenges in a practice setting)
- Soup – AI to provoke reflection – we have built a Cogniti AI agent that was instructed to help students reflect on key issues in the context of the frameworks used in the unit
- Dessert – AI to provide feedback based on rubric criteria – we built another Cogniti AI agent that was given the marking criteria and assignment brief, and instructed to provide feedback to students before submission
Assessment 2 (annotated bibliography)
- Entree/bread service – AI to brainstorm search terms and ideas – we built another Cogniti AI agent to help students unpack unfamiliar terminology and suggest potential search terms for literature databases
- Bread service – we showed our students scite.ai, elicit.com, and Research Rabbit and explained the ins and outs of each of these tools, and the importance of deep-diving into the literature once key papers were identified
- Dessert – AI to provide feedback based on rubric criteria – again, another Cogniti AI agent that was provided the rubric and assignment brief
We have yet to consider what menu items are most appropriate for assessment 3 (group project with forward plan)

As educators, we know the contexts of our units. We know the learning outcomes, and the assessments, and how students might best learn. We know our students, their interests, and their needs. As enthusiastic and caring educators, let’s also come to know generative AI and the ways that it can be used, so that we can help students steer clear of less suitable choices and help them embrace choices that will not only be palatable, but productive and responsible.

Menus, not traffic lights: A different way to think about AI and assessments

Traffic lights and assessment scales

Menus

Knowing what to order

An ‘AI for assessment’ menu

In defence of just doing what you can during COVID-19

2 Comments

More Stories

Principles for the (re)development of learning spaces at Sydney

Traffic lights and assessment scales

Menus

Knowing what to order

An ‘AI for assessment’ menu

You may also like

2 Comments

More Stories