False flags and broken trust – Can we tell if AI has been used?

Adobe Stock, used under licence

A recent ABC news investigation and Sydney Morning Herald story revealed the human cost of the use of AI detectors in some universities and schools, with students falsely accused of using AI in their assignments carrying the burden of proof for their innocence. In the articles, individuals reported withheld grades, failed subjects, impacted graduation prospects, and emotional harm.

University folklore says that once the jacaranda tree in the Quad is in flower, it is too late to start studying for final exams. Picture taken on 28th October 2025.

At the University of Sydney, we are approaching the end of the first semester of the Sydney Assessment Framework which aims to assure both integrity and relevance for our degrees in the realities of the world where generative AI, trained to reproduce human output, is ubiquitous in workplaces and in the software we use daily. With the young jacaranda tree in the University’s quadrangle in full flower, we are now in the peak assessments period with many “open” (lane 2) assignments, with AI use allowed if appropriately acknowledged, currently being marked. Given the news articles linked above, in this post we consider implications of these realities:

  • Generative AI tools are designed to accurately mimic human output. GPT 3.5 stunned the world and its own developers with its performance in late 2022, but the current tools are even more “human-like”. The tools presently in our students’ hands are the worst that they will ever use in their studies and in their work lives – these tools are only going to become more advanced
  • Machines and humans (even expert educators) cannot accurately or reliably detect AI generated content, as shown in detailed empirical studies including those outlined below. Worse still, as these studies highlight, humans can be very confident that they can detect AI use. This leads to:
    • False positives – falsely accused and damaged students, and lost trust between students and their educators.
    • False negatives – a false sense of assessment security and integrity, and lost trust between universities and their communities.
  • Our students need to know how to critically work with generative AI in their studies, lives and careers.

The illusion of accuracy

Software-based AI detectors typically operate on AI models that estimate the likelihood a text was generated by another AI. These models often rely on linguistic markers such as ‘perplexity’ and ‘burstiness’ which are not exclusive to AI-generated content. Human writing can easily exhibit features such as formulaic and bland prose; overuse of colons and semicolons and em dashes; and overly long sentences—especially as students increasingly engage with generative AI in their studies and daily lives and the tools improve to mimic us even more closely.

In practice, AI detectors cannot be verified. This means that false positives are inevitable. These tools can fail to detect actual AI use, creating a false sense of security and undermining the integrity of our assessments. Students with more knowledge of how to modify content – perhaps by simply breaking up long sentences or using other ‘adversarial techniques’ – or with funds to buy better tools will bypass detection with ease.

Human detection is no better

A recent study by Fleckenstein et al. (2024) tested whether educators could identify AI-generated essays among student work. The results were sobering: even experienced educators correctly identified AI-generated texts only 38% of the time. Novices performed worse, and both groups were overconfident in their judgments. Another study by Scarfe et al. (2024) found that 94% of AI submissions went undetected by human markers. Educators tended to misclassify low-quality AI texts as student-written and awarded higher grades to high-quality AI texts than to equivalent student work. If even trained educators are vulnerable to misjudging authorship and may inadvertently reward AI-generated work when marking, the implications for fairness and integrity are clear.

Confirmation bias and the human factor

As noted by Bassett et al. (2025), when AI detection software flags a student’s work, educators may search for stylistic features such as perfect grammar or formulaic phrasing to validate the result. The problems with humans trying to detect AI generated content are very similar to those that machines face; after all, these features occur because they occur in the human writing that the models were trained on. Seeing these elements may introduce confirmation bias where suspicion, rather than evidence, drives interpretation. Comparing student work to AI-generated examples, using multiple detectors, or referencing past writing styles only compounds the problem. These methods do not provide independent verification and risk penalising students for legitimate academic growth or stylistic variation.

The real-world test

In the blind study by Scarfe et al. (2024) mentioned above, a number of 100% AI-generated responses were submitted to five undergraduate take-home exams, across all years of study at a UK university. The AI responses on average outperformed real student work by half a grade boundary. The inability of markers to spot the AI generated responses was despite an integrity focus on the perceived characteristics of AI written text such as looking for answers “too good to be true”, covering predominantly content not taught on the course and the use of irrelevant references.

Kofinas et al. (2025) similarly reported experienced markers could not reliably distinguish between human-authored, AI-modified, and AI-generated assessments even when they were told to look for them. Worse, the suspicion of AI use influenced marking behaviour; in some cases, genuine student work was downgraded simply for being ‘too perfect.’

The false dichotomy of human vs AI

Perhaps the most challenging assumption underpinning AI detection is the binary view that work is either human-written or AI-generated. In reality, students increasingly produce work with AI, not by AI. The blurring of these lines means that helping students use AI responsibly and effectively to learn and produce better work should be our aim as educators.

The standard and burden of proof, and procedural fairness

The standard of proof required in academic integrity investigations is ‘on the balance of probabilities’ and the University carries the burden of proof, not the student. Academics’ instincts and AI detection scores alone do not constitute credible evidence. As AI use is embedded into assessments, additional indicators such as stylistic markers and comparisons with past work become less reliable than they once were.

Students should not be required to prove their innocence, and silence or lack of drafts cannot be interpreted as guilt. If students are required to submit drafts alongside their assessment submission, then the assessment instructions need to clearly indicate this. However, we need to remember that common AI tools can easily generate such drafts and revision histories.

Acknowledgement of AI use

Our recently-updated Academic Integrity Policy requires that “students must acknowledge any generative AI tools used in an assessment”. At a minimum this includes explaining:

  • that they have used AI tools in completing the work
  • the name and version of the automated writing or generative AI tools they have used
  • the publisher
  • the uniform resource locator (URL)
  • and a brief description of how they have used them

Unit Coordinators may also provide additional stipulations for how AI use must be acknowledged/explained, such as by requiring students to keep drafts or a list of AI inputs and outputs. These stipulations must be clearly stated in the assessment instructions and should be used to provide students with the opportunity to reflect on how they have used generative AI, and for educators to provide feedback to help scaffold more effective use.

When submitting an assignment, students also acknowledge that they “have complied with all rules and referencing requirements set for this assessment task, including correctly acknowledging the use of any generative AI tools or assistance from others such as copyediting or proofreading.”

Students should not be required to state that they have not used generative AI tools. If they do not acknowledge use of AI, it should be assumed that it has not been used.

Marking open (lane 2) assessments

Under the Sydney Assessment Framework, the verification of students’ achievement of program learning outcomes comes through the secure (lane 1) assessments, not the open (lane 2) assessments.

Importantly, in all assessments, including open assessments, students are still responsible for the work they submit.

Our recent policy changes mean that educators cannot restrict or ban AI in open (lane 2) assessments. This is about keeping with reality and increasing trust and agency. Educators at the University of Sydney have never had access to AI detection tools for the reasons argued strongly by Bassett et al. (2025) and outlined above. Software-based and human-based detection produce unacceptable levels of false negatives and false positives. They risk punishing the innocent and, by giving a false sense of security, risk missing those who have not learnt. Generative AI tools are trained to mimic human outputs and will only get better with time.

If writing in a bland or imprecise style is contrary to the learning outcomes and disciplinary or professional expectations, it is more than appropriate to construct the rubric to reflect this and provide exemplars. A marker’s instinct that a student’s writing style is “AI-like” is both unprovable and unhelpful. As reflected by Bassett and colleagues, if an “arbitrary hurdle of ‘human-like-ness‘” is required, “what is a student to think if their writing is not viewed as sufficiently ‘human’“?

Some educators may feel that the presence of falsified references is suggestive of generative AI use. Submitting an assignment with fabricated references reflects a learning error and may even constitute an academic integrity breach (for fabricating sources) regardless of whether generative AI was used (or not) in the process of producing the work. The use of generative AI in an open assessment is not itself a breach of academic integrity, providing that students acknowledge it. Importantly, many free and paid generative AI tools have existed for over two years which connect to real sources and compose accurate bibliographies (such as the deep research-capable tools like Perplexity, ChatGPT, Gemini, or Claude, or other literature-grounded tools like Scite or Scispace).

When grading an open assessment, educators should:

  • continue to mark the quality of the work submitted as reflected in the rubric;
  • trust that students want to learn and receive feedback;
  • resist instincts and unconscious bias about how work was produced.

The marking rubric is not be used to manage suspected integrity breaches. Students  must be given an opportunity to respond to any integrity allegations through a formal process.

 

Tags from the story
, ,
More from Adam Bridgeman

Welcome to the Open Learning Environment

The Open Learning Environment (OLE) is a key and highly innovative feature...
Read More