The Cautionary Tale: AI Use with the Lack of Human Oversight

When an organisation gets caught out for submitting flawed AI-generated work, the same narrative follows: “AI made a mistake.” It’s a convenient excuse, but a misleading one. Artificial intelligence does not make mistakes. It generates outputs based on prompts and data. It has no context, no judgment, and no understanding of consequences. The real problem lies with people who use it without providing the correct context and worse to, proper review. The recent Deloitte incident is a perfect example. A global consultancy, trusted by governments and major corporations, found itself refunding part of a 440,000 Australian dollar contract after errors were found in a report prepared for the Australian government. The issue was not that AI was used, but that human accountability was missing. Until organisations accept that technology is only as reliable as the people managing it, similar incidents will keep happening.

The Deloitte Example

Deloitte has agreed to repay part of a 440,000 Australian dollar fee after errors were discovered in an Australian government report that it helped produce with generative AI. The Department of Employment and Workplace Relations commissioned the firm to review the Targeted Compliance Framework and related IT systems. After publication, academics and journalists identified fabricated or incorrect references, along with a misdescribed court decision. Deloitte amended the document and disclosed the use of Azure OpenAI GPT-4o tools, while maintaining that the report’s core findings were unchanged. The department confirmed that the recommendations remain intact, yet the credibility damage was already done. A senator criticised the firm for allowing AI to do the heavy lifting with inadequate human checks. The result was a partial refund and a public lesson in accountability. 

Other outlets reported similar details. Coverage notes that the report included invented or misattributed citations and a fabricated legal quote later removed in revisions. The department accepted a partial refund and said the substance of the review stands, but questions about quality assurance and disclosure persist.

What "Hallucination" really means

The Deloitte report is a textbook example of AI hallucination in a high-stakes setting. Hallucination is the term for when AI models produce information that is fabricated or incorrect, often with total confidence. Generative models predict likely words based on patterns in data. They do not possess context, institutional memory, or legal expertise. When prompted for citations, they can produce plausible but false references. This is known as hallucination. In the Deloitte case, watchdogs found references that did not exist and a legal citation that did not match the real judgment. After corrections, observers noted that replacing one weak reference with a cluster of alternatives did not fix the underlying problem. The initial claims were not grounded in verified sources. 

This is not unique to consulting. Courts have sanctioned lawyers for filings that relied on AI text containing invented cases. In a recent Utah matter, the appeals court punished counsel after a brief cited a precedent that could not be found in any database. Local reporting and follow-up coverage make the same point. AI can assist research, but professionals must verify.

A Failure of Human Accountability, Not of AI

The fault does not lie with AI. It lies with the people who choose to ship AI-assisted work without the checks that any professional deliverable requires. Tools generate drafts. People are responsible for the truth. If a report contains a fabricated reference, that is a process failure. If a legal brief cites a non-existent case, that is a breach of professional duty. When organisations treat AI outputs as finished work instead of raw material, they convert productivity aid into a reputational risk. AI can accelerate research and drafting, but it cannot (and should not) replace human expertise in validating content. When Deloitte’s team delivered a report riddled with fake references and errors, that was a breakdown in their quality control and professional duty. No AI policy or guideline can absolve professionals of accountability for what they present to clients. Just as the court in Utah made clear that lawyers must verify their filings despite using AI, consultants and analysts must ensure their AI-augmented reports are accurate and credible. Blaming the AI alone is irresponsible – the onus is on the humans using the tool to use it correctly.

Why This Will Keep Happening

The market rewards speed. AI accelerates drafting, so teams move faster. Without AI training and clear process review steps, errors slip through, and clients pay for them. Public scrutiny then focuses on the tool rather than the system that enables misuse. Unless leaders establish clear expectations for verification, disclosure, and ownership of outcomes, similar incidents will continue to surface across various domains, including consulting, legal, financial, policy, and technical. The pattern is already visible. When errors like these come to light, they undermine trust, not just in AI tools, but in the organisations deploying them. As Senator Deborah O’Neill (Senator for New South Wales, representing the Australian Labor Party) joked, why shouldn’t a client “sign up for a ChatGPT subscription” instead of paying a hefty fee to a firm that handed in AI-written material? This sharp critique highlights a serious reputational risk: if consulting firms (or any experts) misuse AI, they may be seen as overcharging for establish clear expectations for verification, disclosure, and ownership of outcomes, similar incidents will continue to surface across various domains, including consulting, legal, financial, policy, and technical work product that is no more reliable than something a layperson could prompt from a chatbot. 

Moreover, incidents like this can cause backlash against the broader use of AI in business. Government clients, for example, might become wary of allowing consultants to use AI at all. That would be a shame, because when used responsibly, AI can be a powerful asset – saving time on data analysis, generating useful first drafts, and uncovering insights. The key differentiator is responsible use. The fallout from the Deloitte report is a reminder that we will keep seeing AI-related blunders until organisations institute proper oversight and take accountability. In an era of rapid AI adoption, those who integrate AI wisely – with transparency and robust quality controls – will earn trust, while those who don’t will face harsh scrutiny.

What Accountable Use of AI Looks Like

Leaders who want the benefits without the blow-ups should make five commitments. 

  1. Human in the loop at every critical step 
    No AI-assisted analysis or narrative should reach a client or the public without a competent reviewer signing off. That sign-off should include explicit checks on facts, figures, citations, and legal or regulatory interpretations.
  2. Transparent disclosure of AI assistance
    If AI tools contribute to material content, say so upfront. Hidden appendices added after publication erode trust. Clear disclosure aligns expectations and concentrates minds on quality control. 
  3. Source-first verification 
    Require authors to trace every external claim to a primary or authoritative source. No claim should rely on a model’s memory. If a source cannot be produced, the claim does not ship. This rule would have prevented fabricated references and misstated case law. 
  4. Skills and training for practitioners 
    Equip teams to understand model behaviour, including hallucination, context gaps, and the difference between retrieval and generation. Professionals should treat AI outputs as a starting point, not an answer. 
  5. Governance with teeth 
    Adopt internal policies that define permitted uses, mandatory reviews, and escalation paths for AI-assisted work. Assign a named owner for compliance on every project. When errors occur, correct the record quickly and accept responsibility. The Deloitte episode shows why governance must be operational, not theoretical. 

By implementing measures like these, organisations can harness the efficiency of AI without sacrificing integrity. The goal is to pair the strengths of AI (speed, pattern recognition, generative creativity) with the irreplaceable strengths of humans (judgment, context, ethical reasoning). When human intelligence guides and vets artificial intelligence, the result is far more reliable. 

For Clients and Procurers

The Deloitte incident is a wake-up call. It demonstrates that the weakest link in AI deployment is not the technology – it’s the human factor. AI will undoubtedly continue to advance and become even more integrated in our workflows. But no matter how powerful or “smart” these tools become, they lack true understanding and responsibility. Those qualities remain uniquely human. These are our recommendations for clients and procurers:   

  1. Ask who is doing the work, how AI is being used, and what verification exists between draft and delivery.  
  2. Request a quality checklist as part of the statement of work.  
  3. Require cross-checks for legal and regulatory claims.  
  4. Insist on access to the bibliography or evidence pack used to support conclusions.

If a provider cannot show a robust process, you are paying for risk. Organisations that are leading the AI transformation in the market understand this: they succeed by using AI to augment human expertise, not to substitute it.

In the end, AI is a powerful enabler when used within a disciplined system. It accelerates discovery, helps surface patterns, and speeds up drafting. It does not remove the need for domain expertise, judgement, or professional responsibility. We advocate for AI-assisted workflows that are transparent, verifiable, and accountable. We will continue to use leading models, but we will never outsource our duty of care to a tool. 
 
Do not blame AI for flawed reports. Blame the lack of human oversight that allowed unverified text to become a deliverable. Until organisations take accountability seriously, incidents like Deloitte’s will recur. Those who combine AI with rigorous human review will earn trust and set the standard. Those who do not will keep issuing refunds and apologies while clients look elsewhere.

References:  

Dhanji, K. “Deloitte to pay money back to Albanese government after using AI in $440,000 report.” The Guardian, 6 October 2025. The Guardian 

“Deloitte to partially refund Australian government for report with apparent AI-generated errors.” AP News, October 2025. AP News 

“Deloitte issues refund for error-ridden Australian government report that used AI.” Financial Times, 2025. Financial Times 

“Deloitte is giving the Australian government a partial refund after it used AI to deliver a report with errors.” Business Insider, 2025. Business Insider 

“Lawyer punished for filing brief with ‘fake precedent’ created by AI.” The Salt Lake Tribune / SLTrib, May 2025. The Salt Lake Tribune 

“US lawyer sanctioned after using ChatGPT for court brief.” The Guardian, 31 May 2025. The Guardian 

“Deloitte to refund government, admits using AI in $440k report.” Australian Financial Review (AFR), October 2025. The Australian Financial Review 

“Deloitte will refund Australian government for AI hallucination-filled report.” Ars Technica, 2025. arstechnica.com 

Share – 

Related Posts