As powerful as generative AI tools like ChatGPT are, they are not infallible. AI “hallucinations” can occur when a type of AI, known as a large language model, generates false information. These hallucinations can range from simple errors to dramatic inventions. They can pose real-world problems, including in legal contexts where stakes are high, and accuracy is critical. In this article, we will look at a few situations in which AI tools “hallucinated” legal authorities that did not actually exist, and how the Courts responded.
One of the first cases to deal with AI hallucinations was the US case of Mata v Avianca 22-cv-1461 (PKC). The claimant filed a lawsuit against the defendant airline, alleging that he was injured when a metal serving cart struck his left knee during a flight. During the course of the lawsuit, the claimant’s attorneys filed legal submissions that cited and quoted from purported legal authorities. In reality, the attorneys had used ChatGPT, which had hallucinated —ie completely made up—the authorities. They did not actually exist. Unsurprisingly, neither the opposing attorneys nor the Court were able to find them, and so the Court ordered the Claimant’s attorneys to provide copies. Rather than come clean at this stage, the Claimant’s attorneys doubled down. After all, they had asked ChatGPT to verify that the authorities were real, and it had said yes! They provided copies of what they claimed were extracts from the authorities, which had been generated by AI.
District judge P Kevin Castel of the Southern District of New York discussed the use of generative AI in the legal sphere. He noted that technological advances are commonplace and there was nothing inherently improper about using a reliable artificial intelligence tool for assistance. However, attorneys had a responsibility to ensure the accuracy of their filings. The claimant’s attorneys had abandoned their responsibilities when they submitted non-existent judicial opinions with fake quotes and citations created by ChatGPT, then continued to stand by the fake opinions after the opposing attorneys and the Court called their existence into question.
The judge highlighted the risks of citing non-existent authorities. The opposing party wastes time and money attempting to confirm their existence. The Court’s time is taken from other important endeavours. The client may be deprived of arguments based on authentic judicial precedents. There is potential harm to the reputation of judges and courts whose names are falsely invoked as authors of the bogus authorities and to the reputation of a persons attributed with fictional conduct. It promotes cynicism about the legal profession and the judicial system. Future litigants may be tempted to defy a judicial ruling by disingenuously claiming doubt about its authenticity.
The Court found that the claimant’s attorneys had acted with bad faith and ordered them to pay a penalty of US$5000.
In the Canadian case of Zhang v Chen 2004 BCSC 285, another attorney filed court documents citing two non-existent cases that were hallucinated by ChatGPT. Unlike the attorneys in Avianca, this attorney disclosed and explained her error, apologised to opposing counsel and the Court and indicated that she had no intention of relying on the fictitious authorities generated by ChatGPT.
The Court noted that citing fake cases in court filings was tantamount to making a false statement to the Court, and amounted to an abuse process which, left unchecked, could lead to a miscarriage of justice. However, the subsequent conduct of the attorney showed no intention to deceive. She was ordered to pay certain legal costs arising out of the citation of the hallucinated cases but was spared other sanctions that had been sought by opposing counsel.
Attorneys are not the only ones guilty of misusing AI in Court.
The UK case of Harber v Commissioners for HMRC 2023 UKFTT 1007, involved a self-represented litigant—ie a litigant acting for themselves without the assistance of an attorney—who also cited fabricated authorities generated by AI. The Court accepted that the litigant had been unaware that the AI authorities were not genuine and (as an unrepresented litigant) had not known how to verify their existence using other legal tools and resources. However, it rejected her attempt to downplay the significance of her error, echoing the sentiments expressed by the Court in the Avianca case about the dangers of submitting non-existent authorities. She ultimately lost her case.
In the US case of Weber 2024 NY Slip Op 24258, an expert witness relied on Copilot, a large language model chatbot, to cross check certain calculations. The expert was unable to recall what input or prompt he used or to explain how Copilot arrived at its output. The Court’s own tests of Copilot produced varying results, raising concerns. The Court observed that generative AI tools had “inherent reliability issues”. For these and other reasons, the expert’s evidence was held to be unreliable and speculative.
Closer to home, the Caribbean Court of Justice recently issued a practice direction on the use of generative AI tools in proceedings before it. This practice direction recognises the potential value of generative AI tools, and allows for their use in drafting submissions, summarising legal arguments and conducting basic research (though not in preparing witness statements). However, it also stresses that information generated by such tools must be thoroughly fact-checked and reviewed for accuracy by those seeking to rely on it.
AI tools can be of genuine benefit in many spheres, including the legal sphere. However, it is important to recognise and remember that they are just that—tools. They are not substitutes or replacements for careful human oversight and responsibility.
Catherine Ramnarine is a Partner at M Hamel-Smith & Co. She can be reached at mhs@trinidadlaw.com.