It is a well documented fact in New Zealand and all over the world, that Artificial Intelligence and in particular Large Language Learning Models (LLMs) have been used for student plagiarism in schools, training institutes and universities. Not so well documented are the false notifications that result in innocent students being accused of cheating and plagiarism.
The common phenomenon of ‘bias that an AI output is more superior than a human’ is also common with teachers and examiners who believe the AI system outputs over the student. This bias is not only related to the education sector, but is widespread in many areas of society that uses AI such as Facial Recognition Technologies (FRT) with law enforcement agencies.
A recent study has found that AI can in fact avoid the common plagiarism checks. The study reports that a rigorous blind study in 100% of AI written submissions were injected into the examinations system in five undergraduate modules, across all years of study, for a BSc degree in Psychology at a reputable UK university. The study found that 94% of the AI submissions were undetected. The grades awarded to the AI submissions were on average half a grade higher than that achieved by real students. Across modules there was an 83.4% chance that the AI submissions on a module would outperform a random selection of the same number of real student submissions.
Other research has also shown what many of us have noticed, AI detection tools can exhibit bias against non-native English writers, classifying them more frequently as AI compared to native English writers. This is a significant issue in New Zealand that must be mitigated.
The historical issues of the education system failing Māori, the increasing numbers of Māori students who learn English as a second language and the high numbers of immigrant students and their families who have English as a second language or who have a different cultural background are all likely to be impacted.
For mātauranga Māori (Māori knowledge), I have been experimenting with LLMs and Māori knowledge content as a student, teacher, examiner and tester for the past several years and have shared some examples in the past.
My experience is that no matter which LLM I use for textual or graphical outputs, none are good at current or historical mātauranga Māori, Māori current affairs and that Māori names for people, groups and place names confuse the LLMs. This is likely due to a lack of Māori knowledge training data for a cross range of the LLMs.
A recent example was a professional contract with a company. After spending some time discussing how to summarise Professor Hirini Mead’s Tikanga test, it was suggested that Claude could be useful as it had a long and accurate history of summarising complex business documents into succinct notes that were then checked with a human expert on the topic.
In this case, Claude hallucinated some details about Professor Mead by stating he was a Pākehā lawyer, born in the 18th century and was the son of a New Zealand judge. Claude also stated that the Tikanga test was a legal framework that was adapted by referencing New Zealand legislation. Here the words law and lore were confused by the LLM who then cross checked names against online data and made the hallucination or incorrect outputs.
Even when Māori knowledge holders create and modify a commercial LLM, there can be issues. This was highlighted with a recently announced Māori knowledge and Waitangi Tribunal LLM. The results were shockingly incorrect. The GPT confused Sir Tipene O’Regan with Professor Rawinia Higgins who are both profiled on a page in a pdf that the LLM acknowledged as a source. Anyone with a minimal amount of knowledge of Māori society will know that these two leaders have different tribal affiliations and that there is a reasonable age difference between the two leaders, not to mention two different careers.
However, the Māori knowledge and Waitangi Tribunal LLM does show the potential to use publicly sourced and available New Zealand government digital data, and it also shows a lack of AI and Data governance in the AI life cycle.
With proper governance models in place, these LLMs could be beneficial to many Māori and Māori organisations as we don’t really know what Māori data the government have about Māori. This in turn highlights the need for data repatriation back to New Zealand of New Zealand government data that is largely hosted offshore. The risks of not repatriating is that our country data could be consumed as training data in LLMs that are both hosted overseas and owned by other country companies. Thus making any attempt at repatriation very difficult.
As an academic and expert with Māori knowledge in a number of areas, I am often contacted by examiners, teachers and upset students who have been falsely accused of using an AI to write assignments about Māori knowledge. My advise to teachers and examiners who think they have a student using a LLMs to create Māori knowledge writings and tests that require a mixture of Māori language, Māori knowledge such as concepts and practices, it is important to have an independent subject matter human being check the writings and to be able to conduct a real time in person interview.