MundaneBlog

November 24, 2024

ChatGPT in the classroom – one thought experiment

Filed under: AI,Technology — Tags: , , — DrMundane @ 2:08 pm

Reading Mystery AI Hype Theatre 3000 this morning, on ChatGPT having no place in the classroom, and I tried out a little thought experiment. It goes something like this:

Let us assume we are in, say, an AP Literature class. I imagine the hardest part of grading such a class in reading and scoring several essays for each student over the year. I know for sure my terrible longhand cursive was a painful thing to read for my poor teacher. But what about an AI? Can the AI (unable to come to any factual conclusion or reason, just statistically generate) be used to speed this up?

I think the major problem is the inability to actually understand the students writing, but let us take for granted that improvements to large language models will somehow be able to overcome this fact with sufficient data and computation. That seems to be the claim of Altman and the others, but to be clear, they will not.

We know that currently these LLMs scrape data from the public internet, so can safely assume that they do have a robust source of information on AP Literature questions, books, themes, etc., from forums and other sources seeking to help students. Much of the writing will be done by students themselves, so another bonus is that it is writing representative of the population. So far that seems reasonable.

But there is already one problem. The writing is not representative of the whole population of AP Literature students, only those with access to these online resources and/or willingness to engage with them. I was a student who would never workshop their ideas or practice their writing online. I was simply too private for that sort of thing. The dataset certainly would never include my writing.

I would therefore argue that any such LLM grading students work would be inherently biased against students whose writing does not line up with the LLMs source material. It knows which essays deserve a 5 and which deserve a 1 based on it’s dataset. Based on statistics. It does not understand the rubric or intent, and therefore is unable to rate new (to it) but correct writing in a fair manner. You will therefore teach students to write like the corpus of text ingested by the model, not to find their voice and style. You teach them to look at the prompt and come up with the ‘correct’ answer, not an answer that necessarily comes from their own experience and understanding of the literature in question.

I think this alone makes the use of AI impossible in the classroom, owning to the discriminatory potential.

Another example that springs to mind: what about queer analysis? What about LGBTQ+ students? Will their viewpoints and experiences be reflected in the corpus? I doubt this highly. This means any such student may be able to write a brilliant analysis of a book through a queer lens and it will matter not, because statistically it doesn’t match what a ‘5’ is to look like. It uses all sorts of words that probably aren’t associated with ‘5’ essays. The LLM may even deem them ‘profane’. It therefore is not a ‘5’, in the view of the LLM.

I think these two thought experiments illustrate why I believe, beyond all the technical problems and overselling, that even a GPT that lives up to the hype and can be made factually correct will never be suitable for any evaluative work. Used to such an end, the AI encourages normative expression and discourages breaking boundaries. It truly discourages real feeling and art.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress