SkyNet is still a ways off...
Large Language Models are Amazing
We've all seen how these things, ChatGPT and all the emerging competition, can do remarkable things. If you go to the ChatGPT Web UI, you can ask it about anything and get an almost immediate response. The language is pretty natural. The recall and detail is amazing. These "large language models" (LLMs) possess an incredible ability to generate coherent and contextually relevant text, making them seem almost human. However, as with any advanced technology, LLMs also have their limitations and challenges.
What are they really?
Think of LLMs as sort of mirrors. They train on human-generated content to produce what looks, for all intents and purposes, like human-generated content. But it's not. It's a reflection of what the model has come to understand human-generated content to "look" like.
Now, the fact that the computer has trained on more data than a human could be expected to read in a lifetime means that it "knows" a lot. So, it's awe-inspiring to be able to ask it about just about anything and get a coherent response back.
They can live in a dreamworld.
But these models are just models. They don't "know" anything, really. They just take a prompt, refer to their collected data, and generate a response that seems to make sense based on what it's seen before. The problem is that the AI has no real understanding of the implications or accuracy of what it's saying. This leads to:
- Inappropriate responses. The AI may divulge information that it shouldn't. It may express biases based on its training data. It may simply put words together that are either offensive or dangerous out of sheer lack of understanding.
- Inaccurate responses. It may just make something up that "fits" in its model. We see this in programming AIs a lot. They produce code that looks like it'd work. Variable names are correct. The general idea is contextually appropriate. But the code is wrong. It doesn't work, or it doesn't do what we want.
- "Hallucination". Along those lines, the AI can simply make stuff up because it "fits". It may generate "facts" to fit into a body of text because, according to the model "something like that" would go there. The AI doesn't particularly care if the information is true or not. It has no way of independently validating every piece of information it disgorges.
They're (currently) slow
Though, ChatGPT's web interface is typically pretty responsive, their API is not. A recent project I developed required a full background job management solution to deal with the long response times-- more than a minute is not unusual. A chatbot I built first provides the "tl;dr" answer from the database and then, later, provides the more "human-like" response of the AI-- primarily for entertainment value.
They require careful application and meticulous prompt engineering
For these reasons, the very application of this technology must be carefully considered. They are definitely not a panacea. They are something akin to a word calculator. A tool that will generate an answer incredibly fast, but it will only be as good as the person using it.
This has led to a new discipline is called "prompt engineering", where one must gain an understanding of the AI's tendencies to manipulate the question asked (called the 'prompt') to increase the likelihood of obtaining a useful, accurate response.
Often, with AI-integrated applications, the prompt is constructed from additional information-- such as a series of questions answered by a user or some pre-processing locally. The prompt engineer must determine the best way to aggregate that information into a "question" that the AI is likely to answer with a minimum of hallucination and a maximum of valuable new information.
Huge potential, but probably more hype than the end of work (or the world) as we know it
Large language models hold tremendous promise, serving as versatile tools that can augment our linguistic and creative capabilities. They enable us to explore new ideas, new content, and provide a programmatic way to delve into massive amounts of information.
I find them incredibly useful for avoiding "creativity fatigue" when generating a lot of documentation and in saving me time by remembering and typing in code syntax while programming.
However, today, it is crucial to acknowledge the shortcomings of LLMs-- such as hallucination, the need for careful prompt engineering, and the surrounding complications of building applications with them-- along with recognizing their potential. I am reminded of the sense of impending human obsolescence that the development of the personal computer created. Like PCs, AI is likely to emerge as an integral tool in the ongoing evolution of work.