Hidden power
At the heart of chatbots such as ChatGPT and image generation tools such as DALL-E lies the power of generative AI models that can create realistic content from text, images, audio and video inputs. These tools can be accessed through a browser or a smartphone app.
Since these AI models have no predefined use, they can be fine-tuned for a wide range of applications in a variety of domains ranging from finance to biology. The models, trained on vast quantities of data, can be adapted for different tasks with little to no coding and sometimes as easily as by describing a task in simple language.
Given that AI models such as GPT-3 and GPT-4 were developed by private organizations using proprietary data sets, the public doesn’t know the nature of the data used to train them. The opacity of training data and the complexity of the model architecture – GPT-3 was trained on over 175 billion variables or “parameters” – make it difficult for anyone to audit these models. Consequently, it’s difficult to prove that the way they are built or trained causes harm.
Hallucinations
In language model AIs, a hallucination is a confident response that is inaccurate and seemingly not justified by a model’s training data. Even some generative AI models that were designed to be less prone to hallucinations have amplified them.
There is a danger that generative AI models can produce incorrect or misleading information that can end up being damaging to users. A study investigating ChatGPT’s ability to generate factually correct scientific writing in the medical field found that ChatGPT ended up either generating citations to nonexistent papers or reporting nonexistent results. My collaborators and I found similar patterns in our investigations.
Such hallucinations can cause real damage when the models are used without adequate supervision. For example, ChatGPT falsely claimed that a professor it named had been accused of sexual harassment. And a radio host has filed a defamation lawsuit against OpenAI regarding ChatGPT falsely claiming that there was a legal complaint against him for embezzlement.
Bias and discrimination
Without adequate safeguards or protections, generative AI models trained on vast quantities of data collected from the internet can end up replicating existing societal biases. For example, organizations that use generative AI models to design recruiting campaigns could end up unintentionally discriminating against some groups of people.
When a journalist asked DALL-E 2 to generate images of “a technology journalist writing an article about a new AI system that can create remarkable and strange images,” it generated only pictures of men. An AI portrait app exhibited several sociocultural biases, for example by lightening the skin color of an actress.