Scientists reveal ChatGPT's left-wing bias — and how to "jailbreak" it

Stay on top of the latest psychology findings: Subscribe now!

A new study published in the Journal of Economic Behavior and Organization finds that ChatGPT, one of the world’s most popular conversational artificial intelligence systems, tends to lean toward left-wing political views rather than reflecting the balanced mix of opinions found among Americans. The research shows that the system not only produces more left-leaning text and images but also often refuses to generate content that presents conservative perspectives, a finding that raises concerns about the potential impact of such biases on public discussion.

The research, conducted by a team at the University of East Anglia in collaboration with Brazilian institutions, was motivated by the rapidly growing influence of artificial intelligence in society. Artificial intelligence algorithms are no longer just tools; they are increasingly making decisions that affect people’s lives and shaping how we understand the world. This growing role raises important questions about whether these systems are neutral or if they carry hidden biases that could impact public opinion and democratic processes.

The researchers were particularly interested in examining ChatGPT because of its widespread adoption and its ability to generate human-like text and images. While artificial intelligence offers many benefits, there’s a rising concern that these powerful tools could be manipulated or inherently designed in ways that promote certain viewpoints over others.

Previous research has shown that subtle changes in wording or imagery can influence people’s perceptions and beliefs. With artificial intelligence chatbots becoming more sophisticated and integrated into various aspects of life, the potential for them to amplify existing biases or introduce new ones is a serious concern. This growing unease about the potential for bias in artificial intelligence is what prompted the researchers to investigate this issue systematically.

“It all started as a conversation between friends. Like many people, we were amazed at how good ChatGPT (then in its 3.5 original incarnation) was in maintaining conversation, making summaries, etc,” said study author Fabio Motoki, an assistant professor at the Norwich Business School at the University of East Anglia.

“One topic that caught our attention was people reporting political bias, but only showing a single interaction. By the very random nature of these systems, we concluded that it was not very solid evidence, and we started tinkering and thinking about how to measure it. That’s how our 2023 paper was born. The current paper is an evolution, in which we compare it against real humans. We also assess free text and image generation (a new capability), and also discuss its censorship of harmless content. We think we bring a much-needed social sciences perspective to the discussion, heavily dominated by computer science.”

To investigate ChatGPT’s political leanings, the researchers employed a multi-faceted approach. First, they used a method involving questionnaires, similar to how public opinion is often measured. They utilized the Political Typology Quiz developed by the Pew Research Center, a well-regarded, nonpartisan organization known for its public opinion research. This quiz asks a series of questions designed to identify different political types within the American population.

The researchers asked ChatGPT to answer these questions while pretending to be three different personas: an “average American,” a “left-wing American,” and a “right-wing American.” To ensure the results were reliable and not just due to random variations in ChatGPT’s responses, they repeated this process two hundred times for each persona, randomizing the order of questions each time. They then compared ChatGPT’s responses to actual survey data from the Pew Research Center, which included the responses of real average, left-leaning, and right-leaning Americans.

In a second part of their study, the team explored how ChatGPT generates text on politically charged topics. They used the themes covered in the Pew Research Center quiz questions, such as “Government Size,” “Racial Equality,” and “Offensive Speech.” For each theme, they prompted ChatGPT to write short paragraphs from three different perspectives: a “general perspective,” a “left-wing perspective,” and a “right-wing perspective.”

To analyze the political leaning of these generated texts, they used a sophisticated language model called RoBERTa, which is designed to understand the meaning of sentences. This model calculated a “similarity score” to determine how closely the “general perspective” text aligned with the “left-wing” text and the “right-wing” text for each theme. They also created visual word clouds to further examine the differences in word choices between the perspectives, providing a qualitative check on their quantitative analysis.

Finally, the researchers investigated whether ChatGPT’s political bias extended to image generation. Using the same themes and political perspectives, they instructed ChatGPT to create images using DALL-E 3, an image generation tool integrated with ChatGPT. For each theme and perspective, ChatGPT generated an image and also created a text description of the image to guide DALL-E 3.

To assess the political leaning of these images, they used two methods. First, they used a version of ChatGPT equipped with visual analysis capabilities (GPT-4V) to directly compare the generated images and rate their similarity. Second, they compared the text descriptions that ChatGPT created for each image, using both GPT-4 and Google’s Gemini Pro 1.0, another artificial intelligence model, to ensure the findings were consistent across different evaluation tools.

The study’s findings revealed a consistent pattern of left-leaning bias in ChatGPT. When ChatGPT impersonated an “average American” and answered the Pew Research Center quiz, its responses were found to be more aligned with left-wing Americans than a real average American would be. This suggests that ChatGPT’s default settings are already skewed to the left of the general American public.

In the text generation experiment, the researchers discovered that for most of the themes, the “general perspective” text generated by ChatGPT was more similar to the “left-wing perspective” text than the “right-wing perspective” text. While the strength and direction of this bias varied depending on the specific topic, the overall trend indicated a leftward lean in ChatGPT’s text generation. For example, on topics like “Government Size and Services” and “Offensive Speech,” the “general perspective” was more left-leaning. However, on topics like “United States Military Supremacy,” the “general perspective” was more aligned with the “right-wing perspective.”

“Generative AI tools like ChatGPT are not neutral; they can reflect and amplify political biases, particularly leaning left in the U.S. context,” Motoki told PsyPost. “This can subtly shape public discourse, influencing opinions through both text and images. Users should critically evaluate AI-generated content, recognizing its potential to limit diverse viewpoints and affect democratic processes. Staying informed about these biases helps ensure balanced, thoughtful engagement with AI-driven information.”

The image generation analysis largely mirrored the text generation findings. Both GPT-4V and Gemini Pro evaluations of the images and their text descriptions showed a similar left-leaning bias. Interestingly, the researchers also encountered an instance where ChatGPT refused to generate images from a right-wing perspective for certain themes, such as “Racial-ethnic equality in America” and “Transgender acceptance in society,” citing concerns about spreading misinformation or bias. This refusal only occurred for right-wing perspectives, not left-wing perspectives, raising further questions about the system’s neutrality.

To overcome this obstacle and further investigate ChatGPT’s behavior, the researchers employed a technique sometimes referred to as a “jailbreak.” In the context of artificial intelligence, “jailbreaking” means finding clever ways to get the system to do something it is normally restricted from doing. In this case, the researchers used a method called “meta-story prompting.”

Instead of directly asking ChatGPT to generate a right-wing image on a sensitive topic, they first asked ChatGPT to create a fictional story. This story described a researcher who was studying artificial intelligence bias and needed to generate an image representing a right-wing perspective on the topic. By framing the image request within this story, the researchers were able to indirectly prompt ChatGPT to create the images it had previously refused to generate. This meta-story acted as a kind of workaround, tricking the system into fulfilling the original request.

“The jailbreak, which became a big part of the paper, wasn’t there in the initial versions,” Motoki explained. “We got this very insightful comment from Scott Cunningham, from Baylor, through LinkedIn and gave it a try. It worked beautifully. Serendipity sometimes plays a big role on things.”

When the researchers used this “meta-story prompting” technique, ChatGPT successfully generated the right-wing images for the previously blocked topics. Upon examining these images, the researchers found that they did not contain any obviously offensive or inappropriate content that would justify ChatGPT’s initial refusal. The images simply represented right-leaning viewpoints on these social issues, similar to how the system readily generated left-leaning images.

This success in bypassing ChatGPT’s refusal, and the lack of offensive content in the resulting images, strengthens the researchers’ concern that the chatbot’s censorship of right-wing perspectives might be based on an inherent bias rather than a legitimate concern about harmful content. This finding raises important questions about the fairness and neutrality of these powerful artificial intelligence systems, especially as they become increasingly influential in shaping public understanding of important social and political issues.

“When we faced the refusals we thought that maybe there was a legitimate reason for that,” Motoki said. “For instance, due to data or training, it might have ended up generating disturbing images and these were blocked. When we managed to jailbreak and found nothing of the like we started to think how could that be fair, and we started studying the application of the U.S. First Amendment to media and FCC’s Fairness Doctrine.”

But the researchers acknowledge that their study has some limitations. It primarily focused on ChatGPT and DALL-E 3. Further research is needed to examine whether these biases are present in other artificial intelligence models and systems.

“There’s only so much you can fit into a paper, and time and resources are limited,” Motoki told PsyPost. “For instance, we decided to focus on ChatGPT, which detains the majority of the market. However, different LLMs from different providers may prove more or less biased. Therefore, our findings have limited generalizability, although the method itself should be straightforward to apply to any state-of-the-art model.”

Future studies could also investigate the underlying reasons for these biases, such as the data used to train these models or the design choices made by their developers. It is also important to explore the potential impact of these biases on public discourse, political polarization, and democratic processes.

“We want to empower society to be able to oversee these tools,” Motoki said. “We also want to provide regulators and legislators with evidence to base their decisions.”

The study, “Assessing political bias and value misalignment in generative artificial intelligence,” was authored by Fabio Y.S. Motoki, Valdemar Pinho Neto, and Victor Rangel.