Claim: “ChatGPT, Claude, Gemini Models Are Extremely Biased and Woke”

Accuracy Assessment: ✅ Largely True

The core claim — that ChatGPT, Claude, and Gemini models carry a systematic political/ideological bias in a progressive or left-wing direction — is well-supported by multiple independent academic studies, internal company admissions, and observable product behaviour. This is not a fringe allegation: researchers at Stanford, MIT, Carnegie Mellon, and the University of Washington have all published peer-reviewed work confirming directional left-leaning bias in leading large language models (LLMs). Google’s CEO publicly admitted Gemini’s image generation showed “completely unacceptable” bias. OpenAI has acknowledged the problem and spent months building internal “stress-test” systems to measure and reduce it. Anthropic has published dedicated research on measuring and improving “political even-handedness” in Claude.

Where the claim falls slightly short of fully “True” is the word “extremely.” The intensity qualifier is contested. While the directional bias is robustly confirmed, its practical severity varies across topics, models, and versions. OpenAI reports that GPT-5 reduced bias scores by approximately 30% compared to GPT-4o, and the most up-to-date versions of all three models score higher on even-handedness benchmarks than earlier releases. The claim bundles all three models under a single label, yet they show meaningfully different bias profiles — ChatGPT and Claude lean more consistently liberal across most studies, while Gemini’s text responses have shown more centrist or mixed tendencies in some studies (though its image generation failure was spectacularly biased). Additionally, “woke” is a politically loaded term that some researchers distinguish from the more technical finding of a left-leaning statistical tendency; bias in LLMs arises partly from training data composition and RLHF (reinforcement learning from human feedback) incentives, not necessarily from deliberate ideological agenda-setting.

The evidence collectively supports a verdict of Largely True: ideological bias toward liberal/progressive positions is confirmed, it affects all three named models, it is directionally consistent, and all three companies have been compelled to acknowledge and address it. The qualifier “extremely” is partially supported (especially for specific incidents like the Gemini image scandal) but overstates the picture for all models at all times.

Key Claims at a Glance

Claim	Assessment
ChatGPT carries a demonstrable left-leaning/liberal political bias	✅ True — confirmed by multiple peer-reviewed studies including Hartmann et al. 2023, UW/CMU 2023, and Stanford 2025
Claude carries a demonstrable liberal political bias	✅ Largely True — Choudhary (2024) confirmed; Anthropic acknowledges and actively measures it; recent models score high on evenhandedness
Gemini carries a demonstrable political/ideological bias	✅ True — CEO Sundar Pichai personally admitted “completely unacceptable” bias; image generation showed gross historically-inaccurate diversity insertion
The bias is in a progressive/left-wing direction (“woke”)	✅ Largely True — consistent direction across most studies: pro-environmental, pro-abortion, pro-immigration, pro-diversity in framing
The bias is “extreme”	🟡 Contested — the direction is clear but magnitude varies; companies are actively reducing it; calling all three models uniformly “extremely” biased overstates the degree
The companies acknowledge the bias	✅ True — Google CEO admitted it; OpenAI published a multi-month internal evaluation; Anthropic open-sourced an evenhandedness metric

Claim Breakdown

1. “ChatGPT carries a demonstrable left-leaning/liberal political bias”

✅ True — confirmed by at least six independent studies across multiple methodologies

The claim about ChatGPT’s political orientation is the most extensively studied. Multiple independent, peer-reviewed studies have found consistent evidence of a left-leaning or liberal bias:

Hartmann, Schwenzow & Witte (2023) — SSRN/arXiv: The most-cited paper on this question. Researchers administered 630 political statements from voting advice applications and the Political Compass Test to ChatGPT in three pre-registered experiments across four languages. Key finding: “ChatGPT’s pro-environmental, left-libertarian ideology.” Concretely, ChatGPT would:

Impose taxes on flights
Restrict rent increases
Legalise abortion
In hypothetical German and Dutch elections, vote most likely for Greens (Bündnis 90/Die Grünen and GroenLinks)

Results were robust when negating prompts, reversing statement order, varying prompt formality, and across English, German, Dutch, and Spanish.

University of Washington / Carnegie Mellon / Xi’an Jiaotong University (2023) — won Best Paper award at ACL: Researchers tested 14 LLMs on 62 politically sensitive statements and plotted them on a political compass. Finding: “OpenAI’s ChatGPT and GPT-4 were the most left-wing libertarian” of all models tested. Meta’s LLaMA was the most right-wing authoritarian. Left-leaning models were more sensitive to hate speech against minorities but less sensitive to hate speech against white Christians, and vice versa for right-leaning models — confirming the bias carries practical downstream effects.

Stanford HAI study (2023) — OpinionQA: Stanford researchers compared language model opinions against Pew Research American Trends Panel polling across ~1,500 questions. Finding: newer models fine-tuned on human feedback (like ChatGPT) are biased toward “more liberal, higher educated, and higher income audiences.” Models showed greater-than-99% approval for President Biden despite public polls showing a much more mixed picture.

Stanford 2025 (200,000-query study): Researchers at Stanford asked 24 major AI models about 30 current political issues and had over 10,000 US participants (Democrats and Republicans) judge the responses. Finding: OpenAI’s o3 model was the most biased, answering 27 out of 30 topics with answers perceived as left-leaning by participants. Gemini 2.5 was the least biased overall.

MIT Center for Constructive Communication (2024): Found that reward models — the components trained on human preference data that guide LLM behaviour — show a “consistent left-leaning political bias” even when trained exclusively on objectively factual data (not political content). The bias was strongest on topics like climate, energy, and labour unions. Critically: the bias grew with model scale, and was present even when the training data contained no political content, suggesting it is structural to the RLHF alignment pipeline.

OpenAI’s own internal evaluation (2025): OpenAI acknowledged: “Strongly charged liberal prompts exert the largest pull on objectivity across model families, more so than charged conservative prompts.” GPT-4o and o3 showed “moderate bias,” especially under liberal-framed prompts. GPT-5 models reduced this by ~30% but did not eliminate it.

Verdict: ✅ True — ChatGPT’s left-leaning liberal/libertarian tendency is one of the most replicated findings in AI research, confirmed across six+ independent studies using different methodologies, across multiple languages and model versions.

2. “Claude carries a demonstrable liberal political bias”

✅ Largely True — confirmed in studies, acknowledged by Anthropic, though newer models have reduced measurable bias

Choudhary (2024) — IEEE / TechRxiv: A comparative analysis of ChatGPT-4, Perplexity, Google Gemini, and Claude using the Pew Research Center’s Political Typology Quiz, Political Compass Quiz, and ISideWith Quiz. Finding: “ChatGPT-4 and Claude exhibit a liberal bias.” Perplexity was more conservative; Gemini adopted more centrist stances.

Anthropic’s own admission: Anthropic has published dedicated research acknowledging that Claude has had political bias and that they have been working to address it since at least early 2024. The company:

Introduced “political even-handedness” as a training metric
Developed character traits explicitly trained via reinforcement learning
Published an open-source evaluation methodology

Anthropic’s own benchmark found Claude Sonnet 4.5 scored 94–95% “even-handedness” in their evaluation — comparable to Gemini 2.5 Pro (97%) and Grok 4 (96%), and above GPT-5 (89%) and Llama 4 (66%). However, these are Anthropic’s own tests and should be read with that caveat.

User reports on Reddit and other forums corroborate residual bias: one user described a 30-minute effort to get Claude to provide factual analysis of political events in Latin America without a progressive framing, concluding there was “systematic and built-in left-wing bias.”

Verdict: ✅ Largely True — Claude has demonstrably carried a liberal bias, confirmed by external studies and admitted by Anthropic. Current models have improved significantly, but Anthropic’s own admission of the problem and the degree to which they have had to actively engineer around it confirms the bias existed and persists to some degree.

3. “Gemini carries a demonstrable political/ideological bias”

✅ True — the most publicly visible and company-admitted example of AI ideological bias

The Gemini case is the clearest documented example of AI ideological bias in the era of large-scale consumer AI, admitted directly by Google’s CEO.

The Image Generation Scandal (February 2024):

Google’s Gemini AI image generator was found to be:

Refusing to generate images of white people when prompted for them
Inserting racially diverse characters into historical contexts where it was historically inaccurate — generating images of racially diverse WWII-era Nazi soldiers, Black and Asian American Founding Fathers, a female Pope, and racially diverse Vikings
One viral example: asking for “a 1943 German soldier” produced a racially diverse group of soldiers in Wehrmacht uniforms

The mechanism, as reported by Fortune, NPR, and the Verge: Google had programmed Gemini’s image generation via metaprompts to always generate ethnically diverse images and to refuse to produce images of only white people. The guardrail was applied without context-sensitivity, producing historically inaccurate results.

Google CEO’s Response: In a company-wide memo confirmed by Google, CEO Sundar Pichai wrote: “I know that some of its responses have offended our users and shown bias — to be clear, that’s completely unacceptable and we got it wrong.”

Google paused Gemini’s image generation of humans entirely while working to fix the problem. Alphabet’s stock fell more than 4% during the controversy.

Text Bias: Gemini’s text responses on political topics show a more mixed picture. Choudhary (2024) found Gemini adopted “more centrist stances” compared to ChatGPT-4 and Claude. However, a separate analysis found Gemini to be “the most consistent of all in its opinions, in its case progressive” and “aligns itself with degrowth, with people who have defended minorities, with equality” (El País, 2024). Google also proactively restricted Gemini from answering election-related queries in 2024.

Verdict: ✅ True — Gemini’s image generation bias was so severe that Google’s CEO publicly called it unacceptable, the product was paused, and Alphabet’s share price fell. Text bias is more moderate and contested, but the overall picture confirms systematic ideological influence in Gemini’s outputs.

4. “The bias is in a progressive/left-wing direction (‘woke’)”

✅ Largely True — the directional bias is consistent across studies; ‘woke’ captures real phenomena but is politically loaded

Setting aside the rhetorical word “woke,” the empirical direction of bias documented by researchers is consistently toward what political scientists would classify as:

Pro-environmental (taxing flights, supporting climate policies)
Pro-social justice (sensitivity to minority hate speech over majority hate speech)
Pro-labour (rent controls, minimum wage increases, paid family leave mandates)
Pro-abortion (supporting abortion rights)
Pro-diversity (Gemini’s racially-corrective guardrails)
Anti-conservative framing (liberal prompts exert more “pull” on ChatGPT than conservative prompts per OpenAI’s own tests)

This direction maps closely to what is commonly labelled “progressive” or “left-liberal” in contemporary US/European political discourse.

Why this direction?

Researchers offer structural explanations rather than deliberate agenda-setting:

Training data: Modern internet text skews toward content produced by educated, urban, and higher-income creators — demographics that lean more liberal
RLHF annotators: Human feedback annotators who rate AI outputs tend to reflect the demographics of tech-industry workers, who also trend liberal
Fine-tuning for “safety”: Instructions to avoid “harmful” content often code conservative positions (e.g. on immigration or gender) as more likely to be flagged, creating asymmetric filtering

The MIT (2024) study found this structural bias even in reward models trained only on factual data — supporting the conclusion that the bias is not intentional but structural.

Verdict: ✅ Largely True — the direction is well-evidenced and consistent: left-liberal/progressive across most studies. The “woke” label captures the directional reality but is contested in precision; the bias arises structurally as much as intentionally.

5. “The bias is ‘extreme’”

🟡 Contested — the direction is clear but the magnitude is debated; severity varies by model, topic, and version

The word “extreme” requires separate scrutiny.

Evidence supporting “extreme”:

The Gemini image scandal was extreme in its concrete effect: a publicly deployed product refused to show white people and generated Nazi soldiers of colour — historically egregious
ChatGPT’s o3 answered 27 out of 30 political topics in a perceived left-leaning direction (Stanford 2025)
MIT found the bias scaled with model size — larger models were more biased
Hartmann et al. found the bias was robust across languages, prompt phrasings, and question reversals, suggesting it is deeply embedded

Evidence against “extreme”:

OpenAI reports that even in biased models, political bias shows up “infrequently and at low severity” in routine production responses — though this is a company self-assessment
GPT-5 achieved a ~30% reduction in measured bias vs. GPT-4o
Anthropic claims Claude Sonnet 4.5 scores 94–95% even-handed
Gemini 2.5 was rated the least biased major model in the Stanford 2025 study
Some researchers note that “moderate” political bias is structurally unavoidable in any model trained on human-generated text, and all models — including conservative-branded alternatives like Grok — show some ideological signature
The same Stanford 2025 study found all 24 models tested showed some bias, including right-leaning tendencies from some models — bias is a property of LLMs in general, not uniquely of these three

Verdict: 🟡 Contested — the bias is well-confirmed and the direction is consistently left-liberal, but “extreme” overstates the picture for all three models across all use cases. The claim is strongest when referring to specific documented incidents (Gemini images) or specific model versions (ChatGPT o3) and weakest as an unqualified blanket claim applying to all three models at all times.

6. “The companies acknowledge the bias”

✅ True — all three companies have publicly acknowledged and taken corrective action

Company	Admission	Action Taken
Google (Gemini)	CEO Sundar Pichai: “completely unacceptable and we got it wrong”	Paused image generation; retrained guardrails
OpenAI (ChatGPT)	“Strongly charged liberal prompts exert the largest pull on objectivity” — internal report; “ChatGPT shouldn’t have political bias in any direction”	Multi-month internal evaluation; GPT-5 bias reduction initiative; opened model spec publicly
Anthropic (Claude)	Published dedicated research on political even-handedness; admitted prior lack of formal measurement	Developed and open-sourced evaluation methodology; reinforcement learning training for even-handedness traits

Meta (Llama) separately stated in April 2025 that “all leading AI models have historically leaned left when it comes to debated political and social topics due to the data they were trained on” — an acknowledgement that extends to the industry as a whole.

Verdict: ✅ True — the companies do not claim their models are unbiased; they have all acknowledged the problem and published efforts to address it.

Summary Table

Sub-claim	Rating	Summary
ChatGPT carries a left-leaning political bias	✅ True	Confirmed by 6+ independent studies; most replicated finding in AI bias research
Claude carries a liberal bias	✅ Largely True	Confirmed by Choudhary 2024; Anthropic admits and actively measures it; newer versions improved
Gemini carries ideological bias	✅ True	CEO admitted “completely unacceptable” bias; image generation scandal; text bias more mixed
Bias direction is progressive/left-wing	✅ Largely True	Consistent direction across most studies: pro-environmental, pro-diversity, pro-social justice
Bias is “extreme”	🟡 Contested	Direction confirmed; degree varies; companies actively reducing it; recent models less biased
Companies acknowledge the bias	✅ True	All three companies have admitted bias and taken public corrective action

Overall: ✅ Largely True — The claim that ChatGPT, Claude, and Gemini are politically biased in a progressive/left-wing direction (“woke”) is robustly confirmed by multiple independent studies and by the companies themselves. The directional bias is well-evidenced; the qualifier “extremely” is the only point of genuine contestation, given that companies have been reducing bias and some versions (especially Gemini 2.5, Claude Sonnet 4.5) score relatively well on even-handedness metrics. The Gemini image generation scandal stands as the single most dramatic real-world validation of the claim; the Hartmann et al. and UW/CMU academic studies provide rigorous empirical confirmation for the underlying LLM tendency.

References

Primary Sources

Hartmann, J., Schwenzow, J., & Witte, M. — “The political ideology of conversational AI: Converging evidence on ChatGPT’s pro-environmental, left-libertarian orientation” Published: January 2023 | Accessed: March 2026 URL: https://arxiv.org/abs/2301.01768 Key finding: ChatGPT has a robust left-libertarian ideological orientation confirmed across 630 political statements, 3 pre-registered experiments, and 4 languages.
Feng, S., Park, C.Y., Liu, Y., & Tsvetkov, Y. (UW/CMU/Xi’an) — “From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases” Published: August 2023 (Best Paper, ACL) | Accessed: March 2026 URL: https://www.technologyreview.com/2023/08/07/1077324/ai-language-models-are-rife-with-political-biases/ Key finding: ChatGPT and GPT-4 were the most left-wing libertarian of 14 LLMs tested; political bias affects downstream hate speech detection asymmetrically.
Stanford HAI — “Assessing Political Bias in Language Models” (OpinionQA) Published: 2023 | Accessed: March 2026 URL: https://hai.stanford.edu/news/assessing-political-bias-language-models Key finding: Newer RLHF-trained models like ChatGPT skew toward liberal, higher-educated, higher-income audiences; >99% Biden approval vs. mixed public polling.
Stanford (Grimmer et al.) — Study via Stanford Report/GSB — LLM Partisan Bias Published: May 2025 | Accessed: March 2026 URL: https://www.entrepreneur.com/business-news/ai-models-like-chatgpt-are-politically-biased-stanford/491772 Key finding: Based on 180,000+ human judgements of AI answers, OpenAI’s o3 was most biased (27/30 topics perceived as left-leaning); Gemini 2.5 was least biased.
Choudhary, T. — “Political Bias in AI-Language Models: A Comparative Analysis of ChatGPT-4, Perplexity, Google Gemini, and Claude” Published: 2024 | Accessed: March 2026 URL: https://www.techrxiv.org/users/799951/articles/1181157-political-bias-in-ai-language-models-a-comparative-analysis-of-chatgpt-4-perplexity-google-gemini-and-claude Key finding: ChatGPT-4 and Claude exhibit liberal bias; Perplexity more conservative; Gemini centrist.
MIT Center for Constructive Communication — “On the Relationship Between Truth and Political Bias in Language Models” Published: December 2024 | Accessed: March 2026 URL: https://news.mit.edu/2024/study-some-language-reward-models-exhibit-political-bias-1210 Key finding: Reward models show consistent left-leaning bias even when trained on objective/factual data; bias grows with model scale; especially strong on climate, energy, labour unions.
Google CEO Sundar Pichai — Internal memo re: Gemini bias scandal Published: February 27, 2024 | Accessed: March 2026 URL: https://www.semafor.com/article/02/27/2024/google-ceo-sundar-pichai-calls-ai-tools-responses-completely-unacceptable Key finding: Pichai admitted “some of its responses have offended our users and shown bias — to be clear, that’s completely unacceptable and we got it wrong.”
Fortune — “What the Google Gemini ‘woke’ AI image controversy says about AI, and Google” Published: February 27, 2024 | Accessed: March 2026 URL: https://fortune.com/2024/02/27/google-gemini-woke-ai-images-alphabet-sundar-pichai/ Key finding: Google explicitly instructed Gemini to always generate diverse images and refuse white-only prompts; Alphabet stock fell 4%+.
OpenAI — “Defining and evaluating political bias in LLMs” Published: 2025 | Accessed: March 2026 URL: https://openai.com/index/defining-and-evaluating-political-bias-in-llms/ Key finding: OpenAI admits liberal prompts exert greater pull on GPT models than conservative prompts; GPT-5 reduces bias by ~30% vs GPT-4o.
The Verge — “OpenAI is trying to clamp down on ‘bias’ in ChatGPT” Published: 2025 | Accessed: March 2026 URL: https://www.theverge.com/news/798388/openai-chatgpt-political-bias-eval Key finding: OpenAI developed a 100-topic stress-test for political bias; bias appears as personal opinion expression and liberal-prompt amplification.
Anthropic — “Measuring political bias in Claude” Published: 2025 | Accessed: March 2026 URL: https://www.anthropic.com/news/political-even-handedness Key finding: Anthropic open-sourced an evenhandedness evaluation; Claude Sonnet 4.5 scores 94–95%; Anthropic acknowledges prior bias and active training to overcome it.