In December 2020, Timnit Gebru (then co-lead of Google’s Ethics AI team) received an email while on vacation informing her that Google had fired her. The reason was that Google asked her to remove or take down her name from a staff-authored paper, and she refused. The paper’s concerns—hallucinations and lack of understanding, bias amplification, environmental costs, training data that cannot be audited, and language centralization—were all found in real life five years later.

Five Prophecies Matched to Reality: Verified Cases and Data

Hallucinations and Lack of Understanding: In 2021, the paper described the phenomenon later known as “hallucinations,” referring to how LLMs merely stitch language forms together according to probabilities, “with no reference to any meaning.” This issue has become a known flaw of all mainstream AI systems and has been validated in multiple independent academic evaluations.

Bias Amplification: Amazon’s AI recruiting tool, developed since 2014, was scrapped in 2018 after it was found to systematically discriminate against female applicants. The model learned male-favoring evaluation standards from historical resumes dominated by men. Obermeyer et al. published a study in Science in 2019 revealing that a widely used healthcare risk algorithm used “healthcare spending” as a proxy for “disease severity,” causing Black patients with equal risk scores to actually be sicker. The research confirmed that, after adjustment, the proportion of Black patients identified as needing additional care would rise from 17.7% to 46.5%.

Environmental Costs: In Google’s 2024 environmental report, it disclosed that in 2023 its greenhouse gas emissions were about 14.3 million metric tons of CO₂e, up 48% from the 2019 baseline. Google confirmed that the main cause was a sharp increase in electricity consumption at AI-driven data centers, directly threatening Google’s originally planned carbon-neutral goal for 2030.

Training Data Cannot Be Audited: In December 2023, Stanford’s Internet Observatory found 3,226 pieces of suspected child sexual abuse material (CSAM) in the LAION-5B dataset (containing 5.85 billion image-text pairs, previously used to train Stable Diffusion). Of these, 1,008 were confirmed by external organizations. LAION-5B was taken down immediately.

Language Centralization: A 2024 study by Thompson et al. analyzed an internet corpus made up of 6.38 billion sentences and found that 57.1% of them belonged to multilingual parallel sets—likely low-quality repetitive content produced by machine translation. This proportion was even higher in low-resource languages, meaning that low-resource language corpora are being polluted by low-quality machine-translation artifacts.

Verified Facts About Gebru’s Dismissal and the Paper’s Background

The paper has six authors, four of whom are Google employees. At the time Gebru received her termination notice, she was on vacation. Google’s request was to remove or delete the employee’s name listing, and after Gebru refused, she was informed of the termination decision while still on vacation.

The paper was officially published in March 2021. It explicitly states that for companies building LLMs, their financial and competitive incentives make it structurally impossible for “safety and ethics” to slow down product launch speed. The incident of Gebru being fired itself has been widely cited as specific confirmation of this structural argument.

Frequently Asked Questions

What is the core academic claim of the “Randomized Parrots” paper?

Based on the paper itself, the core arguments have two layers: the first is technical, pointing out five categories of systemic risks in LLMs—hallucinations, bias amplification, environmental costs, unverifiable training data, and language centralization. The second, more fundamental layer, explains why these five risks are difficult to resolve: under competitive and financial pressure, companies building LLMs are structurally inclined to prioritize speed over safety. The paper passed academic review during the peer-review process at the ACM FAccT conference.

How were the bias issues in Amazon’s AI recruiting tool discovered and handled?

According to public reporting, Amazon’s AI recruiting tool began development in 2014. The model was trained on historical resume data from the past decade that was dominated by men, causing it to automatically learn evaluation patterns that favored men—leading resumes containing phrases such as “women’s chess club” to be automatically penalized. This bias problem was discovered in 2018. Amazon then scrapped the tool and confirmed it had not been used to evaluate real applicants.

Google 2024 environmental report: With carbon emissions increasing, is it fully attributable to AI?

According to Google’s 2024 environmental report, greenhouse gas emissions in 2023 were about 14.3 million metric tons of CO₂e, up 48% from the 2019 baseline. Google explicitly states that the main reason was a sharp rise in electricity consumption at data centers driven by AI. Google’s explanation does not claim that the increase in carbon emissions was 100% caused by AI, but AI infrastructure expansion has been confirmed as the most important driver of the increase.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.