Loading stock data...

Google’s Gemini forces contractors to evaluate AI responses beyond their areas of expertise.

  • AI
  • December 19, 2024

The Unsung Heroes of AI Development

Generative AI systems, such as those developed by Google and OpenAI, have captivated the world with their seemingly magical abilities to produce human-like responses to a wide range of questions. However, behind the scenes, a crucial group of employees known as "prompt engineers" and analysts are working tirelessly to improve the accuracy of these chatbots.

The Concerns Surrounding Gemini

A recent internal guideline passed down from Google to contractors working on Gemini has raised concerns that this system may be more prone to producing inaccurate information on highly sensitive topics, such as healthcare. According to an updated internal guideline, contractors working with GlobalLogic, an outsourcing firm owned by Hitachi, are no longer allowed to skip certain prompts, even if they lack the domain expertise required to evaluate them.

The Role of Contractors in AI Development

Contractors working with GlobalLogic play a critical role in evaluating the accuracy of AI-generated responses. They are tasked with rating the outputs according to various factors, including "truthfulness." Previously, contractors could skip certain prompts if they lacked the necessary expertise, but this is no longer an option.

The Updated Guidelines: A Concern for Accuracy

According to internal correspondence seen by TechCrunch, the updated guidelines read:

"You should not skip prompts that require specialized domain knowledge. Instead, you should rate the parts of the prompt you understand and include a note that you don’t have domain knowledge."

This change has led to concerns about Gemini’s accuracy on certain topics, as contractors are sometimes tasked with evaluating highly technical AI responses about issues like rare diseases.

Contractors Weigh in: Concerns About Accuracy

Internal correspondence shows that contractors were initially concerned about the updated guidelines. One contractor noted:

"I thought the point of skipping was to increase accuracy by giving it to someone better?"

The new guidelines limit the ability of contractors to skip prompts only in two cases:

  • If they’re "completely missing information" like the full prompt or response.
  • If they contain harmful content that requires special consent forms to evaluate.

Google’s Response: A Commitment to Factual Accuracy

When asked for comment, Google spokesperson Shira McNamara responded:

"We are constantly working to improve factual accuracy in Gemini. Raters perform a wide range of tasks across many different Google products and platforms. They do not solely review answers for content; they also provide valuable feedback on style, format, and other factors."

However, McNamara also acknowledged that the ratings provided by contractors do not directly impact the algorithms used to generate responses.

The Impact on Factual Accuracy

The updated guidelines have raised concerns about Gemini’s ability to produce accurate information on sensitive topics. By requiring contractors to evaluate prompts outside their domain expertise, there is a risk that inaccurate information may be perpetuated.

Conclusion

The development of generative AI systems relies heavily on the work of prompt engineers and analysts who evaluate the accuracy of these systems. However, recent changes to internal guidelines have raised concerns about Gemini’s ability to produce accurate information on sensitive topics. The importance of human evaluation in AI development cannot be overstated, and it is crucial that companies like Google prioritize factual accuracy in their AI systems.

Related Reading

  • Nvidia’s Project Digits: A Personal AI Supercomputer
  • What are AI ‘world models,’ and why do they matter?
  • Apple to Label Notification Summaries to Indicate Use of AI

Subscribe to our newsletters to stay up-to-date on the latest news in AI, robotics, and more.

  • Related Posts

    You Missed

    Fed’s High-Rate Policy: Why GF Star Group Is Betting Big on Alternatives

    • August 19, 2025
    Fed’s High-Rate Policy: Why GF Star Group Is Betting Big on Alternatives

    BTC Price Nears $90K This Week — Top 5 Key Points About Bitcoin

    • April 1, 2025
    BTC Price Nears $90K This Week — Top 5 Key Points About Bitcoin

    Statistics Canada has updated its CPI basket to give increased weight to food items in inflation calculations.

    • March 31, 2025
    Statistics Canada has updated its CPI basket to give increased weight to food items in inflation calculations.

    Auto-Generated Audio for Video

    • March 31, 2025
    Auto-Generated Audio for Video

    Popular stocks trending now: Tesla, Bitcoin, Sangamo, and Endeavour

    • March 31, 2025
    Popular stocks trending now: Tesla, Bitcoin, Sangamo, and Endeavour

    Web3 creator platform Oh raises $4.5 million for AI-Based Digital Influencers

    • March 30, 2025
    Web3 creator platform Oh raises $4.5 million for AI-Based Digital Influencers