Contractors Warn New Google Guidelines Could Affect Gemini’s Accuracy On Sensitive Topics - 1

Image by Solen Feyissa, from Unsplash

Contractors Warn New Google Guidelines Could Affect Gemini’s Accuracy On Sensitive Topics

  • Written by Kiara Fabbri Former Tech News Writer
  • Fact-Checked by Justyn Newman Former Lead Cybersecurity Editor

A recent shift in internal guidelines at Google has raised concerns over the accuracy of its Gemini AI, particularly when it comes to handling sensitive or highly specialized topics.

In a Rush? Here are the Quick Facts!

  • Google contractors can no longer skip prompts outside their expertise for Gemini evaluation.
  • Contractors now rate AI responses they don’t fully understand, noting lack of expertise.
  • Contractors previously skipped prompts on complex topics like cardiology or rare diseases.

Contractors working on the Gemini project, who are tasked with evaluating the accuracy of AI-generated responses, can no longer skip prompts outside their domain expertise. This change, first reported by TechCrunch , could potentially impact the reliability of information provided by the AI on topics such as healthcare, where precise knowledge is crucial.

TechCrunch notes that previously, contractors at GlobalLogic, an outsourcing firm owned by Hitachi , were previously tasked with evaluating AI responses based on factors like “truthfulness” and allowed to bypass prompts outside their expertise.

For example, if asked to evaluate a technical question about cardiology, a contractor with no scientific background could skip it.

However, under the new guidelines, contractors are now instructed to evaluate responses to all prompts, including those requiring specialized knowledge, and note any areas where they lack expertise, as reported by TechCrunch.

The new rule has led to concerns about the quality of ratings provided for complex topics. Contractors, often without the necessary background, are now tasked with judging AI responses on issues such as rare disease s or advanced mathematics.

One contractor expressed to TechCrunch frustration in internal correspondence, questioning the logic behind eliminating the skip option: “I thought the point of skipping was to increase accuracy by giving it to someone better?”

TechCrunch reports that the updated guidelines allow contractors to skip prompts only in two cases: if the prompt or response is incomplete or contains harmful content that requires special consent for evaluation.

This restriction has raised alarms among those working on Gemini, who worry that the AI could produce inaccurate or misleading information in highly sensitive areas.

TechCrunch reports that Google has not provided a detailed response to the concerns raised by contractors.

However, a spokesperson emphasized to TechCrunch that the company is “constantly working to improve factual accuracy in Gemini.” They further clarified that while raters provide valuable feedback across multiple factors, their ratings do not directly impact the algorithms but are used to gauge overall system performance.

Mashable noted that the report questions the rigor and standards Google claims to apply when testing Gemini for accuracy.

In the “ Building responsibly ” section of the Gemini 2.0 announcement, Google stated that it is “working with trusted testers and external experts and performing extensive risk assessments and safety and assurance evaluations.”

While there is a reasonable emphasis on evaluating responses for sensitive and harmful content, less attention seems to be given to responses that, while not harmful, are simply inaccurate, as noted by Mashable.

AI Predicts Whisky Origins And Aromas Better Than Humans - 2

Image by Edward Howell, from Unsplash

AI Predicts Whisky Origins And Aromas Better Than Humans

  • Written by Kiara Fabbri Former Tech News Writer
  • Fact-Checked by Justyn Newman Former Lead Cybersecurity Editor

Scientists at the Fraunhofer Institute for Process Engineering and Packaging IVV in Germany, have developed an AI tool called OWSum that outperforms human experts in identifying the origin and key aromas of whisky samples.

In a Rush? Here are the Quick Facts!

  • AI tool OWSum distinguishes Scotch from American whiskey using molecular data
  • AI surpasses human experts in predicting whisky aroma profiles based on detected molecules.
  • Study highlights AI’s potential in quality control, fraud detection, and whisky innovation.

Using molecular data and machine learning, the tool demonstrated exceptional accuracy in distinguishing Scotch whisky from American whiskey and predicting their aroma profiles, as outlined in the research paper published on Thursday.

The study analyzed 16 whisky samples using chemical data and human sensory evaluations. Researchers found that OWSum, a linear classifier, could predict whisky origins based on detected molecules with high precision.

For example, certain chemicals, like menthol and citronellol, were linked to American whiskey, while Scotch whisky had unique markers such as methyl decanoate.

To go further, the team used both OWSum and a Convolutional Neural Network (CNN) to predict each whisky’s top five aroma attributes. These included descriptors like fruity, woody, or smoky. The AI models performed better than human experts, achieving scores that showed strong consistency and accuracy.

Unlike humans, who often vary in their evaluations, AI tools provided consistent predictions. However, the study noted that neither model considered the concentration of molecules, which could improve results in the future.

Researchers believe such AI applications could be valuable in the whisky industry for quality control, fraud detection, and developing new blends. The tools might also extend to other industries, such as food and fragrance production, as reported by New Scientist .

Although AI outperformed humans in this study, scientists emphasize that human expertise is still essential for training these models and interpreting results. Future advancements may include refining aroma profiles and expanding analysis to other whisky-producing regions.

The study demonstrates how technology can complement traditional sensory methods, offering new insights into complex aroma compositions.

“[The results] underline the fact that it’s a complicated task for humans, but it’s also a complicated task for machines – but machines are more consistent than humans,”says Satnam Singh, a team member at the Fraunhofer Institute, as reported by New Scientist.

“But that’s not to say that humans are not needed: we do need them to train our machines, at least, right now,” he added.

The study focused on a small selection of whiskies, and it’s uncertain how the AI would perform with a larger variety, or how it would handle the flavor notes that evolve as the whisky ages in the cask.

The Guardian reports Dr. William Peveler, a senior lecturer in chemistry at the University of Glasgow, saying,

“The other thing with whisky is that perception of flavour is hugely influenced by the environment it’s consumed in and other external factors, so there could be some work to do on other factors that influence flavour perception and prediction in such an emotive product.”