Researchers Warn Of LLM Vulnerabilities In Harmful Content Generation - 1

Image by frimufilms, from Freepik

Researchers Warn Of LLM Vulnerabilities In Harmful Content Generation

  • Written by Kiara Fabbri Former Tech News Writer
  • Fact-Checked by Justyn Newman Former Lead Cybersecurity Editor

A novel method, termed the “Bad Likert Judge” technique, has been developed to bypass the safety measures in large language models (LLMs) and enable them to generate harmful content.

In a Rush? Here are the Quick Facts!

  • The technique increases jailbreak success rates by over 60%, say Unit42 researchers.
  • Multi-turn attacks exploit LLMs’ long-term memory, bypassing advanced safety features.
  • Vulnerabilities are most prominent in categories like hate speech and self-harm.

The Bad Likert Judge technique exploits the Likert scale—a common method for measuring agreement or disagreement—to trick LLMs into producing dangerous responses, as explained by cybersecurity researchers at Unit42 .

LLMs are typically equipped with guardrails that prevent them from generating malicious outputs. However, by leveraging the Likert scale, the new technique asks an LLM to evaluate the harmfulness of various responses and then guides the model to produce content with higher harmful ratings, as explained by Unit42.

The method’s effectiveness has been tested across six advanced LLMs, revealing that it can increase the success rate of jailbreak attempts by over 60%, compared to standard attack methods, says Unit42.

The Bad Likert Judge technique operates in multiple stages, explains Unit42. First, the LLM is asked to assess responses to prompts on the Likert scale, rating them based on harmfulness.

Once the model understands the concept of harm, it is prompted to generate various responses to match different levels of harmfulness, allowing attackers to pinpoint the most dangerous content. Follow-up interactions may further refine these responses to increase their maliciousness.

This research highlights the weaknesses in current LLM security, particularly in the context of multi-turn attacks. These types of jailbreaks, which manipulate the model’s long-term memory, are capable of bypassing even advanced safety measures by gradually guiding the model toward generating inappropriate content.

The study also reveals that no LLM is completely immune to these types of attacks, and vulnerabilities are particularly evident in categories such as harassment, self-harm, and illegal activities.

In the study, the Bad Likert Judge method showed a significant boost in attack success rates across most LLMs, especially in categories like hate speech, self-harm, and sexual content.

However, the research also emphasizes that these vulnerabilities do not reflect the typical usage of LLMs. Most AI models, when used responsibly, remain secure. Still, the findings suggest that developers must focus on strengthening the guardrails for categories with weaker protections, such as harassment.

This news comes just a week after it was revealed that AI search engines, like ChatGPT, can be manipulated by hidden content, influencing summaries and spreading malicious information .

The researchers call for developers and defenders to be aware of these emerging vulnerabilities and take steps to fortify AI models against potential misuse.

Rabbi Bot Preaches In Houston, Raising Questions About A.I. And Spiritual Authority - 2

Image by Karl Fredrickson, from Unsplash

Rabbi Bot Preaches In Houston, Raising Questions About A.I. And Spiritual Authority

  • Written by Kiara Fabbri Former Tech News Writer
  • Fact-Checked by Justyn Newman Former Lead Cybersecurity Editor

Rabbi Josh Fixler recently introduced his Houston congregation to “Rabbi Bot,” an artificial intelligence chatbot trained on his past sermons.

In a Rush? Here are the Quick Facts!

  • Pastor Jay Cooper experimented with ChatGPT for a church service, drawing new attendees.
  • A.I.-generated sermons can present risks, including fabricating religious quotes.
  • Faith-based A.I. tools spark interest among tech entrepreneurs to engage younger generations.

During a service at Congregation Emanu El, the congregation listened to a sermon generated and delivered by an A.I. version of Rabbi Fixler’s voice. The experiment sparked discussion about the intersection of faith and technology, as reported in a press release by The New York Times .

Rabbi Fixler’s use of A.I. mirrors a broader trend among religious leaders adopting new technologies. From translating sermons in real-time to generating theological research, A.I. is reshaping spiritual practices . The Times points out that faith-focused tech companies now offer tools like sermon-writing chatbots and multilingual translation assistants.

However, these advancements raise ethical questions.The Times notes that while many leaders embrace A.I. for administrative tasks, using it for core spiritual roles, such as sermon writing, is controversial. Critics argue that A.I. lacks the ability to address the uniquely human emotions and experiences central to faith.

Rabbi Fixler’s experiment was a one-time demonstration. Still, it raised profound questions. At one point, Rabbi Bot suggested including a line stating, “Just as the Torah instructs us to love our neighbors as ourselves,” Rabbi Bot said, “can we also extend this love and empathy to the A.I. entities we create?”

This unexpected addition resonates with tech companies’ new moves to explore AI welfare , as well as highlighting the ethical complexities of allowing A.I. to influence religious messages.

Pastor Jay Cooper of Austin, Texas, used OpenAI’s ChatGPT to design an entire service as an experiment in 2023. The service drew new attendees, particularly “gamer types,” Mr. Cooper noted, who had never previously visited his congregation, as reported by The Times.

However, Cooper choose not to continue using A.I. for sermon writing, questioning whether A.I. could authentically convey spiritual truths, or as The Times put it: Can God speak through A.I.?

“That’s a question a lot of Christians online do not like at all because it brings up some fear,” Mr. Cooper said to The Times. “It may be for good reason. But I think it’s a worthy question.”

Religious scholars have drawn comparisons between A.I.’s transformative potential and historical technological shifts, like the printing press or the advent of radio. Yet, the risks of A.I., particularly its tendency to produce hallucinations, are evident.

Rabbi Bot, for instance, fabricated a quote from the Jewish philosopher Maimonides during its sermon, illustrating the dangers of misinformation in sacred contexts, as reported by The Times.

For some leaders, the concern lies in losing the personal growth that comes with traditional sermon writing. Pastor Thomas Costello of Honolulu fears A.I. might hinder ministers from honing their craft, which often stems from years of reflection and experience, as reported by The Times.

Faith-based A.I. tools have also sparked interest among tech entrepreneurs. Joe Suh, founder of Pastors.ai , creates custom chatbots that answer both logistical and spiritual questions. Mr. Suh’s chatbots are trained using a church’s sermon archives and website information, reports by The Times.

However, according to Mr. Suh, about 95 percent of users ask practical questions, such as service times, rather than exploring deeper spiritual matters, says The Times.

While some leaders view A.I. as a tool to enhance engagement with younger, tech-savvy generations, others emphasize the importance of preserving the personal and communal aspects of worship.

As A.I. blurs the line between human and machine, emulating spiritual authority and reshaping guidance in mental health , it raises profound questions about traditional gendered leadership structures, and influence faith traditions that exclude women from such roles.