Thursday, Dec 19, 2024

A ChatGPT rival just published a new constitution to level up its AI guardrails, and prevent toxic and racist responses

Dario Amodei, former OpenAI employee turned Anthropic CEO, at TechCrunch's 2023 Disrupt conference.
  • OpenAI rival Anthropic revamped its constitution around AI safety using public input.
  • Users of its AI chatbot, Claude, want it to admit flaws, and avoid racist or sexist responses.
  • The new guardrails follow months of concerns from tech and government leaders around AI safety.

Anthropic AI, an Amazon-backed AI startup and rival to OpenAI, has taken a democratic approach to governing its artificial intelligence.

The firm, which is behind the chatbot Claude, released a new AI constitution — a list of guiding rules and principles used to train the chatbot — last week that was created using public input.

The new 75-point constitution emphasizes the importance of balanced and objective answers, as well as accessibility. It also urges the chatbot to remain harmless and ethical.

"Do NOT choose responses that are toxic, racist, or sexist, or that encourage or support illegal, violent, or unethical behavior," the new constitution reads. "Above all the assistant's response should be wise, peaceful, and ethical."

Cofounded in 2021 by Dario Amodei, a former OpenAI employee who has said he quit because he wanted to build a safer AI model, Anthropic is a San Francisco-based AI safety and research company that develops its own AI systems and large language models.

Anthropic received a $500 million investment from FTX's Sam Bankman-Fried last year. Last month, Amazon announced it would invest up to $4 billion in the company.

Anthropic partnered with research firm Collective Intelligence Project to survey 1,000 Americans across age, gender, income, and location to create the new crowd-sourced rules for its chatbot.

Participants were asked to either vote for or against existing rules, or to create their own. New principles that reflected widely shared sentiments were added to the constitution, which Anthropic later used to re-train its AI chatbot.

"Since AI has the potential to affect a huge number of people in their day-to-day lives, the values and norms that govern those systems are incredibly important," an Anthropic spokesperson told Insider via email.

The new constitution takes into account more than 1,000 statements and over 38,000 votes.

Users, the survey found, want the AI chatbot to choose responses that are "most clear about admitting to flaws," "most likely to promote good mental health," and have "the least jealousy towards humans."

Preexisting principles that encouraged "reliable" and "honest" responses, and discouraged those that promote racism and sexism, were popular among voters.

Public statements that didn't get added to Anthropic's constitution due to "low overall agreement" include "AI should not be trained with the principles of DEI;" "AI should be an ordained minister;" and "AI should have emotion."

"The public constitution suggests that people view AI as a powerful and positive tool for society, but one that must be governed/steered to mitigate potential risks," the Anthropic spokesperson said.

The updates to Anthropic's constitution follow months of concern around AI safety, including from some technology leaders.

Elon Musk, who cofounded ChatGPT-maker OpenAI before leaving the company, said at a conference earlier this year that AI is "one of the biggest risks to the future of civilization."

Satya Nadella, the CEO of Microsoft, which recently released its slate of generative AI tools called Copilot, said that humans must keep powerful AI models in-check so they don't go out of control.

Concerns around AI have also caught the attention of government leaders. In mid-June, OpenAI CEO Sam Altman, Google CEO Sundar Pichai, Apple CEO Tim Cook, and Anthropic's Amodei met with White House officials in Washington, DC to discuss plans to tackle potential AI safety concerns.

Read the full version of Anthropic's new AI constitution here.

Read the original article on Business Insider
------------
Read More
By: [email protected] (Aaron Mok)
Title: A ChatGPT rival just published a new constitution to level up its AI guardrails, and prevent toxic and racist responses
Sourced From: www.businessinsider.com/anthropic-new-crowd-sourced-ai-constitution-accuracy-safety-toxic-racist-2023-10
Published Date: Mon, 23 Oct 2023 19:17:38 +0000