sourcegraph
February 21, 2024

In 2020, researchers at the Center for Terrorism, Extremism, and Counterterrorism at the Middlebury Institute of International Studies found that GPT-3, the technology behind ChatGPT, “has an impressively deep understanding of extremist communities” and could be exploited Prompted in the style of mass shooters, fake forum posts discussing Nazism, defending QAnon, and even multilingual extremist texts.

OpenAI uses machines and humans to monitor content input and produced by ChatGPT, a spokesperson said. The company relies on feedback from its human AI trainers and users to identify and filter toxic training data, while teaching ChatGPT to generate more informed responses.

OpenAI policy Prohibits the use of its technology to promote dishonesty, deceive or manipulate users, or attempt to influence politics; the company provides Free Audit Tool Address content that promotes hate, self-harm, violence, or sex. But currently, the tool has limited support for languages ​​other than English, and doesn’t recognize political material, spam, spoofing, or malware. ChatGPT reminds users that it “may occasionally generate harmful instructions or biased content.”

Last week, OpenAI announced a separate tool Helping discern when text is written by a human rather than an AI is partly to identify automated misinformation campaigns. The company warns that its tool is not entirely reliable — it only accurately recognized AI text 26% of the time (compared to mislabeling human text 9% of the time) — and could be circumvented. The tool also struggles with text that is less than 1,000 characters or written in a language other than English.

Arvind Narayanan, professor of computer science at Princeton University, wrote On Twitter last December, he asked ChatGPT about some of the basic information security questions he asks students on the exam. He wrote that the chatbot’s answers sounded plausible but were actually nonsense.

“The danger is that you can’t tell when something is wrong unless you already know the answer,” he wrote. “It was so disturbing that I had to look at my reference solutions to make sure I wasn’t losing my mind.”

Mitigation strategies exist—media literacy campaigns, “radioactive” data identifying generative model work, government restrictions, tighter controls on users, even identity requirements for social media platforms—but many exist in their own way question. The researchers concluded that “there is no magic bullet that can eliminate the threat alone.”





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *