September 14, 2023

AI models struggle to identify nonsense, says study

The AI models that power chatbots and other applications still have difficulty distinguishing between nonsense and natural language, according to a study released on Thursday.

The researchers at Columbia University in the United States said their work revealed the limitations of current AI models and suggested it was too early to let them loose in legal or medical settings.

They put nine AI models through their paces, firing hundreds of pairs of sentences at them and asking which were likely to be heard in everyday speech.

They asked 100 people to make the same judgment on pairs of sentences like: "A buyer can own a genuine product also / One versed in circumference of highschool I rambled."

The research, published in the Nature Machine Intelligence journal, then weighed the AI answers against the human answers and found dramatic differences.

Sophisticated models like GPT-2, an earlier version of the model that powers viral chatbot ChatGPT, generally matched the human answers.

Other simpler models did less well.

But the researchers highlighted that all the models made mistakes.

"Every model exhibited blind spots, labeling some sentences as meaningful that human participants thought were gibberish," said psychology professor Christopher Baldassano, an author of the report.

"That should give us pause about the extent to which we want AI systems making important decisions, at least for now."

Tal Golan, another of the paper's authors, told AFP that the models were "an exciting technology that can complement human productivity dramatically".

However, he argued that "letting these models replace human decision-making in domains such as law, medicine, or student evaluation may be premature".

Among the pitfalls, he said, was the possibility that people might intentionally exploit the blind spots to manipulate the models.

AI models burst into public consciousness with the release of ChatGPT last year, which has since been credited with passing various exams and has been touted as a possible aide to doctors, lawyers and other professionals.

More information: Testing the limits of natural language models for predicting human language judgements, Nature Machine Intelligence (2023). DOI: 10.1038/s42256-023-00718-1 , www.nature.com/articles/s42256-023-00718-1

Journal information: Nature Machine Intelligence

Citation: AI models struggle to identify nonsense, says study (2023, September 14) retrieved 27 April 2024 from https://phys.org/news/2023-09-ai-struggle-nonsense.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Verbal nonsense reveals limitations of AI chatbots

1 shares

Feedback to editors

AI models struggle to identify nonsense, says study

Global study shows a third more insects come out after dark

Cicada-palooza! Billions of bugs to blanket America

Getting dynamic information from static snapshots

Ancient Maya blessed their ballcourts: Researchers find evidence of ceremonial offerings in Mexico

Optical barcodes expand range of high-resolution sensor

Ridesourcing platforms thrive on socio-economic inequality, say researchers

Did Vesuvius bury the home of the first Roman emperor?

Florida dolphin found with highly pathogenic avian flu: Report

A new way to study and help prevent landslides

New algorithm cuts through 'noisy' data to better predict tipping points

Relevant PhysicsForums posts

Studying "Useful" vs. "Useless" Stuff in School

Why are Physicists so informal with mathematics?

Digital oscilloscope for high school use

Motivating high school Physics students with Popcorn Physics

How is Physics taught without Calculus?

The changing physics curriculum in 1961

Verbal nonsense reveals limitations of AI chatbots

AI model can respond appropriately to ophthalmology questions

Large language models depend on humans to maintain performance, expert explains

An interactive platform that explains machine learning models to its users

ChatGPT diagnoses ER patients 'like a human doctor': Study

Google AI health chatbot passes US medical exam: study

Training of brain processes makes reading more efficient

Researchers find lower grades given to students with surnames that come later in alphabetical order

Earth, the sun and a bike wheel: Why your high-school textbook was wrong about the shape of Earth's orbit

Touchibo, a robot that fosters inclusion in education through touch

More than money, family and community bonds prep teens for college success: Study

Research reveals significant effects of onscreen instructors during video classes in aiding student learning

Medical Xpress

Tech Xplore

Science X

AI models struggle to identify nonsense, says study

Global study shows a third more insects come out after dark

Cicada-palooza! Billions of bugs to blanket America

Getting dynamic information from static snapshots

Ancient Maya blessed their ballcourts: Researchers find evidence of ceremonial offerings in Mexico

Optical barcodes expand range of high-resolution sensor

Ridesourcing platforms thrive on socio-economic inequality, say researchers

Did Vesuvius bury the home of the first Roman emperor?

Florida dolphin found with highly pathogenic avian flu: Report

A new way to study and help prevent landslides

New algorithm cuts through 'noisy' data to better predict tipping points

Relevant PhysicsForums posts

Related Stories

Verbal nonsense reveals limitations of AI chatbots

AI model can respond appropriately to ophthalmology questions

Large language models depend on humans to maintain performance, expert explains

An interactive platform that explains machine learning models to its users

ChatGPT diagnoses ER patients 'like a human doctor': Study

Google AI health chatbot passes US medical exam: study

Recommended for you

Training of brain processes makes reading more efficient

Researchers find lower grades given to students with surnames that come later in alphabetical order

Earth, the sun and a bike wheel: Why your high-school textbook was wrong about the shape of Earth's orbit

Touchibo, a robot that fosters inclusion in education through touch

More than money, family and community bonds prep teens for college success: Study

Research reveals significant effects of onscreen instructors during video classes in aiding student learning

Newsletter sign up

Donate and enjoy an ad-free experience