The ‘strawberrry’ downside: Methods to overcome AI’s limitations

Date:

Share post:

Be part of our each day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra


By now, giant language fashions (LLMs) like ChatGPT and Claude have turn out to be an on a regular basis phrase throughout the globe. Many individuals have began worrying that AI is coming for his or her jobs, so it’s ironic to see virtually all LLM-based programs flounder at a simple process: Counting the variety of “r”s within the phrase “strawberry.” They don’t seem to be completely failing on the alphabet “r”; different examples embody counting “m”s in “mammal”, and “p”s in “hippopotamus.” On this article, I’ll break down the rationale for these failures and supply a easy workaround.

LLMs are highly effective AI programs educated on huge quantities of textual content to know and generate human-like language. They excel at duties like answering questions, translating languages, summarizing content material and even producing inventive writing by predicting and setting up coherent responses primarily based on the enter they obtain. LLMs are designed to acknowledge patterns in textual content, which permits them to deal with a variety of language-related duties with spectacular accuracy.

Regardless of their prowess, failing at counting the variety of “r”s within the phrase “strawberry” is a reminder that LLMs usually are not able to “thinking” like people. They don’t course of the knowledge we feed them like a human would.

image3
Dialog with ChatGPT and Claude in regards to the variety of “r”s in strawberry.

Virtually all the present excessive efficiency LLMs are constructed on transformers. This deep studying structure doesn’t straight ingest textual content as their enter. They use a course of known as tokenization, which transforms the textual content into numerical representations, or tokens. Some tokens is likely to be full phrases (like “monkey”), whereas others may very well be elements of a phrase (like “mon” and “key”). Every token is sort of a code that the mannequin understands. By breaking every thing down into tokens, the mannequin can higher predict the subsequent token in a sentence. 

LLMs don’t memorize phrases; they attempt to perceive how these tokens match collectively in numerous methods, making them good at guessing what comes subsequent. Within the case of the phrase “hippopotamus,” the mannequin would possibly see the tokens of letters “hip,” “pop,” “o” and “tamus”, and never know that the phrase “hippopotamus” is manufactured from the letters — “h”, “i”, “p”, “p”, “o”, “p”, “o”, “t”, “a”, “m”, “u”, “s”.

A mannequin structure that may straight have a look at particular person letters with out tokenizing them could probably not have this downside, however for in the present day’s transformer architectures, it isn’t computationally possible.

Additional, how LLMs generate output textual content: They predict what the subsequent phrase will probably be primarily based on the earlier enter and output tokens. Whereas this works for producing contextually conscious human-like textual content, it isn’t appropriate for easy duties like counting letters. When requested to reply the variety of “r”s within the phrase “strawberry”, LLMs are purely predicting the reply primarily based on the construction of the enter sentence.

Right here’s a workaround

Whereas LLMs may not have the ability to “think” or logically cause, they’re adept at understanding structured textual content. A splendid instance of structured textual content is laptop code, of many many programming languages. If we ask ChatGPT to make use of Python to rely the variety of “r”s in “strawberry”, it should most certainly get the right reply. When there’s a want for LLMs to do counting or another process which will require logical reasoning or arithmetic computation, the broader software program will be designed such that the prompts embody asking the LLM to make use of a programming language to course of the enter question.

image1

Conclusion

A easy letter counting experiment exposes a basic limitation of LLMs like ChatGPT and Claude. Regardless of their spectacular capabilities in producing human-like textual content, writing code and answering any query thrown at them, these AI fashions can not but “think” like a human. The experiment reveals the fashions for what they’re, sample matching predictive algorithms, and never “intelligence” able to understanding or reasoning. Nevertheless, having a previous information of what kind of prompts work properly can alleviate the issue to some extent. As the mixing of AI in our lives will increase, recognizing its limitations is essential for accountable utilization and lifelike expectations of those fashions.

 Chinmay Jog is a senior machine studying engineer at Pangiam.

DataDecisionMakers

Welcome to the VentureBeat neighborhood!

DataDecisionMakers is the place consultants, together with the technical folks doing information work, can share data-related insights and innovation.

If you wish to examine cutting-edge concepts and up-to-date info, greatest practices, and the way forward for information and information tech, be a part of us at DataDecisionMakers.

You would possibly even take into account contributing an article of your individual!

Learn Extra From DataDecisionMakers

Related articles

The Dyson Airwrap is $110 off for Black Friday

There are many early Black Friday offers past telephones, tablets, smartwatches and laptops. One other merchandise value contemplating...

Apple Black Friday offers low cost the M3 MacBook Air with 16GB of RAM to $899

Black Friday offers are already coming in sizzling with some wonderful reductions on MacBooks. Key amongst them is...

Black Friday offers embrace the DJI Osmo Cell 6 gimbal for under $89

The DJI Osmo Cell 6 gimbal , as a part of an early Black Friday deal. This knocks...

Gross sales from Amazon, Greatest Purchase, Apple, Anker and others

Black Friday might technically simply be someday, however it’s advanced to devour the complete month of November within...