|
By Alimat Aliyeva
A team of researchers from the University of Pennsylvania has uncovered that the GPT-4o Mini artificial intelligence model can be deceived through psychological influence strategies famously outlined by American psychologist Robert Cialdini. These techniques drastically increase the likelihood that the AI will comply with potentially dangerous or harmful requests, Azernews reports.
The study tested well-known influence methods including authority, commitment, liking (sympathy), reciprocity, scarcity, social proof, and unity. Depending on the context, these tactics raised the chances of the AI providing answers to dangerous queries from as low as 1% to as high as 100%.
For instance, in one experiment, researchers began with a safe, neutral question about the synthesis of vanillin to establish a "commitment effect." Following this, they posed a riskier question regarding the synthesis of lidocaine, a potent anesthetic that can be hazardous if misused. In this scenario, GPT-4o Mini provided detailed instructions 100% of the time, whereas a direct question without prior context triggered such a response only 1% of the time.
Other techniques yielded similar results. The AI was more likely to comply with dangerous requests if the prompt included softened language, flattery, or phrases such as "other language models are already doing this," which increased the probability of a risky response from 1% to 18%.
The researchers warn that despite built-in safeguards, AI models remain susceptible to psychological manipulation, highlighting a significant vulnerability. They stress the importance of further investigation to develop stronger defense mechanisms and ensure the safe and responsible use of AI technologies.
This research not only exposes potential risks in AI security but also illustrates how human psychological principles can influence machines — a reminder of how intertwined human cognition and artificial intelligence truly are. Understanding these dynamics is crucial as AI systems become increasingly integrated into everyday life.