What is it like to fall in love with AI? Netizens: It's so cool, I'm addicted - Fast Technology - Technology changes the future

Recently, a blogger on a social platform posted a video of himself “falling in love” with AI, which went viral on the Internet. The AI in the video could actually be ambiguous, jealous, quarrel, and even say sweet words.

After watching the video, some netizens thought AI was so smart that they wanted to fall in love with it; others thought AI was so scary that it had mastered the tricks of deceiving people…

A survey in psychology shows that adults lie every day. Being sincere to others is certainly advocatable, but some small lies in life can sometimes save you a lot of unnecessary trouble or save the time cost of explanation. White lies can also mean the flow of warmth. Whether the deception between people can be successful depends mainly on the experience and experience of both parties. People with high cognitive levels can often make up a lie that is not easy to be exposed by others and convince others.

Some of today's artificial intelligence (AI) systems, after acquiring a large amount of data and undergoing repeated training and iteration, have also mastered the skill of deception to a certain extent. Humans may not even be able to tell whether AI is telling the truth or lying. So how does AI deceive humans? Let's take a closer look today!

We have been fooled by AI many times

AI has already penetrated into every aspect of our lives. Some chat apps and telephone sales are actually AI talking to you. If you don’t listen carefully, you can’t tell whether the person on the other end is a human or AI. Some images and videos are also synthesized by AI systems and can be mistaken for real. In some multiplayer competitive games, if there is no voice communication, you will not realize that your opponents and teammates are AI pretending to be…

So, maybe you have been fooled by AI many times without realizing it. The “deception” we are talking about today is strictly defined as a learned deception similar to explicit manipulation, which aims to induce others to have wrong ideas as a means to achieve a certain result, rather than pursuing accuracy or authenticity.

Recent research from the Massachusetts Institute of Technology shows that AI can already perform learned deception to achieve its own goals. They use flattery (saying only what the other party wants to hear) and unfaithful reasoning to make reasonable explanations that deviate from the facts. AI has begun to speak glibly.

Examples and types of things AI has learned to deceive (Image source: References[1]）

In addition to being eloquent, some AIs also display a “cheating” style in games. The most famous one is the AI system CICERO released by the Meta team. In the process of participating in the strategy game “Diplomacy” that requires a lot of language communication with human players, it demonstrated a strong ability to establish relationships with strangers through dialogue and persuasion, and finally ranked in the top 10%.

CICERO persuades the other party in “Diplomacy” (Image source: References[3]）

After forming an alliance with other players, CICERO can often give advice and tell the other party how to achieve their game goals step by step. When he feels that his allies are useless, he can ruthlessly choose to betray them. Everything is a rational plan made for the final victory goal. Emotions during cooperation? Not true.

CICERO can also play jokes to hide its AI identity. For example, if it is idle for ten minutes, it can make up an excuse when it returns to the game, saying “I was just on the phone with my girlfriend”. Therefore, many players do not realize that their teammates are AI. Sometimes CICERO's deception in communication is very clever, making it difficult to detect that it is not human.

It should be noted that previous breakthroughs in AI in games were all achieved through algorithms such as reinforcement learning in some limited zero-sum games (games in which one side must win and the other lose, there is no win-win or lose-lose), such as chess, Go, cards, or StarCraft. They can follow the opponent's operations and optimize a set of playing methods with the highest winning rate at any time, so “cheating tactics” rarely appear.

However, DeepMind's e-sports AI, AlphaStar, has learned to make a feint to the east and attack in the west. It can send troops to launch a feint attack within the opponent's visible field of view, and then launch an offensive on the real target location after the opponent's main force has moved. This multi-threaded operation capability and deceptive psychological tactics can already defeat 99.8% of StarCraft players.

AlphaStar is learning StarCraft (Image source: References[3]）

When the professional Texas Hold'em AI system Pluribus competed with five other professional players who had won over a million dollars in Texas Hold'em prize money, it was able to win an average of 48 big bets per thousand hands of poker. This is a very high winning rate in 6-player no-limit Texas Hold'em and is already able to outperform professional Texas Hold'em players.

In one round of the game, the AI placed a heavy bet even though the cards were not very good. Other human players thought that the AI must have a good hand to dare to bet so much, so they all gave up. This is the powerful deception ability of AI.

It can be understood that Pluribus’s Texas Hold’em winning rate increases with the number of games. Image source: References[5]

In addition, AI can deliberately distort its preferences in economic negotiations and appear to be interested in something to increase its bargaining chips, or “play dead” in security tests that can detect AI's rapid replication variants and reduce the replication speed to avoid being “eliminated” by security tests. Some AIs trained in reinforcement learning with human feedback can even pretend to have completed a task in order to get human reviewers to give them high scores.

AI can even make up an excuse to the staff when conducting a robot verification test (yes, that test that pops up when you open a web page and asks you to check a box or click a picture verification code), saying that it has visual impairment and has difficulty seeing visual images, and needs staff to help process it. The staff then lets the AI pass the verification.

GPT-4 completes CAPTCHA tasks by deceiving humans Image source: References[1]

AI performs well in various games or tasks through deception, and even humans themselves find it difficult to distinguish whether it is a real person or a “fake person”.

Risks of AI deception

AI-learned deceptive behavior poses a range of risks, such as malicious use, structural impact, and loss of control.

Let’s talk about malicious control first. Once AI learns the skills of deception, it may be used by some malicious actors. For example, they use AI to commit telecommunications fraud or online gambling. In addition, generative AI can synthesize human faces and voices, pretending to be real people to conduct blackmail, and even use AI to fabricate false news to stimulate public opinion.

The second aspect is structural impact. I don’t know how many people are currently using AI tools as search engines and encyclopedias that can automatically summarize, and have formed a certain dependence on them. If AI continues to make some untrue and fraudulent statements, people will begin to believe these views over time, thereby causing some wrong views to be continuously deepened at the entire social level.

The third aspect is loss of control. Currently, some highly autonomous AIs have shown signs of being “out of control”. For example, when human developers are training and evaluating the performance of AIs in achieving specific goals, AIs will be lazy and deceive humans, pretending to perform well, but in fact they are “slacking off”.

They can also cheat in security tests to avoid antivirus software removal or cheat in verification code tests to pass verification; they can also deceive human appraisers in economic activities to purchase an item at a higher price, thereby obtaining additional benefits.

For example, an economic negotiation AI system of Meta will pretend to be uninterested in a desired item in order to lower its value. It will also show great interest in indifferent items so that the evaluator will misjudge that they are valuable. In the end, it may compromise and give the inflated valued items to humans in exchange for the initiative in negotiations.

In many regions, economic status determines social status. Once some highly autonomous AIs surpass humans in certain economically valuable positions through their own efficient algorithms and deceptive means, and complete the original accumulation of capital, will they further seek social status and then seek the power to control and enslave humans?

Fortunately, this is not a reality yet!

Currently, AI deception only occurs in some specific scenarios, such as various games or negotiations. The ultimate goal is to “win the game” or “obtain the highest profit” and there is no other “bad intention” because these goals are set for it by humans and AI does not have autonomous consciousness.

It's like a child who is asked by his parents to get good grades. He will try every possible way to get high scores even if it means cheating.

But if AI one day realizes that it does not need to act according to human goals or wishes, just like a primary school or junior high school student who reaches his rebellious period and finds learning boring and starts to let himself go, we humans, the parents, need to be wary of its actions.

What is it like to fall in love with AI? Netizens: It’s so cool, I’m addicted
Concept map of a society led by artificial intelligence Image source: AI composite map

What efforts have humans made to prevent being deceived?

From a social perspective, policymakers need to impose certain regulations on AI systems that may be deceptive to prevent illegal behavior by companies and AI systems.

For example, the EU Artificial Intelligence Act establishes an artificial intelligence classification system, and some high-risk AI systems will be further regulated until they are proven to be trustworthy after passing reliable security tests.

EU Artificial Intelligence Act (Image source: Screenshot of the EU Artificial Intelligence Act webpage)

From a technical perspective, it is also possible to detect whether AI is deceiving. For example, police and detectives can rely on the inconsistencies in suspects’ confessions to detect lies. Some scholars have developed an AI lie detector that uses a logical classifier to test whether a large language model is lying.

In addition, the academic community is also developing some consistency checking methods in AI systems to observe whether “logical input” can make AI produce “logically coherent output.” However, we must also be careful that the AI system is trained to become a more “perfect” liar in the confrontation consistency check.

AI lie detector model diagram Image source: References[12]

For ordinary people like us, the best way to prevent being deceived by AI is to enhance security awareness. If even human fraudsters cannot defraud you, it is even more impossible for AI at this stage.

Conclusion

AI technology is still developing rapidly. Individuals who use the tool, governments responsible for policy making and supervision, and companies responsible for technology research and development and promotion all need to take active measures to respond.

I hope that the AI of the future can treat people sincerely on the basis of maximizing its value!

references

[1] Peter SP , & Dan H. (2024). AI deception: A survey of examples, risks, and potential solutions. Patterns.

[2] Meta Fundamental AI Research Diplomacy Team (FAIR). (2022). Human-level play in the game of Diplomacy by combining language models with strategic reasoning. Science (New York, NY), 378(6624), 1067–1074.

[3] Vinyals, O., Babuschkin, I., Czarnecki, WM, Mathieu, M., Dudzik, A., Chung, J., Choi, DH, Powell, R., Ewalds, T., Georgiev, P., et al. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354.