IBM computer holds its own against human debaters

June 20, 2018

businessday-icon

An artificial intelligence system built by IBM has taken on two humans in a formal debating competition and came up only slightly short in conjuring arguments that a human audience would find more persuasive.

The demonstration of Big Blue’s latest AI lacked the clear drama — and the result — of its previous “man versus machine” stunts, including the Watson system that beat the top human champions at question-and-answer television game Jeopardy in 2011, and Deep Blue, which conquered world chess champion Garry Kasparov in 1997.

But it was a graphic demonstration of how a set of technologies at the frontier of AI could be combined to challenge humans in a realm where they might have thought they still had a big lead over machines. It was also a sign that computers are venturing deep into subjective human territory where there are no straightforward answers or clear winners.

Unlike Watson, which IBM took years to develop into a commercial system, its latest AI — called Debater — could also have a far more immediate impact on the company’s fortunes.

“We’re interested in enterprises and governments; our goal is to help humans in decision-making,” said Arvind Krishna, IBM’s director of research.

By assembling arguments out of large bodies of information, the system could help people address important choices, he added. “Should we drill for oil in west Africa? Should we let our food supply have antibiotics in it? There are no right or wrong answers, but we want there to be an informed debate.”

Monday’s debate was the culmination of six years of work by IBM researchers in Tel Aviv, a process that was launched in the wake of the Jeopardy victory.

Related News

“Argumentation is one of the defining features of what it means to be human,” said Chris Reed, a professor of computer science and philosophy at the University of Dundee, who was in the audience. “To see so many pieces of the puzzle coming together here is really impressive.”

In two debates, the IBM system was matched against an experienced human debater and then judged by an invited audience. IBM said the system had not been given foreknowledge or trained on the topics — whether government support of space exploration and telemedicine are good things — but instead assembled its arguments essentially in real time, searching a body of “hundreds of millions” of newspaper articles for evidence.

The computer was represented on stage by a smooth female voice spoken by a slim black slab, like a mini obelisk from the movie 2001: A Space Odyssey. It also drew on jokes that were programmed in advance and tactics that included suggesting its adversary was lying — techniques IBM said were used to make its presentation more accessible to human listeners and, at times, distract from the fact that it did not have a strong argument to draw on, just as a human would.

In each debate, the audience judged that the machine’s arguments were less impressive than those of its human adversary. But the margin was not large and in both cases, the computer outperformed in its ability to present a wider body of relevant information.

Among the most impressive aspects of the demonstrations, said Prof Reed, were the system’s “polarity” — almost all its statements were ones that supported its case; the use of correct grammar; a “relatively natural” organisational structure for the information it presented; and the use of procatalepsis, a rhetorical technique that involves identifying and disproving a rival argument before it has even been made.

But Debater also occasionally jumbled together assertions in support of its case without presenting a coherent line of reasoning. And it offered occasional unsourced assertions that drew laughter — for instance, declaring that higher government spending on space was “more important than better healthcare”.

The system has not been designed to judge the reliability of the information it draws on to make its arguments, said Noam Slonim, the researcher who led the project. But he said the system could still be used as a weapon against fake news, since if it provided full transparency about the sources it was drawing on, humans would have the chance to judge the strength of its case.