Researchers in Japan have tried to build an artificially intelligent system to make people laugh – but, surprise, surprise, the jokes it told were terrible.
The “Neural Joking Machine” (NJM) was created by computer scientists from Tokyo Denki University and the National Institute of Advanced Industrial Science and Technology to see if humor could be automatically generated and studied academically.
“Laughter is a special, higher-order function that only humans possess,” they wrote in a paper emitted online this week. It’s something that is difficult to quantitatively measure, but they gave it a shot anyway.
First, they collected training data by downloading pairs of images and witty captions from Bokete, a Japanese website netizens submit and rank amusing pictures and quips by awarding virtual stars. The more giggles and guffaws a photo induces, the more stars it deserves. The dataset, dubbed BoketeDB, contains 999,571 funny captions for 70,981 images.
Image captioning is a popular area of research in AI. It combines computer vision and natural language processing, and is a useful way of probing what machines see in a way that is understandable to humans. The researchers used a model based on Google's Show and Tell, made up of a convolutional neural network to process images and a long short-term memory network to generate text.
After training on the pairs of images and captions, the NJM tried to come up with captions for new images from a range of 30 themes, including “people”, “two or more people”, “animals”, “landscape”, “inorganics”, and “illustrations”.
The team then asked people to rank the captions generated by a human, the NJM, and STAIR, another neural network captioning system trained on the MS COCO, a dataset containing 330,000 images with five captions each, through questionnaires. MS COCO is commonly used as a dataset to benchmark image captioning models. The researchers included STAIR as a baseline and translated its captions from English into Japanese.
The results from the 16 questionnaires showed that the NJM performed markedly worse than humans. At least they were considered funny about 67.99 per cent of the time, compared to 22.59 per cent for the NJM, and just 9.41 per cent for STAIR.
“These results suggest that captions generated by the NJM are less funny than those generated by humans. However, the NJM is ranked much higher than STAIR caption,” the paper stated.
Given the training data, it could very well be a case of garbage in, garbage out.
In other studies, researchers have toyed with the idea of trying to teach machines sarcasm with the hopes that it’ll be useful for chatbots and digital assistants. But the results there weren’t really promising either. It’ll be a long while yet before Alexa, Siri, or Google Home can make us laugh – so what else are they really good for, apart from being excellent spyware? ®