The all seeing AI
6 Jan 2015 by Evoluted New Media
Stephen Hawking and other high profile scientists have recently issuing stark warnings over the threat posed to humanity by advanced artificial intelligence. But aside from being a source of existential angst, the development of AI technology can give us deep insights into our own on-board computer.
Stephen Hawking and other high profile scientists have recently issuing stark warnings over the threat posed to humanity by advanced artificial intelligence. But aside from being a source of existential angst, the development of AI technology can give us deep insights into our own on-board computer.
Artificial intelligence has made great strides recently on the problem of visual object recognition. The advent of efficient training algorithms for deep neural networks has yielded unprecedented object-recognition performance, reaching levels comparable to humans on certain tasks.
During the last few decades, several models have been developed to extract visual features from images for object recognition tasks. Some of these were inspired by the hierarchical structure of the primate visual system, others were engineered models. The models vary in several ways: So called ‘supervised models’ learn to recognise objects in images by processing many images with associated category labels. ‘Unsupervised models’ are trained with lots of natural images, but without using any labels and hand-engineered models (e.g. feature extractors) which do not learn from images at all. Some of the models come with a deep hierarchical structure consisting of several layers, and some others are shallow and come with only one or two layers of processing. More recently, new models have been developed that are not hand-tuned but trained using millions of images, through which they learn how to extract informative task-related visual features. A major source of difficulty in computer vision has been developing hand-engineered features, sophisticated feature extractors, to identify abstract patterns that are optimal for object recognition. The new deep neural networks solve this problem by learning complex and highly abstract representations automatically from data.
The human brain is a massively parallel processing unit – billions of neurons fire simultaneously to create waves of cortical computation. The discovery of cheap parallel computation units unlocked new possibilities for neural networks, which can include hundreds of millions of connections between their nodes. These cheap parallel processors are called graphics processing units (GPUs). In 2012, a group of researchers led by Geoff Hinton trained a large neural network on multiple GPUs using many labelled images from a huge image dataset. This network won the ImageNet competition – an object recognition contest – in 2012.
Learning the kind of complicated functions that can represent high-level abstractions in vision may require a hierarchical architecture with many layers. This is known as a deep architecture. These new deep models for vision share certain features with the primate visual system. They process images in parallel using many computational units and several stages of representation. Like humans and nonhuman primates, they learn to categorize things through extensive experience.
Despite the recent successful accomplishments in computer vision, there are still several ways that humans are superior to machines. Nevertheless, recent computer vision systems based on deep neural networks have shown that the age of unchallenged human supremacy might be drawing to a close. These systems might also help us better understand how the human brain recognises things. In a new study in PLoS Computational Biology, we investigated the internal representation of a deep-learning vision system in comparison to visual representations in biological brains. We found that the deep computer-vision model’s representations are very similar to biological brain representations, more so than those of a wide range of earlier computer-vision systems. The more similar a model representation was to the high-level visual brain representation, the better the model performed at object categorisation.
Previous studies have shown that the patterns of neural responses to object images are clustered according to semantic categories. The strongest categorical division appears to be that between animates and inanimates. Animate objects are images of living things, such as human or animal faces and bodies. Inanimate objects are images of non-living things, such as plants, fruits, or man-made objects. Within the animates, faces and bodies form separate sub-clusters. In this study, we found that several models were good at distinguishing human faces from other objects, and a few of them were also good at distinguishing animate objects from the rest. However, most importantly, we found that models not trained with many category-labelled images fail to form the animate/inanimate distinction, which is very prominent in human brain representation. The deep neural network had been trained by supervision with over a million category-labelled images and came closest to explaining the brain representation. Intensive training with large sets of labelled images might be necessary to model the brain.
We further showed that even combining unsupervised model features by assigning appropriate weights to each of them did not improve their similarity with the brain representation. On the other hand, by combining features from different stages of the deep supervised model we could fully explain the brain representation. This suggests that the deep supervised features, as opposed to unsupervised features, had the necessary features for explaining the brain representation; and they just needed to be combined appropriately.
These findings suggest that biological visual object recognition, formerly impossible to mimic in artificial systems, is moving into the realm of processes that can be approximately replicated with computers. Deep learning models are likely to inspire a new generation of specific computational simulations of the human brain. Moreover, artificial neural networks were inspired by the brain, therefore modelling biological brains more closely than current engineering approaches promises further advances in computer vision and artificial intelligence.
Deep learning is being applied in many applications and has boosted the performance of popular software in image recognition, data mining, speech recognition, and language translation. The next generation of artificial intelligence looks set to work more similarly to the human brain than any previous technology. And like us, these systems may see the world in categories that help them reach their goals.
Authors:
Nikolaus Kriegeskorte, Principal Investigator at the Medical Research Council's Cognition and Brain Sciences Unit in Cambridge.
Seyed Khaligh-Razavi, PhD at the MRC Cognition and Brain Sciences Unit in Cambridge University.