In Essence, Deep Learning is Really Just Math

825

DEEP LEARNING IS rapidly ‘eating’ artificial intelligence. But let’s not mistake this ascendant form of artificial intelligence for anything more than it really is. The famous author Arthur C. Clarke wrote, “Any sufficiently advanced technology is indistinguishable from magic.” And deep learning is certainly an advanced technology—it can identify objects and faces in photos, recognize spoken words, translate from one language to another, and even beat the top humans at the ancient game of Go. But it’s far from magic.

As companies like Google and Facebook and Microsoft continue to push this technology into everyday online services—and the world continues to marvel at AlphaGo, Google’s Go playing super-machine—the pundits often describe deep learning as an imitation of the human brain. But it’s really just simple math executed on an enormous scale.

In particular, deep learning is a class of algorithmic methods for ‘tuning’ neural networks based on data. What does that mean? Well, a neural network is a computer program, loosely inspired by the structure of the brain, which consists of a large number of very simple interconnected elements. Each element takes numeric inputs, and computes a simple function (for example, a sum) over the inputs. The elements are far simpler than neurons, and the number of elements and their interconnections is several orders of magnitude smaller than the number of neurons and synapses in the brain. Deep learning merely strengthens the connections in such networks.

Deep learning is a subfield of machine learning, which is a vibrant research area in artificial intelligence, or AI. Abstractly, machine learning is an approach to approximating functions based on a collection of data points. For example, given the sequence “2, 4, 6,…” a machine might predict that the 4th element of the sequence is 8, and that the 5th is 10, by hypothesizing that the sequence is capturing the behavior of the function 2 times X, where X is the position of the element in the sequence. This paradigm is quite general. It has been highly successful in applications ranging from self-driving cars and speech recognition to anticipating airfare fluctuations and much more.

Read the source article at wired.com