As a boy, I wanted to maximize my impact on the world, so I decided I would build a self-improving AI that could learn to become much smarter than I am. That would allow me to retire and let AIs solve all of the problems that I could not solve myself—and also colonize the universe in a way infeasible for humans, expanding the realm of intelligence.
So I studied mathematics and computers. My very ambitious 1987 diploma thesis described the first concrete research on meta-learning programs, which not only learn to solve a few problems but also learn to improve their own learning algorithms, restricted only by the limits of computability, to achieve super-intelligence through recursive self-improvement.
I am still working on this, but now many more people are interested. Why? Because the methods we’ve created on the way to this goal are now permeating the modern world—available to half of humankind, used billions of times per day.
As of August 2017, the five most valuable public companies in existence are Apple, Google, Microsoft, Facebook and Amazon. All of them are heavily using the deep-learning neural networks developed in my labs in Germany and Switzerland since the early 1990s—in particular, the Long Short-Term Memory network, or LSTM, described in several papers with my colleagues Sepp Hochreiter, Felix Gers, Alex Graves and other brilliant students and postdocs funded by European taxpayers. In the beginning, such an LSTM is stupid. It knows nothing. But it can learn through experience. It is a bit inspired by the human cortex, each of whose more than 15 billion neurons are connected to 10,000 other neurons on average. Input neurons feed the rest with data (sound, vision, pain). Output neurons trigger muscles. Thinking neurons are hidden in between. All learn by changing the connection strengths defining how strongly neurons influence each other.
Things are similar for our LSTM, an artificial recurrent neural network (RNN), which outperforms previous methods in numerous applications. LSTM learns to control robots, analyze images, summarize documents, recognize videos and handwriting, run chat bots, predict diseases and click rates and stock markets, compose music, and much more. LSTM has become a basis of much of what’s now called deep learning, especially for sequential data (note that most real-world data is sequential).
Read the source article at Scientific American.