“What we’re interested in is teaching machines what can happen in a particular setting,” Carl Vondrick, a Ph.D. student in computer science, told Digital Trends. “For example, we wanted a machine to recognize what happens on a beach. We want it to know that waves are going to crash, people are going to play in the water — these are all things it’s very difficult to teach a machine. The reason is that it would be very time-consuming for a person to sit down and write rules to explain everything that can happen in any given scenario. What we wanted to do was to teach them from watching massive amounts of video instead.”
Whether it’s guessing which songs you want to listen to or which ads you should be shown, modern AI is increasingly focused on predicting the future. But there’s an enormous gulf between those kind of applications and looking at a scene and guessing what will happen next.
That’s what researchers at the Massachusetts Institute of Technology have done, with a new paper revealing not just their ability to look at still images and guess what will happen next — something we’ve covered in the past — but to actually generate video of it.