Developers know a lot about the machine learning (ML) systems they create and manage, that’s a given. However, there is a need for non-developers to have a high level understanding of the types of systems. Artificial neural networks and expert systems are the classical two key classes. With the advanced in computing performance, software capabilities and algorithm complexity, analytical algorithm can arguably be said to have joined the other two. This article is an overview of the three types.
Artificial Neural Networks
The ML architecture getting most of the press is the artificial neural network (ANN), alternately called the convolutional neural network (CNN). In theory, the CNN is a form of the ANN that has become the type almost always discussed in academic circles and conferences, but they are close enough alike when compared to the other two methods discussed below (expert systems and analytics).
The concept of ANNs traces back to 1943, when Warren McCullough and Walter Pitts first defined a model for brain activity based on mathematics and logic. The limits of computing at that time meant ANNs did not have an impact on business until this century. Faster, cheaper compute and advances in networking and parallel computing enabled performance improvements and helped push ANNs to the ML forefront.
The ANN is a type of deep learning, a way for computers to leverage understanding of how a brain works. “Leverage” is an important word, as as ANN developers do not mean to imitate the brain but to use some general ideas of how the brain works. According to one textbook, “The modern term ‘deep learning’ goes beyond the neuroscientific perspective on the current breed of machine learning models. It appeals to a more general principle of learning multiple levels of composition, which can be applied in machine learning frameworks that are not necessarily neutrally inspired.” (Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. The MIT Press)
The ANN is a group of nodes, each in a layer. ANNs include an input layer, an output layer, and any number of hidden layers that process information. Nodes within each layer are identical and process the information passed to it by the previous layer. For instance, in vision, one layer might identify edges of objects through locating gradients, while more advanced layers build on top of that to recognize more complex concepts.
What passes to each node is not one piece of information from a previous node. It is an array of data that is then manipulated, usually using statistical and probabilistic methods, to understand the information and then pass information to another layer until reaching the output layer.
The ANN figure above shows an example of forward propagation, or a feed-forward system. Each layer provides information to the next, which uses its set statistics and other analysis methods to push a result forward to the next step. Back-propagation (not shown) pushes knowledge learned in later layers back down the network to help adjust parameters in earlier layers.
Array processing leads to another term used in deep learning: tensor. A tensor is simply a multi-dimensional array. A spreadsheet table is a version of a two-dimensional array, or tensor. As ANNs constantly are working with complex arrays, and tensor is a more mathematical term, that word has carried more weight. That is the origin of the open source product TensorFlow, a software library for routines uses in machine learning applications.
Logic was a major AI focus during the 1980s. Rules-based systems would take input data, apply a series of rules, and hopefully reach a conclusion. “Hopefully” is the key difference between a heuristic and a deterministic algorithm. When deterministic algorithms are run, you will get a solution. Heuristics are “rules of thumb”: rules that can make predictions but do not have certainty. Heuristic algorithms often use probabilistic methods during applications of rules and providing results.
Two terms relevant to expert systems are forward chaining and backward chaining. Forward chaining starts from evidence and drives to a conclusion. Backward chaining starts with a conclusion and then checks to see if the evidence supports that conclusion. Think about cause and effect.
Forward Chaining and Backward Chaining
In a modern hospital, monitors constantly track many patient data points. Based on changes in such things as temperature and heart rate, an expert system can forward-chain to conclude that a patient is getting better or worse. The machine learning subsystem can then provide the information necessary for the system notify a staff member of a pending problem predicted by the existing data.
MYCIN, another early expert system, on the other hand, would ask about a person’s physical symptoms and then make predictions about the bacteria that might have caused the symptoms. That is an example of backwards chaining.
Read the source article at Forbes.