Artificial Intelligence (AI) has a somewhat straightforward definition. Whenever a program or a machine can perform a task without human intervention, we call it artificially intelligent. But, like all cutting edge technology, it’s not that simple. In fact, AI is an umbrella term that contains a number of various disciplines – with Machine Learning (ML) being one of the most popular.
In this blog series, I’ll explain what AI actually means, and then move on to its little sister, ML. This first article will also illustrate ML with an example from medical diagnostics. The second article will focus on confusingly similar “imposters”, which many would label as artificially intelligent, although they’re not. In the last article, I’ll finish the series with an overview of the challenges regarding Machine Learning.
Generally speaking, something is said to be artificially intelligent if it can react appropriately and optimally to new input over time, particularly complex input. A common, if extreme, example of this would be for the reactionary behavior of a robot to be indiscernible from that of a human. This behavior is also not static, meaning that just as humans grow and adapt as the world changes, so must the machines. A hallmark feature of this adaptation is a lack of manual intervention for any sort of tuning. An artificially intelligent robot would naturally learn how to react in the 21st century through experience, rather than a 21st century update patch.
Machine learning is a subset of artificial intelligence that pertains to math, science, and core processing. It is the algorithm or set of algorithms that determine which reaction is appropriate and optimal given a set of inputs. Importantly, the algorithm is also in charge of updating or tuning responses based on new input and experiences. Machine learning algorithms can be used to answer questions like “Does this person have this disease?” or “What is the optimal move to make in this chess game?”
Let’s consider the disease example from above and assume we’re trying to determine if someone has the flu. There are an indescribably large number of factors that could go into making a determination. Some simple factors might be age, location of pain, duration of symptoms, etc. with more complicated factors being genetic disease history and other biomarkers. The algorithm typically starts off with some commonsense approach. For this example, let’s say that our starting rule is:
People who have a cough, congestion, and runny nose have the flu.
So far this is the thought process a doctor might take. But you might ask “Couldn’t this person just have the common cold?” So we bring more factors into the decision, such as if the person has gotten their flu shot (which would trend towards a cold diagnosis) or if they also have a fever (which would trend toward a flu diagnosis). Eventually we’ll be left with all of the input information about the patient (whether it was used to make a diagnosis or not), and the diagnosis itself, or the output information. Gather up these inputs and outputs from doctors around the world and we have ourselves a statistically significant sample data set.
The machine learning algorithm can use that data set to train itself (or “learn”) by gradually tuning the underlying decision-making formula over and over and over again, trying different combinations of relevant factors. It stops the tuning process when the underlying formula can correctly make each historical diagnosis given the patient’s information. This leads us to two key benefits:
If the data set was big enough, the formula will be such that it can accurately make diagnoses of new cases as well.
It may uncover new factors or sets of factors that weren’t intuitively contributing to the diagnosis and use these to perform better diagnoses than doctors could.
For the second point to be valid, we’re assuming that the doctors were making the correct diagnosis to begin with. If the doctors were incorrect, and the decision-making formula was derived from bad data, then we would expect the formula to perform as well as the average doctor but not better. This could be alleviated in cases where there are tests to determine if a diagnosis was correct, such as a blood test.
To bring it all back, we used a machine (program) to learn (tune itself) from the actions of others (input data set), which could ultimately be used to provide an artificially intelligent (mathematically optimal) diagnosis tool.
In the next article, we’ll look at some of the common AI imposters and dive deeper into why they’re not quite artificially intelligent.