Machine Learning 101


AI & Machine Learning Defined

On a basic level, we can distinguish AI and ML as follows:

AI is a larger idea that aims to construct intelligent machines (or computer programs) that can mimic human thinking capabilities and behavior without being manually driven or coordinated by humans.

Machine learning is a subset of AI that allows machines to learn from data without being explicitly programmed. In machine learning we use data as an input to automatically create computer programs.

What are the different types of machine learning?

Machine Learning has been around for a long time (more than 60 years), but only recently has it been generating a lot of interest from the public. Machine Learning is a type of artificial intelligence that allows computers to learn from data and perform tasks without being programmed explicitly.

There are five types of machine learning:

  • supervised learning,
  • unsupervised learning,
  • reinforcement learning,
  • semi-supervised learning and
  • active learning.

Supervised machine learning

Supervised machine learning is when a system is given a set of inputs and the desired outputs (a training dataset) in order to produce an output (trained model).

The most common type of supervised machine-learning technique is classification, which takes a set of data points and assigns them to one of two or more categories based on features extracted from the data points using algorithms such as Support Vector Machines, Decision Trees, Random Forests, Logistic Regression, k-nearest Neighbors, Naïve Bayes Classification, Gradient Boosted Trees (like XGBoost) and related techniques.

Unsupervised machine learning

Unsupervised machine learning is when there are inputs but no desired outputs; the system finds patterns in the input data sets on its own. Common forms of unsupervised learning include data clustering (e.g. k-means clusters, dbscan, and network connectivity clustering).

Reinforcement learning

Reinforcement learning is a trial-and-error process where a computer attempts to complete a task a number of ways and receives feedback on its performance. After a number of iterative attemps the computer gets closer and closer to meeting its objective and “learns” the series of decisions and actions required to cmplete an objective.

Semi-supervised machine learning

Semi-supervised machine learning is an extension of supervised machine learning. It is different in that it only requires a small subset of input data to be labeled, and it uses these labels to automatically create rules for labeling additional training data.

Active learning

Active learning is a special case of supervised machine learning where an application or a programmer can query a person to label new data points for the right answer.

What are some examples of how ML is used in the real world?

There are many examples of how ML is used in the real world.

Some of these applications include:

  • A virtual assistant like Amazon Alexa that can learn from a user’s voice commands and phone habits to get smarter over time and provide tailored information as part of a Recommendation Engine.

  • An audio translator that can work without any human input by analyzing audio from one language, converting it into textual format and then translating it into another language.

  • A robot that can manage a warehouse, keeps track of inventory, and automatically sends orders to distribution centers.

  • A video game non-player character that can follow a player in the game and help them complete their missions.

  • A cybersecurity incident response team can get notified of potential intrusion activity, by applying anomaly detection algorithms to network traffic and web logs of user-interaction.

What are some risks and possible disadvantages to using AI and ML?

Despite all the benefits of using ML, there are also some disadvantages to using it.

The first one is that all algorithms risk being heavily biased by the data used to train them and the perspective of their programmers. For example, a computer vision machine learning algorithm that was trained using photos of young people may incorrectly categorize old people as in-human, a different animal species, or “unknown.” Naive algorithms will unintentionally lead to forms of discrimination against groups of people.

A second disadvantage is that there’s no guarantee that an AI system will act ethically or morally in a given situation. Human morality and ethics are complex concepts, not easily encoded in AI and machine learning algorithms. As we increase the use of AI and machine learning in human decision-making processes, we open ourselves to unethical decisions made by algorithms with no accountability.

I want to learn about machine learning. Where do I start?

If you want to learn general concepts and you like watching videos, I can recommend this 23 minute YouTube video from AWS Introduction to Machine Learning.

If you are a software engineer, a great place to start is Google’s Machine Learning Crash Course.