The two most common categories of tasks in Machine Learning are supervised learning and unsupervised learning.
- A task is a specific objective for our algorithms.
- Algorithms can be swapped in and out, as long as we pick the right task.
- In fact, one should always try multiple algorithms because we most likely won’t know which algorithm will perform best for our dataset.
Supervised learning includes tasks for “labeled” data (i.e. we have a target variable).
- In practice, it’s often used as an advanced form of predictive modeling.
- Each observation must be labeled with a “correct answer.”
- Only then one can build a predictive model because one must tell the algorithm what’s “correct” while training it (hence, “supervising” it).
- Regression is the task of modeling continuous target variables.
- Classification is the task for modeling categorical (a.k.a. “class”) target variables.
Unsupervised learning includes tasks for “unlabeled” data (i.e. you do not have a target variable).
- In practice, it’s often used either as a form of automated data analysis or automated signal extraction.
- Unlabeled data has no predetermined “correct answer.”
- You’ll allow the algorithm to directly learn patterns from the data (without “supervision”).
- Clustering is the most common unsupervised learning task, and it’s for finding groups within your data.