Key Machine Learning Terminology

When learning Machine Learning, it is important to be clear and concise with its terminology.

Model – a set of patterns learned from data.

Algorithm – a specific ML process used to train a model.

Training data – the dataset from which the algorithm learns the model.

Test data – a new dataset for reliably evaluating model performance.

Features – Variables (columns) in the dataset used to train the model.

Target variable – A specific variable you’re trying to predict.

Observations – Data points (rows) in the dataset.

For example, let’s say you have a dataset of 150 primary school students, and you wish to predict their Height based on their Age, Gender, and Weight…

  • You have 150 observations…
  • 1 target variable (Height)…
  • 3 features (Age, Gender, Weight)…
  • You might then separate your dataset into two subsets:
    1. Set of 120 used to train several models (training set)
    2. Set of 30 used to pick the best model (test set)
Close Menu