Introduction To Machine Learning

What is machine learning?

Machine learning is a subset of Artificial intelligence (AI) where computers can learn from data, predict and make decisions without being explicitly programmed. Instead of writing complex and specific codes, we can create a model that learns from data and finds patterns that help in making predictions.

traditional programming vs. machine learning

Types of machine learning

1. Supervised machine learning:

This type of learning requires features (input data) and labels (output data). The model learns and tries to find the relationship between input and output data to be able to predict values.

Types of supervised machine learning:

classification: used for predicting discrete data.

Examples:

spam prediction: predict if an email is spam or not.
image classification: e.g. find the animal's name from an image.

This image shows how classification works.

How to Know Which Machine Learning Algorithms to Use: Techniques in Machine Learning – PostIndustria

Suppose you want to build a model that predicts the name of a shape

The features can be the number of sides, corners, length of sides, and angles of corners in each shape.
The labels will be the shapes' names.
Training step: The model will start learning from the data, finding relationships between features and labels, and finding patterns. for example, it will learn that squares are shapes that have 4 sides and corners, all sides have the same length, and all corners in the shape have right angles. on the other hand, triangles can have different combinations of sides' lengths but all of them have 3 sides and corners.
Prediction step: After the learning step, the model can predict the shapes' names according to the patterns found.

regression: predicts continuous data.

Examples:

House's price prediction
weather forecasting

Suppose you want to build a model that predicts a house's price

Appraisal notices to hit mailboxes this week

The features can be the number of rooms, number of floors, area of house, latitude and longitude, .. etc.
The labels will be the houses' prices.
Training step: The model will start learning from the data and finding relationships between features. for example, it will find that larger houses may have higher prices. In addition, the number of floors is an important factor in calculating the price of a house. The model tries to find the most suitable equation that shows the relationship between features and outcomes.
Prediction step: After the learning step, the model can predict the price of a house according to the relationship found.

2. Unsupervised machine learning

In this learning type, we need an unlabeled dataset (a dataset that only contains features). Unsupervised learning models aim to find hidden patterns using only input data.

clustering: this unsupervised learning technique divides data with similar patterns into clusters (groups).

Uses of unsupervised machine learning:

Recommendation systems: you can use clustering to recommend books, movies, restaurants and much more.

customer segmentation: unsupervised learning is a great option to divide customers who have similar characteristics into groups such as demographic and geographic segmentation.

3. Reinforcement learning

In this learning type, there is an agent that interacts with an environment by making actions and getting rewards for correct actions and punishments for wrong actions. The agent learns from its experience and aims to maximize its reward.

How to learn machine learning?

Choose a programming language: e.g. Python or R
Statistics: it helps us understand and analyze data
Linear algebra: to help us understand the theory behind different topics in machine learning
Data preprocessing: before giving data to a model, we should make sure that the data is clean to build effective models. In this phase, there are different steps such as:
- Handling missing data
- Removing duplicated data
- Handling outliers
Data visualization: it's necessary to understand data before the machine learning step. Visualizations and graphs help us to have a better understanding of data and find important patterns. Moreover, it can help us in the data preprocessing phase. For example, it can help us recognize if we have missing data or outliers.

Understand machine learning algorithms: There are different methods to solve the same problem. Having a background in different algorithms can help you try various options and choose the best one to solve the problem.

examples of classification algorithms:

decision tree	random forest	support vector machine (SVM)
logistic regression	k-nearest neighbors	naive-Bayes

examples of regression algorithms:

decision tree	random forest	support vector machine (SVM)
linear regression	lasso regression	ridge regression

examples of clustering algorithms

K-means algorithm

DBSCAN

hierarchical clustering

Build projects: building projects help you enhance your technical skills. So keep building projects and publish them on Kaggle. Kaggle is a large community that includes data scientists and machine learning engineers. In this platform, you can also find datasets that help you build projects, take courses, and participate in competitions.

How to build a machine learning project?

Define the problem: understand the problem and gather suitable data. You can get data from different resources such as Kaggle and Google dataset search.
Data preprocessing: make sure that your data is totally clean before training it to the model.
Build the model: it's important to decide which learning type is the most appropriate one. Some problems require classification, others require regression and so on. After choosing the most suitable learning type you can try a related algorithm to build a model that solves the problem.
Evaluate the model: you can use various metrics that help you evaluate your model. You need to make sure that your model is effective and performs well with data.
Improve your model: This can be done by hyperparameters tuning. In this step, you adjust the parameters that control the model to improve it.