Skip to main content

Table 1 Comparison of different supervised learning machine learning methods

From: Healthcare predictive analytics using machine learning and deep learning techniques: a survey

Method

Advantages

Disadvantages

Linear regression

• Linear regression models are easy to understand for beginners

• Training linear regression models is fast, even on large datasets

• Linear regression models can forecast, classify, and predict

• Eliminating overfitting by regularization

• Linearity: Linear regression models require linearity between independent and dependent variables. This can limit nonlinear relationships

•  It is not recommended for most practical applications as it greatly simplifies real- world problems

Logistic regression

• Excellent performance with small datasets

• Its output is interpretable as probability

• Compliant data assumptions are required

• It only offers linear solutions

Decision trees

• They can manage categorical characteristics

• There are a few parameters to tune

• They perform well with large feature-count datasets

• The interpretability of the ensemble is questionable

Random forest

• Even with noisy or imbalanced data, random forest can achieve high accuracy

• Robustness to overfitting: random forest generalizes well to new data

• Interpretability: Random forest models are easy-to-understand

• Random forest scales to large datasets

• Computational complexity: Random forest training is computationally expensive, especially for large datasets

• Sensitivity to hyperparameters: random forest performance can be sensitive to hyperparameters

Support vector machine

• High-dimensional space for input

• Few irrelevant features

• Document vectors are sparse

•Data collection is time-consuming

K-nearest neighbors

• Simple algorithm

• The user must specify the number of neighbors

• A high level of relative computational complexity

Naive Bayes

• Simple and straightforward method

• It combines effectiveness and reasonable precision

• Used primarily when the size of the training set is smaller

• It assumes the conditional independence of linguistic features