Healthcare predictive analytics using machine learning and deep learning techniques: a survey

Badawy, Mohammed; Ramadan, Nagy; Hefny, Hesham Ahmed

doi:10.1186/s43067-023-00108-y

Journal of Electrical Systems and Information Technology

Table 1 Comparison of different supervised learning machine learning methods

From: Healthcare predictive analytics using machine learning and deep learning techniques: a survey

Method	Advantages	Disadvantages
Linear regression	• Linear regression models are easy to understand for beginners • Training linear regression models is fast, even on large datasets • Linear regression models can forecast, classify, and predict • Eliminating overfitting by regularization	• Linearity: Linear regression models require linearity between independent and dependent variables. This can limit nonlinear relationships • It is not recommended for most practical applications as it greatly simplifies real- world problems
Logistic regression	• Excellent performance with small datasets • Its output is interpretable as probability	• Compliant data assumptions are required • It only offers linear solutions
Decision trees	• They can manage categorical characteristics • There are a few parameters to tune • They perform well with large feature-count datasets	• The interpretability of the ensemble is questionable
Random forest	• Even with noisy or imbalanced data, random forest can achieve high accuracy • Robustness to overfitting: random forest generalizes well to new data • Interpretability: Random forest models are easy-to-understand • Random forest scales to large datasets	• Computational complexity: Random forest training is computationally expensive, especially for large datasets • Sensitivity to hyperparameters: random forest performance can be sensitive to hyperparameters
Support vector machine	• High-dimensional space for input • Few irrelevant features • Document vectors are sparse	•Data collection is time-consuming
K-nearest neighbors	• Simple algorithm	• The user must specify the number of neighbors • A high level of relative computational complexity
Naive Bayes	• Simple and straightforward method • It combines effectiveness and reasonable precision	• Used primarily when the size of the training set is smaller • It assumes the conditional independence of linguistic features

Back to article page