Description
Section 1: Parametric Methods Chapter 1: An Introduction to Simple Linear Regression Chapter goal: Introduces the reader to parametric and understand the underlying assumptions of regression. Subtopics . Regression assumptions. . Detecting missing values. . Descriptive analysis. . Understand correlation. o Plot Pearson correlation matrix. . Determine covariance. o Plot covariance matrix. . Create and reshape arrays. . Split data into training and test data. . Normalize data. . Find best hyper-parameters for a model. . Build your own model. . Review model performance. o Mean Absolute Error. o Mean Squared Error. o Root Mean Squared Error. o R-squared. o Plotting Actual Values vs. Predicted Values. . Residual diagnosis. o Normal Q-Q Plot. o Cook’s D Influence Plot. o Plotting predicted values vs. residual values. o Plotting Fitted Values vs. Residual Values. o Plotting Leverage Values vs. Residual Values. o Plotting Fitted Values vs. Studentized Residual Values. o Plotting Leverage Values vs. Studentized Residual Values. Chapter 2: Advanced Parametric Methods Chapter goal: Highlights methods of dealing with the problem of under-fitting and over-fitting. Subtopics . Issue of multi-collinearity. . Explore methods of dealing with the problem under-fitting and over-fitting. . Understand Ridge, RidgeCV and Lasso regression models. . Find best hyper-parameters for a model. . Build regularized models. . Compare performance of different regression methods. o Mean Absolute Error. o Mean Squared Error. o Root Mean Squared Error. o R-squared. o Plotting actual values vs. predicted values. Chapter 3: Time Series Analysis Chapter goal: Covers a model for identifying trends and patterns in sequential data and how to forecast a series. . What is time series analysis? . Underlying assumptions of time series analysis. . Different types of time series analysis models. . The ARIMA model. . Test of stationary. o Conduct an ADF Fuller Test. . Test of white noise. . Test of correlation. o Plot Lag Plot. o Plot Lag vs Autocorrelation Plot. o Plot ACF. o Plot PACF. . Understand trends, seasonality and trends. o Plot seasonal components. . Smoothen a time series using Moving Average, Standard Deviation and Exponential techniques. o Plot smoothened time series. . Determine rate of return and rolling rate of return. . Determine parameters of ARIMA model. . Build ARIMA model. . Forecast ARIMA. o Plot forecast. . Residual diagnosis Chapter 4: High Quality Time Series Chapter goal: Explores Prophet for better series forecast. . Difference between statsmodel and Prophet. . Understand components in Prophet. . Data preprocessing. . Develop a model using Prophet. . Forecast a series. o Plot forecasted. o Plot seasonal components. . Evaluate model performance using Prophet. Chapter 4: Logistic Regression Chapter goal: Introduces reader to logistic regression – a powerful classification model. Subtopics . Find missing values . Understand correlation. o Plotting Pearson correlation matrix. . Determine covariance. o Plotting covariance matrix. . PCA for dimension reduction. o Plotting scree plot. . Normalize data. . Hyper-parameter tuning. . Create a pipeline. . Develop a Logit model. . Model evaluation. o Tabulate classification report. o Tabulate confusion matrix. o Plot ROC Curve o Find AUC. o Plot Precision Recall Curve. o Find APS. o Plot learning curve. Chapter 5: Dimension Reduction and Multivariate Analysis using Linear Discriminant Chapter goal: Discusses the difference between linear discriminant analysis and logistic regression and how linear discriminant analysis can be used for other purposes other than classification. Subtopics . Difference between logistic regression and discriminant analysis. . Purpose of discriminant analysis. . Model fitting. . Model evaluation. o Tabulate classification report. o Tabulate confusion matrix. o Plot ROC Curve o Find AUC. o Plot Precision Recall Curve. o Find APS. o Plot learning curve. Section 2: Ensemble methods Chapter 6: Finding Hyper Lanes Using Support Vector Machine Chapter goal: Highlights ways of finding hyper lanes using Linear Support Vector Chain including its pros and cons. . Understand support vector machine. . Find hyper lanes using SVM. . Scenarios in which SVM performs better. . Disadvantages of SVM. . Model fitting. . Model evaluation. o Tabulate classification report. o Tabulate confusion matrix. o Plot ROC curve o Find AUC. o Plot Precision Recall curve. o Find APS. o Plot learning curve. Chapter 7: Classification Using Decision Tree Chapter goal: Explores how decision trees are formed and visualized them. Subtopics . Discussing entropy. . Information gain . Structure of decision trees . Visualizing decision trees . Modelling fitting . Model evaluation. o Tabulating classification report. o Tabulating confusion matrix. o Plotting ROC curve o Finding AUC. o Plotting Precision Recall curve. o Finding APS. o Plotting learning curve. Chapter 8: Back to the Classic Chapter goal: Gives an overview of this classical algorithm and explain why it is still relevant up to this date. Subtopics . The Nave Bayes theorem. . Unpacking Gaussian Nave Bayes. . Model fitting. . Hyper-parameter tuning. . Create a pipeline. . Model evaluation. o Tabulate classification report. o Tabulate confusion matrix. o Plot ROC Curve o Find AUC. o Plot Precision Recall Curve. o Find APS. o Plot learning curve. Section 3: Non-Parametric Methods Chapter 9: Finding Similarities and Dissimilarities Using Cluster Analysis Chapter goal: Explain clustering and explore three main clustering algorithms (K-Means, Agglomerative and DBSCAN). . An introduction to cluster analysis. . Types of clustering algorithms. . Normalize data. . Dimension reduction using PCA. o Finding number of components . Find number of clusters. o Elbow curve. . Clustering K-Means. . Fit K-Means model. . Plot K-Means clusters. . Clustering using Agglomerative algorithm. o Techniques of calculating similarities/dissimilarities . Fit Agglomerative. . Plot Agglomerative clusters. . Clustering using Density-Based Spatial Clustering Algorithm with Noise (DBSCAN) . Fit DBSCAN. . Plot DBSCAN clusters. Chapter 10: Survival Analysis Chapter: Provides an overview of survival analysis (a model used commonly used in medical and insurance industries) by detailing the commonly used estimator – Kaplan Meier Fitter. Subtopics . Create a survival table. . The survival function. . An introduction to the Kaplan Meier Estimator. . Finding confidence intervals. . Tabulating cumulative density estimates. . Tabulating survival function estimates. . Plotting survival curve. . Plotting cumulative density. . Model evaluation. Chapter 11: Neural Networks Chapter goal: Discusses the fundamentals of neural networks and ways of optimizing networks for better accuracy. Subtopics . Forward propagation. . Backward propagation. . Forward pass. . Backward pass. . Cost function. . Gradient. . The vanishing gradient problem. . Other functions. . Optimizing networks. . Bernoulli Restricted Boltzmann Machine. . Multi-Layer Perceptron. . Regularizing networks. . Dropping layers. . Model evaluation. . Model evaluation. o Tabulate classification report. o Tabulate confusion matrix. o Plot ROC Curve o Find AUC. o Plot Precision Recall Curve. o Find APS. o Plot training and validation loss across epochs. o Plot training and validation accuracy across epochs. Chapter 12: Driverless AI Using H2O Chapter goal: Covers a new library that helps organizations accelerate their adoption of AI. . How H2O works. . Data processing. . Model training. . Model evaluation. . AutoML.




