Machine Learning with R Course

Details

Machine Learning is a branch of computer science that uses algorithms to help artificial intelligent systems learn and adapt. That learning process requires data and a programming language to process it. R is a language built for data science which has many packages to for machine learning, ready for you to use.

In this course you will learn:

Key concepts and terms for machine learning
To remove noise in data via smoothing
To assess the effectiveness of your model with cross-validation
To integrate model development using the Caret Package
Useful algorithms for important tasks
To apply machine learning to real-world applications
How to work with large data sets

Delivery Methods

Delivery Method	Duration
Classroom Instructor-led classroom-based training. Classes are scheduled at various conference centres in the Sandton area or your premises. Stationary and printed manuals or online resources are included. Refreshments, including 2 tea breaks and a cooked meal for lunch are provided. Training is between 9 am to 4 pm. Classroom	Days	Get a Quote
Live Virtual Training This course is delivered live via Microsoft Teams. You will be able to see and hear the instructor, view their whiteboard, and ask questions or communicate in the chat. Attend from the comfort of your own home or private office. Live Virtual Training	Days	Get a Quote

Discounts Available

Brochure:

Information may change without notice.

Audience

Data scientists
Programmers
Business analysts
Engineers
Scientists

Pre-Requisites

Leading Training's Introduction to R Programming or equivalent knowledge

Course Outline / Curriculum

Introduction to machine learning

Notation
An example
Exercises
Evaluation metrics
- Training and test sets
- Overall accuracy
- The confusion matrix
- Sensitivity and specificity
- Balanced accuracy and F1 score
- Prevalence matters in practice
- ROC and precision-recall curves
- The loss function
Exercises
Conditional probabilities and expectations
- Conditional probabilities
- Conditional expectations
- Conditional expectation minimizes squared loss function
Exercises
Case study: is it a 2 or a 7?

Smoothing

Bin smoothing
Kernels
Local weighted regression (loess)
- Fitting parabolas
- Beware of default smoothing parameters
Connecting smoothing to machine learning
Exercises

Cross validation

Motivation with k-nearest neighbors
- Over-training
- Over-smoothing
- Picking the k in kNN
Mathematical description of cross validation
K-fold cross validation
Exercises
Bootstrap
Exercises

The caret package

The caret train functon
Cross validation
Example: fitting with loess

Examples of algorithms

Linear regression
- The predict function
Exercises
Logistic regression
- Generalized linear models
- Logistic regression with more than one predictor
Exercises
k-nearest neighbors
Exercises
Generative models
- Naive Bayes
- Controlling prevalence
- Quadratic discriminant analysis
- Linear discriminant analysis
- Connection to distance
Case study: more than three classes
Exercises
Classification and regression trees (CART)
- The curse of dimensionality
- CART motivation
- Regression trees
- Classification (decision) trees
Random forests
Exercises
Machine learning in practice
Preprocessing
k-nearest neighbor and random forest
Variable importance
Visual assessments
Ensembles
Exercises
Large datasets
- Matrix algebra
- Notation
- Converting a vector to a matrix
- Row and column summaries
- apply
- Filtering columns based on summaries
- Indexing with matrices
- Binarizing the data
- Vectorization for matrices
- Matrix algebra operations
Exercises
Distance
- Euclidean distance
- Distance in higher dimensions
- Euclidean distance example
- Predictor space
- Distance between predictors
Exercises
Dimension reduction
- Preserving distance
- Linear transformations (advanced)
- Orthogonal transformations (advanced)
- Principal component analysis
- Iris example
- MNIST example
Exercises
Recommendation systems
- Movielens data
- Recommendation systems as a machine learning challenge
- Loss function
- A first model
- Modeling movie effects
- User effects
Exercises
Regularization
- Motivation
- Penalized least squares
- Choosing the penalty terms
Exercises
Matrix factorization
- Factors analysis
- Connection to SVD and PCA
Exercises
34 Clustering
Hierarchical clustering
k-means
Heatmaps
Filtering features
Exercises

Schedule Dates and Booking

There are currently no scheduled dates.

Add me to the waiting list

Submit Enquiry

Name

Email

Telephone

Query