Booking options
£134.99
£134.99
On-Demand course
7 hours 42 minutes
All levels
The underlying patterns in your data hold vital insights; unearth them with cutting-edge clustering and classification techniques in R
This course is your complete guide to both supervised and unsupervised learning using R. This course covers all the main aspects of practical data science; if you take this course, there is no need to take other courses or buy books on R-based data science. In this age of big data, companies across the Globe use R to sift through the avalanche of information at their disposal. By becoming proficient in unsupervised and supervised learning in R, you can give your company a competitive edge and take your career to the next level. Over the course of research, the author realized that almost all the R data science courses and books out there do take account of the multidimensional nature of the topic. This course will give you a robust grounding in the main aspects of machine learning: clustering and classification. Unlike other R instructors, the author digs deep into R's machine learning features and give you a one-of-a-kind grounding in data science! You will go all the way from carrying out data reading & cleaning to machine learning, to finally implementing powerful machine learning algorithms and evaluating their performance via R.
The following topics will be covered: - • A full introduction to the R Framework for data science • Data structures and reading in R, including CSV, Excel, and HTML data • How to pre-process and clean data by removing NAs/No data, visualization • Machine learning, supervised learning, and unsupervised learning in R • Model building and selection and much more! The course will help you implement methods using real data obtained from different sources. Many courses use made-up data that does not empower students to implement R-based data science in real life. After taking this course, you'll easily use data science packages such as Caret to work with real data in R. You'll even understand concepts such as unsupervised learning, dimension reduction, and supervised learning. All the code and supporting files for this course are available at - https://github.com/PacktPublishing/Clustering-and-Classification-with-Machine-Learning-in-R
Read-in data into the R environment from different sources
Carry out basic data pre-processing and wrangling in R Studio
Implement unsupervised/clustering techniques such as K-means clustering
Implement dimensional reduction techniques (PCA) and feature selection
Implement supervised learning techniques/classification such as Random Forests
Evaluate model performance and learn the best practices for evaluating machine learning model accuracy
This course is for students interested in getting started with data science applications in the R Studio environment. Students wishing to learn how to implement unsupervised learning on real data. Anyone with prior exposure to R who wants to get started with practical data science.
Every video is packed with hands-on instructions and clear explanations. Real data has been used to demonstrate how to implement these techniques in real life, on your data.
Provides in-depth training in everything you need to know to get started with practical R data science * Jargon-free and suitable for people who have a non-mathematical background * In-depth coverage of the latest unsupervised and supervised techniques
https://github.com/packtpublishing/clustering-and-classification-with-machine-learning-in-r
Minerva Singh is a PhD graduate from Cambridge University where she specialized in Tropical Ecology. She is also a part-time Data Scientist. As part of her research, she must carry out extensive data analysis, including spatial data analysis. For this purpose, she prefers to use a combination of freeware tools: R, QGIS, and Python. She does most of her spatial data analysis work using R and QGIS. Apart from being free, these are very powerful tools for data visualization, processing, and analysis. She also holds an MPhil degree in Geography and Environment from Oxford University. She has honed her statistical and data analysis skills through several MOOCs, including The Analytics Edge and Statistical. In addition to spatial data analysis, she is also proficient in statistical analysis, machine learning, and data mining.
1. Introduction to the Course
1. Welcome to Clustering & Classification with Machine Learning in R Introduction to the Course: Welcome to Clustering & Classification with Machine Learning in R |
2. Installing R and R Studio Introduction to the Course: Installing R and R Studio |
2. Read in Data from Different Sources in R
1. Read in CSV & Excel Data Read in Data from Different Sources in R: Read in CSV & Excel Data |
2. Read in Unzipped Folder Read in Data from Different Sources in R: Read in Unzipped Folder |
3. Read in Online CSV Read in Data from Different Sources in R: Read in Online CSV |
4. Read in Googlesheets Read in Data from Different Sources in R: Read in Googlesheets |
5. Read in Data from Online HTML Tables-Part 1 Read in Data from Different Sources in R: Read in Data from Online HTML Tables-Part 1 |
6. Read in Data from Online HTML Tables-Part 2 Read in Data from Different Sources in R: Read in Data from Online HTML Tables-Part 2 |
7. Read Data from a Database Read in Data from Different Sources in R: Read Data from a Database |
3. Data Pre-processing and Visualization
1. Remove Missing Values Data Pre-processing and Visualization: Remove Missing Values |
2. More Data Cleaning Data Pre-processing and Visualization: More Data Cleaning |
3. Introduction to dplyr for Data Summarizing-Part 1 Data Pre-processing and Visualization: Introduction to dplyr for Data Summarizing-Part 1 |
4. Introduction to dplyr for Data Summarizing-Part 2 Data Pre-processing and Visualization: Introduction to dplyr for Data Summarizing-Part 2 |
5. Exploratory Data Analysis (EDA): Basic Visualizations with R Data Pre-processing and Visualization: Exploratory Data Analysis (EDA): Basic Visualizations with R |
6. More Exploratory Data Analysis with xda Data Pre-processing and Visualization: More Exploratory Data Analysis with xda |
7. Data Exploration & Visualization With dplyr & ggplot2 Data Pre-processing and Visualization: Data Exploration & Visualization With dplyr & ggplot2 |
8. Associations Between Quantitative Variables- Theory Data Pre-processing and Visualization: Associations Between Quantitative Variables- Theory |
9. Testing for Correlation Data Pre-processing and Visualization: Testing for Correlation |
10. Evaluate the Relation Between Nominal Variables Data Pre-processing and Visualization: Evaluate the Relation Between Nominal Variables |
11. Cramer's V for Examining the Strength of Association Between Nominal Variable Data Pre-processing and Visualization: Cramer's V for Examining the Strength of Association Between Nominal Variable |
4. Machine Learning for Data Science
1. How is Machine Learning Different from Statistical Data Analysis? Machine Learning for Data Science: How is Machine Learning Different from Statistical Data Analysis? |
2. What is Machine Learning (ML) About? Some Theoretical Pointers Machine Learning for Data Science: What is Machine Learning (ML) About? Some Theoretical Pointers |
5. Unsupervised Learning in R
1. K-Means Clustering Unsupervised Learning in R: K-Means Clustering |
2. Other Ways of Selecting Cluster Numbers Unsupervised Learning in R: Other Ways of Selecting Cluster Numbers |
3. Fuzzy K-Means Clustering Unsupervised Learning in R: Fuzzy K-Means Clustering |
4. Weighted k-means Unsupervised Learning in R: Weighted k-means |
5. Partitioning Around Meloids (PAM) Unsupervised Learning in R: Partitioning Around Meloids (PAM) |
6. Hierarchical Clustering in R Unsupervised Learning in R: Hierarchical Clustering in R |
7. Expectation-Maximization (EM) in R Unsupervised Learning in R: Expectation-Maximization (EM) in R |
8. DBSCAN Clustering in R Unsupervised Learning in R: DBSCAN Clustering in R |
9. Cluster a Mixed Dataset Unsupervised Learning in R: Cluster a Mixed Dataset |
10. Should We Even Do Clustering? Unsupervised Learning in R: Should We Even Do Clustering? |
11. Assess Clustering Performance Unsupervised Learning in R: Assess Clustering Performance |
12. Which Clustering Algorithm to Choose? Unsupervised Learning in R: Which Clustering Algorithm to Choose? |
6. Feature/Dimension Reduction
1. Dimension Reduction-theory Feature/Dimension Reduction: Dimension Reduction-theory |
2. Principal Component Analysis (PCA) Feature/Dimension Reduction: Principal Component Analysis (PCA) |
3. More on PCA Feature/Dimension Reduction: More on PCA |
4. Multidimensional Scaling Feature/Dimension Reduction: Multidimensional Scaling |
5. Singular Value Decomposition (SVD) Feature/Dimension Reduction: Singular Value Decomposition (SVD) |
7. Feature Selection to Select the Most Relevant Predictors
1. Removing Highly Correlated Predictor Variables Feature Selection to Select the Most Relevant Predictors: Removing Highly Correlated Predictor Variables |
2. Variable Selection Using LASSO Regression Feature Selection to Select the Most Relevant Predictors: Variable Selection Using LASSO Regression |
3. Variable Selection with FSelector Feature Selection to Select the Most Relevant Predictors: Variable Selection with FSelector |
4. Boruta Analysis for Feature Selection Feature Selection to Select the Most Relevant Predictors: Boruta Analysis for Feature Selection |
8. Supervised Learning Theory
1. Some Basic Supervised Learning Concepts Supervised Learning Theory: Some Basic Supervised Learning Concepts |
2. Pre-processing for Supervised Learning Supervised Learning Theory: Pre-processing for Supervised Learning |
9. Supervised Learning: Classification
1. What are GLMs? Supervised Learning: Classification: What are GLMs? |
2. Logistic Regression Models as Binary Classifiers Supervised Learning: Classification: Logistic Regression Models as Binary Classifiers |
3. Binary Classifier with PCA Supervised Learning: Classification: Binary Classifier with PCA |
4. Some Pointers on Evaluating Accuracy Supervised Learning: Classification: Some Pointers on Evaluating Accuracy |
5. Obtain Binary Classification Accuracy Metrics Supervised Learning: Classification: Obtain Binary Classification Accuracy Metrics |
6. More on Binary Accuracy Measures Supervised Learning: Classification: More on Binary Accuracy Measures |
7. Linear Discriminant Analysis Supervised Learning: Classification: Linear Discriminant Analysis |
8. Our Multi-class Classification Problem Supervised Learning: Classification: Our Multi-class Classification Problem |
9. Classification Trees Supervised Learning: Classification: Classification Trees |
10. More on Classification Tree Visualization Supervised Learning: Classification: More on Classification Tree Visualization |
11. Classification with Party Package Supervised Learning: Classification: Classification with Party Package |
12. Decision Trees Supervised Learning: Classification: Decision Trees |
13. Random Forest (RF) Classification Supervised Learning: Classification: Random Forest (RF) Classification |
14. Examine Individual Variable Importance for Random Forests Supervised Learning: Classification: Examine Individual Variable Importance for Random Forests |
15. GBM Classification Supervised Learning: Classification: GBM Classification |
16. Support Vector Machines (SVM) for Classification Supervised Learning: Classification: Support Vector Machines (SVM) for Classification |
17. More SVM for Classification Supervised Learning: Classification: More SVM for Classification |
18. Variable Importance in SVM Modelling with rminer Supervised Learning: Classification: Variable Importance in SVM Modelling with rminer |
10. Additional Lectures
1. Fuzzy C-Means Clustering Additional Lectures: Fuzzy C-Means Clustering |