Duration 4 Days 24 CPD hours This course is intended for The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful. Overview Overview of data science and machine learning at scale Overview of the Hadoop ecosystem Working with HDFS data and Hive tables using Hue Introduction to Cloudera Data Science Workbench Overview of Apache Spark 2 Reading and writing data Inspecting data quality Cleansing and transforming data Summarizing and grouping data Combining, splitting, and reshaping data Exploring data Configuring, monitoring, and troubleshooting Spark applications Overview of machine learning in Spark MLlib Extracting, transforming, and selecting features Building and evaluating regression models Building and evaluating classification models Building and evaluating clustering models Cross-validating models and tuning hyperparameters Building machine learning pipelines Deploying machine learning models Spark, Spark SQL, and Spark MLlib PySpark and sparklyr Cloudera Data Science Workbench (CDSW) Hue This workshop covers data science and machine learning workflows at scale using Apache Spark 2 and other key components of the Hadoop ecosystem. The workshop emphasizes the use of data science and machine learning methods to address real-world business challenges. Using scenarios and datasets from a fictional technology company, students discover insights to support critical business decisions and develop data products to transform the business. The material is presented through a sequence of brief lectures, interactive demonstrations, extensive hands-on exercises, and discussions. The Apache Spark demonstrations and exercises are conducted in Python (with PySpark) and R (with sparklyr) using the Cloudera Data Science Workbench (CDSW) environment. The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful. Overview of data science and machine learning at scaleOverview of the Hadoop ecosystemWorking with HDFS data and Hive tables using HueIntroduction to Cloudera Data Science WorkbenchOverview of Apache Spark 2Reading and writing dataInspecting data qualityCleansing and transforming dataSummarizing and grouping dataCombining, splitting, and reshaping dataExploring dataConfiguring, monitoring, and troubleshooting Spark applicationsOverview of machine learning in Spark MLlibExtracting, transforming, and selecting featuresBuilding and evauating regression modelsBuilding and evaluating classification modelsBuilding and evaluating clustering modelsCross-validating models and tuning hyperparametersBuilding machine learning pipelinesDeploying machine learning models Additional course details: Nexus Humans Cloudera Data Scientist Training training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Cloudera Data Scientist Training course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 3 Days 18 CPD hours This course is intended for This course is geared for Python-experienced attendees who wish to be equipped with the skills you need to use pandas to ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets. Overview Working in a hands-on learning environment, guided by our expert team, attendees will learn to: Understand how data analysts and scientists gather and analyze data Perform data analysis and data wrangling using Python Combine, group, and aggregate data from multiple sources Create data visualizations with pandas, matplotlib, and seaborn Apply machine learning (ML) algorithms to identify patterns and make predictions Use Python data science libraries to analyze real-world datasets Use pandas to solve common data representation and analysis problems Build Python scripts, modules, and packages for reusable analysis code Perform efficient data analysis and manipulation tasks using pandas Apply pandas to different real-world domains with the help of step-by-step demonstrations Get accustomed to using pandas as an effective data exploration tool. Data analysis has become a necessary skill in a variety of domains where knowing how to work with data and extract insights can generate significant value. Geared for data team members with incoming Python scripting experience, Hands-On Data Analysis with Pandas will show you how to analyze your data, get started with machine learning, and work effectively with Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn. Using real-world datasets, you will learn how to use the powerful pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will be able to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding lessons, you will explore some applications of anomaly detection, regression, clustering, and classification using scikit-learn to make predictions based on past data. Students will leave the course armed with the skills required to use pandas to ensure the veracity of their data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets. Introduction to Data Analysis Fundamentals of data analysis Statistical foundations Setting up a virtual environment Working with Pandas DataFrames Pandas data structures Bringing data into a pandas DataFrame Inspecting a DataFrame object Grabbing subsets of the data Adding and removing data Data Wrangling with Pandas What is data wrangling? Collecting temperature data Cleaning up the data Restructuring the data Handling duplicate, missing, or invalid data Aggregating Pandas DataFrames Database-style operations on DataFrames DataFrame operations Aggregations with pandas and numpy Time series Visualizing Data with Pandas and Matplotlib An introduction to matplotlib Plotting with pandas The pandas.plotting subpackage Plotting with Seaborn and Customization Techniques Utilizing seaborn for advanced plotting Formatting Customizing visualizations Financial Analysis - Bitcoin and the Stock Market Building a Python package Data extraction with pandas Exploratory data analysis Technical analysis of financial instruments Modeling performance Rule-Based Anomaly Detection Simulating login attempts Exploratory data analysis Rule-based anomaly detection Getting Started with Machine Learning in Python Learning the lingo Exploratory data analysis Preprocessing data Clustering Regression Classification Making Better Predictions - Optimizing Models Hyperparameter tuning with grid search Feature engineering Ensemble methods Inspecting classification prediction confidence Addressing class imbalance Regularization Machine Learning Anomaly Detection Exploring the data Unsupervised methods Supervised methods Online learning The Road Ahead Data resources Practicing working with data Python practice
Duration 3 Days 18 CPD hours This course is intended for Senior Executives CIOs and CTOs Business Intelligence Executives Marketing Executives Data & Business Analytics Specialists Innovation Specialists & Entrepreneurs Academics, and other people interested in Big Data Overview More specifically, BDAW addresses advanced big data architecture topics, including, data formats, transformation, real-time, batch and machine learning processing, scalability, fault tolerance, security and privacy, minimizing the risk of an unsound architecture and technology selection. Big Data Architecture Workshop (BDAW) is a learning event that addresses advanced big data architecture topics. BDAW brings together technical contributors into a group setting to design and architect solutions to a challenging business problem. The workshop addresses big data architecture problems in general, and then applies them to the design of a challenging system. Throughout the highly interactive workshop, students apply concepts to real-world examples resulting in detailed synergistic discussions. The workshop is conducive for students to learn techniques for architecting big data systems, not only from Cloudera?s experience but also from the experiences of fellow students. Workshop Application Use Cases Oz Metropolitan Architectural questions Team activity: Analyze Metroz Application Use Cases Application Vertical Slice Definition Minimizing risk of an unsound architecture Selecting a vertical slice Team activity: Identify an initial vertical slice for Metroz Application Processing Real time, near real time processing Batch processing Data access patterns Delivery and processing guarantees Machine Learning pipelines Team activity: identify delivery and processing patterns in Metroz, characterize response time requirements, identify Machine Learning pipelines Application Data Three V?s of Big Data Data Lifecycle Data Formats Transforming Data Team activity: Metroz Data Requirements Scalable Applications Scale up, scale out, scale to X Determining if an application will scale Poll: scalable airport terminal designs Hadoop and Spark Scalability Team activity: Scaling Metroz Fault Tolerant Distributed Systems Principles Transparency Hardware vs. Software redundancy Tolerating disasters Stateless functional fault tolerance Stateful fault tolerance Replication and group consistency Fault tolerance in Spark and Map Reduce Application tolerance for failures Team activity: Identify Metroz component failures and requirements Security and Privacy Principles Privacy Threats Technologies Team activity: identify threats and security mechanisms in Metroz Deployment Cluster sizing and evolution On-premise vs. Cloud Edge computing Team activity: select deployment for Metroz Technology Selection HDFS HBase Kudu Relational Database Management Systems Map Reduce Spark, including streaming, SparkSQL and SparkML Hive Impala Cloudera Search Data Sets and Formats Team activity: technologies relevant to Metroz Software Architecture Architecture artifacts One platform or multiple, lambda architecture Team activity: produce high level architecture, selected technologies, revisit vertical slice Vertical Slice demonstration Additional course details: Nexus Humans Big Data Architecture Workshop training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Big Data Architecture Workshop course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 3 Days 18 CPD hours This course is intended for Data Wrangling with Python takes a practical approach to equip beginners with the most essential data analysis tools in the shortest possible time. It contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context. Overview By the end of this course, you will be confident in using a diverse array of sources to extract, clean, transform, and format your data efficiently. In this course you will start with the absolute basics of Python, focusing mainly on data structures. Then you will delve into the fundamental tools of data wrangling like NumPy and Pandas libraries. You'll explore useful insights into why you should stay away from traditional ways of data cleaning, as done in other languages, and take advantage of the specialized pre-built routines in Python.This combination of Python tips and tricks will also demonstrate how to use the same Python backend and extract/transform data from an array of sources including the Internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, you'll cover how to handle missing or wrong data, and reformat it based on the requirements from the downstream analytics tool. The course will further help you grasp concepts through real-world examples and datasets. Introduction to Data Structure using Python Python for Data Wrangling Lists, Sets, Strings, Tuples, and Dictionaries Advanced Operations on Built-In Data Structure Advanced Data Structures Basic File Operations in Python Introduction to NumPy, Pandas, and Matplotlib NumPy Arrays Pandas DataFrames Statistics and Visualization with NumPy and Pandas Using NumPy and Pandas to Calculate Basic Descriptive Statistics on the DataFrame Deep Dive into Data Wrangling with Python Subsetting, Filtering, and Grouping Detecting Outliers and Handling Missing Values Concatenating, Merging, and Joining Useful Methods of Pandas Get Comfortable with a Different Kind of Data Sources Reading Data from Different Text-Based (and Non-Text-Based) Sources Introduction to BeautifulSoup4 and Web Page Parsing Learning the Hidden Secrets of Data Wrangling Advanced List Comprehension and the zip Function Data Formatting Advanced Web Scraping and Data Gathering Basics of Web Scraping and BeautifulSoup libraries Reading Data from XML RDBMS and SQL Refresher of RDBMS and SQL Using an RDBMS (MySQL/PostgreSQL/SQLite) Application in real life and Conclusion of course Applying Your Knowledge to a Real-life Data Wrangling Task An Extension to Data Wrangling
Duration 4 Days 24 CPD hours This course is intended for This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Overview Skills gained in this training include:The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysisThe fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with HadoopHow Pig, Hive, and Impala improve productivity for typical analysis tasksJoining diverse datasets to gain valuable business insightPerforming real-time, complex queries on datasets Cloudera University?s four-day data analyst training course focusing on Apache Pig and Hive and Cloudera Impala will teach you to apply traditional data analytics and business intelligence skills to big data. Hadoop Fundamentals The Motivation for Hadoop Hadoop Overview Data Storage: HDFS Distributed Data Processing: YARN, MapReduce, and Spark Data Processing and Analysis: Pig, Hive, and Impala Data Integration: Sqoop Other Hadoop Data Tools Exercise Scenarios Explanation Introduction to Pig What Is Pig? Pig?s Features Pig Use Cases Interacting with Pig Basic Data Analysis with Pig Pig Latin Syntax Loading Data Simple Data Types Field Definitions Data Output Viewing the Schema Filtering and Sorting Data Commonly-Used Functions Processing Complex Data with Pig Storage Formats Complex/Nested Data Types Grouping Built-In Functions for Complex Data Iterating Grouped Data Multi-Dataset Operations with Pig Techniques for Combining Data Sets Joining Data Sets in Pig Set Operations Splitting Data Sets Pig Troubleshoot & Optimization Troubleshooting Pig Logging Using Hadoop?s Web UI Data Sampling and Debugging Performance Overview Understanding the Execution Plan Tips for Improving the Performance of Your Pig Jobs Introduction to Hive & Impala What Is Hive? What Is Impala? Schema and Data Storage Comparing Hive to Traditional Databases Hive Use Cases Querying with Hive & Impala Databases and Tables Basic Hive and Impala Query Language Syntax Data Types Differences Between Hive and Impala Query Syntax Using Hue to Execute Queries Using the Impala Shell Data Management Data Storage Creating Databases and Tables Loading Data Altering Databases and Tables Simplifying Queries with Views Storing Query Results Data Storage & Performance Partitioning Tables Choosing a File Format Managing Metadata Controlling Access to Data Relational Data Analysis with Hive & Impala Joining Datasets Common Built-In Functions Aggregation and Windowing Working with Impala How Impala Executes Queries Extending Impala with User-Defined Functions Improving Impala Performance Analyzing Text and Complex Data with Hive Complex Values in Hive Using Regular Expressions in Hive Sentiment Analysis and N-Grams Conclusion Hive Optimization Understanding Query Performance Controlling Job Execution Plan Bucketing Indexing Data Extending Hive SerDes Data Transformation with Custom Scripts User-Defined Functions Parameterized Queries Choosing the Best Tool for the Job Comparing MapReduce, Pig, Hive, Impala, and Relational Databases Which to Choose?
Duration 2 Days 12 CPD hours This course is intended for If you are a data analyst, data scientist, or a business analyst who wants to get started with using Python and machine learning techniques to analyze data and predict outcomes, this book is for you. Basic knowledge of computer programming and data analytics is a must. Familiarity with mathematical concepts such as algebra and basic statistics will be useful. Overview By the end of this course, you will have the skills you need to confidently use various machine learning algorithms to perform detailed data analysis and extract meaningful insights from data. This course is designed to give you practical guidance on industry-standard data analysis and machine learning tools in Python, with the help of realistic data. The course will help you understand how you can use pandas and Matplotlib to critically examine a dataset with summary statistics and graphs, and extract the insights you seek to derive. You will continue to build on your knowledge as you learn how to prepare data and feed it to machine learning algorithms, such as regularized logistic regression and random forest, using the scikit-learn package. You?ll discover how to tune the algorithms to provide the best predictions on new and unseen data. As you delve into later sections, you?ll be able to understand the working and output of these algorithms and gain insight into not only the predictive capabilities of the models but also their reasons for making these predictions. Data Exploration and Cleaning Python and the Anaconda Package Management System Different Types of Data Science Problems Loading the Case Study Data with Jupyter and pandas Data Quality Assurance and Exploration Exploring the Financial History Features in the Dataset Activity 1: Exploring Remaining Financial Features in the Dataset Introduction to Scikit-Learn and Model Evaluation Introduction Model Performance Metrics for Binary Classification Activity 2: Performing Logistic Regression with a New Feature and Creating a Precision-Recall Curve Details of Logistic Regression and Feature Exploration Introduction Examining the Relationships between Features and the Response Univariate Feature Selection: What It Does and Doesn't Do Building Cloud-Native Applications Activity 3: Fitting a Logistic Regression Model and Directly Using the Coefficients The Bias-Variance Trade-off Introduction Estimating the Coefficients and Intercepts of Logistic Regression Cross Validation: Choosing the Regularization Parameter and Other Hyperparameters Activity 4: Cross-Validation and Feature Engineering with the Case Study Data Decision Trees and Random Forests Introduction Decision trees Random Forests: Ensembles of Decision Trees Activity 5: Cross-Validation Grid Search with Random Forest Imputation of Missing Data, Financial Analysis, and Delivery to Client Introduction Review of Modeling Results Dealing with Missing Data: Imputation Strategies Activity 6: Deriving Financial Insights Final Thoughts on Delivering the Predictive Model to the Client
Duration 3 Days 18 CPD hours This course is intended for This course is aimed at anyone who wants to harness the power of data analytics in their organization including: Business Analysts, Data Analysts, Reporting and BI professionals Analytics professionals and Data Scientists who would like to learn Python Overview This course teaches delegates with no prior programming or data analytics experience how to perform data manipulation, data analysis and data visualization in Python. Mastery of these techniques and how to apply them to business problems will allow delegates to immediately add value in their workplace by extracting valuable insight from company data to allow better, data-driven decisions. Outcome: After attending this course, delegates will: Be able to write effective Python code Know how to access their data from a variety of sources using Python Know how to identify and fix data quality using Python Know how to manipulate data to create analysis ready data Know how to analyze and visualize data to drive data driven decisioning across your organization Becoming a world class data analytics practitioner requires mastery of the most sophisticated data analytics tools. These programming languages are some of the most powerful and flexible tools in the data analytics toolkit. From business questions to data analytics, and beyond For data analytics tasks to affect business decisions they must be driven by a business question. This section will formally outline how to move an analytics project through key phases of development from business question to business solution. Delegates will be able: to describe and understand the general analytics process. to describe and understand the different types of analytics can be used to derive data driven solutions to business to apply that knowledge to their business context Basic Python Programming Conventions This section will cover the basics of writing R programs. Topics covered will include: What is Python? Using Anaconda Writing Python programs Expressions and objects Functions and arguments Basic Python programming conventions Data Structures in Python This section will look at the basic data structures that Python uses and accessing data in Python. Topics covered will include: Vectors Arrays and matrices Factors Lists Data frames Loading .csv files into Python Connecting to External Data This section will look at loading data from other sources into Python. Topics covered will include: Loading .csv files into a pandas data frame Connecting to and loading data from a database into a panda data frame Data Manipulation in Python This section will look at how Python can be used to perform data manipulation operations to prepare datasets for analytics projects. Topics covered will include: Filtering data Deriving new fields Aggregating data Joining data sources Connecting to external data sources Descriptive Analytics and Basic Reporting in Python This section will explain how Python can be used to perform basic descriptive. Topics covered will include: Summary statistics Grouped summary statistics Using descriptive analytics to assess data quality Using descriptive analytics to created business report Using descriptive analytics to conduct exploratory analysis Statistical Analysis in Python This section will explain how Python can be used to created more interesting statistical analysis. Topics covered will include: Significance tests Correlation Linear regressions Using statistical output to create better business decisions. Data Visualisation in Python This section will explain how Python can be used to create effective charts and visualizations. Topics covered will include: Creating different chart types such as bar charts, box plots, histograms and line plots Formatting charts Best Practices Hints and Tips This section will go through some best practice considerations that should be adopted of you are applying Python in a business context.
Duration 3 Days 18 CPD hours This course is intended for Data Science for Marketing Analytics is designed for developers and marketing analysts looking to use new, more sophisticated tools in their marketing analytics efforts. It'll help if you have prior experience of coding in Python and knowledge of high school level mathematics. Some experience with databases, Excel, statistics, or Tableau is useful but not necessary. Overview By the end of this course, you will be able to build your own marketing reporting and interactive dashboard solutions. The course starts by teaching you how to use Python libraries, such as pandas and Matplotlib, to read data from Python, manipulate it, and create plots, using both categorical and continuous variables. Then, you'll learn how to segment a population into groups and use different clustering techniques to evaluate customer segmentation.As you make your way through the course, you'll explore ways to evaluate and select the best segmentation approach, and go on to create a linear regression model on customer value data to predict lifetime value. In the concluding sections, you'll gain an understanding of regression techniques and tools for evaluating regression models, and explore ways to predict customer choice using classification algorithms. Finally, you'll apply these techniques to create a churn model for modeling customer product choices. Data Preparation and Cleaning Data Models and Structured Data pandas Data Manipulation Data Exploration and Visualization Identifying the Right Attributes Generating Targeted Insights Visualizing Data Unsupervised Learning: Customer Segmentation Customer Segmentation Methods Similarity and Data Standardization k-means Clustering Choosing the Best Segmentation Approach Choosing the Number of Clusters Different Methods of Clustering Evaluating Clustering Predicting Customer Revenue Using Linear Regression Understanding Regression Feature Engineering for Regression Performing and Interpreting Linear Regression Other Regression Techniques and Tools for Evaluation Evaluating the Accuracy of a Regression Model Using Regularization for Feature Selection Tree-Based Regression Models Supervised Learning: Predicting Customer Churn Classification Problems Understanding Logistic Regression Creating a Data Science Pipeline Fine-Tuning Classification Algorithms Support Vector Machine Decision Trees Random Forest Preprocessing Data for Machine Learning Models Model Evaluation Performance Metrics Modeling Customer Choice Understanding Multiclass Classification Class Imbalanced Data Additional course details: Nexus Humans Data Science for Marketing Analytics training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Data Science for Marketing Analytics course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 2 Days 12 CPD hours This course is intended for This course is aimed at anyone currently working with data who is interested in using data visualisation to more effectively communicate their results. Overview At completion, delegates will understand how data visualisations can be best used to communicate actionable insights from data and be competent with the tools required to do it. Visualising data, and analytics results, is one of the most effective ways to achieve this. This course will cover the theory of data visualisation along with practical skills for creating compelling visualisations from data. Course Outline The use of analytics, statistics and data science in business has grown massively in recent years. Harnessing the power of data is opening actionable insights in diverse industries from banking to horse breeding. The companies doing this most successfully understand that using sophisticated analytics approaches to unlock insights from data is only half the job. Communicating these insights to all of the different parts of an organisation is just as important as doing the actual analysis. Visualising data, and analytics results, is one of the most effective ways to achieve this. This course will cover the theory of data visualisation along with practical skills for creating compelling visualisations from data. To attend this course delegates should be competent in the use of data analysis tools such as reporting tools, spreadsheet software or business intelligence tools. The course will explore the following topics through a series of interactive workshop sessions: Fundamentals of data visualisation Data characteristics & dimensions Mapping visual encodings to data dimensions Colour theory Graphical perception & communication Interaction design Visualisation different characteristics of data: trends, comparisons, correlations, maps, networks, hierarchies, text Designing effective dashboards
Duration 3 Days 18 CPD hours This course is intended for Before taking this course delegates should already be familiar with basic analytics techniques, comfortable with basic data manipulation tools such as spreadsheets and databases and already familiar with at least one programming language Overview This course teaches delegates who are already familiar with analytics techniques and at least one programming language how to effectively use the programming language for three tasks: data manipulation and preparation, statistical analysis and advanced analytics (including predictive modelling and segmentation). Mastery of these techniques will allow delegates to immediately add value in their work place by extracting valuable insight from company data to allow better, data-driven decisions. Outcomes: After completing the course, delegates will be capable of writing production-ready R code to perform advanced analytics tasks enabling their organisations make better, data-driven decisions. Becoming a world class data analytics practitioner requires mastery of the most sophisticated data analytics tools. These programming languages are some of the most powerful and flexible tools in the data analytics toolkit. Topic 1 Intro to our chosen language Topic 2 Basic programming conventions Topic 3 Data structures Topic 4 Accessing data Topic 5 Descriptive statistics Topic 6 Data visualisation Topic 7 Statistical analysis Topic 8 Advanced data manipulation Topic 9 Advanced analytics ? predictive modelling Topic 10 Advanced analytics ? segmentation