Duration 2 Days 12 CPD hours This course is intended for Business Analysts, Technical Managers, and Programmers Overview This intensive training course helps students learn the practical aspects of the R programming language. The course is supplemented by many hands-on labs which allow attendees to immediately apply their theoretical knowledge in practice. Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data as well as supervised and unsupervised machine learning. What is R ? What is R? ? Positioning of R in the Data Science Space ? The Legal Aspects ? Microsoft R Open ? R Integrated Development Environments ? Running R ? Running RStudio ? Getting Help ? General Notes on R Commands and Statements ? Assignment Operators ? R Core Data Structures ? Assignment Example ? R Objects and Workspace ? Printing Objects ? Arithmetic Operators ? Logical Operators ? System Date and Time ? Operations ? User-defined Functions ? Control Statements ? Conditional Execution ? Repetitive Execution ? Repetitive execution ? Built-in Functions ? Summary Introduction to Functional Programming with R ? What is Functional Programming (FP)? ? Terminology: Higher-Order Functions ? A Short List of Languages that Support FP ? Functional Programming in R ? Vector and Matrix Arithmetic ? Vector Arithmetic Example ? More Examples of FP in R ? Summary Managing Your Environment ? Getting and Setting the Working Directory ? Getting the List of Files in a Directory ? The R Home Directory ? Executing External R commands ? Loading External Scripts in RStudio ? Listing Objects in Workspace ? Removing Objects in Workspace ? Saving Your Workspace in R ? Saving Your Workspace in RStudio ? Saving Your Workspace in R GUI ? Loading Your Workspace ? Diverting Output to a File ? Batch (Unattended) Processing ? Controlling Global Options ? Summary R Type System and Structures ? The R Data Types ? System Date and Time ? Formatting Date and Time ? Using the mode() Function ? R Data Structures ? What is the Type of My Data Structure? ? Creating Vectors ? Logical Vectors ? Character Vectors ? Factorization ? Multi-Mode Vectors ? The Length of the Vector ? Getting Vector Elements ? Lists ? A List with Element Names ? Extracting List Elements ? Adding to a List ? Matrix Data Structure ? Creating Matrices ? Creating Matrices with cbind() and rbind() ? Working with Data Frames ? Matrices vs Data Frames ? A Data Frame Sample ? Creating a Data Frame ? Accessing Data Cells ? Getting Info About a Data Frame ? Selecting Columns in Data Frames ? Selecting Rows in Data Frames ? Getting a Subset of a Data Frame ? Sorting (ordering) Data in Data Frames by Attribute(s) ? Editing Data Frames ? The str() Function ? Type Conversion (Coercion) ? The summary() Function ? Checking an Object's Type ? Summary Extending R ? The Base R Packages ? Loading Packages ? What is the Difference between Package and Library? ? Extending R ? The CRAN Web Site ? Extending R in R GUI ? Extending R in RStudio ? Installing and Removing Packages from Command-Line ? Summary Read-Write and Import-Export Operations in R ? Reading Data from a File into a Vector ? Example of Reading Data from a File into A Vector ? Writing Data to a File ? Example of Writing Data to a File ? Reading Data into A Data Frame ? Writing CSV Files ? Importing Data into R ? Exporting Data from R ? Summary Statistical Computing Features in R ? Statistical Computing Features ? Descriptive Statistics ? Basic Statistical Functions ? Examples of Using Basic Statistical Functions ? Non-uniformity of a Probability Distribution ? Writing Your Own skew and kurtosis Functions ? Generating Normally Distributed Random Numbers ? Generating Uniformly Distributed Random Numbers ? Using the summary() Function ? Math Functions Used in Data Analysis ? Examples of Using Math Functions ? Correlations ? Correlation Example ? Testing Correlation Coefficient for Significance ? The cor.test() Function ? The cor.test() Example ? Regression Analysis ? Types of Regression ? Simple Linear Regression Model ? Least-Squares Method (LSM) ? LSM Assumptions ? Fitting Linear Regression Models in R ? Example of Using lm() ? Confidence Intervals for Model Parameters ? Example of Using lm() with a Data Frame ? Regression Models in Excel ? Multiple Regression Analysis ? Summary Data Manipulation and Transformation in R ? Applying Functions to Matrices and Data Frames ? The apply() Function ? Using apply() ? Using apply() with a User-Defined Function ? apply() Variants ? Using tapply() ? Adding a Column to a Data Frame ? Dropping A Column in a Data Frame ? The attach() and detach() Functions ? Sampling ? Using sample() for Generating Labels ? Set Operations ? Example of Using Set Operations ? The dplyr Package ? Object Masking (Shadowing) Considerations ? Getting More Information on dplyr in RStudio ? The search() or searchpaths() Functions ? Handling Large Data Sets in R with the data.table Package ? The fread() and fwrite() functions from the data.table Package ? Using the Data Table Structure ? Summary Data Visualization in R ? Data Visualization ? Data Visualization in R ? The ggplot2 Data Visualization Package ? Creating Bar Plots in R ? Creating Horizontal Bar Plots ? Using barplot() with Matrices ? Using barplot() with Matrices Example ? Customizing Plots ? Histograms in R ? Building Histograms with hist() ? Example of using hist() ? Pie Charts in R ? Examples of using pie() ? Generic X-Y Plotting ? Examples of the plot() function ? Dot Plots in R ? Saving Your Work ? Supported Export Options ? Plots in RStudio ? Saving a Plot as an Image ? Summary Using R Efficiently ? Object Memory Allocation Considerations ? Garbage Collection ? Finding Out About Loaded Packages ? Using the conflicts() Function ? Getting Information About the Object Source Package with the pryr Package ? Using the where() Function from the pryr Package ? Timing Your Code ? Timing Your Code with system.time() ? Timing Your Code with System.time() ? Sleeping a Program ? Handling Large Data Sets in R with the data.table Package ? Passing System-Level Parameters to R ? Summary Lab Exercises Lab 1 - Getting Started with R Lab 2 - Learning the R Type System and Structures Lab 3 - Read and Write Operations in R Lab 4 - Data Import and Export in R Lab 5 - k-Nearest Neighbors Algorithm Lab 6 - Creating Your Own Statistical Functions Lab 7 - Simple Linear Regression Lab 8 - Monte-Carlo Simulation (Method) Lab 9 - Data Processing with R Lab 10 - Using R Graphics Package Lab 11 - Using R Efficiently
Duration 2 Days 12 CPD hours This course is intended for Audience: Data Scientists, Software Developers, IT Architects, and Technical Managers. Participants should have the general knowledge of statistics and programming Also familiar with Python Overview ? NumPy, pandas, Matplotlib, scikit-learn ? Python REPLs ? Jupyter Notebooks ? Data analytics life-cycle phases ? Data repairing and normalizing ? Data aggregation and grouping ? Data visualization ? Data science algorithms for supervised and unsupervised machine learning Covers theoretical and technical aspects of using Python in Applied Data Science projects and Data Logistics use cases. Python for Data Science ? Using Modules ? Listing Methods in a Module ? Creating Your Own Modules ? List Comprehension ? Dictionary Comprehension ? String Comprehension ? Python 2 vs Python 3 ? Sets (Python 3+) ? Python Idioms ? Python Data Science ?Ecosystem? ? NumPy ? NumPy Arrays ? NumPy Idioms ? pandas ? Data Wrangling with pandas' DataFrame ? SciPy ? Scikit-learn ? SciPy or scikit-learn? ? Matplotlib ? Python vs R ? Python on Apache Spark ? Python Dev Tools and REPLs ? Anaconda ? IPython ? Visual Studio Code ? Jupyter ? Jupyter Basic Commands ? Summary Applied Data Science ? What is Data Science? ? Data Science Ecosystem ? Data Mining vs. Data Science ? Business Analytics vs. Data Science ? Data Science, Machine Learning, AI? ? Who is a Data Scientist? ? Data Science Skill Sets Venn Diagram ? Data Scientists at Work ? Examples of Data Science Projects ? An Example of a Data Product ? Applied Data Science at Google ? Data Science Gotchas ? Summary Data Analytics Life-cycle Phases ? Big Data Analytics Pipeline ? Data Discovery Phase ? Data Harvesting Phase ? Data Priming Phase ? Data Logistics and Data Governance ? Exploratory Data Analysis ? Model Planning Phase ? Model Building Phase ? Communicating the Results ? Production Roll-out ? Summary Repairing and Normalizing Data ? Repairing and Normalizing Data ? Dealing with the Missing Data ? Sample Data Set ? Getting Info on Null Data ? Dropping a Column ? Interpolating Missing Data in pandas ? Replacing the Missing Values with the Mean Value ? Scaling (Normalizing) the Data ? Data Preprocessing with scikit-learn ? Scaling with the scale() Function ? The MinMaxScaler Object ? Summary Descriptive Statistics Computing Features in Python ? Descriptive Statistics ? Non-uniformity of a Probability Distribution ? Using NumPy for Calculating Descriptive Statistics Measures ? Finding Min and Max in NumPy ? Using pandas for Calculating Descriptive Statistics Measures ? Correlation ? Regression and Correlation ? Covariance ? Getting Pairwise Correlation and Covariance Measures ? Finding Min and Max in pandas DataFrame ? Summary Data Aggregation and Grouping ? Data Aggregation and Grouping ? Sample Data Set ? The pandas.core.groupby.SeriesGroupBy Object ? Grouping by Two or More Columns ? Emulating the SQL's WHERE Clause ? The Pivot Tables ? Cross-Tabulation ? Summary Data Visualization with matplotlib ? Data Visualization ? What is matplotlib? ? Getting Started with matplotlib ? The Plotting Window ? The Figure Options ? The matplotlib.pyplot.plot() Function ? The matplotlib.pyplot.bar() Function ? The matplotlib.pyplot.pie () Function ? Subplots ? Using the matplotlib.gridspec.GridSpec Object ? The matplotlib.pyplot.subplot() Function ? Hands-on Exercise ? Figures ? Saving Figures to File ? Visualization with pandas ? Working with matplotlib in Jupyter Notebooks ? Summary Data Science and ML Algorithms in scikit-learn ? Data Science, Machine Learning, AI? ? Types of Machine Learning ? Terminology: Features and Observations ? Continuous and Categorical Features (Variables) ? Terminology: Axis ? The scikit-learn Package ? scikit-learn Estimators ? Models, Estimators, and Predictors ? Common Distance Metrics ? The Euclidean Metric ? The LIBSVM format ? Scaling of the Features ? The Curse of Dimensionality ? Supervised vs Unsupervised Machine Learning ? Supervised Machine Learning Algorithms ? Unsupervised Machine Learning Algorithms ? Choose the Right Algorithm ? Life-cycles of Machine Learning Development ? Data Split for Training and Test Data Sets ? Data Splitting in scikit-learn ? Hands-on Exercise ? Classification Examples ? Classifying with k-Nearest Neighbors (SL) ? k-Nearest Neighbors Algorithm ? k-Nearest Neighbors Algorithm ? The Error Rate ? Hands-on Exercise ? Dimensionality Reduction ? The Advantages of Dimensionality Reduction ? Principal component analysis (PCA) ? Hands-on Exercise ? Data Blending ? Decision Trees (SL) ? Decision Tree Terminology ? Decision Tree Classification in Context of Information Theory ? Information Entropy Defined ? The Shannon Entropy Formula ? The Simplified Decision Tree Algorithm ? Using Decision Trees ? Random Forests ? SVM ? Naive Bayes Classifier (SL) ? Naive Bayesian Probabilistic Model in a Nutshell ? Bayes Formula ? Classification of Documents with Naive Bayes ? Unsupervised Learning Type: Clustering ? Clustering Examples ? k-Means Clustering (UL) ? k-Means Clustering in a Nutshell ? k-Means Characteristics ? Regression Analysis ? Simple Linear Regression Model ? Linear vs Non-Linear Regression ? Linear Regression Illustration ? Major Underlying Assumptions for Regression Analysis ? Least-Squares Method (LSM) ? Locally Weighted Linear Regression ? Regression Models in Excel ? Multiple Regression Analysis ? Logistic Regression ? Regression vs Classification ? Time-Series Analysis ? Decomposing Time-Series ? Summary Lab Exercises Lab 1 - Learning the Lab Environment Lab 2 - Using Jupyter Notebook Lab 3 - Repairing and Normalizing Data Lab 4 - Computing Descriptive Statistics Lab 5 - Data Grouping and Aggregation Lab 6 - Data Visualization with matplotlib Lab 7 - Data Splitting Lab 8 - k-Nearest Neighbors Algorithm Lab 9 - The k-means Algorithm Lab 10 - The Random Forest Algorithm
Duration 3 Days 18 CPD hours This course is intended for This course is aimed at anyone who wants to harness the power of data analytics in their organization including: Business Analysts, Data Analysts, Reporting and BI professionals Analytics professionals and Data Scientists who would like to learn Python Overview This course teaches delegates with no prior programming or data analytics experience how to perform data manipulation, data analysis and data visualization in Python. Mastery of these techniques and how to apply them to business problems will allow delegates to immediately add value in their workplace by extracting valuable insight from company data to allow better, data-driven decisions. Outcome: After attending this course, delegates will: Be able to write effective Python code Know how to access their data from a variety of sources using Python Know how to identify and fix data quality using Python Know how to manipulate data to create analysis ready data Know how to analyze and visualize data to drive data driven decisioning across your organization Becoming a world class data analytics practitioner requires mastery of the most sophisticated data analytics tools. These programming languages are some of the most powerful and flexible tools in the data analytics toolkit. From business questions to data analytics, and beyond For data analytics tasks to affect business decisions they must be driven by a business question. This section will formally outline how to move an analytics project through key phases of development from business question to business solution. Delegates will be able: to describe and understand the general analytics process. to describe and understand the different types of analytics can be used to derive data driven solutions to business to apply that knowledge to their business context Basic Python Programming Conventions This section will cover the basics of writing R programs. Topics covered will include: What is Python? Using Anaconda Writing Python programs Expressions and objects Functions and arguments Basic Python programming conventions Data Structures in Python This section will look at the basic data structures that Python uses and accessing data in Python. Topics covered will include: Vectors Arrays and matrices Factors Lists Data frames Loading .csv files into Python Connecting to External Data This section will look at loading data from other sources into Python. Topics covered will include: Loading .csv files into a pandas data frame Connecting to and loading data from a database into a panda data frame Data Manipulation in Python This section will look at how Python can be used to perform data manipulation operations to prepare datasets for analytics projects. Topics covered will include: Filtering data Deriving new fields Aggregating data Joining data sources Connecting to external data sources Descriptive Analytics and Basic Reporting in Python This section will explain how Python can be used to perform basic descriptive. Topics covered will include: Summary statistics Grouped summary statistics Using descriptive analytics to assess data quality Using descriptive analytics to created business report Using descriptive analytics to conduct exploratory analysis Statistical Analysis in Python This section will explain how Python can be used to created more interesting statistical analysis. Topics covered will include: Significance tests Correlation Linear regressions Using statistical output to create better business decisions. Data Visualisation in Python This section will explain how Python can be used to create effective charts and visualizations. Topics covered will include: Creating different chart types such as bar charts, box plots, histograms and line plots Formatting charts Best Practices Hints and Tips This section will go through some best practice considerations that should be adopted of you are applying Python in a business context.
Duration 3 Days 18 CPD hours This course is intended for Data Science for Marketing Analytics is designed for developers and marketing analysts looking to use new, more sophisticated tools in their marketing analytics efforts. It'll help if you have prior experience of coding in Python and knowledge of high school level mathematics. Some experience with databases, Excel, statistics, or Tableau is useful but not necessary. Overview By the end of this course, you will be able to build your own marketing reporting and interactive dashboard solutions. The course starts by teaching you how to use Python libraries, such as pandas and Matplotlib, to read data from Python, manipulate it, and create plots, using both categorical and continuous variables. Then, you'll learn how to segment a population into groups and use different clustering techniques to evaluate customer segmentation.As you make your way through the course, you'll explore ways to evaluate and select the best segmentation approach, and go on to create a linear regression model on customer value data to predict lifetime value. In the concluding sections, you'll gain an understanding of regression techniques and tools for evaluating regression models, and explore ways to predict customer choice using classification algorithms. Finally, you'll apply these techniques to create a churn model for modeling customer product choices. Data Preparation and Cleaning Data Models and Structured Data pandas Data Manipulation Data Exploration and Visualization Identifying the Right Attributes Generating Targeted Insights Visualizing Data Unsupervised Learning: Customer Segmentation Customer Segmentation Methods Similarity and Data Standardization k-means Clustering Choosing the Best Segmentation Approach Choosing the Number of Clusters Different Methods of Clustering Evaluating Clustering Predicting Customer Revenue Using Linear Regression Understanding Regression Feature Engineering for Regression Performing and Interpreting Linear Regression Other Regression Techniques and Tools for Evaluation Evaluating the Accuracy of a Regression Model Using Regularization for Feature Selection Tree-Based Regression Models Supervised Learning: Predicting Customer Churn Classification Problems Understanding Logistic Regression Creating a Data Science Pipeline Fine-Tuning Classification Algorithms Support Vector Machine Decision Trees Random Forest Preprocessing Data for Machine Learning Models Model Evaluation Performance Metrics Modeling Customer Choice Understanding Multiclass Classification Class Imbalanced Data Additional course details: Nexus Humans Data Science for Marketing Analytics training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Data Science for Marketing Analytics course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 3 Days 18 CPD hours This course is intended for This course is aimed at anyone who wants to harness the power of data analytics in their organization. Overview After completing this course delegates will be capable of writing effective R code to manipulate, analyse and visualise data to enable their organisations make better, data-driven decisions. This course teaches delegates with no prior programming or data analytics experience how to perform data manipulation, data analysis and data visualisation in R. Course Outline Becoming a world class data analytics practitioner requires mastery of the most sophisticated data analytics tools. The R programming language is one of the most powerful and flexible tools in the data analytics toolkit. This course teaches delegates with no prior programming or data analytics experience how to perform data manipulation, data analysis and data visualisation in R. Mastery of these techniques will allow delegates to immediately add value in their work place by extracting valuable insight from company data to allow better, data-driven decisions. The course will explore the following topics through a series of interactive workshop sessions: What is R? Basic R programming conventions Data structures in R Accessing data in R Descriptive statistics in R Statistical analysis in R Data manipulation in R Data visualisation in R Additional course details: Nexus Humans Beginning Data Analytics With R training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Beginning Data Analytics With R course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 3 Days 18 CPD hours This course is intended for The primary audience for this course are Application Consultants, Developers, Developer Consultants, and Technology Consultants. Overview Define Data ServicesDefine Source and Target ConnectionsTrace, Validate, and Debug Data Services JobsUse Data Services TransformsImplement Change Data Capture in Data Services In this course, students will learn how to define data services, source, and target connections, as well as use data services transforms and implement change in data capture within data services. Data Services Defining Data Services Source and Target Metadata Defining Datastores in Data Services Defining a Data Services Flat File Format Batch Job Creation Creating Batch Jobs Batch Job Troubleshooting Writing Comments with Descriptions and Annotations Validating and Tracing Jobs Debugging Data Flows Auditing Data Flows Functions, Scripts, and Variables Using Built-In Functions Using Variables, Parameters, and Scripts Platform Transforms Using Platform Transforms Using the Map Operation Transform Using the Validation Transform Using the Merge Transform Using the Case Transform Using the SQL Transform Error Handling Setting Up Error Handling Changes in Data Capturing Changes in Data Using Source-Based Change Data Capture (CDC) Using Target-Based Change Data Capture (CDC) Data Services (Integrator) Platform Transforms Using Data Services (Integrator) Platform Transforms Using the Pivot Transform Using the Data Transfer Transform Additional course details: Nexus Humans DS10 SAP Data Services - Platform and Transforms training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the DS10 SAP Data Services - Platform and Transforms course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 2 Days 12 CPD hours This course is intended for This course is suited to marketeers, business analysts, and researchers who are interested in increasing their statistical knowledge. Overview After attending this course, delegates will understand how statistics can be used to provide valuable insight into their business, and be able to apply statistical methods to solve business problems. On returning to work delegates will immediately be able to make a difference to the way that their organisations make decisions. This course covers the statistical methods that analysts need to move from simple reporting on business problems to extracting insight to solve business problems. Course Outline The course will explore the following topics through a series of lectures and workshops: Summary statistics for both continuous data and categorical data Using and reporting confidence intervals Using hypothesis tests to answer business questions Using correlations to explore data relationships Simple prediction models Analysing categorical data Additional course details: Nexus Humans Data-driven Business Using Statistical Analysis training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Data-driven Business Using Statistical Analysis course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 4 Days 24 CPD hours This course is intended for This class is intended for experienced developers who are responsible for managing big data transformations including: Extracting, loading, transforming, cleaning, and validating data. Designing pipelines and architectures for data processing. Creating and maintaining machine learning and statistical models. Querying datasets, visualizing query results and creating reports Overview Design and build data processing systems on Google Cloud Platform. Leverage unstructured data using Spark and ML APIs on Cloud Dataproc. Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow. Derive business insights from extremely large datasets using Google BigQuery. Train, evaluate and predict using machine learning models using TensorFlow and Cloud ML. Enable instant insights from streaming data Get hands-on experience with designing and building data processing systems on Google Cloud. This course uses lectures, demos, and hand-on labs to show you how to design data processing systems, build end-to-end data pipelines, analyze data, and implement machine learning. This course covers structured, unstructured, and streaming data. Introduction to Data Engineering Explore the role of a data engineer. Analyze data engineering challenges. Intro to BigQuery. Data Lakes and Data Warehouses. Demo: Federated Queries with BigQuery. Transactional Databases vs Data Warehouses. Website Demo: Finding PII in your dataset with DLP API. Partner effectively with other data teams. Manage data access and governance. Build production-ready pipelines. Review GCP customer case study. Lab: Analyzing Data with BigQuery. Building a Data Lake Introduction to Data Lakes. Data Storage and ETL options on GCP. Building a Data Lake using Cloud Storage. Optional Demo: Optimizing cost with Google Cloud Storage classes and Cloud Functions. Securing Cloud Storage. Storing All Sorts of Data Types. Video Demo: Running federated queries on Parquet and ORC files in BigQuery. Cloud SQL as a relational Data Lake. Lab: Loading Taxi Data into Cloud SQL. Building a Data Warehouse The modern data warehouse. Intro to BigQuery. Demo: Query TB+ of data in seconds. Getting Started. Loading Data. Video Demo: Querying Cloud SQL from BigQuery. Lab: Loading Data into BigQuery. Exploring Schemas. Demo: Exploring BigQuery Public Datasets with SQL using INFORMATION_SCHEMA. Schema Design. Nested and Repeated Fields. Demo: Nested and repeated fields in BigQuery. Lab: Working with JSON and Array data in BigQuery. Optimizing with Partitioning and Clustering. Demo: Partitioned and Clustered Tables in BigQuery. Preview: Transforming Batch and Streaming Data. Introduction to Building Batch Data Pipelines EL, ELT, ETL. Quality considerations. How to carry out operations in BigQuery. Demo: ELT to improve data quality in BigQuery. Shortcomings. ETL to solve data quality issues. Executing Spark on Cloud Dataproc The Hadoop ecosystem. Running Hadoop on Cloud Dataproc. GCS instead of HDFS. Optimizing Dataproc. Lab: Running Apache Spark jobs on Cloud Dataproc. Serverless Data Processing with Cloud Dataflow Cloud Dataflow. Why customers value Dataflow. Dataflow Pipelines. Lab: A Simple Dataflow Pipeline (Python/Java). Lab: MapReduce in Dataflow (Python/Java). Lab: Side Inputs (Python/Java). Dataflow Templates. Dataflow SQL. Manage Data Pipelines with Cloud Data Fusion and Cloud Composer Building Batch Data Pipelines visually with Cloud Data Fusion. Components. UI Overview. Building a Pipeline. Exploring Data using Wrangler. Lab: Building and executing a pipeline graph in Cloud Data Fusion. Orchestrating work between GCP services with Cloud Composer. Apache Airflow Environment. DAGs and Operators. Workflow Scheduling. Optional Long Demo: Event-triggered Loading of data with Cloud Composer, Cloud Functions, Cloud Storage, and BigQuery. Monitoring and Logging. Lab: An Introduction to Cloud Composer. Introduction to Processing Streaming Data Processing Streaming Data. Serverless Messaging with Cloud Pub/Sub Cloud Pub/Sub. Lab: Publish Streaming Data into Pub/Sub. Cloud Dataflow Streaming Features Cloud Dataflow Streaming Features. Lab: Streaming Data Pipelines. High-Throughput BigQuery and Bigtable Streaming Features BigQuery Streaming Features. Lab: Streaming Analytics and Dashboards. Cloud Bigtable. Lab: Streaming Data Pipelines into Bigtable. Advanced BigQuery Functionality and Performance Analytic Window Functions. Using With Clauses. GIS Functions. Demo: Mapping Fastest Growing Zip Codes with BigQuery GeoViz. Performance Considerations. Lab: Optimizing your BigQuery Queries for Performance. Optional Lab: Creating Date-Partitioned Tables in BigQuery. Introduction to Analytics and AI What is AI?. From Ad-hoc Data Analysis to Data Driven Decisions. Options for ML models on GCP. Prebuilt ML model APIs for Unstructured Data Unstructured Data is Hard. ML APIs for Enriching Data. Lab: Using the Natural Language API to Classify Unstructured Text. Big Data Analytics with Cloud AI Platform Notebooks What's a Notebook. BigQuery Magic and Ties to Pandas. Lab: BigQuery in Jupyter Labs on AI Platform. Production ML Pipelines with Kubeflow Ways to do ML on GCP. Kubeflow. AI Hub. Lab: Running AI models on Kubeflow. Custom Model building with SQL in BigQuery ML BigQuery ML for Quick Model Building. Demo: Train a model with BigQuery ML to predict NYC taxi fares. Supported Models. Lab Option 1: Predict Bike Trip Duration with a Regression Model in BQML. Lab Option 2: Movie Recommendations in BigQuery ML. Custom Model building with Cloud AutoML Why Auto ML? Auto ML Vision. Auto ML NLP. Auto ML Tables.
Duration 2.25 Days 13.5 CPD hours This course is intended for The job roles best suited to the material in this course are: sales personnel, accountants, administrators, auditors, lab assistants, office job positions. Overview Work with functions. Work with lists. Analyze data. Visualize data with charts. Use PivotTables and PivotCharts. Work with multiple worksheets and workbooks. Share and protect workbooks. Automate workbook functionality. Use Lookup functions and formula auditing. Forecast data. Create sparklines and map data This course provides the knowledge to create advanced workbooks and worksheets that can deepen the understanding of organizational intelligence. The ability to analyze massive amounts of data, extract actionable information from it and present that information to decision makers. In addition this course will give you the ability to collaborate with colleagues, automate complex or repetitive tasks and use conditional logic to construct and apply elaborate formulas and functions which will allow you to work through a lot of data and generate the answers that your organisation needs. WORKING WITH FUNCTIONS Topic A: Work with Ranges Topic B: Use Specialized Functions Topic C: Work with Logical Functions Topic D: Work with Date and Time Functions Topic E: Work with Text Functions WORKING WITH LISTS Topic A: Sort Data Topic B: Filter Data Topic C: Query Data with Database Functions Topic D: Outline and Subtotal Data ANALYZING DATA Topic A: Create and Modify Tables Topic B: Apply Intermediate Conditional Formatting Topic C: Apply Advanced Conditional Formatting VISUALIZING DATA WITH CHARTS Topic A: Create Charts Topic B: Modify and Format Charts Topic C: Use Advanced Chart Features USING PIVOTTABLES AND PIVOTCHARTS Topic A: Create a PivotTable Topic B: Analyze PivotTable Data Topic C: Present Data with PivotCharts Topic D: Filter Data by Using Timelines and Slicers WORKING WITH MULTIPLE WORKSHEETS AND WORKBOOKS Topic A: Use Links and External References Topic B: Use 3-D References Topic C: Consolidate Data SHARING AND PROTECTING WORKBOOKS Topic A: Collaborate on a Workbook Topic B: Protect Worksheets and Workbooks AUTOMATING WORKBOOK FUNCTIONALITY Topic A: Apply Data Validation Topic B: Search for Invalid Data and Formulas with Errors Topic C: Work with Macros USING LOOKUP FUNCTIONS AND FORMULAS AUDITING Topic A: Use Lookup Functions Topic B: Trace Cells Topic C: Watch and Evaluate Formulas FORECASTING DATA Topic A: Determine Potential Outcomes Using Data Tables Topic B: Determine Potential Outcomes Using Scenarios Topic C: Use the Goal Seek Feature Topic D: Forecast Data Trends CREATING SPARKLINES AND MAPPING DATA Topic A: Create Sparklines Topic B: Map Data
Duration 2 Days 12 CPD hours This course is intended for This course is aimed at anyone currently working with data who is interested in using data visualisation to more effectively communicate their results. Overview At completion, delegates will understand how data visualisations can be best used to communicate actionable insights from data and be competent with the tools required to do it. Visualising data, and analytics results, is one of the most effective ways to achieve this. This course will cover the theory of data visualisation along with practical skills for creating compelling visualisations from data. Course Outline The use of analytics, statistics and data science in business has grown massively in recent years. Harnessing the power of data is opening actionable insights in diverse industries from banking to horse breeding. The companies doing this most successfully understand that using sophisticated analytics approaches to unlock insights from data is only half the job. Communicating these insights to all of the different parts of an organisation is just as important as doing the actual analysis. Visualising data, and analytics results, is one of the most effective ways to achieve this. This course will cover the theory of data visualisation along with practical skills for creating compelling visualisations from data. To attend this course delegates should be competent in the use of data analysis tools such as reporting tools, spreadsheet software or business intelligence tools. The course will explore the following topics through a series of interactive workshop sessions: Fundamentals of data visualisation Data characteristics & dimensions Mapping visual encodings to data dimensions Colour theory Graphical perception & communication Interaction design Visualisation different characteristics of data: trends, comparisons, correlations, maps, networks, hierarchies, text Designing effective dashboards