• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

191 Data Skills courses

Advanced Tableau

By Nexus Human

Duration 2.5 Days 15 CPD hours This course is intended for This course is intended for those with a basic understanding of Tableau who want to pursue mastery of the advanced features. Overview The goal of this course is to present essential Tableau concepts and its advanced functionalities to help better prepare and analyze data. This course will use Tableau Hyper, Tableau Prep and more. Getting Up to Speed ? a Review of the Basics Connecting Tableau to your data Connecting to Tableau Server Connecting to saved data sources Measure Names and Measure Values Three essential Tableau concepts Exporting data to other devices Summary All About Data ? Getting Your Data Ready Data mining and knowledge discovery process models CRISP?DM All About Data ? Joins, Blends, and Data Structures All About Data - Joins, Blends, and Data Structures Introduction to joins Introduction to complex joins Exercise: observing join culling Introduction to join calculations Introduction to spatial joins Introduction to unions Understanding data blending Order of operations No dimensions from a secondary source Introduction to scaffolding Introduction to data structures Exercise: adjusting the data structure for different questions Summary Table Calculations Table Calculations A definition and two questions Introduction to functions Directional and non-directional table calculations Application of functions Summary Level of Detail Calculations Level of Detail Calculations Building playgrounds Playground I: FIXED and EXCLUDE Playground II: INCLUDE Practical application Exercise: practical FIXED Exercise: practical INCLUDE Exercise: practical EXCLUDE Summary Beyond the Basic Chart Types Beyond the Basic Chart Types Improving popular visualizations Custom background images Tableau extensions Summary Mapping Mapping Extending Tableau's mapping capabilities without leaving Tableau Extending Tableau mapping with other technology Exercise: connecting to a WMS server Exploring the TMS file Exploring Mapbox Accessing different maps with a dashboard Creating custom polygons Converting shape files for Tableau Exercise: polygons for Texas Heatmaps Summary Tableau for Presentations Tableau for Presentations Getting the best images out of Tableau From Tableau to PowerPoint Embedding Tableau in PowerPoint Animating Tableau Story points and dashboards for Presentations Summary Visualization Best Practices and Dashboard Design Visualization Best Practices and Dashboard Design Visualization design theory Formatting rules Color rules Visualization type rules Compromises Keeping visualizations simple Dashboard design Dashboard layout Sheet selection Summary Advanced Analytics Advanced Analytics Self-service Analytics Use case ? Self-service Analytics Use case ? Geo-spatial Analytics Summary Improving Performance Improving Performance Understanding the performance-recording dashboard Exercise: exploring performance recording in Tableau desktop Performance-recording dashboard events Behind the scenes of the performance- recording dashboard Hardware and on-the-fly techniques Hardware considerations On-the-fly-techniques Single Data Source > Joining > Blending Three ways Tableau connects to data Using referential integrity when joining Advantages of blending Efficiently working with data sources Tuning data sources Working efficiently with large data sources Intelligent extracts Understanding the Tableau data extract Constructing an extract for optimal performance Exercise: summary aggregates for improved performance Optimizing extracts Exercise: materialized calculations Using filters wisely Extract filter performance Data source filter performance Context filters Dimension and measure filters Table-calculation filters Efficient calculations Boolean/Numbers > Date > String Additional performance considerations Avoid overcrowding a dashboard Fixing dashboard sizing Setting expectations Summary Additional course details: Nexus Humans Advanced Tableau training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Advanced Tableau course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Advanced Tableau
Delivered OnlineFlexible Dates
Price on Enquiry

0G09A IBM Advanced Statistical Analysis Using IBM SPSS Statistics (v25)

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for Anyone who works with IBM SPSS Statistics and wants to learn advanced statistical procedures to be able to better answer research questions. Overview Introduction to advanced statistical analysis Group variables: Factor Analysis and Principal Components Analysis Group similar cases: Cluster Analysis Predict categorical targets with Nearest Neighbor Analysis Predict categorical targets with Discriminant Analysis Predict categorical targets with Logistic Regression Predict categorical targets with Decision Trees Introduction to Survival Analysis Introduction to Generalized Linear Models Introduction to Linear Mixed Models This course provides an application-oriented introduction to advanced statistical methods available in IBM SPSS Statistics. Students will review a variety of advanced statistical techniques and discuss situations in which each technique would be used, the assumptions made by each method, how to set up the analysis, and how to interpret the results. This includes a broad range of techniques for predicting variables, as well as methods to cluster variables and cases. Introduction to advanced statistical analysis Taxonomy of models Overview of supervised models Overview of models to create natural groupings Group variables: Factor Analysis and Principal Components Analysis Factor Analysis basics Principal Components basics Assumptions of Factor Analysis Key issues in Factor Analysis Improve the interpretability Use Factor and component scores Group similar cases: Cluster Analysis Cluster Analysis basics Key issues in Cluster Analysis K-Means Cluster Analysis Assumptions of K-Means Cluster Analysis TwoStep Cluster Analysis Assumptions of TwoStep Cluster Analysis Predict categorical targets with Nearest Neighbor Analysis Nearest Neighbor Analysis basics Key issues in Nearest Neighbor Analysis Assess model fit Predict categorical targets with Discriminant Analysis Discriminant Analysis basics The Discriminant Analysis model Core concepts of Discriminant Analysis Classification of cases Assumptions of Discriminant Analysis Validate the solution Predict categorical targets with Logistic Regression Binary Logistic Regression basics The Binary Logistic Regression model Multinomial Logistic Regression basics Assumptions of Logistic Regression procedures Testing hypotheses Predict categorical targets with Decision Trees Decision Trees basics Validate the solution Explore CHAID Explore CRT Comparing Decision Trees methods Introduction to Survival Analysis Survival Analysis basics Kaplan-Meier Analysis Assumptions of Kaplan-Meier Analysis Cox Regression Assumptions of Cox Regression Introduction to Generalized Linear Models Generalized Linear Models basics Available distributions Available link functions Introduction to Linear Mixed Models Linear Mixed Models basics Hierachical Linear Models Modeling strategy Assumptions of Linear Mixed Models Additional course details: Nexus Humans 0G09A IBM Advanced Statistical Analysis Using IBM SPSS Statistics (v25) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the 0G09A IBM Advanced Statistical Analysis Using IBM SPSS Statistics (v25) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

0G09A IBM Advanced Statistical Analysis Using IBM SPSS Statistics (v25)
Delivered OnlineFlexible Dates
Price on Enquiry

Designing and Building Big Data Applications

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for This course is best suited to developers, engineers, and architects who want to use use Hadoop and related tools to solve real-world problems. Overview Skills learned in this course include:Creating a data set with Kite SDKDeveloping custom Flume components for data ingestionManaging a multi-stage workflow with OozieAnalyzing data with CrunchWriting user-defined functions for Hive and ImpalaWriting user-defined functions for Hive and ImpalaIndexing data with Cloudera Search Cloudera University?s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH). IntroductionApplication Architecture Scenario Explanation Understanding the Development Environment Identifying and Collecting Input Data Selecting Tools for Data Processing and Analysis Presenting Results to the Use Defining & Using Datasets Metadata Management What is Apache Avro? Avro Schemas Avro Schema Evolution Selecting a File Format Performance Considerations Using the Kite SDK Data Module What is the Kite SDK? Fundamental Data Module Concepts Creating New Data Sets Using the Kite SDK Loading, Accessing, and Deleting a Data Set Importing Relational Data with Apache Sqoop What is Apache Sqoop? Basic Imports Limiting Results Improving Sqoop?s Performance Sqoop 2 Capturing Data with Apache Flume What is Apache Flume? Basic Flume Architecture Flume Sources Flume Sinks Flume Configuration Logging Application Events to Hadoop Developing Custom Flume Components Flume Data Flow and Common Extension Points Custom Flume Sources Developing a Flume Pollable Source Developing a Flume Event-Driven Source Custom Flume Interceptors Developing a Header-Modifying Flume Interceptor Developing a Filtering Flume Interceptor Writing Avro Objects with a Custom Flume Interceptor Managing Workflows with Apache Oozie The Need for Workflow Management What is Apache Oozie? Defining an Oozie Workflow Validation, Packaging, and Deployment Running and Tracking Workflows Using the CLI Hue UI for Oozie Processing Data Pipelines with Apache Crunch What is Apache Crunch? Understanding the Crunch Pipeline Comparing Crunch to Java MapReduce Working with Crunch Projects Reading and Writing Data in Crunch Data Collection API Functions Utility Classes in the Crunch API Working with Tables in Apache Hive What is Apache Hive? Accessing Hive Basic Query Syntax Creating and Populating Hive Tables How Hive Reads Data Using the RegexSerDe in Hive Developing User-Defined Functions What are User-Defined Functions? Implementing a User-Defined Function Deploying Custom Libraries in Hive Registering a User-Defined Function in Hive Executing Interactive Queries with Impala What is Impala? Comparing Hive to Impala Running Queries in Impala Support for User-Defined Functions Data and Metadata Management Understanding Cloudera Search What is Cloudera Search? Search Architecture Supported Document Formats Indexing Data with Cloudera Search Collection and Schema Management Morphlines Indexing Data in Batch Mode Indexing Data in Near Real Time Presenting Results to Users Solr Query Syntax Building a Search UI with Hue Accessing Impala through JDBC Powering a Custom Web Application with Impala and Search

Designing and Building Big Data Applications
Delivered OnlineFlexible Dates
Price on Enquiry

Hands-on Data Analysis with Pandas (TTPS4878)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This course is geared for Python-experienced attendees who wish to be equipped with the skills you need to use pandas to ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets. Overview Working in a hands-on learning environment, guided by our expert team, attendees will learn to: Understand how data analysts and scientists gather and analyze data Perform data analysis and data wrangling using Python Combine, group, and aggregate data from multiple sources Create data visualizations with pandas, matplotlib, and seaborn Apply machine learning (ML) algorithms to identify patterns and make predictions Use Python data science libraries to analyze real-world datasets Use pandas to solve common data representation and analysis problems Build Python scripts, modules, and packages for reusable analysis code Perform efficient data analysis and manipulation tasks using pandas Apply pandas to different real-world domains with the help of step-by-step demonstrations Get accustomed to using pandas as an effective data exploration tool. Data analysis has become a necessary skill in a variety of domains where knowing how to work with data and extract insights can generate significant value. Geared for data team members with incoming Python scripting experience, Hands-On Data Analysis with Pandas will show you how to analyze your data, get started with machine learning, and work effectively with Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn. Using real-world datasets, you will learn how to use the powerful pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will be able to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding lessons, you will explore some applications of anomaly detection, regression, clustering, and classification using scikit-learn to make predictions based on past data. Students will leave the course armed with the skills required to use pandas to ensure the veracity of their data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets. Introduction to Data Analysis Fundamentals of data analysis Statistical foundations Setting up a virtual environment Working with Pandas DataFrames Pandas data structures Bringing data into a pandas DataFrame Inspecting a DataFrame object Grabbing subsets of the data Adding and removing data Data Wrangling with Pandas What is data wrangling? Collecting temperature data Cleaning up the data Restructuring the data Handling duplicate, missing, or invalid data Aggregating Pandas DataFrames Database-style operations on DataFrames DataFrame operations Aggregations with pandas and numpy Time series Visualizing Data with Pandas and Matplotlib An introduction to matplotlib Plotting with pandas The pandas.plotting subpackage Plotting with Seaborn and Customization Techniques Utilizing seaborn for advanced plotting Formatting Customizing visualizations Financial Analysis - Bitcoin and the Stock Market Building a Python package Data extraction with pandas Exploratory data analysis Technical analysis of financial instruments Modeling performance Rule-Based Anomaly Detection Simulating login attempts Exploratory data analysis Rule-based anomaly detection Getting Started with Machine Learning in Python Learning the lingo Exploratory data analysis Preprocessing data Clustering Regression Classification Making Better Predictions - Optimizing Models Hyperparameter tuning with grid search Feature engineering Ensemble methods Inspecting classification prediction confidence Addressing class imbalance Regularization Machine Learning Anomaly Detection Exploring the data Unsupervised methods Supervised methods Online learning The Road Ahead Data resources Practicing working with data Python practice

Hands-on Data Analysis with Pandas (TTPS4878)
Delivered OnlineFlexible Dates
Price on Enquiry

B6159 IBM Cognos Analytics - Author Reports Advanced (v11.0.x)

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for Report Authors Overview Create query models Create reports based on query relationships Introduction to dimensional data Introduction to dimensional data in reports Dimensional report context Focus your dimensional data Calculations and dimensional functions Create advanced dynamic reports This offering teaches Professional Report Authors about advanced report building techniques using relational data models, dimensional data, and ways of enhancing, customizing, managing, and distributing professional reports. The course builds on topics presented in the Fundamentals course. Activities will illustrate and reinforce key concepts during this learning activity. Create query models Build a query and connect it to a report Answer a business question by referencing data in a separate query Create reports based on query relationships Create join relationships between queries Combine data containers based on relationships from different queries Create a report comparing the percentage of change Introduction to dimensional reporting concepts Examine data sources and model types Describe the dimensional approach to queries Apply report authoring styles Introduction to dimensional data in reports Use members to create reports Identify sets and tuples in reports Use query calculations and set definitions Dimensional report context Examine dimensional report members Examine dimensional report measures Use the default measure to create a summarized column in a report Focus your dimensional data Focus your report by excluding members of a defined set Compare the use of the filter() function to a detail filter Filter dimensional data using slicers Calculations and dimensional functions Examine dimensional functions Show totals and exclude members Create a percent of base calculation Create advanced dynamic reports Use query macros Control report output using a query macro Create a dynamic growth report Create a report that displays summary data before detailed data and uses singletons to summarize information Design effective prompts Create a prompt that allows users to select conditional formatting values Create a prompt that provides users a choice between different filters Create a prompt to let users choose a column sort order Create a prompt to let users select a display type Examine the report specification Examine report specification flow Identify considerations when modifying report specifications Customize reporting objects Distribute reports Burst a report to email recipients by using a data item Burst a list report to the IBM Cognos Analytics portal by using a burst table Burst a crosstab report to the IBM Cognos Analytics portal by using a burst table and a master detail relationship Enhance user interaction with HTML Create interactive reports using HTML Include additional information with tooltips Send emails using links in a report Introduction to IBM Cognos Active Reports Examine Active Report controls and variables Create a simple Active Report using Static and Data-driven controls Change filtering and selection behavior in a report Create interaction between multiple controls and variables Active Report charts and decks Create an Active Report with a Data deck Use Master detail relationships with Decks Optimize Active Reports Create an Active Report with new visualizations

B6159 IBM Cognos Analytics - Author Reports Advanced (v11.0.x)
Delivered OnlineFlexible Dates
Price on Enquiry

B6061 IBM Cognos Analytics - Author Reports with Multidimensional Data (V11.0)

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for Report authors working with dimensional data sources. Through interactive demonstrations & exercises, participants will learn how to author reports that navigate & manipulate dimensional data structures using the specific dimensional functions & features available in IBM Cognos Analytics. Introduction to Dimensional Concepts Identify different data sources and models Investigate the OLAP dimensional structure Identify dimensional data items and expressions Differentiate the IBM Cognos Analytics query language from SQL and MDX Differentiate relational and dimensional report authoring styles Introduction to Dimensional Data in Reports Work with members Identify sets and tuples in IBM Cognos Analytics Dimensional Report Context Understand the purpose of report context Understand how data is affected by default and root members Focus Your Dimensional Data Compare dimensional queries to relational queries Explain the importance of filtering dimensional queries Evaluate different filtering techniques Filter based on dimensions and members Filter based on measure values Filter using a slicer Calculations & Dimensional Functions Use IBM Cognos Analytics dimensional functions to create sets and tuples Perform arithmetic operations in OLAP queries Identify coercion errors and rules Functions for Navigating Dimesional Hierarchies Navigate dimensional data using family functions Relative Functions Navigate dimensional data using relative functions Navigate dimensional data using relative time functions Advanced Drilling Techniques & Member Sets Understand default drill-up and drill-down functionality Identify cases when you need to override default drilling behavior Configure advanced drilling behavior to support sophisticated use cases Define member sets to support advanced drilling Define member sets to support functions Set Up Drill-Through Reports Navigate from a specific report to a target report Drill down to greater detail and then navigate to target report Navigate between reports created using different data sources End-to-End Workshop Review concepts covered throughout the course

B6061 IBM Cognos Analytics - Author Reports with Multidimensional Data (V11.0)
Delivered OnlineFlexible Dates
Price on Enquiry

B6158 IBM Cognos Analytics - Author Reports Fundamentals (v11.0.x)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for Report Authors Overview What is IBM Cognos Analytics ? Reporting Examine dimensionally modelled and dimensional data sources Examine personal data sources and data modules Examine List reports Aggregate measure/fact data Use shared dimensions to create multi-fact queries Add repeated information to reports Create crosstab reports Create complex crosstab reports Format, sort, and aggregate data in a crosstab report Create discontinuous crosstab reports Create Visualization reports Add business logic to reports using IBM Cognos Analytics ? Reporting Focus reports using filters Focus reports using prompts Augment reports using calculations Extend report functionality in IBM Cognos Analytics - Reporting Customize reports with conditional formatting Conditionally format one crosstab measure based on another Drill-through definitions Enhance the report layout Use additional report building techniques This offering provides Business and Professional Authors with an introduction to report building techniques using relational data models. Techniques to enhance, customize, and manage professional reports will be explored. Activities will illustrate and reinforce key concepts during this learning opportunity. What is IBM Cognos Analytics - Reporting? Create a simple list report Create a report from a dimensionally modeled relational data source Examine personal data sources and data modules Upload personal data Upload custom images Use navigation paths Create a report from a personal data source Examine list reports Group data in a list Format columns in a list Include headers and footers in a list Enhance a list report Aggregate measure/fact data Identify differences in aggregation Explore data aggregation Use shared dimensions to create multi-fact queries Create a multi-fact query in a list report Add repeated information to reports Create a mailing list report Create crosstab reports Add measures to a crosstab Data sources for a crosstab Create a simple crosstab report Create complex crosstab reports Add items as peers Create crosstab nodes and crosstab members Create a complex crosstab report Format, sort, and aggregate data in a crosstab Sort, format, and aggregate a crosstab report Create discontinuous crosstab reports Present unrelated items using a discontinuous crosstab Create a visualization report Create and format a visualization report Create a report that uses a Map visualization Show the same data graphically and numerically Focus reports using filters Apply filters to a report Apply a detail filter on fact data in a report Apply a summary filter to a report Focus reports using prompts Create a prompt by adding a parameter Add a value prompt to a report Add a Select & search prompt to a report Create a cascading prompt Augment reports using calculations Add calculations to a report Display prompt selections in the report title Customize reports with conditional formatting Create a multilingual report Highlight exceptional data and conditionally render a column Drill-through definitions Let users navigate to related data in IBM Cognos Analytics Enhance report layout Create a report structured on data items Create a condensed list report Use additional report building techniques Section a report and reuse objects within the same report Reuse layout components in a different report Explore options for reports that contain no data Additional course details: Nexus Humans B6158 IBM Cognos Analytics - Author Reports Fundamentals (v11.0.x) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the B6158 IBM Cognos Analytics - Author Reports Fundamentals (v11.0.x) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

B6158 IBM Cognos Analytics - Author Reports Fundamentals (v11.0.x)
Delivered OnlineFlexible Dates
Price on Enquiry

Data Wrangling with Python

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for Data Wrangling with Python takes a practical approach to equip beginners with the most essential data analysis tools in the shortest possible time. It contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context. Overview By the end of this course, you will be confident in using a diverse array of sources to extract, clean, transform, and format your data efficiently. In this course you will start with the absolute basics of Python, focusing mainly on data structures. Then you will delve into the fundamental tools of data wrangling like NumPy and Pandas libraries. You'll explore useful insights into why you should stay away from traditional ways of data cleaning, as done in other languages, and take advantage of the specialized pre-built routines in Python.This combination of Python tips and tricks will also demonstrate how to use the same Python backend and extract/transform data from an array of sources including the Internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, you'll cover how to handle missing or wrong data, and reformat it based on the requirements from the downstream analytics tool. The course will further help you grasp concepts through real-world examples and datasets. Introduction to Data Structure using Python Python for Data Wrangling Lists, Sets, Strings, Tuples, and Dictionaries Advanced Operations on Built-In Data Structure Advanced Data Structures Basic File Operations in Python Introduction to NumPy, Pandas, and Matplotlib NumPy Arrays Pandas DataFrames Statistics and Visualization with NumPy and Pandas Using NumPy and Pandas to Calculate Basic Descriptive Statistics on the DataFrame Deep Dive into Data Wrangling with Python Subsetting, Filtering, and Grouping Detecting Outliers and Handling Missing Values Concatenating, Merging, and Joining Useful Methods of Pandas Get Comfortable with a Different Kind of Data Sources Reading Data from Different Text-Based (and Non-Text-Based) Sources Introduction to BeautifulSoup4 and Web Page Parsing Learning the Hidden Secrets of Data Wrangling Advanced List Comprehension and the zip Function Data Formatting Advanced Web Scraping and Data Gathering Basics of Web Scraping and BeautifulSoup libraries Reading Data from XML RDBMS and SQL Refresher of RDBMS and SQL Using an RDBMS (MySQL/PostgreSQL/SQLite) Application in real life and Conclusion of course Applying Your Knowledge to a Real-life Data Wrangling Task An Extension to Data Wrangling

Data Wrangling with Python
Delivered OnlineFlexible Dates
Price on Enquiry

Data Science Projects with Python

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for If you are a data analyst, data scientist, or a business analyst who wants to get started with using Python and machine learning techniques to analyze data and predict outcomes, this book is for you. Basic knowledge of computer programming and data analytics is a must. Familiarity with mathematical concepts such as algebra and basic statistics will be useful. Overview By the end of this course, you will have the skills you need to confidently use various machine learning algorithms to perform detailed data analysis and extract meaningful insights from data. This course is designed to give you practical guidance on industry-standard data analysis and machine learning tools in Python, with the help of realistic data. The course will help you understand how you can use pandas and Matplotlib to critically examine a dataset with summary statistics and graphs, and extract the insights you seek to derive. You will continue to build on your knowledge as you learn how to prepare data and feed it to machine learning algorithms, such as regularized logistic regression and random forest, using the scikit-learn package. You?ll discover how to tune the algorithms to provide the best predictions on new and unseen data. As you delve into later sections, you?ll be able to understand the working and output of these algorithms and gain insight into not only the predictive capabilities of the models but also their reasons for making these predictions. Data Exploration and Cleaning Python and the Anaconda Package Management System Different Types of Data Science Problems Loading the Case Study Data with Jupyter and pandas Data Quality Assurance and Exploration Exploring the Financial History Features in the Dataset Activity 1: Exploring Remaining Financial Features in the Dataset Introduction to Scikit-Learn and Model Evaluation Introduction Model Performance Metrics for Binary Classification Activity 2: Performing Logistic Regression with a New Feature and Creating a Precision-Recall Curve Details of Logistic Regression and Feature Exploration Introduction Examining the Relationships between Features and the Response Univariate Feature Selection: What It Does and Doesn't Do Building Cloud-Native Applications Activity 3: Fitting a Logistic Regression Model and Directly Using the Coefficients The Bias-Variance Trade-off Introduction Estimating the Coefficients and Intercepts of Logistic Regression Cross Validation: Choosing the Regularization Parameter and Other Hyperparameters Activity 4: Cross-Validation and Feature Engineering with the Case Study Data Decision Trees and Random Forests Introduction Decision trees Random Forests: Ensembles of Decision Trees Activity 5: Cross-Validation Grid Search with Random Forest Imputation of Missing Data, Financial Analysis, and Delivery to Client Introduction Review of Modeling Results Dealing with Missing Data: Imputation Strategies Activity 6: Deriving Financial Insights Final Thoughts on Delivering the Predictive Model to the Client

Data Science Projects with Python
Delivered OnlineFlexible Dates
Price on Enquiry

Cloudera Data Analyst Training - Using Pig, Hive, and Impala with Hadoop

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Overview Skills gained in this training include:The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysisThe fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with HadoopHow Pig, Hive, and Impala improve productivity for typical analysis tasksJoining diverse datasets to gain valuable business insightPerforming real-time, complex queries on datasets Cloudera University?s four-day data analyst training course focusing on Apache Pig and Hive and Cloudera Impala will teach you to apply traditional data analytics and business intelligence skills to big data. Hadoop Fundamentals The Motivation for Hadoop Hadoop Overview Data Storage: HDFS Distributed Data Processing: YARN, MapReduce, and Spark Data Processing and Analysis: Pig, Hive, and Impala Data Integration: Sqoop Other Hadoop Data Tools Exercise Scenarios Explanation Introduction to Pig What Is Pig? Pig?s Features Pig Use Cases Interacting with Pig Basic Data Analysis with Pig Pig Latin Syntax Loading Data Simple Data Types Field Definitions Data Output Viewing the Schema Filtering and Sorting Data Commonly-Used Functions Processing Complex Data with Pig Storage Formats Complex/Nested Data Types Grouping Built-In Functions for Complex Data Iterating Grouped Data Multi-Dataset Operations with Pig Techniques for Combining Data Sets Joining Data Sets in Pig Set Operations Splitting Data Sets Pig Troubleshoot & Optimization Troubleshooting Pig Logging Using Hadoop?s Web UI Data Sampling and Debugging Performance Overview Understanding the Execution Plan Tips for Improving the Performance of Your Pig Jobs Introduction to Hive & Impala What Is Hive? What Is Impala? Schema and Data Storage Comparing Hive to Traditional Databases Hive Use Cases Querying with Hive & Impala Databases and Tables Basic Hive and Impala Query Language Syntax Data Types Differences Between Hive and Impala Query Syntax Using Hue to Execute Queries Using the Impala Shell Data Management Data Storage Creating Databases and Tables Loading Data Altering Databases and Tables Simplifying Queries with Views Storing Query Results Data Storage & Performance Partitioning Tables Choosing a File Format Managing Metadata Controlling Access to Data Relational Data Analysis with Hive & Impala Joining Datasets Common Built-In Functions Aggregation and Windowing Working with Impala How Impala Executes Queries Extending Impala with User-Defined Functions Improving Impala Performance Analyzing Text and Complex Data with Hive Complex Values in Hive Using Regular Expressions in Hive Sentiment Analysis and N-Grams Conclusion Hive Optimization Understanding Query Performance Controlling Job Execution Plan Bucketing Indexing Data Extending Hive SerDes Data Transformation with Custom Scripts User-Defined Functions Parameterized Queries Choosing the Best Tool for the Job Comparing MapReduce, Pig, Hive, Impala, and Relational Databases Which to Choose?

Cloudera Data Analyst Training - Using Pig, Hive, and Impala with Hadoop
Delivered OnlineFlexible Dates
Price on Enquiry