Duration 4 Days 24 CPD hours This course is intended for The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful. Overview Overview of data science and machine learning at scale Overview of the Hadoop ecosystem Working with HDFS data and Hive tables using Hue Introduction to Cloudera Data Science Workbench Overview of Apache Spark 2 Reading and writing data Inspecting data quality Cleansing and transforming data Summarizing and grouping data Combining, splitting, and reshaping data Exploring data Configuring, monitoring, and troubleshooting Spark applications Overview of machine learning in Spark MLlib Extracting, transforming, and selecting features Building and evaluating regression models Building and evaluating classification models Building and evaluating clustering models Cross-validating models and tuning hyperparameters Building machine learning pipelines Deploying machine learning models Spark, Spark SQL, and Spark MLlib PySpark and sparklyr Cloudera Data Science Workbench (CDSW) Hue This workshop covers data science and machine learning workflows at scale using Apache Spark 2 and other key components of the Hadoop ecosystem. The workshop emphasizes the use of data science and machine learning methods to address real-world business challenges. Using scenarios and datasets from a fictional technology company, students discover insights to support critical business decisions and develop data products to transform the business. The material is presented through a sequence of brief lectures, interactive demonstrations, extensive hands-on exercises, and discussions. The Apache Spark demonstrations and exercises are conducted in Python (with PySpark) and R (with sparklyr) using the Cloudera Data Science Workbench (CDSW) environment. The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful. Overview of data science and machine learning at scaleOverview of the Hadoop ecosystemWorking with HDFS data and Hive tables using HueIntroduction to Cloudera Data Science WorkbenchOverview of Apache Spark 2Reading and writing dataInspecting data qualityCleansing and transforming dataSummarizing and grouping dataCombining, splitting, and reshaping dataExploring dataConfiguring, monitoring, and troubleshooting Spark applicationsOverview of machine learning in Spark MLlibExtracting, transforming, and selecting featuresBuilding and evauating regression modelsBuilding and evaluating classification modelsBuilding and evaluating clustering modelsCross-validating models and tuning hyperparametersBuilding machine learning pipelinesDeploying machine learning models Additional course details: Nexus Humans Cloudera Data Scientist Training training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Cloudera Data Scientist Training course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 5 Days 30 CPD hours This course is intended for This course is intended for Database Administrators, Database Developers, BI professionals, and Business reporting users. Overview Upon successful completion of this course, students will be able to run Queries and retrieve results, perform conditional searches, and retrieve data from multiple tables. Before starting this course, make sure you meet at least one of the following prerequisites: Basic knowledge of the Microsoft Windows operating system and its core functionality. Basic working knowledge of Relational Databases. In this course, students will gain a good understanding of the Transact-SQL language. They will be able to create queries, sort, filter the data, execute procedures with T-SQL. Course Outline 1.Introduction to Microsoft SQL Server 2.Introduction to T-SQL Querying 3.Writing SELECT Queries 4.Querying Multiple Tables 5.Sorting and Filtering Data 6.Working with SQL Server Data Types 7.Using DML to Modify Data 8.Using Built-In Functions 9.Grouping and Aggregating Data 10.Using Subqueries 11.Using Table Expressions 12.Using Set Operators 13.Using Windows Ranking, Offset, and Aggregate Functions 14.Pivoting and Grouping Sets 15.Executing Stored Procedures 16.Programming with T-SQL
Duration 2.5 Days 15 CPD hours This course is intended for This course is intended for those with a basic understanding of Tableau who want to pursue mastery of the advanced features. Overview The goal of this course is to present essential Tableau concepts and its advanced functionalities to help better prepare and analyze data. This course will use Tableau Hyper, Tableau Prep and more. Getting Up to Speed ? a Review of the Basics Connecting Tableau to your data Connecting to Tableau Server Connecting to saved data sources Measure Names and Measure Values Three essential Tableau concepts Exporting data to other devices Summary All About Data ? Getting Your Data Ready Data mining and knowledge discovery process models CRISP?DM All About Data ? Joins, Blends, and Data Structures All About Data - Joins, Blends, and Data Structures Introduction to joins Introduction to complex joins Exercise: observing join culling Introduction to join calculations Introduction to spatial joins Introduction to unions Understanding data blending Order of operations No dimensions from a secondary source Introduction to scaffolding Introduction to data structures Exercise: adjusting the data structure for different questions Summary Table Calculations Table Calculations A definition and two questions Introduction to functions Directional and non-directional table calculations Application of functions Summary Level of Detail Calculations Level of Detail Calculations Building playgrounds Playground I: FIXED and EXCLUDE Playground II: INCLUDE Practical application Exercise: practical FIXED Exercise: practical INCLUDE Exercise: practical EXCLUDE Summary Beyond the Basic Chart Types Beyond the Basic Chart Types Improving popular visualizations Custom background images Tableau extensions Summary Mapping Mapping Extending Tableau's mapping capabilities without leaving Tableau Extending Tableau mapping with other technology Exercise: connecting to a WMS server Exploring the TMS file Exploring Mapbox Accessing different maps with a dashboard Creating custom polygons Converting shape files for Tableau Exercise: polygons for Texas Heatmaps Summary Tableau for Presentations Tableau for Presentations Getting the best images out of Tableau From Tableau to PowerPoint Embedding Tableau in PowerPoint Animating Tableau Story points and dashboards for Presentations Summary Visualization Best Practices and Dashboard Design Visualization Best Practices and Dashboard Design Visualization design theory Formatting rules Color rules Visualization type rules Compromises Keeping visualizations simple Dashboard design Dashboard layout Sheet selection Summary Advanced Analytics Advanced Analytics Self-service Analytics Use case ? Self-service Analytics Use case ? Geo-spatial Analytics Summary Improving Performance Improving Performance Understanding the performance-recording dashboard Exercise: exploring performance recording in Tableau desktop Performance-recording dashboard events Behind the scenes of the performance- recording dashboard Hardware and on-the-fly techniques Hardware considerations On-the-fly-techniques Single Data Source > Joining > Blending Three ways Tableau connects to data Using referential integrity when joining Advantages of blending Efficiently working with data sources Tuning data sources Working efficiently with large data sources Intelligent extracts Understanding the Tableau data extract Constructing an extract for optimal performance Exercise: summary aggregates for improved performance Optimizing extracts Exercise: materialized calculations Using filters wisely Extract filter performance Data source filter performance Context filters Dimension and measure filters Table-calculation filters Efficient calculations Boolean/Numbers > Date > String Additional performance considerations Avoid overcrowding a dashboard Fixing dashboard sizing Setting expectations Summary Additional course details: Nexus Humans Advanced Tableau training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Advanced Tableau course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 2 Days 12 CPD hours This course is intended for Anyone who works with IBM SPSS Statistics and wants to learn advanced statistical procedures to be able to better answer research questions. Overview Introduction to advanced statistical analysis Group variables: Factor Analysis and Principal Components Analysis Group similar cases: Cluster Analysis Predict categorical targets with Nearest Neighbor Analysis Predict categorical targets with Discriminant Analysis Predict categorical targets with Logistic Regression Predict categorical targets with Decision Trees Introduction to Survival Analysis Introduction to Generalized Linear Models Introduction to Linear Mixed Models This course provides an application-oriented introduction to advanced statistical methods available in IBM SPSS Statistics. Students will review a variety of advanced statistical techniques and discuss situations in which each technique would be used, the assumptions made by each method, how to set up the analysis, and how to interpret the results. This includes a broad range of techniques for predicting variables, as well as methods to cluster variables and cases. Introduction to advanced statistical analysis Taxonomy of models Overview of supervised models Overview of models to create natural groupings Group variables: Factor Analysis and Principal Components Analysis Factor Analysis basics Principal Components basics Assumptions of Factor Analysis Key issues in Factor Analysis Improve the interpretability Use Factor and component scores Group similar cases: Cluster Analysis Cluster Analysis basics Key issues in Cluster Analysis K-Means Cluster Analysis Assumptions of K-Means Cluster Analysis TwoStep Cluster Analysis Assumptions of TwoStep Cluster Analysis Predict categorical targets with Nearest Neighbor Analysis Nearest Neighbor Analysis basics Key issues in Nearest Neighbor Analysis Assess model fit Predict categorical targets with Discriminant Analysis Discriminant Analysis basics The Discriminant Analysis model Core concepts of Discriminant Analysis Classification of cases Assumptions of Discriminant Analysis Validate the solution Predict categorical targets with Logistic Regression Binary Logistic Regression basics The Binary Logistic Regression model Multinomial Logistic Regression basics Assumptions of Logistic Regression procedures Testing hypotheses Predict categorical targets with Decision Trees Decision Trees basics Validate the solution Explore CHAID Explore CRT Comparing Decision Trees methods Introduction to Survival Analysis Survival Analysis basics Kaplan-Meier Analysis Assumptions of Kaplan-Meier Analysis Cox Regression Assumptions of Cox Regression Introduction to Generalized Linear Models Generalized Linear Models basics Available distributions Available link functions Introduction to Linear Mixed Models Linear Mixed Models basics Hierachical Linear Models Modeling strategy Assumptions of Linear Mixed Models Additional course details: Nexus Humans 0G09A IBM Advanced Statistical Analysis Using IBM SPSS Statistics (v25) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the 0G09A IBM Advanced Statistical Analysis Using IBM SPSS Statistics (v25) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 4 Days 24 CPD hours This course is intended for This course is best suited to developers, engineers, and architects who want to use use Hadoop and related tools to solve real-world problems. Overview Skills learned in this course include:Creating a data set with Kite SDKDeveloping custom Flume components for data ingestionManaging a multi-stage workflow with OozieAnalyzing data with CrunchWriting user-defined functions for Hive and ImpalaWriting user-defined functions for Hive and ImpalaIndexing data with Cloudera Search Cloudera University?s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH). IntroductionApplication Architecture Scenario Explanation Understanding the Development Environment Identifying and Collecting Input Data Selecting Tools for Data Processing and Analysis Presenting Results to the Use Defining & Using Datasets Metadata Management What is Apache Avro? Avro Schemas Avro Schema Evolution Selecting a File Format Performance Considerations Using the Kite SDK Data Module What is the Kite SDK? Fundamental Data Module Concepts Creating New Data Sets Using the Kite SDK Loading, Accessing, and Deleting a Data Set Importing Relational Data with Apache Sqoop What is Apache Sqoop? Basic Imports Limiting Results Improving Sqoop?s Performance Sqoop 2 Capturing Data with Apache Flume What is Apache Flume? Basic Flume Architecture Flume Sources Flume Sinks Flume Configuration Logging Application Events to Hadoop Developing Custom Flume Components Flume Data Flow and Common Extension Points Custom Flume Sources Developing a Flume Pollable Source Developing a Flume Event-Driven Source Custom Flume Interceptors Developing a Header-Modifying Flume Interceptor Developing a Filtering Flume Interceptor Writing Avro Objects with a Custom Flume Interceptor Managing Workflows with Apache Oozie The Need for Workflow Management What is Apache Oozie? Defining an Oozie Workflow Validation, Packaging, and Deployment Running and Tracking Workflows Using the CLI Hue UI for Oozie Processing Data Pipelines with Apache Crunch What is Apache Crunch? Understanding the Crunch Pipeline Comparing Crunch to Java MapReduce Working with Crunch Projects Reading and Writing Data in Crunch Data Collection API Functions Utility Classes in the Crunch API Working with Tables in Apache Hive What is Apache Hive? Accessing Hive Basic Query Syntax Creating and Populating Hive Tables How Hive Reads Data Using the RegexSerDe in Hive Developing User-Defined Functions What are User-Defined Functions? Implementing a User-Defined Function Deploying Custom Libraries in Hive Registering a User-Defined Function in Hive Executing Interactive Queries with Impala What is Impala? Comparing Hive to Impala Running Queries in Impala Support for User-Defined Functions Data and Metadata Management Understanding Cloudera Search What is Cloudera Search? Search Architecture Supported Document Formats Indexing Data with Cloudera Search Collection and Schema Management Morphlines Indexing Data in Batch Mode Indexing Data in Near Real Time Presenting Results to Users Solr Query Syntax Building a Search UI with Hue Accessing Impala through JDBC Powering a Custom Web Application with Impala and Search
Duration 2 Days 12 CPD hours This course is intended for This course is relevant to anyone who needs to work with and understand data including: Business Analysts, Data Analysts, Reporting and BI professionals Marketing and Digital Marketing professionals Digital, Web, e-Commerce, Social media and Mobile channel professionals Business managers who need to interpret analytical output to inform managerial decisions Overview This course will cover the basic theory of data visualization along with practical skills for creating compelling visualizations, reports and dashboards from data using Tableau. Outcome: After attending this course delegates will understand - How to move from business questions to great data visualizations and beyond How to apply the fundamentals of data visualization to create informative charts How to choose the right visualization type for the job at hand How to design and develop basic dashboards in Tableau that people will love to use by doing the following: Reading data sources into Tableau Setting up the roles and data types for your analysis Creating new data fields using a range of calculation types Creating the following types of charts - cross tabs, pie and bar charts, geographic maps, dual axis and combo charts, heat maps, highlight tables, tree maps and scatter plots Creating Dashboards that delight using the all of the features available in Tableau. The use of analytics, statistics and data science in business has grown massively in recent years. Harnessing the power of data is opening actionable insights in diverse industries from banking to tourism. From Business Questions to Data Visualisation and Beyond The first step in any data analysis project is to move from a business question to data analysis and then on to a complete solution. This section will examine this conversion emphasizing: The use of data visualization to address a business need The data analytics process ? from business questions to developed dashboards Introduction to Tableau ? Part 1 In this section, the main functionality of Tableau will be explained including: Selecting and loading your data Defining data item properties Create basic calculations including basic arithmetic calculations, custom aggregations and ratios, date math, and quick table calculations Creating basic visualizations Creating a basic dashboard Introduction to Tableau ? Part 2 In this section, the main functionality of Tableau will be explained including: Selecting and loading your data Defining data item properties Create basic calculations including basic arithmetic calculations, custom aggregations and ratios, date math, and quick table calculations Creating basic visualizations Creating a basic dashboard Key Components of Good Data Visualisation and The Visualisation Zoo In this section the following topics will be covered: Colour theory Graphical perception & communication Choosing the right chart for the right job Data Exploration with Tableau Exploring data to answer business questions is one of the key uses of applying good data visualization techniques within Tableau. In this section we will apply the data visualization theory from the previous section within Tableau to uncover trends within the data to answer specific business questions. The types of charts that will be covered are: Cross Tabs Pie and bar charts Geographic maps Dual axis and combo charts with different mark types Heat maps Highlight tables Tree maps Scatter plots Introduction to Building Dashboards with Tableau In this section, we will implement the full process from business question to final basic dashboard in Tableau: Introduction to good dashboard design Building dashboards in Tableau
Duration 2 Days 12 CPD hours This course is intended for Experienced DataStage developers seeking training in more advanced DataStage job techniques and who seek techniques for working with complex types of data resources. Overview Use Connector stages to read from and write to database tables Handle SQL errors in Connector stages Use Connector stages with multiple input links Use the File Connector stage to access Hadoop HDFS data Optimize jobs that write to database tables Use the Unstructured Data stage to extract data from Excel spreadsheets Use the Data Masking stage to mask sensitive data processed within a DataStage job Use the Hierarchical stage to parse, compose, and transform XML data Use the Schema Library Manager to import and manage XML schemas Use the Data Rules stage to validate fields of data within a DataStage job Create custom data rules for validating data Design a job that processes a star schema data warehouse with Type 1 and Type 2 slowly changing dimensions This course is designed to introduce you to advanced parallel job data processing techniques in DataStage v11.5. In this course you will develop data techniques for processing different types of complex data resources including relational data, unstructured data (Excel spreadsheets), and XML data. In addition, you will learn advanced techniques for processing data, including techniques for masking data and techniques for validating data using data rules. Finally, you will learn techniques for updating data in a star schema data warehouse using the DataStage SCD (Slowly Changing Dimensions) stage. Even if you are not working with all of these specific types of data, you will benefit from this course by learning advanced DataStage job design techniques, techniques that go beyond those utilized in the DataStage Essentials course. Accessing databases Connector stage overview - Use Connector stages to read from and write to relational tables - Working with the Connector stage properties Connector stage functionality - Before / After SQL - Sparse lookups - Optimize insert/update performance Error handling in Connector stages - Reject links - Reject conditions Multiple input links - Designing jobs using Connector stages with multiple input links - Ordering records across multiple input links File Connector stage - Read and write data to Hadoop file systems Demonstration 1: Handling database errors Demonstration 2: Parallel jobs with multiple Connector input links Demonstration 3: Using the File Connector stage to read and write HDFS files Processing unstructured data Using the Unstructured Data stage in DataStage jobs - Extract data from an Excel spreadsheet - Specify a data range for data extraction in an Unstructured Data stage - Specify document properties for data extraction. Demonstration 1: Processing unstructured data Data masking Using the Data Masking stage in DataStage jobs - Data masking techniques - Data masking policies - Applying policies for masquerading context-aware data types - Applying policies for masquerading generic data types - Repeatable replacement - Using reference tables - Creating custom reference tables Demonstration 1: Data masking Using data rules Introduction to data rules - Using the Data Rules Editor - Selecting data rules - Binding data rule variables - Output link constraints - Adding statistics and attributes to the output information Use the Data Rules stage to valid foreign key references in source data Create custom data rules Demonstration 1: Using data rules Processing XML data Introduction to the Hierarchical stage - Hierarchical stage Assembly editor - Use the Schema Library Manager to import and manage XML schemas Composing XML data - Using the HJoin step to create parent-child relationships between input lists - Using the Composer step Writing Hierarchical data to a relational table Using the Regroup step Consuming XML data - Using the XML Parser step - Propagating columns Topic 6: Transforming XML data - Using the Aggregate step - Using the Sort step - Using the Switch step - Using the H-Pivot step Demonstration 1: Importing XML schemas Demonstration 2: Compose hierarchical data Demonstration 3: Consume hierarchical data Demonstration 4: Transform hierarchical data Updating a star schema database Surrogate keys - Design a job that creates and updates a surrogate key source key file from a dimension table Slowly Changing Dimensions (SCD) stage - Star schema databases - SCD stage Fast Path pages - Specifying purpose codes - Dimension update specification - Design a job that processes a star schema database with Type 1 and Type 2 slowly changing dimensions Demonstration 1: Build a parallel job that updates a star schema database with two dimensions Additional course details: Nexus Humans KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 2 Days 12 CPD hours This course is intended for New users of IBM SPSS Statistics Users who want to refresh their knowledge about IBM SPSS Statistics Anyone who is considering purchasing IBM SPSS Statistics Overview Introduction to IBM SPSS Statistics Review basic concepts in IBM SPSS Statistics Identify the steps in the research process Review basic analyses Use Help Reading data and defining metadata Overview of data sources Read from text files Read data from Microsoft Excel Read data from databases Define variable properties Selecting cases for analyses Select cases for analyses Run analyses for groups Apply report authoring styles Transforming variables Compute variables Recode values of categorical and scale variables Create a numeric variable from a string variable Using functions to transform variables Use statistical functions Use logical functions Use missing value functions Use conversion functions Use system variables Use the Date and Time Wizard Setting the unit of analysis Remove duplicate cases Create aggregate datasets Restructure datasets Merging data files Add cases from one dataset to another Add variables from one dataset to another Enrich a dataset with aggregated information Summarizing individual variables Define levels of measurement Summarizing categorical variables Summarizing scale variables Describing the relationship between variables Choose the appropriate procedure Summarize the relationship between categorical variables Summarize the relationship between a scale and a categorical variable Creating presentation ready tables with Custom Tables Identify table layouts Create tables for variables with shared categories Create tables for multiple response questions Customizing pivot tables Perform Automated Output Modification Customize pivot tables Use table templates Export pivot tables to other applications Working with syntax Use syntax to automate analyses Create, edit, and run syntax Shortcuts in the Syntax Editor Controlling the IBM SPSS Statistics environment Set options for output Set options for variables display Set options for default working folders This course guides students through the fundamentals of using IBM SPSS Statistics for typical data analysis. Students will learn the basics of reading data, data definition, data modification, data analysis, and presentation of analytical results. In addition to the fundamentals, students will learn shortcuts that will help them save time. This course uses the IBM SPSS Statistics Base; one section presents an add-on module, IBM SPSS Custom Tables. Introduction to IBM SPSS Statistics Review basic concepts in IBM SPSS Statistics Identify the steps in the research process Review basic analyses Use Help Reading data and defining metadata Overview of data sources Read from text files Read data from Microsoft Excel Read data from databases Define variable properties Selecting cases for analyses Select cases for analyses Run analyses for groups Apply report authoring styles Transforming variables Compute variables Recode values of categorical and scale variables Create a numeric variable from a string variable Using functions to transform variables Use statistical functions Use logical functions Use missing value functions Use conversion functions Use system variables Use the Date and Time Wizard Setting the unit of analysis Remove duplicate cases Create aggregate datasets Restructure datasets Merging data files Add cases from one dataset to another Add variables from one dataset to another Enrich a dataset with aggregated information Summarizing individual variables Define levels of measurement Summarizing categorical variables Summarizing scale variables Describing the relationship between variables Choose the appropriate procedure Summarize the relationship between categorical variables Summarize the relationship between a scale and a categorical variable Creating presentation ready tables with Custom Tables Identify table layouts Create tables for variables with shared categories Create tables for multiple response questions Customizing pivot tables Perform Automated Output Modification Customize pivot tables Use table templates Export pivot tables to other applications Working with syntax Use syntax to automate analyses Create, edit, and run syntax Shortcuts in the Syntax Editor Controlling the IBM SPSS Statistics environment Set options for output Set options for variables display Set options for default working folders Additional course details: Nexus Humans 0G53BG IBM SPSS Statistics Essentials (V26) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the 0G53BG IBM SPSS Statistics Essentials (V26) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 2 Days 12 CPD hours This course is intended for This course is intended for those who need to use Tableau Desktop to build complex visuals and dashboards to present information or to monitor data about their organization. Overview Upon completion of this course, participants will be able to:Select the best method to work with multiple data sourcesCreate complex visuals using calculations and parametersApply best practices to improve the layout and aesthetics of dashboards This course enables participants to create complex visualizations and to combine them into interactive dashboards to share with others using Tableau Desktop. The Data Data Interpreter Data Joins Same Database Cross Databases Spatial Join New! Data Blending New Union Custom SQL Tableau Extract TDE Hyper Clipboard Database Changes Automatic Updates Calculations Regular Calculations Quick Table Calculations Table Calculations Level of Detail (LOD) Expressions Complex Visualizations Custom Background Map Web Map Servers Dual Maps Bar in Bar Graph Bullet Graph Pareto Chart Sparkline Report Top N Within a Category Report Waterfall Chart Funnel Chart Pattern Analysis using the Path Shelf Building Better Dashboards Best Practices for Design Best Practices for Performance Creating a Template Workbook Using Layout Containers Dashboard Extenders New! Generating A Performance Summary Additional course details: Nexus Humans Tableau Advanced v10.3 training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Tableau Advanced v10.3 course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 5 Days 30 CPD hours This course is intended for This intermediate and beyond level course is geared for experienced technical professionals in various roles, such as developers, data analysts, data engineers, software engineers, and machine learning engineers who want to leverage Scala and Spark to tackle complex data challenges and develop scalable, high-performance applications across diverse domains. Practical programming experience is required to participate in the hands-on labs. Overview Working in a hands-on learning environment led by our expert instructor you'll: Develop a basic understanding of Scala and Apache Spark fundamentals, enabling you to confidently create scalable and high-performance applications. Learn how to process large datasets efficiently, helping you handle complex data challenges and make data-driven decisions. Gain hands-on experience with real-time data streaming, allowing you to manage and analyze data as it flows into your applications. Acquire practical knowledge of machine learning algorithms using Spark MLlib, empowering you to create intelligent applications and uncover hidden insights. Master graph processing with GraphX, enabling you to analyze and visualize complex relationships in your data. Discover generative AI technologies using GPT with Spark and Scala, opening up new possibilities for automating content generation and enhancing data analysis. Embark on a journey to master the world of big data with our immersive course on Scala and Spark! Mastering Scala with Apache Spark for the Modern Data Enterprise is a five day hands on course designed to provide you with the essential skills and tools to tackle complex data projects using Scala programming language and Apache Spark, a high-performance data processing engine. Mastering these technologies will enable you to perform a wide range of tasks, from data wrangling and analytics to machine learning and artificial intelligence, across various industries and applications.Guided by our expert instructor, you?ll explore the fundamentals of Scala programming and Apache Spark while gaining valuable hands-on experience with Spark programming, RDDs, DataFrames, Spark SQL, and data sources. You?ll also explore Spark Streaming, performance optimization techniques, and the integration of popular external libraries, tools, and cloud platforms like AWS, Azure, and GCP. Machine learning enthusiasts will delve into Spark MLlib, covering basics of machine learning algorithms, data preparation, feature extraction, and various techniques such as regression, classification, clustering, and recommendation systems. Introduction to Scala Brief history and motivation Differences between Scala and Java Basic Scala syntax and constructs Scala's functional programming features Introduction to Apache Spark Overview and history Spark components and architecture Spark ecosystem Comparing Spark with other big data frameworks Basics of Spark Programming SparkContext and SparkSession Resilient Distributed Datasets (RDDs) Transformations and Actions Working with DataFrames Spark SQL and Data Sources Spark SQL library and its advantages Structured and semi-structured data sources Reading and writing data in various formats (CSV, JSON, Parquet, Avro, etc.) Data manipulation using SQL queries Basic RDD Operations Creating and manipulating RDDs Common transformations and actions on RDDs Working with key-value data Basic DataFrame and Dataset Operations Creating and manipulating DataFrames and Datasets Column operations and functions Filtering, sorting, and aggregating data Introduction to Spark Streaming Overview of Spark Streaming Discretized Stream (DStream) operations Windowed operations and stateful processing Performance Optimization Basics Best practices for efficient Spark code Broadcast variables and accumulators Monitoring Spark applications Integrating External Libraries and Tools, Spark Streaming Using popular external libraries, such as Hadoop and HBase Integrating with cloud platforms: AWS, Azure, GCP Connecting to data storage systems: HDFS, S3, Cassandra, etc. Introduction to Machine Learning Basics Overview of machine learning Supervised and unsupervised learning Common algorithms and use cases Introduction to Spark MLlib Overview of Spark MLlib MLlib's algorithms and utilities Data preparation and feature extraction Linear Regression and Classification Linear regression algorithm Logistic regression for classification Model evaluation and performance metrics Clustering Algorithms Overview of clustering algorithms K-means clustering Model evaluation and performance metrics Collaborative Filtering and Recommendation Systems Overview of recommendation systems Collaborative filtering techniques Implementing recommendations with Spark MLlib Introduction to Graph Processing Overview of graph processing Use cases and applications of graph processing Graph representations and operations Introduction to Spark GraphX Overview of GraphX Creating and transforming graphs Graph algorithms in GraphX Big Data Innovation! Using GPT and Generative AI Technologies with Spark and Scala Overview of generative AI technologies Integrating GPT with Spark and Scala Practical applications and use cases Bonus Topics / Time Permitting Introduction to Spark NLP Overview of Spark NLP Preprocessing text data Text classification and sentiment analysis Putting It All Together Work on a capstone project that integrates multiple aspects of the course, including data processing, machine learning, graph processing, and generative AI technologies.