• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

196 Data Skills courses in Cardiff delivered Online

Visualizing Data Designing Informative Graphics

By Compete High

Overview: Visualizing Data: Designing Informative Graphics   Welcome to 'Visualizing Data: Designing Informative Graphics,' a comprehensive course designed to equip you with the skills needed to create compelling and informative visualizations from raw data. In today's data-driven world, the ability to effectively communicate insights through visualizations is crucial for professionals across various industries.   Module 1: Introduction to Data Visualization In this module, you'll embark on your journey into the world of data visualization. Understand the importance of data visualization, its applications, and the fundamental principles behind creating impactful visuals.   Module 2: Choosing the Right Visualization Types Discover the diverse range of visualization types available and learn how to select the most suitable ones for different data sets and objectives. Gain insights into when to use bar charts, line graphs, scatter plots, and more.   Module 3: Data Preparation and Cleaning Master the art of preparing and cleaning data for visualization. Learn essential techniques to ensure data accuracy, completeness, and consistency, laying a solid foundation for effective visualization creation.   Module 4: Design Principles for Effective Visualizations Unlock the secrets of designing visually appealing and informative graphics. Explore principles such as color theory, typography, layout, and visual hierarchy to create engaging and user-friendly visualizations.   Module 5: Basic Charts and Graphs Dive into the world of basic charts and graphs, including bar charts, pie charts, histograms, and line graphs. Understand how to construct these fundamental visualizations accurately to convey your message effectively.   Module 6: Advanced Charts and Graphs Take your visualization skills to the next level with advanced chart types such as heatmaps, treemaps, and network diagrams. Explore complex data structures and learn to visualize them in a clear and intuitive manner.   By the end of this course, you'll have the knowledge and confidence to transform raw data into visually compelling stories that drive understanding and decision-making. Whether you're a data analyst, business professional, or aspiring data visualization expert, 'Visualizing Data: Designing Informative Graphics' is your gateway to mastering the art of data visualization. Don't miss out on this opportunity to elevate your skills and make a lasting impact with your data presentations. Enroll now and embark on your journey towards becoming a proficient data visualization practitioner! Course Curriculum Module 1_ Introduction to Data Visualization Introduction to Data Visualization 00:00 Module 2_ Choosing the Right Visualization Types Choosing the Right Visualization Types 00:00 Module 3_ Data Preparation and Cleaning Data Preparation and Cleaning 00:00 Module 4_ Design Principles for Effective Visualizations Design Principles for Effective Visualizations 00:00 Module 5_ Basic Charts and Graphs Basic Charts and Graphs 00:00 Module 6_ Advanced Charts and Graphs Advanced Charts and Graphs 00:00

Visualizing Data Designing Informative Graphics
Delivered Online On Demand6 hours
£5

Data Analysis Basics

By Compete High

Overview With the ever-increasing demand for Data Analysis in personal & professional settings, this online training aims at educating, nurturing, and upskilling individuals to stay ahead of the curve - whatever their level of expertise in Data Analysis may be. Learning about Data Analysis or keeping up to date on it can be confusing at times, and maybe even daunting! But that's not the case with this course from Compete High. We understand the different requirements coming with a wide variety of demographics looking to get skilled in Data Analysis. That's why we've developed this online training in a way that caters to learners with different goals in mind. The course materials are prepared with consultation from the experts of this field and all the information on Data Analysis is kept up to date on a regular basis so that learners don't get left behind on the current trends/updates. The self-paced online learning methodology by compete high in this Data Analysis Basics course helps you learn whenever or however you wish, keeping in mind the busy schedule or possible inconveniences that come with physical classes. The easy-to-grasp, bite-sized lessons are proven to be most effective in memorising and learning the lessons by heart. On top of that, you have the opportunity to receive a certificate after successfully completing the course! Instead of searching for hours, enrol right away on this Data Analysis Basics course from Compete High and accelerate your career in the right path with expert-outlined lessons and a guarantee of success in the long run.   Who is this course for? While we refrain from discouraging anyone wanting to do this Data Analysis Basics course or impose any sort of restrictions on doing this online training, people meeting any of the following criteria will benefit the most from it: Anyone looking for the basics of Data Analysis, Jobseekers in the relevant domains, Anyone with a ground knowledge/intermediate expertise in Data Analysis, Anyone looking for a certificate of completion on doing an online training on this topic, Students of Data Analysis, or anyone with an academic knowledge gap to bridge, Anyone with a general interest/curiosity   Career Path This Data Analysis Basics course smoothens the way up your career ladder with all the relevant information, skills, and online certificate of achievements. After successfully completing the course, you can expect to move one significant step closer to achieving your professional goals - whether it's securing that job you desire, getting the promotion you deserve, or setting up that business of your dreams. Course Curriculum Module - 01 - Introduction to Data Analysis its Applications Introduction to Data Analysis its Applications 00:00 Module - 02 - Probability Probability Distributions Probability Probability Distributions 00:00 Module - 03 - Decision making and Factors to Account for Decision making and Factors to Account for 00:00 Module - 04 - Data Mining Data Mining 00:00 Module - 05 - Optimization Situation modelling Optimization Situation modelling 00:00

Data Analysis Basics
Delivered Online On Demand5 hours
£4.99

Big Data Analytics

By Compete High

Overview With the ever-increasing demand for Big Data Analytics in personal & professional settings, this online training aims at educating, nurturing, and upskilling individuals to stay ahead of the curve - whatever their level of expertise in Big Data Analytics may be. Learning about Big Data Analytics or keeping up to date on it can be confusing at times, and maybe even daunting! But that's not the case with this course from Compete High. We understand the different requirements coming with a wide variety of demographics looking to get skilled in Big Data Analytics . That's why we've developed this online training in a way that caters to learners with different goals in mind. The course materials are prepared with consultation from the experts of this field and all the information on Big Data Analytics is kept up to date on a regular basis so that learners don't get left behind on the current trends/updates. The self-paced online learning methodology by compete high in this Big Data Analytics course helps you learn whenever or however you wish, keeping in mind the busy schedule or possible inconveniences that come with physical classes. The easy-to-grasp, bite-sized lessons are proven to be most effective in memorising and learning the lessons by heart. On top of that, you have the opportunity to receive a certificate after successfully completing the course! Instead of searching for hours, enrol right away on this Big Data Analytics course from Compete High and accelerate your career in the right path with expert-outlined lessons and a guarantee of success in the long run. Who is this course for? While we refrain from discouraging anyone wanting to do this Big Data Analytics course or impose any sort of restrictions on doing this online training, people meeting any of the following criteria will benefit the most from it: Anyone looking for the basics of Big Data Analytics , Jobseekers in the relevant domains, Anyone with a ground knowledge/intermediate expertise in Big Data Analytics , Anyone looking for a certificate of completion on doing an online training on this topic, Students of Big Data Analytics , or anyone with an academic knowledge gap to bridge, Anyone with a general interest/curiosity Career Path This Big Data Analytics course smoothens the way up your career ladder with all the relevant information, skills, and online certificate of achievements. After successfully completing the course, you can expect to move one significant step closer to achieving your professional goals - whether it's securing that job you desire, getting the promotion you deserve, or setting up that business of your dreams.     Course Curriculum Module 1_ Introduction to Big Data. Introduction to Big Data. 00:00 Module 2_ Hadoop and MapReduce. Hadoop and MapReduce. 00:00 Module 3_ NoSQL Databases. NoSQL Databases. 00:00 Module 4_ Data Storage and Retrieval. Data Storage and Retrieval. 00:00 Module 5_ Data Processing with Spark. Data Processing with Spark. 00:00 Module 6_ Data Analysis with Hadoop and Pig. Data Analysis with Hadoop and Pig. 00:00

Big Data Analytics
Delivered Online On Demand6 hours
£4.99

Diploma in Data Analytics In Tableau

By Compete High

Overview   With the ever-increasing demand for Tableau in personal & professional settings, this online training aims at educating, nurturing, and upskilling individuals to stay ahead of the curve - whatever their level of expertise in Tableau may be.   Learning about Tableau or keeping up to date on it can be confusing at times, and maybe even daunting! But that's not the case with this course from Compete High. We understand the different requirements coming with a wide variety of demographics looking to get skilled in Tableau. That's why we've developed this online training in a way that caters to learners with different goals in mind. The course materials are prepared with consultation from the experts of this field and all the information on Tableau is kept up to date on a regular basis so that learners don't get left behind on the current trends/updates.   The self-paced online learning methodology by Compete High in this Diploma in Data Analytics In Tableau course helps you learn whenever or however you wish, keeping in mind the busy schedule or possible inconveniences that come with physical classes. The easy-to-grasp, bite-sized lessons are proven to be most effective in memorising and learning the lessons by heart. On top of that, you have the opportunity to receive a certificate after successfully completing the course!   Instead of searching for hours, enrol right away on this Diploma in Data Analytics In Tableau course from Compete High and accelerate your career in the right path with expert-outlined lessons and a guarantee of success in the long run.   Who is this course for?   While we refrain from discouraging anyone wanting to do this Diploma in Data Analytics In Tableau course or impose any sort of restrictions on doing this online training, people meeting any of the following criteria will benefit the most from it: Anyone looking for the basics of Tableau, Jobseekers in the relevant domains, Anyone with a ground knowledge/intermediate expertise in Tableau, Anyone looking for a certificate of completion on doing an online training on this topic, Students of Tableau, or anyone with an academic knowledge gap to bridge, Anyone with a general interest/curiosity   Career Path   This Diploma in Data Analytics In Tableau course smoothens the way up your career ladder with all the relevant information, skills, and online certificate of achievements. After successfully completing the course, you can expect to move one significant step closer to achieving your professional goals - whether it's securing that job you desire, getting the promotion you deserve, or setting up that business of your dreams.    Course Curriculum Module 01_ Data Analytics Data Analytics 00:00 Module 02_ Why Use Tableau for Data Analytics Why Use Tableau for Data Analytics 00:00 Module 03_ Getting Started With Tableau Getting Started With Tableau 00:00 Module 04_ Tableau Data Source (TDS) Tableau Data Source (TDS) 00:00 Module 05_ Tableau Worksheets Tableau Worksheets 00:00 Module 06_ Tableau Calculations Tableau Calculations 00:00 Module 07_ Tableau Sort _ Filters Tableau Sort _ Filters 00:00 Module 08_ Tableau Charts Tableau Charts 00:00 Module 09_ Tableau Advanced Tableau Advanced 00:00

Diploma in Data Analytics In Tableau
Delivered Online On Demand1 hour
£4.99

Practical Data Science with Amazon SageMaker

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: A technical audience at an intermediate level Overview Using Amazon SageMaker, this course teaches you how to: Prepare a dataset for training. Train and evaluate a machine learning model. Automatically tune a machine learning model. Prepare a machine learning model for production. Think critically about machine learning model results In this course, learn how to solve a real-world use case with machine learning and produce actionable results using Amazon SageMaker. This course teaches you how to use Amazon SageMaker to cover the different stages of the typical data science process, from analyzing and visualizing a data set, to preparing the data and feature engineering, down to the practical aspects of model building, training, tuning and deployment. Day 1 Business problem: Churn prediction Load and display the dataset Assess features and determine which Amazon SageMaker algorithm to use Use Amazon Sagemaker to train, evaluate, and automatically tune the model Deploy the model Assess relative cost of errors Additional course details: Nexus Humans Practical Data Science with Amazon SageMaker training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Practical Data Science with Amazon SageMaker course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Practical Data Science with Amazon SageMaker
Delivered OnlineFlexible Dates
Price on Enquiry

Introduction to Writing SQL Queries (TTSQL003)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This is an introductory- level course appropriate for those who are developing applications using relational databases, or who are using SQL to extract and analyze data from databases and need to use the full power of SQL queries. Overview This course combines expert lecture, real-world demonstrations and group discussions with machine-based practical labs and exercises. Working in a hands-on learning environment led by our expert practitioner, attendees will learn to: Maximize the potential of SQL to build powerful, complex and robust SQL queries Query multiple tables with inner joins, outer joins and self joins Construct recursive common table expressions Summarize data using aggregation and grouping Execute analytic functions to calculate ranks Build simple and correlated subqueries Thoroughly test SQL queries to avoid common errors Select the most efficient solution to complex SQL problems A company?s success hinges on responsible, accurate database management. Organizations rely on highly available data to complete all sorts of tasks, from creating marketing reports and invoicing customers to setting financial goals. Data professionals like analysts, developers and architects are tasked with creating, optimizing, managing and analyzing data from databases ? with little room for error. When databases aren?t built or maintained correctly, it?s easy to mishandle or lose valuable data. Our SQL Programming and Database Training Series provides students with the skills they require to develop, analyze and maintain data and in correctly structured, modern and secure databases. SQL is the cornerstone of all relational database operations. In this hands-on course, you learn to exploit the full potential of the SELECT statement to write robust queries using the best query method for your application, test your queries, and avoid common errors and pitfalls. It also teaches alternative solutions to given problems, enabling you to choose the most efficient solution in each situation. Introduction: Quick Tools Review Introduction to SQL and its development environments Using SQL*PLUS Using SQL Developer Using the SQL SELECT Statement Capabilities of the SELECT statement Arithmetic expressions and NULL values in the SELECT statement Column aliases Use of concatenation operator, literal character strings, alternative quote operator, and the DISTINCT keyword Use of the DESCRIBE command Restricting and Sorting Data Limiting the Rows Rules of precedence for operators in an expression Substitution Variables Using the DEFINE and VERIFY command Single-Row Functions Describe the differences between single row and multiple row functions Manipulate strings with character function in the SELECT and WHERE clauses Manipulate numbers with the ROUND, TRUNC and MOD functions Perform arithmetic with date data Manipulate dates with the date functions Conversion Functions and Expressions Describe implicit and explicit data type conversion Use the TO_CHAR, TO_NUMBER, and TO_DATE conversion functions Nest multiple functions Apply the NVL, NULLIF, and COALESCE functions to data Decode/Case Statements Using the Group Functions and Aggregated Data Group Functions Creating Groups of Data Having Clause Cube/Rollup Clause SQL Joins and Join Types Introduction to JOINS Types of Joins Natural join Self-join Non equijoins OUTER join Using Subqueries Introduction to Subqueries Single Row Subqueries Multiple Row Subqueries Using the SET Operators Set Operators UNION and UNION ALL operator INTERSECT operator MINUS operator Matching the SELECT statements Using Data Manipulation Language (DML) statements Data Manipulation Language Database Transactions Insert Update Delete Merge Using Data Definition Language (DDL) Data Definition Language Create Alter Drop Data Dictionary Views Introduction to Data Dictionary Describe the Data Dictionary Structure Using the Data Dictionary views Querying the Data Dictionary Views Dynamic Performance Views Creating Sequences, Synonyms, Indexes Creating sequences Creating synonyms Creating indexes Index Types Creating Views Creating Views Altering Views Replacing Views Managing Schema Objects Managing constraints Creating and using temporary tables Creating and using external tables Retrieving Data Using Subqueries Retrieving Data by Using a Subquery as Source Working with Multiple-Column subqueries Correlated Subqueries Non-Correlated Subqueries Using Subqueries to Manipulate Data Using the Check Option Subqueries in Updates and Deletes In-line Views Data Control Language (DCL) System privileges Creating a role Object privileges Revoking object privileges Manipulating Data Overview of the Explicit Default Feature Using multitable INSERTs Using the MERGE statement Tracking Changes in Data

Introduction to Writing SQL Queries (TTSQL003)
Delivered OnlineFlexible Dates
Price on Enquiry

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

By Nexus Human

Duration 5 Days 30 CPD hours This course is intended for This intermediate and beyond level course is geared for experienced technical professionals in various roles, such as developers, data analysts, data engineers, software engineers, and machine learning engineers who want to leverage Scala and Spark to tackle complex data challenges and develop scalable, high-performance applications across diverse domains. Practical programming experience is required to participate in the hands-on labs. Overview Working in a hands-on learning environment led by our expert instructor you'll: Develop a basic understanding of Scala and Apache Spark fundamentals, enabling you to confidently create scalable and high-performance applications. Learn how to process large datasets efficiently, helping you handle complex data challenges and make data-driven decisions. Gain hands-on experience with real-time data streaming, allowing you to manage and analyze data as it flows into your applications. Acquire practical knowledge of machine learning algorithms using Spark MLlib, empowering you to create intelligent applications and uncover hidden insights. Master graph processing with GraphX, enabling you to analyze and visualize complex relationships in your data. Discover generative AI technologies using GPT with Spark and Scala, opening up new possibilities for automating content generation and enhancing data analysis. Embark on a journey to master the world of big data with our immersive course on Scala and Spark! Mastering Scala with Apache Spark for the Modern Data Enterprise is a five day hands on course designed to provide you with the essential skills and tools to tackle complex data projects using Scala programming language and Apache Spark, a high-performance data processing engine. Mastering these technologies will enable you to perform a wide range of tasks, from data wrangling and analytics to machine learning and artificial intelligence, across various industries and applications.Guided by our expert instructor, you?ll explore the fundamentals of Scala programming and Apache Spark while gaining valuable hands-on experience with Spark programming, RDDs, DataFrames, Spark SQL, and data sources. You?ll also explore Spark Streaming, performance optimization techniques, and the integration of popular external libraries, tools, and cloud platforms like AWS, Azure, and GCP. Machine learning enthusiasts will delve into Spark MLlib, covering basics of machine learning algorithms, data preparation, feature extraction, and various techniques such as regression, classification, clustering, and recommendation systems. Introduction to Scala Brief history and motivation Differences between Scala and Java Basic Scala syntax and constructs Scala's functional programming features Introduction to Apache Spark Overview and history Spark components and architecture Spark ecosystem Comparing Spark with other big data frameworks Basics of Spark Programming SparkContext and SparkSession Resilient Distributed Datasets (RDDs) Transformations and Actions Working with DataFrames Spark SQL and Data Sources Spark SQL library and its advantages Structured and semi-structured data sources Reading and writing data in various formats (CSV, JSON, Parquet, Avro, etc.) Data manipulation using SQL queries Basic RDD Operations Creating and manipulating RDDs Common transformations and actions on RDDs Working with key-value data Basic DataFrame and Dataset Operations Creating and manipulating DataFrames and Datasets Column operations and functions Filtering, sorting, and aggregating data Introduction to Spark Streaming Overview of Spark Streaming Discretized Stream (DStream) operations Windowed operations and stateful processing Performance Optimization Basics Best practices for efficient Spark code Broadcast variables and accumulators Monitoring Spark applications Integrating External Libraries and Tools, Spark Streaming Using popular external libraries, such as Hadoop and HBase Integrating with cloud platforms: AWS, Azure, GCP Connecting to data storage systems: HDFS, S3, Cassandra, etc. Introduction to Machine Learning Basics Overview of machine learning Supervised and unsupervised learning Common algorithms and use cases Introduction to Spark MLlib Overview of Spark MLlib MLlib's algorithms and utilities Data preparation and feature extraction Linear Regression and Classification Linear regression algorithm Logistic regression for classification Model evaluation and performance metrics Clustering Algorithms Overview of clustering algorithms K-means clustering Model evaluation and performance metrics Collaborative Filtering and Recommendation Systems Overview of recommendation systems Collaborative filtering techniques Implementing recommendations with Spark MLlib Introduction to Graph Processing Overview of graph processing Use cases and applications of graph processing Graph representations and operations Introduction to Spark GraphX Overview of GraphX Creating and transforming graphs Graph algorithms in GraphX Big Data Innovation! Using GPT and Generative AI Technologies with Spark and Scala Overview of generative AI technologies Integrating GPT with Spark and Scala Practical applications and use cases Bonus Topics / Time Permitting Introduction to Spark NLP Overview of Spark NLP Preprocessing text data Text classification and sentiment analysis Putting It All Together Work on a capstone project that integrates multiple aspects of the course, including data processing, machine learning, graph processing, and generative AI technologies.

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)
Delivered OnlineFlexible Dates
Price on Enquiry

Python for Data Science: Hands-on Technical Overview (TTPS4873)

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for This introductory-level course is intended for Business Analysts and Data Analysts (or anyone else in the data science realm) who are already comfortable working with numerical data in Excel or other spreadsheet environments. No prior programming experience is required, and a browser is the only tool necessary for the course. Overview This course is approximately 50% hands-on, combining expert lecture, real-world demonstrations and group discussions with machine-based practical labs and exercises. Our engaging instructors and mentors are highly experienced practitioners who bring years of current 'on-the-job' experience into every classroom. Throughout the hands-on course students, will learn to leverage Python scripting for data science (to a basic level) using the most current and efficient skills and techniques. Working in a hands-on learning environment, guided by our expert team, attendees will learn about and explore (to a basic level): How to work with Python interactively in web notebooks The essentials of Python scripting Key concepts necessary to enter the world of Data Science via Python This course introduces data analysts and business analysts (as well as anyone interested in Data Science) to the Python programming language, as it?s often used in Data Science in web notebooks. This goal of this course is to provide students with a baseline understanding of core concepts that can serve as a platform of knowledge to follow up with more in-depth training and real-world practice. An Overview of Python Why Python? Python in the Shell Python in Web Notebooks (iPython, Jupyter, Zeppelin) Demo: Python, Notebooks, and Data Science Getting Started Using variables Builtin functions Strings Numbers Converting among types Writing to the screen Command line parameters Flow Control About flow control White space Conditional expressions Relational and Boolean operators While loops Alternate loop exits Sequences, Arrays, Dictionaries and Sets About sequences Lists and list methods Tuples Indexing and slicing Iterating through a sequence Sequence functions, keywords, and operators List comprehensions Generator Expressions Nested sequences Working with Dictionaries Working with Sets Working with files File overview Opening a text file Reading a text file Writing to a text file Reading and writing raw (binary) data Functions Defining functions Parameters Global and local scope Nested functions Returning values Essential Demos Sorting Exceptions Importing Modules Classes Regular Expressions The standard library Math functions The string module Dates and times Working with dates and times Translating timestamps Parsing dates from text Formatting dates Calendar data Python and Data Science Data Science Essentials Pandas Overview NumPy Overview SciKit Overview MatPlotLib Overview Working with Python in Data Science Additional course details: Nexus Humans Python for Data Science: Hands-on Technical Overview (TTPS4873) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Python for Data Science: Hands-on Technical Overview (TTPS4873) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Python for Data Science: Hands-on Technical Overview (TTPS4873)
Delivered OnlineFlexible Dates
Price on Enquiry

KM204 IBM InfoSphere DataStage Essentials (v11.5)

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for Project administrators and ETL developers responsible for data extraction and transformation using DataStage. Overview Describe the uses of DataStage and the DataStage workflowDescribe the Information Server architecture and how DataStage fits within itDescribe the Information Server and DataStage deployment optionsUse the Information Server Web Console and the DataStage Administrator client to create DataStage users and to configure the DataStage environmentImport and export DataStage objects to a fileImport table definitions for sequential files and relational tablesDesign, compile, run, and monitor DataStage parallel jobsDesign jobs that read and write to sequential filesDescribe the DataStage parallel processing architectureDesign jobs that combine data using joins and lookupsDesign jobs that sort and aggregate dataImplement complex business logic using the DataStage Transformer stageDebug DataStage jobs using the DataStage PX Debugger This course enables the project administrators & developers to acquire the skills necessary to develop parallel jobs in DataStage. Students will learn to create parallel jobs that access sequential & relational data, and combine and transform the data. Course Outline Introduction to DataStage Deployment DataStage Administration Work with Metadata Create Parallel Jobs Access Sequential Data Partitioning and Collecting Algorithms Combine Data Group Processing Stages Transformer Stage Repository Functions Work with Relational Data Control Jobs

KM204 IBM InfoSphere DataStage Essentials (v11.5)
Delivered OnlineFlexible Dates
Price on Enquiry

KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for Experienced DataStage developers seeking training in more advanced DataStage job techniques and who seek techniques for working with complex types of data resources. Overview Use Connector stages to read from and write to database tables Handle SQL errors in Connector stages Use Connector stages with multiple input links Use the File Connector stage to access Hadoop HDFS data Optimize jobs that write to database tables Use the Unstructured Data stage to extract data from Excel spreadsheets Use the Data Masking stage to mask sensitive data processed within a DataStage job Use the Hierarchical stage to parse, compose, and transform XML data Use the Schema Library Manager to import and manage XML schemas Use the Data Rules stage to validate fields of data within a DataStage job Create custom data rules for validating data Design a job that processes a star schema data warehouse with Type 1 and Type 2 slowly changing dimensions This course is designed to introduce you to advanced parallel job data processing techniques in DataStage v11.5. In this course you will develop data techniques for processing different types of complex data resources including relational data, unstructured data (Excel spreadsheets), and XML data. In addition, you will learn advanced techniques for processing data, including techniques for masking data and techniques for validating data using data rules. Finally, you will learn techniques for updating data in a star schema data warehouse using the DataStage SCD (Slowly Changing Dimensions) stage. Even if you are not working with all of these specific types of data, you will benefit from this course by learning advanced DataStage job design techniques, techniques that go beyond those utilized in the DataStage Essentials course. Accessing databases Connector stage overview - Use Connector stages to read from and write to relational tables - Working with the Connector stage properties Connector stage functionality - Before / After SQL - Sparse lookups - Optimize insert/update performance Error handling in Connector stages - Reject links - Reject conditions Multiple input links - Designing jobs using Connector stages with multiple input links - Ordering records across multiple input links File Connector stage - Read and write data to Hadoop file systems Demonstration 1: Handling database errors Demonstration 2: Parallel jobs with multiple Connector input links Demonstration 3: Using the File Connector stage to read and write HDFS files Processing unstructured data Using the Unstructured Data stage in DataStage jobs - Extract data from an Excel spreadsheet - Specify a data range for data extraction in an Unstructured Data stage - Specify document properties for data extraction. Demonstration 1: Processing unstructured data Data masking Using the Data Masking stage in DataStage jobs - Data masking techniques - Data masking policies - Applying policies for masquerading context-aware data types - Applying policies for masquerading generic data types - Repeatable replacement - Using reference tables - Creating custom reference tables Demonstration 1: Data masking Using data rules Introduction to data rules - Using the Data Rules Editor - Selecting data rules - Binding data rule variables - Output link constraints - Adding statistics and attributes to the output information Use the Data Rules stage to valid foreign key references in source data Create custom data rules Demonstration 1: Using data rules Processing XML data Introduction to the Hierarchical stage - Hierarchical stage Assembly editor - Use the Schema Library Manager to import and manage XML schemas Composing XML data - Using the HJoin step to create parent-child relationships between input lists - Using the Composer step Writing Hierarchical data to a relational table Using the Regroup step Consuming XML data - Using the XML Parser step - Propagating columns Topic 6: Transforming XML data - Using the Aggregate step - Using the Sort step - Using the Switch step - Using the H-Pivot step Demonstration 1: Importing XML schemas Demonstration 2: Compose hierarchical data Demonstration 3: Consume hierarchical data Demonstration 4: Transform hierarchical data Updating a star schema database Surrogate keys - Design a job that creates and updates a surrogate key source key file from a dimension table Slowly Changing Dimensions (SCD) stage - Star schema databases - SCD stage Fast Path pages - Specifying purpose codes - Dimension update specification - Design a job that processes a star schema database with Type 1 and Type 2 slowly changing dimensions Demonstration 1: Build a parallel job that updates a star schema database with two dimensions Additional course details: Nexus Humans KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing
Delivered OnlineFlexible Dates
Price on Enquiry