• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

3273 Engineering courses

R Programming for Data Science (v1.0)

By Nexus Human

Duration 5 Days 30 CPD hours This course is intended for This course is designed for students who want to learn the R programming language, particularly students who want to leverage R for data analysis and data science tasks in their organization. The course is also designed for students with an interest in applying statistics to real-world problems. A typical student in this course should have several years of experience with computing technology, along with a proficiency in at least one other programming language. Overview In this course, you will use R to perform common data science tasks.You will: Set up an R development environment and execute simple code. Perform operations on atomic data types in R, including characters, numbers, and logicals. Perform operations on data structures in R, including vectors, lists, and data frames. Write conditional statements and loops. Structure code for reuse with functions and packages. Manage data by loading and saving datasets, manipulating data frames, and more. Analyze data through exploratory analysis, statistical analysis, and more. Create and format data visualizations using base R and ggplot2. Create simple statistical models from data. In our data-driven world, organizations need the right tools to extract valuable insights from that data. The R programming language is one of the tools at the forefront of data science. Its robust set of packages and statistical functions makes it a powerful choice for analyzing data, manipulating data, performing statistical tests on data, and creating predictive models from data. Likewise, R is notable for its strong data visualization tools, enabling you to create high-quality graphs and plots that are incredibly customizable. This course will teach you the fundamentals of programming in R to get you started. It will also teach you how to use R to perform common data science tasks and achieve data-driven results for the business. Lesson 1: Setting Up R and Executing Simple Code Topic A: Set Up the R Development Environment Topic B: Write R Statements Lesson 2: Processing Atomic Data Types Topic A: Process Characters Topic B: Process Numbers Topic C: Process Logicals Lesson 3: Processing Data Structures Topic A: Process Vectors Topic B: Process Factors Topic C: Process Data Frames Topic D: Subset Data Structures Lesson 4: Writing Conditional Statements and Loops Topic A: Write Conditional Statements Topic B: Write Loops Lesson 5: Structuring Code for Reuse Topic A: Define and Call Functions Topic B: Apply Loop Functions Topic C: Manage R Packages Lesson 6: Managing Data in R Topic A: Load Data Topic B: Save Data Topic C: Manipulate Data Frames Using Base R Topic D: Manipulate Data Frames Using dplyr Topic E: Handle Dates and Times Lesson 7: Analyzing Data in R Topic A: Examine Data Topic B: Explore the Underlying Distribution of Data Topic C: Identify Missing Values Lesson 8: Visualizing Data in R Topic A: Plot Data Using Base R Functions Topic B: Plot Data Using ggplot2 Topic C: Format Plots in ggplot2 Topic D: Create Combination Plots Lesson 9: Modeling Data in R Topic A: Create Statistical Models in R Topic B: Create Machine Learning Models in R

R Programming for Data Science (v1.0)
Delivered OnlineFlexible Dates
Price on Enquiry

Hands-on Predicitive Analytics with Python (TTPS4879)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This course is geared for Python experienced attendees who wish to learn and use basic machine learning algorithms and concepts. Students should have skills at least equivalent to the Python for Data Science courses we offer. Overview Working in a hands-on learning environment, guided by our expert team, attendees will learn to Understand the main concepts and principles of predictive analytics Use the Python data analytics ecosystem to implement end-to-end predictive analytics projects Explore advanced predictive modeling algorithms w with an emphasis on theory with intuitive explanations Learn to deploy a predictive model's results as an interactive application Learn about the stages involved in producing complete predictive analytics solutions Understand how to define a problem, propose a solution, and prepare a dataset Use visualizations to explore relationships and gain insights into the dataset Learn to build regression and classification models using scikit-learn Use Keras to build powerful neural network models that produce accurate predictions Learn to serve a model's predictions as a web application Predictive analytics is an applied field that employs a variety of quantitative methods using data to make predictions. It involves much more than just throwing data onto a computer to build a model. This course provides practical coverage to help you understand the most important concepts of predictive analytics. Using practical, step-by-step examples, we build predictive analytics solutions while using cutting-edge Python tools and packages. Hands-on Predictive Analytics with Python is a three-day, hands-on course that guides students through a step-by-step approach to defining problems and identifying relevant data. Students will learn how to perform data preparation, explore and visualize relationships, as well as build models, tune, evaluate, and deploy models. Each stage has relevant practical examples and efficient Python code. You will work with models such as KNN, Random Forests, and neural networks using the most important libraries in Python's data science stack: NumPy, Pandas, Matplotlib, Seabor, Keras, Dash, and so on. In addition to hands-on code examples, you will find intuitive explanations of the inner workings of the main techniques and algorithms used in predictive analytics. The Predictive Analytics Process Technical requirements What is predictive analytics? Reviewing important concepts of predictive analytics The predictive analytics process A quick tour of Python's data science stack Problem Understanding and Data Preparation Technical requirements Understanding the business problem and proposing a solution Practical project ? diamond prices Practical project ? credit card default Dataset Understanding ? Exploratory Data Analysis Technical requirements What is EDA? Univariate EDA Bivariate EDA Introduction to graphical multivariate EDA Predicting Numerical Values with Machine Learning Technical requirements Introduction to ML Practical considerations before modeling MLR Lasso regression KNN Training versus testing error Predicting Categories with Machine Learning Technical requirements Classification tasks Credit card default dataset Logistic regression Classification trees Random forests Training versus testing error Multiclass classification Naive Bayes classifiers Introducing Neural Nets for Predictive Analytics Technical requirements Introducing neural network models Introducing TensorFlow and Keras Regressing with neural networks Classification with neural networks The dark art of training neural networks Model Evaluation Technical requirements Evaluation of regression models Evaluation for classification models The k-fold cross-validation Model Tuning and Improving Performance Technical requirements Hyperparameter tuning Improving performance Implementing a Model with Dash Technical requirements Model communication and/or deployment phase Introducing Dash Implementing a predictive model as a web application Additional course details: Nexus Humans Hands-on Predicitive Analytics with Python (TTPS4879) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Hands-on Predicitive Analytics with Python (TTPS4879) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Hands-on Predicitive Analytics with Python (TTPS4879)
Delivered OnlineFlexible Dates
Price on Enquiry

CITB Temp Wks Sup'visor (1 Day)(On-Site)

4.9(182)

By You Can Do It .Training

This course is designed to provide training for those undertaking the role of temporary works...

CITB Temp Wks Sup'visor (1 Day)(On-Site)
Delivered In-Person in Stoke on Trent or UK WideFlexible Dates
Price on Enquiry

CITB Temporary Works Coordinator On-Site

4.9(182)

By You Can Do It .Training

This course is designed to assist those on site who have responsibility for managing all forms of...

CITB Temporary Works Coordinator On-Site
Delivered In-Person in Stoke on Trent or UK WideFlexible Dates
Price on Enquiry

Google Cloud Platform Big Data and Machine Learning Fundamentals

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This class is intended for the following: Data analysts, Data scientists, Business analysts getting started with Google Cloud Platform. Individuals responsible for designing pipelines and architectures for data processing, creating and maintaining machine learning and statistical models, querying datasets, visualizing query results and creating reports. Executives and IT decision makers evaluating Google Cloud Platform for use by data scientists. Overview This course teaches students the following skills:Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform.Use Cloud SQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform.Employ BigQuery and Cloud Datalab to carry out interactive data analysis.Train and use a neural network using TensorFlow.Employ ML APIs.Choose between different data processing products on the Google Cloud Platform. This course introduces participants to the Big Data and Machine Learning capabilities of Google Cloud Platform (GCP). It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities. Introducing Google Cloud Platform Google Platform Fundamentals Overview. Google Cloud Platform Big Data Products. Compute and Storage Fundamentals CPUs on demand (Compute Engine). A global filesystem (Cloud Storage). CloudShell. Lab: Set up a Ingest-Transform-Publish data processing pipeline. Data Analytics on the Cloud Stepping-stones to the cloud. Cloud SQL: your SQL database on the cloud. Lab: Importing data into CloudSQL and running queries. Spark on Dataproc. Lab: Machine Learning Recommendations with Spark on Dataproc. Scaling Data Analysis Fast random access. Datalab. BigQuery. Lab: Build machine learning dataset. Machine Learning Machine Learning with TensorFlow. Lab: Carry out ML with TensorFlow Pre-built models for common needs. Lab: Employ ML APIs. Data Processing Architectures Message-oriented architectures with Pub/Sub. Creating pipelines with Dataflow. Reference architecture for real-time and batch data processing. Summary Why GCP? Where to go from here Additional Resources Additional course details: Nexus Humans Google Cloud Platform Big Data and Machine Learning Fundamentals training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Google Cloud Platform Big Data and Machine Learning Fundamentals course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Google Cloud Platform Big Data and Machine Learning Fundamentals
Delivered OnlineFlexible Dates
Price on Enquiry

Data Warehousing on AWS

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This course is intended for: Database architects Database administrators Database developers Data analysts and scientists Overview This course is designed to teach you how to: Discuss the core concepts of data warehousing, and the intersection between data warehousing and big data solutions Launch an Amazon Redshift cluster and use the components, features, and functionality to implement a data warehouse in the cloud Use other AWS data and analytic services, such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis, and Amazon S3, to contribute to the data warehousing solution Architect the data warehouse Identify performance issues, optimize queries, and tune the database for better performance Use Amazon Redshift Spectrum to analyze data directly from an Amazon S3 bucket Use Amazon QuickSight to perform data analysis and visualization tasks against the data warehouse Data Warehousing on AWS introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift, the petabyte-scale data warehouse in AWS. This course demonstrates how to collect, store, and prepare data for the data warehouse by using other AWS services such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis, and Amazon S3. Additionally, this course demonstrates how to use Amazon QuickSight to perform analysis on your data Module 1: Introduction to Data Warehousing Relational databases Data warehousing concepts The intersection of data warehousing and big data Overview of data management in AWS Hands-on lab 1: Introduction to Amazon Redshift Module 2: Introduction to Amazon Redshift Conceptual overview Real-world use cases Hands-on lab 2: Launching an Amazon Redshift cluster Module 3: Launching clusters Building the cluster Connecting to the cluster Controlling access Database security Load data Hands-on lab 3: Optimizing database schemas Module 4: Designing the database schema Schemas and data types Columnar compression Data distribution styles Data sorting methods Module 5: Identifying data sources Data sources overview Amazon S3 Amazon DynamoDB Amazon EMR Amazon Kinesis Data Firehose AWS Lambda Database Loader for Amazon Redshift Hands-on lab 4: Loading real-time data into an Amazon Redshift database Module 6: Loading data Preparing Data Loading data using COPY Data Warehousing on AWS AWS Classroom Training Concurrent write operations Troubleshooting load issues Hands-on lab 5: Loading data with the COPY command Module 7: Writing queries and tuning for performance Amazon Redshift SQL User-Defined Functions (UDFs) Factors that affect query performance The EXPLAIN command and query plans Workload Management (WLM) Hands-on lab 6: Configuring workload management Module 8: Amazon Redshift Spectrum Amazon Redshift Spectrum Configuring data for Amazon Redshift Spectrum Amazon Redshift Spectrum Queries Hands-on lab 7: Using Amazon Redshift Spectrum Module 9: Maintaining clusters Audit logging Performance monitoring Events and notifications Lab 8: Auditing and monitoring clusters Resizing clusters Backing up and restoring clusters Resource tagging and limits and constraints Hands-on lab 9: Backing up, restoring and resizing clusters Module 10: Analyzing and visualizing data Power of visualizations Building dashboards Amazon QuickSight editions and feature

Data Warehousing on AWS
Delivered OnlineFlexible Dates
Price on Enquiry

From Data to Insights with Google Cloud Platform

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for Data Analysts, Business Analysts, Business Intelligence professionals Cloud Data Engineers who will be partnering with Data Analysts to build scalable data solutions on Google Cloud Platform Overview This course teaches students the following skills: Derive insights from data using the analysis and visualization tools on Google Cloud Platform Interactively query datasets using Google BigQuery Load, clean, and transform data at scale Visualize data using Google Data Studio and other third-party platforms Distinguish between exploratory and explanatory analytics and when to use each approach Explore new datasets and uncover hidden insights quickly and effectively Optimizing data models and queries for price and performance Want to know how to query and process petabytes of data in seconds? Curious about data analysis that scales automatically as your data grows? Welcome to the Data Insights course! This four-course accelerated online specialization teaches course participants how to derive insights through data analysis and visualization using the Google Cloud Platform. The courses feature interactive scenarios and hands-on labs where participants explore, mine, load, visualize, and extract insights from diverse Google BigQuery datasets. The courses also cover data loading, querying, schema modeling, optimizing performance, query pricing, and data visualization. This specialization is intended for the following participants: Data Analysts, Business Analysts, Business Intelligence professionals Cloud Data Engineers who will be partnering with Data Analysts to build scalable data solutions on Google Cloud Platform To get the most out of this specialization, we recommend participants have some proficiency with ANSI SQL. Introduction to Data on the Google Cloud Platform Highlight Analytics Challenges Faced by Data Analysts Compare Big Data On-Premises vs on the Cloud Learn from Real-World Use Cases of Companies Transformed through Analytics on the Cloud Navigate Google Cloud Platform Project Basics Lab: Getting started with Google Cloud Platform Big Data Tools Overview Walkthrough Data Analyst Tasks, Challenges, and Introduce Google Cloud Platform Data Tools Demo: Analyze 10 Billion Records with Google BigQuery Explore 9 Fundamental Google BigQuery Features Compare GCP Tools for Analysts, Data Scientists, and Data Engineers Lab: Exploring Datasets with Google BigQuery Exploring your Data with SQL Compare Common Data Exploration Techniques Learn How to Code High Quality Standard SQL Explore Google BigQuery Public Datasets Visualization Preview: Google Data Studio Lab: Troubleshoot Common SQL Errors Google BigQuery Pricing Walkthrough of a BigQuery Job Calculate BigQuery Pricing: Storage, Querying, and Streaming Costs Optimize Queries for Cost Lab: Calculate Google BigQuery Pricing Cleaning and Transforming your Data Examine the 5 Principles of Dataset Integrity Characterize Dataset Shape and Skew Clean and Transform Data using SQL Clean and Transform Data using a new UI: Introducing Cloud Dataprep Lab: Explore and Shape Data with Cloud Dataprep Storing and Exporting Data Compare Permanent vs Temporary Tables Save and Export Query Results Performance Preview: Query Cache Lab: Creating new Permanent Tables Ingesting New Datasets into Google BigQuery Query from External Data Sources Avoid Data Ingesting Pitfalls Ingest New Data into Permanent Tables Discuss Streaming Inserts Lab: Ingesting and Querying New Datasets Data Visualization Overview of Data Visualization Principles Exploratory vs Explanatory Analysis Approaches Demo: Google Data Studio UI Connect Google Data Studio to Google BigQuery Lab: Exploring a Dataset in Google Data Studio Joining and Merging Datasets Merge Historical Data Tables with UNION Introduce Table Wildcards for Easy Merges Review Data Schemas: Linking Data Across Multiple Tables Walkthrough JOIN Examples and Pitfalls Lab: Join and Union Data from Multiple Tables Advanced Functions and Clauses Review SQL Case Statements Introduce Analytical Window Functions Safeguard Data with One-Way Field Encryption Discuss Effective Sub-query and CTE design Compare SQL and Javascript UDFs Lab: Deriving Insights with Advanced SQL Functions Schema Design and Nested Data Structures Compare Google BigQuery vs Traditional RDBMS Data Architecture Normalization vs Denormalization: Performance Tradeoffs Schema Review: The Good, The Bad, and The Ugly Arrays and Nested Data in Google BigQuery Lab: Querying Nested and Repeated Data More Visualization with Google Data Studio Create Case Statements and Calculated Fields Avoid Performance Pitfalls with Cache considerations Share Dashboards and Discuss Data Access considerations Optimizing for Performance Avoid Google BigQuery Performance Pitfalls Prevent Hotspots in your Data Diagnose Performance Issues with the Query Explanation map Lab: Optimizing and Troubleshooting Query Performance Advanced Insights Introducing Cloud Datalab Cloud Datalab Notebooks and Cells Benefits of Cloud Datalab Data Access Compare IAM and BigQuery Dataset Roles Avoid Access Pitfalls Review Members, Roles, Organizations, Account Administration, and Service Accounts

From Data to Insights with Google Cloud Platform
Delivered OnlineFlexible Dates
Price on Enquiry

Building Batch Data Analytics Solutions on AWS

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: Data platform engineers Architects and operators who build and manage data analytics pipelines Overview In this course, you will learn to: Compare the features and benefits of data warehouses, data lakes, and modern data architectures Design and implement a batch data analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage Select and deploy appropriate options to ingest, transform, and store data Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems Apply cost management best practices In this course, you will learn to build batch data analytics solutions using Amazon EMR, an enterprise-grade Apache Spark and Apache Hadoop managed service. You will learn how Amazon EMR integrates with open-source projects such as Apache Hive, Hue, and HBase, and with AWS services such as AWS Glue and AWS Lake Formation. The course addresses data collection, ingestion, cataloging, storage, and processing components in the context of Spark and Hadoop. You will learn to use EMR Notebooks to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon EMR. Module A: Overview of Data Analytics and the Data Pipeline Data analytics use cases Using the data pipeline for analytics Module 1: Introduction to Amazon EMR Using Amazon EMR in analytics solutions Amazon EMR cluster architecture Interactive Demo 1: Launching an Amazon EMR cluster Cost management strategies Module 2: Data Analytics Pipeline Using Amazon EMR: Ingestion and Storage Storage optimization with Amazon EMR Data ingestion techniques Module 3: High-Performance Batch Data Analytics Using Apache Spark on Amazon EMR Apache Spark on Amazon EMR use cases Why Apache Spark on Amazon EMR Spark concepts Interactive Demo 2: Connect to an EMR cluster and perform Scala commands using the Spark shell Transformation, processing, and analytics Using notebooks with Amazon EMR Practice Lab 1: Low-latency data analytics using Apache Spark on Amazon EMR Module 4: Processing and Analyzing Batch Data with Amazon EMR and Apache Hive Using Amazon EMR with Hive to process batch data Transformation, processing, and analytics Practice Lab 2: Batch data processing using Amazon EMR with Hive Introduction to Apache HBase on Amazon EMR Module 5: Serverless Data Processing Serverless data processing, transformation, and analytics Using AWS Glue with Amazon EMR workloads Practice Lab 3: Orchestrate data processing in Spark using AWS Step Functions Module 6: Security and Monitoring of Amazon EMR Clusters Securing EMR clusters Interactive Demo 3: Client-side encryption with EMRFS Monitoring and troubleshooting Amazon EMR clusters Demo: Reviewing Apache Spark cluster history Module 7: Designing Batch Data Analytics Solutions Batch data analytics use cases Activity: Designing a batch data analytics workflow Module B: Developing Modern Data Architectures on AWS Modern data architectures

Building Batch Data Analytics Solutions on AWS
Delivered OnlineFlexible Dates
Price on Enquiry

Building Data Analytics Solutions Using Amazon Redshift

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for data warehouse engineers, data platform engineers, and architects and operators who build and manage data analytics pipelines. Completed either AWS Technical Essentials or Architecting on AWS Completed Building Data Lakes on AWS Overview In this course, you will learn to: Compare the features and benefits of data warehouses, data lakes, and modern data architectures Design and implement a data warehouse analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage Select and deploy appropriate options to ingest, transform, and store data Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems Apply cost management best practices In this course, you will build a data analytics solution using Amazon Redshift, a cloud data warehouse service. The course focuses on the data collection, ingestion, cataloging, storage, and processing components of the analytics pipeline. You will learn to integrate Amazon Redshift with a data lake to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon Redshift. Module A: Overview of Data Analytics and the Data Pipeline Data analytics use cases Using the data pipeline for analytics Module 1: Using Amazon Redshift in the Data Analytics Pipeline Why Amazon Redshift for data warehousing? Overview of Amazon Redshift Module 2: Introduction to Amazon Redshift Amazon Redshift architecture Interactive Demo 1: Touring the Amazon Redshift console Amazon Redshift features Practice Lab 1: Load and query data in an Amazon Redshift cluster Module 3: Ingestion and Storage Ingestion Interactive Demo 2: Connecting your Amazon Redshift cluster using a Jupyter notebook with Data API Data distribution and storage Interactive Demo 3: Analyzing semi-structured data using the SUPER data type Querying data in Amazon Redshift Practice Lab 2: Data analytics using Amazon Redshift Spectrum Module 4: Processing and Optimizing Data Data transformation Advanced querying Practice Lab 3: Data transformation and querying in Amazon Redshift Resource management Interactive Demo 4: Applying mixed workload management on Amazon Redshift Automation and optimization Interactive demo 5: Amazon Redshift cluster resizing from the dc2.large to ra3.xlplus cluster Module 5: Security and Monitoring of Amazon Redshift Clusters Securing the Amazon Redshift cluster Monitoring and troubleshooting Amazon Redshift clusters Module 6: Designing Data Warehouse Analytics Solutions Data warehouse use case review Activity: Designing a data warehouse analytics workflow Module B: Developing Modern Data Architectures on AWS Modern data architectures

Building Data Analytics Solutions Using Amazon Redshift
Delivered OnlineFlexible Dates
Price on Enquiry

Building Data Lakes on AWS

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: Data platform engineers Solutions architects IT professionals Overview In this course, you will learn to: Apply data lake methodologies in planning and designing a data lake Articulate the components and services required for building an AWS data lake Secure a data lake with appropriate permission Ingest, store, and transform data in a data lake Query, analyze, and visualize data within a data lake In this course, you will learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake architectures. Module 1: Introduction to data lakes Describe the value of data lakes Compare data lakes and data warehouses Describe the components of a data lake Recognize common architectures built on data lakes Module 2: Data ingestion, cataloging, and preparation Describe the relationship between data lake storage and data ingestion Describe AWS Glue crawlers and how they are used to create a data catalog Identify data formatting, partitioning, and compression for efficient storage and query Lab 1: Set up a simple data lake Module 3: Data processing and analytics Recognize how data processing applies to a data lake Use AWS Glue to process data within a data lake Describe how to use Amazon Athena to analyze data in a data lake Module 4: Building a data lake with AWS Lake Formation Describe the features and benefits of AWS Lake Formation Use AWS Lake Formation to create a data lake Understand the AWS Lake Formation security model Lab 2: Build a data lake using AWS Lake Formation Module 5: Additional Lake Formation configurations Automate AWS Lake Formation using blueprints and workflows Apply security and access controls to AWS Lake Formation Match records with AWS Lake Formation FindMatches Visualize data with Amazon QuickSight Lab 3: Automate data lake creation using AWS Lake Formation blueprints Lab 4: Data visualization using Amazon QuickSight Module 6: Architecture and course review Post course knowledge check Architecture review Course review Additional course details: Nexus Humans Building Data Lakes on AWS training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Building Data Lakes on AWS course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Building Data Lakes on AWS
Delivered OnlineFlexible Dates
Price on Enquiry