• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

116 Data Engineering courses

Building Batch Data Analytics Solutions on AWS

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: Data platform engineers Architects and operators who build and manage data analytics pipelines Overview In this course, you will learn to: Compare the features and benefits of data warehouses, data lakes, and modern data architectures Design and implement a batch data analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage Select and deploy appropriate options to ingest, transform, and store data Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems Apply cost management best practices In this course, you will learn to build batch data analytics solutions using Amazon EMR, an enterprise-grade Apache Spark and Apache Hadoop managed service. You will learn how Amazon EMR integrates with open-source projects such as Apache Hive, Hue, and HBase, and with AWS services such as AWS Glue and AWS Lake Formation. The course addresses data collection, ingestion, cataloging, storage, and processing components in the context of Spark and Hadoop. You will learn to use EMR Notebooks to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon EMR. Module A: Overview of Data Analytics and the Data Pipeline Data analytics use cases Using the data pipeline for analytics Module 1: Introduction to Amazon EMR Using Amazon EMR in analytics solutions Amazon EMR cluster architecture Interactive Demo 1: Launching an Amazon EMR cluster Cost management strategies Module 2: Data Analytics Pipeline Using Amazon EMR: Ingestion and Storage Storage optimization with Amazon EMR Data ingestion techniques Module 3: High-Performance Batch Data Analytics Using Apache Spark on Amazon EMR Apache Spark on Amazon EMR use cases Why Apache Spark on Amazon EMR Spark concepts Interactive Demo 2: Connect to an EMR cluster and perform Scala commands using the Spark shell Transformation, processing, and analytics Using notebooks with Amazon EMR Practice Lab 1: Low-latency data analytics using Apache Spark on Amazon EMR Module 4: Processing and Analyzing Batch Data with Amazon EMR and Apache Hive Using Amazon EMR with Hive to process batch data Transformation, processing, and analytics Practice Lab 2: Batch data processing using Amazon EMR with Hive Introduction to Apache HBase on Amazon EMR Module 5: Serverless Data Processing Serverless data processing, transformation, and analytics Using AWS Glue with Amazon EMR workloads Practice Lab 3: Orchestrate data processing in Spark using AWS Step Functions Module 6: Security and Monitoring of Amazon EMR Clusters Securing EMR clusters Interactive Demo 3: Client-side encryption with EMRFS Monitoring and troubleshooting Amazon EMR clusters Demo: Reviewing Apache Spark cluster history Module 7: Designing Batch Data Analytics Solutions Batch data analytics use cases Activity: Designing a batch data analytics workflow Module B: Developing Modern Data Architectures on AWS Modern data architectures

Building Batch Data Analytics Solutions on AWS
Delivered OnlineFlexible Dates
Price on Enquiry

Building Data Analytics Solutions Using Amazon Redshift

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for data warehouse engineers, data platform engineers, and architects and operators who build and manage data analytics pipelines. Completed either AWS Technical Essentials or Architecting on AWS Completed Building Data Lakes on AWS Overview In this course, you will learn to: Compare the features and benefits of data warehouses, data lakes, and modern data architectures Design and implement a data warehouse analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage Select and deploy appropriate options to ingest, transform, and store data Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems Apply cost management best practices In this course, you will build a data analytics solution using Amazon Redshift, a cloud data warehouse service. The course focuses on the data collection, ingestion, cataloging, storage, and processing components of the analytics pipeline. You will learn to integrate Amazon Redshift with a data lake to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon Redshift. Module A: Overview of Data Analytics and the Data Pipeline Data analytics use cases Using the data pipeline for analytics Module 1: Using Amazon Redshift in the Data Analytics Pipeline Why Amazon Redshift for data warehousing? Overview of Amazon Redshift Module 2: Introduction to Amazon Redshift Amazon Redshift architecture Interactive Demo 1: Touring the Amazon Redshift console Amazon Redshift features Practice Lab 1: Load and query data in an Amazon Redshift cluster Module 3: Ingestion and Storage Ingestion Interactive Demo 2: Connecting your Amazon Redshift cluster using a Jupyter notebook with Data API Data distribution and storage Interactive Demo 3: Analyzing semi-structured data using the SUPER data type Querying data in Amazon Redshift Practice Lab 2: Data analytics using Amazon Redshift Spectrum Module 4: Processing and Optimizing Data Data transformation Advanced querying Practice Lab 3: Data transformation and querying in Amazon Redshift Resource management Interactive Demo 4: Applying mixed workload management on Amazon Redshift Automation and optimization Interactive demo 5: Amazon Redshift cluster resizing from the dc2.large to ra3.xlplus cluster Module 5: Security and Monitoring of Amazon Redshift Clusters Securing the Amazon Redshift cluster Monitoring and troubleshooting Amazon Redshift clusters Module 6: Designing Data Warehouse Analytics Solutions Data warehouse use case review Activity: Designing a data warehouse analytics workflow Module B: Developing Modern Data Architectures on AWS Modern data architectures

Building Data Analytics Solutions Using Amazon Redshift
Delivered OnlineFlexible Dates
Price on Enquiry

Building Data Lakes on AWS

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: Data platform engineers Solutions architects IT professionals Overview In this course, you will learn to: Apply data lake methodologies in planning and designing a data lake Articulate the components and services required for building an AWS data lake Secure a data lake with appropriate permission Ingest, store, and transform data in a data lake Query, analyze, and visualize data within a data lake In this course, you will learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake architectures. Module 1: Introduction to data lakes Describe the value of data lakes Compare data lakes and data warehouses Describe the components of a data lake Recognize common architectures built on data lakes Module 2: Data ingestion, cataloging, and preparation Describe the relationship between data lake storage and data ingestion Describe AWS Glue crawlers and how they are used to create a data catalog Identify data formatting, partitioning, and compression for efficient storage and query Lab 1: Set up a simple data lake Module 3: Data processing and analytics Recognize how data processing applies to a data lake Use AWS Glue to process data within a data lake Describe how to use Amazon Athena to analyze data in a data lake Module 4: Building a data lake with AWS Lake Formation Describe the features and benefits of AWS Lake Formation Use AWS Lake Formation to create a data lake Understand the AWS Lake Formation security model Lab 2: Build a data lake using AWS Lake Formation Module 5: Additional Lake Formation configurations Automate AWS Lake Formation using blueprints and workflows Apply security and access controls to AWS Lake Formation Match records with AWS Lake Formation FindMatches Visualize data with Amazon QuickSight Lab 3: Automate data lake creation using AWS Lake Formation blueprints Lab 4: Data visualization using Amazon QuickSight Module 6: Architecture and course review Post course knowledge check Architecture review Course review Additional course details: Nexus Humans Building Data Lakes on AWS training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Building Data Lakes on AWS course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Building Data Lakes on AWS
Delivered OnlineFlexible Dates
Price on Enquiry

AWS Building Data Lakes on AWS

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: Data platform engineers Solutions architects IT professionals Overview In this course, you will learn to: Apply data lake methodologies in planning and designing a data lake Articulate the components and services required for building an AWS data lake Secure a data lake with appropriate permission Ingest, store, and transform data in a data lake Query, analyze, and visualize data within a data lake In this course, you will learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake Introduction to data lakes Describe the value of data lakes Compare data lakes and data warehouses Describe the components of a data lake Recognize common architectures built on data lakes Data ingestion, cataloging, and preparation Describe the relationship between data lake storage and data ingestion Describe AWS Glue crawlers and how they are used to create a data catalog Identify data formatting, partitioning, and compression for efficient storage and query Lab 1: Set up a simple data lake Data processing and analytics Recognize how data processing applies to a data lake Use AWS Glue to process data within a data lake Describe how to use Amazon Athena to analyze data in a data lake Building a data lake with AWS Lake Formation Describe the features and benefits of AWS Lake Formation Use AWS Lake Formation to create a data lake Understand the AWS Lake Formation security model Lab 2: Build a data lake using AWS Lake Formation Additional Lake Formation configurations Automate AWS Lake Formation using blueprints and workflows Apply security and access controls to AWS Lake Formation Match records with AWS Lake Formation FindMatches Visualize data with Amazon QuickSight Lab 3: Automate data lake creation using AWS Lake Formation blueprints Lab 4: Data visualization using Amazon QuickSight Architecture and course review Post course knowledge check Architecture review Course review

AWS Building Data Lakes on AWS
Delivered OnlineFlexible Dates
Price on Enquiry

CertNexus Data Ethics for Business Professionals (DEBIZ)

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is designed for business leaders and decision makers, including C-level executives, project and product managers, HR leaders, Marketing and Sales leaders, and technical sales consultants, who have a vested interest in the representation of ethical values in technology solutions. Other individuals who want to know more about data ethics are also candidates for this course. This course is also designed to assist learners in preparing for the CertNexus DEBIZ™ (Exam DEB-110) credential. The power of extracting value from data utilizing Artificial Intelligence, Data Science and Machine Learning exposes the learning differences between humans and machines. Humans can apply ethical principles throughout the decision-making process to avoid discrimination, societal harm, and marginalization to maintain and even enhance acceptable norms. Machines make decisions autonomously. So how do we train them to apply ethical principles as they learn from decisions they make? This course provides business professionals and consumers of technology core concepts of ethical principles, how they can be applied to emerging data driven technologies and the impact to an organization which ignores ethical use of technology. Introduction to Data Ethics Defining Data Ethics The Case for Data Ethics Identifying Ethical Issues Improving Ethical Data Practices Ethical Principles Ethical Frameworks Data Privacy Accountability Transparency and Explainability Human-Centered Values and Fairness Inclusive Growth, Sustainable Development, and Well-Being Applying Ethical Principles to Emerging Technology Improving Ethical Data Practices Sources of Ethical Risk Mitigating Bias Mitigating Discrimination Safety and Security Mitigating Negative Outputs Data Surveillance Assessing Risk Ethical Risks in sharing data Applying professional critical judgement Business Considerations Data Legislation Impact of Social and Behavioral Effects Trustworthiness Impact on Business Reputation Organizational Values and the Data Value Chain Building a Data Ethics Culture/Code of Ethics Balancing organizational goals with Ethical Practice Additional course details: Nexus Humans CertNexus Data Ethics for Business Professionals (DEBIZ) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the CertNexus Data Ethics for Business Professionals (DEBIZ) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

CertNexus Data Ethics for Business Professionals (DEBIZ)
Delivered OnlineFlexible Dates
Price on Enquiry

CertNexus Certified Data Science Practitioner (CDSP)

By Nexus Human

Duration 5 Days 30 CPD hours This course is intended for This course is designed for business professionals who leverage data to address business issues. The typical student in this course will have several years of experience with computing technology, including some aptitude in computer programming. However, there is not necessarily a single organizational role that this course targets. A prospective student might be a programmer looking to expand their knowledge of how to guide business decisions by collecting, wrangling, analyzing, and manipulating data through code; or a data analyst with a background in applied math and statistics who wants to take their skills to the next level; or any number of other data-driven situations. Ultimately, the target student is someone who wants to learn how to more effectively extract insights from their work and leverage that insight in addressing business issues, thereby bringing greater value to the business. Overview In this course, you will learn to: Use data science principles to address business issues. Apply the extract, transform, and load (ETL) process to prepare datasets. Use multiple techniques to analyze data and extract valuable insights. Design a machine learning approach to address business issues. Train, tune, and evaluate classification models. Train, tune, and evaluate regression and forecasting models. Train, tune, and evaluate clustering models. Finalize a data science project by presenting models to an audience, putting models into production, and monitoring model performance. For a business to thrive in our data-driven world, it must treat data as one of its most important assets. Data is crucial for understanding where the business is and where it's headed. Not only can data reveal insights, it can also inform?by guiding decisions and influencing day-to-day operations. This calls for a robust workforce of professionals who can analyze, understand, manipulate, and present data within an effective and repeatable process framework. In other words, the business world needs data science practitioners. This course will enable you to bring value to the business by putting data science concepts into practice Addressing Business Issues with Data Science Topic A: Initiate a Data Science Project Topic B: Formulate a Data Science Problem Extracting, Transforming, and Loading Data Topic A: Extract Data Topic B: Transform Data Topic C: Load Data Analyzing Data Topic A: Examine Data Topic B: Explore the Underlying Distribution of Data Topic C: Use Visualizations to Analyze Data Topic D: Preprocess Data Designing a Machine Learning Approach Topic A: Identify Machine Learning Concepts Topic B: Test a Hypothesis Developing Classification Models Topic A: Train and Tune Classification Models Topic B: Evaluate Classification Models Developing Regression Models Topic A: Train and Tune Regression Models Topic B: Evaluate Regression Models Developing Clustering Models Topic A: Train and Tune Clustering Models Topic B: Evaluate Clustering Models Finalizing a Data Science Project Topic A: Communicate Results to Stakeholders Topic B: Demonstrate Models in a Web App Topic C: Implement and Test Production Pipelines

CertNexus Certified Data Science Practitioner (CDSP)
Delivered OnlineFlexible Dates
Price on Enquiry

Educators matching "Data Engineering"

Show all 24