Cademy Marketplace

19 Hadoop courses in Nottingham delivered Live Online

Hadoop Live Online Nottingham

H006 IBM Spectrum Scale Advanced Administration for Linux

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This advanced course is for IT professionals tasked with administering a Spectrum Scale system. Overview Please see Overview This course is intended for IT professionals tasked with administering a Spectrum Scale system. It includes information on installing, configuring and monitoring a Spectrum Scale cluster. Migrating to IBM Spectrum Scale 4.2Spectrum Scale 4.2 GUIMulti-clusterClustered NFSCluster Export Services for multi-protocol supportSMB Protocol SupportNFS Support in CES; Ganesha overview/performanceActive File ManagementAFM-Based Disaster Recovery (DR) and Asynchronous DRPlanning LTFS and GPFS environment for data archivingFile Placement OptimizerIBMÂ© GPFS-FPO and integration with GPFS Hadoop connectorIBMÂ© Spectrum Scale Call HomeMonitoring and performance tuningFlash Cache metadata migration

H006 IBM Spectrum Scale Advanced Administration for Linux

Delivered OnlineFlexible Dates

Price on Enquiry

Oracle Data Integrator 19c Configuration and Administration (TTOR30319)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This intermediate-level hands-on course is geared for experienced Administrators, Analysts, Architects, Data Scientists, Database Administrators and Implementers Overview This course is approximately 50% hands-on, combining expert lecture, real-world demonstrations and group discussions with machine-based practical labs and exercises. Working in a hands-on learning environment led by our Oracle Certified expert facilitator, students will learn how to: Administer ODI resources and setup security with ODI Apply ODI Topology concepts for data integration Describe ODI Model concepts Describe architecture of Oracle Data Integrator Design ODI Mappings, Procedures, Packages, and Load Plans to perform ELT data transformations Explore, audit data, and enforce data quality with ODI Implement Changed Data Capture with ODI Oracle Data Integrator is a comprehensive data integration platform that covers all data integration requirements from high-volume, high-performance batch loads, to event-driven integration processes and SOA-enabled data services. Oracle Data Integrator's Extract, Load, Transform (E-LT) architecture leverages disparate RDBMS engines to process and transform the data - the approach that optimizes performance, scalability and lowers overall solution costs. Throughout this course participants will explore how to centralize data across databases, performing integration, designing ODI Mappings, and setting up ODI security. In addition, Oracle Data Integrator can interact with the various tools of the Hadoop ecosystem, allowing administrators and data scientists to farm out map-reduce operations from established relational databases to Hadoop. They can also read back into the relational world the results of complex Big Data analysis for further processing. Working in a hands-on learning environment led by our Oracle Certified expert facilitator, students will learn how to: Administer ODI resources and setup security with ODI Apply ODI Topology concepts for data integration Describe ODI Model concepts Describe architecture of Oracle Data Integrator Design ODI Mappings, Procedures, Packages, and Load Plans to perform ELT data transformations Explore, audit data, and enforce data quality with ODI Implement Changed Data Capture with ODI Introduction to Integration and Administration Oracle Data Integrator: Introduction Oracle Data Integrator Repositories Administering ODI Repositories Create and connect to the Master Repository Export and import the Master Repository Create, connect, and set a password to the Work Repository ODI Topology Concepts ODI Topology: Overview Data Servers and Physical Schemas Defining Topology Agents in Topology Planning a Topology Describing the Physical and Logical Architecture Topology Navigator Creating Physical Architecture Creating Logical Architecture Setting Up a New ODI Project ODI Projects Using Folders Understanding Knowledge Modules Exporting and Importing Objects Using Markers Oracle Data Integrator Model Concepts Understanding the Relational Model Understanding Reverse-Engineering Creating Models Organizing ODI Models and Creating ODI Datastores Organizing Models Creating Datastores Constraints in ODI Creating Keys and References Creating Conditions Exploring Your Data Constructing Business Rules ODI Mapping Concepts ODI Mappings Expressions, Join, Filter, Lookup, Sets, and Others Behind the Rules Staging Area and Execution Location Understanding Knowledge Modules Mappings: Overview Designing Mappings Multiple Sources and Joins Filtering Data Overview of the Flow in ODI Mapping Selecting a Staging Area Configuring Expressions Execution Location Selecting a Knowledge Module Mappings: Monitoring and Troubleshooting Monitoring Mappings Working with Errors Designing Mappings: Advanced Topics 1 Working with Business Rules Using Variables Datasets and Sets Using Sequences Designing Mappings: Advanced Topics 2 Partitioning Configuring Reusable Mappings Using User Functions Substitution Methods Modifying Knowledge Modules Using ODI Procedures Procedures: Overview Creating a Blank Procedure Adding Commands Adding Options Running a Procedure Using ODI Packages Packages: Overview Executing a Package Review of Package Steps Model, Submodel, and Datastore Steps Variable Steps Controlling the Execution Path Step-by-Step Debugger Starting a Debug Session New Functions Menu Bar Icons Managing ODI Scenarios Scenarios Managing Scenarios Preparing for Deployment Using Load Plans What are load plans? Load plan editor Load plan step sequence Defining restart behavior Enforcing Data Quality with ODI Data Quality Business Rules for Data Quality Enforcing Data Quality with ODI Working with Changed Data Capture CDC with ODI CDC implementations with ODI CDC implementation techniques Journalizing Results of CDC Advanced ODI Administration Setting Up ODI Security Managing ODI Reports ODI Integration with Java

Oracle Data Integrator 19c Configuration and Administration (TTOR30319)

Delivered OnlineFlexible Dates

Price on Enquiry

Data Engineering on Google Cloud

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for This class is intended for experienced developers who are responsible for managing big data transformations including: Extracting, loading, transforming, cleaning, and validating data. Designing pipelines and architectures for data processing. Creating and maintaining machine learning and statistical models. Querying datasets, visualizing query results and creating reports Overview Design and build data processing systems on Google Cloud Platform. Leverage unstructured data using Spark and ML APIs on Cloud Dataproc. Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow. Derive business insights from extremely large datasets using Google BigQuery. Train, evaluate and predict using machine learning models using TensorFlow and Cloud ML. Enable instant insights from streaming data Get hands-on experience with designing and building data processing systems on Google Cloud. This course uses lectures, demos, and hand-on labs to show you how to design data processing systems, build end-to-end data pipelines, analyze data, and implement machine learning. This course covers structured, unstructured, and streaming data. Introduction to Data Engineering Explore the role of a data engineer. Analyze data engineering challenges. Intro to BigQuery. Data Lakes and Data Warehouses. Demo: Federated Queries with BigQuery. Transactional Databases vs Data Warehouses. Website Demo: Finding PII in your dataset with DLP API. Partner effectively with other data teams. Manage data access and governance. Build production-ready pipelines. Review GCP customer case study. Lab: Analyzing Data with BigQuery. Building a Data Lake Introduction to Data Lakes. Data Storage and ETL options on GCP. Building a Data Lake using Cloud Storage. Optional Demo: Optimizing cost with Google Cloud Storage classes and Cloud Functions. Securing Cloud Storage. Storing All Sorts of Data Types. Video Demo: Running federated queries on Parquet and ORC files in BigQuery. Cloud SQL as a relational Data Lake. Lab: Loading Taxi Data into Cloud SQL. Building a Data Warehouse The modern data warehouse. Intro to BigQuery. Demo: Query TB+ of data in seconds. Getting Started. Loading Data. Video Demo: Querying Cloud SQL from BigQuery. Lab: Loading Data into BigQuery. Exploring Schemas. Demo: Exploring BigQuery Public Datasets with SQL using INFORMATION_SCHEMA. Schema Design. Nested and Repeated Fields. Demo: Nested and repeated fields in BigQuery. Lab: Working with JSON and Array data in BigQuery. Optimizing with Partitioning and Clustering. Demo: Partitioned and Clustered Tables in BigQuery. Preview: Transforming Batch and Streaming Data. Introduction to Building Batch Data Pipelines EL, ELT, ETL. Quality considerations. How to carry out operations in BigQuery. Demo: ELT to improve data quality in BigQuery. Shortcomings. ETL to solve data quality issues. Executing Spark on Cloud Dataproc The Hadoop ecosystem. Running Hadoop on Cloud Dataproc. GCS instead of HDFS. Optimizing Dataproc. Lab: Running Apache Spark jobs on Cloud Dataproc. Serverless Data Processing with Cloud Dataflow Cloud Dataflow. Why customers value Dataflow. Dataflow Pipelines. Lab: A Simple Dataflow Pipeline (Python/Java). Lab: MapReduce in Dataflow (Python/Java). Lab: Side Inputs (Python/Java). Dataflow Templates. Dataflow SQL. Manage Data Pipelines with Cloud Data Fusion and Cloud Composer Building Batch Data Pipelines visually with Cloud Data Fusion. Components. UI Overview. Building a Pipeline. Exploring Data using Wrangler. Lab: Building and executing a pipeline graph in Cloud Data Fusion. Orchestrating work between GCP services with Cloud Composer. Apache Airflow Environment. DAGs and Operators. Workflow Scheduling. Optional Long Demo: Event-triggered Loading of data with Cloud Composer, Cloud Functions, Cloud Storage, and BigQuery. Monitoring and Logging. Lab: An Introduction to Cloud Composer. Introduction to Processing Streaming Data Processing Streaming Data. Serverless Messaging with Cloud Pub/Sub Cloud Pub/Sub. Lab: Publish Streaming Data into Pub/Sub. Cloud Dataflow Streaming Features Cloud Dataflow Streaming Features. Lab: Streaming Data Pipelines. High-Throughput BigQuery and Bigtable Streaming Features BigQuery Streaming Features. Lab: Streaming Analytics and Dashboards. Cloud Bigtable. Lab: Streaming Data Pipelines into Bigtable. Advanced BigQuery Functionality and Performance Analytic Window Functions. Using With Clauses. GIS Functions. Demo: Mapping Fastest Growing Zip Codes with BigQuery GeoViz. Performance Considerations. Lab: Optimizing your BigQuery Queries for Performance. Optional Lab: Creating Date-Partitioned Tables in BigQuery. Introduction to Analytics and AI What is AI?. From Ad-hoc Data Analysis to Data Driven Decisions. Options for ML models on GCP. Prebuilt ML model APIs for Unstructured Data Unstructured Data is Hard. ML APIs for Enriching Data. Lab: Using the Natural Language API to Classify Unstructured Text. Big Data Analytics with Cloud AI Platform Notebooks What's a Notebook. BigQuery Magic and Ties to Pandas. Lab: BigQuery in Jupyter Labs on AI Platform. Production ML Pipelines with Kubeflow Ways to do ML on GCP. Kubeflow. AI Hub. Lab: Running AI models on Kubeflow. Custom Model building with SQL in BigQuery ML BigQuery ML for Quick Model Building. Demo: Train a model with BigQuery ML to predict NYC taxi fares. Supported Models. Lab Option 1: Predict Bike Trip Duration with a Regression Model in BQML. Lab Option 2: Movie Recommendations in BigQuery ML. Custom Model building with Cloud AutoML Why Auto ML? Auto ML Vision. Auto ML NLP. Auto ML Tables.

Data Engineering on Google Cloud

Delivered OnlineFlexible Dates

Price on Enquiry

KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for Experienced DataStage developers seeking training in more advanced DataStage job techniques and who seek techniques for working with complex types of data resources. Overview Use Connector stages to read from and write to database tables Handle SQL errors in Connector stages Use Connector stages with multiple input links Use the File Connector stage to access Hadoop HDFS data Optimize jobs that write to database tables Use the Unstructured Data stage to extract data from Excel spreadsheets Use the Data Masking stage to mask sensitive data processed within a DataStage job Use the Hierarchical stage to parse, compose, and transform XML data Use the Schema Library Manager to import and manage XML schemas Use the Data Rules stage to validate fields of data within a DataStage job Create custom data rules for validating data Design a job that processes a star schema data warehouse with Type 1 and Type 2 slowly changing dimensions This course is designed to introduce you to advanced parallel job data processing techniques in DataStage v11.5. In this course you will develop data techniques for processing different types of complex data resources including relational data, unstructured data (Excel spreadsheets), and XML data. In addition, you will learn advanced techniques for processing data, including techniques for masking data and techniques for validating data using data rules. Finally, you will learn techniques for updating data in a star schema data warehouse using the DataStage SCD (Slowly Changing Dimensions) stage. Even if you are not working with all of these specific types of data, you will benefit from this course by learning advanced DataStage job design techniques, techniques that go beyond those utilized in the DataStage Essentials course. Accessing databases Connector stage overview - Use Connector stages to read from and write to relational tables - Working with the Connector stage properties Connector stage functionality - Before / After SQL - Sparse lookups - Optimize insert/update performance Error handling in Connector stages - Reject links - Reject conditions Multiple input links - Designing jobs using Connector stages with multiple input links - Ordering records across multiple input links File Connector stage - Read and write data to Hadoop file systems Demonstration 1: Handling database errors Demonstration 2: Parallel jobs with multiple Connector input links Demonstration 3: Using the File Connector stage to read and write HDFS files Processing unstructured data Using the Unstructured Data stage in DataStage jobs - Extract data from an Excel spreadsheet - Specify a data range for data extraction in an Unstructured Data stage - Specify document properties for data extraction. Demonstration 1: Processing unstructured data Data masking Using the Data Masking stage in DataStage jobs - Data masking techniques - Data masking policies - Applying policies for masquerading context-aware data types - Applying policies for masquerading generic data types - Repeatable replacement - Using reference tables - Creating custom reference tables Demonstration 1: Data masking Using data rules Introduction to data rules - Using the Data Rules Editor - Selecting data rules - Binding data rule variables - Output link constraints - Adding statistics and attributes to the output information Use the Data Rules stage to valid foreign key references in source data Create custom data rules Demonstration 1: Using data rules Processing XML data Introduction to the Hierarchical stage - Hierarchical stage Assembly editor - Use the Schema Library Manager to import and manage XML schemas Composing XML data - Using the HJoin step to create parent-child relationships between input lists - Using the Composer step Writing Hierarchical data to a relational table Using the Regroup step Consuming XML data - Using the XML Parser step - Propagating columns Topic 6: Transforming XML data - Using the Aggregate step - Using the Sort step - Using the Switch step - Using the H-Pivot step Demonstration 1: Importing XML schemas Demonstration 2: Compose hierarchical data Demonstration 3: Consume hierarchical data Demonstration 4: Transform hierarchical data Updating a star schema database Surrogate keys - Design a job that creates and updates a surrogate key source key file from a dimension table Slowly Changing Dimensions (SCD) stage - Star schema databases - SCD stage Fast Path pages - Specifying purpose codes - Dimension update specification - Design a job that processes a star schema database with Type 1 and Type 2 slowly changing dimensions Demonstration 1: Build a parallel job that updates a star schema database with two dimensions Additional course details: Nexus Humans KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

KM423 IBM InfoSphere DataStage v11.5 - Advanced Data Processing

Delivered OnlineFlexible Dates

Price on Enquiry

KM404 IBM InfoSphere Advanced DataStage - Parallel Framework (v11.5)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for Experienced DataStage developers seeking training in more advanced DataStage job techniques and who seek an understanding of the parallel framework architecture. In this course, students will develop a deeper understanding of the DataStage architecture, including a deeper understanding of the DataStage development and runtime environments. Introduction to the Parallel Framework Architecture Describe the parallel processing architecture Describe pipeline and partition parallelism Describe the role of the configuration file Design a job that creates robust test data Compiling & Executing Jobs Describe the main parts of the configuration file Describe the compile process and the OSH that the compilation process generates Describe the role and the main parts of the Score Describe the job execution process Partitioning & Collecting Data Understand how partitioning works in the Framework Viewing partitioners in the Score Selecting partitioning algorithms Generate sequences of numbers (surrogate keys) in a partitioned, parallel environment Sorting Data Sort data in the parallel framework Find inserted sorts in the Score Reduce the number of inserted sorts Optimize Fork-Join jobs Use Sort stages to determine the last row in a group Describe sort key and partitioner key logic in the parallel framework Buffering in Parallel Jobs Describe how buffering works in parallel jobs Tune buffers in parallel jobs Avoid buffer contentions Parallel Framework Data Types Describe virtual data sets Describe schemas Describe data type mappings and conversions Describe how external data is processed Handle nulls Work with complex data Reusable Components Create a schema file Read a sequential file using a schema Describe Runtime Column Propagation (RCP) Enable and disable RCP Create and use shared containers Balanced Optimization Enable Balanced Optimization functionality in Designer Describe the Balanced Optimization workflow List the different Balanced Optimization options. Push stage processing to a data source Push stage processing to a data target Optimize a job accessing Hadoop HDFS file system Understand the limitations of Balanced Optimizations Additional course details: Nexus Humans KM404 IBM InfoSphere Advanced DataStage - Parallel Framework (v11.5) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the KM404 IBM InfoSphere Advanced DataStage - Parallel Framework (v11.5) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

KM404 IBM InfoSphere Advanced DataStage - Parallel Framework (v11.5)

Delivered OnlineFlexible Dates

Price on Enquiry

Google Cloud Platform Big Data and Machine Learning Fundamentals

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This class is intended for the following: Data analysts, Data scientists, Business analysts getting started with Google Cloud Platform. Individuals responsible for designing pipelines and architectures for data processing, creating and maintaining machine learning and statistical models, querying datasets, visualizing query results and creating reports. Executives and IT decision makers evaluating Google Cloud Platform for use by data scientists. Overview This course teaches students the following skills:Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform.Use Cloud SQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform.Employ BigQuery and Cloud Datalab to carry out interactive data analysis.Train and use a neural network using TensorFlow.Employ ML APIs.Choose between different data processing products on the Google Cloud Platform. This course introduces participants to the Big Data and Machine Learning capabilities of Google Cloud Platform (GCP). It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities. Introducing Google Cloud Platform Google Platform Fundamentals Overview. Google Cloud Platform Big Data Products. Compute and Storage Fundamentals CPUs on demand (Compute Engine). A global filesystem (Cloud Storage). CloudShell. Lab: Set up a Ingest-Transform-Publish data processing pipeline. Data Analytics on the Cloud Stepping-stones to the cloud. Cloud SQL: your SQL database on the cloud. Lab: Importing data into CloudSQL and running queries. Spark on Dataproc. Lab: Machine Learning Recommendations with Spark on Dataproc. Scaling Data Analysis Fast random access. Datalab. BigQuery. Lab: Build machine learning dataset. Machine Learning Machine Learning with TensorFlow. Lab: Carry out ML with TensorFlow Pre-built models for common needs. Lab: Employ ML APIs. Data Processing Architectures Message-oriented architectures with Pub/Sub. Creating pipelines with Dataflow. Reference architecture for real-time and batch data processing. Summary Why GCP? Where to go from here Additional Resources Additional course details: Nexus Humans Google Cloud Platform Big Data and Machine Learning Fundamentals training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Google Cloud Platform Big Data and Machine Learning Fundamentals course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Google Cloud Platform Big Data and Machine Learning Fundamentals

Delivered OnlineFlexible Dates

Price on Enquiry

Big Data Architecture Workshop

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for Senior Executives CIOs and CTOs Business Intelligence Executives Marketing Executives Data & Business Analytics Specialists Innovation Specialists & Entrepreneurs Academics, and other people interested in Big Data Overview More specifically, BDAW addresses advanced big data architecture topics, including, data formats, transformation, real-time, batch and machine learning processing, scalability, fault tolerance, security and privacy, minimizing the risk of an unsound architecture and technology selection. Big Data Architecture Workshop (BDAW) is a learning event that addresses advanced big data architecture topics. BDAW brings together technical contributors into a group setting to design and architect solutions to a challenging business problem. The workshop addresses big data architecture problems in general, and then applies them to the design of a challenging system. Throughout the highly interactive workshop, students apply concepts to real-world examples resulting in detailed synergistic discussions. The workshop is conducive for students to learn techniques for architecting big data systems, not only from Cloudera?s experience but also from the experiences of fellow students. Workshop Application Use Cases Oz Metropolitan Architectural questions Team activity: Analyze Metroz Application Use Cases Application Vertical Slice Definition Minimizing risk of an unsound architecture Selecting a vertical slice Team activity: Identify an initial vertical slice for Metroz Application Processing Real time, near real time processing Batch processing Data access patterns Delivery and processing guarantees Machine Learning pipelines Team activity: identify delivery and processing patterns in Metroz, characterize response time requirements, identify Machine Learning pipelines Application Data Three V?s of Big Data Data Lifecycle Data Formats Transforming Data Team activity: Metroz Data Requirements Scalable Applications Scale up, scale out, scale to X Determining if an application will scale Poll: scalable airport terminal designs Hadoop and Spark Scalability Team activity: Scaling Metroz Fault Tolerant Distributed Systems Principles Transparency Hardware vs. Software redundancy Tolerating disasters Stateless functional fault tolerance Stateful fault tolerance Replication and group consistency Fault tolerance in Spark and Map Reduce Application tolerance for failures Team activity: Identify Metroz component failures and requirements Security and Privacy Principles Privacy Threats Technologies Team activity: identify threats and security mechanisms in Metroz Deployment Cluster sizing and evolution On-premise vs. Cloud Edge computing Team activity: select deployment for Metroz Technology Selection HDFS HBase Kudu Relational Database Management Systems Map Reduce Spark, including streaming, SparkSQL and SparkML Hive Impala Cloudera Search Data Sets and Formats Team activity: technologies relevant to Metroz Software Architecture Architecture artifacts One platform or multiple, lambda architecture Team activity: produce high level architecture, selected technologies, revisit vertical slice Vertical Slice demonstration Additional course details: Nexus Humans Big Data Architecture Workshop training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Big Data Architecture Workshop course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Big Data Architecture Workshop

Delivered OnlineFlexible Dates

Price on Enquiry

Cisco Designing the FlexPod Solution (FPDESIGN)

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for This course is designed for post-sales audiences and is aimed at channel partners, customer network engineers and administrators whose interest is focused around designing a scalable infrastructure with the FlexPod. Overview Upon completing this course, you will be able to meet these overall objectives: Describe the FlexPod data center solutions and architecture Identify FlexPod workload sizing and technical specifications Describe the FlexPod deployment and management strategies The goal of this course is to evaluate the FlexPod solution design process in regards to the contemporary data center challenges. The course provides a comprehensive understanding of the reconnaissance and analytics to assess computing solution performance characteristics and requirements. In addition this course will describe the hardware components of the FlexPod and the process for selecting proper hardware for a given set of requirements. FlexPod Data Center Solutions and Architecture Describe data center elements Identify data center business challenges Identify data center environmental challenges Identify data center technical challenges Describe the data center consolidation trend Describe the FlexPod solution Identify the benefits of FlexPod Describe FlexPod platforms Describe FlexPod validated and supported designs Identify the supported Cisco UCS components Identify the supported Cisco Nexus switch components Identify the supported NetApp storage components FlexPod Workload Sizing and Technical Specifications Describe FlexPod performance characteristics Describe server virtualization performance characteristics Describe desktop virtualization performance characteristics Describe reconnaissance and analysis tools Describe the process for deploying analysis tools Configure the Microsoft MAP Toolkit Identify FlexPod Design components Describe FlexPod Sizing considerations Employ Cisco UCS Application Sizer Employ Cisco UCS VXI Resource Comparison tool Describe NetApp Solution Builder Sizing tool FlexPod Deployment and Management Strategies Describe key FlexPod LAN features Describe key FlexPod SAN features Identify FlexPod server provisioning features List FlexPod high availability features Describe supported FlexPod SAN features Describe FlexPod virtual storage tiering features Identify Cisco FlexPod validated designs Identify FlexPod data center with VMware vSphere 5.1 Identify FlexPod data center with VMware vSphere 5.1 with Cisco Nexus 7000 Identify FlexPod data center with Microsoft Private Cloud Enterprise Design Guide Identify FlexPod Select with Cloudera's Distribution including Apache Hadoop (CDH) Identify FlexPod Cisco Nexus 7000 and NetApp MetroCluster for multisite deployment Identify data center operations and management challenges Describe FlexPod validated management solutions Describe Cisco UCS Director turnkey solutions Identify Cisco UCS Director management types Describe Cisco UCS Director automation Describe self-service provisioning and reporting Identify the customer challenges and goals Describe the workload analysis Describe the component selection process Review the selected component Analyze the solution Additional course details: Nexus Humans Cisco Designing the FlexPod Solution (FPDESIGN) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Cisco Designing the FlexPod Solution (FPDESIGN) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Cisco Designing the FlexPod Solution (FPDESIGN)

Delivered OnlineFlexible Dates

Price on Enquiry

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

By Nexus Human

Duration 5 Days 30 CPD hours This course is intended for This intermediate and beyond level course is geared for experienced technical professionals in various roles, such as developers, data analysts, data engineers, software engineers, and machine learning engineers who want to leverage Scala and Spark to tackle complex data challenges and develop scalable, high-performance applications across diverse domains. Practical programming experience is required to participate in the hands-on labs. Overview Working in a hands-on learning environment led by our expert instructor you'll: Develop a basic understanding of Scala and Apache Spark fundamentals, enabling you to confidently create scalable and high-performance applications. Learn how to process large datasets efficiently, helping you handle complex data challenges and make data-driven decisions. Gain hands-on experience with real-time data streaming, allowing you to manage and analyze data as it flows into your applications. Acquire practical knowledge of machine learning algorithms using Spark MLlib, empowering you to create intelligent applications and uncover hidden insights. Master graph processing with GraphX, enabling you to analyze and visualize complex relationships in your data. Discover generative AI technologies using GPT with Spark and Scala, opening up new possibilities for automating content generation and enhancing data analysis. Embark on a journey to master the world of big data with our immersive course on Scala and Spark! Mastering Scala with Apache Spark for the Modern Data Enterprise is a five day hands on course designed to provide you with the essential skills and tools to tackle complex data projects using Scala programming language and Apache Spark, a high-performance data processing engine. Mastering these technologies will enable you to perform a wide range of tasks, from data wrangling and analytics to machine learning and artificial intelligence, across various industries and applications.Guided by our expert instructor, you?ll explore the fundamentals of Scala programming and Apache Spark while gaining valuable hands-on experience with Spark programming, RDDs, DataFrames, Spark SQL, and data sources. You?ll also explore Spark Streaming, performance optimization techniques, and the integration of popular external libraries, tools, and cloud platforms like AWS, Azure, and GCP. Machine learning enthusiasts will delve into Spark MLlib, covering basics of machine learning algorithms, data preparation, feature extraction, and various techniques such as regression, classification, clustering, and recommendation systems. Introduction to Scala Brief history and motivation Differences between Scala and Java Basic Scala syntax and constructs Scala's functional programming features Introduction to Apache Spark Overview and history Spark components and architecture Spark ecosystem Comparing Spark with other big data frameworks Basics of Spark Programming SparkContext and SparkSession Resilient Distributed Datasets (RDDs) Transformations and Actions Working with DataFrames Spark SQL and Data Sources Spark SQL library and its advantages Structured and semi-structured data sources Reading and writing data in various formats (CSV, JSON, Parquet, Avro, etc.) Data manipulation using SQL queries Basic RDD Operations Creating and manipulating RDDs Common transformations and actions on RDDs Working with key-value data Basic DataFrame and Dataset Operations Creating and manipulating DataFrames and Datasets Column operations and functions Filtering, sorting, and aggregating data Introduction to Spark Streaming Overview of Spark Streaming Discretized Stream (DStream) operations Windowed operations and stateful processing Performance Optimization Basics Best practices for efficient Spark code Broadcast variables and accumulators Monitoring Spark applications Integrating External Libraries and Tools, Spark Streaming Using popular external libraries, such as Hadoop and HBase Integrating with cloud platforms: AWS, Azure, GCP Connecting to data storage systems: HDFS, S3, Cassandra, etc. Introduction to Machine Learning Basics Overview of machine learning Supervised and unsupervised learning Common algorithms and use cases Introduction to Spark MLlib Overview of Spark MLlib MLlib's algorithms and utilities Data preparation and feature extraction Linear Regression and Classification Linear regression algorithm Logistic regression for classification Model evaluation and performance metrics Clustering Algorithms Overview of clustering algorithms K-means clustering Model evaluation and performance metrics Collaborative Filtering and Recommendation Systems Overview of recommendation systems Collaborative filtering techniques Implementing recommendations with Spark MLlib Introduction to Graph Processing Overview of graph processing Use cases and applications of graph processing Graph representations and operations Introduction to Spark GraphX Overview of GraphX Creating and transforming graphs Graph algorithms in GraphX Big Data Innovation! Using GPT and Generative AI Technologies with Spark and Scala Overview of generative AI technologies Integrating GPT with Spark and Scala Practical applications and use cases Bonus Topics / Time Permitting Introduction to Spark NLP Overview of Spark NLP Preprocessing text data Text classification and sentiment analysis Putting It All Together Work on a capstone project that integrates multiple aspects of the course, including data processing, machine learning, graph processing, and generative AI technologies.

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

Delivered OnlineFlexible Dates

Price on Enquiry