• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

118 Big Data Analytics courses

Data Analyst (Data Analytics) - CPD Certified

By Training Tale

Data Analyst (Data Analytics) - CPD Certified Have you ever wondered how companies get insights from massive volumes of data to stay competitive and make wise decisions? If so, then participate in our exclusive Data Analytics Course. This Data Analytics Course describes the fundamentals of data, statistics, and an introduction to data analytics. How to get data and where to find it is explained in the Data Analytics Course. Moreover, this Data Analytics Course covers data cleansing, preprocessing, and exploratory data analysis (EDA). Additionally, the Data Analytics Course provides an introduction to Python and Excel for data analytics. This thorough Data Analytics Course includes lessons on data wrangling with Pandas (python) and data visualisation using Matplotlib and Seaborn (python). Enrol in our Data Analytics Course to study the fundamentals of statistical analysis and machine learning. Special Offers of this Data Analyst (Data Analytics) Course Data Analyst (Data Analytics) Course includes a FREE PDF Certificate. Lifetime access to this Data Analyst (Data Analytics) Course Instant access to this Data Analyst (Data Analytics) Course Get FREE Tutor Support from Monday to Friday in this Data Analyst (Data Analytics) Course Courses are included in this Data Analyst (Data Analytics) Course Course 01: Cyber Security Course 02: GDPR Course 03: Business Administration [ Note: Free PDF certificate as soon as completing the Data Analyst (Data Analytics) Course] Course Curriculum of Data Analyst (Data Analytics) - CPD Certified Module 1: Introduction to Data Analytics Module 2: Basics of Data and Statistics Module 3: Data Collection and Sources Module 4: Data Cleaning and Preprocessing Module 5: Exploratory Data Analysis (EDA) Module 6: Introduction to Excel for Data Analytics Module 7: Introduction to Python for Data Analytics Module 8: Data Wrangling with Pandas (Python) Module 9: Data visualisation with Matplotlib and Seaborn (Python) Module 10: Introduction to Basic Statistical Analysis Module 11: Introduction to Machine Learning Module 12: Capstone Project - Exploratory Data Analysis Assessment Method After completing each module of the Data Analyst (Data Analytics) Course, you will find automated MCQ quizzes. To unlock the next module, you need to complete the quiz task and get at least 60% marks. Certification After completing the MCQ/Assignment assessment for this Data Analyst (Data Analytics) course, you will be entitled to a Certificate of Completion from Training Tale. The certificate is in PDF format, which is completely free to download. A printed version is also available upon request. It will also be sent to you through a courier for £13.99. Who is this course for? Data Analyst (Data Analytics) - CPD Certified For business professionals, entrepreneurs, or anybody else looking to have a thorough grasp of data analysis in a commercial setting, this Data Analytics Course is ideal. Requirements There are no specific requirements for Data Analyst (Data Analytics) Course because it does not require any advanced knowledge or skills. Career path Data Analyst (Data Analytics) - CPD Certified This Data Analytics Course will assist you in obtaining positions as a business analyst, marketing analyst, data analyst, and in related fields. Certificates Certificate of completion Digital certificate - Included

Data Analyst (Data Analytics) - CPD Certified
Delivered Online On Demand18 hours
£12

Data Science and Data Analytics with Python

By Xpert Learning

About Course Data Science and Data Analytics with Python: A Comprehensive Course for Beginners Unlock the power of data and gain insights that drive informed decisions with this comprehensive course on data science and data analytics with Python. This course is designed for beginners of all skill levels, with no prior programming experience required. You will learn the essential skills to embark on your data-driven journey, including: Data manipulation with NumPy and Pandas Data visualization with Matplotlib and Seaborn Statistical analysis with Python Machine learning and artificial intelligence You will also gain hands-on experience with real-world data projects, allowing you to apply your newfound knowledge to solve real-world problems. By the end of this course, you will be able to: Understand the fundamentals of data science and data analytics Apply Python to manipulate, visualize, and analyze data Use Python to build machine learning and artificial intelligence models Solve real-world data problems This course is the perfect launchpad for your data science journey. Whether you are looking to pivot your career, enhance your skill set, or simply quench your curiosity, this course will give you the foundation you need to succeed. Enroll today and start exploring the fascinating world of data science together! What Will You Learn? Understand the fundamentals of data science and data analytics Apply Python to manipulate, visualize, and analyze data Use Python to build machine learning and artificial intelligence models Solve real-world data problems Course Content Introduction to Python Data Science Introduction to Python Data Science Environment Setup Data Cleaning Packages Working with the Numpy package Working with Pandas Data science package Data Visualization Packages Working with Matplotlib Data Science package (Part - 1) Working with Matplotlib Data Science (Part - 2) A course by Uditha Bandara Microsoft Most Valuable Professional (MVP) RequirementsBeginners level knowledge for working with Data .Programming knowledge not required. Audience Beginners with no prior programming experience Anyone interested in learning data science and data analytics Audience Beginners with no prior programming experience Anyone interested in learning data science and data analytics

Data Science and Data Analytics with Python
Delivered Online On Demand
£9.99

Learn MySQL from scratch for Data Science and Analytics

By Xpert Learning

A course by Sekhar Metla IT Industry Expert RequirementsNo prior technical experience is required! All you need a computer!No SQL experience needed. You will learn everything you need to knowNo software is required in advance of the course (all software used in the course is free) Audience Beginner SQL, Data Science and Analytics - developers curious about SQL Career Anyone who wants to generate new income streams Anyone who works with data analytics, or databases! Anyone who wants to become Business intelligence developer Anyone who wants to start their own business or become freelance Anyone who wants to become a Data Science developer If you work in: marketing, finance, accounting, operations, sales, manufacturing, healthcare, financial services, or any other industry/function that collects information Someone who wants to learn skills that give them the potential to earn near SIX figures! Audience Beginner SQL, Data Science and Analytics - developers curious about SQL Career Anyone who wants to generate new income streams Anyone who works with data analytics, or databases! Anyone who wants to become Business intelligence developer Anyone who wants to start their own business or become freelance Anyone who wants to become a Data Science developer If you work in: marketing, finance, accounting, operations, sales, manufacturing, healthcare, financial services, or any other industry/function that collects information Someone who wants to learn skills that give them the potential to earn near SIX figures!

Learn MySQL from scratch for Data Science and Analytics
Delivered Online On Demand6 hours
£9.99

Data Analyst: Data Analysis in Excel

By IOMH - Institute of Mental Health

Overview of Data Analyst: Data Analysis in Excel Join our Data Analyst: Data Analysis in Excel course and discover your hidden skills, setting you on a path to success in this area. Get ready to improve your skills and achieve your biggest goals. The Data Analyst: Data Analysis in Excel course has everything you need to get a great start in this sector. Improving and moving forward is key to getting ahead personally. The Data Analyst: Data Analysis in Excel course is designed to teach you the important stuff quickly and well, helping you to get off to a great start in the field. So, what are you looking for? Enrol now! Get a Quick Look at The Course Content: This Data Analyst: Data Analysis in Excel Course will help you to learn: Learn strategies to boost your workplace efficiency. Hone your skills to help you advance your career. Acquire a comprehensive understanding of various topics and tips. Learn in-demand skills that are in high demand among UK employers This course covers the topic you must know to stand against the tough competition. The future is truly yours to seize with this Data Analyst: Data Analysis in Excel. Enrol today and complete the course to achieve a certificate that can change your career forever. Details Perks of Learning with IOMH One-To-One Support from a Dedicated Tutor Throughout Your Course. Study Online - Whenever and Wherever You Want. Instant Digital/ PDF Certificate. 100% Money Back Guarantee. 12 Months Access. Process of Evaluation After studying the course, an MCQ exam or assignment will test your skills and knowledge. You have to get a score of 60% to pass the test and get your certificate. Certificate of Achievement Certificate of Completion - Digital / PDF Certificate After completing the Data Analyst: Data Analysis in Excel course, you can order your CPD Accredited Digital / PDF Certificate for £5.99.  Certificate of Completion - Hard copy Certificate You can get the CPD Accredited Hard Copy Certificate for £12.99. Shipping Charges: Inside the UK: £3.99 International: £10.99 Who Is This Course for? This Data Analyst: Data Analysis in Excel is suitable for anyone aspiring to start a career in relevant field; even if you are new to this and have no prior knowledge, this course is going to be very easy for you to understand.  On the other hand, if you are already working in this sector, this course will be a great source of knowledge for you to improve your existing skills and take them to the next level.  This course has been developed with maximum flexibility and accessibility, making it ideal for people who don't have the time to devote to traditional education. Requirements You don't need any educational qualification or experience to enrol in the Data Analyst: Data Analysis in Excel course. Do note: you must be at least 16 years old to enrol. Any internet-connected device, such as a computer, tablet, or smartphone, can access this online course. Career Path The certification and skills you get from this Data Analyst: Data Analysis in Excel Course can help you advance your career and gain expertise in several fields, allowing you to apply for high-paying jobs in related sectors. Course Curriculum Modifying a Worksheet Insert, Delete, and Adjust Cells, Columns, and Rows 00:10:00 Search for and Replace Data 00:09:00 Use Proofing and Research Tools 00:07:00 Working with Lists Sort Data 00:10:00 Filter Data 00:10:00 Query Data with Database Functions 00:09:00 Outline and Subtotal Data 00:09:00 Analyzing Data Apply Intermediate Conditional Formatting 00:07:00 Apply Advanced Conditional Formatting 00:05:00 Visualizing Data with Charts Create Charts 00:13:00 Modify and Format Charts 00:12:00 Use Advanced Chart Features 00:12:00 Using PivotTables and PivotCharts Create a PivotTable 00:13:00 Analyze PivotTable Data 00:12:00 Present Data with PivotCharts 00:07:00 Filter Data by Using Timelines and Slicers 00:11:00 Working with Multiple Worksheets and Workbooks Use Links and External References 00:12:00 Use 3-D References 00:06:00 Consolidate Data 00:05:00 Using Lookup Functions and Formula Auditing Use Lookup Functions 00:12:00 Trace Cells 00:09:00 Watch and Evaluate Formulas 00:08:00 Automating Workbook Functionality Apply Data Validation 00:13:00 Search for Invalid Data and Formulas with Errors 00:04:00 Work with Macros 00:18:00 Creating Sparklines and Mapping Data Create Sparklines 00:07:00 MapData 00:07:00 Forecasting Data Determine Potential Outcomes Using Data Tables 00:08:00 Determine Potential Outcomes Using Scenarios 00:09:00 Use the Goal Seek Feature 00:04:00 Forecasting Data Trends 00:05:00

Data Analyst: Data Analysis in Excel
Delivered Online On Demand4 hours 43 minutes
£11

Learn Data Science with Python, JavaScript, and Microsoft SQL

By Xpert Learning

A course by Sekhar Metla IT Industry Expert RequirementsNo programming experience needed. You will learn everything you need to knowNo software is required in advance of the course (all software used in the course is free)No pre-knowledge is required - you will learn from basic Audience Beginner JavaScript, Python and MSSQL developers curious about data science development Anyone who wants to generate new income streams Anyone who wants to build websites Anyone who wants to become financially independent Anyone who wants to start their own business or become freelance Anyone who wants to become a Full stack web developer Audience Beginner JavaScript, Python and MSSQL developers curious about data science development Anyone who wants to generate new income streams Anyone who wants to build websites Anyone who wants to become financially independent Anyone who wants to start their own business or become freelance Anyone who wants to become a Full stack web developer

Learn Data Science with Python, JavaScript, and Microsoft SQL
Delivered Online On Demand22 hours
£9.99

GDPR Data Protection Law [Updated 2023]

5.0(1)

By Empower UK Employment Training

GDPR Data Protection Law [Updated 2023] Stay ahead in compliance with our updated 2023 GDPR Data Protection Law course. Equip yourself with the latest in GDPR Data Protection standards. Secure your organisation's future with comprehensive GDPR Data Protection knowledge. Learning Outcomes: Navigate the Introduction to GDPR for compliance. Uphold the Principles of GDPR in data management. Ensure Lawful Basis for Processing personal data. Defend the Rights of Data Subject under GDPR. Differentiate roles of Data Controller and Processor. More Benefits: LIFETIME access Device Compatibility Free Workplace Management Toolkit Key Modules from GDPR Data Protection Law [Updated 2023]: Introduction to GDPR: Familiarise yourself with the GDPR's scope and its impact on GDPR Data Protection practices. Principles of GDPR: Grasp the key GDPR principles that underpin effective GDPR Data Protection strategies. Lawful Basis for Processing: Understand the legal grounds for processing personal data within GDPR Data Protection frameworks. Rights of Data Subject: Recognise the rights individuals hold over their data, a cornerstone of GDPR Data Protection. Data Controller and Data Processor: Define and distinguish between the responsibilities of data controllers and processors under GDPR Data Protection laws. Data Protection by Design and by Default: Implement GDPR Data Protection requirements throughout your data processing activities. Security of Data: Master the security measures required to protect data in line with GDPR Data Protection guidelines. Data Breaches: Learn how to effectively manage and report data breaches in accordance with GDPR Data Protection procedures. Workplace and GDPR: Apply GDPR Data Protection policies within your organisational processes and workplace culture. Transferring Data Outside of EEA: Navigate the complexities of transferring data internationally under GDPR Data Protection rules. Exemptions: Identify the exemptions within GDPR Data Protection law and how they may apply to certain data processing scenarios.

GDPR Data Protection Law [Updated 2023]
Delivered Online On Demand2 hours 42 minutes
£5

Introduction to Hadoop Administration (TTDS6503)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This is an introductory-level course designed to teach experienced systems administrators how to install, maintain, monitor, troubleshoot, optimize, and secure Hadoop. Previous Hadoop experience is not required. Overview Working within in an engaging, hands-on learning environment, guided by our expert team, attendees will learn to: Understand the benefits of distributed computing Understand the Hadoop architecture (including HDFS and MapReduce) Define administrator participation in Big Data projects Plan, implement, and maintain Hadoop clusters Deploy and maintain additional Big Data tools (Pig, Hive, Flume, etc.) Plan, deploy and maintain HBase on a Hadoop cluster Monitor and maintain hundreds of servers Pinpoint performance bottlenecks and fix them Apache Hadoop is an open source framework for creating reliable and distributable compute clusters. Hadoop provides an excellent platform (with other related frameworks) to process large unstructured or semi-structured data sets from multiple sources to dissect, classify, learn from and make suggestions for business analytics, decision support, and other advanced forms of machine intelligence. This is an introductory-level, hands-on lab-intensive course geared for the administrator (new to Hadoop) who is charged with maintaining a Hadoop cluster and its related components. You will learn how to install, maintain, monitor, troubleshoot, optimize, and secure Hadoop. Introduction Hadoop history and concepts Ecosystem Distributions High level architecture Hadoop myths Hadoop challenges (hardware / software) Planning and installation Selecting software and Hadoop distributions Sizing the cluster and planning for growth Selecting hardware and network Rack topology Installation Multi-tenancy Directory structure and logs Benchmarking HDFS operations Concepts (horizontal scaling, replication, data locality, rack awareness) Nodes and daemons (NameNode, Secondary NameNode, HA Standby NameNode, DataNode) Health monitoring Command-line and browser-based administration Adding storage and replacing defective drives MapReduce operations Parallel computing before MapReduce: compare HPC versus Hadoop administration MapReduce cluster loads Nodes and Daemons (JobTracker, TaskTracker) MapReduce UI walk through MapReduce configuration Job config Job schedulers Administrator view of MapReduce best practices Optimizing MapReduce Fool proofing MR: what to tell your programmers YARN: architecture and use Advanced topics Hardware monitoring System software monitoring Hadoop cluster monitoring Adding and removing servers and upgrading Hadoop Backup, recovery, and business continuity planning Cluster configuration tweaks Hardware maintenance schedule Oozie scheduling for administrators Securing your cluster with Kerberos The future of Hadoop

Introduction to Hadoop Administration (TTDS6503)
Delivered OnlineFlexible Dates
Price on Enquiry

Cloudera Training for Apache HBase

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for This course is appropriate for developers and administrators who intend to use HBase. Overview Skills learned on the course include:The use cases and usage occasions for HBase, Hadoop, and RDBMSUsing the HBase shell to directly manipulate HBase tablesDesigning optimal HBase schemas for efficient data storage and recoveryHow to connect to HBase using the Java API, configure the HBase cluster, and administer an HBase clusterBest practices for identifying and resolving performance bottlenecks Cloudera University?s four-day training course for Apache HBase enables participants to store and access massive quantities of multi-structured data and perform hundreds of thousands of operations per second. Introduction to Hadoop & HBase What Is Big Data? Introducing Hadoop Hadoop Components What Is HBase? Why Use HBase? Strengths of HBase HBase in Production Weaknesses of HBase HBase Tables HBase Concepts HBase Table Fundamentals Thinking About Table Design The HBase Shell Creating Tables with the HBase Shell Working with Tables Working with Table Data HBase Architecture Fundamentals HBase Regions HBase Cluster Architecture HBase and HDFS Data Locality HBase Schema Design General Design Considerations Application-Centric Design Designing HBase Row Keys Other HBase Table Features Basic Data Access with the HBase API Options to Access HBase Data Creating and Deleting HBase Tables Retrieving Data with Get Retrieving Data with Scan Inserting and Updating Data Deleting Data More Advanced HBase API Features Filtering Scans Best Practices HBase Coprocessors HBase on the Cluster How HBase Uses HDFS Compactions and Splits HBase Reads & Writes How HBase Writes Data How HBase Reads Data Block Caches for Reading HBase Performance Tuning Column Family Considerations Schema Design Considerations Configuring for Caching Dealing with Time Series and Sequential Data Pre-Splitting Regions HBase Administration and Cluster Management HBase Daemons ZooKeeper Considerations HBase High Availability Using the HBase Balancer Fixing Tables with hbck HBase Security HBase Replication & Backup HBase Replication HBase Backup MapReduce and HBase Clusters Using Hive & Impala with HBase Using Hive and Impala with HBase Appendix A: Accessing Data with Python and Thrift Thrift Usage Working with Tables Getting and Putting Data Scanning Data Deleting Data Counters Filters Appendix B: OpenTSDB

Cloudera Training for Apache HBase
Delivered OnlineFlexible Dates
Price on Enquiry

Designing and Building Big Data Applications

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for This course is best suited to developers, engineers, and architects who want to use use Hadoop and related tools to solve real-world problems. Overview Skills learned in this course include:Creating a data set with Kite SDKDeveloping custom Flume components for data ingestionManaging a multi-stage workflow with OozieAnalyzing data with CrunchWriting user-defined functions for Hive and ImpalaWriting user-defined functions for Hive and ImpalaIndexing data with Cloudera Search Cloudera University?s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH). IntroductionApplication Architecture Scenario Explanation Understanding the Development Environment Identifying and Collecting Input Data Selecting Tools for Data Processing and Analysis Presenting Results to the Use Defining & Using Datasets Metadata Management What is Apache Avro? Avro Schemas Avro Schema Evolution Selecting a File Format Performance Considerations Using the Kite SDK Data Module What is the Kite SDK? Fundamental Data Module Concepts Creating New Data Sets Using the Kite SDK Loading, Accessing, and Deleting a Data Set Importing Relational Data with Apache Sqoop What is Apache Sqoop? Basic Imports Limiting Results Improving Sqoop?s Performance Sqoop 2 Capturing Data with Apache Flume What is Apache Flume? Basic Flume Architecture Flume Sources Flume Sinks Flume Configuration Logging Application Events to Hadoop Developing Custom Flume Components Flume Data Flow and Common Extension Points Custom Flume Sources Developing a Flume Pollable Source Developing a Flume Event-Driven Source Custom Flume Interceptors Developing a Header-Modifying Flume Interceptor Developing a Filtering Flume Interceptor Writing Avro Objects with a Custom Flume Interceptor Managing Workflows with Apache Oozie The Need for Workflow Management What is Apache Oozie? Defining an Oozie Workflow Validation, Packaging, and Deployment Running and Tracking Workflows Using the CLI Hue UI for Oozie Processing Data Pipelines with Apache Crunch What is Apache Crunch? Understanding the Crunch Pipeline Comparing Crunch to Java MapReduce Working with Crunch Projects Reading and Writing Data in Crunch Data Collection API Functions Utility Classes in the Crunch API Working with Tables in Apache Hive What is Apache Hive? Accessing Hive Basic Query Syntax Creating and Populating Hive Tables How Hive Reads Data Using the RegexSerDe in Hive Developing User-Defined Functions What are User-Defined Functions? Implementing a User-Defined Function Deploying Custom Libraries in Hive Registering a User-Defined Function in Hive Executing Interactive Queries with Impala What is Impala? Comparing Hive to Impala Running Queries in Impala Support for User-Defined Functions Data and Metadata Management Understanding Cloudera Search What is Cloudera Search? Search Architecture Supported Document Formats Indexing Data with Cloudera Search Collection and Schema Management Morphlines Indexing Data in Batch Mode Indexing Data in Near Real Time Presenting Results to Users Solr Query Syntax Building a Search UI with Hue Accessing Impala through JDBC Powering a Custom Web Application with Impala and Search

Designing and Building Big Data Applications
Delivered OnlineFlexible Dates
Price on Enquiry

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

By Nexus Human

Duration 5 Days 30 CPD hours This course is intended for This intermediate and beyond level course is geared for experienced technical professionals in various roles, such as developers, data analysts, data engineers, software engineers, and machine learning engineers who want to leverage Scala and Spark to tackle complex data challenges and develop scalable, high-performance applications across diverse domains. Practical programming experience is required to participate in the hands-on labs. Overview Working in a hands-on learning environment led by our expert instructor you'll: Develop a basic understanding of Scala and Apache Spark fundamentals, enabling you to confidently create scalable and high-performance applications. Learn how to process large datasets efficiently, helping you handle complex data challenges and make data-driven decisions. Gain hands-on experience with real-time data streaming, allowing you to manage and analyze data as it flows into your applications. Acquire practical knowledge of machine learning algorithms using Spark MLlib, empowering you to create intelligent applications and uncover hidden insights. Master graph processing with GraphX, enabling you to analyze and visualize complex relationships in your data. Discover generative AI technologies using GPT with Spark and Scala, opening up new possibilities for automating content generation and enhancing data analysis. Embark on a journey to master the world of big data with our immersive course on Scala and Spark! Mastering Scala with Apache Spark for the Modern Data Enterprise is a five day hands on course designed to provide you with the essential skills and tools to tackle complex data projects using Scala programming language and Apache Spark, a high-performance data processing engine. Mastering these technologies will enable you to perform a wide range of tasks, from data wrangling and analytics to machine learning and artificial intelligence, across various industries and applications.Guided by our expert instructor, you?ll explore the fundamentals of Scala programming and Apache Spark while gaining valuable hands-on experience with Spark programming, RDDs, DataFrames, Spark SQL, and data sources. You?ll also explore Spark Streaming, performance optimization techniques, and the integration of popular external libraries, tools, and cloud platforms like AWS, Azure, and GCP. Machine learning enthusiasts will delve into Spark MLlib, covering basics of machine learning algorithms, data preparation, feature extraction, and various techniques such as regression, classification, clustering, and recommendation systems. Introduction to Scala Brief history and motivation Differences between Scala and Java Basic Scala syntax and constructs Scala's functional programming features Introduction to Apache Spark Overview and history Spark components and architecture Spark ecosystem Comparing Spark with other big data frameworks Basics of Spark Programming SparkContext and SparkSession Resilient Distributed Datasets (RDDs) Transformations and Actions Working with DataFrames Spark SQL and Data Sources Spark SQL library and its advantages Structured and semi-structured data sources Reading and writing data in various formats (CSV, JSON, Parquet, Avro, etc.) Data manipulation using SQL queries Basic RDD Operations Creating and manipulating RDDs Common transformations and actions on RDDs Working with key-value data Basic DataFrame and Dataset Operations Creating and manipulating DataFrames and Datasets Column operations and functions Filtering, sorting, and aggregating data Introduction to Spark Streaming Overview of Spark Streaming Discretized Stream (DStream) operations Windowed operations and stateful processing Performance Optimization Basics Best practices for efficient Spark code Broadcast variables and accumulators Monitoring Spark applications Integrating External Libraries and Tools, Spark Streaming Using popular external libraries, such as Hadoop and HBase Integrating with cloud platforms: AWS, Azure, GCP Connecting to data storage systems: HDFS, S3, Cassandra, etc. Introduction to Machine Learning Basics Overview of machine learning Supervised and unsupervised learning Common algorithms and use cases Introduction to Spark MLlib Overview of Spark MLlib MLlib's algorithms and utilities Data preparation and feature extraction Linear Regression and Classification Linear regression algorithm Logistic regression for classification Model evaluation and performance metrics Clustering Algorithms Overview of clustering algorithms K-means clustering Model evaluation and performance metrics Collaborative Filtering and Recommendation Systems Overview of recommendation systems Collaborative filtering techniques Implementing recommendations with Spark MLlib Introduction to Graph Processing Overview of graph processing Use cases and applications of graph processing Graph representations and operations Introduction to Spark GraphX Overview of GraphX Creating and transforming graphs Graph algorithms in GraphX Big Data Innovation! Using GPT and Generative AI Technologies with Spark and Scala Overview of generative AI technologies Integrating GPT with Spark and Scala Practical applications and use cases Bonus Topics / Time Permitting Introduction to Spark NLP Overview of Spark NLP Preprocessing text data Text classification and sentiment analysis Putting It All Together Work on a capstone project that integrates multiple aspects of the course, including data processing, machine learning, graph processing, and generative AI technologies.

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)
Delivered OnlineFlexible Dates
Price on Enquiry