This course will show you why Hadoop is one of the best tools to work with big data. With the help of some real-world data sets, you will learn how to use Hadoop and its distributed technologies, such as Spark, Flink, Pig, and Flume, to store, analyze, and scale big data.
This is a hands-on comprehensive course for beginners and in just two hours, you will learn the fundamentals of the Hadoop Ecosystem and its three main building blocks. This course will prepare you to start learning more about big data and to implement Hadoop components in Azure Cloud using HDInsight.
Duration 5 Days 30 CPD hours This course is intended for The primary audience for this course is database professionals who need to fulfil a Business Intelligence Developer role. They will need to focus on hands-on work creating BI solutions including Data Warehouse implementation, ETL, and data cleansing. Overview Create sophisticated SSIS packages for extracting, transforming, and loading data Use containers to efficiently control repetitive tasks and transactions Configure packages to dynamically adapt to environment changes Use Data Quality Services to cleanse data Successfully troubleshoot packages Create and Manage the SSIS Catalog Deploy, configure, and schedule packages Secure the SSIS Catalog SQL Server Integration Services is the Community Courseware version of 20767CC Implementing a SQL Data Warehouse. This five-day instructor-led course is intended for IT professionals who need to learn how to use SSIS to build, deploy, maintain, and secure Integration Services projects and packages, and to use SSIS to extract, transform, and load data to and from SQL Server. This course is similar to the retired Course 20767-C: Implementing a SQL Data Warehouse but focuses more on building packages, rather than the entire data warehouse design and implementation. Prerequisites Working knowledge of T-SQL and SQL Server Agent jobs is helpful, but not required. Basic knowledge of the Microsoft Windows operating system and its core functionality. Working knowledge of relational databases. Some experience with database design. 1 - SSIS Overview Import/Export Wizard Exporting Data with the Wizard Common Import Concerns Quality Checking Imported/Exported Data 2 - Working with Solutions and Projects Working with SQL Server Data Tools Understanding Solutions and Projects Working with the Visual Studio Interface 3 - Basic Control Flow Working with Tasks Understanding Precedence Constraints Annotating Packages Grouping Tasks Package and Task Properties Connection Managers Favorite Tasks 4 - Common Tasks Analysis Services Processing Data Profiling Task Execute Package Task Execute Process Task Expression Task File System Task FTP Task Hadoop Task Script Task Introduction Send Mail Task Web Service Task XML Task 5 - Data Flow Sources and Destinations The Data Flow Task The Data Flow SSIS Toolbox Working with Data Sources SSIS Data Sources Working with Data Destinations SSIS Data Destinations 6 - Data Flow Transformations Transformations Configuring Transformations 7 - Making Packages Dynamic Features for Making Packages Dynamic Package Parameters Project Parameters Variables SQL Parameters Expressions in Tasks Expressions in Connection Managers After Deployment How It All Fits Together 8 - Containers Sequence Containers For Loop Containers Foreach Loop Containers 9 - Troubleshooting and Package Reliability Understanding MaximumErrorCount Breakpoints Redirecting Error Rows Logging Event Handlers Using Checkpoints Transactions 10 - Deploying to the SSIS Catalog The SSIS Catalog Deploying Projects Working with Environments Executing Packages in SSMS Executing Packages from the Command Line Deployment Model Differences 11 - Installing and Administering SSIS Installing SSIS Upgrading SSIS Managing the SSIS Catalog Viewing Built-in SSIS Reports Managing SSIS Logging and Operation Histories Automating Package Execution 12 - Securing the SSIS Catalog Principals Securables Grantable Permissions Granting Permissions Configuring Proxy Accounts Additional course details: Nexus Humans 55321 SQL Server Integration Services training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the 55321 SQL Server Integration Services course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 3 Days 18 CPD hours This course is intended for This is an introductory-level course designed to teach experienced systems administrators how to install, maintain, monitor, troubleshoot, optimize, and secure Hadoop. Previous Hadoop experience is not required. Overview Working within in an engaging, hands-on learning environment, guided by our expert team, attendees will learn to: Understand the benefits of distributed computing Understand the Hadoop architecture (including HDFS and MapReduce) Define administrator participation in Big Data projects Plan, implement, and maintain Hadoop clusters Deploy and maintain additional Big Data tools (Pig, Hive, Flume, etc.) Plan, deploy and maintain HBase on a Hadoop cluster Monitor and maintain hundreds of servers Pinpoint performance bottlenecks and fix them Apache Hadoop is an open source framework for creating reliable and distributable compute clusters. Hadoop provides an excellent platform (with other related frameworks) to process large unstructured or semi-structured data sets from multiple sources to dissect, classify, learn from and make suggestions for business analytics, decision support, and other advanced forms of machine intelligence. This is an introductory-level, hands-on lab-intensive course geared for the administrator (new to Hadoop) who is charged with maintaining a Hadoop cluster and its related components. You will learn how to install, maintain, monitor, troubleshoot, optimize, and secure Hadoop. Introduction Hadoop history and concepts Ecosystem Distributions High level architecture Hadoop myths Hadoop challenges (hardware / software) Planning and installation Selecting software and Hadoop distributions Sizing the cluster and planning for growth Selecting hardware and network Rack topology Installation Multi-tenancy Directory structure and logs Benchmarking HDFS operations Concepts (horizontal scaling, replication, data locality, rack awareness) Nodes and daemons (NameNode, Secondary NameNode, HA Standby NameNode, DataNode) Health monitoring Command-line and browser-based administration Adding storage and replacing defective drives MapReduce operations Parallel computing before MapReduce: compare HPC versus Hadoop administration MapReduce cluster loads Nodes and Daemons (JobTracker, TaskTracker) MapReduce UI walk through MapReduce configuration Job config Job schedulers Administrator view of MapReduce best practices Optimizing MapReduce Fool proofing MR: what to tell your programmers YARN: architecture and use Advanced topics Hardware monitoring System software monitoring Hadoop cluster monitoring Adding and removing servers and upgrading Hadoop Backup, recovery, and business continuity planning Cluster configuration tweaks Hardware maintenance schedule Oozie scheduling for administrators Securing your cluster with Kerberos The future of Hadoop
Duration 4 Days 24 CPD hours This course is intended for This course is best suited to systems administrators and IT managers. Overview Skills gained in this training include:Determining the correct hardware and infrastructure for your clusterProper cluster configuration and deployment to integrate with the data centerConfiguring the FairScheduler to provide service-level agreements for multiple users of a clusterBest practices for preparing and maintaining Apache Hadoop in productionTroubleshooting, diagnosing, tuning, and solving Hadoop issues Cloudera University?s four-day administrator training course for Apache Hadoop provides participants with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster. The Case for Apache Hadoop Why Hadoop? Core Hadoop Components Fundamental Concepts HDFS HDFS Features Writing and Reading Files NameNode Memory Considerations Overview of HDFS Security Using the Namenode Web UI Using the Hadoop File Shell Getting Data into HDFS Ingesting Data from External Sources with Flume Ingesting Data from Relational Databases with Sqoop REST Interfaces Best Practices for Importing Data YARN & MapReduce What Is MapReduce? Basic MapReduce Concepts YARN Cluster Architecture Resource Allocation Failure Recovery Using the YARN Web UI MapReduce Version 1 Planning Your Hadoop Cluster General Planning Considerations Choosing the Right Hardware Network Considerations Configuring Nodes Planning for Cluster Management Hadoop Installation and Initial Configuration Deployment Types Installing Hadoop Specifying the Hadoop Configuration Performing Initial HDFS Configuration Performing Initial YARN and MapReduce Configuration Hadoop Logging Installing and Configuring Hive, Impala, and Pig Hive Impala Pig Hadoop Clients What is a Hadoop Client? Installing and Configuring Hadoop Clients Installing and Configuring Hue Hue Authentication and Authorization Cloudera Manager The Motivation for Cloudera Manager Cloudera Manager Features Express and Enterprise Versions Cloudera Manager Topology Installing Cloudera Manager Installing Hadoop Using Cloudera Manager Performing Basic Administration Tasks Using Cloudera Manager Advanced Cluster Configuration Advanced Configuration Parameters Configuring Hadoop Ports Explicitly Including and Excluding Hosts Configuring HDFS for Rack Awareness Configuring HDFS High Availability Hadoop Security Why Hadoop Security Is Important Hadoop?s Security System Concepts What Kerberos Is and How it Works Securing a Hadoop Cluster with Kerberos Managing and Scheduling Jobs Managing Running Jobs Scheduling Hadoop Jobs Configuring the FairScheduler Impala Query Scheduling Cluster Maintainence Checking HDFS Status Copying Data Between Clusters Adding and Removing Cluster Nodes Rebalancing the Cluster Cluster Upgrading Cluster Monitoring & Troubleshooting General System Monitoring Monitoring Hadoop Clusters Common Troubleshooting Hadoop Clusters Common Misconfigurations Additional course details: Nexus Humans Cloudera Administrator Training for Apache Hadoop training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Cloudera Administrator Training for Apache Hadoop course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Course Overview The SQL, NoSQL, Big Data and Hadoop Level 4 course is designed to provide aspiring data engineers with the skills to fast track their career. It will introduce you to key database and data engineering concepts, taking you through the different classifications of databases and software. You will learn how to build a data-driven organisation step-by-step, best practices for data analysis, how to use Elasticsearch as a search engine, and much more. This course is open to everyone and can be studied on a part-time or full-time basis. It is ideal for anyone looking to work in this field or gain a better understanding of Hadoop from a database perspective. Fast track your career by enrolling today and learn tips and shortcuts from experienced industry professionals. This best selling SQL, NoSQL, Big Data and Hadoop Level 4 has been developed by industry professionals and has already been completed by hundreds of satisfied students. This in-depth SQL, NoSQL, Big Data and Hadoop Level 4 is suitable for anyone who wants to build their professional skill set and improve their expert knowledge. The SQL, NoSQL, Big Data and Hadoop Level 4 is CPD-accredited, so you can be confident you're completing a quality training course will boost your CV and enhance your career potential. The SQL, NoSQL, Big Data and Hadoop Level 4 is made up of several information-packed modules which break down each topic into bite-sized chunks to ensure you understand and retain everything you learn. After successfully completing the SQL, NoSQL, Big Data and Hadoop Level 4, you will be awarded a certificate of completion as proof of your new skills. If you are looking to pursue a new career and want to build your professional skills to excel in your chosen field, the certificate of completion from the SQL, NoSQL, Big Data and Hadoop Level 4 will help you stand out from the crowd. You can also validate your certification on our website. We know that you are busy and that time is precious, so we have designed the SQL, NoSQL, Big Data and Hadoop Level 4 to be completed at your own pace, whether that's part-time or full-time. Get full course access upon registration and access the course materials from anywhere in the world, at any time, from any internet-enabled device. Our experienced tutors are here to support you through the entire learning process and answer any queries you may have via email.
Overview This comprehensive course on SQL NoSQL Big Data and Hadoop will deepen your understanding on this topic. After successful completion of this course you can acquire the required skills in this sector. This SQL NoSQL Big Data and Hadoop comes with accredited certification from CPD, which will enhance your CV and make you worthy in the job market. So enrol in this course today to fast track your career ladder. How will I get my certificate? At the end of the course there will be an online written test, which you can take either during or after the course. After successfully completing the test you will be able to order your certificate, these are included in the price. Who is This course for? There is no experience or previous qualifications required for enrolment on this SQL NoSQL Big Data and Hadoop. It is available to all students, of all academic backgrounds. Requirements Our SQL NoSQL Big Data and Hadoop is fully compatible with PC's, Mac's, Laptop, Tablet and Smartphone devices. This course has been designed to be fully compatible with tablets and smartphones so you can access your course on Wi-Fi, 3G or 4G. There is no time limit for completing this course, it can be studied in your own time at your own pace. Career Path Learning this new skill will help you to advance in your career. It will diversify your job options and help you develop new techniques to keep up with the fast-changing world. This skillset will help you to- Open doors of opportunities Increase your adaptability Keep you relevant Boost confidence And much more! Course Curriculum 14 sections • 130 lectures • 22:34:00 total length •Introduction: 00:07:00 •Building a Data-driven Organization - Introduction: 00:04:00 •Data Engineering: 00:06:00 •Learning Environment & Course Material: 00:04:00 •Movielens Dataset: 00:03:00 •Introduction to Relational Databases: 00:09:00 •SQL: 00:05:00 •Movielens Relational Model: 00:15:00 •Movielens Relational Model: Normalization vs Denormalization: 00:16:00 •MySQL: 00:05:00 •Movielens in MySQL: Database import: 00:06:00 •OLTP in RDBMS: CRUD Applications: 00:17:00 •Indexes: 00:16:00 •Data Warehousing: 00:15:00 •Analytical Processing: 00:17:00 •Transaction Logs: 00:06:00 •Relational Databases - Wrap Up: 00:03:00 •Distributed Databases: 00:07:00 •CAP Theorem: 00:10:00 •BASE: 00:07:00 •Other Classifications: 00:07:00 •Introduction to KV Stores: 00:02:00 •Redis: 00:04:00 •Install Redis: 00:07:00 •Time Complexity of Algorithm: 00:05:00 •Data Structures in Redis : Key & String: 00:20:00 •Data Structures in Redis II : Hash & List: 00:18:00 •Data structures in Redis III : Set & Sorted Set: 00:21:00 •Data structures in Redis IV : Geo & HyperLogLog: 00:11:00 •Data structures in Redis V : Pubsub & Transaction: 00:08:00 •Modelling Movielens in Redis: 00:11:00 •Redis Example in Application: 00:29:00 •KV Stores: Wrap Up: 00:02:00 •Introduction to Document-Oriented Databases: 00:05:00 •MongoDB: 00:04:00 •MongoDB Installation: 00:02:00 •Movielens in MongoDB: 00:13:00 •Movielens in MongoDB: Normalization vs Denormalization: 00:11:00 •Movielens in MongoDB: Implementation: 00:10:00 •CRUD Operations in MongoDB: 00:13:00 •Indexes: 00:16:00 •MongoDB Aggregation Query - MapReduce function: 00:09:00 •MongoDB Aggregation Query - Aggregation Framework: 00:16:00 •Demo: MySQL vs MongoDB. Modeling with Spark: 00:02:00 •Document Stores: Wrap Up: 00:03:00 •Introduction to Search Engine Stores: 00:05:00 •Elasticsearch: 00:09:00 •Basic Terms Concepts and Description: 00:13:00 •Movielens in Elastisearch: 00:12:00 •CRUD in Elasticsearch: 00:15:00 •Search Queries in Elasticsearch: 00:23:00 •Aggregation Queries in Elasticsearch: 00:23:00 •The Elastic Stack (ELK): 00:12:00 •Use case: UFO Sighting in ElasticSearch: 00:29:00 •Search Engines: Wrap Up: 00:04:00 •Introduction to Columnar databases: 00:06:00 •HBase: 00:07:00 •HBase Architecture: 00:09:00 •HBase Installation: 00:09:00 •Apache Zookeeper: 00:06:00 •Movielens Data in HBase: 00:17:00 •Performing CRUD in HBase: 00:24:00 •SQL on HBase - Apache Phoenix: 00:14:00 •SQL on HBase - Apache Phoenix - Movielens: 00:10:00 •Demo : GeoLife GPS Trajectories: 00:02:00 •Wide Column Store: Wrap Up: 00:05:00 •Introduction to Time Series: 00:09:00 •InfluxDB: 00:03:00 •InfluxDB Installation: 00:07:00 •InfluxDB Data Model: 00:07:00 •Data manipulation in InfluxDB: 00:17:00 •TICK Stack I: 00:12:00 •TICK Stack II: 00:23:00 •Time Series Databases: Wrap Up: 00:04:00 •Introduction to Graph Databases: 00:05:00 •Modelling in Graph: 00:14:00 •Modelling Movielens as a Graph: 00:10:00 •Neo4J: 00:04:00 •Neo4J installation: 00:08:00 •Cypher: 00:12:00 •Cypher II: 00:19:00 •Movielens in Neo4J: Data Import: 00:17:00 •Movielens in Neo4J: Spring Application: 00:12:00 •Data Analysis in Graph Databases: 00:05:00 •Examples of Graph Algorithms in Neo4J: 00:18:00 •Graph Databases: Wrap Up: 00:07:00 •Introduction to Big Data With Apache Hadoop: 00:06:00 •Big Data Storage in Hadoop (HDFS): 00:16:00 •Big Data Processing : YARN: 00:11:00 •Installation: 00:13:00 •Data Processing in Hadoop (MapReduce): 00:14:00 •Examples in MapReduce: 00:25:00 •Data Processing in Hadoop (Pig): 00:12:00 •Examples in Pig: 00:21:00 •Data Processing in Hadoop (Spark): 00:23:00 •Examples in Spark: 00:23:00 •Data Analytics with Apache Spark: 00:09:00 •Data Compression: 00:06:00 •Data serialization and storage formats: 00:20:00 •Hadoop: Wrap Up: 00:07:00 •Introduction Big Data SQL Engines: 00:03:00 •Apache Hive: 00:10:00 •Apache Hive : Demonstration: 00:20:00 •MPP SQL-on-Hadoop: Introduction: 00:03:00 •Impala: 00:06:00 •Impala : Demonstration: 00:18:00 •PrestoDB: 00:13:00 •PrestoDB : Demonstration: 00:14:00 •SQL-on-Hadoop: Wrap Up: 00:02:00 •Data Architectures: 00:05:00 •Introduction to Distributed Commit Logs: 00:07:00 •Apache Kafka: 00:03:00 •Confluent Platform Installation: 00:10:00 •Data Modeling in Kafka I: 00:13:00 •Data Modeling in Kafka II: 00:15:00 •Data Generation for Testing: 00:09:00 •Use case: Toll fee Collection: 00:04:00 •Stream processing: 00:11:00 •Stream Processing II with Stream + Connect APIs: 00:19:00 •Example: Kafka Streams: 00:15:00 •KSQL : Streaming Processing in SQL: 00:04:00 •KSQL: Example: 00:14:00 •Demonstration: NYC Taxi and Fares: 00:01:00 •Streaming: Wrap Up: 00:02:00 •Database Polyglot: 00:04:00 •Extending your knowledge: 00:08:00 •Data Visualization: 00:11:00 •Building a Data-driven Organization - Conclusion: 00:07:00 •Conclusion: 00:03:00 •Assignment -SQL NoSQL Big Data and Hadoop: 00:00:00
Duration 4 Days 24 CPD hours This course is intended for Hadoop Developers Overview Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:How data is distributed, stored, and processed in a Hadoop clusterHow to use Sqoop and Flume to ingest dataHow to process distributed data with Apache SparkHow to model structured data as tables in Impala and HiveHow to choose the best data storage format for different data usage patternsBest practices for data storage This training course is the best preparation for the challenges faced by Hadoop developers. Participants will learn to identify which tool is the right one to use in a given situation, and will gain hands-on experience in developing using those tools. Course Outline Introduction Introduction to Hadoop and the Hadoop Ecosystem Hadoop Architecture and HDFS Importing Relational Data with Apache Sqoop Introduction to Impala and Hive Modeling and Managing Data with Impala and Hive Data Formats Data Partitioning Capturing Data with Apache Flume Spark Basics Working with RDDs in Spark Writing and Deploying Spark Applications Parallel Programming with Spark Spark Caching and Persistence Common Patterns in Spark Data Processing Spark SQL and DataFrames Conclusion Additional course details: Nexus Humans Developer Training for Spark and Hadoop training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Developer Training for Spark and Hadoop course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Register on the SQL NoSQL Big Data and Hadoop today and build the experience, skills and knowledge you need to enhance your professional development and work towards your dream job. Study this course through online learning and take the first steps towards a long-term career. The course consists of a number of easy to digest, in-depth modules, designed to provide you with a detailed, expert level of knowledge. Learn through a mixture of instructional video lessons and online study materials. Receive online tutor support as you study the course, to ensure you are supported every step of the way. Get a digital certificate as a proof of your course completion. The SQL NoSQL Big Data and Hadoop is incredibly great value and allows you to study at your own pace. Access the course modules from any internet-enabled device, including computers, tablet, and smartphones. The course is designed to increase your employability and equip you with everything you need to be a success. Enrol on the now and start learning instantly! What You Get With The SQL NoSQL Big Data and Hadoop Receive a e-certificate upon successful completion of the course Get taught by experienced, professional instructors Study at a time and pace that suits your learning style Get instant feedback on assessments 24/7 help and advice via email or live chat Get full tutor support on weekdays (Monday to Friday) Course Design The course is delivered through our online learning platform, accessible through any internet-connected device. There are no formal deadlines or teaching schedules, meaning you are free to study the course at your own pace. You are taught through a combination of Video lessons Online study materials Certification Upon successful completion of the course, you will be able to obtain your course completion e-certificate free of cost. Print copy by post is also available at an additional cost of £9.99 and PDF Certificate at £4.99. Who Is This Course For: The course is ideal for those who already work in this sector or are an aspiring professional. This course is designed to enhance your expertise and boost your CV. Learn key skills and gain a professional qualification to prove your newly-acquired knowledge. Requirements: The online training is open to all students and has no formal entry requirements. To study the SQL NoSQL Big Data and Hadoop, all your need is a passion for learning, a good understanding of English, numeracy, and IT skills. You must also be over the age of 16. Course Content Section 01: Introduction Introduction 00:07:00 Building a Data-driven Organization - Introduction 00:04:00 Data Engineering 00:06:00 Learning Environment & Course Material 00:04:00 Movielens Dataset 00:03:00 Section 02: Relational Database Systems Introduction to Relational Databases 00:09:00 SQL 00:05:00 Movielens Relational Model 00:15:00 Movielens Relational Model: Normalization vs Denormalization 00:16:00 MySQL 00:05:00 Movielens in MySQL: Database import 00:06:00 OLTP in RDBMS: CRUD Applications 00:17:00 Indexes 00:16:00 Data Warehousing 00:15:00 Analytical Processing 00:17:00 Transaction Logs 00:06:00 Relational Databases - Wrap Up 00:03:00 Section 03: Database Classification Distributed Databases 00:07:00 CAP Theorem 00:10:00 BASE 00:07:00 Other Classifications 00:07:00 Section 04: Key-Value Store Introduction to KV Stores 00:02:00 Redis 00:04:00 Install Redis 00:07:00 Time Complexity of Algorithm 00:05:00 Data Structures in Redis : Key & String 00:20:00 Data Structures in Redis II : Hash & List 00:18:00 Data structures in Redis III : Set & Sorted Set 00:21:00 Data structures in Redis IV : Geo & HyperLogLog 00:11:00 Data structures in Redis V : Pubsub & Transaction 00:08:00 Modelling Movielens in Redis 00:11:00 Redis Example in Application 00:29:00 KV Stores: Wrap Up 00:02:00 Section 05: Document-Oriented Databases Introduction to Document-Oriented Databases 00:05:00 MongoDB 00:04:00 MongoDB Installation 00:02:00 Movielens in MongoDB 00:13:00 Movielens in MongoDB: Normalization vs Denormalization 00:11:00 Movielens in MongoDB: Implementation 00:10:00 CRUD Operations in MongoDB 00:13:00 Indexes 00:16:00 MongoDB Aggregation Query - MapReduce function 00:09:00 MongoDB Aggregation Query - Aggregation Framework 00:16:00 Demo: MySQL vs MongoDB. Modeling with Spark 00:02:00 Document Stores: Wrap Up 00:03:00 Section 06: Search Engines Introduction to Search Engine Stores 00:05:00 Elasticsearch 00:09:00 Basic Terms Concepts and Description 00:13:00 Movielens in Elastisearch 00:12:00 CRUD in Elasticsearch 00:15:00 Search Queries in Elasticsearch 00:23:00 Aggregation Queries in Elasticsearch 00:23:00 The Elastic Stack (ELK) 00:12:00 Use case: UFO Sighting in ElasticSearch 00:29:00 Search Engines: Wrap Up 00:04:00 Section 07: Wide Column Store Introduction to Columnar databases 00:06:00 HBase 00:07:00 HBase Architecture 00:09:00 HBase Installation 00:09:00 Apache Zookeeper 00:06:00 Movielens Data in HBase 00:17:00 Performing CRUD in HBase 00:24:00 SQL on HBase - Apache Phoenix 00:14:00 SQL on HBase - Apache Phoenix - Movielens 00:10:00 Demo : GeoLife GPS Trajectories 00:02:00 Wide Column Store: Wrap Up 00:04:00 Section 08: Time Series Databases Introduction to Time Series 00:09:00 InfluxDB 00:03:00 InfluxDB Installation 00:07:00 InfluxDB Data Model 00:07:00 Data manipulation in InfluxDB 00:17:00 TICK Stack I 00:12:00 TICK Stack II 00:23:00 Time Series Databases: Wrap Up 00:04:00 Section 09: Graph Databases Introduction to Graph Databases 00:05:00 Modelling in Graph 00:14:00 Modelling Movielens as a Graph 00:10:00 Neo4J 00:04:00 Neo4J installation 00:08:00 Cypher 00:12:00 Cypher II 00:19:00 Movielens in Neo4J: Data Import 00:17:00 Movielens in Neo4J: Spring Application 00:12:00 Data Analysis in Graph Databases 00:05:00 Examples of Graph Algorithms in Neo4J 00:18:00 Graph Databases: Wrap Up 00:07:00 Section 10: Hadoop Platform Introduction to Big Data With Apache Hadoop 00:06:00 Big Data Storage in Hadoop (HDFS) 00:16:00 Big Data Processing : YARN 00:11:00 Installation 00:13:00 Data Processing in Hadoop (MapReduce) 00:14:00 Examples in MapReduce 00:25:00 Data Processing in Hadoop (Pig) 00:12:00 Examples in Pig 00:21:00 Data Processing in Hadoop (Spark) 00:23:00 Examples in Spark 00:23:00 Data Analytics with Apache Spark 00:09:00 Data Compression 00:06:00 Data serialization and storage formats 00:20:00 Hadoop: Wrap Up 00:07:00 Section 11: Big Data SQL Engines Introduction Big Data SQL Engines 00:03:00 Apache Hive 00:10:00 Apache Hive : Demonstration 00:20:00 MPP SQL-on-Hadoop: Introduction 00:03:00 Impala 00:06:00 Impala : Demonstration 00:18:00 PrestoDB 00:13:00 PrestoDB : Demonstration 00:14:00 SQL-on-Hadoop: Wrap Up 00:02:00 Section 12: Distributed Commit Log Data Architectures 00:05:00 Introduction to Distributed Commit Logs 00:07:00 Apache Kafka 00:03:00 Confluent Platform Installation 00:10:00 Data Modeling in Kafka I 00:13:00 Data Modeling in Kafka II 00:15:00 Data Generation for Testing 00:09:00 Use case: Toll fee Collection 00:04:00 Stream processing 00:11:00 Stream Processing II with Stream + Connect APIs 00:19:00 Example: Kafka Streams 00:15:00 KSQL : Streaming Processing in SQL 00:04:00 KSQL: Example 00:14:00 Demonstration: NYC Taxi and Fares 00:01:00 Streaming: Wrap Up 00:02:00 Section 13: Summary Database Polyglot 00:04:00 Extending your knowledge 00:08:00 Data Visualization 00:11:00 Building a Data-driven Organization - Conclusion 00:07:00 Conclusion 00:03:00 Resources Resources - SQL NoSQL Big Data And Hadoop 00:00:00