Information on the risks and practical advice to address them TSC's eBooks, whitepapers, and reports cover some of the most important risks in information and cyber security — risks that constantly challenge information and cyber security professionals who work tirelessly to reduce them across their organisations and home users alike.
Information on the risks and practical advice to address them TSC's eBooks, whitepapers, and reports cover some of the most important risks in information and cyber security — risks that constantly challenge information and cyber security professionals who work tirelessly to reduce them across their organisations and home users alike.
Duration 5 Days 30 CPD hours This course is intended for Red Hat Certified System Administrator (RHCSA) who wants to learn how to provision and configure IdM technologies across both Linux and Windows applications Identity management specialist or engineer Access management specialist or engineer Web application developer DevOps specialist Overview As a result of attending this course, you will gain an understanding of the architecture of an identity management realm and trusted relationships using both Red Hat Enterprise Linux Identity Management and Microsoft Active Directory. You will be able to create, manage, and troubleshoot user management structures, security policies, local and remote secure access methods, and implementation technologies such as Kerberos, PKI, and certificates. You should be able to demonstrate these skills: Create and manage a scalable, resilient Identity Management realm, including both Linux and Microsoft Windows clients and servers. Create and manage secure access configurations, including managing and troubleshooting Kerberos, certificate servers, and access control policies. Integrate IdM as the back end for other major enterprise tools in the Red Hat portfolio, including Satellite Server and Tower. This course will empower you with the skills to configure and manage IdM, the comprehensive Identity Management solution bundled with Red Hat© Enterprise Linux.You will master these skills on the most requested Red Hat Identity Management (IdM) capabilities, including Active Directory trusts, multi-product federation, configuration management with Ansible, integrated certificate management, single sign-on, one-time passwords, and cybersecurity policy conformance. This course covers the same material as RH362, but includes the Red Hat Certified Specialist in Identity Management exam (EX362). Install Red Hat Identity Management Describe and install Red Hat Identity Management (IdM). Centralize Identity Management Explain the IdM server services, explore IdM clients access methods, and install an IdM client. Authenticate identities with Kerberos Define the Kerberos protocol and configure services for Kerberos authentication. Integrate IdM with Active Directory Create a trust relationship with Active Directory. Control user access Configure users for authorized access to services and resources. Manage a public key infrastructure Manage certificate authorities, certificates, and storing secrets. Maintain IdM operations Troubleshoot and recover Identity Management. Integrate Red Hat products with IdM Configure major services to share the IdM authentication database. Install scalable IdM Construct a resilient and scalable Identity Management topology.
Duration 2 Days 12 CPD hours This course is intended for IBM SPSS Statistics users who want to familiarize themselves with the statistical capabilities of IBM SPSS StatisticsBase. Anyone who wants to refresh their knowledge and statistical experience. Overview Introduction to statistical analysis Describing individual variables Testing hypotheses Testing hypotheses on individual variables Testing on the relationship between categorical variables Testing on the difference between two group means Testing on differences between more than two group means Testing on the relationship between scale variables Predicting a scale variable: Regression Introduction to Bayesian statistics Overview of multivariate procedures This course provides an application-oriented introduction to the statistical component of IBM SPSS Statistics. Students will review several statistical techniques and discuss situations in which they would use each technique, how to set up the analysis, and how to interpret the results. This includes a broad range of techniques for exploring and summarizing data, as well as investigating and testing relationships. Students will gain an understanding of when and why to use these various techniques and how to apply them with confidence, interpret their output, and graphically display the results. Introduction to statistical analysis Identify the steps in the research process Identify measurement levels Describing individual variables Chart individual variables Summarize individual variables Identify the normal distributionIdentify standardized scores Testing hypotheses Principles of statistical testing One-sided versus two-sided testingType I, type II errors and power Testing hypotheses on individual variables Identify population parameters and sample statistics Examine the distribution of the sample mean Test a hypothesis on the population mean Construct confidence intervals Tests on a single variable Testing on the relationship between categorical variables Chart the relationship Describe the relationship Test the hypothesis of independence Assumptions Identify differences between the groups Measure the strength of the association Testing on the difference between two group meansChart the relationship Describe the relationship Test the hypothesis of two equal group means Assumptions Testing on differences between more than two group means Chart the relationship Describe the relationship Test the hypothesis of all group means being equal Assumptions Identify differences between the group means Testing on the relationship between scale variables Chart the relationship Describe the relationship Test the hypothesis of independence Assumptions Treatment of missing values Predicting a scale variable: Regression Explain linear regression Identify unstandardized and standardized coefficients Assess the fit Examine residuals Include 0-1 independent variables Include categorical independent variables Introduction to Bayesian statistics Bayesian statistics and classical test theory The Bayesian approach Evaluate a null hypothesis Overview of Bayesian procedures in IBM SPSS Statistics Overview of multivariate procedures Overview of supervised models Overview of models to create natural groupings
Duration 2 Days 12 CPD hours This course is intended for Customers and systems operators that want to learn fundamental AOS concepts and navigate Prism on AHV. Junior IT administrators and business leaders who manage Nutanix clusters in the datacenter and want a formal, hands-on, detailed introduction to Nutanix datacenter administration. The Nutanix Hybrid Cloud Fundamentals course introduces you to the products, capabilities, and technologies that serve as the foundation of Nutanix?s Hybrid Cloud solution.Begin by exploring the history of this technology space, including different types of clouds, and how on-prem and public infrastructures came together to create hybrid operating models. Then, delve deeper into essential Nutanix products ? AOS, AHV, and Prism ? while discussing how these products were designed to solve business challenges.And conclude by discussing certain fundamental aspects involved in operating the Nutanix Hybrid Cloud, such as cluster updates, managing virtual machines, reporting and performance metrics, and more Module 1: Introduction Describe course terminology, such as three-tier architecture, hyperconverged architecture, and public, private, and hybrid clouds. Module 2: Understanding AOS Concepts Describe self-healing architecture Describe replication factor Describe Nutanix multicloud solutions Module 3: Understanding Cluster Management Concepts Explain Prism Element features and benefits Explain Prism Central features and benefits Manage the Image Repository Upgrade the hypervisor and AOS on a cluster Describe Life Cycle Manager. Module 4: Understanding Storage Concepts Define a storage pool and storage container Identify components of AOS Distributed Storage Identify space-saving technologies Module 5: Managing VMs Create and manage virtual machines (VMs) Add a VM to a category Describe Acropolis Dynamic Scheduler (ADS) Describe data locality Module 6: Monitoring VMs and Cluster Health Use metrics to identify performance issues Measure VM performance using Nutanix tools: Health dashboard, Analysis dashboard, Alerts dashboard Use the Support Portal and Insights Module 7:Understanding Data Protection Concepts Describe how to enable data protection on a VM Define a retention policy Define Nutanix Mine Identify the different types of replication targets
Duration 4 Days 24 CPD hours This course is intended for This course is appropriate for anyone who wants to create applications or modules to automate and simplify common tasks with Perl. Overview Working within in an engaging, hands-on learning environment, guided by our expert web development, PHP practitioner, students will learn to: Create a working script that gets input from the command line, the keyboard, or a file Use arrays to store and process data from files Create formatted reports Use regular expressions Use the appropriate types of variables and data structures Refactor duplicate code into subroutines and modules What is available in the standard library Use shortcuts and defaults, and what they replace Introduction to Perl Programming Essentials is an Introductory-level practical, hands-on Perl scripting training course that guides the students from the basics of writing and running Perl scripts to using more advanced features such as file operations, report writing, the use of regular expressions, working with binary data files, and using the extensive functionality of the standard Perl library. Students will immediately be able to use Perl to complete tasks in the real world. Session: An Overview of Perl What is Perl? Perl is compiled and interpreted Perl Advantages and Disadvantages Downloading and Installing Perl Which version of Perl Getting Help Session: Creating and running Perl Programs Structure of a Perl program Running a Perl script Checking syntax and warnings Execution of scripts under Unix and Windows Session: Basic Data and I/O Numeric and Text literals Math operators and expressions Scalar variables Default values Writing to standard output Command line arguments Reading from the standard input Session: Logic and Loops About flow control The if statement and Boolean values Using unless and elsif Statement modifiers warn() and die() The conditional construct Using while loop and its variants Using the for loop Exiting from loops Session: Lists and Arrays The list data type Accessing array elements Creating arrays List interpolation Arrays and memory Counting elements Iterating through an array List evaluation Slices and ranges Session: Reading and writing text files File I/O Overview Opening a file Reading text files Writing to a text file Arrays and file I/O Using the <> operator Session: List functions Growing and shrinking arrays The split() function Splitting on whitespace Assigning to literal lists The join() function The sort() function Alternate sort keys Reversing an array Session: Formatting output Using sprintf() and printf() Report formatting overview Defining report formats The write() function Advanced filehandle magic Session: Hashes Hash overview Creating hashes Hash attributes Traversing a hash Testing for existence of elements Deleting hash elements Session: References What is a reference? The two ways to create references References to existing data References to anonymous data Dereferencing scalar, array, and ash references Dereferencing elements of arrays and hashes Multidimensional arrays and other data structures Session: Text and Regular Expressions String length The substr() function The index() and rindex() functions String replication Pattern matching and substitution Regular expressions Session: Raw file and data access Opening and closing raw (binary) files Reading raw data Using seek() and tell() Writing raw data Raw data manipulation with pack() and unpack() Session: Subroutines and variable scope Understanding packages Package and Lexical variables Localizing builtin variables Declaring and calling subroutines Calling subroutines Passing parameters and returning values Session: Working with the operating system Determining current OS Environment variables Running external programs User identification Trapping signals File test operators Working with files Time of day Session: Shortcuts and defaults Understanding $_ shift() with no array specified Text file processing Using grep() and Using map() Command-line options for file processing Session: Data wrangling Quoting in Perl Evaluating arrays Understanding qw( ) Getting more out of the <> operator Read ranges of lines Using m//g in scalar context The /o modifier Working with embedded newlines Making REs more readable Perl data conversion Session: Using the Perl Library The Perl library Old-style library files Perl modules Modules bundled with Perl A selection of modules Getting modules from ActiveState Getting modules from CPAN Using Getopt::Long Session: Some Useful Tools Sending and receiving files with Net::FTP Using File::Find to search for files and directories Grabbing a Web page Some good places to find scripts Perl man pages for more information Zipping and unzipping files
Duration 2 Days 12 CPD hours This course is intended for The introductory-level course is geared for software developers, project managers, and IT professionals seeking to enhance their understanding and practical skills in version control and collaboration using GitLab. It's also well-suited for those transitioning from another version control system to GitLab, or those responsible for software development lifecycle within their organization. Whether you are an individual looking to boost your proficiency or a team leader aiming to drive productivity and collaboration, this course will provide the necessary expertise to make the most of GitLab's capabilities. Overview This course combines engaging instructor-led presentations and useful demonstrations with valuable hands-on labs and engaging group activities. Throughout the course you'll: Gain a firm understanding of the fundamentals of Git and GitLab, setting a solid foundation for advanced concepts. Learn to effectively manage and track changes in your code, ensuring a clean and reliable codebase. Discover ways to streamline your daily tasks with aliases, stashing, and other GitLab workflow optimization techniques. Develop skills in creating, merging, and synchronizing branches, enabling seamless collaboration and version control. Equip yourself with the knowledge to use Git as a powerful debugging tool, saving time and effort when troubleshooting issues. Understand the basics of continuous integration and continuous deployment (CI/CD) in GitLab, helping you automate the software delivery process. Immerse yourself in the dynamic world of GitLab, a leading web-based platform for version control and collaboration, through our intensive two-day course, GitLab Quick Start. Version control systems, such as GitLab, are the backbone of modern software development, enabling teams to work cohesively and maintain a structured workflow. By mastering GitLab, you can improve efficiency, encourage collaboration, and ensure accuracy and reliability within your projects, adding significant value to your organization. Throughout the course you?ll explore various aspects of GitLab, starting from the fundamental principles of source code management to advanced concepts like rebasing and continuous integration/design. Key topics covered include Git and GitLab basics, reviewing and editing commit history, mastering GitFlow and GitLab Flow, branching and merging strategies, and understanding remote repositories. You'll also learn how to utilize Git as a debugging tool and explore the power of GitLab's built-in CI/CD capabilities. The core value of this course lies in its practical application. You'll learn how to effectively manage changes in code with GitLab, allowing you to maintain audit trails, create reproducible software, and seamlessly move from another version control system. Then you?ll learn how to enhance your workflow efficiency using aliases for common commands, saving changes for later use, and ignoring build artifacts. You?ll also explore GitLab's CI/CD, which will enable you to automate your software delivery process. These hands-on labs will walk you through creating, merging, and synchronizing remote branches, configuring Git, troubleshooting using Git as a debugging tool, and setting up GitLab Runner for CI/CD. Each lab is designed to simulate real-world projects, offering you a first-hand experience in managing and contributing to a version control system like GitLab. Introduction to Source Code Management The Core Principles of Change Management The Power to Undo Changes Audit Trails and Investigations Reproducible Software Changing code-hosting platform Moving from another version control system Git and GitLab Introduction and Basics Introduction to Git GitFlow GitLab Flow Trees and Commits Configuring Git Adding, Renaming, and Removing Files Reviewing and Editing the Commit History Reviewing the Commit History Revision Shortcuts Fixing Mistakes Improving Your Daily Workflow Simplifying Common Commands with Aliases Ignoring Build Artifacts Saving Changes for Later Use (Stashing) Branching Branching Basics Listing Differences Between Branches Visualizing Branches Deleting Branches Tagging Merging Merging Basics Merge Conflicts Merging Remote Branches Remote Repositories Remote Repositories Synchronizing Objects with Remotes Tracking Branches Centralizing and Controlling Access Introduction to GitLab Git Repositories on GitLab Daily Workflow Reviewing Branching and Merging Branch Review Merging Basics Rebasing Rebasing Basics Rebasing with Local Branches Rebasing with Remote Branches Interactive Rebasing Squashing Commits Getting Out of Trouble Git as a Debugging Tool Using the Blame Command to See File History Performing a Binary Search Continuous Integration / Continuous Design (CI/CD) How to install GitLab Runner Adding to our example project Breaking down .gitlab-ci.yml Adding .gitlab-ci.yml to our example project Deconstructing an advanced .gitlab-ci.yml file GitLab CI/CD web UI Optional: Resetting Trees Introduction to Resetting Resetting Branch Pointers Resetting Branches and the Index Resetting the Working Directory Making Good Use of the Reset Command Optional More on Improving Your Daily Workflow Interactively Staging Changes Optional: Including External Repositories Submodules Subtrees Choosing Between Submodules and Subtrees Workflow Management Branch Management
Duration 5 Days 30 CPD hours This course is intended for This course is designed for technical professionals who require the skills to administer IBM MQ. Overview After completing this course, you should be able to: Describe the IBM MQ deployment options Create and manage queue managers, queues, and channels Use the IBM MQ sample programs and utilities to test the IBM MQ network Configure distributed queuing Configure MQ client connections to a queue manager Define and administer a queue manager cluster Administer Java Message Service (JMS) in MQ Implement basic queue manager restart and recovery procedures Use IBM MQ troubleshooting tools to identify the cause of a problem in the IBM MQ network Manage IBM MQ security Monitor the activities and performance of an IBM MQ system This course is also available as self-paced virtual (e-learning) course IBM MQ V9.1 System Administration (ZM156G). This option does not require any travel.This course teaches you how to customize, operate, administer, and monitor IBM MQ on-premises on distributed operating systems. The course covers configuration, day-to-day administration, problem recovery, security management, and performance monitoring. In addition to the instructor-led lectures, the hands-on exercises provide practical experience with distributed queuing, working with MQ clients, and implementing clusters, publish/subscribe messaging. You also learn how to implement authorization, authentication, and encryption, and you learn how to monitor performance. Introducing IBM MQ Exercise Getting started with IBM MQ Working with IBM MQ administration tools Exercise Working with IBM MQ administration tools Configuring distributed queuing Exercise Implementing distributed queuing Managing clients and client connections Exercise Connecting an IBM MQ client Advanced IBM MQ client features Working with queue manager clusters Exercise Implementing a basic cluster Publish/subscribe messaging Exercise Configuring publish/subscribe message queuing Implementing basic security in IBM MQ Exercise Controlling access to IBM MQ Securing IBM MQ channels with TLS Exercise Securing channels with TLS Authenticating channels and connections Exercise Implementing connection authentication Supporting JMS with IBM MQ Diagnosing problems Running an IBM MQ trace Backing up and restoring IBM MQ messages and object definitions Using a media image to restore a queue Backing up and restoring IBM MQ object definitions High availability Monitoring and configuring IBM MQ for performance Monitoring IBM MQ for performance Monitoring resources with the IBM MQ Console Additional course details: Nexus Humans WM156G IBM MQ V9.1 System Administration (using Windows for labs) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the WM156G IBM MQ V9.1 System Administration (using Windows for labs) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 2 Days 12 CPD hours This course is intended for Business Analysts, Technical Managers, and Programmers Overview This intensive training course helps students learn the practical aspects of the R programming language. The course is supplemented by many hands-on labs which allow attendees to immediately apply their theoretical knowledge in practice. Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data as well as supervised and unsupervised machine learning. What is R ? What is R? ? Positioning of R in the Data Science Space ? The Legal Aspects ? Microsoft R Open ? R Integrated Development Environments ? Running R ? Running RStudio ? Getting Help ? General Notes on R Commands and Statements ? Assignment Operators ? R Core Data Structures ? Assignment Example ? R Objects and Workspace ? Printing Objects ? Arithmetic Operators ? Logical Operators ? System Date and Time ? Operations ? User-defined Functions ? Control Statements ? Conditional Execution ? Repetitive Execution ? Repetitive execution ? Built-in Functions ? Summary Introduction to Functional Programming with R ? What is Functional Programming (FP)? ? Terminology: Higher-Order Functions ? A Short List of Languages that Support FP ? Functional Programming in R ? Vector and Matrix Arithmetic ? Vector Arithmetic Example ? More Examples of FP in R ? Summary Managing Your Environment ? Getting and Setting the Working Directory ? Getting the List of Files in a Directory ? The R Home Directory ? Executing External R commands ? Loading External Scripts in RStudio ? Listing Objects in Workspace ? Removing Objects in Workspace ? Saving Your Workspace in R ? Saving Your Workspace in RStudio ? Saving Your Workspace in R GUI ? Loading Your Workspace ? Diverting Output to a File ? Batch (Unattended) Processing ? Controlling Global Options ? Summary R Type System and Structures ? The R Data Types ? System Date and Time ? Formatting Date and Time ? Using the mode() Function ? R Data Structures ? What is the Type of My Data Structure? ? Creating Vectors ? Logical Vectors ? Character Vectors ? Factorization ? Multi-Mode Vectors ? The Length of the Vector ? Getting Vector Elements ? Lists ? A List with Element Names ? Extracting List Elements ? Adding to a List ? Matrix Data Structure ? Creating Matrices ? Creating Matrices with cbind() and rbind() ? Working with Data Frames ? Matrices vs Data Frames ? A Data Frame Sample ? Creating a Data Frame ? Accessing Data Cells ? Getting Info About a Data Frame ? Selecting Columns in Data Frames ? Selecting Rows in Data Frames ? Getting a Subset of a Data Frame ? Sorting (ordering) Data in Data Frames by Attribute(s) ? Editing Data Frames ? The str() Function ? Type Conversion (Coercion) ? The summary() Function ? Checking an Object's Type ? Summary Extending R ? The Base R Packages ? Loading Packages ? What is the Difference between Package and Library? ? Extending R ? The CRAN Web Site ? Extending R in R GUI ? Extending R in RStudio ? Installing and Removing Packages from Command-Line ? Summary Read-Write and Import-Export Operations in R ? Reading Data from a File into a Vector ? Example of Reading Data from a File into A Vector ? Writing Data to a File ? Example of Writing Data to a File ? Reading Data into A Data Frame ? Writing CSV Files ? Importing Data into R ? Exporting Data from R ? Summary Statistical Computing Features in R ? Statistical Computing Features ? Descriptive Statistics ? Basic Statistical Functions ? Examples of Using Basic Statistical Functions ? Non-uniformity of a Probability Distribution ? Writing Your Own skew and kurtosis Functions ? Generating Normally Distributed Random Numbers ? Generating Uniformly Distributed Random Numbers ? Using the summary() Function ? Math Functions Used in Data Analysis ? Examples of Using Math Functions ? Correlations ? Correlation Example ? Testing Correlation Coefficient for Significance ? The cor.test() Function ? The cor.test() Example ? Regression Analysis ? Types of Regression ? Simple Linear Regression Model ? Least-Squares Method (LSM) ? LSM Assumptions ? Fitting Linear Regression Models in R ? Example of Using lm() ? Confidence Intervals for Model Parameters ? Example of Using lm() with a Data Frame ? Regression Models in Excel ? Multiple Regression Analysis ? Summary Data Manipulation and Transformation in R ? Applying Functions to Matrices and Data Frames ? The apply() Function ? Using apply() ? Using apply() with a User-Defined Function ? apply() Variants ? Using tapply() ? Adding a Column to a Data Frame ? Dropping A Column in a Data Frame ? The attach() and detach() Functions ? Sampling ? Using sample() for Generating Labels ? Set Operations ? Example of Using Set Operations ? The dplyr Package ? Object Masking (Shadowing) Considerations ? Getting More Information on dplyr in RStudio ? The search() or searchpaths() Functions ? Handling Large Data Sets in R with the data.table Package ? The fread() and fwrite() functions from the data.table Package ? Using the Data Table Structure ? Summary Data Visualization in R ? Data Visualization ? Data Visualization in R ? The ggplot2 Data Visualization Package ? Creating Bar Plots in R ? Creating Horizontal Bar Plots ? Using barplot() with Matrices ? Using barplot() with Matrices Example ? Customizing Plots ? Histograms in R ? Building Histograms with hist() ? Example of using hist() ? Pie Charts in R ? Examples of using pie() ? Generic X-Y Plotting ? Examples of the plot() function ? Dot Plots in R ? Saving Your Work ? Supported Export Options ? Plots in RStudio ? Saving a Plot as an Image ? Summary Using R Efficiently ? Object Memory Allocation Considerations ? Garbage Collection ? Finding Out About Loaded Packages ? Using the conflicts() Function ? Getting Information About the Object Source Package with the pryr Package ? Using the where() Function from the pryr Package ? Timing Your Code ? Timing Your Code with system.time() ? Timing Your Code with System.time() ? Sleeping a Program ? Handling Large Data Sets in R with the data.table Package ? Passing System-Level Parameters to R ? Summary Lab Exercises Lab 1 - Getting Started with R Lab 2 - Learning the R Type System and Structures Lab 3 - Read and Write Operations in R Lab 4 - Data Import and Export in R Lab 5 - k-Nearest Neighbors Algorithm Lab 6 - Creating Your Own Statistical Functions Lab 7 - Simple Linear Regression Lab 8 - Monte-Carlo Simulation (Method) Lab 9 - Data Processing with R Lab 10 - Using R Graphics Package Lab 11 - Using R Efficiently
Duration 4 Days 24 CPD hours This course is intended for This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Overview Skills gained in this training include:The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysisThe fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with HadoopHow Pig, Hive, and Impala improve productivity for typical analysis tasksJoining diverse datasets to gain valuable business insightPerforming real-time, complex queries on datasets Cloudera University?s four-day data analyst training course focusing on Apache Pig and Hive and Cloudera Impala will teach you to apply traditional data analytics and business intelligence skills to big data. Hadoop Fundamentals The Motivation for Hadoop Hadoop Overview Data Storage: HDFS Distributed Data Processing: YARN, MapReduce, and Spark Data Processing and Analysis: Pig, Hive, and Impala Data Integration: Sqoop Other Hadoop Data Tools Exercise Scenarios Explanation Introduction to Pig What Is Pig? Pig?s Features Pig Use Cases Interacting with Pig Basic Data Analysis with Pig Pig Latin Syntax Loading Data Simple Data Types Field Definitions Data Output Viewing the Schema Filtering and Sorting Data Commonly-Used Functions Processing Complex Data with Pig Storage Formats Complex/Nested Data Types Grouping Built-In Functions for Complex Data Iterating Grouped Data Multi-Dataset Operations with Pig Techniques for Combining Data Sets Joining Data Sets in Pig Set Operations Splitting Data Sets Pig Troubleshoot & Optimization Troubleshooting Pig Logging Using Hadoop?s Web UI Data Sampling and Debugging Performance Overview Understanding the Execution Plan Tips for Improving the Performance of Your Pig Jobs Introduction to Hive & Impala What Is Hive? What Is Impala? Schema and Data Storage Comparing Hive to Traditional Databases Hive Use Cases Querying with Hive & Impala Databases and Tables Basic Hive and Impala Query Language Syntax Data Types Differences Between Hive and Impala Query Syntax Using Hue to Execute Queries Using the Impala Shell Data Management Data Storage Creating Databases and Tables Loading Data Altering Databases and Tables Simplifying Queries with Views Storing Query Results Data Storage & Performance Partitioning Tables Choosing a File Format Managing Metadata Controlling Access to Data Relational Data Analysis with Hive & Impala Joining Datasets Common Built-In Functions Aggregation and Windowing Working with Impala How Impala Executes Queries Extending Impala with User-Defined Functions Improving Impala Performance Analyzing Text and Complex Data with Hive Complex Values in Hive Using Regular Expressions in Hive Sentiment Analysis and N-Grams Conclusion Hive Optimization Understanding Query Performance Controlling Job Execution Plan Bucketing Indexing Data Extending Hive SerDes Data Transformation with Custom Scripts User-Defined Functions Parameterized Queries Choosing the Best Tool for the Job Comparing MapReduce, Pig, Hive, Impala, and Relational Databases Which to Choose?