We couldn't find any listings for your search. Explore our online options and related educators below to see if they help you.
Know someone teaching this? Help them become an Educator on Cademy.
Duration 2 Days 12 CPD hours This course is intended for The target audience for the SRE Foundation course are professionals including. Anyone starting or leading a move towards increased reliability. Anyone interested in modern IT leadership and organizational change approaches. Business Managers, Business Stakeholders, Change Agents, Consultants, DevOps Practitioners, IT Directors, IT Managers, IT, Team Leaders, Product Owners, Scrum Masters, Software Engineers, Site Reliability Engineers, System Integrators, Tool Providers will benefit from this course. Overview The learning objectives for the SRE Foundation course include a practical understanding of. The history of SRE and its emergence at Google. The inter-relationship of SRE with DevOps and other popular frameworks. The underlying principles behind SRE Service Level Objectives (SLO's) and their user focus Service Level Indicators (SLI's) and the modern monitoring landscape. Error budgets and the associated error budget policies. Toil and its effect on an organization's productivity. Some practical steps that can help to eliminate toil. Observability as something to indicate the health of a service SRE tools. Automation techniques and the importance of security. Anti-fragility, our approach to failure and failure testing. The organizational impact that introducing SRE brings. The SRE (Site Reliability Engineering) Foundation course is an introduction to the principles & practices that enable an organization to reliably and economically scale critical services. Introducing a site-reliability dimension requires organizational re-alignment, a new focus on engineering & automation, and the adoption of a range of new working paradigms. This course prepares you for the SRE Foundation (SREF) certification. Course Introduction Course Goals Course Agenda SRE Principles & Practices What is Site Reliability Engineering? SRE & DevOps: What is the Difference? SRE Principles & Practices Service Level Objectives & Error Budgets Service Level Objectives (SLO?s) Error Budgets Error Budget Policies Reducing Toil What is Toil? Why is Toil Bad? Doing Something About Toil Monitoring & Service Level Indicators Service Level Indicators (SLI?s) Monitoring Observability SRE Tools & Automation Automation Defined Automation Focus Hierarchy of Automation Types Secure Automation Automation Tools Anti-Fragility & Learning from Failure Why Learn from Failure Benefits of Anti-Fragility Shifting the Organizational Balance Organizational Impact of SRE Why Organizations Embrace SRE Patterns for SRE Adoption On-Call Necessities Blameless Post-Mortems SRE & Scale SRE, Other Frameworks, The Future SRE & Other Frameworks The Future Exam Preparations Exam Requirements, Question Weighting, and Terminology List Sample Exam Review Additional course details: Nexus Humans Site Reliability Engineering (SRE) Foundation (DevOps Institute) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Site Reliability Engineering (SRE) Foundation (DevOps Institute) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 3 Days 18 CPD hours This course is intended for The target audience for the SRE Practitioner course are professionals including: Anyone focused on large-scale service scalability and reliability Anyone interested in modern IT leadership and organizational change approaches Business Managers Business Stakeholders Change Agents Consultants DevOps Practitioners IT Directors IT Managers IT Team Leaders Product Owners Scrum Masters Software Engineers Site Reliability Engineers System Integrators Tool Providers Overview After completing this course, students will have learned: Practical view of how to successfully implement a flourishing SRE culture in your organization. The underlying principles of SRE and an understanding of what it is not in terms of anti-patterns, and how you become aware of them to avoid them. The organizational impact of introducing SRE. Acing the art of SLIs and SLOs in a distributed ecosystem and extending the usage of Error Budgets beyond the normal to innovate and avoid risks. Building security and resilience by design in a distributed, zero-trust environment. How do you implement full stack observability, distributed tracing and bring about an Observability-driven development culture? Curating data using AI to move from reactive to proactive and predictive incident management. Also, how you use DataOps to build clean data lineage. Why is Platform Engineering so important in building consistency and predictability of SRE culture? Implementing practical Chaos Engineering. Major incident response responsibilities for a SRE based on incident command framework, and examples of anatomy of unmanaged incidents. Perspective of why SRE can be considered as the purest implementation of DevOps SRE Execution model Understanding the SRE role and understanding why reliability is everyone's problem. SRE success story learnings This course introduces a range of practices for advancing service reliability engineering through a mixture of automation, organizational ways of working and business alignment. Tailored for those focused on large-scale service scalability and reliability. SRE Anti-patterns Rebranding Ops or DevOps or Dev as SRE Users notice an issue before you do Measuring until my Edge False positives are worse than no alerts Configuration management trap for snowflakes The Dogpile: Mob incident response Point fixing Production Readiness Gatekeeper Fail-Safe really? SLO is a Proxy for Customer Happiness Define SLIs that meaningfully measure the reliability of a service from a user?s perspective Defining System boundaries in a distributed ecosystem for defining correct SLIs Use error budgets to help your team have better discussions and make better data-driven decisions Overall, Reliability is only as good as the weakest link on your service graph Error thresholds when 3rd party services are used Building Secure and Reliable Systems SRE and their role in Building Secure and Reliable systems Design for Changing Architecture Fault tolerant Design Design for Security Design for Resiliency Design for Scalability Design for Performance Design for Reliability Ensuring Data Security and Privacy Full-Stack Observability Modern Apps are Complex & Unpredictable Slow is the new down Pillars of Observability Implementing Synthetic and End user monitoring Observability driven development Distributed Tracing What happens to Monitoring? Instrumenting using Libraries an Agents Platform Engineering and AIOPs Taking a Platform Centric View solves Organizational scalability challenges such as fragmentation, inconsistency and unpredictability. How do you use AIOps to improve Resiliency How can DataOps help you in the journey A simple recipe to implement AIOps Indicative measurement of AIOps SRE & Incident Response Management SRE Key Responsibilities towards incident response DevOps & SRE and ITIL OODA and SRE Incident Response Closed Loop Remediation and the Advantages Swarming ? Food for Thought AI/ML for better incident management Chaos Engineering Navigating Complexity Chaos Engineering Defined Quick Facts about Chaos Engineering Chaos Monkey Origin Story Who is adopting Chaos Engineering Myths of Chaos Chaos Engineering Experiments GameDay Exercises Security Chaos Engineering Chaos Engineering Resources SRE is the Purest form of DevOps Key Principles of SRE SREs help increase Reliability across the product spectrum Metrics for Success Selection of Target areas SRE Execution Model Culture and Behavioral Skills are key SRE Case study Post-class assignments/exercises Non-abstract Large Scale Design (after Day 1) Engineering Instrumentation- Instrumenting Gremlin (after Day 2)
ITIL® 4 Specialist: High Velocity IT: In-House Training The ITIL® 4 Specialist: High-Velocity IT module is part of the Managing Professional stream for ITIL® 4. Candidates need to pass the related certification exam for working towards the Managing Professional (MP) designation. This course is based on the ITIL® 4 Specialist: High-Velocity IT exam specifications from AXELOS. With the help of ITIL® 4 concepts and terminology, exercises, and examples included in the course, candidates acquire the relevant knowledge required to pass the certification exam. This module addresses the specifics of digital transformation and helps organizations to evolve towards a convergence of business and technology, or to establish a new digital organization. It was designed to enable practitioners to explore the ways in which digital organizations and digital operating models function in high-velocity environments. Working practices such as Agile and Lean, and technical practices and technologies such as Cloud, Automation, and Automatic Testing are included. What You Will Learn At the end of this course, participants will be able to: Understand concepts regarding the high-velocity nature of the digital enterprise, including the demand it places on IT. Understand the digital product lifecycle in terms of the ITIL operating model. Understand the importance of the ITIL guiding principles and other fundamental concepts for delivering high-velocity IT. Know how to contribute to achieving value with digital products. Course Introduction Let's Get to Know Each Other Course Learning Objectives Target Audience Characteristics ITIL® 4 Certification Scheme Course Components Course Agenda Module-End Exercises Exam Details Introduction to High-Velocity IT High-Velocity IT Digital Technology Digital Organizations Digital Transformation High-Velocity IT Approaches Relevance of High-Velocity IT Approaches High-Velocity IT Approaches in Detail High-Velocity IT Operating Models Introduction ITIL® Perspective High-Velocity IT Aspects High-Velocity IT Applications ITIL® Building Blocks for High-Velocity IT Digital Product Lifecycle Service Value Streams Four Dimensions of Service Management ITIL® Management Practices High-Velocity IT Culture Key Behavior Patterns ITIL® Guiding Principles Supporting Models and Concepts for Purpose Ethics Design Thinking Supporting Models and Concepts for People Reconstructing for Service Agility Safety Culture Stress Prevention Supporting Models and Concepts for Progress Working in Complex Environments Lean Culture ITIL® Continual Improvement Model High-Velocity IT Objectives and Techniques High-Velocity IT Objectives High-Velocity IT Techniques Techniques for Valuable Investments Prioritization Techniques Minimum Viable Products and Services Product / Service Ownership A/B Testing Techniques for Fast Developments Basic Concepts Related to Fast Development Infrastructure as Code Reviews Continual Business Analysis Continuous Integration / Continuous Delivery (CI/CD) Continuous Testing Kanban Techniques for Resilient Operations Introduction to Resilient Operations Technical Debt Chaos Engineering Definition of Done Version Control Algorithmic IT Operations ChatOps Site Reliability Engineering (SRE) Techniques for Co-created Value Basic Concepts of Co-created Value Service Experience Techniques for Assured Conformance DevOps Audit Defense Toolkit DevSecOpsPeer Review
ITIL® 4 Specialist: High Velocity IT: Virtual In-House Training The ITIL® 4 Specialist: High-Velocity IT module is part of the Managing Professional stream for ITIL® 4. Candidates need to pass the related certification exam for working towards the Managing Professional (MP) designation. This course is based on the ITIL® 4 Specialist: High-Velocity IT exam specifications from AXELOS. With the help of ITIL® 4 concepts and terminology, exercises, and examples included in the course, candidates acquire the relevant knowledge required to pass the certification exam. This module addresses the specifics of digital transformation and helps organizations to evolve towards a convergence of business and technology, or to establish a new digital organization. It was designed to enable practitioners to explore the ways in which digital organizations and digital operating models function in high-velocity environments. Working practices such as Agile and Lean, and technical practices and technologies such as Cloud, Automation, and Automatic Testing are included. What You Will Learn At the end of this course, participants will be able to: Understand concepts regarding the high-velocity nature of the digital enterprise, including the demand it places on IT. Understand the digital product lifecycle in terms of the ITIL operating model. Understand the importance of the ITIL guiding principles and other fundamental concepts for delivering high-velocity IT. Know how to contribute to achieving value with digital products. Course Introduction Let's Get to Know Each Other Course Learning Objectives Target Audience Characteristics ITIL® 4 Certification Scheme Course Components Course Agenda Module-End Exercises Exam Details Introduction to High-Velocity IT High-Velocity IT Digital Technology Digital Organizations Digital Transformation High-Velocity IT Approaches Relevance of High-Velocity IT Approaches High-Velocity IT Approaches in Detail High-Velocity IT Operating Models Introduction ITIL® Perspective High-Velocity IT Aspects High-Velocity IT Applications ITIL® Building Blocks for High-Velocity IT Digital Product Lifecycle Service Value Streams Four Dimensions of Service Management ITIL® Management Practices High-Velocity IT Culture Key Behavior Patterns ITIL® Guiding Principles Supporting Models and Concepts for Purpose Ethics Design Thinking Supporting Models and Concepts for People Reconstructing for Service Agility Safety Culture Stress Prevention Supporting Models and Concepts for Progress Working in Complex Environments Lean Culture ITIL® Continual Improvement Model High-Velocity IT Objectives and Techniques High-Velocity IT Objectives High-Velocity IT Techniques Techniques for Valuable Investments Prioritization Techniques Minimum Viable Products and Services Product / Service Ownership A/B Testing Techniques for Fast Developments Basic Concepts Related to Fast Development Infrastructure as Code Reviews Continual Business Analysis Continuous Integration / Continuous Delivery (CI/CD) Continuous Testing Kanban Techniques for Resilient Operations Introduction to Resilient Operations Technical Debt Chaos Engineering Definition of Done Version Control Algorithmic IT Operations ChatOps Site Reliability Engineering (SRE) Techniques for Co-created Value Basic Concepts of Co-created Value Service Experience Techniques for Assured Conformance DevOps Audit Defense Toolkit DevSecOpsPeer Review
Duration 3 Days 18 CPD hours This course is intended for This class is intended for the following customer job roles: Cloud architects, administrators, and SysOps personnel Cloud developers and DevOps personnel Overview This course teaches participants the following skills: Plan and implement a well-architected logging and monitoring infrastructure Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) Create effective monitoring dashboards and alerts Monitor, troubleshoot, and improve Google Cloud infrastructure Analyze and export Google Cloud audit logs Find production code defects, identify bottlenecks, and improve performance Optimize monitoring costs This course teaches you techniques for monitoring, troubleshooting, and improving infrastructure and application performance in Google Cloud. Guided by the principles of Site Reliability Engineering (SRE), and using a combination of presentations, demos, hands-on labs, and real-world case studies, attendees gain experience with full-stack monitoring, real-time log management and analysis, debugging code in production, tracing application performance bottlenecks, and profiling CPU and memory usage. Introduction to Google Cloud Monitoring Tools Understand the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Service Monitoring Understand the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace, and Profiler Avoiding Customer Pain Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation Measure customer pain with SLIs Define critical performance measures Create and use SLOs and SLAs Achieve developer and operation harmony with error budgets Alerting Policies Develop alerting strategies Define alerting policies Add notification channels Identify types of alerts and common uses for each Construct and alert on resource groups Manage alerting policies programmatically Monitoring Critical Systems Choose best practice monitoring project architectures Differentiate Cloud IAM roles for monitoring Use the default dashboards appropriately Build custom dashboards to show resource consumption and application load Define uptime checks to track aliveness and latency Configuring Google Cloud Services for Observability Integrate logging and monitoring agents into Compute Engine VMs and images Enable and utilize Kubernetes Monitoring Extend and clarify Kubernetes monitoring with Prometheus Expose custom metrics through code, and with the help of OpenCensus Advanced Logging and Analysis Identify and choose among resource tagging approaches Define log sinks (inclusion filters) and exclusion filters Create metrics based on logs Define custom metrics Link application errors to Logging using Error Reporting Export logs to BigQuery Monitoring Network Security and Audit Logs Collect and analyze VPC Flow logs and Firewall Rules logs Enable and monitor Packet Mirroring Explain the capabilities of Network Intelligence Center Use Admin Activity audit logs to track changes to the configuration or metadata of resources Use Data Access audit logs to track accesses or changes to user-provided resource data Use System Event audit logs to track GCP administrative actions Managing Incidents Define incident management roles and communication channels Mitigate incident impact Troubleshoot root causes Resolve incidents Document incidents in a post-mortem process Investigating Application Performance Issues Debug production code to correct code defects Trace latency through layers of service interaction to eliminate performance bottlenecks Profile and identify resource-intensive functions in an application Optimizing the Costs of Monitoring Analyze resource utilization cust for monitoring related components within Google Cloud Implement best practices for controlling the cost of monitoring within Google Cloud
Master the art of self-hosting WordPress on Linux with our comprehensive video course, designed to empower technical professionals to fully control their web presence.
With this 2-in-1 course, you will get access to AWS Technical Essentials and AWS Certified Solutions Architect - Associate certification exam content.
Prepare for the AWS Certified Solutions Architect - Associate (SAA-C03) exam. Learn about the AWS Management Console, S3 buckets, instances, database services, cloud security, costs associated with AWS, Amazon Elastic Compute Cloud (EC2), Amazon Virtual Private Cloud (VPC), Amazon Simple Storage Service (S3), and Amazon Elastic Block Store (EBS).
The course will provide a comprehensive overview of Consul and its capabilities, including deploying a single data center, registering services using service discovery, and accessing Consul Key/Value (KV). It is designed for individuals who possess basic terminal skills and have an understanding of application and data center/cloud networking architectures for running applications.
Duration 3 Days 18 CPD hours This course is intended for Cluster administrators (Junior systems administrators, junior cloud administrators) interested in deploying additional clusters to meet increasing demands from their organizations. Cluster engineers (Senior systems administrators, senior cloud administrators, cloud engineers) interested in the planning and design of OpenShift clusters to meet performance and reliability of different workloads and in creating work books for these installations. Site reliability engineers (SREs) interested in deploying test bed clusters to validate new settings, updates, customizations, operational procedures, and responses to incidents. Overview Validate infrastructure prerequisites for an OpenShift cluster. Run the OpenShift installer with custom settings. Describe and monitor each stage of the OpenShift installation process. Collect troubleshooting information during an ongoing installation, or after a failed installation. Complete the configuration of cluster services in a newly installed cluster. Installing OpenShift on a cloud, virtual, or physical infrastructure. Red Hat OpenShift Installation Lab (DO322) teaches essential skills for installing an OpenShift cluster in a range of environments, from proof of concept to production, and how to identify customizations that may be required because of the underlying cloud, virtual, or physical infrastructure. This course is based on Red Hat OpenShift Container Platform 4.6. 1 - Introduction to container technology Describe how software can run in containers orchestrated by Red Hat OpenShift Container Platform. 2 - Create containerized services Provision a server using container technology. 3 - Manage containers Manipulate prebuilt container images to create and manage containerized services. 4 - Manage container images Manage the life cycle of a container image from creation to deletion. 5 - Create custom container images Design and code a Dockerfile to build a custom container image. 6 - Deploy containerized applications on OpenShift Deploy single container applications on OpenShift Container Platform. 7 - Troubleshoot containerized applications Troubleshoot a containerized application deployed on OpenShift. 8 - Deploy and manage applications on an OpenShift cluster Use various application packaging methods to deploy applications to an OpenShift cluster, then manage their resources. 9 - Design containerized applications for OpenShift Select a containerization method for an application and create a container to run on an OpenShift cluster. 10 - Publish enterprise container images Create an enterprise registry and publish container images to it. 11 - Build applications Describe the OpenShift build process, then trigger and manage builds. 12 - Customize source-to-image (S2I) builds Customize an existing S2I base image and create a new one. 13 - Create applications from OpenShift templates Describe the elements of a template and create a multicontainer application template. 14 - Manage application deployments Monitor application health and implement various deployment methods for cloud-native applications. 15 - Perform comprehensive review Create and deploy cloudinative applications on OpenShift.