Duration 3 Days 18 CPD hours This course is intended for This course is geared for Python experienced attendees who wish to learn and use basic machine learning algorithms and concepts. Students should have skills at least equivalent to the Python for Data Science courses we offer. Overview Working in a hands-on learning environment, guided by our expert team, attendees will learn to Understand the main concepts and principles of predictive analytics Use the Python data analytics ecosystem to implement end-to-end predictive analytics projects Explore advanced predictive modeling algorithms w with an emphasis on theory with intuitive explanations Learn to deploy a predictive model's results as an interactive application Learn about the stages involved in producing complete predictive analytics solutions Understand how to define a problem, propose a solution, and prepare a dataset Use visualizations to explore relationships and gain insights into the dataset Learn to build regression and classification models using scikit-learn Use Keras to build powerful neural network models that produce accurate predictions Learn to serve a model's predictions as a web application Predictive analytics is an applied field that employs a variety of quantitative methods using data to make predictions. It involves much more than just throwing data onto a computer to build a model. This course provides practical coverage to help you understand the most important concepts of predictive analytics. Using practical, step-by-step examples, we build predictive analytics solutions while using cutting-edge Python tools and packages. Hands-on Predictive Analytics with Python is a three-day, hands-on course that guides students through a step-by-step approach to defining problems and identifying relevant data. Students will learn how to perform data preparation, explore and visualize relationships, as well as build models, tune, evaluate, and deploy models. Each stage has relevant practical examples and efficient Python code. You will work with models such as KNN, Random Forests, and neural networks using the most important libraries in Python's data science stack: NumPy, Pandas, Matplotlib, Seabor, Keras, Dash, and so on. In addition to hands-on code examples, you will find intuitive explanations of the inner workings of the main techniques and algorithms used in predictive analytics. The Predictive Analytics Process Technical requirements What is predictive analytics? Reviewing important concepts of predictive analytics The predictive analytics process A quick tour of Python's data science stack Problem Understanding and Data Preparation Technical requirements Understanding the business problem and proposing a solution Practical project ? diamond prices Practical project ? credit card default Dataset Understanding ? Exploratory Data Analysis Technical requirements What is EDA? Univariate EDA Bivariate EDA Introduction to graphical multivariate EDA Predicting Numerical Values with Machine Learning Technical requirements Introduction to ML Practical considerations before modeling MLR Lasso regression KNN Training versus testing error Predicting Categories with Machine Learning Technical requirements Classification tasks Credit card default dataset Logistic regression Classification trees Random forests Training versus testing error Multiclass classification Naive Bayes classifiers Introducing Neural Nets for Predictive Analytics Technical requirements Introducing neural network models Introducing TensorFlow and Keras Regressing with neural networks Classification with neural networks The dark art of training neural networks Model Evaluation Technical requirements Evaluation of regression models Evaluation for classification models The k-fold cross-validation Model Tuning and Improving Performance Technical requirements Hyperparameter tuning Improving performance Implementing a Model with Dash Technical requirements Model communication and/or deployment phase Introducing Dash Implementing a predictive model as a web application Additional course details: Nexus Humans Hands-on Predicitive Analytics with Python (TTPS4879) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Hands-on Predicitive Analytics with Python (TTPS4879) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.
Duration 5 Days 30 CPD hours This course is intended for This intermediate and beyond level course is geared for experienced technical professionals in various roles, such as developers, data analysts, data engineers, software engineers, and machine learning engineers who want to leverage Scala and Spark to tackle complex data challenges and develop scalable, high-performance applications across diverse domains. Practical programming experience is required to participate in the hands-on labs. Overview Working in a hands-on learning environment led by our expert instructor you'll: Develop a basic understanding of Scala and Apache Spark fundamentals, enabling you to confidently create scalable and high-performance applications. Learn how to process large datasets efficiently, helping you handle complex data challenges and make data-driven decisions. Gain hands-on experience with real-time data streaming, allowing you to manage and analyze data as it flows into your applications. Acquire practical knowledge of machine learning algorithms using Spark MLlib, empowering you to create intelligent applications and uncover hidden insights. Master graph processing with GraphX, enabling you to analyze and visualize complex relationships in your data. Discover generative AI technologies using GPT with Spark and Scala, opening up new possibilities for automating content generation and enhancing data analysis. Embark on a journey to master the world of big data with our immersive course on Scala and Spark! Mastering Scala with Apache Spark for the Modern Data Enterprise is a five day hands on course designed to provide you with the essential skills and tools to tackle complex data projects using Scala programming language and Apache Spark, a high-performance data processing engine. Mastering these technologies will enable you to perform a wide range of tasks, from data wrangling and analytics to machine learning and artificial intelligence, across various industries and applications.Guided by our expert instructor, you?ll explore the fundamentals of Scala programming and Apache Spark while gaining valuable hands-on experience with Spark programming, RDDs, DataFrames, Spark SQL, and data sources. You?ll also explore Spark Streaming, performance optimization techniques, and the integration of popular external libraries, tools, and cloud platforms like AWS, Azure, and GCP. Machine learning enthusiasts will delve into Spark MLlib, covering basics of machine learning algorithms, data preparation, feature extraction, and various techniques such as regression, classification, clustering, and recommendation systems. Introduction to Scala Brief history and motivation Differences between Scala and Java Basic Scala syntax and constructs Scala's functional programming features Introduction to Apache Spark Overview and history Spark components and architecture Spark ecosystem Comparing Spark with other big data frameworks Basics of Spark Programming SparkContext and SparkSession Resilient Distributed Datasets (RDDs) Transformations and Actions Working with DataFrames Spark SQL and Data Sources Spark SQL library and its advantages Structured and semi-structured data sources Reading and writing data in various formats (CSV, JSON, Parquet, Avro, etc.) Data manipulation using SQL queries Basic RDD Operations Creating and manipulating RDDs Common transformations and actions on RDDs Working with key-value data Basic DataFrame and Dataset Operations Creating and manipulating DataFrames and Datasets Column operations and functions Filtering, sorting, and aggregating data Introduction to Spark Streaming Overview of Spark Streaming Discretized Stream (DStream) operations Windowed operations and stateful processing Performance Optimization Basics Best practices for efficient Spark code Broadcast variables and accumulators Monitoring Spark applications Integrating External Libraries and Tools, Spark Streaming Using popular external libraries, such as Hadoop and HBase Integrating with cloud platforms: AWS, Azure, GCP Connecting to data storage systems: HDFS, S3, Cassandra, etc. Introduction to Machine Learning Basics Overview of machine learning Supervised and unsupervised learning Common algorithms and use cases Introduction to Spark MLlib Overview of Spark MLlib MLlib's algorithms and utilities Data preparation and feature extraction Linear Regression and Classification Linear regression algorithm Logistic regression for classification Model evaluation and performance metrics Clustering Algorithms Overview of clustering algorithms K-means clustering Model evaluation and performance metrics Collaborative Filtering and Recommendation Systems Overview of recommendation systems Collaborative filtering techniques Implementing recommendations with Spark MLlib Introduction to Graph Processing Overview of graph processing Use cases and applications of graph processing Graph representations and operations Introduction to Spark GraphX Overview of GraphX Creating and transforming graphs Graph algorithms in GraphX Big Data Innovation! Using GPT and Generative AI Technologies with Spark and Scala Overview of generative AI technologies Integrating GPT with Spark and Scala Practical applications and use cases Bonus Topics / Time Permitting Introduction to Spark NLP Overview of Spark NLP Preprocessing text data Text classification and sentiment analysis Putting It All Together Work on a capstone project that integrates multiple aspects of the course, including data processing, machine learning, graph processing, and generative AI technologies.
Excel 2016 Formulas and Functions Course Overview: This course is designed to provide learners with a comprehensive understanding of Excel 2016 formulas and functions. By covering essential tools and techniques, the course enables learners to confidently navigate the Excel environment and utilise formulas for data analysis, calculations, and reporting. With a focus on both fundamental and advanced functions, learners will gain the skills needed to streamline their tasks and improve efficiency in the workplace. The course’s practical value lies in its application across various industries, allowing professionals to enhance their data handling and reporting capabilities, making it an invaluable addition to any skill set. Course Description: The "Excel 2016 Formulas and Functions" course covers the core concepts of using Excel formulas and functions to manage and analyse data. It starts with the basics, such as SUM, AVERAGE, and IF functions, before advancing to more complex tools like VLOOKUP and conditional formatting.Through each module, learners will explore how to use these features to automate calculations, create dynamic reports, and manipulate datasets. The course offers a structured approach to mastering Excel, with clear guidance on how to use each function effectively in a business context. Learners will come away with the knowledge to work more efficiently in Excel, enabling them to make informed data-driven decisions and enhance productivity. Excel 2016 Formulas and Functions Curriculum: Module 01: Getting Started with Microsoft Excel 2016 Module 02: Basic Formulas and Functions Module 03: Formulas and Functions Activities (See full curriculum) Who is this course for? Individuals seeking to improve their Excel skills Professionals aiming to enhance their data analysis capabilities Beginners with an interest in data management and analysis Any other individuals looking to gain proficiency in Excel Career Path: Data Analyst Administrative Assistant Financial Analyst Business Analyst Project Coordinator
Course Overview This "Microsoft Power BI - Master Power BI in 90 Minutes!" course offers a concise yet comprehensive introduction to Power BI, empowering learners to quickly create and manage data dashboards. The course is designed for both newcomers and those looking to enhance their skills, covering essential topics to help users understand and utilise Power BI for data analysis and reporting. Upon completion, learners will gain the ability to design dynamic dashboards, manipulate data, and visualise trends, providing valuable insights for decision-making in various business environments. Course Description This course takes learners through Power BI’s core functionalities, starting with an introduction to the platform before advancing to building basic and sophisticated dashboards. Learners will explore key aspects such as data import, filtering, and the creation of interactive visuals. By working with live data, participants will develop an understanding of how to generate real-time reports, enhancing their ability to analyse data and derive insights. The course offers both beginner and advanced concepts, ensuring that all participants can apply what they’ve learned to real-world data scenarios. Upon completing the course, learners will feel confident in their ability to utilise Power BI to its full potential in any business context. Course Modules Module 01: Power BI - Introduction Module 02: Your First Power BI Dashboard Module 03: Your Advanced Power BI Dashboard with Real Live Data Module 04: Course Bonuses (See full curriculum) Who is this course for? Individuals seeking to enhance their data analysis skills. Professionals aiming to improve data visualisation and reporting capabilities. Beginners with an interest in business intelligence or data management. Anyone looking to gain insights into data-driven decision-making. Career Path Business Intelligence Analyst Data Analyst Data Visualisation Specialist Marketing Data Analyst Financial Analyst
Duration 2 Days 12 CPD hours This course is intended for Audience: Data Scientists, Software Developers, IT Architects, and Technical Managers. Participants should have the general knowledge of statistics and programming Also familiar with Python Overview ? NumPy, pandas, Matplotlib, scikit-learn ? Python REPLs ? Jupyter Notebooks ? Data analytics life-cycle phases ? Data repairing and normalizing ? Data aggregation and grouping ? Data visualization ? Data science algorithms for supervised and unsupervised machine learning Covers theoretical and technical aspects of using Python in Applied Data Science projects and Data Logistics use cases. Python for Data Science ? Using Modules ? Listing Methods in a Module ? Creating Your Own Modules ? List Comprehension ? Dictionary Comprehension ? String Comprehension ? Python 2 vs Python 3 ? Sets (Python 3+) ? Python Idioms ? Python Data Science ?Ecosystem? ? NumPy ? NumPy Arrays ? NumPy Idioms ? pandas ? Data Wrangling with pandas' DataFrame ? SciPy ? Scikit-learn ? SciPy or scikit-learn? ? Matplotlib ? Python vs R ? Python on Apache Spark ? Python Dev Tools and REPLs ? Anaconda ? IPython ? Visual Studio Code ? Jupyter ? Jupyter Basic Commands ? Summary Applied Data Science ? What is Data Science? ? Data Science Ecosystem ? Data Mining vs. Data Science ? Business Analytics vs. Data Science ? Data Science, Machine Learning, AI? ? Who is a Data Scientist? ? Data Science Skill Sets Venn Diagram ? Data Scientists at Work ? Examples of Data Science Projects ? An Example of a Data Product ? Applied Data Science at Google ? Data Science Gotchas ? Summary Data Analytics Life-cycle Phases ? Big Data Analytics Pipeline ? Data Discovery Phase ? Data Harvesting Phase ? Data Priming Phase ? Data Logistics and Data Governance ? Exploratory Data Analysis ? Model Planning Phase ? Model Building Phase ? Communicating the Results ? Production Roll-out ? Summary Repairing and Normalizing Data ? Repairing and Normalizing Data ? Dealing with the Missing Data ? Sample Data Set ? Getting Info on Null Data ? Dropping a Column ? Interpolating Missing Data in pandas ? Replacing the Missing Values with the Mean Value ? Scaling (Normalizing) the Data ? Data Preprocessing with scikit-learn ? Scaling with the scale() Function ? The MinMaxScaler Object ? Summary Descriptive Statistics Computing Features in Python ? Descriptive Statistics ? Non-uniformity of a Probability Distribution ? Using NumPy for Calculating Descriptive Statistics Measures ? Finding Min and Max in NumPy ? Using pandas for Calculating Descriptive Statistics Measures ? Correlation ? Regression and Correlation ? Covariance ? Getting Pairwise Correlation and Covariance Measures ? Finding Min and Max in pandas DataFrame ? Summary Data Aggregation and Grouping ? Data Aggregation and Grouping ? Sample Data Set ? The pandas.core.groupby.SeriesGroupBy Object ? Grouping by Two or More Columns ? Emulating the SQL's WHERE Clause ? The Pivot Tables ? Cross-Tabulation ? Summary Data Visualization with matplotlib ? Data Visualization ? What is matplotlib? ? Getting Started with matplotlib ? The Plotting Window ? The Figure Options ? The matplotlib.pyplot.plot() Function ? The matplotlib.pyplot.bar() Function ? The matplotlib.pyplot.pie () Function ? Subplots ? Using the matplotlib.gridspec.GridSpec Object ? The matplotlib.pyplot.subplot() Function ? Hands-on Exercise ? Figures ? Saving Figures to File ? Visualization with pandas ? Working with matplotlib in Jupyter Notebooks ? Summary Data Science and ML Algorithms in scikit-learn ? Data Science, Machine Learning, AI? ? Types of Machine Learning ? Terminology: Features and Observations ? Continuous and Categorical Features (Variables) ? Terminology: Axis ? The scikit-learn Package ? scikit-learn Estimators ? Models, Estimators, and Predictors ? Common Distance Metrics ? The Euclidean Metric ? The LIBSVM format ? Scaling of the Features ? The Curse of Dimensionality ? Supervised vs Unsupervised Machine Learning ? Supervised Machine Learning Algorithms ? Unsupervised Machine Learning Algorithms ? Choose the Right Algorithm ? Life-cycles of Machine Learning Development ? Data Split for Training and Test Data Sets ? Data Splitting in scikit-learn ? Hands-on Exercise ? Classification Examples ? Classifying with k-Nearest Neighbors (SL) ? k-Nearest Neighbors Algorithm ? k-Nearest Neighbors Algorithm ? The Error Rate ? Hands-on Exercise ? Dimensionality Reduction ? The Advantages of Dimensionality Reduction ? Principal component analysis (PCA) ? Hands-on Exercise ? Data Blending ? Decision Trees (SL) ? Decision Tree Terminology ? Decision Tree Classification in Context of Information Theory ? Information Entropy Defined ? The Shannon Entropy Formula ? The Simplified Decision Tree Algorithm ? Using Decision Trees ? Random Forests ? SVM ? Naive Bayes Classifier (SL) ? Naive Bayesian Probabilistic Model in a Nutshell ? Bayes Formula ? Classification of Documents with Naive Bayes ? Unsupervised Learning Type: Clustering ? Clustering Examples ? k-Means Clustering (UL) ? k-Means Clustering in a Nutshell ? k-Means Characteristics ? Regression Analysis ? Simple Linear Regression Model ? Linear vs Non-Linear Regression ? Linear Regression Illustration ? Major Underlying Assumptions for Regression Analysis ? Least-Squares Method (LSM) ? Locally Weighted Linear Regression ? Regression Models in Excel ? Multiple Regression Analysis ? Logistic Regression ? Regression vs Classification ? Time-Series Analysis ? Decomposing Time-Series ? Summary Lab Exercises Lab 1 - Learning the Lab Environment Lab 2 - Using Jupyter Notebook Lab 3 - Repairing and Normalizing Data Lab 4 - Computing Descriptive Statistics Lab 5 - Data Grouping and Aggregation Lab 6 - Data Visualization with matplotlib Lab 7 - Data Splitting Lab 8 - k-Nearest Neighbors Algorithm Lab 9 - The k-means Algorithm Lab 10 - The Random Forest Algorithm
Excel Vlookup, Xlookup, Match and Index Course Overview: This comprehensive course covers essential Excel functions such as VLOOKUP, XLOOKUP, MATCH, and INDEX, which are integral for efficient data management and analysis. Learners will gain a clear understanding of how to use these functions to simplify complex data tasks, enhance productivity, and improve decision-making. Throughout the course, students will master how to search, match, and retrieve data from large datasets, preparing them for real-world scenarios in finance, marketing, HR, and more. The course is designed to equip learners with the necessary skills to perform advanced Excel functions with confidence, contributing to their professional growth and data analysis expertise. Course Description: In this course, learners will explore the powerful functions of Excel, including VLOOKUP, XLOOKUP, MATCH, and INDEX, enabling them to perform efficient data searches, cross-referencing, and information retrieval. The course includes step-by-step lessons on how to apply these functions to real-world datasets, making it highly relevant for anyone working with large volumes of data. Learners will become proficient in building dynamic spreadsheets that streamline decision-making processes and improve data accuracy. Additionally, this course emphasises problem-solving techniques, empowering individuals to handle complex data-related tasks with ease. By the end of the course, learners will have a strong command of these Excel functions, boosting their data management and analytical capabilities. Excel Vlookup, Xlookup, Match and Index Curriculum: Module 01: Excel VLOOKUP Module 02: Excel XLOOKUP Module 03: Excel MATCH Module 04: Excel INDEX Module 05: Advanced VLOOKUP Techniques Module 06: Combining VLOOKUP, MATCH, and INDEX Module 07: Practical Applications of XLOOKUP (See full curriculum) Who is this course for? Individuals seeking to enhance their Excel skills for data analysis. Professionals aiming to improve their data management capabilities. Beginners with an interest in learning advanced Excel functions. Anyone looking to improve their problem-solving abilities in data-heavy tasks. Career Path: Data Analyst Financial Analyst Marketing Analyst HR Specialist Business Intelligence Specialist Excel Expert for Administrative or Management Roles
Duration 2 Days 12 CPD hours This course is intended for Business Analysts, Technical Managers, and Programmers Overview This intensive training course helps students learn the practical aspects of the R programming language. The course is supplemented by many hands-on labs which allow attendees to immediately apply their theoretical knowledge in practice. Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data as well as supervised and unsupervised machine learning. What is R ? What is R? ? Positioning of R in the Data Science Space ? The Legal Aspects ? Microsoft R Open ? R Integrated Development Environments ? Running R ? Running RStudio ? Getting Help ? General Notes on R Commands and Statements ? Assignment Operators ? R Core Data Structures ? Assignment Example ? R Objects and Workspace ? Printing Objects ? Arithmetic Operators ? Logical Operators ? System Date and Time ? Operations ? User-defined Functions ? Control Statements ? Conditional Execution ? Repetitive Execution ? Repetitive execution ? Built-in Functions ? Summary Introduction to Functional Programming with R ? What is Functional Programming (FP)? ? Terminology: Higher-Order Functions ? A Short List of Languages that Support FP ? Functional Programming in R ? Vector and Matrix Arithmetic ? Vector Arithmetic Example ? More Examples of FP in R ? Summary Managing Your Environment ? Getting and Setting the Working Directory ? Getting the List of Files in a Directory ? The R Home Directory ? Executing External R commands ? Loading External Scripts in RStudio ? Listing Objects in Workspace ? Removing Objects in Workspace ? Saving Your Workspace in R ? Saving Your Workspace in RStudio ? Saving Your Workspace in R GUI ? Loading Your Workspace ? Diverting Output to a File ? Batch (Unattended) Processing ? Controlling Global Options ? Summary R Type System and Structures ? The R Data Types ? System Date and Time ? Formatting Date and Time ? Using the mode() Function ? R Data Structures ? What is the Type of My Data Structure? ? Creating Vectors ? Logical Vectors ? Character Vectors ? Factorization ? Multi-Mode Vectors ? The Length of the Vector ? Getting Vector Elements ? Lists ? A List with Element Names ? Extracting List Elements ? Adding to a List ? Matrix Data Structure ? Creating Matrices ? Creating Matrices with cbind() and rbind() ? Working with Data Frames ? Matrices vs Data Frames ? A Data Frame Sample ? Creating a Data Frame ? Accessing Data Cells ? Getting Info About a Data Frame ? Selecting Columns in Data Frames ? Selecting Rows in Data Frames ? Getting a Subset of a Data Frame ? Sorting (ordering) Data in Data Frames by Attribute(s) ? Editing Data Frames ? The str() Function ? Type Conversion (Coercion) ? The summary() Function ? Checking an Object's Type ? Summary Extending R ? The Base R Packages ? Loading Packages ? What is the Difference between Package and Library? ? Extending R ? The CRAN Web Site ? Extending R in R GUI ? Extending R in RStudio ? Installing and Removing Packages from Command-Line ? Summary Read-Write and Import-Export Operations in R ? Reading Data from a File into a Vector ? Example of Reading Data from a File into A Vector ? Writing Data to a File ? Example of Writing Data to a File ? Reading Data into A Data Frame ? Writing CSV Files ? Importing Data into R ? Exporting Data from R ? Summary Statistical Computing Features in R ? Statistical Computing Features ? Descriptive Statistics ? Basic Statistical Functions ? Examples of Using Basic Statistical Functions ? Non-uniformity of a Probability Distribution ? Writing Your Own skew and kurtosis Functions ? Generating Normally Distributed Random Numbers ? Generating Uniformly Distributed Random Numbers ? Using the summary() Function ? Math Functions Used in Data Analysis ? Examples of Using Math Functions ? Correlations ? Correlation Example ? Testing Correlation Coefficient for Significance ? The cor.test() Function ? The cor.test() Example ? Regression Analysis ? Types of Regression ? Simple Linear Regression Model ? Least-Squares Method (LSM) ? LSM Assumptions ? Fitting Linear Regression Models in R ? Example of Using lm() ? Confidence Intervals for Model Parameters ? Example of Using lm() with a Data Frame ? Regression Models in Excel ? Multiple Regression Analysis ? Summary Data Manipulation and Transformation in R ? Applying Functions to Matrices and Data Frames ? The apply() Function ? Using apply() ? Using apply() with a User-Defined Function ? apply() Variants ? Using tapply() ? Adding a Column to a Data Frame ? Dropping A Column in a Data Frame ? The attach() and detach() Functions ? Sampling ? Using sample() for Generating Labels ? Set Operations ? Example of Using Set Operations ? The dplyr Package ? Object Masking (Shadowing) Considerations ? Getting More Information on dplyr in RStudio ? The search() or searchpaths() Functions ? Handling Large Data Sets in R with the data.table Package ? The fread() and fwrite() functions from the data.table Package ? Using the Data Table Structure ? Summary Data Visualization in R ? Data Visualization ? Data Visualization in R ? The ggplot2 Data Visualization Package ? Creating Bar Plots in R ? Creating Horizontal Bar Plots ? Using barplot() with Matrices ? Using barplot() with Matrices Example ? Customizing Plots ? Histograms in R ? Building Histograms with hist() ? Example of using hist() ? Pie Charts in R ? Examples of using pie() ? Generic X-Y Plotting ? Examples of the plot() function ? Dot Plots in R ? Saving Your Work ? Supported Export Options ? Plots in RStudio ? Saving a Plot as an Image ? Summary Using R Efficiently ? Object Memory Allocation Considerations ? Garbage Collection ? Finding Out About Loaded Packages ? Using the conflicts() Function ? Getting Information About the Object Source Package with the pryr Package ? Using the where() Function from the pryr Package ? Timing Your Code ? Timing Your Code with system.time() ? Timing Your Code with System.time() ? Sleeping a Program ? Handling Large Data Sets in R with the data.table Package ? Passing System-Level Parameters to R ? Summary Lab Exercises Lab 1 - Getting Started with R Lab 2 - Learning the R Type System and Structures Lab 3 - Read and Write Operations in R Lab 4 - Data Import and Export in R Lab 5 - k-Nearest Neighbors Algorithm Lab 6 - Creating Your Own Statistical Functions Lab 7 - Simple Linear Regression Lab 8 - Monte-Carlo Simulation (Method) Lab 9 - Data Processing with R Lab 10 - Using R Graphics Package Lab 11 - Using R Efficiently
Duration 4 Days 24 CPD hours This course is intended for This class is intended for experienced developers who are responsible for managing big data transformations including: Extracting, loading, transforming, cleaning, and validating data. Designing pipelines and architectures for data processing. Creating and maintaining machine learning and statistical models. Querying datasets, visualizing query results and creating reports Overview Design and build data processing systems on Google Cloud Platform. Leverage unstructured data using Spark and ML APIs on Cloud Dataproc. Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow. Derive business insights from extremely large datasets using Google BigQuery. Train, evaluate and predict using machine learning models using TensorFlow and Cloud ML. Enable instant insights from streaming data Get hands-on experience with designing and building data processing systems on Google Cloud. This course uses lectures, demos, and hand-on labs to show you how to design data processing systems, build end-to-end data pipelines, analyze data, and implement machine learning. This course covers structured, unstructured, and streaming data. Introduction to Data Engineering Explore the role of a data engineer. Analyze data engineering challenges. Intro to BigQuery. Data Lakes and Data Warehouses. Demo: Federated Queries with BigQuery. Transactional Databases vs Data Warehouses. Website Demo: Finding PII in your dataset with DLP API. Partner effectively with other data teams. Manage data access and governance. Build production-ready pipelines. Review GCP customer case study. Lab: Analyzing Data with BigQuery. Building a Data Lake Introduction to Data Lakes. Data Storage and ETL options on GCP. Building a Data Lake using Cloud Storage. Optional Demo: Optimizing cost with Google Cloud Storage classes and Cloud Functions. Securing Cloud Storage. Storing All Sorts of Data Types. Video Demo: Running federated queries on Parquet and ORC files in BigQuery. Cloud SQL as a relational Data Lake. Lab: Loading Taxi Data into Cloud SQL. Building a Data Warehouse The modern data warehouse. Intro to BigQuery. Demo: Query TB+ of data in seconds. Getting Started. Loading Data. Video Demo: Querying Cloud SQL from BigQuery. Lab: Loading Data into BigQuery. Exploring Schemas. Demo: Exploring BigQuery Public Datasets with SQL using INFORMATION_SCHEMA. Schema Design. Nested and Repeated Fields. Demo: Nested and repeated fields in BigQuery. Lab: Working with JSON and Array data in BigQuery. Optimizing with Partitioning and Clustering. Demo: Partitioned and Clustered Tables in BigQuery. Preview: Transforming Batch and Streaming Data. Introduction to Building Batch Data Pipelines EL, ELT, ETL. Quality considerations. How to carry out operations in BigQuery. Demo: ELT to improve data quality in BigQuery. Shortcomings. ETL to solve data quality issues. Executing Spark on Cloud Dataproc The Hadoop ecosystem. Running Hadoop on Cloud Dataproc. GCS instead of HDFS. Optimizing Dataproc. Lab: Running Apache Spark jobs on Cloud Dataproc. Serverless Data Processing with Cloud Dataflow Cloud Dataflow. Why customers value Dataflow. Dataflow Pipelines. Lab: A Simple Dataflow Pipeline (Python/Java). Lab: MapReduce in Dataflow (Python/Java). Lab: Side Inputs (Python/Java). Dataflow Templates. Dataflow SQL. Manage Data Pipelines with Cloud Data Fusion and Cloud Composer Building Batch Data Pipelines visually with Cloud Data Fusion. Components. UI Overview. Building a Pipeline. Exploring Data using Wrangler. Lab: Building and executing a pipeline graph in Cloud Data Fusion. Orchestrating work between GCP services with Cloud Composer. Apache Airflow Environment. DAGs and Operators. Workflow Scheduling. Optional Long Demo: Event-triggered Loading of data with Cloud Composer, Cloud Functions, Cloud Storage, and BigQuery. Monitoring and Logging. Lab: An Introduction to Cloud Composer. Introduction to Processing Streaming Data Processing Streaming Data. Serverless Messaging with Cloud Pub/Sub Cloud Pub/Sub. Lab: Publish Streaming Data into Pub/Sub. Cloud Dataflow Streaming Features Cloud Dataflow Streaming Features. Lab: Streaming Data Pipelines. High-Throughput BigQuery and Bigtable Streaming Features BigQuery Streaming Features. Lab: Streaming Analytics and Dashboards. Cloud Bigtable. Lab: Streaming Data Pipelines into Bigtable. Advanced BigQuery Functionality and Performance Analytic Window Functions. Using With Clauses. GIS Functions. Demo: Mapping Fastest Growing Zip Codes with BigQuery GeoViz. Performance Considerations. Lab: Optimizing your BigQuery Queries for Performance. Optional Lab: Creating Date-Partitioned Tables in BigQuery. Introduction to Analytics and AI What is AI?. From Ad-hoc Data Analysis to Data Driven Decisions. Options for ML models on GCP. Prebuilt ML model APIs for Unstructured Data Unstructured Data is Hard. ML APIs for Enriching Data. Lab: Using the Natural Language API to Classify Unstructured Text. Big Data Analytics with Cloud AI Platform Notebooks What's a Notebook. BigQuery Magic and Ties to Pandas. Lab: BigQuery in Jupyter Labs on AI Platform. Production ML Pipelines with Kubeflow Ways to do ML on GCP. Kubeflow. AI Hub. Lab: Running AI models on Kubeflow. Custom Model building with SQL in BigQuery ML BigQuery ML for Quick Model Building. Demo: Train a model with BigQuery ML to predict NYC taxi fares. Supported Models. Lab Option 1: Predict Bike Trip Duration with a Regression Model in BQML. Lab Option 2: Movie Recommendations in BigQuery ML. Custom Model building with Cloud AutoML Why Auto ML? Auto ML Vision. Auto ML NLP. Auto ML Tables.
Course Overview This comprehensive course offers a deep dive into three essential technologies for data science: Python, JavaScript, and Microsoft SQL. Learners will gain foundational knowledge and practical skills in each of these key areas, which are crucial for handling data, creating interactive websites, and working with databases. By the end of the course, students will be proficient in writing Python code for data analysis, creating dynamic web content with JavaScript, and managing data with Microsoft SQL. The course is designed to equip learners with the technical skills needed to succeed in data science, making it a valuable investment for anyone looking to excel in this growing field. Course Description In this course, learners will explore the core principles of Python, JavaScript, and Microsoft SQL, all tailored to the needs of data science professionals. The curriculum covers Python’s data structures, functions, and libraries essential for data analysis, while JavaScript introduces students to web development skills, including client-side validation and data visualisation. The Microsoft SQL section focuses on data management, including filtering, joining, and structuring queries. Learners will develop a solid understanding of these technologies, which will enable them to manipulate data, automate processes, and design interactive applications. The course also includes real-world applications, ensuring learners are well-prepared for future opportunities in data science and web development. Course Modules: Module 01: JavaScript Getting Started Module 02: JavaScript Fundamentals Module 03: JavaScript Strings Module 04: JavaScript Operators Module 05: JavaScript Conditional Statements Module 06: JavaScript Control Flow Statements Module 07: JavaScript Functions Module 08: Data Visualization (Google Charts) Module 09: JavaScript Error Handling Module 10: JavaScript Client-Side Validations Module 11: Python Introduction Module 12: Python Basic Module 13: Python Strings Module 14: Python Operators Module 15: Python Data Structures Module 16: Python Conditional Statements Module 17: Python Control Flow Statements Module 18: Python Core Games Module 19: Python Functions Module 20: Python Args, KW Args for Data Science Module 21: Python Project Module 22: Publish Your Website for Live Module 23: MS SQL Statements Module 24: MS SQL Filtering Data Module 25: MS SQL Functions Module 26: MS SQL Joins Module 27: MS SQL Advanced Commands Module 28: MS SQL Structure and Keys Module 29: MS SQL Queries Module 30: MS SQL Structure Queries Module 31: MS SQL Constraints Module 32: MS SQL Backup and Restore (See full curriculum) Who is this course for? Individuals seeking to enhance their skills in data science. Professionals aiming to expand their knowledge in programming and database management. Beginners with an interest in Python, JavaScript, and SQL. Anyone looking to enter the field of data science or web development. Career Path Data Scientist Web Developer Database Administrator Data Analyst Front-End Developer Full Stack Developer Data Engineer
Duration 3 Days 18 CPD hours This course is intended for This in an intermediate and beyond-level course is geared for experienced Python developers looking to delve into the exciting field of Natural Language Processing. It is ideally suited for roles such as data analysts, data scientists, machine learning engineers, or anyone working with text data and seeking to extract valuable insights from it. If you're in a role where you're tasked with analyzing customer sentiment, building chatbots, or dealing with large volumes of text data, this course will provide you with practical, hands on skills that you can apply right away. Overview This course combines engaging instructor-led presentations and useful demonstrations with valuable hands-on labs and engaging group activities. Throughout the course you'll: Master the fundamentals of Natural Language Processing (NLP) and understand how it can help in making sense of text data for valuable insights. Develop the ability to transform raw text into a structured format that machines can understand and analyze. Discover how to collect data from the web and navigate through semi-structured data, opening up a wealth of data sources for your projects. Learn how to implement sentiment analysis and topic modeling to extract meaning from text data and identify trends. Gain proficiency in applying machine learning and deep learning techniques to text data for tasks such as classification and prediction. Learn to analyze text sentiment, train emotion detectors, and interpret the results, providing a way to gauge public opinion or understand customer feedback. The Hands-on Natural Language Processing (NLP) Boot Camp is an immersive, three-day course that serves as your guide to building machines that can read and interpret human language. NLP is a unique interdisciplinary field, blending computational linguistics with artificial intelligence to help machines understand, interpret, and generate human language. In an increasingly data-driven world, NLP skills provide a competitive edge, enabling the development of sophisticated projects such as voice assistants, text analyzers, chatbots, and so much more. Our comprehensive curriculum covers a broad spectrum of NLP topics. Beginning with an introduction to NLP and feature extraction, the course moves to the hands-on development of text classifiers, exploration of web scraping and APIs, before delving into topic modeling, vector representations, text manipulation, and sentiment analysis. Half of your time is dedicated to hands-on labs, where you'll experience the practical application of your knowledge, from creating pipelines and text classifiers to web scraping and analyzing sentiment. These labs serve as a microcosm of real-world scenarios, equipping you with the skills to efficiently process and analyze text data. Time permitting, you?ll also explore modern tools like Python libraries, the OpenAI GPT-3 API, and TensorFlow, using them in a series of engaging exercises. By the end of the course, you'll have a well-rounded understanding of NLP, and will leave equipped with the practical skills and insights that you can immediately put to use, helping your organization gain valuable insights from text data, streamline business processes, and improve user interactions with automated text-based systems. You?ll be able to process and analyze text data effectively, implement advanced text representations, apply machine learning algorithms for text data, and build simple chatbots. Launch into the Universe of Natural Language Processing The journey begins: Unravel the layers of NLP Navigating through the history of NLP Merging paths: Text Analytics and NLP Decoding language: Word Sense Disambiguation and Sentence Boundary Detection First steps towards an NLP Project Unleashing the Power of Feature Extraction Dive into the vast ocean of Data Types Purification process: Cleaning Text Data Excavating knowledge: Extracting features from Texts Drawing connections: Finding Text Similarity through Feature Extraction Engineer Your Text Classifier The new era of Machine Learning and Supervised Learning Architecting a Text Classifier Constructing efficient workflows: Building Pipelines for NLP Projects Ensuring continuity: Saving and Loading Models Master the Art of Web Scraping and API Usage Stepping into the digital world: Introduction to Web Scraping and APIs The great heist: Collecting Data by Scraping Web Pages Navigating through the maze of Semi-Structured Data Unearth Hidden Themes with Topic Modeling Embark on the path of Topic Discovery Decoding algorithms: Understanding Topic-Modeling Algorithms Dialing the right numbers: Key Input Parameters for LSA Topic Modeling Tackling complexity with Hierarchical Dirichlet Process (HDP) Delving Deep into Vector Representations The Geometry of Language: Introduction to Vectors in NLP Text Manipulation: Generation and Summarization Playing the creator: Generating Text with Markov Chains Distilling knowledge: Understanding Text Summarization and Key Input Parameters for TextRank Peering into the future: Recent Developments in Text Generation and Summarization Solving real-world problems: Addressing Challenges in Extractive Summarization Riding the Wave of Sentiment Analysis Unveiling emotions: Introduction to Sentiment Analysis Tools Demystifying the Textblob library Preparing the canvas: Understanding Data for Sentiment Analysis Training your own emotion detectors: Building Sentiment Models Optional: Capstone Project Apply the skills learned throughout the course. Define the problem and gather the data. Conduct exploratory data analysis for text data. Carry out preprocessing and feature extraction. Select and train a model. ? Evaluate the model and interpret the results. Bonus Chapter: Generative AI and NLP Introduction to Generative AI and its role in NLP. Overview of Generative Pretrained Transformer (GPT) models. Using GPT models for text generation and completion. Applying GPT models for improving autocomplete features. Use cases of GPT in question answering systems and chatbots. Bonus Chapter: Advanced Applications of NLP with GPT Fine-tuning GPT models for specific NLP tasks. Using GPT for sentiment analysis and text classification. Role of GPT in Named Entity Recognition (NER). Application of GPT in developing advanced chatbots. Ethics and limitations of GPT and generative AI technologies.