Booking options
£67.99
£67.99
On-Demand course
19 hours 26 minutes
All levels
This course offers an immersive experience in data analysis, guiding you from initial setup with Python and Pandas, through series and DataFrame manipulation, to advanced data visualization techniques. Perfect for enhancing your data handling and analysis skills.
This course begins with the essentials, introducing you to Anaconda and Jupyter Lab setup for Python and Pandas. You'll gain foundational knowledge in Python before diving into Pandas for data analysis. The focus then shifts to Series and DataFrame structures, providing you with the skills to manage and manipulate data effectively. Further, the course covers handling dates and times, and performing various file input and output operations, essential for real-world data analysis tasks. Advanced sections delve into data visualization using Matplotlib, enabling you to create impactful charts and graphs. You'll also explore advanced Pandas options and settings, enhancing your data manipulation capabilities.
By the end of this course, you'll have a comprehensive understanding of data analysis techniques. You'll be equipped to handle complex datasets, perform detailed analysis, and present data visually, opening doors to advanced data analysis and manipulation in professional settings.
Understand and utilize Python's basic and advanced data types.
Master Series and DataFrame operations in Pandas.
Handle complex data types like dates and times.
Perform input and output operations with different file types.
Create and customize various types of data visualizations.
Optimize data analysis with advanced Pandas settings and functions.
Ideal for data analysts, aspiring data scientists, and professionals keen on mastering data manipulation and analysis. This course is a perfect fit for those with basic Python knowledge looking to delve deep into data analytics using Pandas. Whether you're aiming to enhance your skillset for professional growth or apply data analysis techniques in your current role, this course offers a comprehensive learning path from Python basics to advanced data handling and visualization techniques in Pandas.
The course adopts a hands-on, step-by-step approach, starting with Python and Pandas setup and progressing through data manipulation, visualization, and advanced features. It emphasizes practical application, with examples and exercises to solidify understanding and skill development.
Detailed step-by-step guidance on the installation and setup of key data analysis tools. * Comprehensive coverage of Pandas functionalities, from basics to advanced techniques. * Focus on real-world applications, enhancing your ability to analyze complex datasets.
https://github.com/PacktPublishing/Data-Analysis-with-Pandas-and-Python
Boris Paskhaver is a New York City-based software engineer, author, and Udemy instructor with a unique journey into tech. Graduating from NYU in 2013 with a degree in Business Economics and Marketing, he initially worked in various roles, including business analyst and data analyst, at several companies. His coding journey began accidentally while building projects with Python and JavaScript, leading him to passionately pursue programming. Without formal computer science education, Boris completed App Academy's full-stack web development bootcamp, diving headfirst into web development. As an instructor, Boris focuses on creating comprehensive, easy-to-understand courses, addressing the challenges he faced learning to code. He's driven by the intersection of technology and education, aiming to make programming accessible to all. Boris brings this passion to his teaching, helping others unlock the potential of coding.
1. Installation and Setup
This section sets the stage with an introduction to the course's framework, leading you through the Anaconda distribution installation for Python and R data science. It focuses on setting up the Python environment using Anaconda Navigator and familiarizes you with the Jupyter Lab interface. Key skills taught include executing code cells, importing libraries, and understanding start-up and shutdown processes in Python environments, forming a solid base for efficient data analysis.
1. Introduction to the Course This video provides an overview of the course, setting expectations and outlining the learning journey ahead. |
2. macOS - Download and Install the Anaconda Distribution Learn how to download and install the Anaconda distribution, a popular Python and R data science platform, on macOS. |
3. Windows - Download and Install the Anaconda Distribution This tutorial guides you through the installation process of the Anaconda distribution on a Windows operating system. |
4. Use Anaconda Navigator to Create a New Environment Discover how to use Anaconda Navigator to create and manage your Python environments for data analysis. |
5. Unpack Course Materials + The Startdown and Shutdown Process Unpack the course materials provided and understand the startup and shutdown processes in your new Python environment. |
6. Intro to the Jupyter Lab Interface Get introduced to the Jupyter Lab interface, a key tool for Python programming and data analysis. |
7. Code Cell Execution Learn about executing code cells within Jupyter Lab, a fundamental aspect of interactive Python programming. |
8. Import Libraries into Jupyter Lab Understand how to import necessary Python libraries into Jupyter Lab for data analysis. |
2. Python Crash Course
This section offers a comprehensive Python crash course. It covers the essentials of Python programming, including comments, basic data types, operators, variables, functions, string methods, and core data structures like lists and dictionaries. You'll also gain insight into Python classes and navigating libraries in Jupyter Lab, crucial for effective data management.
1. Comments This video explains how to use comments in Python, an essential practice for making your code readable and maintainable. |
2. Basic Data Types Dive into Python's basic data types, foundational knowledge for any aspiring Python programmer in this session. |
3. Operators In this video, we will explore Python's operators to perform various operations on data. |
4. Variables Learn all about defining and using variables in Python, a key concept in programming in this video. |
5. Built-in Functions Discover Python's built-in functions that are readily available for various tasks in this tutorial. |
6. Custom Functions Understand how to create and use custom functions in Python to encapsulate reusable code logic. |
7. String Methods Explore various string methods in Python that are vital for text data manipulation. |
8. Lists Learn about lists in Python, a versatile data structure for storing collections of items. |
9. Index Positions and Slicing Master the techniques of indexing and slicing in Python to access and modify data in lists. |
10. Dictionaries Dive into dictionaries, a Python data structure that stores data in key-value pairs. |
11. Classes Get an introduction to Python classes, a fundamental concept of object-oriented programming. |
12. Navigating Libraries using Jupyter Lab Learn how to navigate and utilize various Python libraries within the Jupyter Lab environment. |
3. Series
Focusing on Pandas Series, this section delves into creating and manipulating Series objects from lists and dictionaries. It introduces Series methods and attributes, covering topics like data import using `pd.read_csv`, element inspection with head and tail methods, value sorting, and inclusion checks. The section also covers advanced techniques like overwriting values, Series copying, math operations, broadcasting, and applying functions, vital for proficient data analysis in Pandas.
1. Create a Series Object from a List Discover how to create a Pandas Series object from a Python list, a basic operation in data analysis. |
2. Create a Series Object from a Dictionary Learn to create a Pandas Series object from a dictionary, allowing for labeled data manipulation. |
3. Intro to Series Methods Get introduced to various methods available for Pandas Series objects, enhancing your data manipulation skills. |
4. Intro to Attributes Understand the attributes of Pandas Series, which provide information about the data structure. |
5. Parameters and Arguments Explore the parameters and arguments used in Pandas methods for more controlled data operations. |
6. Import Series with the pd.read_csv Function Learn how to import data into a Pandas Series using the pd.read_csv function, a common task in data analysis. |
7. The head and tail Methods Discover how to use the head and tail methods in Pandas to quickly inspect the beginning and end of a Series. |
8. Passing Series to Python Built-In Functions Understand how to pass Pandas Series objects to Python's built-in functions for various computations. |
9. Check for Inclusion with Python's in Keyword Learn to check for the inclusion of elements in a Pandas Series using Python's in keyword. |
10. The sort_values Method Master the sort_values method in Pandas to sort data in a Series. |
11. The sort_index Method Understand how to use the sort_index method to sort a Pandas Series by its index. |
12. Extract Series Values by Index Position Learn how to extract values from a Pandas Series based on their index positions. |
13. Extract Series Values by Index Label Discover how to retrieve data from a Pandas Series using index labels. |
14. The get Method Get familiar with the get method for safely retrieving values from a Pandas Series. |
15. Overwrite a Series Value Learn how to overwrite values in a Pandas Series, an essential skill for data cleaning. |
16. The copy Method Understand the importance of the copy method in Pandas for creating independent copies of data. |
17. Math Methods on Series Objects Explore various mathematical methods available on Pandas Series for data analysis. |
18. Broadcasting Get an introduction to broadcasting in Pandas, which allows for operations on all elements of a Series. |
19. The value_counts Method Learn about the value_counts method, a powerful tool for summarizing categorical data in a Series. |
20. The apply Method Discover how to use the apply method to apply a function to each element in a Series. |
21. The map Method Understand the map method in Pandas for transforming each element in a Series. |
4. DataFrames I: Introduction
This section provides an introduction to Pandas DataFrames, beginning with a comparison between Series and DataFrames. It covers essential DataFrame manipulation techniques, including column selection, addition, and the value_counts method for data analysis. The section further delves into handling missing values through methods like row dropping and the fillna function. Additionally, it explores the astype method for data type conversion and concludes with in-depth sorting techniques, including value and index-based sorting and the rank method, laying a solid foundation in DataFrame operations.
1. Methods and Attributes between Series and DataFrames This video compares the methods and attributes of Series and DataFrames, highlighting their similarities and differences. |
2. Differences between Shared Methods Explore the nuances and differences in shared methods between Series and DataFrames in Pandas. |
3. Select One Column from a DataFrame Learn how to select a single column from a DataFrame, a fundamental skill in data frame manipulation. |
4. Select Multiple Columns from a DataFrame Understand the techniques to select multiple columns from a DataFrame, enhancing your data analysis capabilities. |
5. Add New Column to DataFrame Discover how to add new columns to a DataFrame, a key step in enriching your data set. |
6. A Review of the value_counts Method Review of the value_counts method and its applications in analyzing DataFrame columns will be covered in this session. |
7. Drop DataFrame Rows with Missing Values Learn strategies to drop rows with missing values in a DataFrame, crucial for data cleaning. |
8. Fill in Missing Values with the fillna Method Master the use of the fillna method to fill in missing values in a DataFrame. |
9. The astype Method I This video introduces the astype method in Pandas, focusing on its basic usage for converting data types within a DataFrame. |
10. The astype Method II A deeper exploration into the astype method, covering more complex scenarios and best practices for type conversion in Pandas. |
11. Sort a DataFrame with the sort_values Method I Learn the fundamentals of the sort_values method, including sorting DataFrames based on one or more columns. |
12. Sort a DataFrame with the sort_values Method II Expands on the sort_values method, exploring advanced options and techniques for sorting DataFrames in Pandas. |
13. Sort DataFrame with the sort_index Method Learn how to sort DataFrames by their index using the sort_index method in this video |
14. Rank Series Values with the rank Method Explore the rank method to rank values within a DataFrame column. |
5. DataFrames II: Filtering Data
In this section, we will dive into advanced data filtering within Pandas DataFrames, starting with dataset introduction and memory optimization. It extensively covers row filtering based on various conditions using logical operators AND (&) and OR (|), and the isin method for value-based filtering. We will also address handling null values with isnull and notnull, range filtering using between, and identifying duplicates with duplicated. It concludes by teaching how to remove duplicate rows using drop_duplicates and finding unique values, equipping learners with vital skills for efficient data cleaning and preparation in Pandas.
1. This Module's Dataset + Memory Optimization Get introduced to the dataset used in this module and learn techniques for memory optimization. |
2. Filter a DataFrame Based on a Condition Learn to filter rows in a DataFrame based on specific conditions. |
3. Filter with More than One Condition (AND - &) Understand how to apply multiple filter conditions to a DataFrame using logical operators. We will focus on the AND-& operator in this video session. |
4. Filter with More than One Condition (OR - |) Understand how to apply multiple filter conditions to a DataFrame using logical operators. We will focus on the OR - | operator in this video session. |
5. The isin Method Discover the isin method to filter DataFrame rows based on a list of values. |
6. The isnull and notnull Methods Learn to identify and handle null values in a DataFrame using isnull and notnull. |
7. The between Method Explore the between method to filter DataFrame rows within a certain range. |
8. The duplicated Method Understand how to identify duplicated rows in a DataFrame using the duplicated method. |
9. The drop_duplicates Method Learn the process of removing duplicate rows from a DataFrame with drop_duplicates. |
10. The unique and nunique Methods Explore how to find unique values and count them in DataFrame columns. |
6. DataFrames III: Data Extraction
Let's focus on mastering data extraction techniques in DataFrames in this section. It starts with dataset familiarization and delves into DataFrame structuring using set_index and reset_index. Key skills taught include retrieving data by index position or label using iloc and loc, overwriting data values, and renaming index labels or columns for clarity. The section also addresses deleting rows or columns, creating random samples, and extracting specific rows with nsmallest and nlargest. It culminates with conditional filtering and function application across DataFrame rows or columns, enhancing data manipulation proficiency.
1. This Module's Dataset An introduction to the dataset used in this module for data extraction practices will be discussed in this session. |
2. The set_index and reset_index Methods Learn how to set and reset index in a DataFrame, an important aspect of DataFrame structuring. |
3. Retrieve Rows by Index Position with iloc Accessor Master retrieving rows by index position using the iloc accessor in Pandas. |
4. Retrieve Rows by Index Label with loc Accessor Discover how to access DataFrame rows by index labels using the loc accessor. |
5. Second Arguments to loc and iloc Accessors Understand the use of second arguments in loc and iloc for more precise data retrieval. |
6. Overwrite Value in a DataFrame This tutorial covers how to overwrite individual values in a DataFrame, essential for updating data entries or correcting errors. |
7. Overwrite Multiple Values in a DataFrame Explores techniques for bulk updating or modifying multiple values simultaneously in a DataFrame, crucial for efficient data management. |
8. Rename Index Labels or Columns in a DataFrame Discover how to rename index labels or columns in a DataFrame for better data clarity. |
9. Delete Rows or Columns from a DataFrame Understand how to delete rows or columns from a DataFrame, an essential data cleaning skill. |
10. Create Random Sample with the sample Method Learn to create a random sample from a DataFrame using the sample method. |
11. The nsmallest and nlargest Methods Explore the nsmallest and nlargest methods to extract specific rows based on column values. |
12. Filtering with the where Method Master the where method for conditional filtering in DataFrames. |
13. The apply Method with DataFrames Learn the use of the apply method to execute functions across DataFrame rows or columns. |
7. Working with Text Data
This section is dedicated to handling text data in Pandas, beginning with dataset introduction and progressing to common string methods for text manipulation. It covers filtering DataFrame rows using string methods and applying these methods to DataFrame indices and columns. Learners will explore the split method, including its expand and n parameters, for detailed text analysis. The section provides practical exercises for applying these string methods, essential for those working with textual data in data analysis.
1. This Module's Dataset Introduces the dataset used in this module, focusing on handling text data in Pandas. |
2. Common String Methods Demonstrates the use of common string methods in Pandas for text data manipulation. |
3. Filtering with String Methods Teaches how to filter DataFrame rows using string methods for text data analysis. |
4. String Methods on Index and Columns Explores the application of string methods on DataFrame indices and columns for data organization. |
5. The split Method Covers the split method to divide text data into multiple parts for detailed analysis. |
6. More Practice with Splits Provides additional exercises and examples to master the split method in various scenarios. |
7. The expand and n Parameters of the split Method Explains the use of expand and n parameters in the split method for customized text splitting. |
8. MultiIndex
In this section, learners are introduced to the advanced concept of MultiIndex in Pandas, essential for complex data structuring. The section walks through creating a MultiIndex for sophisticated data grouping and teaches extracting values from different levels. It covers sorting MultiIndex DataFrames, renaming index labels, and methods like transpose, stack, and unstack for DataFrame transformation. The section concludes with an in-depth look at the pivot and melt methods and creating pivot tables, equipping learners with advanced data structuring techniques.
1. Intro to the MultiIndex Module This video offers an introduction to MultiIndex in Pandas, setting the stage for complex data structuring. |
2. Create a MultiIndex Guides through creating a MultiIndex in Pandas, a technique for advanced data grouping. |
3. Extract Index Level Values Demonstrates how to extract values from different levels of a MultiIndex for detailed data analysis. |
4. Rename Index Lebels Teaches how to rename index labels in a MultiIndex DataFrame for clarity and readability. |
5. The sort_index Method on a MultiIndex DataFrame Shows how to sort a MultiIndex DataFrame using the sort_index method for organized data presentation. |
6. Extract Rows from a MultiIndex DataFrame Covers techniques to extract specific rows from a MultiIndex DataFrame, enhancing data access. |
7. The transpose Method Introduces the transpose method to rearrange DataFrame rows and columns for different perspectives. |
8. The stack Method Explains the stack method to reshape DataFrame columns into a MultiIndex on rows. |
9. The unstack Method Demonstrates the unstack method for pivoting level values from the index to the columns. |
10. The pivot Method This video covers the pivot method to reshape and reorganize data in a DataFrame based on column values. |
11. The melt Method Teaches the melt method for transforming DataFrames into a format with one or more identifier variables. |
12. The pivot_table Method Introduces the pivot_table method for creating a spreadsheet-style pivot table as a DataFrame. |
9. GroupBy
This section explores Pandas' GroupBy functionality, essential for data aggregation. It begins with an overview and practical use of the groupby method for grouping data, then moves to retrieving specific groups and applying various aggregation methods. The section also covers grouping by multiple columns, using the agg method for complex operations, and concludes with techniques for iterating through group data, providing a thorough understanding of data grouping and aggregation.
1. Intro to the GroupBy Module Provides an overview of the GroupBy functionality in Pandas, essential for aggregating data. |
2. The groupby Method Demonstrates the use of the groupby method for grouping data based on some criteria. |
3. Retrieve A Group with the get_group Method Shows how to retrieve a specific group from a grouped DataFrame using the get_group method. |
4. Methods on the GroupBy Object Explores various methods available on GroupBy objects for different types of data aggregation. |
5. Grouping by Multiple Columns Teaches how to group data by multiple columns for more complex data analyses. |
6. The agg Method Covers the agg method to apply one or more operations over the grouped data. |
7. Iterating through Groups Provides techniques for iterating through groups in a GroupBy object for individual data processing. |
10. Merging DataFrames
This section is dedicated to DataFrame merging, a crucial aspect of dataset combination in Pandas. It introduces merging techniques, focusing on the pd.concat function for DataFrame concatenation. The section covers various join types, including left, inner, and full-outer joins, and parameter usage for precise column matching. It also explains merging DataFrames by indexes and concludes with the join method, offering a comprehensive guide to DataFrame merging strategies.
1. Intro to the Merging DataFrames Module Provides an introduction to techniques for merging DataFrames, a key aspect of combining datasets in Pandas. |
2. The pd.concat Function I This is the first session on pd.concat. This video demonstrates how to use the pd.concat function to concatenate DataFrames along a particular axis. |
3. The pd.concat Function II This is the second session on pd.concat. This video demonstrates how to use the pd.concat function to concatenate DataFrames along a particular axis. |
4. Left Joins Explores the concept of left joins, showing how to merge DataFrames with a focus on keys from the left frame. |
5. The left_on and right_on Parameters Teaches the use of left_on and right_on parameters in DataFrame joins for specific column matching. |
6. Inner Joins I This is the first session where we will explore all about inner joins. This video covers inner joins, detailing how to combine DataFrames based on the intersection of keys. |
7. Inner Joins II This is the second session where we will explore all about inner joins. This video covers inner joins, detailing how to combine DataFrames based on the intersection of keys. |
8. Full-Outer Joins This video explains full-outer joins, a method to merge DataFrames including all keys from both frames. |
9. Merging by Indexes with the left_index and right_index Parameters This video shows how to merge DataFrames using indexes as keys with left_index and right_index parameters. |
10. The join Method Introduces the join method for merging DataFrames, a simpler alternative to using the merge function. |
11. Working with Dates and Times
This section covers handling dates and times in Pandas, starting with an overview and a review of Python's datetime. It teaches using Timestamp and DatetimeIndex objects, creating date ranges with pd.date_range, and accessing date-time properties via the dt attribute. The section also delves into time-based arithmetic with DateOffset, specialized date offsets, and timedeltas, enhancing skills in managing time series data.
1. Intro to the Working with Dates and Times Module and Review of Python's datetime Introduces the module on working with dates and times in Pandas, along with a review of Python's datetime. |
2. The Timestamp and DatetimeIndex Objects Covers the use of Timestamp and DatetimeIndex objects in Pandas for date-time data manipulation. |
3. Create Range of Dates with pd.date_range Function Demonstrates creating a range of dates using the pd.date_range function, crucial for time series data. |
4. The dt Attribute Explores the dt attribute, allowing access to date and time properties of a Pandas Series. |
5. Selecting Rows from a DataFrame with DatetimeIndex Teaches techniques for selecting rows based on date-time indexes in a DataFrame. |
6. The DateOffset Object Introduces the DateOffset object to perform time-based arithmetic operations on dates. |
7. Specialized Date Offsets Covers specialized date offsets in Pandas for handling complex date-time manipulations. |
8. Timedeltas Explains the concept of timedeltas in Pandas, used for representing durations of time. |
12. Input and Output
This section provides a complete guide on Pandas' input and output operations. It demonstrates exporting DataFrames to CSV, a standard data sharing format, and guides on using the openpyxl library for Excel file interactions. The section shows practical ways to import and export Excel files in Pandas, essential for diverse data analysis tasks.
1. Intro to the Input and Output Module Provides an overview of input and output operations in Pandas, essential for data exchange. |
2. Export DataFrame to CSV File Demonstrates how to export a DataFrame to a CSV file, a common data sharing format. |
3. Install openpyxl Library to Read and Write Excel Files Guides on installing the openpyxl library, enabling reading and writing Excel files in Pandas. |
4. Import Excel File into pandas Shows how to import data from an Excel file into Pandas, a frequent operation in data analysis. |
5. Export Excel File from pandas This video teaches the process of exporting Pandas DataFrames to Excel files, useful for data reporting and sharing. |
13. Visualization
This section focuses on data visualization with Pandas and Matplotlib in Python. Starting with installing Matplotlib, it teaches creating dynamic visualizations, including static and interactive charts. The section introduces the `plot` method for basic line plots, followed by enhancing visuals using templates for improved aesthetics. Learners will explore crafting bar charts for comparative analysis and pie charts for proportional data representation. This section equips learners with practical skills to effectively convey data insights through visual storytelling.
1. Install matplotlib Library for Visualization Guides on installing the Matplotlib library, a powerful tool for creating a wide range of static, animated, and interactive visualizations in Python. |
2. The plot Method Introduces the plot method in Pandas for basic line plots, enabling quick and easy visualization of data series. |
3. Modifying Plot Aesthetics with Templates Demonstrates how to modify plot aesthetics using templates to enhance the visual appeal of data representations. |
4. Bar Charts In this session we will cover the creation of bar charts in Pandas, useful for comparing different groups or tracking changes over time. |
5. Pie Charts This video teaches how to make pie charts, which are effective for showing proportions of a whole in a visually intuitive way. |
14. Options and Settings
In this section, learners explore various options and settings in Pandas for customized data analysis. The section demonstrates altering Pandas settings using attributes and functions for a flexible analysis environment and discusses the precision option, crucial for accurate data display.
1. Introduction to the Options and Settings Module This session provides an overview of various options and settings in Pandas, allowing for customization of Pandas' behavior and output. |
2. Changing Options with Attributes Shows how to change Pandas options using attributes, offering a way to adjust settings to suit different analysis needs. |
3. Changing Options with Functions Explores changing Pandas options using functions, which can provide more flexibility and control over the data analysis environment. |
4. The precision Option Discusses the precision option in Pandas, which controls the output display precision of floating-point numbers, important for data clarity and readability. |
15. Conclusion
The final section summarizes the entire course, emphasizing the key concepts and techniques in data analysis with Pandas and Python. It consolidates the comprehensive skill set developed, preparing learners to effectively handle data analysis in real-world scenarios.
1. Conclusion This video wraps up the course by summarizing key concepts and techniques learned, reinforcing the comprehensive skill set acquired in data analysis with Pandas and Python. |