Python Training , study and exam guide
1 Introduction to Python
1.1 What is Python?
1.2 History of Python
1.3 Features of Python
1.4 Python Applications
1.5 Setting up the Python Environment
1.6 Running Your First Python Program
2 Python Basics
2.1 Python Syntax and Indentation
2.2 Variables and Data Types
2.2 1 Numbers
2.2 2 Strings
2.2 3 Lists
2.2 4 Tuples
2.2 5 Sets
2.2 6 Dictionaries
2.3 Operators
2.3 1 Arithmetic Operators
2.3 2 Comparison Operators
2.3 3 Logical Operators
2.3 4 Assignment Operators
2.3 5 Membership Operators
2.3 6 Identity Operators
2.4 Input and Output
2.4 1 Input Function
2.4 2 Output Function
2.5 Comments
2.5 1 Single-line Comments
2.5 2 Multi-line Comments
3 Control Flow
3.1 Conditional Statements
3.1 1 If Statement
3.1 2 If-Else Statement
3.1 3 Elif Statement
3.1 4 Nested If Statements
3.2 Loops
3.2 1 For Loop
3.2 2 While Loop
3.2 3 Nested Loops
3.3 Loop Control Statements
3.3 1 Break Statement
3.3 2 Continue Statement
3.3 3 Pass Statement
4 Functions
4.1 Defining Functions
4.2 Function Arguments
4.2 1 Positional Arguments
4.2 2 Keyword Arguments
4.2 3 Default Arguments
4.2 4 Variable-length Arguments
4.3 Return Statement
4.4 Lambda Functions
4.5 Scope of Variables
4.5 1 Local Variables
4.5 2 Global Variables
4.6 Recursion
5 Data Structures
5.1 Lists
5.1 1 List Operations
5.1 2 List Methods
5.1 3 List Comprehensions
5.2 Tuples
5.2 1 Tuple Operations
5.2 2 Tuple Methods
5.3 Sets
5.3 1 Set Operations
5.3 2 Set Methods
5.4 Dictionaries
5.4 1 Dictionary Operations
5.4 2 Dictionary Methods
5.5 Advanced Data Structures
5.5 1 Stacks
5.5 2 Queues
5.5 3 Linked Lists
6 Modules and Packages
6.1 Importing Modules
6.2 Creating Modules
6.3 Standard Library Modules
6.3 1 Math Module
6.3 2 Random Module
6.3 3 DateTime Module
6.4 Creating Packages
6.5 Installing External Packages
7 File Handling
7.1 Opening and Closing Files
7.2 Reading from Files
7.2 1 read()
7.2 2 readline()
7.2 3 readlines()
7.3 Writing to Files
7.3 1 write()
7.3 2 writelines()
7.4 File Modes
7.5 Working with CSV Files
7.6 Working with JSON Files
8 Exception Handling
8.1 Try and Except Blocks
8.2 Handling Multiple Exceptions
8.3 Finally Block
8.4 Raising Exceptions
8.5 Custom Exceptions
9 Object-Oriented Programming (OOP)
9.1 Classes and Objects
9.2 Attributes and Methods
9.3 Constructors and Destructors
9.4 Inheritance
9.4 1 Single Inheritance
9.4 2 Multiple Inheritance
9.4 3 Multilevel Inheritance
9.5 Polymorphism
9.6 Encapsulation
9.7 Abstraction
10 Working with Libraries
10.1 NumPy
10.1 1 Introduction to NumPy
10.1 2 Creating NumPy Arrays
10.1 3 Array Operations
10.2 Pandas
10.2 1 Introduction to Pandas
10.2 2 DataFrames and Series
10.2 3 Data Manipulation
10.3 Matplotlib
10.3 1 Introduction to Matplotlib
10.3 2 Plotting Graphs
10.3 3 Customizing Plots
10.4 Scikit-learn
10.4 1 Introduction to Scikit-learn
10.4 2 Machine Learning Basics
10.4 3 Model Training and Evaluation
11 Web Development with Python
11.1 Introduction to Web Development
11.2 Flask Framework
11.2 1 Setting Up Flask
11.2 2 Routing
11.2 3 Templates
11.2 4 Forms and Validation
11.3 Django Framework
11.3 1 Setting Up Django
11.3 2 Models and Databases
11.3 3 Views and Templates
11.3 4 Forms and Authentication
12 Final Exam Preparation
12.1 Review of Key Concepts
12.2 Practice Questions
12.3 Mock Exams
12.4 Exam Tips and Strategies
10 2 3 Data Manipulation Explained

10 2 3 Data Manipulation Explained

Key Concepts

Data manipulation in Python involves several key concepts:

1. Filtering Data

Filtering data involves selecting specific rows or columns based on certain conditions. This is useful for isolating relevant information.

Example:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)

# Filtering rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print(filtered_df)
    

Analogy: Think of filtering as picking out specific fruits from a basket based on their color or size.

2. Sorting Data

Sorting data involves arranging rows based on the values of one or more columns. This helps in understanding the data better.

Example:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)

# Sorting by Age in ascending order
sorted_df = df.sort_values(by='Age')
print(sorted_df)
    

Analogy: Sorting is like arranging books on a shelf alphabetically by their titles.

3. Grouping Data

Grouping data involves splitting the data into groups based on some criteria and applying a function to each group.

Example:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)

# Grouping by City and calculating the mean age
grouped_df = df.groupby('City')['Age'].mean()
print(grouped_df)
    

Analogy: Grouping is like categorizing items in a store by their type and calculating the average price for each category.

4. Aggregating Data

Aggregating data involves applying statistical functions to summarize the data. Common functions include sum, mean, count, etc.

Example:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)

# Calculating the total and average age
total_age = df['Age'].sum()
average_age = df['Age'].mean()
print("Total Age:", total_age)
print("Average Age:", average_age)
    

Analogy: Aggregating is like counting the total number of items in a shopping cart and calculating the average price per item.

5. Merging and Joining Data

Merging and joining data involves combining two or more datasets based on common columns. This is useful for integrating data from different sources.

Example:

import pandas as pd

data1 = {'Name': ['Alice', 'Bob', 'Charlie'],
         'Age': [25, 30, 35]}
df1 = pd.DataFrame(data1)

data2 = {'Name': ['Alice', 'Bob', 'David'],
         'City': ['New York', 'Los Angeles', 'Houston']}
df2 = pd.DataFrame(data2)

# Merging dataframes on the 'Name' column
merged_df = pd.merge(df1, df2, on='Name')
print(merged_df)
    

Analogy: Merging is like combining two spreadsheets by matching entries based on a common column, such as a customer ID.

Putting It All Together

By understanding and using these concepts effectively, you can manipulate data efficiently for various analytical tasks.

Example:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)

# Filtering and sorting
filtered_sorted_df = df[df['Age'] > 30].sort_values(by='Age')

# Grouping and aggregating
grouped_df = df.groupby('City')['Age'].agg(['mean', 'sum'])

# Merging with another dataset
data2 = {'Name': ['Alice', 'Bob', 'David'],
         'City': ['New York', 'Los Angeles', 'Houston']}
df2 = pd.DataFrame(data2)
merged_df = pd.merge(df, df2, on='Name')

print("Filtered and Sorted Data:\n", filtered_sorted_df)
print("Grouped and Aggregated Data:\n", grouped_df)
print("Merged Data:\n", merged_df)