Saltar navegación

1.2.3 Basic Operations with Pandas

Información

In this page, you will find the content of the section in both video and text formats. Videos are interactive and contain embedded content (explanations, links or exercises) throughout their playback.

At the end of this page, you have a link to the Jupyter/Colab notebook where you can practice the theory from this section.

Vídeo

Basic Operations with Pandas

Welcome back to our Pandas course module. In this section, we will delve into basic operations with Pandas, which are essential for any data manipulation and analysis tasks. These operations include reading and writing data, selecting and indexing data, filtering, and modifying data. Let's get started!

Reading and Writing Data

One of the most common tasks you'll perform with Pandas is reading data from various file formats and writing data back to these formats. Pandas supports multiple file types, but we'll focus on CSV (Comma Separated Values) files, as they are widely used.

To read a CSV file into a Pandas DataFrame, use the read_csv function:

# Reading a CSV file
# Uncomment the line below and replace 'path_to_your_file.csv' with the actual path to your CSV file
# df = pd.read_csv('path_to_your_file.csv')
# print(df.head())  # Display the first few rows of the DataFrame

Similarly, to write a DataFrame to a CSV file, use the to_csv function:

# Writing a DataFrame to a CSV file
# Uncomment the line below and replace 'path_to_your_output_file.csv' with the desired output file path
# df.to_csv('path_to_your_output_file.csv', index=False)

Selecting and Indexing Data

Pandas provides various methods to select and index data within a DataFrame. The two primary methods are .loc and .iloc.

  • .loc: Selects data by labels (index and column names).
  • .iloc: Selects data by integer location (position).

Here’s how you can use these methods:

# Selecting a single column using .loc
# Uncomment the lines below to run the example after creating the DataFrame df
# print(df.loc[:, 'Name'])

# Selecting multiple columns using .loc
# print(df.loc[:, ['Name', 'Age']])

# Selecting rows by index using .iloc
# print(df.iloc[0])  # First row
# print(df.iloc[0:2])  # First two rows

Filtering Data

Filtering data is crucial for narrowing down your dataset to specific conditions. You can filter rows based on one or more conditions using Boolean indexing.

For example, to filter rows where the age is greater than 23:

# Filtering rows based on a condition
# Uncomment the lines below to run the example after creating the DataFrame df
# df_filtered = df[df['Age'] > 23]
# print(df_filtered)

You can also combine multiple conditions using logical operators:

# Filtering rows based on multiple conditions
# Uncomment the lines below to run the example after creating the DataFrame df
# df_filtered = df[(df['Age'] > 23) & (df['City'] == 'Vigo')]
# print(df_filtered)

Modifying Data

Pandas allows you to modify your data efficiently. This includes updating values, adding new columns, and dropping unnecessary ones.

To update values in a DataFrame, you can assign new values directly:

# Updating a column's values
# Uncomment the lines below to run the example after creating the DataFrame df
# df['Age'] = df['Age'] + 1
# print(df)

Adding new columns is straightforward:

# Adding a new column
# Uncomment the lines below to run the example after creating the DataFrame df
# df['Country'] = 'Spain'
# print(df)

To drop columns or rows, use the drop method:

# Dropping a column
# Uncomment the lines below to run the example after creating the DataFrame df
# df = df.drop('Country', axis=1)  # axis=1 indicates columns
# print(df)

# Dropping rows
# df = df.drop([0, 1], axis=0)  # axis=0 indicates rows
# print(df)

Summary

In this section, we covered the fundamental operations you'll need to work effectively with Pandas. These skills form the foundation for more advanced data analysis and manipulation tasks. Practice these operations with your own datasets to become more comfortable with them.

Next, we will explore data analysis and manipulation techniques in Pandas, which will help you derive meaningful insights from your data. Stay tuned!

Practice

Below, you have a link to the Jupyter/Colab notebook where you can practice the theory from this section:

Basic Operations in Pandas

Feito con eXeLearning (Nova xanela)