The Tech Platform

Nov 10, 20224 min

How to Select Rows and Columns in Panda?

This article will teach us how to select Rows and Columns in Panda using [], loc, iloc.

Indexing in Pandas means selecting rows and columns of data from a Dataframe. It can be selecting all the rows and the particular number of columns, a particular number of rows, and all the columns or a particular number of rows and columns each. Indexing is also known as Subset selection.

The difference between loc[] vs iloc[] is described by how you select rows and columns from pandas DataFrame.

  • loc[] is used to select rows and columns by Names/Labels

  • iloc[] is used to select rows and columns by Integer Index/Position. zero based index position.

One of the main advantages of pandas DataFrame is the ease of use. You can see this yourself when you use loc[] or iloc[] attributes to select or filter DataFrame rows or columns. These are mostly used attributes in pandas DataFrame. Let’s see the usage of these before jumping into differences and similarities.

Create Database

Creating the Database with column names: "Name", "Age", "City", and "Salary".

# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns =['Name', 'Age',
 
'City', 'Salary'])
 
# Show the dataframe
 
df

Output:

Select Column by Name using []

The [] is used to select the column by mentioning the column name.

Example:

Select Single Column

# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns=['Name', 'Age',
 
'City', 'Salary'])
 

 
# Using the operator []
 
# to select a column
 
result = df["City"]
 

 
# Show the dataframe
 
result


 
Output:

Select Multiple Column

# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns =['Name', 'Age',
 
'City', 'Salary'])
 

 
# Using the operator [] to
 
# select multiple columns
 
result = df[["Name", "Age", "Salary"]]
 

 
# Show the dataframe
 
result
 

Output:

Select Rows by Name using loc

The loc[] function selects the data by labels of rows or columns. It can select a subset of rows and columns.

Example:

Select Single Row

# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns =['Name', 'Age',
 
'City', 'Salary'])
 

 
# Set 'Name' column as index
 
# on a Dataframe
 
df.set_index("Name", inplace = True)
 

 
# Using the operator .loc[]
 
# to select single row
 
result = df.loc["Stuti"]
 

 
# Show the dataframe
 
result

Output:

Select Multiple Rows

# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns =['Name', 'Age',
 
'City', 'Salary'])
 

 
# Set index on a Dataframe
 
df.set_index("Name",
 
inplace = True)
 

 
# Using the operator .loc[]
 
# to select multiple rows
 
result = df.loc[["Stuti", "Seema"]]
 

 
# Show the dataframe
 
result

Output:

Select Rows by Index using iloc

The iloc[ ] is used for selection based on position. It is similar to loc[] indexer but it takes only integer values to make selections.

Example

Select Single Row.

# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns =['Name', 'Age',
 
'City', 'Salary'])
 

 
# Using the operator .iloc[]
 
# to select single row
 
result = df.iloc[2]
 

 
# Show the dataframe
 
result
 
# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns =['Name', 'Age',
 
'City', 'Salary'])
 

 
# Using the operator .iloc[]
 
# to select single row
 
result = df.iloc[2]
 

 
# Show the dataframe
 
result

Output:

Select Multiple Rows

# import pandas
 
import pandas as pd
 

 
# List of Tuples
 
employees = [('Stuti', 28, 'Varanasi', 20000),
 
('Saumya', 32, 'Delhi', 25000),
 
('Aaditya', 25, 'Mumbai', 40000),
 
('Saumya', 32, 'Delhi', 35000),
 
('Saumya', 32, 'Delhi', 30000),
 
('Saumya', 32, 'Mumbai', 20000),
 
('Aaditya', 40, 'Dehradun', 24000),
 
('Seema', 32, 'Delhi', 70000)
 
]
 

 
# Create a DataFrame object from list
 
df = pd.DataFrame(employees,
 
columns=['Name', 'Age',
 
'City', 'Salary'])
 

 
# Using the operator .iloc[]
 
# to select multiple rows
 
result = df.iloc[[2, 3, 5]]
 

 
# Show the dataframe
 
result

Output:


 

Resource: geeksforgeeks.org

The Tech Platform

www.thetechplatform.com

    0