13 Important Pandas Function used in Data Science

Python is one of the most widely used language for Data Analysis and Data Science. Python is easy to learn, has a great online community of learners and instructors, and has some really powerful data-centric libraries. Pandas is one of the most important libraries in Python for Data Analysis, and Data Science.


Pandas is a predominantly used python data analysis library. It provides many functions and methods to expedite the data analysis process. What makes pandas so common is its functionality, flexibility, and simple syntax.


1. read_csv()

read_csv() function helps read a comma-separated values (csv) file into a Pandas DataFrame. All you need to do is mention the path of the file you want it to read. It can also read files separated by delimiters other than comma, like | or tab.

data_1 = pd.read_csv(r'C:UsersABCDesktopblog_dataset.csv')

The data has been read from the data source into the Pandas DataFrame. You will have to change the path of the file you want to read.


to_csv() function works exactly opposite of read_csv(). It helps to write data contained in a Pandas DataFrame or Series to a csv file. read_csv() and to_csv() are one of the most used functions in Pandas because they are used while reading data from a data source, and are very important to know.


2. head()

head(n) is used to return the first n rows of a dataset. By default, df.head() will return the first 5 rows of the DataFrame. If you want more/less number of rows, you can specify n as an integer.

data_1.head(6)

Output:

Name Age City State DOB Gender City temp Salary

0 Alam 29 Indore Madhya Pradesh 20-11-1991 Male 35.5 50000

1 Rohit 23 New Delhi Delhi 19-09-1997 Male 39.0 85000

2 Bimla 35 Rohtak Haryana 09-01-1985 Female 39.7 20000

3 Rahul 25 Kolkata West Bengal 19-09-1995 Male 36.5 40000

4 Chaman 32 Chennai Tamil Nadu 12-03-1988 Male 41.1 65000

5 Vivek 38 Gurugram Haryana 22-06-1982 Male 38.9 35000


The first 6 rows (indexed 0 to 5) are returned as output as per expectation.

tail() is similar to head(), and returns the bottom n rows of a dataset. head() and tail() help you get a quick glance at your dataset, and check if data has been read into the DataFrame properly.


<