7 Steps to Mastering SQL for Data Science



SQL is a standard language for storing, manipulating and retrieving data in databases. Our SQL tutorial will teach you how to use SQL in: MySQL, SQL Server, MS Access, Oracle, Sybase, Informix, Postgres, and other database systems.


SQL is used to communicate with a database. According to ANSI (American National Standards Institute), it is the standard language for relational database management systems. SQL statements are used to perform tasks such as update data on a database, or retrieve data from a database.


Step 1: SQL Basics

As a data scientist, you will be reading from databases and analyzing data to fit your use-case. You generally don’t need to create or manipulate existing databases — companies have a separate team to do this.


If you have no prior SQL knowledge whatsoever, start with this tutorial to understand what an RDBMS is.

An ERD is a structural diagram used to visualize the tables in a database and the relationship between them. As a data scientist, when extracting data from different tables, you’d often need to refer to an ER Diagram to understand how the tables interact with each other.


After that, you can immediately start learning how to query data in SQL. I highly recommend following along to these tutorials by W3Schools to learn the following commands — SELECT, IN, WHERE, BETWEEN, AND, OR, NOT, LIKE.


WHERE

The WHERE clause can be combined with AND, OR, and NOT operators.


AND and OR

The AND and OR operators are used to filter records based on more than one condition:

  • The AND operator displays a record if all the conditions separated by AND are TRUE.

  • The OR operator displays a record if any of the conditions separated by OR is TRUE.


NOT

The NOT operator displays a record if the condition(s) is NOT TRUE.