Getting started with SQL

Data Science for Everyone

Author

Bongani Ncube

SELECT statement

SELECT all with *

SELECT all columns FROM the regions table

SELECT * FROM regions;

SELECT all columns FROM the employees table

SELECT * FROM employees;

SELECT all columns FROM the jobs table

SELECT * FROM ____;

SELECTing a subset

  • you can achieve this by defining the columns to be SELECTed e.g

SELECT col_1,col_2,col_3 FROM my_table;

SELECT email columns FROM the employees table

SELECT email FROM employees;

SELECT employee_id,first_name,last_name columns FROM the employees table

SELECT ___,___,____ FROM employees;

SELECT employee_id,first_name,last_name,email,salary columns FROM the employees table

SELECT __,__,__,__,___ FROM employees;

SELECT distinct items in SQL

  • Often your results will include many duplicate values. If you want to SELECT all the unique values FROM a column, you can use the DISTINCT keyword. for instance if you have the following database table:

notice that SELECT * FROM employees; returns 40 rows but adding distinct first_name reduces the number of rows because many people have the same first name . using distinct is not a good idea though.

SELECT distinct first_name FROM employees;

filtering using WHERE in SQL

  • WHERE is a filtering clause

  • In SQL, the WHERE keyword allows you to filter based on both text and numeric values in a table. There are a few different comparison operators you can use:

  • You can build up your WHERE queries by combining multiple conditions with the AND keyword.

SELECT all rows where first_name is John FROM employees table

SELECT * FROM employees
where first_name='___';

SELECT all rows where salary is above10000 FROM employees table

SELECT * FROM employees
where salary > _____;

SELECT all rows where salary is above10000 AND department_id is 9 FROM employees table

SELECT COUNT(*) FROM employees
WHERE salary > _____ AND ______;

COUNT and AS alias in SQL

  • The COUNT statement lets you count then returning the number of rows in one or more columns.
  • we use AS to rename default name
  • COUNT(*) tells you how many rows are in a table

how many employees do we have

SELECT COUNT(*) FROM employees AS num_people;
  • the above query calculates the number of rows in the data and finds 40 rows

we can use COUNT(DISTINCT column) to get the number of unique rows

SELECT COUNT(distinct first_name) FROM employees;

the query above will calculate rows with distinct names and finds them to be 37 instead of 40

We can use COUNT with WHERE

  • We can do this to get the number of terms after filtering
  • how many people are male?

count all rows where salary is above10000 FROM employees table

SELECT COUNT(*) FROM employees
WHERE salary > _____;

Aggregate functions in SQL

Find the MIN() OF salary

SELECT MIN(salary) FROM employees;

Find the MIN() OF salary and aliase it as min_salary

SELECT MIN(salary) AS ____ FROM employees;

Find the MIN(),MAX(),AVG(),STDEV() OF salary

SELECT MIN(salary) AS ___,
       MAX(salary) AS ___,
       AVG(______) AS ___,
FROM employees;