SQL 101
Welcome N1NJ10 , this time we will not discuss about security stuff but about database stuff specific SQL server , I will try to talk about basics SQL 101
SQL is a programming language that deals with the different DBMSs to do things in databases
SQL can be classified in a lot of ways but the most common one is: -
Data Definition Language (DDL): Queries that create, modify objects in the database.
Data Manipulation Language (DML): Queries that help us deal with managing and manipulating data in the database.
Data Control Language (DCL): Queries that help us deal with controls, rights, and permission in the database system.
Transaction Control Language (TCL): Queries used for managing and controlling the transactions in a database to maintain consistency.
In this article we will focus on DDL and DML
First of all, we should know some database concepts.
What is a primary key ?
Primary key is a crucial thing for important database concept (maintaining data integrity) which means maintaining the accuracy, consistency, and reliability of data.
Primary key is a column or a group of columns that UNIQUELY identifying a row in a table.
The primary key must be unique, not null, immutable (not changeable) and as minimum as possible.
Think of the primary key as a representator of the table to other tables
For example employees’ id in employee table is a primary key
What is the difference between primary key and unique key?
The primary key is used to uniquely identify each record in a table and is typically the main identifier for the table. It has additional characteristics and system-level behaviors, such as non-nullability and automatic indexing. On the other hand, a unique key enforces uniqueness but allows null values and is not restricted to being the primary identifier for a table.
What is a foreign key?
The foreign key is a reference for the primary key in another table which establish a relations between tables.
Think of the foreign key as an ambassador of the table that he came from.
Now let’s dig in SQL syntax.
SQL IS NOT CASE SENSITIVE For example
will retrieve the same result as
but it’s preferred to type the keywords in upper case to ease the code readability.
in this article we will use Holywell SQL guide, this guide separates the query to several lines “each keyword in a separate line “
for example
note “the semicolon (;) is required in most DBMSs to separate the queries from each other which allow more than one SQL statement to be executed in one call”
Let’s start with the DDL.
the DDL commands are:
CREATE
This command creates database objects “Database, Table, Index …… etc.”
syntax:
SQL datatypes are:
Numeric Types:
INT or INTEGER: Integer values (whole numbers).
FLOAT: Floating-point numbers (decimal numbers).
DECIMAL or NUMERIC: Fixed-point numbers with a specified precision and scale.
BIGINT: Large integer values.
SMALLINT: Small integer values.
TINYINT: Tiny integer values.
Character String Types:
CHAR: Fixed-length character string.
VARCHAR: Variable-length character string.
TEXT: Variable-length character string for storing large amounts of text.
NCHAR: Fixed-length Unicode character string.
NVARCHAR: Variable-length Unicode character string.
NTEXT: Variable-length Unicode character string for storing large amounts of text.
Date and Time Types:
DATE: Date values (year, month, day).
TIME: Time values (hour, minute, second).
TIMESTAMP: Date and time values, including fractional seconds.
DATETIME: Date and time values with a precision of milliseconds.
SMALLDATETIME: Date and time values with a precision of one minute.
we want now to implement the primary key, foreign key, unique key.
we will use this syntax:
In the previous query the primary key is column 1.
In the previous query the unique key is column2.
“Note you cannot implement in MS SQL SERVER without the CONSTARINT keyword which give you the chance to name your key”
In the previous query column 3 in table 1 is a foreign key reference to column 4 in table 2.
DROP
This command PERMANENTLY DELETE database objects “Database, Table, Index …… etc.”
syntax:
ALTER
The ALTER command is used to add, delete, or modify columns in existing tables.
syntax:
This code added a new column to the table.
This code dropped a column from the table.
This code renamed a column.
This code modified the column’s datatype.
“please note that there is difference between the different DBMSs in this code the previous code is for MS SQL SERVER”
“This one is for MySQL”
Now remember the first two tables we created?
We want column 3 to automatically change after the change of column 4 in table 2.
We will use option called CASCADE
CASCADE is basically option that automatically changes the foreign key after the change of the primary key “either update or delete”
In the previous two queries we dropped the foreign key because we want to respecify the foreign key to be up to date with the primary key
and we added the foreign key once again but this time with CASCADE option which let the two keys to be up to date
TRUNCATE
The TRUNCATE command deletes all the data inside the table but not the table itself and this is the main difference between TRUNICATE and DROP
Before we go into the DML part we will know two important commands in the TCL.
COMMIT vs ROLLBACK
COMMIT: The “commit” command is used to permanently save the changes made within a transaction. It signifies the successful completion of a transaction and makes the changes visible to other users or processes. Once a commit is issued, the changes become durable and cannot be rolled back.
ROLLBACK: The “rollback” command is used to undo or cancel the changes made within a transaction. It reverts the database to its state before the transaction started, discarding any modifications made within the transaction. Rollback is typically used when an error occurs or when the transaction needs to be aborted.
In Terms of Syntax the COMMIT and ROLLBACK are considered as independent queries.
First of all, we should know that the DML commands are very useful in data cleaning, data extraction.
To make things interesting we will take the journey of the DML by facing some problems and trying to get out of it.
Suppose we have this humble table “Customers” in our database, and we want to select the two columns ContactName , City
First of all we should analyze the plain English text
The two columns we want are ContactName , City and we want from the table Customers.
So, the query should look like this.
Nice we now retrieved the wanted columns
Now if we want to retrieve all the columns in our table, we should use the asterisk ( * ) after the SELECT command.
The query will look like this:
We now retrieved all the columns in the table, good.
Suppose we want to know the unique values in the country column.
What we should do is using the DISTINCT keyword alongside the SELECT one to retrieve the unique values.
The query should look like this:
The result retrieved only 21 record “note that the total records are 91.”
Now we have another problem.
We need to know the cities in Norway that are stored in our table
We will use the where clause to fulfill our condition
Only one city in the table from Norway which is Staven.
Now imagine that you want the cities in Venezuela and the CustomerID is even.
We will use logical operator (AND operator) and the modulo operator (%)
The query will look like this:
The only costumer with even ID is in Barquisimeto Venezuela.
Now imagine that you want to retrieve the customer names in Spain or the ID is odd.
We will use the two logical operators (OR , NOT) and we will use the modulo operator (%)
The query will look like this:
The result consists of all the names of the odd IDs and the names in Spain.
Imagine we want to order the previous result ascending or descending.
We will use the keyword ORDER BY with the ASC or DESC clauses.
Let’s try to retrieve the first 5 results in the previous query
well this depend on the DBMS you are using
if you use MS SQL SERVER you use TOP clause alongside the SELECT statement:
but if you are in MySQL you use the LIMIT keyword:
however, the result is still the same.
now we want the names that starts with letter ‘f’ for example.
we should use the LIKE operator and the SQL WILDCARDS
The syntax:
And these are examples for the patterns :
WHERE CustomerName LIKE ‘f%’ → Finds any values that start with “f”
WHERE CustomerName LIKE ‘%f’ → Finds any values that end with “f”
WHERE CustomerName LIKE ‘%or%’ → Finds any values that have “or” in any position
WHERE CustomerName LIKE ‘_r%’ → Finds any values that have “r” in the second position
WHERE CustomerName LIKE ‘a_%’ → Finds any values that start with “a” and are at least 2 characters in length
WHERE CustomerName LIKE ‘a__%’ → Finds any values that start with “a” and are at least 3 characters in length
WHERE ContactName LIKE ‘a%o’ → Finds any values that start with “a” and ends with “o”
And these are the wildcards
‘%’ (Represents zero or more characters) → bl% finds bl, black, blue, and blob
‘_’ (Represents a single character) → h_t finds hot, hat, and hit
‘[]’ (Represents any single character within the brackets) → h[oa]t finds hot and hat, but not hit
‘^’ (Represents any character not in the brackets) → h[^oa]t finds hit, but not hot and hat
‘-’(Represents any single character within the specified range) → c[a-b]t finds cat and cbt
Let’s get back to our example we want to find the names that starts with ‘f’
we can retrieve with a name also which called Alias and the clause for it is AS
Imagine that we want to put new values to the table.
We use the INSERT command as follows:
The result will be like this:
VERY IMPORTANT NOTE “in INSERT command notice the order of the values and don’t forget any value.”
Let’s try to update this row because we entered the wrong postal code.
We will use the UPDATE command to modify the postal code as follows:
On second thought we want to remove this row from the table
We will use the DELETE command:
The syntax:
please note that without the where clause this query will delete all rows from the column “just like TRUNCATE”
DELETE FROM vs TRUNCATE
The TRUNCATE statement is a DDL (Data Definition Language) operation used to quickly and efficiently remove all rows from a table. When you truncate a table, the operation is not logged in the transaction log, and the space used by the table’s data pages is deallocated. TRUNCATE is a non-recoverable operation, meaning you cannot roll it back. It is often faster than DELETE for removing all data from a table.
on the other hand
The DELETE statement is a DML (Data Manipulation Language) operation used to remove specific rows from a table. When you use DELETE * FROM table_name
, it deletes all rows in the table, effectively achieving the same result as TRUNCATE. However, unlike TRUNCATE, DELETE is a logged operation, and each deleted row is recorded in the transaction log. DELETE can be rolled back, and it allows you to specify conditions to delete specific rows based on criteria.
Now let’s move on to another table “Products.”
Now we want to know the minimum, maximum price in the table.
We will use some functions called aggregate functions.
Now we want to know the number of the products.
We should count a unique entity in the products which is the ID.
So we should use the COUNT() function on the ProductID column.
The result is:
we also want to know the average price of the products
so we will use AVG() function on the price column
let me introduce you to another table in our database
OrderDetails table
Now we want to know the number of the quantities sold.
We will use the SUM() function on the Quantity column.
let’s get back to “Customers” table.
we want to know the Customer IDs in countries Mexico, Sweden, Switzerland
we will use the IN operator.
let’s go to the “Products” table
we want to know the product name of the products where their price is between 20 and 25.
we will use the between operator.
note that between operator is INCLUSIVE which means that it includes the first and the last value.
Now imagine we want to know the date of the orders made by the costumers with their names .
But hold on these two information are in different tables.
The solution of this problem is the JOINS
Here are the different types of the JOINs in SQL:
(INNER) JOIN
: Returns records that have matching values in both tablesLEFT (OUTER) JOIN
: Returns all records from the left table, and the matched records from the right table.RIGHT (OUTER) JOIN
: Returns all records from the right table, and the matched records from the left table.FULL (OUTER) JOIN
: Returns all records when there is a match in either left or right table.
NOTE If the INNER JOIN and LEFT JOIN produce the same result, it means that there are no non-matching rows between the two tables based on the join condition. In other words, all the rows from the left table in the LEFT JOIN have a matching row in the right table.
Even if the result is the same in a specific scenario, it’s still recommended to choose the join type (INNER JOIN or LEFT JOIN) based on the semantics and data relationships to ensure clarity and convey the intended meaning of the query.
The same thing applies for the RIGHT JOIN
Back to our problem
In this problem we should use the inner join to retreive all the names that has order date
First we must find the common column
The common column is the CustomerID
Now we have all the data we need, Let’s code
Please note that you must use the aliases you made for the tables otherwise your query will generate errors.
Let me introduce you to a new table which is the “Suppliers” table
Now we want to know all the cities recorded in our database with no duplicates.
we will use the UNION keyword.
Note if you want the duplicates just add ALL to UNION
If we want to know the common cities in the two tables
we will use INTERSECT clause
if we want to know the cities in the Customers table and not in the Suppliers
we will use EXCEPT clause.
Back to the Customers table.
We want to know how many customers in each country.
we should now use keyword called GROUP BY
The GROUP BY statement groups rows that have the same values into summary rows
it is often used with aggregate functions (COUNT()
, MAX()
, MIN()
, SUM()
, AVG()
) to group the result-set by one or more columns.
Let’s get back to our example.
Note, if you want to add condition to the group by use HAVING clause rather than WHERE clause because WHERE don’t accept any agg. functions.
For example:
Now we want the postal code of the customers that ordered from the employee with the ID = 7
We will use something called subquery
Subqueries, also known as nested queries or sub selects, are queries embedded within another query in SQL. A subquery is used to retrieve data from one or more tables and use that result within the context of another query.
Note “Subqueries take some time to be executed.”
Let’s code!
Note, we can use the Subqueries in SELECT , WHERE , HAVING , FROM keywords
Finally there is something we want to know which is the sequence of the keywords
which keyword executed the first ?
The SQL , is a key skill across every area of data science. Jobs that require SQL knowledge include data analyst, business intelligence developer, data engineer, data architect and software engineer.
I enjoyed the article , And again thanks to my friend
Beshoy Qulta without him I can’t write article or even interested in SQL
Last updated