SQL 101

Welcome N1NJ10 , this time we will not discuss about security stuff but about database stuff specific SQL server , I will try to talk about basics SQL 101

SQL is a programming language that deals with the different DBMSs to do things in databases

SQL can be classified in a lot of ways but the most common one is: -

Data Definition Language (DDL): Queries that create, modify objects in the database.

Data Manipulation Language (DML): Queries that help us deal with managing and manipulating data in the database.

Data Control Language (DCL): Queries that help us deal with controls, rights, and permission in the database system.

Transaction Control Language (TCL): Queries used for managing and controlling the transactions in a database to maintain consistency.

In this article we will focus on DDL and DML

First of all, we should know some database concepts.

What is a primary key ?

Primary key is a crucial thing for important database concept (maintaining data integrity) which means maintaining the accuracy, consistency, and reliability of data.

Primary key is a column or a group of columns that UNIQUELY identifying a row in a table.

The primary key must be unique, not null, immutable (not changeable) and as minimum as possible.

Think of the primary key as a representator of the table to other tables

For example employees’ id in employee table is a primary key

What is the difference between primary key and unique key?

The primary key is used to uniquely identify each record in a table and is typically the main identifier for the table. It has additional characteristics and system-level behaviors, such as non-nullability and automatic indexing. On the other hand, a unique key enforces uniqueness but allows null values and is not restricted to being the primary identifier for a table.

What is a foreign key?

The foreign key is a reference for the primary key in another table which establish a relations between tables.

Think of the foreign key as an ambassador of the table that he came from.

Now let’s dig in SQL syntax.

SQL IS NOT CASE SENSITIVE For example

select * from table_name;

will retrieve the same result as

SELECT * FROM table_name;

but it’s preferred to type the keywords in upper case to ease the code readability.

in this article we will use Holywell SQL guide, this guide separates the query to several lines “each keyword in a separate line “

for example

SELECT * 
FROM table_name;

note “the semicolon (;) is required in most DBMSs to separate the queries from each other which allow more than one SQL statement to be executed in one call”

Let’s start with the DDL.

the DDL commands are:

CREATE

This command creates database objects “Database, Table, Index …… etc.”

syntax:

CREATE DATABASE db_name ;

CREATE TABLE table1 (column1_name datatype , column2_name datatype ...... etc.);

SQL datatypes are:

Numeric Types:

INT or INTEGER: Integer values (whole numbers).
FLOAT: Floating-point numbers (decimal numbers).
DECIMAL or NUMERIC: Fixed-point numbers with a specified precision and scale.
BIGINT: Large integer values.
SMALLINT: Small integer values.
TINYINT: Tiny integer values.

Character String Types:

CHAR: Fixed-length character string.
VARCHAR: Variable-length character string.
TEXT: Variable-length character string for storing large amounts of text.
NCHAR: Fixed-length Unicode character string.
NVARCHAR: Variable-length Unicode character string.
NTEXT: Variable-length Unicode character string for storing large amounts of text.

Date and Time Types:

DATE: Date values (year, month, day).
TIME: Time values (hour, minute, second).
TIMESTAMP: Date and time values, including fractional seconds.
DATETIME: Date and time values with a precision of milliseconds.
SMALLDATETIME: Date and time values with a precision of one minute.

we want now to implement the primary key, foreign key, unique key.

we will use this syntax:

CREATE TABLE table1 (column1 datatype, column2 datatype, column3 datatype 
PRIMARY KEY (column1));

In the previous query the primary key is column 1.

CREATE TABLE table1 (column1 datatype, column2 datatype, column3 datatype 
PRIMARY KEY (column1)
CONSTRAINT UQ_column UNIQUE (column2));

In the previous query the unique key is column2.

“Note you cannot implement in MS SQL SERVER without the CONSTARINT keyword which give you the chance to name your key”

CREATE TABLE table1 (column1 INT, column2 VARCHAR(15), column3 INT,
PRIMARY KEY (column1),
CONSTRAINT UQ_column UNIQUE (column2),
CONSTRAINT FK_key FORIEGN KEY(column3) 
REFERENCES table2 (column4));

CREATE TABLE table2 (column4 INT , column5 DATETIME, 
PRIMARY KEY (column4));

In the previous query column 3 in table 1 is a foreign key reference to column 4 in table 2.

DROP

This command PERMANENTLY DELETE database objects “Database, Table, Index …… etc.”

syntax:

DROP DATABASE db_name;

DROP TABLE table_name;

ALTER

The ALTER command is used to add, delete, or modify columns in existing tables.

syntax:

ALTER TABLE table_name 
ADD new_column_name datatype;

This code added a new column to the table.

ALTER TABLE table_name
DROP COLUMN column_name;

This code dropped a column from the table.

ALTER TABLE table_name
RENAME COLUMN old_name TO new_name;

This code renamed a column.

ALTER TABLE table_name
ALTER COLUMN column_name datatype;

This code modified the column’s datatype.

“please note that there is difference between the different DBMSs in this code the previous code is for MS SQL SERVER”

ALTER TABLE table_name
MODIFY COLUMN column_name datatype;

“This one is for MySQL”

Now remember the first two tables we created?

We want column 3 to automatically change after the change of column 4 in table 2.

We will use option called CASCADE

CASCADE is basically option that automatically changes the foreign key after the change of the primary key “either update or delete”

ALTER TABLE table1
DROP CONSTRAINT FK_key;

ALTER TABLE table1
ADD CONSTRAINT FK_key FOREIGN KEY (column3)
    REFERENCES table2 (column4)
    ON UPDATE CASCADE
    ON DELETE CASCADE;

In the previous two queries we dropped the foreign key because we want to respecify the foreign key to be up to date with the primary key

and we added the foreign key once again but this time with CASCADE option which let the two keys to be up to date

TRUNCATE

The TRUNCATE command deletes all the data inside the table but not the table itself and this is the main difference between TRUNICATE and DROP

TRUNCATE TABLE table_name;

Before we go into the DML part we will know two important commands in the TCL.

COMMIT vs ROLLBACK

COMMIT: The “commit” command is used to permanently save the changes made within a transaction. It signifies the successful completion of a transaction and makes the changes visible to other users or processes. Once a commit is issued, the changes become durable and cannot be rolled back.

ROLLBACK: The “rollback” command is used to undo or cancel the changes made within a transaction. It reverts the database to its state before the transaction started, discarding any modifications made within the transaction. Rollback is typically used when an error occurs or when the transaction needs to be aborted.

In Terms of Syntax the COMMIT and ROLLBACK are considered as independent queries.

First of all, we should know that the DML commands are very useful in data cleaning, data extraction.

To make things interesting we will take the journey of the DML by facing some problems and trying to get out of it.

Suppose we have this humble table “Customers” in our database, and we want to select the two columns ContactName , City

First of all we should analyze the plain English text

The two columns we want are ContactName , City and we want from the table Customers.

So, the query should look like this.

SELECT ContactName , City 
FROM Customers;

Nice we now retrieved the wanted columns

Now if we want to retrieve all the columns in our table, we should use the asterisk ( * ) after the SELECT command.

The query will look like this:

SELECT * 
FROM Customers;

We now retrieved all the columns in the table, good.

Suppose we want to know the unique values in the country column.

What we should do is using the DISTINCT keyword alongside the SELECT one to retrieve the unique values.

The query should look like this:

SELECT DISTINCT Country
FROM Customers;

The result retrieved only 21 record “note that the total records are 91.”

Now we have another problem.

We need to know the cities in Norway that are stored in our table

We will use the where clause to fulfill our condition

SELECT City
FROM Customers
WHERE Country = 'Norway';

Only one city in the table from Norway which is Staven.

Now imagine that you want the cities in Venezuela and the CustomerID is even.

We will use logical operator (AND operator) and the modulo operator (%)

The query will look like this:

SELECT city 
FROM Customers 
WHERE Country = 'Venezuela' AND CustomerID % 2 = 0;

The only costumer with even ID is in Barquisimeto Venezuela.

Now imagine that you want to retrieve the customer names in Spain or the ID is odd.

We will use the two logical operators (OR , NOT) and we will use the modulo operator (%)

The query will look like this:

SELECT CustomerName
FROM Customers
WHERE Country = 'Spain' OR NOT CustomerID % 2 = 0;

The result consists of all the names of the odd IDs and the names in Spain.

Imagine we want to order the previous result ascending or descending.

We will use the keyword ORDER BY with the ASC or DESC clauses.

SELECT CustomerName
FROM Customers
WHERE Country = 'Spain' OR NOT CustomerID % 2 = 0
ORDER BY CustomerName ASC; -- for ascending order

SELECT CustomerName
FROM Customers
WHERE Country = 'Spain' OR NOT CustomerID % 2 = 0
ORDER BY CustomerName DESC; -- for descending order

Let’s try to retrieve the first 5 results in the previous query

well this depend on the DBMS you are using

if you use MS SQL SERVER you use TOP clause alongside the SELECT statement:

SELECT TOP 5  CustomerName
FROM Customers
WHERE Country = 'Spain' OR NOT CustomerID % 2 = 0
ORDER BY CustomerName DESC;

but if you are in MySQL you use the LIMIT keyword:

SELECT CustomerName
FROM Customers
WHERE Country = 'Spain' OR NOT CustomerID % 2 = 0
ORDER BY CustomerName DESC
LIMIT 5;

however, the result is still the same.

now we want the names that starts with letter ‘f’ for example.

we should use the LIKE operator and the SQL WILDCARDS

The syntax:

SELECT column1, column2, ...
FROM table_name
WHERE columnN LIKE pattern;

And these are examples for the patterns :

WHERE CustomerName LIKE ‘f%’ → Finds any values that start with “f”

WHERE CustomerName LIKE ‘%f’ → Finds any values that end with “f”

WHERE CustomerName LIKE ‘%or%’ → Finds any values that have “or” in any position

WHERE CustomerName LIKE ‘_r%’ → Finds any values that have “r” in the second position

WHERE CustomerName LIKE ‘a_%’ → Finds any values that start with “a” and are at least 2 characters in length

WHERE CustomerName LIKE ‘a__%’ → Finds any values that start with “a” and are at least 3 characters in length

WHERE ContactName LIKE ‘a%o’ → Finds any values that start with “a” and ends with “o”

And these are the wildcards

‘%’ (Represents zero or more characters) → bl% finds bl, black, blue, and blob

‘_’ (Represents a single character) → h_t finds hot, hat, and hit

‘[]’ (Represents any single character within the brackets) → h[oa]t finds hot and hat, but not hit

‘^’ (Represents any character not in the brackets) → h[^oa]t finds hit, but not hot and hat

‘-’(Represents any single character within the specified range) → c[a-b]t finds cat and cbt

Let’s get back to our example we want to find the names that starts with ‘f’

we can retrieve with a name also which called Alias and the clause for it is AS

SELECT CustomerName AS 'Customers start with f'
FROM Customers
WHERE CustomerName LIKE 'f%' ;

Imagine that we want to put new values to the table.

We use the INSERT command as follows:

INSERT INTO Customers (CustomerName, ContactName, Address, City, PostalCode, Country)
VALUES ('Ahmed', 'Ahmed Mohamed', 'El dokki', 'Cairo', '56987', 'Egypt');

The result will be like this:

VERY IMPORTANT NOTE “in INSERT command notice the order of the values and don’t forget any value.”

Let’s try to update this row because we entered the wrong postal code.

We will use the UPDATE command to modify the postal code as follows:

UPDATE Customers
SET PostalCode = '12611'
WHERE CustomerID = 92;

On second thought we want to remove this row from the table

We will use the DELETE command:

The syntax:

DELETE FROM table_name 
WHERE condition;

please note that without the where clause this query will delete all rows from the column “just like TRUNCATE”

DELETE FROM Customers
WHERE CustomerID = 92;

DELETE FROM vs TRUNCATE

The TRUNCATE statement is a DDL (Data Definition Language) operation used to quickly and efficiently remove all rows from a table. When you truncate a table, the operation is not logged in the transaction log, and the space used by the table’s data pages is deallocated. TRUNCATE is a non-recoverable operation, meaning you cannot roll it back. It is often faster than DELETE for removing all data from a table.

on the other hand

The DELETE statement is a DML (Data Manipulation Language) operation used to remove specific rows from a table. When you use DELETE * FROM table_name, it deletes all rows in the table, effectively achieving the same result as TRUNCATE. However, unlike TRUNCATE, DELETE is a logged operation, and each deleted row is recorded in the transaction log. DELETE can be rolled back, and it allows you to specify conditions to delete specific rows based on criteria.

Now let’s move on to another table “Products.”

Now we want to know the minimum, maximum price in the table.

We will use some functions called aggregate functions.

SELECT MIN(Price) AS 'The smallest price'
FROM Products; -- for the minimum

SELECT MAX(Price) AS 'The largest price'
FROM Products; -- for the maximum

Now we want to know the number of the products.

We should count a unique entity in the products which is the ID.

So we should use the COUNT() function on the ProductID column.

SELECT COUNT(ProductID) AS 'products number'
FROM Products;

The result is:

we also want to know the average price of the products

so we will use AVG() function on the price column

SELECT AVG(Price) AS 'Price Average' 
FROM Products;

let me introduce you to another table in our database

OrderDetails table

Now we want to know the number of the quantities sold.

We will use the SUM() function on the Quantity column.

SELECT SUM(Quantity) AS 'Quantities sold'
FROM OrderDetails;

let’s get back to “Customers” table.

we want to know the Customer IDs in countries Mexico, Sweden, Switzerland

we will use the IN operator.

SELECT CustomerID , Country
FROM Customers
WHERE country IN ('Mexico','Sweden','Switzerland')

let’s go to the “Products” table

we want to know the product name of the products where their price is between 20 and 25.

we will use the between operator.

note that between operator is INCLUSIVE which means that it includes the first and the last value.

SELECT ProductName FROM Products
WHERE Price BETWEEN 20 AND 25;

Now imagine we want to know the date of the orders made by the costumers with their names .

But hold on these two information are in different tables.

The solution of this problem is the JOINS

Here are the different types of the JOINs in SQL:

(INNER) JOIN: Returns records that have matching values in both tables
LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from the right table.
RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from the left table.
FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table.

NOTE If the INNER JOIN and LEFT JOIN produce the same result, it means that there are no non-matching rows between the two tables based on the join condition. In other words, all the rows from the left table in the LEFT JOIN have a matching row in the right table.

Even if the result is the same in a specific scenario, it’s still recommended to choose the join type (INNER JOIN or LEFT JOIN) based on the semantics and data relationships to ensure clarity and convey the intended meaning of the query.

The same thing applies for the RIGHT JOIN

Back to our problem

In this problem we should use the inner join to retreive all the names that has order date

First we must find the common column

The common column is the CustomerID

Now we have all the data we need, Let’s code

SELECT o.OrderID AS 'order ID', c.CustomerName AS 'Customer Name' , o.OrderDate AS 'Order date'
FROM Orders AS o -- left table
INNER JOIN  Customers AS c -- right table 
ON o.CustomerID=c.CustomerID; -- common column

Please note that you must use the aliases you made for the tables otherwise your query will generate errors.

Let me introduce you to a new table which is the “Suppliers” table

Now we want to know all the cities recorded in our database with no duplicates.

we will use the UNION keyword.

SELECT City FROM Customers
UNION
SELECT City FROM Suppliers
ORDER BY City;

Note if you want the duplicates just add ALL to UNION

SELECT City FROM Customers
UNION ALL
SELECT City FROM Suppliers
ORDER BY City;

If we want to know the common cities in the two tables

we will use INTERSECT clause

SELECT City FROM Customers
INTERSECT 
SELECT City FROM Suppliers
ORDER BY City;

if we want to know the cities in the Customers table and not in the Suppliers

we will use EXCEPT clause.

SELECT City FROM Customers
EXCEPT 
SELECT City FROM Suppliers
ORDER BY City;

Back to the Customers table.

We want to know how many customers in each country.

we should now use keyword called GROUP BY

The GROUP BY statement groups rows that have the same values into summary rows

it is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns.

Let’s get back to our example.

SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;

Note, if you want to add condition to the group by use HAVING clause rather than WHERE clause because WHERE don’t accept any agg. functions.

For example:

SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
HAVING COUNT(CustomerID) > 5;

Now we want the postal code of the customers that ordered from the employee with the ID = 7

We will use something called subquery

Subqueries, also known as nested queries or sub selects, are queries embedded within another query in SQL. A subquery is used to retrieve data from one or more tables and use that result within the context of another query.

Note “Subqueries take some time to be executed.”

Let’s code!

SELECT PostalCode AS 'postal code'
FROM Customers
WHERE CustomerID  IN (SELECT CustomerID FROM Orders WHERE EmployeeID = 7);

Note, we can use the Subqueries in SELECT , WHERE , HAVING , FROM keywords

Finally there is something we want to know which is the sequence of the keywords

which keyword executed the first ?

The SQL , is a key skill across every area of data science. Jobs that require SQL knowledge include data analyst, business intelligence developer, data engineer, data architect and software engineer.

I enjoyed the article , And again thanks to my friend

Beshoy Qulta without him I can’t write article or even interested in SQL

PreviousTHE NTLM/NTLMV2 AUTHENTICATION PROTOCOLS NextPOISONING & SNIFFING LAB

Last updated 1 year ago