Introduction

SQL, which is short for Structured Query Language, is a database computer language designed for accessing and manipulating databases. This guide will provide a quick introduction to SQL covering basic topics to give you an overview of the language and its capabilities.

RDBMS Concepts

RDBMS stands for Relational Database Management System. This is the basis for SQL, and for modern database systems like MS-SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access. It is a structuring of data into database tables, records and fields. Each database table has rows of data consisting of one or more fields. The records stored in each table is data that has been logically grouped together, like all customer information or all order information. This organization of data into specific tables is called normalization.

What is Normalization?

Database normalization is the process of efficiently organizing data in a database. This process eliminates redundant data, for example, storing the same data in more than one table and ensures that data dependencies make sense. The benefits of this are saving space in database storage and making sure that data is stored in a logical form resulting in faster operations.

Imagine that you are responsible for keeping track of all the books being checked out of a library. You could use a single table to track all the critical information such as name, address, phone, book title and due date to name a few. This single table meets the basic need to keep track of who has checked out which book, but does have some serious flaws in terms of efficiency, space required, and maintenance time.

For example, as voracious reader Bob checks out more books over time, you will have to re-enter all of his contact information for every book each time. To re-enter Bob’s contact information wastes time, and increases the opportunity for error. Also, when an update is necessary like a change of address or phone number, each of Bob’s records must be located and corrected. If one of Bob’s records has a different phone number from the rest, is it a correction, a record overlooked during the last update, or a data-entry mistake?

Many of these issues are avoided by dividing the information into multiple tables with the goal of having “a place for everything, and everything in its place.” Each piece of information should appear just once, simplifying data maintenance and decreasing the storage space required.

Schema

A schema is a logical database object holder. A database schema of a database system is its structure described in a formal language supported by the database management system. In simpler terms, schema is a container for database objects. It's a convention to help you better organize your tables.

Syntax

SQL follows a set of rules and guidelines called Syntax. Statement, which are like directions to follow, start with keywords like SELECT, INSERT, UPDATE, DELETE, ALTER, DROP, CREATE, USE, and SHOW to name just a few. All statements end with a semicolon (;). SQL is case insensitive, meaning that the keyword SELECT or select, will have the same meaning in SQL statements. The standard practice and what you see most often is uppercase for keywords.

There are many popular RDBMS available to work with. MS SQL Server, MySQL, Oracle and MS Access are some of the most popular. Although (in theory) SQL is standardized, in practice each vendor has their own "extensions" (variations) on the language so it is always best to review the documentation for your RDBMS. For example, MySQL is case sensitive with table naming so if you are working with MySQL you will need to use table names in your SQL statements as they exist in the database.

SQL SELECT Statement

The following SQL statement selects a number of fields from a given table. The result would be all the records from that table showing just the fields in the SELECT statement. This is useful if you have 20 fields in a record and only want the information from 5 of them.

SELECT Column1, Column2, ...ColumnN FROM table_name;

SQL WHERE Clause

The WHERE clause allows you to use a condition on your query statement. Maybe you only want customers for a certain zip code.

SELECT column1, column2....columnN FROM table_name WHERE CONDITION;

Notice that you can have a statement written all on one line or on multiple lines to make reading and understanding easier. The statement is not finished until it reaches the semicolon.

Create Database

The SQL CREATE DATABASE statement is used to create a new SQL database. Make sure you have the admin privilege before creating any database.

Syntax

CREATE DATABASE DatabaseName;

The database name should always be unique within the schema (your RDBMS).

Example

If you want to create a new database <testDB>, then the CREATE DATABASE statement would be as shown below −

CREATE DATABASE testDB;
Drop Database

The SQL DROP DATABASE statement is used to drop (delete) an existing database. Make sure you have the admin privilege before dropping any database.

NOTE -Be careful before using this operation. Dropping an existing database will result in loss of complete information stored in the database.

Syntax

DROP DATABASE DatabaseName;

Example

If you want to delete an existing database <testDB>, then the DROP DATABASE statement would be as shown below −

DROP DATABASE testDB;
Select Database

If you have more than one database in your database system you will need to select the desired database to perform any actions on it.

The SQL USE statement is used to select any existing database in the SQL schema (your RDBMS).

Syntax

USE DatabaseName;
Create Table

Creating a table in your database involves naming the table and defining its columns and each column's data type.

The data type is a guideline for SQL to understand what type of data is expected inside of each column, and it also identifies how SQL will interact with the stored data. Note: Data types might have different names in different RDBMS.

The SQL CREATE TABLE statement is used to create a new table.

Syntax

CREATE TABLE table_name( column1 datatype, column12 datatype, column3 datatype, ..... columnN datatype, PRIMARY KEY( one or more columns )

CREATE TABLE is the keyword telling the database system what you want to do. In this case, you want to create a new table. The unique name or identifier for the table follows the CREATE TABLE statement.

Then in brackets comes the list defining each column in the table and what sort of data type it is. The syntax becomes clearer with the following example.

Example

The following code block is an example, which creates a CUSTOMERS table with an ID as a primary key and NOT NULL are the constraints showing that these fields cannot be NULL while creating records in this table −

CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID) );

The PRIMARY KEY constraint uniquely identifies each record in a table. Primary keys must contain UNIQUE values, and cannot contain NULL values. A table can have only one primary key, which may consist of single or multiple fields. Most times the primary key field will be set to auto-increment.

Auto-increment allows a unique number to be generated automatically when a new record is inserted into a table.

A FOREIGN KEY is a key used to link two tables together. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table.

The table containing the foreign key is called the child table, and the table containing the matching primary key is called the referenced or parent table.

For information on the available SQL data types visit this reference.

Drop Table

The SQL DROP TABLE statement is used to remove a table and all of it's data.

NOTE − You should be very careful while using this command because once a table is deleted then all the information available in that table will also be lost forever.

Syntax

DROP TABLE table_name;
Insert Query

The SQL INSERT INTO Statement is used to add new rows of data to a table in the database.

Syntax

There are two basic syntaxes of the INSERT INTO statement which are shown below.

Here, column1, column2, column3,...columnN are the names of the columns in the table into which you want to insert the data.

INSERT INTO TABLE_NAME (column1, column2, column3,...columnN) VALUES (value1, value2, value3,...valueN);

You may not need to specify the column(s) name in the SQL query if you are adding values for all the columns of the table. But make sure the order of the values is in the same order as the columns in the table.

INSERT INTO TABLE_NAME VALUES (value1,value2,value3,...valueN);

Example

Lets add three records to the CUSTOMERS table.

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY) VALUES (1, 'Ramesh', 32, 'Ahmedabad', 2000.00 ); INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY) VALUES (2, 'Khilan', 25, 'Delhi', 1500.00 ); INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY) VALUES (3, 'kaushik', 23, 'Kota', 2000.00 );

You can also do it this way -

INSERT INTO CUSTOMERS VALUES (7, 'Muffy', 24, 'Indore', 10000.00 );

Populate one table using another table

You can populate the data into a table through the select statement over another table; provided the other table has a set of fields, which are required to populate the first table.

Here is the syntax −

INSERT INTO first_table_name [(column1, column2, ... columnN)] SELECT column1, column2, ...columnN FROM second_table_name [WHERE condition];
Select Query

The SQL SELECT statement is used to fetch the data from a database table which returns this data in the form of a result table. These result tables are called result-sets.

Syntax

SELECT column1, column2, columnN FROM table_name;

Here, column1, column2... are the fields of a table whose values you want to fetch. If you want to fetch all the fields available in the field, then you can use the following syntax.

SELECT * FROM table_name;

NOTE - You should only use the * if you really need all of the fields returned. In a large table with several fields it contributes to inefficiency that can be avoided by specifying the required columns.

Where Clause

The SQL WHERE clause is used to specify a condition while fetching the data from a single table or by joining with multiple tables. If the given condition is satisfied, then it returns a specific value from the table. You should use the WHERE clause to filter the records and fetching only the necessary records.

The WHERE clause is not only used in the SELECT statement, but it is also used in the UPDATE, DELETE statement, etc., which we will examine in the subsequent chapters.

Syntax

SELECT column1, column2, columnN FROM table_name WHERE [condition]

You can specify a condition using the comparison or logical operators like >, <, =, LIKE, NOT, etc.

Example

The following code is an example which would fetch the ID, Name and Salary fields from the CUSTOMERS table, where the salary is greater than 2000 −

SELECT ID, NAME, SALARY FROM CUSTOMERS WHERE SALARY > 2000;

The result would be -

+----+----------+----------+ | ID | NAME | SALARY | +----+----------+----------+ | 4 | Chaitali | 6500.00 | | 5 | Hardik | 8500.00 | | 6 | Komal | 4500.00 | | 7 | Muffy | 10000.00 | +----+----------+----------+

The following query is an example, which would fetch the ID, Name and Salary fields from the CUSTOMERS table for a customer with the name Hardik.

Here, it is important to note that all the strings should be given inside single quotes (''). Whereas, numeric values should be given without any quote as in the above example.

SELECT ID, NAME, SALARY FROM CUSTOMERS WHERE NAME = 'Hardik';

This would produce -

+----+----------+----------+ | ID | NAME | SALARY | +----+----------+----------+ | 5 | Hardik | 8500.00 | +----+----------+----------+
AND/OR Clauses

The SQL AND & OR operators are used to combine multiple conditions to narrow data in an SQL statement. These two operators are called as the conjunctive operators.

These operators provide a means to make multiple comparisons with different operators in the same SQL statement.

The AND Operator

The AND operator allows the existence of multiple conditions in an SQL statement's WHERE clause.

Syntax

The basic syntax of the AND operator with a WHERE clause is as follows −

SELECT column1, column2, columnN FROM table_name WHERE [condition1] AND [condition2]...AND [conditionN];

You can combine N number of conditions using the AND operator. For an action to be taken by the SQL statement, whether it be a transaction or a query, all conditions separated by the AND must be TRUE.

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

Following is an example, which would fetch the ID, Name and Salary fields from the CUSTOMERS table, where the salary is greater than 2000 and the age is less than 25 years −

SELECT ID, NAME, SALARY FROM CUSTOMERS WHERE SALARY > 2000 AND age < 25;

This would produce the following result −

+----+-------+----------+ | ID | NAME | SALARY | +----+-------+----------+ | 6 | Komal | 4500.00 | | 7 | Muffy | 10000.00 | +----+-------+----------+

The OR Operator

The OR operator is used to combine multiple conditions in an SQL statement's WHERE clause.

Syntax

The basic syntax of the OR operator with a WHERE clause is as follows −

SELECT column1, column2, columnN FROM table_name WHERE [condition1] OR [condition2]...OR [conditionN];

You can combine N number of conditions using the OR operator. For an action to be taken by the SQL statement, whether it be a transaction or query, the only any ONE of the conditions separated by the OR must be TRUE.

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

The following code block has a query, which would fetch the ID, Name and Salary fields from the CUSTOMERS table, where the salary is greater than 2000 OR the age is less than 25 years.

SELECT ID, NAME, SALARY FROM CUSTOMERS WHERE SALARY > 2000 OR age < 25;

This would produce the following result −

+----+----------+----------+ | ID | NAME | SALARY | +----+----------+----------+ | 3 | kaushik | 2000.00 | | 4 | Chaitali | 6500.00 | | 5 | Hardik | 8500.00 | | 6 | Komal | 4500.00 | | 7 | Muffy | 10000.00 | +----+----------+----------+
Update Query

The SQL UPDATE Query is used to modify the existing records in a table. You can use the WHERE clause with the UPDATE query to update the selected rows, otherwise all the rows would be affected.

Syntax

The basic syntax of the UPDATE query with a WHERE clause is as follows −

UPDATE table_name SET column1 = value1, column2 = value2...., columnN = valueN WHERE [condition];

You can combine N number of conditions using the AND or the OR operators.

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

The following query will update the ADDRESS for a customer whose ID number is 6 in the table.

UPDATE CUSTOMERS SET ADDRESS = 'Pune' WHERE ID = 6;

Now, the CUSTOMERS table would have the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | Pune | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

If you want to modify all the ADDRESS and the SALARY column values in the CUSTOMERS table, you do not need to use the WHERE clause as the UPDATE query would be enough as shown in the following code block.

UPDATE CUSTOMERS SET ADDRESS = 'Pune', SALARY = 1000.00;

Now, CUSTOMERS table would have the following records −

+----+----------+-----+---------+---------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+---------+---------+ | 1 | Ramesh | 32 | Pune | 1000.00 | | 2 | Khilan | 25 | Pune | 1000.00 | | 3 | kaushik | 23 | Pune | 1000.00 | | 4 | Chaitali | 25 | Pune | 1000.00 | | 5 | Hardik | 27 | Pune | 1000.00 | | 6 | Komal | 22 | Pune | 1000.00 | | 7 | Muffy | 24 | Pune | 1000.00 | +----+----------+-----+---------+---------+
Delete Query

The SQL DELETE Query is used to delete the existing records from a table. You can use the WHERE clause with a DELETE query to delete the selected rows, otherwise all the records would be deleted.

Syntax

The basic syntax of the UPDATE query with a WHERE clause is as follows −

DELETE FROM table_name WHERE [condition];

You can combine N number of conditions using AND or OR operators.

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

The following code has a query, which will DELETE a customer, whose ID is 6.

DELETE FROM CUSTOMERS WHERE ID = 6;

Now, the CUSTOMERS table would have the following records.

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

If you want to DELETE all the records from the CUSTOMERS table, you do not need to use the WHERE clause and the DELETE query would be as follows −

DELETE FROM CUSTOMERS;

Now, the CUSTOMERS table would not have any records.

Like Clause

The SQL LIKE clause is used to compare a value to similar values using wildcard operators. There are two wildcards used in conjunction with the LIKE operator.

  • The percent sign (%)
  • The underscore (_)

The percent sign represents zero, one or multiple characters. The underscore represents a single number or character. These symbols can be used in combinations.

Syntax

The basic syntax of % and _ is as follows −

SELECT FROM table_name WHERE column LIKE 'XXXX%' or SELECT FROM table_name WHERE column LIKE '%XXXX%' or SELECT FROM table_name WHERE column LIKE 'XXXX_' or SELECT FROM table_name WHERE column LIKE '_XXXX' or SELECT FROM table_name WHERE column LIKE '_XXXX_'

You can combine N number of conditions using AND or OR operators. Here, XXXX could be any numeric or string value.

  • WHERE SALARY LIKE '200%'
    Finds any values that start with 200.
  • WHERE SALARY LIKE '%200%'
    Finds any values that have 200 in any position.
  • WHERE SALARY LIKE '_00%'
    Finds any values that have 00 in the second and third positions.
  • WHERE SALARY LIKE '2_%_%'
    Finds any values that start with 2 and are at least 3 characters in length.
  • WHERE SALARY LIKE '%2'
    Finds any values that end with 2.
  • WHERE SALARY LIKE '_2%3'
    Finds any values that have a 2 in the second position and end with a 3.
  • WHERE SALARY LIKE '2___3'
    Finds any values in a five-digit number that start with 2 and end with 3.

Let's take a real example, consider the CUSTOMERS table having the records as shown below.

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

Following is an example, which would display all the records from the CUSTOMERS table, where the SALARY starts with 200.

SELECT * FROM CUSTOMERS WHERE SALARY LIKE '200%';

This would produce the following result −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 3 | kaushik | 23 | Kota | 2000.00 | +----+----------+-----+-----------+----------+
Top Clause

The SQL TOP clause is used to fetch a TOP N number or X percent records from a table.

Note − Not all databases support the TOP clause. For example MySQL supports the LIMIT clause to fetch limited number of records while Oracle uses the ROWNUM command to fetch a limited number of records.

Syntax

The basic syntax of the TOP clause with a SELECT statement would be as follows.

SELECT TOP number|percent column_name(s) FROM table_name WHERE [condition];

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

The following query is an example on the SQL server, which would fetch the top 3 records from the CUSTOMERS table.

SELECT TOP 3 * FROM CUSTOMERS;

This would produce the following result −

+----+---------+-----+-----------+---------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+---------+-----+-----------+---------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | +----+---------+-----+-----------+---------+
Order By

The SQL ORDER BY clause is used to sort the data in ascending or descending order, based on one or more columns. Some databases sort the query results in an ascending order by default.

Syntax

The basic syntax of the ORDER BY clause is as follows −

SELECT column-list FROM table_name [WHERE condition] [ORDER BY column1, column2, .. columnN] [ASC | DESC];

You can use more than one column in the ORDER BY clause. Make sure whatever column you are using to sort that column should be in the column-list.

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

The following code block has an example, which would sort the result in an ascending order by the NAME and the SALARY −

SELECT * FROM CUSTOMERS ORDER BY NAME, SALARY;

This would produce the following result −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | +----+----------+-----+-----------+----------+

The following code block has an example, which would sort the result in the descending order by NAME.

SELECT * FROM CUSTOMERS ORDER BY NAME DESC;

This would produce the following result −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 7 | Muffy | 24 | Indore | 10000.00 | | 6 | Komal | 22 | MP | 4500.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | +----+----------+-----+-----------+----------+
Group By

The SQL GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups. This GROUP BY clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY clause.

Syntax

The basic syntax of a GROUP BY clause is shown in the following code block. The GROUP BY clause must follow the conditions in the WHERE clause and must precede the ORDER BY clause if one is used.

SELECT column1, column2 FROM table_name WHERE [ conditions ] GROUP BY column1, column2 ORDER BY column1, column2

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

If you want to know the total amount of the salary on each customer, then the GROUP BY query would be as follows.

SELECT NAME, SUM(SALARY) FROM CUSTOMERS GROUP BY NAME;

This would produce the following result −

+----------+-------------+ | NAME | SUM(SALARY) | +----------+-------------+ | Chaitali | 6500.00 | | Hardik | 8500.00 | | kaushik | 2000.00 | | Khilan | 1500.00 | | Komal | 4500.00 | | Muffy | 10000.00 | | Ramesh | 2000.00 | +----------+-------------+

Now, let us look at a table where the CUSTOMERS table has the following records with duplicate names −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Ramesh | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | kaushik | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

Now again, if you want to know the total amount of salary on each customer, then the GROUP BY query would be as follows −

SELECT NAME, SUM(SALARY) FROM CUSTOMERS GROUP BY NAME;

This would produce the following result −

+---------+-------------+ | NAME | SUM(SALARY) | +---------+-------------+ | Hardik | 8500.00 | | kaushik | 8500.00 | | Komal | 4500.00 | | Muffy | 10000.00 | | Ramesh | 3500.00 | +---------+-------------+
Distinct

The SQL DISTINCT keyword is used in conjunction with the SELECT statement to eliminate all the duplicate records and fetching only unique records.

There may be a situation when you have multiple duplicate records in a table. While fetching such records, it makes more sense to fetch only those unique records instead of fetching duplicate records.

Syntax

The basic syntax of DISTINCT keyword to eliminate the duplicate records is as follows −

SELECT DISTINCT column1, column2,.....columnN FROM table_name WHERE [condition]

Example

Consider the CUSTOMERS table having the following records −

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

First, let us see how the following SELECT query returns the duplicate salary records.

SELECT SALARY FROM CUSTOMERS ORDER BY SALARY;

This would produce the following result, where the salary (2000) is coming twice which is a duplicate record from the original table.

+----------+ | SALARY | +----------+ | 1500.00 | | 2000.00 | | 2000.00 | | 4500.00 | | 6500.00 | | 8500.00 | | 10000.00 | +----------+

Now, let us use the DISTINCT keyword with the above SELECT query and then see the result.

SELECT DISTINCT SALARY FROM CUSTOMERS ORDER BY SALARY;

This would produce the following result where we do not have any duplicate entry.

+----------+ | SALARY | +----------+ | 1500.00 | | 2000.00 | | 4500.00 | | 6500.00 | | 8500.00 | | 10000.00 | +----------+
Conclusion

This concludes the basic introduction to SQL language. There is still a lot more that you can do with SQL. To learn further and cover more advanced topics I would recomend the following resources.