Now that we have been familiar with Create, Alter and Drop – Database and Table, it is time to be familiar with T-SQL statements. Let’s start this chapter with the definition of T-SQL followed by a detailed discussion on T-SQL’s SELECT statement.
Transact SQL (T-SQL)
Transact SQL (T-SQL) is a proprietary extension of SQL (structured Query Language) by Microsoft and Sybase to communicate with Microsoft SQL Server. T-SQL has support for procedural programming using control flow (if-else, Case), loop (while), cursor, recursion and local variable, etc. It has many inbuilt functions to support string operation, date and time processing, mathematical calculation, aggregate processing and many more.
In short, T-SQL is capable to deal all kinds of SQL Server requirements like; creating database, adding tables, creating relationship, constraints, managing users and roles, creating a backup, restore database etc.
T-SQL Statements are the basic constructs of T-SQL Programming and in this chapter, we are going to discuss basic T-SQL statements which are as below:
- SELECT – Used to fetch data from a table or multiple tables.
- INSERT – Used to insert rows in an existing table. (Would be covered in the next chapter)
- UPDATE – Used to update row(s) in a table. (Would be covered in the next chapter)
- DELETE – Used to delete row(s) from a table. (Would be covered in the next chapter)
The SELECT statement is used to fetch data from database tables in the form of result set which is composed of rows and columns. It can fetch data from a single table, or from multiple tables using JOIN, UNION, INTERSECTION and EXCEPT. It also enables a way to define the columns (attributes) and rows (tuples) to be retrieved.
We have already downloaded and attached AdventureWorks2014 database in chapter SQL Server Tutorial – Prerequisite. If not downloaded yet, go to the chapter and follow the given instructions.
The SELECT statement is composed of multiple parts and a typical SELECT has below syntax:
GROUP BY <column-list>
ORDER BY <column-list>
<column-list> is the comma separated list of required columns in the output. E.g. SELECT [column-name-1], [column-name-2], [column-name-3] ……[column-name-n].
<schema-name>.<table-name> <schema-name> is the name of the schema and <table-name> is the name of the table. Mind the DOT (.) operator between schema name and table name. E.g. FROM [HumanResources].[Employee].
<filter> Filter is used to include only those rows from the table which are satisfying the given filter criteria. We can have multiple filters combined with AND and OR operators. E.g. WHERE [column-name-1] = ‘some value’ AND [column-name-2] = ‘some value’ AND [column-name-3] = ‘some value’.
<aggregate-filter> Aggregate filter applies to aggregated data unlike “WHERE” which filters the data while reading from the table. We can use aggregated filters with the aggregated result set. E.g. HAVING COUNT(column-name-1) > 10.
Lets have a deeper look on each clause which can be used with the SELECT statement from above syntax.
SELECT clause is used to specify the columns to be retrieved from the database. The simplest way of getting a data using SELECT clause is as:
SELECT 1 AS [DummyColumn]
In the above query, we are retrieving constant value 1 as a single column and single row result set. “AS” keyword has been used to name the column as [DummyColumn].
We are not using any other clause except SELECT from the above SELECT statement syntax. We can retrieve a constant or variable value as above in SQL Server.
To get multiple values in multiple columns:
--Query 1 SELECT 1 AS [Column1], 2 AS [Column2], 1 + 2 AS [Column3] --Query 2 SELECT 'X' AS [Column X], 'Y' AS [Column Y], 'X' + 'Y' AS [Column X + Y]
Mind the square brackets surrounding the name of the columns. Enclose column names in between square brackets if the column name contains any space or is a SQL keyword otherwise using square brackets is optional.
Commenting code in T-SQL
In T-SQL, we can use double hyphens (–) to comment a single line of code (as we did in the above example code) and to comment multiple line at once we can simply enclose the codes between (/* and */) as:
/* Commented code goes here */ SELECT 'Welcome to t-sql programming' AS [Column1]
The FROM clause is used to specify the name of the object from which data needs to be retrieved. The object name can be the name of a table, derived table, common table expression (CTE), view or function name.
To extract all data from all columns of Product table of production schema:
SELECT * FROM Production.Product
We have used asterisks (*) to get all columns in above query. Using * is not a recommended way to retrieve data from a table as it extracts all columns with their data which is not always needed.
We should use the name of each column explicitly in the query even if all the columns are required in the result set because if a column name gets changed, we get the error at the time of query execution which does not occur in case of asterisk (*). As asterisk fetches all the current column names of the table, any change in column names would lead to an error at application level.
To fetch selected columns from a table, use below query:
SELECT ProductID, Name, ProductNumber, MakeFlag, FinishedGoodsFlag, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice FROM Production.Product
Remember, we can rename any column in output result set using AS clause.
WHERE clause provides a way to put a filter to the table at row level while reading the data. Using appropriate WHERE clause speeds up the query by reducing the amount of data to be read from the table. Generally, we don’t need to extract all the data from a table. We need few columns which satisfy few given filter conditions.
--To get the list of all black color products SELECT ProductID, Name, ProductNumber, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice FROM Production.Product WHERE Color = 'Black' --To get the list of all black color products with list price greater than 50 SELECT ProductID, Name, ProductNumber, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice FROM Production.Product WHERE Color = 'Black' AND ListPrice > 50 --To get the list of all black or blue color products SELECT ProductID, Name, ProductNumber, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice FROM Production.Product WHERE Color = 'Black' OR Color = 'Blue'
We can combine multiple logical conditions with AND and OR logical operators as we did above. We will cover logical operators in more detail later in this tutorial.
GROUP BY clause
GROUP BY clause is used to summarize the result set by one or more columns. It aggregates the data as per the available values in columns. Lets have a few sample queries to make it clear:
--Query 1 --ListPrice total for all Black and Blue color products SELECT Color, SUM(ListPrice) AS TotalListPrice FROM Production.Product WHERE Color = 'Black' OR Color = 'Blue' GROUP BY Color --Query 2 --Size wise ListPrice total for all Black and Blue color products SELECT Color, Size, SUM(ListPrice) AS TotalListPrice FROM Production.Product WHERE Color = 'Black' OR Color = 'Blue' GROUP BY Color, Size
In the above example, in Query 1, ListPrice is summed and color column values are grouped to summarize the outcome of query by color. We get only two rows, one for black and another for blue color, though we have many rows of black and blue color in the Product table. The result set has been aggregated by color which we have in the group by clause. In Query 2, the result set is grouped by Color and Size whereas ListPrice is summed.
We cannot have a non aggregated column with an aggregated column in select list. Either we can have all aggregated columns or there should not be an aggregation on any column. For example, if we have column1 and column2 in the select list and grouping the result set by column1 only, it would raise an error. We can resolve the error either by putting both columns column1 and column2 in the group by clause or by removing both columns from the group by clause.
HAVING clause is used to put a filter on aggregated or grouped data. If we want to list only those colors whose sum of ListPrice is more than 3000, use HAVING clause.
Having clause is more resource intensive than WHERE clause as we need to read all the data, aggregate it, then apply a filter unlike WHERE clause which filters the data at row level and reduces the I/O by reading less data. But, sometimes we need to filter aggregated date instead of row level as like below query.
SELECT Color, Size, SUM(ListPrice) AS TotalListPrice FROM Production.Product WHERE Color = 'Black' OR Color = 'Blue' GROUP BY Color, Size HAVING SUM(ListPrice) > 3000
Above query outcomes all the rows grouped by color and size, which have the sum of ListPrice greater than 3000.
ORDER BY clause
ORDER BY clause is used to sort the output. We can use single or multiple columns to order the result set. Have a look on below sample queries:
--Order the result set by TotalListPrice in ascending order SELECT Color, Size, SUM(ListPrice) AS TotalListPrice FROM Production.Product WHERE Color = 'Black' OR Color = 'Blue' GROUP BY Color, Size HAVING SUM(ListPrice) > 3000 ORDER BY TotalListPrice ASC --Order the result set by TotalListPrice in descending order SELECT Color, Size, SUM(ListPrice) AS TotalListPrice FROM Production.Product WHERE Color = 'Black' OR Color = 'Blue' GROUP BY Color, Size HAVING SUM(ListPrice) > 3000 ORDER BY TotalListPrice DESC --Order the result set by TotalListPrice in dscending then by Size in descending and then by Color ascending SELECT Color, Size, SUM(ListPrice) AS TotalListPrice FROM Production.Product WHERE Color = 'Black' OR Color = 'Blue' GROUP BY Color, Size HAVING SUM(ListPrice) > 3000 ORDER BY TotalListPrice DESC, Size DESC, Color ASC --List of products order by product name, default in ascending order SELECT Name FROM Production.Product ORDER BY Name --List of products order by product ID, default in ascending order SELECT ProductID, Name AS ProductName FROM Production.Product ORDER BY ProductID --List of products order by color in descending order then by product name in ascending order by default SELECT Color, Name AS ProductName FROM Production.Product ORDER BY Color DESC, ProductName
To sort the result set with a column in ascending or descending order, we use ASC and DESC keywords with ORDER BY clause which defaults to ascending (ASC). That is, if we omit the ASC or DESC keyword after ORDER BY, by default, it is ASC.
SELECT statement can have FROM, WHERE, GROUP BY, HAVING and ORDER BY clauses or it can just have a few of them but in below specified order only. If a clause is not required, we can skip it in SELECT statement, but we cannot change the order in which they are used. We can correlate this with a stair in which we can jump some steps, but we cannot change the order of the steps of the stair. Have a look on below image:
Thank you for reading and share this to your friends if you like this tutorial series. Please rate this post and share your valuable suggestion in the comment section if you like.