Archives: SQL Server Tutorials

This SQL Server Tutorials is written in Step by step way from novice to advance level. Its the best online tutorial so far.


Python use case – Import zipped file without unzipping it in SSIS and SQL Server – SQL Server 2017

Import zipped CSV file without unzipping it in SSIS using SQL Server 2017

SQL Server Integration Services (SSIS) is one of the most popular ETL tools. It has many built-in components which can be used in order to automate the enterprise ETL(Extract, Transform, and Load). Also, if we need a customized component which is not available in SSIS, we can simply create it by writing our own piece of code in C# using Script Task or Script Component.

In this post, we are going to explore that how we can read and load a zipped CSV file in SQL Server without unzipping it using SSIS along with SQL Server 2017. Reading a zipped file directly (without unzipping it) will save some time required in order to write the text file on the physical disk and then reading it from there. As of now, we don’t have any built-in component in … More


Find and Delete all duplicate rows but keep one

In this post “Find and Delete all duplicate rows but keep one”, we are going to discuss that how we can find and delete all the duplicate rows of a table except one row. Assume that we have a table named tbl_sample which has four columns – EmpId, EmpName, Age, and City. This table has some duplicate data (in all the four columns) which needs to be deleted except the original one row. To demonstrate this, let’s create the dummy table with some sample data.

Here is the code to create the dummy table with sample data:

IF OBJECT_ID('dbo.tbl_Sample') IS NOT NULL
	DROP TABLE dbo.tbl_Sample
GO

CREATE TABLE dbo.tbl_Sample
(
	EmpId INT,
	EmpName VARCHAR(256),
	Age TINYINT,
	City VARCHAR(50)
)
GO

INSERT INTO dbo.tbl_Sample
(EmpId, EmpName, Age, City)
VALUES('1', 'Smith', 30, 'New York'),
('1', 'Smith', 30, 'New York'),
('1', 'Smith', 30, 'New York'),
('2', 'Adam', 35, 'California'),
('2', 'Adam', 35, 
More

Import CSV file into SQL Server using T-SQL query 2

Sometimes, we need to read an external CSV file using T-SQL query in SQL Server. Due to some functional limitations, we cannot use the import-export wizard functionality in such kinds of scenarios as we need the result set in the middle of the execution of the other queries. There, we can use the BULK INSERT SQL command which helps us to import a data file into SQL Server table directly.

Let’s have a look at the sample CSV file which we want to import into a SQL table. The CSV file is this.

Sample CSV File

Sample CSV File

To download the sample CSV file, click here. The above CSV file uses comma as a column delimiter and contains 6 columns which are:

PersonID – Stores the Id of the person.

FullName – Stores the full name of the person.

PreferredName – Stores the preferred name of the person.

SearchName – Stores … More


Python use case – Convert rows into comma separated values in a column – SQL Server 2017

In this post, we are going to learn how we can leverage python in SQL server to generate comma separated values.

If we want to combine all values of a single column it is fairly easy as we can use COALESCE function to do that. Here is a reference to the already existing post. But have you ever thought what would happen if we needed a comma separated value in a column along with other columns? In that scenario, this approach would not work.

We can get comma separated values in a column along with other columns using FOR XML PATH  query wrapped inside a sub-query, but there also we would need to take care of HTML encoded characters like < and >.

Now, with python’s integration with SQL Server 2017, it can be achieved very easily and efficiently as we do not have to rely on subqueries and … More


Python use case – Dynamic UNPIVOT using pandas – SQL Server 2017

In this post, we are going to learn how we can leverage the power of Python’s pandas module in SQL Server 2017. pandas is an open source Python library providing data frame as data structure similar to the SQL table with the vectorized operation support for high performance. To know more about pandas, you can click here.

Let’s discuss the problem we face while using the SQL UNPIVOT clause especially when we have a large number of columns. We can use UNPIVOT clause in SQL Server to convert the columns as row values and normalize the output result set. To use the UNPIVOT command, we need to specify each column name as a fixed value while writing the T-SQL query. However, this becomes annoying if we need to specify a large number of columns in the UNPIVOT clause. Also, if the column names are not fixed (dynamic in nature), … More


Handling special characters in Hive (using encoding properties) 1

In case we are reading a text file in a Hive table which contains non-English characters and we are not using the appropriate text encoding, these non-English characters might be loaded as junk symbols (like boxes – �). To get these characters in their original form, we need to use the correct character encoding. In this post “Handling special characters in Hive (using encoding properties)“, we are going to learn that how we can read special characters in Hive using encoding properties available with TBLPROPERTIES clause.

To demonstrate it, we will be using a dummy text file which is in ANSI text encoding format and contains Spanish characters. Also, we will be using Microsoft Azure cloud platform to instantiate an on-demand HDInsight cluster that makes it easy to write Hive queries. We will upload the dummy text file to an Azure Data Lake Storage and then we will … More


Skip header and footer rows in Hive 1

In this post “Skip header and footer rows in Hive“, we are going to learn that how we can ignore few header and footer records in Hive without loading or reading these records in another table or in a view temporarily. If you want to read more about Hive, visit my post “Preserve Hive metastore in Azure HDInsight” which explains Hive QL in detail.

Skip header and footer records in Hive

We can ignore N number of rows from top and bottom from a text file without loading that file in Hive using TBLPROPERTIES clause. The TBLPROPERTIES clause provides various features which can be set as per our need. It can be used in this scenario to handle the files which are being generated with additional header and footer records. Let’s have a look at the below sample file:

Sample text file

Sample text file

Now assume that we … More


Preserve Hive metastore in Azure HDInsight 1

In this blog “Preserve Hive metastore in Azure HDInsight“, we are going to learn how we can preserve the hive metadata while working with the Azure HDInsight services. Microsoft Azure HDInsight is an on-demand managed Open source Big Data analytics service for the enterprises. We can provision clusters as per the demand in few minutes, perform the computations, and then we can shut it down to avoid charges. We pay as per the usage only. You can visit this link to know more about Azure HDInsight.

What is Hive?

Apache Hive is a SQL like Big Data query language which is used as an abstraction for the map reduce jobs. The Hive query seamlessly converts into an equivalent map reduce job without the need to write low-level code. This increases the productivity of a developer to a great extent. If you want to read more about Hive … More


Connecting Python 3 to SQL Server 2017 using pyodbc 3

In this post “Connecting Python 3 to SQL Server 2017 using pyodbc”, we are going to learn that how we can connect Python 3 to SQL Server 2017 to execute SQL queries. We can change the settings accordingly to connect to other versions of SQL Server also. If you are interested to know more about Python and why you should learn it, visit our post “Why Python and how to use it in SQL Server 2017“.

What is pyodbc?

pyodbc is an open source DB API 2 Python module. It provides a convenient interface to connect a database which accepts an ODBC connection. In order to use pyodbc module, firstly, we need to install it. Click here for more information on pyodbc.

pip install pyodbc module

We can use pip install command to install the pyodbc module in Python 3 on a Windows machine. Before executing the … More


Free C# Entity Generator or C# Class Generator

C# Entity Generator or Class Generator Tool

A few years back, I created a tool “C# Entity Generator”. Though it was created for an older version of visual studio, still it can be useful. I am sharing this tool here so that anyone can download and use it for free. This tool does not require any installation and is a copy paste utility.

C# Entity Generator is a tool which can be used to generate C# Entity Layer classes without writing a single line of code. If we are using three-tier architecture in our applications and need to create entities frequently to map the output of the relational queries, this tool can be very useful. It increases the developer’s productivity especially if the entity contains a large set of properties. Only once would we need to define the data types, their prefixes as per the naming convention guidelines, and their … More