Get error column name in Data Flow Task in SSIS

How to get error column name and error description in Data Flow Task in SSIS

During execution of an SSIS package, when a bad row comes in the data flow task, the task gets failed. However, most of the components (source, transformation, and destination) in the data flow task exposes an error output path which can be configured to redirect these invalid records to the error output path and then we can log these redirected bad records using a destination component. Once all these bad records get logged, we can investigate the root cause of the errors, fix them, and then these bad rows can be reintroduced again.

The error output path in the data flow task contains two four byte int columns, ErrorColumn and ErrorCode, which represent the lineage id of the error column and the error code respectively. The values in both these columns are numeric which make it hard to understand the root cause of the error. Also, SSIS does not provide a direct way to extract the error description and error column name out of the box. To get the error description and error column name in the data flow task in SSIS, we need to use the script component as a transformation component.

In previous versions of SSIS, though it was a bit tricky, but we could have been able to extract the error description using script component. However, extracting the column name from the lineage id was far from the simple.

SQL Server 2016 has introduced a method named “GetIdentificationStringByID” of interface “IDTSComponentMetaData130” which helps us to find the name of the error column from the lineage id of the column. We can use this method inside a script component in a data flow task to extract the name of the column name from the column id.

Get error column name in SSIS – Step by Step

To demonstrate this, we are going to extract all the records from the table “tbl_SourceEmpTable”, then we will transform the DOB column, which contains the date of birth of the employee as a string value, into DB_DATE datatype and will insert the records in the table “tbl_DestinationEmpTable”. Also, we will configure the derived column’s error output to log the invalid rows with error column name and error description in the table “package_InvalidRecords”.

We can use below script to create the test database and the source table “tbl_SourceEmpTable” with sample rows:

--Create database TestDB
IF EXISTS(SELECT 'X' FROM sys.databases WHERE NAME LIKE 'TestDB')
 DROP DATABASE TestDB
CREATE DATABASE TestDB
GO

--Start using TestDB
USE TestDB
GO

--Create source table
IF(OBJECT_ID('dbo.tbl_SourceEmpTable') IS NOT NULL)
DROP TABLE dbo.tbl_SourceEmpTable
GO
CREATE TABLE dbo.tbl_SourceEmpTable
(
ID INT,
EmpName VARCHAR(100),
DOB VARCHAR(50)
)
GO

--Add sample rows
INSERT INTO dbo.tbl_SourceEmpTable
VALUES(1, 'John', '01/01/1980'),
(2, 'Mark', '05/08/1978'),
(3, 'Mery', '08/10/1972'),
(4, 'Clark', '01/20/1990'),
(5, 'Michael', 'NA')
GO

Have a look at the source table data below:

Source table with sample rows
Source table with sample rows

Above, we can see that the DOB column is of varchar(50) data type and contains a string “NA” in the last row which will be failed in case we will convert this value into date data type.

Now, create the destination table “tbl_DestinationEmpTable” below:

IF(OBJECT_ID('dbo.tbl_DestinationEmpTable') IS NOT NULL)
DROP TABLE dbo.tbl_DestinationEmpTable
GO
CREATE TABLE dbo.tbl_DestinationEmpTable
(
ID INT,
EmpName VARCHAR(100),
DOB DATE
)
GO

Finally, add the table “package_InvalidRecords” to store the error details for the package:

IF(OBJECT_ID('dbo.package_InvalidRecords') IS NOT NULL)
DROP TABLE dbo.package_InvalidRecords
GO
CREATE TABLE dbo.package_InvalidRecords
(
rowID INT IDENTITY(1, 1) NOT NULL PRIMARY KEY,
packageName VARCHAR(512) NULL,
taskName VARCHAR(512) NULL,
errorCode INT NULL,
errorDescription VARCHAR(4000) NULL,
errorColumn INT NULL,
errorColumnName VARCHAR(512),
errorLogDate DATETIME NULL,
packageExecutionDate DATETIME NULL
)

Now that we have created the source, destination, and the error log tables, next, we will create an SSIS package and will configure the package to log the error column name and error description in a table named “package_InvalidRecords”.

Let’s follow below steps:

  • Open visual studio and create a new Integrated services project.
  • Rename the “Package1.dtsx” to “Get Error Column Name.dtsx”.
  • Add a data flow task to the package “Get Error Column Name.dtsx”.
  • Create an OLEDB connection to connect to the TestDB database.
  • Open the data flow task and add an OLEDB source to read the data from table “tbl_SourceEmpTable”. Choose SQL Command as data access mode in the OLEDB source and in SQL command text, use the query “SELECT ID, EmpName, DOB FROM tbl_SourceEmpTable”. The source component should look like below:
OLEDB Source component
OLEDB Source component
  • Now, add a data conversion component in the pipeline and configure this as below:
Data conversion component
Data conversion component
  • Add a destination component and configure it to insert the data in the table “tbl_DestinationEmpTable”.
  • Add a script component as a transformation component in the data flow task and map the data conversion error output to the script component. Add errorColumnName and errorDescription column as output columns in “Output 0” as below:
Add output columns
Add output columns
  • Now, click on the “Edit Script” button of the script component and inside the method “Input0_ProcessInputRow”, add below lines of code:
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
Row.errorDescription = this.ComponentMetaData.GetErrorDescription(Row.ErrorCode);

try
{
var componentMetaData130 = this.ComponentMetaData as IDTSComponentMetaData130;
if (componentMetaData130 != null)
{
Row.errorColumnName = componentMetaData130.GetIdentificationStringByID(Row.ErrorColumn);
}
}
catch(Exception ex)
{
Row.errorColumnName = "Unable to fetch column name";
}
}
  • Next, add a derived column transformation to add some audit details which could be useful during error logging as below:
Add audit columns using derived column transformation
Add audit columns using derived column transformation
  • Finally, add another OLEDB destination component and configure it to log the error rows into the table “package_InvalidRecords”. The package should look like below:
Get error column name in Data Flow Task in SSIS - Package
Get error column name in Data Flow Task in SSIS – Package
  • Now, execute the package and see the final output which should be as below:
Package after execution
Package after execution

 

I have uploaded the full visual studio solution which can be downloaded and used in case you want to have hands on experience.

To download demo solution, click here.

Thanks for the reading and please do share and share your input on this post.

Rate This
[Total: 2 Average: 5]

2 thoughts on “Get error column name in Data Flow Task in SSIS”

  1. Hi, I am stuck in 2008R2 and the C# doesn’t quite drop in. Is there something else I have to change?

    Cheers,
    Dave

Leave a Comment

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

This site uses Akismet to reduce spam. Learn how your comment data is processed.