Saturday, September 26, 2015

How to use Merge/Merge Join Transformation Without SORT Tranformation.

Merge/Merge Join transformation need to be preceded by a Sort Transformation.However it is a good SSIS Practise to sort the data at the source.Sort Transformations can slow down the ETL process when dealing with large volumes of data.

Assuming that the data source is an OLE DB Source a simple ORDER BY clause in the SQL statement of the Source will sort the data at the source itself.

In order to bypass the SORT Transformation 2 properties need to be configured on the source
IsSorted
SortKeyPosition

The above mentioned 2 properties will be configured through the advanced source editor 




Once the IsSorted property is set to TRUE Configure the SortKeyPosition.To configure the sortkeyposition go to the output columns

Select the column on which the data is sorted in the source and set the SORTKeyposition to 1


This configuration will tell the data flow engine that the data has already been sorted by the emp_id in the source.






How to use a Cache Transform with Excel/CSV Source File

In order to compare the data from 2 excel files a lookup transformation is used.Add a data flow task with excel source.



Configure the Cache Transform for an excel source.



Set the index position of the source column which would be looked up  to 1.

Configure the column mappings of the source excel file and the cache


The Cache Connection Manager is configured to be used in the lookup transformation.Same procedure would be followed for a csv source file.

Tuesday, September 22, 2015

Using OLE DB Command to insert new records & update Existing Records in SQL SERVER Database Table

The OLE DB Command transformation runs an SQL statement for each row in a data flow. It is used to execute an SQL statement that inserts, updates, or deletes rows in a database table.

Package Requirement:For a given source file if the record exists in the database table,update for the new values.If not then insert the new records

Source File:
Lookup/Destination Table:
The package should insert records 4,5 and update record 3 for the new emp_location

Package Design:

Configuring the text file source..


 Data Conversion Transformation converts the source datatype to the column data type in lookup reference table




Lookup Transformation:





For OLE DB Command for matching rows(update)



Map the parameters to the fields in the source file




For OLE DB Command for non matching rows(insert)


 




Tuesday, September 15, 2015

Archival of Data Files using Control Tasks

In a real time scenario there can be multiple data files which are used to load data in staging tables and then need to be archived.Archival of data files can be done by using 2 control flow tasks
For-each-Loop
File System Task.

Source Files:
Package:First let's create a user variable file name .Now Add a For-Each-Loop Container Control Task to your package design.

Next Step is configuration of for each loop container.Since we will loop through files select the for each file enumerator
Select the source folder,file type.Since we only require the file name hence select name only.If the source directory consists of subfolders that have files that need to be archived select travserse subfolders.
In the variable mapping select the user variable file name with an index 0.With each occurence of file in the loop the name of the file will be stored in the user variable.

 After configuring the for each loop containers let's add a File System Task
Configuration of File System Task:
Create a File Connection Manager for Destination Directory

If delayvalidation property of the file system task is set to false the task shows an error "variable user::filename is used as a source or destination and is empty."It recieves this error as the variable filename gets its value only on the execution of for each loop.

Hence set the delayvalidation property to true.
Before execution of package
Source Directory:
Destination Directory:





 


Monday, August 24, 2015

How do you eliminate special characters from a text source data file

Source File: Comma Delimited CSV file

The file has special characters which need to be eliminated before loading in the database or any further transformations.This can be done by the TEXT QUALIFIER feature of text connection Manager.






Change TEXT QUALIFIER TO #

The data is cleaned of special character #

Package Design which Checks existence of data file .Only If it exists Execute Package,If not found exit package

Using the same earlier package

http://simplyssis.blogspot.com/2015/08/how-to-use-begin-tran-in-ssis-rollback.html

Let's now add a requirement that the package checks for the existence of the data file.If the data file exists it then executes the tasks if not exits the task.

To check the existence of the data file we use the SCRIPT TASK and a user variable with the full name(name & location) of the data file.


The C# code to check for the existence of the file is:

if (System.IO.File.Exists(Dts.Variables["User::file_name"].Value.ToString()))
            { Dts.TaskResult = (int)ScriptResults.Success; }
            else
            { Dts.TaskResult = (int)ScriptResults.Failure; }


Then Add SCRIPT TASK with precedence constraint on Success to BEGIN TRAN SQl command.If the Script task doesn't find the data file it fails and the following tasks are not executed.

Also change the delayvalidation property of the transaction container  to true.If this is set to False the entire package will fail at pre-execute stage and even the script task won't execute.Delayvalidation property delays the validation to the execution stage
 

If the script task finds the data file the following tasks are executed


Sunday, August 23, 2015

How to use "BEGIN TRAN" in SSIS-ROLLBACK on failure/COMMIT on success.

Scenario: Truncate Table before loading the data.However if the loading fails rollback the truncate.If loading succeeds commit transaction.

Data Source: Excel File

OLEDB Destination:
A normal package to truncate case_details and load the data from the source file would look like:
However in the event the load fails sql command is still executed.

Hence a normal package with SQL COMMAND & DATA FLOW TASK will not work.

In such a situation wherein we need to rollback/commit tasks BEGIN TRAN & CONTAINER need to be used

Resolved Package:

First TASK  is SQL COMMAND TASK  connected to container

The Transaction Container has precedence constraint on success to SQL COMMAND TASK(COMMIT)


 .It has precedence constraint on failure to SQL COMMAND TASK(ROLLBACK)

 For the successful execution of package set the Trueproperty of OLEDB Connection Manager as TRUE

On Execution:
If loading is successful the transaction commits

If loading fails the transaction rollbacks