Saturday, March 19, 2016

How to delete Duplicate records from a SQL Server table with Identity Column


Scenario: A package with a Data Flow task runs multiple times and inserts duplicate records in the destination table.The destination table is a transactional table with Numeric Identity column for transaction_id for each record inserted.Although the records are duplicate,they are still unique because of the identity property of the transaction_id column.
 
Following would be the best way to delete the duplicate records while maintaining the identity of the first load inserted.

The package loads data in destination table TBL_Transaction_Details as below.
 

  After the package execution the data in the table looks like:


The Data from Transaction_Id 6 onwards is duplicate which needs to be deleted.

Using a CTE code with rownumber over partition by the duplicated column values and order by the transaction id, will remove the duplicate transactions while restoring the id's of the first load.

WITH CTE_Transaction_Details ([Customer_NAME]
      ,[Customer_address]
      ,[Transaction_amount]
      ,[Transaction_discount]
      ,[discount_percentage]
      ,[New_customer]
      ,[Transaction_Fees],Count_duplicate)
AS
(
SELECT [Customer_NAME]
      ,[Customer_address]
      ,[Transaction_amount]
      ,[Transaction_discount]
      ,[discount_percentage]
      ,[New_customer]
      ,[Transaction_Fees]
      ,ROW_NUMBER() OVER(PARTITION BY [Customer_NAME]
      ,[Customer_address]
      ,[Transaction_amount]
      ,[Transaction_discount]
      ,[discount_percentage]
      ,[New_customer]
      ,[Transaction_Fees] ORDER BY transaction_id) AS Count_duplicate
  FROM [dbo].[TBL_Transaction_Details]
)

DELETE
FROM CTE_Transaction_Details
WHERE count_duplicate > 1

GO

When we delete from a CTE it deletes from the source table

After the above code is executed the table is cleaned of the duplicate data




Friday, March 4, 2016

Error SSIS Variable (Cannot assign value to variable): The result of the expression cannot be written to the property. The expression was evaluated, but cannot be set on the property.

I recently encountered an error while dynamically returning the value of a connection string in a parent package used to execute child packages in a for-each loop.The parent master package retrieves the name of the child packages from a database table and passes the value to a connection string variable(user::packagename) in the for each loop container,each enumerator of which is the package name for the subsequent execute package task in the container.

However the package on runtime failed with the error "The result of the expression “@variablename” on property  cannot be written to the property. The expression was evaluated, but cannot be set on the property."

When a script task was used to debug the issue it correctly showed all the package names in the loop which proved that the value to the variable was being written correctly...However when the script task was replaced by execute package task it failed with the above error.After much troubleshooting found out that the value of the string user:packagename must be initialised to an existing connection manager in the parent package before executing it .Once that was done the parent package worked perfectly executing the child packages with the dynamic connection string variable.