This project has moved. For the latest updates, please go here.

Multiple Hash handling of integer values

Oct 18, 2013 at 9:12 PM
Hi, I'm curious how the MD5 hash generator handles integer values.

I'm able to recreate hashes with the safe null handling enabled for any string values, but if I have an integer, which I would cast as a varchar, as part of the string, I am unable to match the values.

In the following example, all columns are string, except AcctDetail, which is a smallint. If I only hash the previous columns, I get the correct hash value which matches the multiple hash generator, but when I include the integer column I do not match. Any help is appreciated. Thanks.

update int.MD5Test
set MD5Test = HASHBYTES('MD5'
,cast(ltrim(rtrim(isnull(Account, '')))
+ltrim(rtrim(isnull(AcctDescription, ''))
+isnull(ltrim(rtrim([AcctStatus])),''))
+isnull(cast([AcctDetail] as varchar(10)),'') as varchar(max))  
+case when [Account] is null then 'Y' else 'N' + cast(len(Account) as varchar(5)) end                       
+case when [AcctDescription] is null then 'Y' else 'N' + cast(len([AcctDescription]) as varchar(5)) end
+case when [AcctStatus] is null then 'Y' else 'N' + cast(len(AcctStatus) as varchar(5))  end
+case when [AcctDetail] is null then 'Y' else 'N' + cast(len(AcctDetail) as varchar(5))  end
)
Oct 22, 2013 at 4:18 PM
Does anyone have any info on how to handle integer values to replicate what is being done with the hash task in ssis?
Coordinator
Oct 23, 2013 at 4:13 AM
It is possible.

Multiple Hash is taking the data from SSIS's data pump, and converting it into byte arrays, appending the Y/N + length, and then hashing.
The Length's are only done on variable length columns:
DataType.DT_IMAGE
DataType.DT_NTEXT
DataType.DT_STR
DataType.DT_TEXT
DataType.DT_WSTR

I haven't tried the query below, as I don't have a suitable set of test data at hand, but it may work.
It really depends on the data types that were in the SSIS package, as to what the AcctDetail will need to be cast as.
update int.MD5Test 
set MD5Test = HASHBYTES('MD5' ,
ISNULL(CAST(ltrim(rtrim(Account)) AS VARBINARY(MAX)), CAST('' AS VARBINARY(1))) +
ISNULL(CAST(ltrim(rtrim(AcctDescription)) AS VARBINARY(MAX)), CAST('' AS VARBINARY(1))) +
ISNULL(CAST(ltrim(rtrim([AcctStatus])) AS VARBINARY(MAX)), CAST('' AS VARBINARY(1))) +
ISNULL(CAST([AcctDetail] AS BINARY(4)), CAST('' AS VARBINARY(1))) +
CAST(
     CASE WHEN Account IS NULL THEN 'Y' ELSE 'N' + CAST(LEN(Account) AS VARCHAR(5)) END +
     CASE WHEN AcctDescription IS NULL THEN 'Y' ELSE 'N' + CAST(LEN(AcctDescription) AS VARCHAR(5)) END +
     CASE WHEN [AcctStatus] IS NULL THEN 'Y' ELSE 'N' + CAST(LEN([AcctStatus]) AS VARCHAR(5)) END +
     CASE WHEN [AcctDetail] IS NULL THEN 'Y' ELSE 'N' END 
 AS VARBINARY(MAX))
)
Marked as answer by kmartin on 7/8/2014 at 6:16 AM