THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Eric Johnson

My Thoughts on SQL Server and Database Related Technologies.

Data Conversion in SSIS

When it comes to converting data from one type to another in SSIS, you have a few choices of methods. We are going to look at a few options and discuss the differences. With all of these, we are assuming that the input data type is compatible with the new output data type. In other words, we are not converting the word “one” to the number 1, this would require fancier logic. So first up, there is the appropriately named Data Conversion tranformation. This task will take one or more input columns and create new columns with the desired data type. It is simple to use, just select your input column or columns, name your output column and chose the new data type, as shown.

clip_image002

You can also use the Derived Column transformation to convert data. This transformation is also pretty simple, you select an input column and then chose whether to replace this column or create a new output column. Then you need to apply an expression in order to come up with the new data for the output column. This expression can be a cast statement as shown.

clip_image004

As you can see, we have created a new column with a new data type, functionally the same as the Data Conversion transformation. Once thing to note, you can also choose to replace a column but when you do this, SSIS wants the new column to maintain the original data type. As you can see in the next screen shot, even though we have applied to DT_I8 cast, the data is still a four-byte signed integer. This is just something to be aware of.

clip_image006

So why use one over the other? Well the Data Conversion transformation will just take an input, convert the type and provide a new output column. If you use the Derived Column transformation you get to apply an expression to the data, which allows you to do more complex conversions or perform additional manipulations on the data. In fact, using IF...THEN logic in an expression, you could convert the word “one” to the number 1 with a Derived Column transformation.

By no means, are these the only two methods for conversion, but they are the two most common transformations used to convert data. You can also build custom script transformations or even use CAST or CONVERT statements in your source T-SQL if you so want. In the end, the method you chose will depend on your specific requirements, but it is good practice to understand all the options before making a call on which one is the best.

Published Friday, October 01, 2010 1:43 PM by ejohnson2010
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

jonmcrawford said:

Do these only convert one column at a time? Or can you (I assume) select the input columns and perform multiple transformations?

October 1, 2010 3:40 PM
 

Todd McDermid said:

Both transforms can convert multiple columns at a time.

The Data Conversion transform is a higher-performing transform, since it doesn't have to interpret an SSIS expression.  It also doesn't have to do as many COM to .Net transitions.

I would also suggest you consider implementing something like this best practice (http://toddmcdermid.blogspot.com/2010/07/conversion-between-unicode-and-non.html) in order to avoid using the "original" (inappropriately typed) column later in your flow instead of your "converted" (appropriately typed) column.

Oh - and for those times when you have to convert a LOT of columns, and/or you don't like doing the best practice I mentioned in the last paragraph - you may want to look at the Replacing Data Conversion component: http://toddmcdermid.blogspot.com/2010/08/convert-several-columns-in-ssis-with.html

Sorry to hijack your blog post ;)

October 1, 2010 5:58 PM
 

deepal said:

liked it!!!!

December 6, 2011 6:58 AM
 

Jack said:

"As you can see in the next screen shot, even though we have applied to DT_I8 cast, the data is still a four-byte signed integer.".

That is not "just something to be aware of", it's another bug in a very buggy tool.

July 11, 2013 11:15 AM

Leave a Comment

(required) 
(required) 
Submit

About ejohnson2010

Eric (Microsoft SQL MVP) is the co-founder of Consortio Services, and the primary Database Technologies Consultant. His background in Information Technology is diverse, ranging from operating systems and hardware to specialized applications and development. He has even done his fair share of work on networks. Since IT is really just a way to support business processes, he has also acquired his MBA. All in all, he has 10 years of experience with IT, a great amount of which has been working with Microsoft SQL Server. Eric has managed and designed databases of all shapes and sizes. He has delivered numerous SQL Server training classes and webcasts as well as presentations at national technology conferences. Most recently, he presented at TechMentor on SQL Server 2005 Replication, Reporting Services, and Integration Services. In addition, he is active in the local SQL Server Community, serving as the President of the Colorado Springs SQL Server Users Group.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement