SSIS Quick Tip: Data Conversion Transformation or Cast in Source?

After having a great conversation tonight at the JSSUG meeting, I am once again reminded of the time I spent learning SSIS.  There are lots of subtleties and nuances that, if you’re not careful, will lead to poor performance.  As such, I’m going to attempt to put out some very quick tips on the subject.  I’m not sure of the regularity of these posts, or of their usefulness, but hopefully they’ll be helpful to someone.  Enjoy!

Setup

When given the task of importing data which is of a different data type in the destination than it is in the source the new developer might jump to use the data conversion task to perform this option.  Is this wrong?  Well, no, not really.  It is, after all, a tool that you can use out of the box and it will get the job done.  However, without some of the knowledge of how data buffers are built and maintained behind the scenes you might not realize that this can lead to less than optimal performance.

Example 1: Data Conversion

DataConversion

This data flow uses a very simple data source, a data conversion transformation and a row count (as a terminator).  When run, it generates 1409 records per buffer (with a 1 MB max buffer size) as seen below.

DataConversionBuffer

Example 2: Cast in Query

DataCast

This is an even simpler data flow that uses cast within the data source to return the data with the desired data types.  When run, it generates 1659 records per buffer (with a 1 MB max buffer size) as seen below.

DataCastBuffer

Outcome

Does this performance improvement outweigh the virtual self-documentation of a well named and formatted data flow with the proper components?  Without the component someone who hasn’t looked at the package in detail might not realize that conversions were done to get the data to match up.  However, this type of information can be conveyed in the description of the source and / or in an annotation next to the source for all to see.   With the information conveyed in this fashion, it should be clear that using the cast will give you a performance benefit over using the transformation.  So, we should probably shelve the transformation unless if we are in a situation where we can’t return the data with the type we would like for the destination (ex. flat files or stored procedures).

Advertisements
This entry was posted in SSIS and tagged , . Bookmark the permalink.

One Response to SSIS Quick Tip: Data Conversion Transformation or Cast in Source?

  1. Devin Knight says:

    Good post. I like the side by side comparison. Makes it a lot easier to understand.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s