SSIS Quick Tip: Size Matters

I imagine that other people have some of the same issues with enterprise data models which I have had in the past.  It seems that no one wants to be the person who might limit someone’s artistic abilities to write a good name, give a good description or otherwise capture some important piece of information.  And besides, this column is variable length, so it really isn’t going to matter how big the field is because it is only going to take up the space necessary to store it, right?

Well, that might be true (to a point) for databases in general.  However, SSIS is strongly tied to Metadata.  And, since it is possible that a column might be 4000 characters long, it has to plan for such when sizing out the data buffers.

So, for a bit of a contrived example we will take a look at the Email Address table coming from adventure works as is, and as if it had been created as an nvarchar(4000).  This may seem silly, but I have certainly seen many examples where a name or description column was set to either the max  length (4000 / 8000) or just max when there really was no need for that.  Usually this would yield a column that when profiled would have a max length of 40 or 50, sometimes a few hundred, and rarely over a thousand let alone 4000 / 8000 or enough to necessitate a max length field (a.k.a. lob data).

Large Data

ColumnSize.LargeColumn ColumnSize.LargeColumnDataViewer

Here, as described above, we have as an example the email address table with the email address column blown out to  nvarchar 4000.  We haven’t changed the data any, we’ve only modified the metadata.  So, we know that the maximum size of any value will be fifty characters in length.  But, the SSIS engine can’t know this… We’ve told it that the data can be up to 4000 characters long.  Since that is the max size, that is how wide it will have to build each of the data buffers to handle.

When we look at the output, using the default buffer size and max buffer rows, we get 1300 Records per buffer.

Sized Data

For the properly sized data we have a column that is nvarchar(50).  When we run this data flow, we see that we are getting 9725 records per buffer under the default buffer settings.

ColumnSize.SizedColumn ColumnSize.SizedColumnDataViewer

Action

Now, this is still a bit contrived because we already know the proper size for the data.  However, you have the ability to perform a data profile on you base data, particularly when you see large width data columns in an extract.

If you notice a large disparity between the size of the column and the size that data actually uses, you should attempt to determine if there is a need to actually have that size data.  If not, you can at a minimum cast / convert the column out to the appropriate size in the transfer, and size it appropriately on the destination.  If you have access or influence to make the changes on the source system all the better.

Advertisements
This entry was posted in SQL, SSIS and tagged , . Bookmark the permalink.

3 Responses to SSIS Quick Tip: Size Matters

  1. Arthur says:

    A very good point Eric, I hate to see so many examples in the wild of such inflated table fields, too! Another perk in this post is using the Data Viewer to see the buffer size, smart, I did not even think of this before!

  2. David Stein says:

    Great Article, quick question for you. Presumably if you don’t need unicode support, converting a data flow from DT_WSTR to DT_STR would allow more records per buffer as well. What do you think?

    • Eric Wisdahl says:

      Absolutely, positively, yes that is correct. However, I’ve recently started to run into issues where I wish that I had the foresight to jump the gun and go with uni-code in the first place. Some projects which I have been working will now be a fair amount more complicated due to refactoring original designs, procedures, packages, etc. However, if you know beyond a shadow of doubt that a column or data store will only ever need non-uni-code data, go for it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s