A few days ago I was alerted to a peculiarity of SSIS’s Script Component that I believe people need to be aware of. Its a peculiarity that can rear its head when your script component is synchronous and has multiple outputs. Here’s an example of a dataflow that contains such a component:
Some things to note about this dataflow that you can’t tell from this screenshot:
- The source component will create a 1-row,1-column dataset with the value “1” in it
- Even though it has three outputs “SCR Send data to multiple outputs” is still a synchronous component*
*If you don’t understand how a synchronous component can have multiple outputs and/or have never heard of an Exclusion Group then you should go and read my blog post from 2005 Multiple outputs from a synchronous script transform – the blog post you’re reading right now won’t make much sense without it! If you have ever used the Multicast component then it shouldn’t surprise you that a synchronous component can have multiple outputs – because that’s a pretty good definition of what the Multicast component does!
Here’s the important code inside “SCR Send data to multiple outputs”:
- increments the incoming value and puts it into Output0
- increments the value again and puts it into Output1
- increments the value again and puts it into Output2
Given that out starting value is “1” you might expect that our three outputs would contain the values “2”, “3” & “4” respectively but in fact that is not the case. Here is what we actually see in those three outputs when we execute the dataflow:
Each of our outputs has the same value “4”. Why is that? The trick is in understanding something fundamental about synchronous components, they only ever output the same number of rows as are input. The fact that in the data flow above it appears as though three rows have been output is simply an illusion that is best explained by former SSIS Development Manager Kirk Haselden who left the following comment on my blog post that I linked to earlier:
The additional or duplicate rows are an illusion. The Dataflow Task actually tracks what buffer columns and rows are visible to the downstream transforms, but doesn't copy any buffers or rows. It simply "exposes" them with row and column views.
Truly, the synchronous outputs only send the same number of columns* as they receive on their inputs.
- Kirk Haselden, 15th March 2006
*This is a typo. Kirk meant to say “rows” not “columns”
In our case the script component has simply incremented the same value three times and we are looking at that same incremented value of “4” in each of our outputs.
I have produced a short video that demonstrates this behaviour using the dataflow pictured above. Embedding videos here on SQLBlog is however a fiddly experience so for ease I’ll just direct you to view it on Vimeo at Multiple Outputs from a synchronous script component.
Hope this helps. Any questions please put them in the comments.