THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Andy Leonard

Andy Leonard is a Data Philosopher at Enterprise Data & Analytics, an SSIS Trainer, Consultant, and developer; a Business Intelligence Markup Language (Biml) developer and BimlHero; SQL Server database and data warehouse developer, community mentor, engineer, and farmer. He is a co-author of SQL Server Integration Services Design Patterns, and author of Managing Geeks - A Journey of Leading by Doing, and the Stairway to Integration Services.

  • Microsoft is Listening

    There’s a back-story to my post titled SSIS 2016 CTP 3 and Data Flow Column LineageID. I’ll get to the story in a minute, but first I’d like to say that every company has excellent customer service. You don’t believe me? Ask them, “Company, do you have excellent customer service?” They will all answer, “Yes. Yes we do.”

    Microsoft has been engaging the software development community and MVP’s for years – decades, even. As a former SQL Server MVP, I was repeatedly told Microsoft is listening and that my thoughts and suggestions were welcome. So I shared my thoughts and suggestions, like many of my colleagues in the community. Most of our suggestions were not implemented. Many were marked “Works as designed” or “Will not fix” at Microsoft Connect. Granted, no one can implement every feature or even correct every single bug. It’s a) not possible; and b) not cost-effective to do so. But after quite a bit of time making requests and up-voting the excellent requests of others – and then seeing disappointing responses – I (and many others) realized some other force was driving the agenda for Microsoft’s development teams and overriding our suggestions. Many became jaded as a result. I will confess some jading occurred on my part.

    Recently, I and others began hearing fresh calls for thoughts and suggestions. My first (somewhat jaded) thought was, “That’s nice. That’s real nice.” (…which is a reference to an old Southern US joke that I will classify as “unkind”…)

    The Story

    About the time SQL Server 2016 CTP 3.1 was released, I read about the features I covered in the post titled SSIS 2016 CTP 3 and Data Flow Column LineageID. As I worked through some demo packages, however, I found some things did not appear to work – at least not yet; not in the Visual Studio 2015 SSIS template. I communicated with the SSIS team and was pleasantly surprised to receive an email stating they would attempt to work the changes into CTP 3.3. I was optimistic, if cautiously so.

    After SQL Server 2016 CTP 3.3 was released, I received a follow-up email from the team member who had addressed the concerns I identified – apologizing for the long wait and pointing to a detailed blog post describing the implementation of the updated features. </Applause!>

    Conclusion

    It’s one thing to say you’re listening to your community. It’s a different thing to actually listen to your community. One is some words. The other is some action.

    I’m happy to report Microsoft is, in fact, listening. I couldn’t be happier. Thank you!

    :{>

  • SSIS 2016 CTP 3 and Data Flow Column LineageID

    Back When We Used to Carve Our Own Chips Out of Wood…

    Back in the old days (2005), SSIS Data Flow LineageIDs were created at design-time. Here’s a screenshot of an SSIS 2005 package’s XML:

    SSIS2016Lineage03

    When an error occurred, the ID property of the column in error was supplied to the ErrorColumn column field of the Error Output, as shown here:

    SSIS2016Lineage04

    Although it wasn’t simple, it was possible to use the value of the ID supplied in the ErrorColumn value to identify the offending column in a Data Flow Task. It was a manual process that involved:

    · Opening the package’s XML

    · Finding the column ID value shown in the ErrorColumn field of the Error Output

    · Reading the LineageID value for that same column

    · Tracing the LineageID back to its original assignment in the Data Flow Task:

    SSIS2016Lineage05

    The LineageID “49” maps to the “name” column in the image at the top of this post. The “name” column raised an error because someone (me) made the destination column way too small to hold any name values.

    Back in the day, some folks came up with some pretty clever ways to automate identifying the name of the column causing SSIS Data Flow errors.

    In Later Versions of SSIS…

    The LineageID property of Data Flow columns changed in later versions of SSIS. In SSIS 2012, the LineageID property is there, but the value looks more like an SSIS package path than its SSIS 2005 counterpart.:

    SSIS2016Lineag1a

    The LineageID attribute in SSIS 2016 CTP 3.3, however, appears differently:

    SSIS2016Lineage_LineageID

    A New Beginning…

    SSIS 2016 CTP3.3 offers a solution. First, there are now two new columns in the SSIS Data Flow Component Error Output – ErrorCode – Description and ErrorColumn – Description:

    SSIS2016Lineage06a

    The new columns provide extremely useful (plain language) error metadata that will, in my opinion, greatly reduce the amount of time required to identify data-related load failures in the Data Flow Task.

    But that’s not all. If you configure a log to capture the DiagnosticEx event, you will receive a message that provides the Data Flow column ID. To have a look, add a new log to an SSIS package that contains a configured Data Flow Task. On the Details tab, select the DiagnosticEx event:

    SSIS2016LineageDiagnosticEx

    When the package runs, a DiagnosticEx event will record XML describing the DTS:PipelineColumnMap. Viewing the XML in either a text file or SQL Server is no fun, but if you copy the XML and paste it into an XML file in Visual Studio (SSDT), you can format it nicely, as shown below:

    SSIS2016Lineage07a

    It’s possible to call a new method (GetIdentificationStringByID<) on the ComponentMetadata class in a Script Component, passing it the LineageID of a Data Flow column, and get the IdentificationString of the column. You can learn more about the GetIdentificationStringByID method – and everything I’ve written in this post – by reading Bo Fan’s (excellent) blog post on the subject here.

    :{>

  • Congratulations to BimlHero Cathrine Wilhelmsen!

    CathrineBimlHero

    I’m happy to pass along the news that my friend Cathrine Wilhelmsen (blog | @cathrinew) has been recognized as a BimlHero! Congratulations, Cathrine!

    You can learn more about Biml from Cathrine and Scott Currie at SQLBits in May 2016. On 4 May, Scott and Cathrine deliver a full day of Biml training titled Learn Biml Today: Start Using It Tomorrow. Note, at the time of this writing I am also listed as a co-presenter but I will, unfortunately, be unable to attend SQLBits this year. :( But go anyway! See Scott and Cathrine!

    :{>

  • Data Integration is the Foundation

    Unless you live under a rock, you’ve seen the buzz about Data Lakes, Big Data, Data Mining, Cloud-tech, and Machine Learning. I watch and read reports from two perspectives: technical and as a consultant.

    As a Consultant

    If you watch CNBC, you won’t hear discussions about ETL Incremental Load or Slowly Changing Dimensions Design Patterns. You will hear them using words like “cloud” and “big data,” though. That means people who watch and respect the people on CNBC are going to hire consultants who are knowledgeable about cloud technology and Big Data.

    As an Engineer

    I started working with computers in 1975. Since that time, I believe I’ve witnessed about one major paradigm shift per decade. I believe I am now witnessing two at the same time: 1) A revolution in Machine Learning and all the things it touches (which includes Big Data and Data Lakes); and 2) the Cloud. These two are combining in some very interesting ways. Data Lakes and Big Data appliances and systems are the sources for many systems, Machine Learning and Data Mining solutions are but a couple of their consumers. At the same time, much of this technology and storage is either migrating to the Cloud, or is being built there (and in some cases, only there). But all of this awesome technology depends on something…

    Data

    In order for Machine Learning or Data Mining to work, there has to be data in the Data Lake or in the Big Data appliance or system. Without data, the Data Lake is dry. Without data, there’s no “Big” in Big Data. How do these solutions acquire data?

    It Depends

    Some of these new systems have access to data locally. But many of them – most, if I may be so bold – require data to be rounded up from myriad sources. Hence my claim that data integration is the foundation for these new solutions.

    What is Data Integration and Why is it Important?

    Data integration is the collection of data from myriad, disparate sources into a single (or minimal number of) repository (repositories). It’s “shipping” the data from where it is to someplace “nearer.” Why is this important? Internet connection speeds are awesome these days. I have – literally – 20,000 times more bandwidth than when I first connected to the internet. But modern internet connection speeds are hundreds-to-millions times slower than networks running inside data centers. Computing power – measured in cycles or flops per second – is certainly required to perform today’s magic with Machine Learning. But if the servers must wait hours (or longer) for data – instead of milliseconds? The magic happens in slow-motion. In slow-motion, magic doesn’t look awesome at all.

    Trust me, speed matters.

    Data integration is the foundation on which most of these systems depend. Some important questions to consider:

    • Are you getting the most out of your enterprise data integration?
    • Could your enterprise benefit from faster access to data – perhaps even near real-time business intelligence?
    • How can you improve your enterprise data integration solutions?

    :{>

    Learn more:

    Enterprise Data & Analytics
    Stairway to Integration Services
    IESSIS1: Immersion Event on Learning SQL Server Integration Services
    EnterpriseDNA Training

  • Deploying a Single Package First in SSIS 2016

    I gathered recently with some friends who are geeks to discuss geeky stuff. As is often the case, there were a lot of “what if” questions, and we answered most of them by firing up laptops and VMs for testing. I had some trouble with my VM running my instance of the December preview running in Visual Studio 2015 on Windows Server 2016 Technical Preview 4. So I was unable to answer one curious question until later. When I did answer it, I grabbed some screenshots and decided to blog about it. And here we are. :)

    The question was:

    “What Happens if Your First SSIS Project Deployment is a Single-Package Deployment?”

    To test, I created a new SSIS project named “DeploymentTest1” and added three simple SSIS packages. I right-clicked Package3.dtsx and clicked Deploy Package. The Integration Services Deployment Wizard started, as expected. Since I was deploying to a pristine Catalog, I created a new Catalog Folder named “Deployment”. There were no projects in my catalog, and I was curious how the Deployment Wizard would handle this.

    Once the folder was created I saw a new button: “New project…”. I couldn’t click the OK button until I created a Catalog Project to contain my SSIS package:

    Cat2016Dep_1

    So I created a new Catalog Project. Just to see if I could do it, I named the Catalog Project something different (“Deployment_Testing_1”:) from my SSIS project (“DeploymentTest1”):

    Cat2016Dep_2

    I have to admit I was a little surprised that it worked:

    Cat2016Dep_3

    Once I clicked the OK button, the Integration Services Deployment Wizard displayed a three-part Path, “/SSISDB/<FolderName>/<CatalogProjectName>” instead of the usual two-part Path (“/SSISDB/<FolderName>”):

    Cat2016Dep_4

    Once deployed, Package3 appeared content, if alone, in the SSIS Catalog node of SSMS:

    Cat2016Dep_5

    I wondered what might happen if I now deployed the SSIS Project from SQL Server Data Tools – Business Intelligence Edition (SSDT-BI), so I tried it. When the Integration Services Deployment Wizard reached the Select Destination step, the Catalog Project I created when deploying Package3 was remembered:

    Cat2016Dep_6

    This makes perfect sense. I just didn’t know what would happen. Hence, testing.

    Once the SSIS project was deployed from SSDT-BI, Package3 was no longer alone. And that was a good thing.

    Cat2016Dep_7

    Still, I was curious how SSIS 2016 tracks the target Catalog Project. So Kent Bradshaw and I conducted a wee bit of testing. Here’s what we learned. There are three new attributes stored inside the SSIS 2016 version of the <SSIS Project Name>.dtproj.user file:

    Cat2016Dep_8

    ServerName, PathOnServer, and Run64BitRuntime aren’t found in previous versions of the file (at least the couple we checked). Again this makes perfect sense with single-package deployment. It especially makes sense if you can must create a new Catalog Project to perform a single-package deployment first.

    We also tested deploying the SSIS Project as a whole first, and SSIS 2016 appears to work just like SSIS 2012 and 2014.

    Conclusion

    As we chatted some more, Kent and I realized this decoupling of the SSIS Project name and the Catalog Project name has the potential to confuse developers not familiar with the new functionality. While we really like the idea of being able to deploy a single package to perform a quick fix, we realize this opens the door to heretofore unseen issues. It’s not just coupling between Catalog Projects and SSIS Projects in development, this potentially decouples Catalog Projects from source control. That has Kent and I concerned. How does one know an SSIS Project has been deployed to Production under a different Catalog Project name? We don’t know the answer to that question at this time.

    For now, we’re going to suggest SSIS developers adhere to a practice of naming Catalog Projects with the same name as the SSIS Project, even though we’ve proven it’s not technically necessary. We won’t call it a “best practice” – at least not yet. But we will call it an “Andy and Kent practice.” In the event someone doesn’t adhere to our suggestion, it’s possible to update the <SSIS Project Name>.dtproj.user file and correct this for future development of the SSIS project. (We tested!) If you make this change, do yourself a favor and check it into source control right afterwards.

    As always, I welcome your thoughts in the Comments (Although comments are moderated and may not appear for a day).

    Thoughts?

    :{>

  • The Recording for DILM DevOps: SSIS Frameworks + Design Patterns is available!

    The recording for the Enterprise Data & Analytics webinar “DILM DevOps: SSIS Frameworks + Design Patterns” is available here. It’s free! But registration is required.

    Enjoy!

    :{>

  • More Training from Andy Leonard!

    First, thanks to everyone who has attended one of my free webinars! I enjoy delivering training and hope that it shows.

    Where can you learn about upcoming training from me? At the Enterprise Data & Analytics Training page, of course.

    Upcoming training

    January 20 – Free Webcast: DILM DevOps: SSIS Frameworks + Design Patterns
    January 27 – SSIS Data Flow Performance Tips
    February 3 – Introducing SSIS Framework Community Edition
    February 10 – Easy SSIS Migration with SSIS Catalog Compare

    Recordings of Past Presentations

    Biml 103: Automating SSIS Design Patterns with Biml
    SSIS Catalog Compare and DILM
    Biml 101
    Introduction to SSIS Frameworks

    In-Person SSIS Training Classes

    May 2-6 – In-Person Class: Immersion Event on SQL Server Integration ServicesRegister Now
    September 19-22 – In-Person Class: Advanced Immersion Event on SQL Server Integration Services – with Tim Mitchell! Register Now

    Check the Enterprise Data & Analytics Training page early and often – it’s a great way to keep up with free webinars and non-free (but valuable!) training from me!

    :{>

  • Biml 103–Automating SSIS Design Patterns with Biml Recording is up!

    The Biml 103 webinar “Automating SSIS Design Patterns with Biml” recording is now available. Registration is required, but it’s free.

    Enjoy!

    :{>

  • “Where Can I Learn Enough C# to Use Biml?”

    An attendee at today’s Biml 103 webinar (Automating SSIS Design Patterns with Biml – this link takes you to register to view the recording) asked a very good question in a follow-up email: “Where can I learn enough C# to use Biml?”

    The answer is the C# Primer at BimlScript.com!

    BimlScriptCsharp

    Read, then test your knowledge! How cool is that?

    Create a BimlScript.com account today. It’s free! And then get started learning Biml!

    :{>

  • Introduction to SSIS Frameworks Recording Now Available!

    The recording for the webinar "Introduction to SSIS Frameworks" is now available! (Registration required)

    :{>

  • SSIS Catalog Compare and DILM Recording is now available!

    You can now view the recording from the SSIS Catalog Compare and DILM presentation here. I’m trying something new here, so please bear with me. This link takes you to a registration page. After you register, you can view the recording.

    Enjoy!

    :{>

  • SSISPerformance.com

    During a delivery of Advanced SSIS Training (by Tim Mitchell and Andy Leonard), Michael Smith suggested an idea for a site where SSIS performance benchmarks could be created and recorded. Thus, SSISPerformance.com was born. I was so struck by the idea; I purchased the domain while Tim was delivering a couple modules on Advanced SSIS!

    The site is designed by an engineer for engineers. The idea? Create repeatable tests that can be executed by data integration developers who use SSIS (SQL Server Integration Services) along with a site to store and present results for comparison.

    I created a test SSIS project that loads National Provider Identification (NPI) data known as National Plan and Provider Enumeration System (NPPES). NPPES is collected and maintained by the US Centers for Medicare and Medicaid Services (CMS) and is a list of US healthcare providers. How can you use this?

    You can find the test procedure here. It includes links to the source data and the SSIS project named NPIData. The test project contains an SSIS package that creates a database and the two tables required for testing purposes. Instructions are included in the test procedure. Please let me know if you have any questions.

    You can view my test conditions and results here.

    I welcome your feedback.

    :{>

  • ETL Instrumentation in 2016

    A long time ago in a galaxy far, far away… I wrote about ETL Instrumentation.

    Ok, it wasn’t that long ago, and it wasn’t far away. But things have changed some since I wrote that series, and I would like to spend some time updating the material. I’m typing this post at the end of 2015, so I’m using SQL Server 2016 CTP 3.2 running on Windows Server 2016 Technical Preview 4. The SSDT-BI IDE I’m using is the December preview running in Visual Studio 2015. I downloaded the AdventureWorks 2016 CTP 3 Sample database scripts from Codeplex. Finally (or first…), I build VMs for testing purposes and I believe you should too. You can spin up a VM in Azure pretty quickly these days. To do so, start at the Azure Portal. For local instances, I use Oracle VirtualBox to build VMs. You can also use VMWare or Hyper-V for local virtualization. Pick a hypervisor, local or in the cloud, and build a VM.

    As in the original article, lets begin by creating a database to host our instrumented data:

    use master
    go

    print 'SSISRunTimeMetrics database'
    if not exists(select name
                  from master.sys.databases
                  where name = 'SSISRunTimeMetrics')
    begin
      print ' - Creating SSISRunTimeMetrics database'
      Create Database SSISRunTimeMetrics
      print ' - SSISRunTimeMetrics database created'
    end
    Else
    print ' - SSISRunTimeMetrics database already exists.'
    go

    Next, let’s build a schema and a table to hold our metrics:

    -- create ssis schema...
    print ''
    print 'ssis Schema'
    if not exists(select name
                  from sys.schemas
                  where name = 'ssis')
    begin
      print ' - Creating ssis Schema'
      set @sql = 'Create Schema ssis'
      exec(@sql)
      print ' - ssis Schema created'
    end
    Else
      print ' - ssis Schema already exists.'

    -- create RunTimeMetrics table...
    print ''
    print 'ssis.RunTimeMetrics Table'
    if not exists(select s.name + '.' + t.name
                  from sys.tables t
                  inner join sys.schemas s on s.schema_id = t.schema_id
                  where t.name = 'RunTimeMetrics'
                   and s.name = 'ssis')
    begin
      print ' - Create ssis.RunTimeMetrics Table'
      Create Table ssis.RunTimeMetrics
      (id int identity(1,1)
      ,packageStartDateTime datetime null
      ,packageEndDateTime datetime null
      ,packageName varchar(255) null
      ,packageVersion varchar(25) null
      ,packageStatus varchar(25) null)
      print ' - ssis.RunTimeMetrics Table created'
    end
    Else
    print ' - ssis.RunTimeMetrics Table already exists.'
    go

    We’ll depart from the original article here – but just a little – and create a stored procedure named usp_RecordManualPackageStart in the ssis schema:

    use SSISRunTimeMetrics
    go

    print ''
    print 'ssis.usp_RecordManualPackageStart Stored Procedure'
    if exists(select s.name + '.' + p.name
              from sys.procedures p
              inner join sys.schemas s on s.schema_id = p.schema_id
              where p.name = 'usp_RecordManualPackageStart'
               and s.name = 'ssis')
    begin
      print ' - Dropping ssis.usp_RecordManualPackageStart Stored Procedure'
      Drop Procedure ssis.usp_RecordManualPackageStart
      print ' - ssis.usp_RecordManualPackageStart Stored Procedure dropped'
    end
    print ' - Creating ssis.usp_RecordManualPackageStart Stored Procedure'
    go
     

    Create Procedure ssis.usp_RecordManualPackageStart
      @packageStartDateTime datetime = null
    ,@packageName varchar(255) = null
    ,@packageVersion varchar(25) = null
    As
    begin
      -- set @packageStartDateTime default...
      declare @Now datetime
      set @Now = GetDate()

      if (@packageStartDateTime Is Null)
       set @packageStartDateTime = @Now

      -- insert the run time metrics data...
      insert into ssis.RunTimeMetrics
       (packageStartDateTime
       ,packageName
       ,packageVersion
       ,packageStatus)
      values
      (@packageStartDateTime
      ,@packageName
      ,@packageVersion
      ,'Started')

    end
    go

    print ' - ssis.usp_RecordManualPackageStart Stored Procedure created'
    go

    We make this change in order to provide support for SSIS packages executed from the file system or MSDB database which remain supported in SSIS 2016 (and for which I nod with approval to the Microsoft SSIS Developer Team. Options. Always give me options, please!).

    Building the SSIS Test Project

    Open SQL Server Data Tools – Business Intelligence (SSDT-BI) and create a new project. When the New Project dialog displays,select Business Intelligence Projects from the Project types list and Integration Services Project from the Templates. Name the project SSISRunTimeMetrics and click the OK button:

    ETLInst2016_1_0

    One Glitch…

    When I click the OK button, I get an error that looks like this:

    ETLInst2016_1_1

    To work around this error (which I suspect is a beta issue), I click OK and then right-click the solution in Solution Explorer. I then hover over “A” and click “Existing Project…”:

    ETLInst2016_1_2

    I navigate to the solution folder and locate and select the project file:

    ETLInst2016_1_3

    When I click the Open button, the project appears as part of the solution:

    ETLInst2016_1_4

    Adding a new package is easy:

    ETLInst2016_1_5

    After it’s added, I rename the new package to SSISRunTimeMetrics.dtsx:

    ETLInst2016_1_6

    Annotation…

    I still like to add annotations to my SSIS packages (although I would like to be able to change the color of the font…):

    ETLInst2016_1_7

    I can't say enough about good documentation. You never know when someone else will have to open your source code to troubleshoot.

    PackageVersion: One Way

    Right-click in the white-space of the Control Flow canvas and click Variables. Add a variable named PackageVersion. Make sure the Scope is the SSISRunTimeMetrics package and make the Data Type String:

    ETLInst2016_1_8

    Click the ellipsis for the Expression and add the following SSIS Expression Language to automate the variable value:

    (DT_WSTR,4) @[System::VersionMajor] + "." +
    (DT_WSTR,4) @[System::VersionMinor] + "." +
    (DT_WSTR,4) @[System::VersionBuild] + "." +
    (DT_WSTR,4) Year(@[System::CreationDate]) +
    ((Month(@[System::CreationDate]) < 10) ? "0" : "") +
    (DT_WSTR,2) Month(@[System::CreationDate]) +
    ((Day(@[System::CreationDate]) < 10) ? "0" : "") +
    (DT_WSTR,2) Day(@[System::CreationDate])

    ETLInst2016_1_9

    Once entered, the variable window will display as shown here:

    ETLInst2016_1_10

    I manually update the System::VersionMajor and System::VersionMinor variable values. System::VersionBuild is automatically incremented each time you save the package. I rarely parse the System::CreationDate as shown in this expression. I wish SSIS packages had a System::ModifiedDate property. If they did, I would parse that date as shown here. But they don’t, so I usually store the modified date in the System::VersionComments property manually, which alters my expression thus:

    (DT_WSTR,4) @[System::VersionMajor] + "." +
    (DT_WSTR,4) @[System::VersionMinor] + "." +
    (DT_WSTR,4) @[System::VersionBuild] + "." +
    @[System::VersionComments]

    I just have to remember to manually update the value of System::VersionComments in the package properties:

    ETLInst2016_1_11

    I labeled this section “PackageVersion: One Way” because there are other ways to track the version of an SSIS package in SSIS 2016. The VersionGUID package property shown above is another way, and this property is read-only from the SSDT-BI IDE (although editable from the dtsx file). How you manage SSIS package versions is not nearly as important as tracking them somehow.

    Add a Container

    Add a Sequence Container and change the name to "SEQ Step 1 – Do Some Stuff":

    ETLInst2016_1_12

    Sequence Containers are nice for several reasons:

    • They're aesthetic. You can break up chucks of functionality logically... how cool!
    • When troubleshooting, you can execute the contents of a container from the right-click context menu.
    • You can isolate chunks of functionality using transactions by container.

    Add an Execute SQL Task

    Next, add an Execute SQL Task and name it "SQL Log Manual Start of Package Execution":

    ETLInst2016_1_13

    Double-click the Execute SQL Task to edit it. Change the ConnectionType property to ADO.Net. Click Connection and select "<New connection...>":

    ETLInst2016_1_14

    When the Configure OLE DB Connection Manager dialog displays, click the "New..." button:

    ETLInst2016_1_15

    I really like the “New Connection” functionality in SSIS. It picks the correct Provider for the option selected in the Execute SQL Task. Configure the connection to match where you built the SSISRunTimeMetrics database and then click the OK button:

    ETLInst2016_1_16

    When you return to the Configure ADO.NET Connection Manager window, it should now display your connection in the Data Connections lists:

    ETLInst2016_1_17

    Click OK to return to the Execute SQL Task Editor. Click the ellipsis in the SQLStatement property to open the Enter SQL Query dialog. Enter the name of the ssis.usp_RecordManualPackageStart stored procedure:

    ETLInst2016_1_18

    Click the OK button to return to the Execute SQL Task Editor. Set the IsQueryStoredProcedure property (available for ADO.Net connection types) from False to True:

    ETLInst2016_1_19

    What’s nice about using ADO.Net and the IsQueryStoredProcedure property is parameter configuration, which comes next. Click on the Parameter Mapping page:

    ETLInst2016_1_20

    Click the Add button and add the following three parameter mappings:

    Variable Name: System::StartTime
    Direction: Input
    Data Type: DateTime
    Parameter Name: packageStartDateTime
    Parameter Size: –1

    Variable Name: System::PackageName
    Direction: Input
    Data Type: String
    Parameter Name: packageName
    Parameter Size: –1

    Variable Name: User::PackageVersion
    Direction: Input
    Data Type: String
    Parameter Name: packageVersion
    Parameter Size: –1

    ETLInst2016_1_21

    Click the OK button to proceed.

    Add a Script Task

    Drag a Script Task into the sequence container and rename it “SCR Output Package Version”:

    ETLInst2016_1_22

    Open the Script Task Editor and click the ellipsis in the ReadOnlyVariables property. Add the User::PackageVersion variable:

    ETLInst2016_1_23

    Click the Edit Script button to open the VSTA Editor, and enter the following code in the public void Main() method:

    public void Main()
            {
                string packageVersion = Dts.Variables["User::PackageVersion"].Value.ToString();
                string msg = "Version: " + packageVersion;
                bool fireAgain = true;

                Dts.Events.FireInformation(1001, "SSISRunTimeMetrics", msg, "", 0, ref fireAgain);

                Dts.TaskResult = (int)ScriptResults.Success;
            }

    Close the VSTA Editor and click the OK button to close the Script Task Editor. Your SSIS package should appear as shown here:

    ETLInst2016_1_24

    Let’s Test It!

    Execute the package in the SSDT-BI debugger. Your results should match those shown here (Success!):

    ETLInst2016_1_25

    Click on the Progress tab to locate the Information message we generated in the Script Task:

    ETLInst2016_1_26

    Viola! It works.

    Checking for Data

    Now let’s check the database to see if our message was logged. Open SSMS and connect to your SSISRunTimeMetrics database. Execute a query to check the contents of the SSISRunTimeMetrics table:

    select * from ssis.RunTimeMetrics

    ETLInst2016_1_27

    This is a good sign. Our data is being logged and shows the package started execution.

    Logging Execution Completion

    It's time to add a procedure to log the end of package execution, but we have a problem: How do we identify the row we just added to the ssis.RunTimeMetrics table? Think about it. We just inserted a row, and since that's the only row in the table it's pretty simple to identify. Later, we will have lots of rows in the table in varying states. We need a method to positively identify the row we're working with.

    We could use the last row inserted - apply Max(id) criteria - but what happens as our metrics grows to include multiple packages? One package could start while another was executing and we'd update the wrong row.

    We require a means to positively identify the row when added to the table. Such a means exists. If we modify the ssis.usp_RecordManualPackageStart stored procedure we can have it return the value of the inserted ID. The OUTPUT clause gives us access to the inserted and deleted virtual tables. Modify the stored procedure by adding the following OUTPUT clause as shown:

    output inserted.id as 'Inserted_ID'

    Execute the Transact-SQL to drop and recreate the ssis.usp_RecordManualPackageStart stored procedure:

    print 'ssis.usp_RecordManualPackageStart Stored Procedure'
    if exists(select s.name + '.' + p.name
              from sys.procedures p
              inner join sys.schemas s on s.schema_id = p.schema_id
              where p.name = 'usp_RecordManualPackageStart'
               and s.name = 'ssis')
    begin
      print ' - Dropping ssis.usp_RecordManualPackageStart Stored Procedure'
      Drop Procedure ssis.usp_RecordManualPackageStart
      print ' - ssis.usp_RecordManualPackageStart Stored Procedure dropped'
    end
    print ' - Creating ssis.usp_RecordManualPackageStart Stored Procedure'
    go 

    Create Procedure ssis.usp_RecordManualPackageEnd
      @id int
    ,@packageStatus varchar(25) = ‘Succeeded’
    As
    begin

      -- set @packageEndDateTime default...
      declare @packageEndDateTime datetime = GetDate()

      -- update the run time metrics data...
      update ssis.RunTimeMetrics
       Set packageEndDateTime = @packageEndDateTime
          ,packageStatus = @packageStatus
      where id = @id

    end

    go

    print ' - ssis.usp_RecordManualPackageStart Stored Procedure created'
    go

    Add another Int32 Data Type variable named PackageInstanceId:

    ETLInst2016_1_28

    Open the editor for the “SQL Log Manual Start of Package Execution” Execute SQL Task. Change the ResultSet property to “Single row”:

    ETLInst2016_1_29

    Click the Result Set page and add a new resultset. Name it “0” (ordinals work well) and assign the results of this query to the PackageInstanceId variable you just created. This will push the results of the OUTPUT clause in the stored procedure (which will include the id of the newly inserted row) into the PackageInstanceId variable. You can then access it later in the package to update the precise row:

    ETLInst2016_1_30

    Next, edit the Script Task by adding PackageInstanceId to the list of ReadOnlyVariables:

    ETLInst2016_1_31

    Then edit the script in public void Main() to read:

    public void Main()
            {
                string packageVersion = Dts.Variables["User::PackageVersion"].Value.ToString();
                int packageInstanceId = Convert.ToInt32(Dts.Variables["User::PackageInstanceId"].Value);
                string msg = "Version: " + packageVersion + " PackageInstanceId: " + packageInstanceId.ToString() ;
                bool fireAgain = true;

                Dts.Events.FireInformation(1001, "SSISRunTimeMetrics", msg, "", 0, ref fireAgain);

                Dts.TaskResult = (int)ScriptResults.Success;
            }

    Next, add a new stored procedure named ssis.usp_RecordManualPackageEnd to the SSISRunTimeMetrics database using the following Transact-SQL script:
    use SSISRunTimeMetrics
    go

    print ' - ssis.usp_RecordManualPackageStart Stored Procedure created'
    go

    print ''
    print 'ssis.usp_RecordManualPackageEnd Stored Procedure'
    if exists(select s.name + '.' + p.name
              from sys.procedures p
              inner join sys.schemas s on s.schema_id = p.schema_id
              where p.name = 'usp_RecordManualPackageEnd'
              and s.name = 'ssis')
    begin
      print ' - Dropping ssis.usp_RecordManualPackageEnd Stored Procedure'
      Drop Procedure ssis.usp_RecordManualPackageEnd
      print ' - ssis.usp_RecordManualPackageEnd Stored Procedure dropped'
    end
    print ' - ssis.usp_RecordManualPackageEnd Stored Procedure'
    go

    Create Procedure ssis.usp_RecordManualPackageEnd
      @id int
    ,@packageStatus varchar(25) = 'Succeeded'
    As
    begin

      -- set @packageEndDateTime default...
      declare @packageEndDateTime datetime = GetDate()

      -- update the run time metrics data...
      update ssis.RunTimeMetrics
       Set packageEndDateTime = @packageEndDateTime
          ,packageStatus = @packageStatus
      where id = @id

    end

    go

    print ' - ssis.usp_RecordManualPackageEnd Stored Procedure created'
    go

    Now add a new Execute SQL Task named "SQL Log Manual End of Package Execution":

    ETLInst2016_1_32

    Open the Execute SQL Task Editor and set the ConnectionType to ADO.Net, the Connection to your SSISRunTimeMetrics connection manager. Edit the SQLStatement property, changing it to ssis.usp_RecordManualPackageEnd. Set the IsQueryStoredProcedure property to True. Navigate to the Parameter Mapping page and add a mapping for PackageInstanceId as shown here:

    ETLInst2016_1_33

    Testing!

    That’s all we need to conduct our next test, so let’s roll (as my friend Chris Yates [blog] says)!

    ETLInst2016_1_34

    The package execution succeeds – that’s a good start. Let’s look at the Progress tab:

    ETLInst2016_1_35

    Our information message tells us we created PackageInstanceId 2 – another good sign. Next, let’s look at the metadata in our database:

    ETLInst2016_1_36

    w00t! Very cool, our metadata shows us when the package started and ended!

    Conclusion

    “In an age of the SSIS Catalog, why would one ever employ this kind of metadata collection, Andy?” That’s a fair question. The SSIS Catalog is an awesome data integration execution, logging, and externalization engine. There are a handful of use cases, though, where enterprises may opt to continue to execute SSIS packages from the file system or the MSDB database. Perhaps the biggest reason to do so is that’s the way the enterprise is currently executing SSIS. When SSDT-BI converts pre-Catalog-era (2005, 2008, 2008 R2) SSIS packages to current, it imports these packages in a “Package Deployment Model” SSIS Project. This allows developers to upgrade the version of their SSIS project to SSIS 2016 (and enjoy many benefits for so doing) while continuing to execute SSIS packages in the file system. Kudos to the Microsoft SSIS Development Team for this backwards compatibility!

    If you do not have an SSIS Catalog collecting this kind of metadata for you, creating a database similar to SSISRunTimeMetrics is a valid solution. Yes, you have to manually add ETL instrumentation to your SSIS packages, but I believe the cost and overhead outweighs not having a solution for collecting this data integration execution metadata.

    As always, I welcome your thoughts on the matter!

    :{>

  • SSIS Design Patterns and BIML: A Day of Intelligent Data Integration

    I’m honored to deliver a brand new precon (the PowerPoint slides still have that new slide smell) at SQL Saturday #477 – Atlanta – BI Edition! The precon is titled SSIS Design Patterns and Biml: A Day of Intelligent Data Integration.

    Abstract

    Join Andy Leonard, founder of Enterprise Data & Analytics, SSIS coach and trainer, author, and consultant, for a day of training focused on intelligent data integration. Andy is an experienced SSIS consultant, but he also led an enterprise team of 40 ETL developers during projects that spanned 2.5 years. And delivered.

    The target audience for this course is data integration developers and architects who want to learn more about SSIS performance, DevOps, execution, and automation.

    Attendees will learn:
    - a holistic approach to data integration design.
    - a methodology for enterprise data integration that spans development through operational support.
    - how automation changes everything. Including data integration with SSIS.

    Topics include:
    1. SSIS Design Patterns
       Data Flow Performance
       ETL Instrumentation
    2. Executing SSIS in the Enterprise
       The SSIS Catalog - the good, the bad, and the ugly.
    3. Custom SSIS Execution Frameworks
    4. DevOps and SSIS
       A (Condensed) Sprint in the Life of a Data Integration Solution
       Version Control and SSIS
    5. Business Intelligence Markup Language
       A Brief Introduction to Biml in BimlStudio
    6. SSIS Design Patterns + Biml
       Putting the pieces together.
    7. SSIS Design Patterns + Biml + Custom SSIS Execution Frameworks
       Executing the new combinations.
    8. SSIS Design Patterns + Biml + Custom SSIS Execution Frameworks => DevOps
       Enterprise-class data integration with SSIS.

    Register today!

    I’m also presenting Using BIML as an SSIS Design Patterns Engine at SQL Saturday #477 – Atlanta – BI Edition! Right after my presentation, Scott Currie (the guy who created Mist and BimlOnline, and who shepherds Biml) is presenting SSIS Unit and Integration Testing with the Open Source BEST project!

    As if that’s not enough, Patrick LeBlanc and Adam Saxton are also delivering a precon titled Microsoft BI in a Day. There are tons of really smart speakers lined up to present (plus me)! It’s going to be a great event!!

    You can register for SQL Saturday #477 – Atlanta – BI Edition here. I hope to see you there!

    :{>

  • Presenting “Use Biml to Automate SSIS Design Patterns” at SQL Saturday-Nashville!

    I’m honored to present “Use Biml to Automate SSIS Design Patterns” at SQL Saturday – Nashville 16 Jan 2016!

    Abstract:

    Perhaps you’ve heard the buzz about Business Intelligence Markup Language (Biml) and wondered, “Why is everyone so excited about Biml?” Occasionally, a new technology emerges that changes everything. For SSIS developers, Biml is one such technology. Business Intelligence Markup Language (Biml) is a powerful solution for rapidly creating SSIS packages and solutions. SSIS Design Patterns support various load profiles. Combined, Biml and SSIS Design Patterns offer a compelling solution for automating enterprise data integration.
    
    In this session, you will learn: 
    -	How to rapidly build multiple SSIS packages using Biml
    -	How to apply different SSIS Design Patterns using Biml
    -	How to automate SSIS Design Patterns using Biml
    Register today!
    I hope to see you there!
    :{>
More Posts Next page »

This Blog

Syndication

News


My Companies



Community Awards

Friend of Red Gate

Contact Me

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement