Archive

Posts Tagged ‘SSIS’

Loading SAS files with SSIS

February 8, 2012 23 comments

Hi

I’ve been given an assignment to load data from a SAS Institute data file to SQL Server using SSIS. How hard can that be I thought to my self. All I needed to do was to find the oledb provider for SAS, install it on my machine, and use it within SSIS, a no brainer!

Well first it took some time for me to actually find the oledb provider. You see SAS Institute provide it for free, which is great, but they kinda keep it in a secret place. If you google “sas oledb provider” the first result is this, which is a general documation spot for the SAS Oledb Provider, which is a great source for content regarding how to use the provider, but where to download it? Well you need to click on some links, only to get to a page that tells you where you can download it, but guess what, the part that tells you where to download it from is NOT a link, it’s URL indeed, but it’s not marked as a link, and very hard to find in all the other text on the page. Unfortunately I’m not a reader, I’m a scanner (we’ll get back to that later also). When I need to read something fast, which is almost always, I scan the text looking for known patterns instead of actually reading the entire content. So when I look for a link for downloading the file, I tend to not read any irrelevant text, which in this case was the link, disguising as normal text. But I’m a good person, so I’ll help you guys with that, here is the direct link to the download page, if you need to download it. You’ll need to register and confirm by an email first.

Now that I got the provider, I opened SSIS and did the following:

  1. Created anew connection manager to my SAS file using the “Native OLE DB\SAS Local Data Provider 9.3”
  2. In the Server or FileName I wrote the full path of the file name (which we’ll later go back to)
  3. Pressing Test Connection to make sure it works and pressed OK
  4. Added a new OLE DB Source to my data flow task
  5. In the OLE DB Source Editor I tried to pick the table in the SAS file, only to get an error message with the following content: “fn =ydedopn; context =sasej9lib.dll.”

So I thought I’ll google that, and guess what, one result only, to a linked in post of a guy asking for help with the same problem, no one with a solution.

I thought well let’s try and isolate the issue, so I started windows console project, to test to connectivity logic using an OleDb provider to the SAS file. I’ve spent 2 hours on this, trying with all kind of different ways to make it work, but either I got an empty data table, or the same error message as in SSIS. Frustrated I kept on going back and forward between the SAS website documentation, looking for a hint, and there, just before I was to give up, I saw the problem!

In the documentation it states that the datasource property is “A physical directory that contains the SAS data set that you want to access with the connection.” How did I missed that? (damn you scanners!) I shouldn’t have used the file name, but the file directory! The provider treats the folder as the “database” and each file is a “table” in the “database”!

Changing that simple fact, I was able to select a table from the Ole Db Source Editor, and beside some page code warnings which I’ll look into later, everything works as is should.

So the conclusions are:

  • don’t scan the documentation, read it! (this conclusion is pointed to me only)
  • when pointing to a SAS Institute files use only the directory in the data source!

I hope this will help others

/Rafi

Categories: Uncategorized Tags: ,

Catch Custom Events with SSIS Event Handler

January 5, 2012 6 comments

Background

When I started working for PensionDanmark summer 2011, we started looking into using SSIS as the main ETL tool for the Datawarehouse team. One of the main requirements were to be able to bring SSIS into the enterprise scale, and to be able to design and implement a pattern that will be able to cover a wide range of ETL related tasks though out the enterprise.

I knew that SSIS is lacking some of these enterprise needs, and when I saw Andy Leonard approach using a framework, I decided to try and implement one similar in PensionDanmark. What really got my attention was the ability to have a meta data driven approach to the execution of SSIS, and the ability to centralize the settings of logging, and by that to allow multiple child package to be simple and yet to inherit a complex set of events logging, that can be maintained easily, without effecting the child packages.

After the implementation of the first version of the framework, we had a request from the users to be able to report on non standard events. I thought that it should be quite simple, as SSIS has a custom event, and SSIS Script Task Component has the ability to fire custom events. But I was wrong to think it was that simple, because the SSIS event handler has no implementation for the custom event.

The problem description is like this: the frameworks central logging feature is build on the fact that the event handlers of the parent package (the framework package) is able to catch events that are fired in the child package too. But there is no event handler for the custom event, so what can we do?

A Solution

What we need is a workaround, that will allow us to catch a custom event, fired in the child package, using an event handler of the parent package. I started to look into the types of events which have an event handler, and to see weather I could hijack one, and make it act as a custom event handler, and than I saw it, the perfect victim for the task,  the VariableChanged event handler!

Why is it perfect?

  • I’ve never used it in production in my years developing SSIS packages.
  • The event is fully controlled by the developer. That is the SSIS developer chooses to enable this event on a variable. This is something we cannot do with any other event!
  • I believe that it’s by far a event that one is willing to live without for the price of having custom event in a centralize manner. I know I would :-).

Let’s take a look on how we actually doing it:

Here is the parent package, it’s made simple, in order to better illustrate the event catch scenario. An SSIS Framework looks of course more complex than this simple parent package.

The parent package

The parent package has a script component in the OnVariableValueChanged. With that script we will need to catch the variables which are available for this type of event, and use them to log the real event that is fired by the child package.

The Parent Package Event Handler

This is how the script looks like:

'The follwing variables are accesable from this type of event handler:
'System::TaskName
'System::SourceName
'System::VariableDescription
'System::VariableID
'System::VariableName
'System::VariableValue

Public Sub Main()
    Dim S As String

    'Building a string that proof that the values of the variables
    ' is originated from the event fires in the child package
    S = "VariableName: " & Dts.Variables("System::VariableName") _
      .Value.ToString() & vbCrLf _
      & "VariableID: " & Dts.Variables("System::VariableID") _
      .Value.ToString() & vbCrLf _
      & "VariableDescription: " & Dts.Variables("System::VariableDescription") _
      .Value.ToString() & vbCrLf _
      & "VariableValue: " & Dts.Variables("System::VariableValue") _
      .Value.ToString() & vbCrLf _
      & "TaskName: " & Dts.Variables("System::TaskName") _
      .Value.ToString() & vbCrLf _
      & "SourceName: " & Dts.Variables("System::SourceName") _
      .Value.ToString()

    'Showing the string value as a message box
    MsgBox(S)

    Dts.TaskResult = ScriptResults.Success
End Sub

Note I’m not telling you here how to log the event, but just showing which variables are available for usage, and how to manipulate this variables from the event fire method

Here is a screen shot of the child package:

The child package

The first script task is there just to proof the fact that the event is not fired, unless we set the event on a variable.

The second script task is where the magic is done, let’s look into the code:

Public Sub Main()

        Dts.Events().FireCustomEvent("OnVariableValueChanged", _
                                     "", _
                                     New Object(3) _
                                        { _
                                        "First Value", _
                                        "Second Value", _
                                        "Third Value", _
                                        "Forth Value"}, _
                                     "", _
                                     False)

        Dts.TaskResult = ScriptResults.Success
    End Sub

What we are doing here is this, we fire a custom event, of type OnVariableValueChanged. The FireCustomEvent can take an object, as the third parameter, and this object is later used by the SSIS Event Handler. I’ve doing some testing, and found out that at least 4 items can be included in this object, and that this 4 objects correspond to 4 SSIS system variables, which are accessible in the SSIS Event Handler. All 4 are strings, so we must use strings, to prevent errors.

To illustrate that, I’m sending 4 text values, let’s look into what happens when we ran the parent package:

Running the parent package

Let’s zoom in:

The message box

In real life, we will use this 4 strings to send information about the custom event, it’s source, level, severity what ever is relevant for the specific system requirements.

The entire project is available for download.

Categories: Uncategorized Tags: ,