icrosoft introduced a whole new way to migrate and transform data when it packaged Data Transformation Services (DTS) with SQL Server 7.0. Although many developers use DTS for simple transformations, they often hesitate to use it for more complex, mission-critical, data-transformation tasks. This article will show you how to move beyond simple data transformations and implement custom code in DTS.
What you need:
SQL Server 7.0
Microsoft Internet Explorer 5.0
|
|
DTS lets you use the full capabilities of an ActiveX scripting language to enhance and customize your data migrations. This versatility can be invaluable when dealing with large, complex data transformations. DTS accepts scripts written in Microsoft Visual Basic Scripting Edition (VBScript), JScript, or PerlScript, provided you have the scripting library for the appropriate scripting language installed on your machine. Microsoft Internet Explorer 5.0 installs the libraries for both VBScript and JScript by default.
For this article, I've created a simple DTS package that illustrates how to create and use ActiveX scripts within DTS. This package imports information on publishing companies from the Publishers table in Access' Biblio database into a new table in SQL Server's Northwind database. You can download this code project here.
You can use active scripts in three different locations within a DTS package: in a transformation, a workflow, or an ActiveX scripting task. Before looking at the specifics of these three scripting locations, you should understand how to use global variables within DTS. DTS supports using global variables to pass data or object references between different scripts within your DTS package. You can declare and initialize global variables either when you create your DTS package or dynamically when you run the package.
Select Package | Properties from the DTS Designer menu to create global variables at design time. The DTS Package Properties dialog box appears. Click on the Global Variables tab, which displays all global variables, their datatypes, and initial values. Add a new global variable by clicking on the New button and specifying the variable's name, datatype, and initial value. In this article's sample code, I created one global variable called RecordsImported and gave it an initial value of zero. Use this variable to track how many records are imported into SQL Server successfully. Use the DTSGlobalVariable object in your ActiveX script to create a new global variable or to access existing global variables dynamically at run time.
| |  |
Figure 1 Use ActiveX Scripts During Trans-formations. Click here.
|
Modify Column Transformations
ActiveX scripts used within a transformation let you modify a column transformation between the source and destination. To modify the ActiveX script you'll use during a transformation, right-click on a transformation object in the DTS Designer and select the Properties menu. Then click on the Transformations tab in the Data Transformation Properties dialog box (see Figure 1).
There are two types of transformations between columns: a column copy and an ActiveX script. A column copy ports the data in the source column exactly as-is into the destination column. When you first access the Transformations tab after creating a transformation, the tab shows that by default DTS has set up a straight column copy between the source and the destination objects.
In the sample DTS package, you'll import only the records with a PubID value greater than 500. To do this, implement a many-to-many relationship between the source and destination columns. First, delete the default transformations DTS created by clicking on the transformation arrow, then clicking on the Delete button. Now you can create new ActiveX transformations. Creating an ActiveX transformation between columns requires several steps. First, highlight the desired column or columns from the Source Table and Destination Table listboxes. Then select ActiveX Script from the New Transformation drop-down listbox and click on the New button. The ActiveX Script Transformation Properties dialog box appears (see Figure 2).
The default code in the Script textbox shows you how to program a straight column copy between the source and destination column(s). You can modify the code in this window to do anything the scripting language allows. Add a simple If...Then...End If conditional to your transformation script to perform the import only if the PubID's value is greater than 500. For each insertion, update the global record count variable by one, so each record has the correct value (see Listing 1).
The return code for the script is important because it tells DTS what to do next. The constant DTSTransformStat_OK (one) tells DTS to continue with the import. The constant DTSTransformStat_SkipRow (two) tells DTS not to import the row. A number of other constants are accepted as return codes for a transformation (see Table 1). Refer to the Microsoft help files for more information about these and other return codes.
By default, the Main function contains the entire transformation. However, you can create as many functions as needed to complete a complex transformation. You need to enter the name of the function you want to use as the startup function in the Entry Function textbox if you choose to add additional functions. The Parse button allows you to check your code for syntax errors. The Test button executes the migration and puts the results into a temporary file. This lets you see exactly how your data will look after it has been imported, without actually modifying your final destination data. Despite these features, this window is hard to use. You might find it easier to create long, complex scripts outside DTS (for example, in Notepad or Visual InterDev) and import them into the ActiveX Script Transformation Properties window by clicking on the Browse button.
Setting up a many-to-many relationship between the source and destination columns puts you in a good position to enhance the default transformation code by adding additional functions and complex logic. However, you could also have implemented the code example used in this sample application by leaving all the default column transformations alone and modifying only the transformation for the PubID column. In this example, you'd only need to delete and re-create the transformation for the PubID. You'd use this code in the PubID transformation:
Function Main()
'Copy row if id is greater than 500
If DTSSource("PubID").Value > 500 Then
DTSDestination("PubID") = DTSSource("PubID")
DTSGlobalVariables ("RecordsImported").Value = _
DTSGlobalVariables("RecordsImported").Value + 1
Main = DTSTransformStat_OK
Else
Main = DTSTransformStat_SkipRow
End If
End Function
Use ActiveX With Workflow Operations
You can also use ActiveX scripting as a part of a workflow object. If you include an ActiveX script in a workflow object, the script is executed before the task. This gives you the flexibility to do things such as determine whether to cancel the task or initialize global variables. To create an ActiveX script that executes before a workflow operation, right-click on a transformation object and select Workflow Properties. The Workflow Properties dialog box appears (see Figure 3).
Click on the Options tab and check the "Use ActiveX script" checkbox. Clicking on the Properties button takes you to the Workflow ActiveX Script Properties dialog box, which is almost identical to the ActiveX Script Transformation Properties dialog box, except you can't test your code. The source and destination columns in scope aren't listed in the Workflow ActiveX Script Properties box because that information isn't applicable here. Before attempting to execute the import into SQL Server, use the ActiveX script in the workflow object to make sure the Biblio database exists in the correct file path (see Listing 2). A workflow script has three possible return values, represented by the constants DTSStepScriptResult_ExecuteTask (zero), DTSStepScriptResult_DontExecuteTask (one), and DTSStepScriptResult_RetryLater (two).
The third possible location for an ActiveX script in a DTS package is the scripting task, a separate task that executes any ActiveX script. To add a scripting task to your package, select ActiveX Script Task from the DTS Designer's Task menu. The ActiveX Script Task Properties dialog box appears, which is almost identical to the Workflow ActiveX Script Properties dialog box. Here you can create an ActiveX script to perform any additional tasks you might need in your DTS package. The sample code includes an ActiveX task that creates a log file and populates it with the total number of records the Biblio database successfully inserted into SQL Server (see Listing 3). The total number of records inserted into SQL Server is stored in the global variable you created at design time. ActiveX tasks have only two possible return values: DTSTaskExecResult_Success (zero) and DTSTaskExecResult_Failure (one).
Some caveats apply when incorporating ActiveX scripts in your DTS package. First, using ActiveX scripts slows down your package's execution time. DTS packages are slow anyway, so you might want to limit the number of ActiveX scripts you use if performance time is crucial to your operation.
However, you can speed up the scripts you do use. Microsoft recommends using VBScript in DTS packages for the fastest results. JScript is generally the second-fastest scripting language within DTS, followed by PerlScript. Although DTS supports PerlScript, it's not recommended. Microsoft has issued a disclaimer stating that ActiveX scripts and transformations have been fully tested only with VBScript and JScript. Refer to columns by their ordinal values instead of by their names to give an additional performance boost to an ActiveX script used within a transformation. You should define constants whose names are based on the corresponding fields for the ordinal values if you choose this route. Doing so will help your code read better.
Enable Transactions
You'll need more transactions as your DTS packages become more complex. By default, a workflow step in a DTS package cancels its own execution if an error occurs anywhere within that workflow step. However, unless you've set up a precedence to stop or modify package execution if a step fails, other steps in the DTS package continue to execute. In addition, any steps that occurred before the error remain committed to your data destination. Transactions allow you to group steps and commit or cancel them all if an error occurs within any of the transaction steps.
DTS packages enable transactions by default. To check the settings for a package's transactions, select Package/Properties from the DTS Designer's main menu and click on the Advanced tab. The default setting for the transaction's isolation level is Read Committed. This means modifications made by other transactions can't be seen until those transactions have committed. You can choose several other options for the value of the isolation level property (see Table 2).
To incorporate a workflow step in a transaction, right-click on the transformation object and select the Workflow Properties menu. The transaction settings are on the Options tab of the Workflow Properties dialog box. To include a transformation in a transaction, check the Join Transaction If Present checkbox. You can also choose to commit or roll back the transaction when the transformation has completed or failed successfully.
Before implementing transactions in your DTS package, note that these transactions only work if you're using a data provider that supports transactions. In addition, a package can only execute one transaction at a time. Finally, DTS doesn't support multiple, asynchronous transactions.
DTS provides robust, built-in functionality. Combining this functionality with the ability to customize ActiveX scripts within a package gives you a lot of flexibility. You can extend this flexibility by implementing custom tasks created in any development language that supports COM. To create a custom task, first develop the COM component and compile it into a DLL. You can use your object within DTS as long as it supports either the CustomTasks or IDTSCustomTask object interface. To register a custom task in DTS, choose Task | Register Custom Task in the DTS Designer. After you register the custom task, it appears in the DTS Designer toolbar along with the seven built-in tasks. (See Resources for information about creating and incorporating custom tasks.)
You can use ActiveX scripts in your DTS project to access any external COM object with the CreateObject command, so implementing a custom task might seem redundant. However, if you have rules and procedures unique to your business, implementing a custom task allows you to reuse and maintain these processes within multiple DTS packages easily. Implementing custom tasks, instead of using an ActiveX script to reference an external COM object, also reduces your DTS package's complexity. This can boost performance.
You should now have some ideas on how to customize your DTS packages. Next, you might want to do some research. DTS is still relatively new, so you might have trouble finding good documentation and answers to questions. Don't let this deter you from exploring what DTS has to offer. Taking the time to understand how you can use DTS to handle mission-critical tasks can really give your business a boost.
Mary Hooke is the President of Distinctive Development, a software consulting
firm located in New York City. Distinctive Development specializes in the
design and development of interactive web applications. In addition, Mary is
a regular author for Visual Basic Programmer's Journal. Contact Mary at mhooke@DistinctiveDevelopment.com.