You can get error column names in SSIS with two script tasks/components - no custom DLLs, or Components. Clone the repository and pry into it yourself!.
For a full BIML treatment, see this repository.
When I saw the output Error columns in SSIS’s Dataflow Task (DFT) for the first time, I thought I had surely misconfigured something, or not followed a certain step. I mean, I was working with actual data and columns that had human-readable names - but I was getting integers in these Error Code/Column columns…
But like I said, I figured it was operator error, and so I searched my way into hours of disappointment. Microsoft was kind enough to enhance the error description, as described here. But that still didn’t get a column name.
When it came to getting the column name, all signs pointed back to that LineageID - it is the link between a column name at a certain point in the pipeline, and the integer SSIS is using to identify that column. In the existing answer set on how to grab LineageIDs, I found a web of hacks that not only didn’t work, but seemed to be pointing in opposite directions (are my LineageIDs in my dtsx, and are they ints
or text?). So here are a couple of important takeaways if you deviate from the solution I will provide later.
LineageIDs
by reading the dtsx or trying a Convert.ToIntXX
call will end in disappointment because LineageIDs in the dtsx are now stored as textual path references.Package.Executables
, and explore them later in the DFT for some Just-In-Time LineageID lookup fun (you’ll also likely need a reference to the ManagedDTS assembly).I realize that parts of this solution may not work for everyone, say those people with a parent package with a million child packages, and a billion columns. But it’s a proof of concept, and I’m sure you’ll find a way to make this work at runtime in your own situation. A lot of credit here goes to the method outlined by Dougbert in this post.
The nice parts of this solution are:
[![CDATA]]
tags around the entire codeblocks for the SCT and SC)Here’s the basic algorithm we’ll follow in the code.
Dictionary<int, string>
variable (let’s call it lineageIds
) to store our LineageIDslineageIds
’s ParentExecutables
searching for Dataflows
to extract the LineageIDslineageIds
Dictionary to extract both the name of the error column and error descriptionThe Script Task to get the LineageIDs is straight forward. There is a section of code writing out to a temp file to show the proof of concept. Just follow these steps before implementing the code.
User::execsObj
(your Executables object) and User::lineageIds
(your Dictionary<int,string>
).Microsoft.SqlServer.DTSPipelineWrap.dll
for the MainPipe
objectNow for the code.
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using System.Windows.Forms;
using System.Collections.Generic;
using System.IO;
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
Dictionary<int, string> lineageIds = null;
public void Main()
{
// Grab the executables so we have to something to iterate over, and initialize our lineageIDs list
// Why the executables? Well, SSIS won't let us store a reference to the Package itself...
Dts.Variables["User::execsObj"].Value = ((Package)Dts.Variables["User::execsObj"].Parent).Executables;
Dts.Variables["User::lineageIds"].Value = new Dictionary<int, string>();
lineageIds = (Dictionary<int, string>)Dts.Variables["User::lineageIds"].Value;
Executables execs = (Executables)Dts.Variables["User::execsObj"].Value;
ReadExecutables(execs);
// Just proof of concept to see the results before you dedicate your time to the solution
// Delete this code in your actual implementation
using (StreamWriter writetext = new StreamWriter(@"C:\temp\write.txt", true))
{
foreach (var kvp in lineageIds)
writetext.WriteLine(kvp.Key + " : " + kvp.Value);
}
Dts.TaskResult = (int)ScriptResults.Success;
}
private void ReadExecutables(Executables executables)
{
foreach (Executable pkgExecutable in executables)
{
if (object.ReferenceEquals(pkgExecutable.GetType(), typeof(Microsoft.SqlServer.Dts.Runtime.TaskHost)))
{
TaskHost pkgExecTaskHost = (TaskHost)pkgExecutable;
if (pkgExecTaskHost.CreationName.StartsWith("SSIS.Pipeline"))
{
ProcessDataFlowTask(pkgExecTaskHost);
}
}
else if (object.ReferenceEquals(pkgExecutable.GetType(), typeof(Microsoft.SqlServer.Dts.Runtime.ForEachLoop)))
{
// Recurse into FELCs
ReadExecutables(((ForEachLoop)pkgExecutable).Executables);
}
}
}
private void ProcessDataFlowTask(TaskHost currentDataFlowTask)
{
MainPipe currentDataFlow = (MainPipe)currentDataFlowTask.InnerObject;
foreach (IDTSComponentMetaData100 currentComponent in currentDataFlow.ComponentMetaDataCollection)
{
// Get the inputs in the component.
foreach (IDTSInput100 currentInput in currentComponent.InputCollection)
foreach (IDTSInputColumn100 currentInputColumn in currentInput.InputColumnCollection)
lineageIds.Add(currentInputColumn.ID, currentInputColumn.Name);
// Get the outputs in the component.
foreach (IDTSOutput100 currentOutput in currentComponent.OutputCollection)
foreach (IDTSOutputColumn100 currentoutputColumn in currentOutput.OutputColumnCollection)
lineageIds.Add(currentoutputColumn.ID, currentoutputColumn.Name);
}
}
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
}
I place this task below my transformations to catch all of the errors through a Union All. This doesn’t require any special assemblies, but does need ReadOnly access to lineageIDs
.
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Collections.Generic;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
Dictionary<int, string> lineageIds = (Dictionary<int, string>)Variables.lineageIds;
int? colNum = Row.ErrorColumn;
if (colNum.HasValue && (lineageIds != null))
{
if (lineageIds.ContainsKey(colNum.Value))
Row.ErrorColumnName = lineageIds[colNum.Value];
else
Row.ErrorColumnName = "Row error";
}
Row.ErrorDescription = this.ComponentMetaData.GetErrorDescription(Row.ErrorCode);
}
}
What I like about this solution is I don’t have to run any scripts/packages to prepopulate fields - I can just grab the in-memory package Executables, iterate over them, and get what I need. Again, this is an amalgamation of existing ideas, with most of the inspiration coming from Dougbert.
Here’s the nice output I get in SSMS when I need to see where I’m erring!
Clone the repository and pry into it yourself!
For a full BIML treatment, see this repository.