Working with SqlBulkCopy

Friday, February 8, 2013

The SqlBulkCopy class is invaluable for moving large amounts of data into SQL Server from a .NET application. I've been working with the class recently and taking notes on a few quirks and features.

Async Bulk Copy

Like most other areas of the .NET framework, SqlBulkCopy provides await-able methods for C# 5. In fact, the number of awaits in a bulk copy operation is remarkable (3 awaits across 4 async calls):

public async Task Execute()
{
    using(var sourceConnection = new SqlConnection(_sourceConnectionString))
    using(var targetConnection = new SqlConnection(_targetConnectionString))
    using(var sourceCommand = new SqlCommand(_selectQuery, sourceConnection))
    using(var bulkCopy = new SqlBulkCopy(targetConnection))
    {
        await Task.WhenAll(sourceConnection.OpenAsync(),
                           targetConnection.OpenAsync());

        var reader = await sourceCommand.ExecuteReaderAsync();

        bulkCopy.DestinationTableName = _targetTable;
        await bulkCopy.WriteToServerAsync(reader);
    }
}

How Many Rows?

An obvious statistic you'll want to know about is how many total records have moved into the target table. Unfortunately, SqlBulkCopy makes this difficult. Although the class does have a SqlRowsCopied event, the event doesn't fire at the end of execution, but only on every N number of records (where N is configurable). Thus, if N is set to 50,000, you'll know when you copy 50,0000 records and 100,000 records, but won't know if you copied 29,000 more records after the first 50,000.

bulkCopy.NotifyAfter = 50000;
bulkCopy.SqlRowsCopied += (sender, args) =>
{
    Console.WriteLine(args.RowsCopied);
};

There are several workarounds, as suggested in a StackOverflow post:

- Use reflection to dig out the value of _rowsCopied (a private field in a SqlBulkCopy object).

- Execute a SELECT COUNT on the target table after the bulk copy completes (possibly not accurate).

- Provide your own wrapper/adapter for IDataReader and count the number of MoveNext calls with your own code. This is slightly painful since you can't inherit from SqlDataReader (because of an internal constructor), and need to implement dozens of methods on IDataReader (just forwarding calls to the inner reader). It's scenarios like this where I wish C# had an easy way to delegate member invocations (like Groovy). 

Data Type Issues

SqlBulkCopy is particular about matching columns in the source and destination, so chances are you won't be able to bulk copy without configuring a column mapping for the operation.  SqlBulkCopy is also particular about data types, and here there is limited flexibility in code. For example, if you want to copy a NULL value into a textual destination column, you'd think it would this easy:

SELECT Column1, NULL as Column2, Column3 FROM Table

This gives a runtime error, however:

System.InvalidOperationException: The locale id '0' of the source column and the locale id '1033' of the destination column do not match.

The only way I found to avoid the error is to make sure the NULL value is a "textual" NULL value (where the CAST type doesn't have to exactly match the destination column type, i.e. for this example the destination could be nchar(10), but the query still works):

SELECT Column1, CAST(NULL as varchar(1)) as Column2, Column3 FROM Table

Numbers can also present some difficulties. If I want to copy a decimal(20,7) column to a decimal(20,7) column there is no problem, but if I want to aggregate a decimal(20,7) column and place the result into a decimal(20,7) column, there will be an exception. In other words, the following query:

SELECT Column1, SUM(aColumn) as Column2 ...

Will throw an InvalidOperationException (The given value of type SqlDecimal from the data source cannot be converted to type decimal of the specified target column.). I believe this is because SUM takes a decimal(20,7) and returns a decimal(38,7). Again, the only work around I found was an explicit convert in the query to keep the data types as decimal(20,7).

Hope that helps anyone working with SqlBulkCopy.


Comments
Maris Friday, February 8, 2013
To use SqlBulkCopy you also have to set permissions correctly in SQL Server. http://msdn.microsoft.com/en-us/library/ms188365.aspx
Anonymous Monday, February 11, 2013
You can (I think) get around the NULL problem by passing in an explicit column mapping with the SqlBulkCopyColumnMapping object. If you don't do that, SqlBulkCopy tries to guess the data types you're tossing in, which doesn't work when you're just passing in a NULL.
gravatar Scott Monday, February 11, 2013
I did have to configure (conventionally) column mappings for every copy operation to avoid identity columns, computed columns, and out of order columns. Unfortunately, I never found a way to specify datatypes in the column map, which would have made this easier.
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!