When you use parallel bulk copy, IDENTITY columns can cause a bottleneck.
As bcp reads in the data, the utility both generates the values of the IDENTITY column and updates the IDENTITY column’s maximum value for each row. This extra work may adversely affect the performance improvement that you expected to receive from using parallel bulk copy. To avoid this bottleneck, explicitly specify the IDENTITY starting point for each session.
Use -g id_start_value to specify an IDENTITY starting point for a session in the command line.
The -g parameter instructs Adaptive Server to generate a sequence of IDENTITY column values for the bcp session without checking and updating the maximum value of the table’s IDENTITY column for each row. Instead of checking, Adaptive Server updates the maximum value at the end of each batch.
bcp [-gid_start_value]
bcp mydb..bigtable in file1 -g100 bcp mydb..bigtable in file2 -g200 bcp mydb..bigtable in file3 -g300 bcp mydb..bigtable in file4 -g400
Know how many rows are in the input files and what the highest existing value is. Use this information to set the starting values with the -g parameter and generate ranges that do not overlap.
In the example above, if any file contains more than 100 rows, the identity values overlap into the next 100 rows of data, creating duplicate identity values.
Verify that no one else is inserting data that can produce conflicting IDENTITY values.