Loading data into a partitioned table using parallel bcp lets you direct the data to a particular partition in the table.
Before you run parallel bulk copy, the table should be located on the segment, and it should be partitioned.
You should drop all indexes, so that you do not experience failures due to index deadlocks.
Use alter table...disable trigger so that fast, minimally-logged bulk copy is used, instead of slow bulk copy, which is completely logged.
You may also want to set the database option trunc log on chkpt to keep the log from filling up during large loads.
You can use operating system commands to split the file into separate files, and then copy each file, or use the -F (first row) and -L (last row) command-line flags for bcp.
Whichever method you choose, be sure that the number of rows sent to each partition is approximately the same.
Here is an example using separate files:
bcp mydb..huge_tab:1 in bigfile1 bcp mydb..huge_tab:2 in bigfile2 ... bcp mydb..huge_tab:10 in bigfile10
This example uses the first row and last row command-line arguments on a single file:
bcp mydb..huge_tab:1 in bigfile -F1 -L100000 bcp mydb..huge_tab:2 in bigfile -F100001 -L200000 ... bcp mydb..huge_tab:10 in bigfile -F900001 -L1000000
If you have space to split the file into multiple files, copying from separate files is much faster than using the first row and last row command-line arguments, since bcp needs to parse each line of the input file when using -F and -L. This parsing process can be very slow, almost negating the benefits from parallel copying.