Bulk loading data using the LOAD TABLE statement

The LOAD TABLE statement efficiently imports data from a text or binary file into an existing database table. The statement loads data into column indexes created automatically or defined by users.

The permissions needed to execute a LOAD TABLE statement are set on the server command line, using the -gl option. Sybase recommends the -gl all setting, which allows any user to load or unload a table. This is the default setting set by start_iq. If -gl all is set, you must be the owner of the table, have DBA authority, or have ALTER permission, in order to use the LOAD TABLE statement. You also need INSERT permission on the table.

For a description of input file error handling during loads, see the description of the ON FILE ERROR load option in the LOAD TABLE statement in Reference: Statements and Options.

LOAD TABLE supports loading of large object (LOB) data. See Unstructured Data Analytics in Sybase IQ.

Using command files to load data

To load large amounts of data, most users create command files. To create a command file, follow the instructions in Chapter 2, “Using Interactive SQL (dbisql),”in the Utility Guide.

Transaction processing and LOAD TABLE

When you issue the LOAD TABLE statement for an IQ table, a savepoint occurs automatically before the data is loaded. If the load completes successfully, Sybase IQ releases the savepoint. If the load fails, the transaction rolls back to the savepoint. This approach gives you flexibility in committing transactions. For example, if you issue two LOAD TABLE commands, you can ensure that either both commands commit or neither commits.

When you issue LOAD TABLE for a catalog store table, there is no automatic savepoint. If the load succeeds, it commits automatically. If the load fails, it rolls back. You cannot roll back a successful load of a catalog store table.

For more information on transaction processing, see Chapter 10, “Transactions and Versioning”.

Integrity constraints and LOAD TABLE

LOAD TABLE allows you to control load behavior when integrity constraints are violated and to selectively log information about the violations. You can specify whether to ignore UNIQUE, NULL, DATA VALUE, and/or FOREIGN KEY constraint violations that occur during a load and the maximum number of violations to ignore before initiating a rollback. You can also direct the load to log information about specific types of integrity constraint violations both per violation in a message log and per row in a row log.

For information on the contents and format of the message and row logs, see “Logging integrity constraint violations”.

Simple LOAD TABLE example

The following statement loads the data from the file dept.txt into all columns of the department table. This example assumes that no explicit data conversion is needed, and that the width of input columns matches the width of columns in the Departments table. The flat file dept.txt must exist at the specified location.

LOAD TABLE Departments
( DepartmentID, DepartmentName, DepartmentHeadID )
FROM '/d1/MILL1/dept.txt'

Specifying files to load

You specify one or more files from which to load data. In the FROM clause, you specify each filename-string, and separate multiple strings by commas.

The files are read one at a time, and processed in a left-to-right order as specified in the FROM clause. Any SKIP or LIMIT value only applies in the beginning of the load, not for each file.

If a load cannot complete, for example due to insufficient memory, the entire load transaction is rolled back.

filename-string The filename-string is passed to the server as a string. The string is therefore subject to the same formatting requirements as other SQL strings. In particular:

Named pipes

The file specification can be a named pipe. When you load from a named pipe (or FIFO) on Windows, the program writing to the pipe must close the pipe in a special way. The pipe writer must call FlushFileBuffers( ) and then DisconnectNamedPipe( ). (If the program does not, Sybase IQ reports an exception from hos_io::Read( ).) This issues a PIPE_NOT_CONNECTED error, which notifies Sybase IQ that the pipe was shut down in an orderly manner rather than an uncontrolled disconnect. See Microsoft documentation for details on these calls.

Specifying table-wide format options

You can specify several options that describe the format of input data.

FORMAT option You can specify a default format for table columns, which applies if you omit the column-spec. The same formats that can appear in the column-spec can appear here. If you also omit the FORMAT load option, the file is assumed to be binary.

For a detailed description of the binary format used by Sybase IQ to produce data files that can be read by the LOAD TABLE statement using the FORMAT BINARY and BINARY column specification clauses, see “Using binary load format”.

DELIMITED BY option If you omit a column delimiter in the column-spec definition, the default column delimiter character is a comma. You can specify an alternative column delimiter by providing a single ASCII character or the hexadecimal representation. In particular, to specify tab-delimited values use the hexadecimal ASCII code of the tab character (9), as follows:

...DELIMITED BY '\x09' ...

To use the newline character as a delimiter, you can specify either the special combination '\n' or its ASCII value '\x0a'.

NoteAlthough the delimiter-string in the column-spec may be a string of up to four characters, the DELIMITED BY option allows only a single ASCII character or its hexadecimal representation.

STRIP The STRIP clause specifies whether unquoted values should have trailing blanks stripped off before they are inserted. The LOAD TABLE command accepts the following STRIP keywords:

With STRIP turned on (the default), trailing blanks are stripped from values before they are inserted. This is effective only for VARCHAR data. To turn the STRIP option off, the clause is as follows:

... STRIP OFF ...

Trailing blanks are stripped only for unquoted strings. Quoted strings retain their trailing blanks. If you do not require blank sensitivity, you can use the FILLER option as an alternative to be more specific in the number of bytes to strip, instead of all the trailing spaces. It is more efficient for Sybase IQ to have this option off, and it adheres to the ANSI standard when dealing with trailing blanks. (CHAR data is always padded, so the STRIP option only affects VARCHAR data.)

The STRIP option applies only to variable-length non-binary data and does not apply to ASCII fixed-width inserts. For example, assume the following schema:

CREATE TABLE t( c1 VARCHAR(3) );
LOAD TABLE t( c1 ',' ) ........ STRIP RTRIM    // trailing blanks trimmed

LOAD TABLE t( c1 ',' ) ........ STRIP OFF      // trailing blanks not trimmed

LOAD TABLE t( c1 ASCII(3) ) ... STRIP RTRIM    // trailing blanks not trimmed
LOAD TABLE t( c1 ASCII(3) ) ... STRIP OFF      // trailing blanks trimmed

LOAD TABLE t( c1 BINARY ) ..... STRIP RTRIM    // trailing blanks trimmed
LOAD TABLE t( c1 BINARY ) ..... STRIP OFF      // trailing blanks trimmed

Trailing blanks are always trimmed from binary data.

QUOTES option The QUOTES parameter is optional and the default is ON. With QUOTES turned on, LOAD TABLE expects input strings to be enclosed in quote characters. The quote character is either an apostrophe (single quote) or a quotation mark (double quote). The first such character encountered in a string is treated as the quote character for the string. String data must be terminated with a matching quote.

With QUOTES ON, column or row delimiter characters can be included in the column value. Leading and ending quote characters are assumed not to be part of the value and are excluded from the loaded data value.

With QUOTES OFF, Sybase IQ does not strip off apostrophes (single quotes) or quotation marks (double quotes). When it encounters these characters in your input file, it treats them as part of the data. With QUOTES OFF, you cannot include column delimiter characters in column values.

For syntax and usage details, see LOAD TABLE statement in Chapter 1, “SQL Statements,” in Reference: Statements and Options.

LOAD TABLE QUOTES option example Consider a table defined as:

CREATE TABLE t1 (c1 INT, c2 VARCHAR(20), c3 VARCHAR(20))

with the following input data:

1, apple , fruit1	 ,
2, “banana” , “fruit2”,
3, “ pear ”, “ fruit3 ”,

The result of loading this data is displayed by running the following query:

SELECT c1, c2, c3, LENGTH(c2), LENGTH(c3) FROM t1

Given the values of the QUOTES and STRIP options used during the LOAD TABLE command, the following table displays the result of the query above, with each result enclosed by ‘<’ and ‘>’:

    LOAD TABLE        options

Results of SELECT c1, c2, c3, LENGTH(c2), LENGTH(c3) FROM t1

QUOTES

STRIP

c1

c2

c3

length(c2)

length(c3)

ON

RTRIM

<1>

<apple>

<fruit1>

<5>

<6>

<2>

<banana>

<fruit2>

<6>

<6>

<3>

< pear >

< fruit3 >

<6>

<8>

ON

OFF

<1>

<apple >

<fruit1 >

<6>

<7>

<2>

<banana>

<fruit2>

<6>

<6>

<3>

< pear >

< fruit3 >

<6>

<8>

OFF

RTRIM

<1>

< apple>

< fruit1>

<6>

<7>

<2>

< “banana”>

< “fruit2”>

<9>

<9>

<3>

< “ pear ”>

< “ fruit3 ”>

<9>

<11>

OFF

OFF

<1>

< apple >

< fruit1 >

<7>

<8>

<2>

< “banana” >

< “fruit2”>

<10>

<9>

<3>

< “ pear ”>

< “ fruit3 ”>

<9>

<11>

Notes on the results:

Specifying load options

You can specify a wide range of load options that tell Sybase IQ how to interpret and process the input file and what to do when errors occur.

You can specify load options in any order. For details of all options, see LOAD TABLE statement in Reference: Statements and Options.

FILLER option example The following is a Windows example:

LOAD TABLE nn
    (l_orderkey,
    l_quantity ASCII(PREFIX 2),
    FILLER(2),
FROM 'C:\\iq\archive\\mill.txt'
BYTE ORDER LOW

LIMIT option example In the following Windows example, no rows are skipped and up to 1,000,000 rows are inserted.

LOAD TABLE lineitem
    (l_shipmode ASCII(15),
    l_quantity ASCII(8),
    FILLER(30))
FROM 'C:\\iq\archive\\mill.txt'
PREVIEW ON
LIMIT 1000000

ROW DELIMITED BY option example The following Windows example sets the column delimiter for the l_orderkey column to tab, and the row delimiter to newline (\x0a) followed by carriage return (\x0d):

LOAD TABLE mm
    (l_orderkey '\x09',
    l_quantity ASCII(4),
    FILLER(6),
    l_shipdate DATE('YYYY/MM/DD'))
FROM 'C:\\iq\\archive\\mill.txt'
ROW DELIMITED BY '\x0a\x0d'

SKIP option example In this UNIX example, Sybase IQ reads 9,000 rows from the input file, skips the first 5,000, and loads the next 4,000. If there are only 8,000 rows in the input file, then only 3,000 rows are loaded.

LOAD TABLE lineitem(
    l_shipmode ASCII(15),
    l_quantity ASCII(8),
FILLER(30))
FROM '/d1/MILL1/tt.t'
LIMIT 4000
SKIP 5000
PREVIEW ON

LOAD TABLE adds rows

The LOAD TABLE statement appends the contents of the file to the existing rows of the table; it does not replace the existing rows in the table, unless you specify the START ROW ID load option.

To empty an existing table, use the TRUNCATE TABLE statement to remove all the rows.