The LOAD TABLE statement efficiently imports data from a text or binary file into an existing database table. The statement loads data into column indexes created automatically or defined by users.
The permissions needed to execute a LOAD TABLE statement are set on the server command line, using the -gl option. Sybase recommends the -gl all setting, which allows any user to load or unload a table. This is the default setting set by start_iq. If -gl all is set, you must be the owner of the table, have DBA authority, or have ALTER permission, in order to use the LOAD TABLE statement. You also need INSERT permission on the table.
For a description of input file error handling during loads, see the description of the ON FILE ERROR load option in the LOAD TABLE statement in Reference: Statements and Options.
LOAD TABLE supports loading of large object (LOB) data. See Unstructured Data Analytics in Sybase IQ.
To load large amounts of data, most users create command files. To create a command file, follow the instructions in Chapter 2, “Using Interactive SQL (dbisql),”in the Utility Guide.
When you issue the LOAD TABLE statement for an IQ table, a savepoint occurs automatically before the data is loaded. If the load completes successfully, Sybase IQ releases the savepoint. If the load fails, the transaction rolls back to the savepoint. This approach gives you flexibility in committing transactions. For example, if you issue two LOAD TABLE commands, you can ensure that either both commands commit or neither commits.
When you issue LOAD TABLE for a catalog store table, there is no automatic savepoint. If the load succeeds, it commits automatically. If the load fails, it rolls back. You cannot roll back a successful load of a catalog store table.
For more information on transaction processing, see Chapter 10, “Transactions and Versioning”.
LOAD TABLE allows you to control load behavior when integrity constraints are violated and to selectively log information about the violations. You can specify whether to ignore UNIQUE, NULL, DATA VALUE, and/or FOREIGN KEY constraint violations that occur during a load and the maximum number of violations to ignore before initiating a rollback. You can also direct the load to log information about specific types of integrity constraint violations both per violation in a message log and per row in a row log.
For information on the contents and format of the message and row logs, see “Logging integrity constraint violations”.
The following statement loads the data from the file dept.txt into all columns of the department table. This example assumes that no explicit data conversion is needed, and that the width of input columns matches the width of columns in the Departments table. The flat file dept.txt must exist at the specified location.
LOAD TABLE Departments ( DepartmentID, DepartmentName, DepartmentHeadID ) FROM '/d1/MILL1/dept.txt'
You specify one or more files from which to load data. In the FROM clause, you specify each filename-string, and separate multiple strings by commas.
The files are read one at a time, and processed in a left-to-right order as specified in the FROM clause. Any SKIP or LIMIT value only applies in the beginning of the load, not for each file.
If a load cannot complete, for example due to insufficient memory, the entire load transaction is rolled back.
filename-string The filename-string is passed to the server as a string. The string is therefore subject to the same formatting requirements as other SQL strings. In particular:
If a backslash (\) precedes
the characters n
, x
,
or \
it is considered
an escape character. For this reason, to indicate directory paths
in Windows systems, you must represent the backslash character by
two backslashes if the next character is any of those listed. (It
is always safe to double the backslashes.) Therefore, the statement
to load data from the file c:\newinput.dat into
the employee table is:
LOAD TABLE employees FROM 'c:\\newinput.dat' ...
For server-side loading (LOAD TABLE ... USING FILE), the path name is relative to the database server, not to the client application. If you are running the statement on a database server on some other computer, the directory name refers to directories on the server machine, not on the client machine. The input file for the load must be on the server machine.
For client-side data loading (LOAD TABLE ... USING CLIENT FILE), the path name must be relative to the client application. The directory name refers to directories on the client machine.
The file specification can be a named pipe. When you load from a named pipe (or FIFO) on Windows, the program writing to the pipe must close the pipe in a special way. The pipe writer must call FlushFileBuffers( ) and then DisconnectNamedPipe( ). (If the program does not, Sybase IQ reports an exception from hos_io::Read( ).) This issues a PIPE_NOT_CONNECTED error, which notifies Sybase IQ that the pipe was shut down in an orderly manner rather than an uncontrolled disconnect. See Microsoft documentation for details on these calls.
You can specify several options that describe the format of input data.
FORMAT option You can specify a default format for table columns, which applies if you omit the column-spec. The same formats that can appear in the column-spec can appear here. If you also omit the FORMAT load option, the file is assumed to be binary.
For a detailed description of the binary format used by Sybase IQ to produce data files that can be read by the LOAD TABLE statement using the FORMAT BINARY and BINARY column specification clauses, see “Using binary load format”.
DELIMITED BY option If you omit a column delimiter in the column-spec definition, the default column delimiter character is a comma. You can specify an alternative column delimiter by providing a single ASCII character or the hexadecimal representation. In particular, to specify tab-delimited values use the hexadecimal ASCII code of the tab character (9), as follows:
...DELIMITED BY '\x09' ...
To use the newline character as a delimiter, you can specify either the special combination '\n' or its ASCII value '\x0a'.
Although the delimiter-string in the column-spec may be a string of up to four characters, the DELIMITED BY option allows only a single ASCII character or its hexadecimal representation.
STRIP The STRIP clause specifies whether unquoted values should have trailing blanks stripped off before they are inserted. The LOAD TABLE command accepts the following STRIP keywords:
With STRIP turned on (the default), trailing blanks are stripped from values before they are inserted. This is effective only for VARCHAR data. To turn the STRIP option off, the clause is as follows:
... STRIP OFF ...
Trailing blanks are stripped only for unquoted strings. Quoted strings retain their trailing blanks. If you do not require blank sensitivity, you can use the FILLER option as an alternative to be more specific in the number of bytes to strip, instead of all the trailing spaces. It is more efficient for Sybase IQ to have this option off, and it adheres to the ANSI standard when dealing with trailing blanks. (CHAR data is always padded, so the STRIP option only affects VARCHAR data.)
The STRIP option applies only to variable-length non-binary data and does not apply to ASCII fixed-width inserts. For example, assume the following schema:
CREATE TABLE t( c1 VARCHAR(3) ); LOAD TABLE t( c1 ',' ) ........ STRIP RTRIM // trailing blanks trimmed LOAD TABLE t( c1 ',' ) ........ STRIP OFF // trailing blanks not trimmed LOAD TABLE t( c1 ASCII(3) ) ... STRIP RTRIM // trailing blanks not trimmed LOAD TABLE t( c1 ASCII(3) ) ... STRIP OFF // trailing blanks trimmed LOAD TABLE t( c1 BINARY ) ..... STRIP RTRIM // trailing blanks trimmed LOAD TABLE t( c1 BINARY ) ..... STRIP OFF // trailing blanks trimmed
Trailing blanks are always trimmed from binary data.
QUOTES option The QUOTES parameter is optional and the default is ON. With QUOTES turned on, LOAD TABLE expects input strings to be enclosed in quote characters. The quote character is either an apostrophe (single quote) or a quotation mark (double quote). The first such character encountered in a string is treated as the quote character for the string. String data must be terminated with a matching quote.
With QUOTES ON, column or row delimiter characters can be included in the column value. Leading and ending quote characters are assumed not to be part of the value and are excluded from the loaded data value.
With QUOTES OFF, Sybase IQ does not strip off apostrophes (single quotes) or quotation marks (double quotes). When it encounters these characters in your input file, it treats them as part of the data. With QUOTES OFF, you cannot include column delimiter characters in column values.
For syntax and usage details, see LOAD TABLE statement in Chapter 1, “SQL Statements,” in Reference: Statements and Options.
LOAD TABLE QUOTES option example Consider a table defined as:
CREATE TABLE t1 (c1 INT, c2 VARCHAR(20), c3 VARCHAR(20))
with the following input data:
1, apple , fruit1 , 2, “banana” , “fruit2”, 3, “ pear ”, “ fruit3 ”,
The result of loading this data is displayed by running the following query:
SELECT c1, c2, c3, LENGTH(c2), LENGTH(c3) FROM t1
Given the values of the QUOTES and STRIP options used during the LOAD TABLE command, the following table displays the result of the query above, with each result enclosed by ‘<’ and ‘>’:
LOAD TABLE options |
Results of SELECT c1, c2, c3, LENGTH(c2), LENGTH(c3) FROM t1 |
|||||
---|---|---|---|---|---|---|
QUOTES |
STRIP |
c1 |
c2 |
c3 |
length(c2) |
length(c3) |
ON |
RTRIM |
<1> |
<apple> |
<fruit1> |
<5> |
<6> |
<2> |
<banana> |
<fruit2> |
<6> |
<6> |
||
<3> |
< pear > |
< fruit3 > |
<6> |
<8> |
||
ON |
OFF |
<1> |
<apple > |
<fruit1 > |
<6> |
<7> |
<2> |
<banana> |
<fruit2> |
<6> |
<6> |
||
<3> |
< pear > |
< fruit3 > |
<6> |
<8> |
||
OFF |
RTRIM |
<1> |
< apple> |
< fruit1> |
<6> |
<7> |
<2> |
< “banana”> |
< “fruit2”> |
<9> |
<9> |
||
<3> |
< “ pear ”> |
< “ fruit3 ”> |
<9> |
<11> |
||
OFF |
OFF |
<1> |
< apple > |
< fruit1 > |
<7> |
<8> |
<2> |
< “banana” > |
< “fruit2”> |
<10> |
<9> |
||
<3> |
< “ pear ”> |
< “ fruit3 ”> |
<9> |
<11> |
Notes on the results:
With QUOTES ON and STRIP RTRIM, both leading space and trailing space for the non-enclosed field c2 row 1 are trimmed.
With QUOTES ON and STRIP OFF, only the leading space for the non-enclosed field c2 row 1 is trimmed.
With QUOTES OFF and STRIP RTRIM, only the trailing space for the non-enclosed field c2 row 1 is trimmed.
With QUOTES OFF and STRIP OFF, neither leading space nor trailing space for the non-enclosed field c2 row 1 is trimmed.
With QUOTES ON and STRIP RTRIM, both leading space and trailing space within quotes for the enclosed fields c2 and c3 row 3 are NOT trimmed.
You can specify a wide range of load options that tell Sybase IQ how to interpret and process the input file and what to do when errors occur.
You can specify load options in any order. For details of all options, see LOAD TABLE statement in Reference: Statements and Options.
FILLER option example The following is a Windows example:
LOAD TABLE nn (l_orderkey, l_quantity ASCII(PREFIX 2), FILLER(2), FROM 'C:\\iq\archive\\mill.txt' BYTE ORDER LOW
LIMIT option example In the following Windows example, no rows are skipped and up to 1,000,000 rows are inserted.
LOAD TABLE lineitem (l_shipmode ASCII(15), l_quantity ASCII(8), FILLER(30)) FROM 'C:\\iq\archive\\mill.txt' PREVIEW ON LIMIT 1000000
ROW DELIMITED BY option example The following Windows example sets the column delimiter for the l_orderkey column to tab, and the row delimiter to newline (\x0a) followed by carriage return (\x0d):
LOAD TABLE mm (l_orderkey '\x09', l_quantity ASCII(4), FILLER(6), l_shipdate DATE('YYYY/MM/DD')) FROM 'C:\\iq\\archive\\mill.txt' ROW DELIMITED BY '\x0a\x0d'
SKIP option example In this UNIX example, Sybase IQ reads 9,000 rows from the input file, skips the first 5,000, and loads the next 4,000. If there are only 8,000 rows in the input file, then only 3,000 rows are loaded.
LOAD TABLE lineitem( l_shipmode ASCII(15), l_quantity ASCII(8), FILLER(30)) FROM '/d1/MILL1/tt.t' LIMIT 4000 SKIP 5000 PREVIEW ON
The LOAD TABLE statement appends the contents of the file to the existing rows of the table; it does not replace the existing rows in the table, unless you specify the START ROW ID load option.
To empty an existing table, use the TRUNCATE TABLE statement to remove all the rows.