Configuring multiple writers for loading data

Sybase ETL supports the multiple writer functionality that is available in Sybase IQ 15.0. This functionality allows you to add multiple writers for loading data into an IQ database. Multiple writers allow parallel loading of Sybase IQ tables and is faster than sequential loading. You can use multiple writers if you have:

To use the multiple writer functionality, you must have these permissions in the target IQ database:

Object name

Type

Required permission

ETL_MULTIPLEX_STATE

Table

create

ETL_MULTIPLEX_VERSION

Table

create

sp_iqstatistic

Stored Procedure

execute

NoteETL creates two tables ETL_MULTIPLEX_STATE and ETL_MULTIPLEX_VERSION in the IQ database. Each row in the ETL_MULTIPLEX_STATE table signifies an IQ writer selected by an ETL gridnode, which is removed automatically after each execution. In case, gridnode crashes due to unexpected error, you must manually clean the data in this table.

You can set the required permissions using Sybase Central:

  1. Go to Sybase Central and connect to the Sybase IQ 15.0 server.

  2. Expand Users & Groups and then select the user for whom you want to set the create table permission.

  3. Right-click the user and select Properties.

  4. Select the Authorities tab and check the Resource option to give the user permission to create database objects in the IQ database.

  5. Select the Permissions tab and then select the Procedures & Functions option to see a list of all the available permissions.

  6. Select sp_iqstatistics and click on the corresponding Execute column to give the user permission to execute the stored procedure in the IQ database.

  7. Click OK to save the settings.

NoteTo support multiplex execution, you must install the SQL Anywhere 11 ODBC driver on the same machine as ETL Development and ETL Server.

You must configure the writers to be used for multiplex execution by defining them in the IQMultiplex.ini file.

StepsDefining preferred writers in the IQMultiplex.ini file

  1. Navigate to the etc directory in the installation folder and use a text editor to open the IQMultiplex.ini file.

  2. Add one section for each multiplex group you want to use. By default, the IQMultiplex.ini file is empty. If you do not specify anything, the grid engine uses the internal default values.

    A sample section is as follows:

    [dbsybase15+iq15m+sample] 
    /* This is the section name */
    Enabled=true
    Workload=OperationsWaiting
    MinimalUpdateInterval=60
    SelectWriterTimeout=6
    MostIdleNode=select NAME from TEST_CUSTOMER_NODE
    order by WORKLOAD asc
    MostBusyNode=select NAME from TEST_CUSTOMER_NODE
    order by WORKLOAD desc
    

    You must provide the section name, which includes the coordinator’s interface name, host name, and database name, separated by a plus (+) sign. Do not use colon (:), hash (#), or equal sign (=) characters. The other properties for each group are optional and discussed in Table 5-1.

  3. Save and close the file.

Table 5-1: Multiplex group optional properties

Name

Type

Value

Description

Enabled

boolean

True (default) or False

Enable or disable this feature in the IQ server database.

Workload

text

OperationsWaiting (default)

Specify which row of the result of EXEC sp_iqstatistics should be used to take as workload.The dispatcher executes the stored procedure EXEC sp_iqstatistics to query the writer workload. The query result set returns the operation status name in the second column, and the status value in the fourth column. The dispatcher finds the row for which the second column matches the value of the Workload option you specify, and then uses the fourth column of that row as the final workload value.

MinimalUpdateInterval

integer

Greater than zero (0), default value is 6

The minimal interval in seconds, for the dispatcher to refresh writer information by querying the coordinator.

SelectWriterTimeout

integer

Greater than equal to zero (0), default value is 0

The number of seconds the dispatcher should wait, if all the writers are selected and not released. When you specify 0, the dispatcher waits indefinitely. In case of a timeout, an error is generated.

MostIdleNode

SQL

Empty by default

The first column of the first row returned by the SQL execution should be a writer node name. The dispatcher assumes the returned writer as the most idle node in the multiplex, and uses it as the target for the next table load request.

For example, this SQL script creates a custom dispatch table:

DROP TABLE TEST_CUSTOMER_NODE; CREATE TABLE TEST_CUSTOMER_NODE(         NAME varchar(100),           WORKLOAD int /*must be integer*/ ); INSERT INTO TEST_CUSTOMER_NODE (NAME,WORKLOAD) VALUES('iq15w1',78); INSERT INTO TEST_CUSTOMER_NODE (NAME,WORKLOAD) VALUES('iq15w2',34); INSERT INTO TEST_CUSTOMER_NODE (NAME,WORKLOAD) VALUES('iq15w3',12); /* iq15w1-w3 are writers*/

To obtain the most idle node from the table, add this SQL query in the IQMultiplex.ini file:

Select NAME from TEST_CUSTOMER_NODE order by WORKLOAD asc

iq15w3 is returned as the most idle node.

MostBusyNode

SQL

Empty by default

The first column of first row returned by the SQL execution should be a writer node name. The dispatcher assumes the returned writer as the busiest node in the multiplex, and delays using it as a load table target.

For example, this SQL script creates a custom dispatch table:

DROP TABLE TEST_CUSTOMER_NODE; CREATE TABLE TEST_CUSTOMER_NODE(         NAME varchar(100),           WORKLOAD int /*must be integer*/ ); INSERT INTO TEST_CUSTOMER_NODE (NAME,WORKLOAD) VALUES('iq15w1',78); INSERT INTO TEST_CUSTOMER_NODE (NAME,WORKLOAD) VALUES('iq15w2',34); INSERT INTO TEST_CUSTOMER_NODE (NAME,WORKLOAD) VALUES('iq15w3',12); /* iq15w1-w3 are writers*/

To obtain the most busy node from the custom dispatch table, add this SQL query in the IQMultiplex.ini file:

Select NAME from TEST_CUSTOMER_NODE order by WORKLOAD desc

iq15w1 is returned as the most busy node.