File CSV Input Adapter

Adapter type: dsv_in. The File CSV Input adapter reads a file in Event Stream Processor delimited format.

Use this adapter to poll new data appended to the data file. The file does not require a header. If the file includes a header, it specifies the field names.

Sample record formats for the data file:
1. hasHeader=true
delimiter=,
expectStreamNameOpcode=false

Ts,ItemID,Price,Quantity,WarehouseZipCode,DeliveryZipCode
2004/06/17 10:00:00.000000,SKU1276532,50.00,1,10012,94086
2004/06/17 10:00:05.000000,SKU6723143,23.00,2,10012,94043

2. expectStreamNameOpcode=true
delimiter=,

Trades_in,i,2004/06/17 10:00:00.000000,SKU1276532,50.00,1,10012,94086
Trades_in,i,2004/06/17 10:00:05.000000,SKU6723143,23.00,2,10012,94043

3. expectStreamNameOpcode=false
timestampFormat=%Y/%m/%d %H:%M:%S
delimiter=,

2004/06/17 10:00:00.000000,SKU1276532,50.00,1,10012,94086
2004/06/17 10:00:05.000000,SKU6723143,23.00,2,10012,94043

This adapter supports schema discovery in normal mode only. If you use the CCL ATTACH ADAPTER statement to attach an adapter, you must supply the adapter type.

The File CSV Input adapter operates in two loading modes: dynamic and normal. In normal mode, the adapter reads records from a single source file until there are no more records or you stop the process. In dynamic mode, the adapter reads records from a source directory on a first-come, first-served basis. These files follow a format defined by a regular expression or the filePattern parameter. You can also specify a POSIX standard regular expression, such as [a-z]{4}\.csv, or a wildcard character sequence that is supported by the native operating system, such as *.csv. For example, if the expression is *.csv, then the adapter searches for .csv files only. After reading each file, the adapter checks the directory for new files and continues processing. If there is an error in opening or reading a file, the adapter skips it.

In dynamic mode, if a file is overwritten or updated after the adapter processes it and the reprocess parameter is set to false, the file it is not processed again. If the removeAfterProcess parameter is set to false, the adapter does not delete the file after it processes it. Both reprocess and removeAfterProcess can be set to false simultaneously, but only one can be set to true at a time.

Property Label Property ID Type Description
Directory dir string

(Required) Specify the absolute path to the data files you want the adapter to read. For example, <username>/<folder name>.

Default value is "." (current directory in which the Server is running).

Use a forward slash for both UNIX and Windows paths.

File (in Directory) file tables (Dependent required) File to read. Set only in normal mode; in dynamic mode, the value is ignored. No default value..
Stream name, opcode expected expectStreamNameOpcode boolean (Required) If true, the adapter interprets the first two fields as stream name and opcode respectively. Messages with unmatched stream names are discarded.

When you are using schema discovery on this adapter with this property enabled, two columns for the stream name and opcode are created in the schema. Manually remove these two columns from your schema.

Default value is false

.
Field Count fieldCount uint (Optional) Count of fields in CSV file, if different from the value for the source stream. Default value is 0.
Repeat Date Field Count repeatCount int (Optional) Number of times the input data is repeated. If set to -1, the input data is repeated indefinitely. Default value is 0.
Note: You can use this parameter to test a continuous streaming source.
Repeat Data Field Name repeatField string (Optional) On each repeat, increment the value in this field. You must specify the stream column if repeatCount has a nonzero value. Default value is a hyphen (-).
  • If repeatCount has a nonzero value, specify the stream column name.
  • If the repeatColumn is a key column in the stream, ensure there are no duplicates when specifying multiple rows in the input file.
  • If the adapter is attached to a window, the repeatField must be a key column.
Delimiter delimiter string (Advanced) Symbol used to separate the column. Default value is a comma ( , ).
Has Header hasHeader boolean (Advanced) Determines whether the first line of the file contains the description of the fields. Default value is false.
Directory (runtime) runtimeDir runtimeDirectory (Advanced) Location of the data files at runtime, if the value is different from the location defined at discovery time. No default value.
File Pattern filePattern string (Advanced) In dynamic mode, the filePattern looks up files in a directory. In normal mode, it is the pattern used to look up files for discovery.

If both filePattern and fileRegex are specified, filePattern is ignored and fileRegex is used.

filePattern allows wildcard characters supported by the native operating system. Default value is *.csv.
Regular Expression fileRegex string (Optional) In dynamic mode, the regular expression property allows the user to specify a POSIX regular expression standard to find matching file names in the directory. If both filePattern and fileRegex are specified, then filePattern is ignored and fileRegex is used. No default value.
Poll Period (seconds) pollperiod uint (Advanced) In normal mode, specifies the period for polling a file to check for new records.

pollperiod is ignored in dynamic mode.

Default value is 0.
Directory Poll Period dirPollPeriod uint

(Advanced) In dynamic mode, continuously checks for new files in the directory. Default value is 60 seconds.

Convert to Safe Opcodes safeOps boolean (Advanced) Converts the opcodes INSERT and UPDATE to UPSERT, and converts DELETE to SAFEDELETE. Default value is false.
Skip Deletes skipDels boolean (Advanced) Skips the rows with opcodes DELETE or SAFEDELETE. Default value is false.
Date Format dateFormat string (Advanced) Format string for parsing date values. Default value is %Y-%m-%dT%H:%M:%S.
Timestamp Format timestampFormat string (Advanced) Format string for parsing timestamp values. Default value is %Y-%m-%dT%H:%M:%S.
Block Size blockSize int (Advanced) Number of records to block into one pseudotransaction. Default value is 1.

Use Envelopes

useEnvelopes boolean

(Advanced) Specify the block type the adapter uses to pass data to the engine. If you specify a blockSize property greater than zero, by default, the adapter packages rows into transaction blocks to send to the engine. To get the adapter to package rows into envelope blocks instead, set this property to true. Default value is false.

Field Mapping permutation permutation

Mapping between Event Stream Processor and external fields, for example:

<esp_columnname>=<database_columnname>:<esp_columnname>=<database_columnname>. No default value.

PropertySet

propertyset string

(Advanced) Specifies the name of the property set from the project configuration file. If you specify the same properties in the project configuration file and the ATTACH ADAPTER statement, the values in the property set override the values defined in the ATTACH ADAPTER statement. No default value.

Dynamic Loading dynamicMode boolean (Optional) Set to true to enable dynamic loading mode, which causes the adapter to read records from files as they arrive in the directory. Records are processed one after another on a first-come, first-serve basis. Default value is false.
Remove File After Processing removeAfterProcess boolean (Optional) In dynamic mode, if removeAfterProcess is set to true, the file is removed from the directory after the adapter processes it. You cannot set this property to true if reprocess is also set to true. Default value is false.
Reprocess File reprocess boolean (Optional) In dynamic mode, if reprocess is set to true, the file is processed again after being updated or overwritten. You cannot set this property to true if removeAfterProcess is also set to true. Default value is false.
Known limitations:
  • In normal mode only, when polling, you can append to the file, but cannot overwrite or replace it. The stream names in the file rows are ignored and all data is sent to the same stream.
  • For discovery to work correctly, set the delimiter character and the header presence flag to match the actual data.
  • Do not mix files with different delimiters or files with and without headers in the same directory. Files with wrong delimiters or headers are incorrectly discovered.
Related reference
Adapter Support for Schema Discovery