File Input Transporter Module Configuration

The File Input transporter reads data from local files, wraps the data with string, and sends it to the next module specified in the adapter configuration file. Set values for this transporter in the adapter configuration file.

The File Input transporter supports schema discovery.
XML Element Description
Dir

Type: string

(Required) Specify the absolute path to the data files which you want the adapter to read. For example, <username>/<foldername>. No default value.

To use Hadoop system files, use an HDFS folder uri instead of a local file system folder. For example, hdfs://<hdfsserver>:9000/<foldername>/<subfoldername>/<leaffoldername>.

To use Hadoop, download the binaries for Hadoop version 1.2.1 from http://hadoop.apache.org. Copy the hadoop-core.jar file (for example, for version 1.2.1 hadoop-core-1.2.1.jar) to %ESP_HOME%\adapters\framework\libj. Ensure you use a stable version rather than a beta.

Use a forward slash for both UNIX and Windows paths.

File

Type: string

(Required) Specify the file you want the adapter to read or the regex pattern to filter the files on a given directory. See the DynamicMode element. No default value.

AccessMode

Type: string

(Required) Specify an access mode:
  • rowBased – the adapter reads one text line at a time.
  • Streaming – the adapter reads a preconfigured size of bytes into a buffer.
No default value.
DynamicMode

Type: string

(Advanced) Specify a dynamic mode:
  • Static – the adapter reads the file specified in the Dir and File elements.
  • dynamicFile – the adapter reads the file specified in the Dir and File elements and keeps polling the new appended content. The polling period is specified in the PollingPeriod element.
  • dynamicPath – the adapter polls all the new files under the Dir element. Also, the File element acts as a regex pattern and filters out the necessary files.
The default value is Static. If DynamicMode has been set to dynamicPath and you leave the File element empty, the adapter reads all the files under the specified directory.

An example regex pattern is ".*\.txt", which selects only files that end with ".txt". In regex patterns, you must include an escape character, "\", before meta chars to include them in the pattern string.

PollingPeriod

Type: integer

(Advanced) Define the period, in seconds, to poll the specified file or directory. Set this element only if the value of the DynamicMode element is set to dynamicFile or dynamicPath.

The default value is 0, which, along with all other values less than 0, turns off polling.

RemoveAfterProcess

Type: boolean

(Optional) If this property is set to true, the file is removed from the directory after the adapter processes it. This element takes effect if the value of the DynamicMode element is set to dynamicPath and ignored if it is set to dynamicFile instead.

The default value is false.

ScanDepth

Type: integer

(Optional) Specify the depth of the schema discovery. The adapter reads the number of rows specified by this element value when discovering the input data schema.

The default value is three.