Configure the File/Hadoop JSON Input adapter by specifying values for the ESP connector, formatter, and transporter modules in the adapter configuration file.
XML Element | Description |
---|---|
Log4jProperty |
Type: string (Optional) Specify a full path to the log4j.properties logging file you wish to use. The default value is $ESP_HOME/adapters/framework/config/log4j.properties. |
The File Input transporter reads data from local files, wraps the data with string, and sends it to the next module specified in the adapter configuration file. Set values for this transporter in the adapter configuration file.
XML Element | Description |
---|---|
Module |
(Required) Element containing all information for this module. It contains a type attribute for specifying the module type. For example, transporter. |
InstanceName |
Type: string (Required) Instance name of the specific module you want to use. For example, MyInputTransporter. |
Name |
Type: string (Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter. |
Next |
Type: string (Required) Instance name of the module that follows this one. |
BufferMaxSize |
Type: integer (Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240. |
Parameters |
(Required) Element containing the FileInputTransporterParameters element. |
FileInputTransporterParameters |
(Required) Element containing elements for the File Input transporter. |
Dir |
Type: string (Required) Specify the absolute path to the data files which you want the adapter to read. For example, <username>/<foldername>. No default value. To use Hadoop system files, use an HDFS folder uri instead of a local file system folder. For example, hdfs://<hdfsserver>:9000/<foldername>/<subfoldername>/<leaffoldername>. To use Hadoop, download the binaries for Hadoop version 1.2.1 from http://hadoop.apache.org. Copy the hadoop-core.jar file (for example, for version 1.2.1 hadoop-core-1.2.1.jar) to %ESP_HOME%\adapters\framework\libj. Ensure you use a stable version rather than a beta. Use a forward slash for both UNIX and Windows paths. |
File |
Type: string (Required) Specify the file you want the adapter to read or the regex pattern to filter the files on a given directory. See the DynamicMode element. No default value. |
AccessMode |
Type: string (Required) Specify an access mode:
|
DynamicMode |
Type: string (Advanced) Specify a dynamic mode:
An example regex pattern is ".*\.txt", which selects only files that end with ".txt". In regex patterns, you must include an escape character, "\", before meta chars to include them in the pattern string. |
PollingPeriod |
Type: integer (Advanced) Define the period, in seconds, to poll the specified file or directory. Set this element only if the value of the DynamicMode element is set to dynamicFile or dynamicPath. The default value is 0, which, along with all other values less than 0, turns off polling. |
RemoveAfterProcess |
Type: boolean (Optional) If this property is set to true, the file is removed from the directory after the adapter processes it. This element takes effect if the value of the DynamicMode element is set to dynamicPath and ignored if it is set to dynamicFile instead. The default value is false. |
ScanDepth |
Type: integer (Optional) Specify the depth of the schema discovery. The adapter reads the number of rows specified by this element value when discovering the input data schema. The default value is three. |
The JSON Stream to JSON String formatter reads data from InputStream, splits it into standalone JSON message strings, and sends these message strings to the next module that is configured in the adapter configuration file.
XML Element | Description |
---|---|
Module |
(Required) Element containing all information for this module. It contains a type attribute for specifying the module type. For example, formatter. |
InstanceName |
Type: string (Required) Instance name of the specific module you want to use. For example, MyInputTransporter. |
Name |
Type: string (Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter. |
Next |
Type: string (Required) Instance name of the module that follows this one. |
BufferMaxSize |
Type: integer (Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240. |
Parameters |
(Required) Element containing the JsonStreamToJsonStringFormatterParameters element. |
JsonStreamToJsonStringFormatterParameters |
(Required) Element containing elements for the Streaming JSON to JSON String formatter. |
CharsetName |
Type: string (Optional) Specify the name of a supported charset. The default value is US-ASCII. |
The JSON String to ESP formatter translates JSON strings to AepRecord objects.
XML Element | Description |
---|---|
Module |
(Required) Element containing all information for this module. It contains a type attribute for specifying the module type. For example, formatter. |
InstanceName |
Type: string (Required) Instance name of the specific module you want to use. For example, MyInputTransporter. |
Name |
Type: string (Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter. |
Next |
Type: string (Required) Instance name of the module that follows this one. |
BufferMaxSize |
Type: integer (Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240. |
Parameters |
(Required) Element containing the JsonStringToEspFormatterParameters element. |
JsonStringToEspFormatterParameters |
(Required) Element containing the JSON String to ESP formatter elements. |
ColumnMappings |
(Required) Element containing the ColsMapping element. |
ColsMapping |
Type: complexType (Required) Element which contains the Column element. You can have multiple ColsMapping elements if you are using an ESPMultiStreamPublisher. This element has two attributes:
|
Column |
Type: string (Required) Specify a JSONPath expression for the JSON data that you want to map to columns of an ESP stream. This expression is matched to the value specified in the rootpath attribute of the ColsMapping element, if applicable. You can have multiple Column elements. There are two types of JSON data: array or object.
For example, if you had the following JSON data about a
person,
{ "firstName": "John", "lastName": "Smith", "phoneNumbers": [ { "type": "home", "number": "212 555-1234" }, { "type": "fax", "number": "646 555-4567" } ], "friends": [ ["female1","female2","female3"], ["male1","male2","male3"] ] }you could get the individual's first name by using the JSONPath expression firstname. If you want the first phone number, specify phoneNumbers[0].number as the JSONPath expression. You would not have to specify a rootpath value. If you want the numbers and types of phone numbers, specify phoneNumbers as the rootpath value and numbers and type as the JSONPath expressions in the Column element. You can also specify * for Column which indicates you want all the data in an array (that does not have key value data within it). For example, if you wanted all the female friends data, specify friends[1] for rootpath and * for Column. The first <Column/> element and its value are mapped to the first column of an ESP stream, the second <Column/> element and its value are mapped to the second column of an ESP stream, and so on. |
DateFormat |
Type: string (Advanced) The format string for parsing date values. For example, yyyy-MM-dd'T'HH:mm:ss. |
TimestampFormat |
Type: string (Advanced) Format string for parsing timestamp values. For example, yyyy-MM-dd'T'HH:mm:ss.SSS. |
The ESP Publisher module obtains data from a transporter or formatter module and publishes it to an ESP project.
XML Element | Description |
---|---|
Module |
(Required) Element containing all information for this module. It contains a type attribute for specifying the module type. For example, formatter. |
InstanceName |
Type: string (Required) Instance name of the specific module you want to use. For example, MyInputTransporter. |
Name |
Type: string (Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter. |
BufferMaxSize |
Type: integer (Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240. |
Parameters |
(Required) Element containing the EspPublisherParameters element. |
EspPublisherParameters |
(Required) Element containing elements for the ESP publisher. |
ProjectName |
Type: string (Required if adapter is running in standalone mode; optional if it is running in managed mode) Name of the ESP project to which the adapter is connected. For example, EspProject2. This is the same project tag that you specify later in the adapter configuration file in the Name element within the Event Stream Processor (EspProjects) element. If you are starting the adapter with the ESP project to which it is attached (that is, running the adapter in managed mode), you need not set this element as the adapter automatically detects the project name. |
StreamName |
Type: string (Required if adapter is running in standalone mode; optional if it is running in managed mode) Name of the ESP stream to which the adapter publishes data. If you are starting the adapter with the ESP project to which it is attached (that is, running the adapter in managed mode), you need not set this element as the adapter automatically detects the stream name. |
MaxPubPoolSize |
Type: positive integer (Optional) Maximum size of the record pool. Record pooling, also referred to as block or batch publishing, allows for faster publication since there is less overall resource cost in publishing multiple records together, compared to publishing records individually. Record pooling is disabled if this value is 1. The default value is 256. |
MaxPubPoolTime |
Type: positive integer (Optional) Maximum period of time, in milliseconds, for which records are pooled before being published. If not set, pooling time is unlimited and the pooling strategy is governed by maxPubPoolSize. No default value. |
UseTransactions |
Type: boolean (Optional) If set to true, pooled messages are published to Event Stream Processor in transactions. If set to false, they are published in envelopes. The default value is false. |
SafeOps |
Type: boolean (Advanced) Converts the opcodes INSERT and UPDATE to UPSERT, and converts DELETE to SAFEDELETE. The default value is false. |
SkipDels |
Type: boolean (Advanced) Skips the rows with opcodes DELETE or SAFEDELETE. The default value is false. |
Event Stream Processor elements configure communication between Event Stream Processor and the File/Hadoop JSON Input adapter.
XML Element | Description |
---|---|
EspProjects |
(Required) Element containing elements for connecting to Event Stream Processor. |
EspProject |
(Required) Element containing the Name and Uri elements. Specifies information for the ESP project to which the adapter is connected. |
Name |
Type: string (Required) Specifies the unique project tag of the ESP project which the EspConnector (publisher/subscriber) module references. |
Uri |
Type: string (Required) Specifies the total project URI to connect to the ESP project. For example, esp://localhost:19011/ws1/p1. |
Security |
(Required) Element containing all the authentication elements below. Specifies details for the authentication method used for Event Stream Processor. |
User |
Type: string (Required) Specifies the user name required to log in to Event Stream Processor (see AuthType). No default value. |
Password |
Type: string (Required) Specifies the password required to log in to Event Stream Processor (see espAuthType). Includes an "encrypted" attribute indicating whether the Password value is encrypted. The default value is false. If set to true, the password value is decrypted using RSAKeyStore and RSAKeyStorePassword. |
AuthType |
Type: string (Required) Method used to authenticate to the Event Stream Processor. Valid values are:
If the adapter is operated as a Studio plug-in, AuthType is overridden by the Authentication Mode Studio start-up parameter. |
RSAKeyStore |
Type: string (Dependent required) Specifies the location of the RSA keystore, and decrypts the password value. Required if AuthType is set to server_rsa, or the encrypted attribute for Password is set to true, or both. |
RSAKeyStorePassword |
Type:string (Dependent required) Specifies the keystore password, and decrypts the password value. Required if AuthType is set to server_rsa, or the encrypted attribute for Password is set to true, or both. |
KerberosKDC |
Type: string (Dependent required) Specifies host name of Kerberos key distribution center. Required if AuthType is set to kerberos. |
KerberosRealm |
Type: string (Dependent required) Specifies the Kerberos realm setting. Required if AuthType is set to kerberos. |
KerberosService |
Type: string (Dependent required) Specifies the Kerberos principal name that identifies an Event Stream Processor cluster. Required if AuthType is set to kerberos. |
KerberosTicketCache |
Type: string (Dependent required) Specifies the location of the Kerberos ticket cache file. Required if AuthType is set to kerberos. |
EncryptionAlgorithm |
Type: string (Optional) Used when the encrypted attribute for Password is set to true. If left blank, RSA is used as default. |