Regular Expressions: Read From Socket Using a Regular Expression Adapter

A Read from Socket Using a Regular Expression adapter opens a TCP connection to a given address specified through Host and Port properties. Once a connection is established, the adapter reads tuples from this connection line by line, matching each line against a given regular expression and producing tuples from subexpressions.

POSIX regular expression syntax is used. Use "..." to denote sub-expressions. The order of sub-expressions must match the tuple descriptor order. See the description of the Regular Expressions: Read From File Using a Regular Expression Adapter for more details about regular expressions.

If a connection is lost during adapter execution, the adapter attempts to reconnect.

Property Name (screen)

Property Name (Attach Adapter)

Type

Description

Host

Host

String

The host name or IP address of the data source.

Port

Port

Integer Min: 1 Max: 65535

The port number of the data source.

Regular expression

RegEx

String

A regular expression to use when parsing rows.

Fields

Fields

String

A comma-separated list of fields corresponding to sub-expressions.

Timestamp column format

TimestampColumnFormat

String

The format of the timestamp columns (for example, YYYY/MM/DD HH24:MI:SS.FF). If no timestamp format is specified, the adapter assumes that the timestamp is represented as a number of microseconds from 00:00:00 Jan 1, 1970 UTC/GMT. Note that this format specifier applies to all input columns of type TIMESTAMP, not only to the row timestamp column. This means that ALL columns of type TIMESTAMP must be formatted the same way; you cannot specify independent formats for each TIMESTAMP column. For more information, see Reading, Writing, and Converting Timestamps.

Ignore Mismatch

IgnoreMismatch

Boolean

If true, a line not matching the regular expression is ignored. If false, a line that does not match the regular expression raises and the adapter is stopped.

Log Mismatch

LogMismatch

Boolean

If set to true, lines that don't match the regular expression are logged. If false, the mismatched lines are dropped silently. This option is only used when IgnoreMismatch option is set to true.

Note: Each input stream has a property (see the stream's Properties tab in Studio) that can specify whether to use the current server timestamp value instead of the row timestamp set by the adapter. If this stream property is set to true, it overrides any row timestamp set by the adapter