Solution

When RepAgent shuts down because you did not define a large object (LOB) datatype column such a image, text, or unitext column correctly in the replication definition, you must purge corrupt data from affected queues and correct the replication definition before resuming replication.

This scenario creates table t3 and the corresponding replication that incorrectly defines the datatypes in the table resulting in error 32044 and the shut down of RepAgent. The scenario shows how to purge the queues and recover from the error.

  1. Create the tbl3 table in the primary and replicate Adaptive Server databases:
    create table tbl3
    (
    p_key int not null, 
    char_col char(10), 
    unitext_fld1 text null, 
    unitext_fld2 text null
    )
    Remember: The datatype of columns unitext_fld1 and unitext_fld2 in the schema for tbl3 is text and the columns are not compressed LOB data columns.
  2. Create the repdef_tbl3 replication definition for the tbl3 table in the pdb1 primary database of the PDS data server:
    create replication definition repdef_tbl3 
    with primary at PDS.pdb1 
    ( p_key int,
      char_col char(10), 
      unitext_fld1 image, 
      unitext_fld2 image
     )
    primary key (p_key)
    searchable columns (p_key) 
    always_replicate (unitext_fld1,unitext_fld2)
     go
    Attention: Instead of text, the replication definition is incorrect and defines the datatypes of columns unitext_fld1 and unitext_fld2 as image. The two columns are not compressed LOB data columns and therefore the replication definition does not need to define the columns as image dataype columns.
  3. Insert data into tbl3 at pdb1:
    insert into tbl3 values (5,'cc','aa','bb')
    go
    The RepAgent shuts down because of the colmn datatype mismatch between the table schema and the replication definition and you see error messages in the log files of the :
    • Primary Replication Server –
      E. 2013/01/23 18:38:44. ERROR #32044 REP AGENT(PDS.pdb1) - /nrm/nrm.c(11165)
              The datatype ('4') for 'tbl3.unitext_fld1' is not convertible to the required type ('5').
    • Primary Adaptive Server–
      00:0003:00000:00037:2013/01/23 18:38:44.15 server  RepAgent(4): Error in passthru packet: 
      'distribute 4 0x000000000000234b00003dbe000600003dbe00030000a14f013343db00000000,
      6 0x000000000000234b00037265706c325f313233363874646232 applied owner =~"$dbo ~"%tbl3.
      ~!-rs_writetext append first last changed with log textlen =2 ~$-unitext_fld1=~.!!#aa '.
      00:0003:00000:00037:2013/01/23 18:38:44.15 server  
      RepAgent(4): Received the following error message from the Replication Server:
      Msg 32044. The datatype ('4') for 'tbl3.unitext_fld1' is not convertible to the required type ('5')..
  4. Obtain the inbound queue number for pdb1 in PDS. At Replication Server enter:
    admin who
    You see:
    29 SQM        Awaiting Message     102:1 PDS.pdb1
    27 SQM        Awaiting Message     102:0 PDS.pdb1
       REP AGENT  Down                 PDS.pdb1
    The inbound queue number is 102.
  5. Dump the contents of queue 102 to the client machine where you are issuing the sysadmin dump_queue command. The first open transaction after RepAgent shuts down contains the corrupt data. At Replication Server enter:
    sysadmin dump_queue,102,1,-1,-1,-1,client
    go
    In the Replication Server log, you see:
    102           1           0          15           0         240         102
            Jan 23 2013  6:38PM 0x000000000000234b00003dbe000300003dbe00030000a14f0133
            43db0000000000000000 sa        _ins                           
            0x0000000000000000000000000000000000000000000000000000000000000000000f0000
            4 0x000000000000234b00037265706c325f31323336387464623200000066000000000000
            00000000000000000000000000000000000000000000000000000000000000000000000000
            00000000000000000000000000000000000000000000000000000000000000000000000000
            0000000000000000000000
            0        1100          19           0
            begin transaction                                                                                                                                                                                                                                        
            102           1           0          15           1         484         102
            Jan  1 1900 12:00AM
            0x000000000000234b00003dbe000400003dbe00030000a14f013343db0000000000000000
            NULL         NULL                           
            0x0000000000000000000000000000000000000000000000000000000000000000000f0001
            2101248
            0x000000000000234b00037265706c325f3132333638746462320000006600000000000000
            00000000000000000000000000000000000000000000000000000000000000000000000000
            00000000000000000000000000000000000000000000000000000000000000000000000000
            00000000000000000000 
            0        1100         104           0
            insert into dbo.tbl3 (p_key, char_col, unitext_fld1, unitext_fld2)
            values (5, 'cc        ', 0x20, 0x20)                 
    Note this transaction is interrupted. only begin transaction and insert.
    
  6. Purge the first open transaction from the 102 inbound queue. At Replication Server enter:
    sysadmin purge_first_open,102,1
    go
  7. Dump the queue contents again to check if there are any error messages due to the continued presence of corrupt data in the queues:
    sysadmin dump_queue,102,1,-1,-1,-1,client
    go
  8. Purge the queue until there are no error 32044 messages in the dump from the queue.
    sysadmin purge_first_open,102,1
    go
  9. Correct the replication definition by changing the datatypes of the unitext_fld1 and unitext_fld2 columns to text to match the table schema:
    alter replication definition tbl3
    alter columns with unitext_fld1 text
    go
    alter replication definition tbl3
    alter columns with unitext_fld2 text
    go
  10. Enable autocorrection for the DSI thread:
    set autocorrection on
    for replication_definition
    with replicate at data_server.database
  11. Retrieve the current database generation ID.
    At the Adaptive Server primary database, enter:
    dbcc gettrunc
    go
    You see a single row
    secondary trunc page secondary trunc state dbrepstat generation id 
    -------------------- --------------------- --------- ------------- 
                   15836                     1       231             1
    
    database id database name ltl version
    ----------- ------------- -----------
              4 pdb1          760
    
  12. Instruct Adaptive Server to increment the database generation number in the log for the database where RepAgent has shut down
    Since the value for the "generation id" column output from dbcc gettrunc is 1, set the new ID value to 2:
    dbcc settrunc('ltm', 'gen_id', 2)
    go
  13. Restart RepAgent
  14. Disable autocorrection after successful replication of the data that was previously corrupted:
    set autocorrection off
    for replication_definition
    with replicate at data_server.database