R12.2 Apps DBA. Powered by Blogger.

ADOP Fails Because Remote Execution Failed

No comments :
On : 12.2.4 version, Online Patching - OP
When attempting to run adop phase=prepare or  fs_clone or abort in a multi-node deployment,
the following error occurs.
ERROR
-----------------------
  Executing "adop" action in all nodes
  [ERROR] Node: "erpdb" Status: "failed"
  [STATEMENT] Node: "erpp1" Status: "success"
  [ERROR] Node: "erpp2" Status: "failed"
  [ERROR] Remote execution failed on Node: "erpp2".
  [ERROR] Remote execution failed on Node: "erpdb".
  [EVENT] [START 2015/08/30 17:45:19] Checking if adop can continue with available nodes
  [EVENT] Calling: /engn101/erpprod/fs2/EBSapps/appl/ad/12.0.0/patch/115/bin/txkADOPEvalSrvStatus.pl
  [STATEMENT] Output directory: /engn101/erpprod/fs_ne/EBSapps/log/adop/16/prepare_20150830_164022/ERPPROD_erpp1
 [UNEXPECTED]Error occurred while executing "perl /engn101/erpprod/fs2/EBSapps/appl/ad/12.0.0/patch/115/bin/txkADOPEvalSrvStatus.pl -contextfile=/engn101/erpprod/fs2/inst/apps/ERPPROD_erpp1/appl/admin/ERPPROD_erpp1.xml -patchcontextfile=/engn101/erpprod/fs1/inst/apps/ERPPROD_erpp1/appl/admin/ERPPROD_erpp1.xml -phase=prepare -promptmsg=hide -nodelist=erpp1 -console=off -sessionid=16 -timestamp=20150830_164022 -outdir=/engn101/erpprod/fs_ne/EBSapps/log/adop/16/prepare_20150830_164022/ERPPROD_erpp1"
  [UNEXPECTED]Error occurred while Checking if adop can continue with available nodes using command: "perl /engn101/erpprod/fs2/EBSapps/appl/ad/12.0.0/patch/115/bin/txkADOPEvalSrvStatus.pl -contextfile=/engn101/erpprod/fs2/inst/apps/ERPPROD_erpp1/appl/admin/ERPPROD_erpp1.xml -patchcontextfile=/engn101/erpprod/fs1/inst/apps/ERPPROD_erpp1/appl/admin/ERPPROD_erpp1.xml -phase=prepare -promptmsg=hide -nodelist=erpp1 -console=off -sessionid=16 -timestamp=20150830_164022 -outdir=/engn101/erpprod/fs_ne/EBSapps/log/adop/16/prepare_20150830_164022/ERPPROD_erpp1".
  Previous tasks have failed or are incomplete on nodes: ",erpp2,erpdb"
  [UNEXPECTED]Unable to continue with other completed nodes: "erpp1".
  [UNEXPECTED]Error: While executing "adop" on all other nodes.
  [UNEXPECTED]Execution failed when performing remote action.
remote_execution_result.xml:
     [END   2015/08/30 17:40:51] adzdoptl.pl runadop phase=prepare - Completed Successfully
     Log file: /engn101/erpprod/fs_ne/EBSapps/log/adop/16/adop_erpdb_20150830_164837.log
     adop exiting with status = 0 (Success)
     Worker count determination...stty: tcgetattr: Not a typewriter :failed</ResultString>
     <Status>failed</Status>
     <TraceFile/>
  </Node>
  <Node>
     <Name>erpp1</Name>
     <ProcessName>erpp1_patchpool-1-thread-2_erpp1_patch</ProcessName>
     <ResultString>
Please wait. Validating credentials...
Execute SYSTEM command : df -k /engn101/erpprod/fs1
Validation successful. All expected nodes are listed in ADOP_VALID_NODES table.
Worker count determination...
stty: tcgetattr: Not a typewriter
.end err out.
</ResultString>
     <Status>success</Status>
     <TraceFile>NoTraceFile</TraceFile>
  </Node>
  <Node>
     <Name>erpp2</Name>
     <ProcessName>nx1aerpp2_patchpool-1-thread-3_erpp2_patch</ProcessName>
     <ResultString> Please wait. Validating credentials...
     Execute SYSTEM command : df -k /engn101/erpprod/fs1 Validation successful.
     All expected nodes are listed in ADOP_VALID_NODES table.[EVENT]
......
     adop exiting with status = 0 (Success)
     Worker count determination...stty: tcgetattr: Not a typewriter :failed
     </ResultString>
     <Status>failed</Status>
     <TraceFile/>
  </Node>
</NODES>
STEPS
-----------------------
The issue can be reproduced at will with the following steps:
1. Login to primary node as applmgr
2. adop phase=prepare
CAUSE
In Multi node, non-shared environment, ADOP returns slave node Remote execution status as failed due to "stty: tcgetattr: Not a typewriter :failed"
though the task completed successfully.
It's because of the command stty and due to this its retuning failed status.
SSH in remote nodes were not enabled using txkRunSSHSetup.pl.
After enabling SSH via txkRunSSHSetup.pl, the issue is  resolved.
This is explained in the following bug:
Bug 18138059 - QREP1224:B1:REMOTE EXECUTION ON SLAVE NODE RETURN FAILED STATE IN MN NON SHARED
SOLUTION
Re-enable ssh from the primary node to all secondary nodes using txkRunSSHSetup.pl
For example, a basic command to enable ssh would be:
$ perl $AD_TOP/patch/115/bin/txkRunSSHSetup.pl enablessh -contextfile=<CONTEXT_FILE> -hosts=h1,h2,h3
To verify ssh operation:
$ perl $AD_TOP/patch/115/bin/txkRunSSHSetup.pl verifyssh -contextfile=<CONTEXT_FILE> -hosts=h1,h2,h3 \
-invalidnodefile=<filename to report ssh verification failures>

No comments :

Post a Comment

Note: only a member of this blog may post a comment.