Friday, August 21, 2015

adop phase=prepare failed with ERROR: Script failed, exit code 255 while creating WLS domain


Issue: 
We are trying to run adop prepare phase on our EBS 12.2 Patch file system and it came out with below errors in FSCloneApplyAppsTier_08200431.log file.

START: Creating new WLS domain.
Running /u01/app/EBSDB/fs2/FMW_Home/oracle_common/bin/pasteConfig.sh -javaHome /u01/app/EBSDB/fs2/EBSapps/comn/util/jdk64 -al /u01/app/EBSDB/fs1/EBSapps/comn/adopclone_ebsdb01/FMW/WLS/EBSdomain.jar -tdl /u01/app/EBSDB/fs2/FMW_Home/user_projects/domains/EBS_domain_EBSDB -tmw /u01/app/EBSDB/fs2/FMW_Home -mpl /u01/app/EBSDB/fs1/EBSapps/comn/adopclone_ebsdb01/FMW/WLS/plan/moveplan.xml -ldl /u01/app/EBSDB/fs1/inst/apps/EBSDB_ebsdb01/admin/log/clone/wlsT2PApply -silent true -debug true -domainAdminPassword /u01/app/EBSDB/fs1/EBSapps/comn/adopclone_ebsdb01/FMW/tempinfo.txt
Script Executed in 1602 milliseconds, returning status 255
ERROR: Script failed, exit code 255



The above information didn't give sufficient information and had to review CLONE2015-08-20_04-31-47AM.log &  CLONE2015-08-20_04-31-47AM.error logfiles under $ADOP_LOG_HOME/22/<prepare..>/<CONTEXT_NAME>/ directory.  These have provided below additional information which helped in resolving the issue.

CLONE2015-08-20_04-31-47AM.error:

SEVERE : Aug 20, 2015 4:31:49 AM - ERROR - CLONE-20372   Server port validation failed.
SEVERE : Aug 20, 2015 4:31:49 AM - CAUSE - CLONE-20372   Ports of following servers - AdminServer(7004) - are not available.  <-- This port is of Run filesystem instead it should look for 7005
SEVERE : Aug 20, 2015 4:31:49 AM - ACTION - CLONE-20372   Provide valid free ports.
oracle.as.t2p.exceptions.FMWT2PPasteConfigException: PasteConfig failed. Make sure that the move plan and the values specified in moveplan are correct.


CLONE2015-08-20_04-31-47AM.log:FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateAndGetServerPort] Listen Port(7805) of server forms-c4ws_server1 is valid and free.FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateServerConfig] Mapping of ports vs servers from moveplan = {AdminServer=[7004], forms-c4ws_server1=[7805], forms_server1=[7405], oacore_server1=[7205], oafm_server1=[7605]} FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateServerConfig] Mapping of temporary ports assigned vs servers = {}
FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateServerConfig] map_serverName_vs_actualPort_vs_TempPort = {}


Above highlighted area confirms that except AdminServer port all the other ports are properly picked up.

Additional info noticed in FSCloneApplyAppsTier_08200431.log
m_LstnAddrPrtsDBMap size = 6
Total no. of configGroup = 6
 configGroup Type name = SERVER_CONFIG
configProperty id = Server1
Count for NodeIterator nextNode = 3


Input for AdminServer Listen Address OR Port is NULL
Moveplan won't be updated for AdminServer Listen Address
Moveplan won't be updated for AdminServer Listen Port
+++++++++++++++++++++++++
configProperty id = Server2
Count for NodeIterator nextNode = 3
Input for oacore_server1 Listen Address OR Port is NULL
 Updating oacore_server1 Listen Address to ushtc0app02.jacobs.com
 Updating oacore_server1 Listen Port to 7205

These highlighted details above confirmed that adop is unable to update AdminPort to correct value as part of its preparations.


Fix:
Upon further investigation, Listen Address value for AdminServer on Run Filesystem is removed manually.

1) Login to weblogic console of Run  Filesystem. Perform Lock & Edit. Then Navigate to Domain Structure -> Servers -> Select Adminserver.
2) Under Configuration screen -> Update Listen Address value to application server hostname(FQDN).
3) Bounce all the application services.
4) Run adop phase=prepare.

Workaround:
If you want the Listen address value to be Null. Run fs_clone for the EBS Apps. Then proceed with adop prepare phase.

Tuesday, August 11, 2015

ADOP Failes with "Unable to find appltop_id for the host from database"

Issue:


Our EBS 12.2 application has been cloned recently from PROD and this is a shared application tier. User requested to apply a patch and when we tried to apply the patch adop is coming with errors. Below are the error details.

[UNEXPECTED]Unable to find appltop_id for the host oraap02 from database
[UNEXPECTED]Unable to find appltop_id for the host oraap02 from database

[EVENT]     [START 2015/04/10 02:23:17] Verify SSH
Logfile location /u01/app/fs1/inst/apps/EBSR12_shared/logs/appl/rgf/TXK/verifyssh.log
xml output = /u01/app/fs1/inst/apps/EBSR12_shared/logs/appl/rgf/TXK/out.xml
[EVENT]     [END   2015/04/10 02:23:20] Verify SSH
All nodes have ssh connectivity. Continuing...
[EVENT]     [START 2015/04/10 02:23:20] Checking the DB parameter value
[EVENT]     [END   2015/04/10 02:23:23] Checking the DB parameter value
  [UNEXPECTED]Unable to find appltop_id for the host oraap02 from database
  [UNEXPECTED]Invalid appltop id: "".
  [UNEXPECTED]Unrecoverable error occured. Exiting the current session.

  Log file: /u01/app/fs_ne/EBSapps/log/adop/adop_20150410_022145.log


The above error doesn't give much clear information on what was causing issue. I started reviewing & validating the below tables and everything shows good.

FND_NODES
FND_OAM_CONTEXT_FILES
AD_APPL_TOPS
AD_ADOP_SESSIONS


Then as a first attempt I ran autoconfig on Node 1 "oraapp01" and tried to restart adop again but this time with fs_clone option. This again printed same errors again(pointing to node 2) but luckily it went ahead with fs_clone.

[EVENT]     [START 2015/04/10 22:54:51] Checking the DB parameter value
[EVENT]     [END   2015/04/10 22:54:54] Checking the DB parameter value
    [UNEXPECTED]Unable to find appltop_id for the host oraapp02 from database
    [UNEXPECTED]Unable to find appltop_id for the host oraapp02 from database

 but fs_clone failed at Validation phase and the log file <ADPLOGHOME>/fs_clone_20150410_224816/EBSR12_oraap01/remote_execution_result_20150410_225611.xml showed below details



[ERROR]: Either the value of the context variables_shared_file_system in the run file system is not consistent between the file system and the database or the APPL_TOP name across nodes is not set correctly.


Solution:

This made me think about APPL_TOP Name again. I reviewed my context files on both the application tiers (Primary & Secondary) and found that the context variable s_atName is different on both the nodes.

For a shared Applicaiton Tier APPL_TOP name should be same across all the applicaiton tier nodes.
1) Modified context variable s_atName on Secondary node to Master Application Nodename .
2) Executed autoconfig and bounced services.

3) Then adop is went successfully.