Friday, August 21, 2015

adop phase=prepare failed with ERROR: Script failed, exit code 255 while creating WLS domain

We are trying to run adop prepare phase on our EBS 12.2 Patch file system and it came out with below errors in FSCloneApplyAppsTier_08200431.log file.

START: Creating new WLS domain.
Running /u01/app/EBSDB/fs2/FMW_Home/oracle_common/bin/ -javaHome /u01/app/EBSDB/fs2/EBSapps/comn/util/jdk64 -al /u01/app/EBSDB/fs1/EBSapps/comn/adopclone_ebsdb01/FMW/WLS/EBSdomain.jar -tdl /u01/app/EBSDB/fs2/FMW_Home/user_projects/domains/EBS_domain_EBSDB -tmw /u01/app/EBSDB/fs2/FMW_Home -mpl /u01/app/EBSDB/fs1/EBSapps/comn/adopclone_ebsdb01/FMW/WLS/plan/moveplan.xml -ldl /u01/app/EBSDB/fs1/inst/apps/EBSDB_ebsdb01/admin/log/clone/wlsT2PApply -silent true -debug true -domainAdminPassword /u01/app/EBSDB/fs1/EBSapps/comn/adopclone_ebsdb01/FMW/tempinfo.txt
Script Executed in 1602 milliseconds, returning status 255
ERROR: Script failed, exit code 255

The above information didn't give sufficient information and had to review CLONE2015-08-20_04-31-47AM.log &  CLONE2015-08-20_04-31-47AM.error logfiles under $ADOP_LOG_HOME/22/<prepare..>/<CONTEXT_NAME>/ directory.  These have provided below additional information which helped in resolving the issue.


SEVERE : Aug 20, 2015 4:31:49 AM - ERROR - CLONE-20372   Server port validation failed.
SEVERE : Aug 20, 2015 4:31:49 AM - CAUSE - CLONE-20372   Ports of following servers - AdminServer(7004) - are not available.  <-- This port is of Run filesystem instead it should look for 7005
SEVERE : Aug 20, 2015 4:31:49 AM - ACTION - CLONE-20372   Provide valid free ports. PasteConfig failed. Make sure that the move plan and the values specified in moveplan are correct.

CLONE2015-08-20_04-31-47AM.log:FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateAndGetServerPort] Listen Port(7805) of server forms-c4ws_server1 is valid and free.FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateServerConfig] Mapping of ports vs servers from moveplan = {AdminServer=[7004], forms-c4ws_server1=[7805], forms_server1=[7405], oacore_server1=[7205], oafm_server1=[7605]} FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateServerConfig] Mapping of temporary ports assigned vs servers = {}
FINE : Aug 20, 2015 4:31:49 AM - [J2EEGenericValidationUtil:validateServerConfig] map_serverName_vs_actualPort_vs_TempPort = {}

Above highlighted area confirms that except AdminServer port all the other ports are properly picked up.

Additional info noticed in FSCloneApplyAppsTier_08200431.log
m_LstnAddrPrtsDBMap size = 6
Total no. of configGroup = 6
 configGroup Type name = SERVER_CONFIG
configProperty id = Server1
Count for NodeIterator nextNode = 3

Input for AdminServer Listen Address OR Port is NULL
Moveplan won't be updated for AdminServer Listen Address
Moveplan won't be updated for AdminServer Listen Port
configProperty id = Server2
Count for NodeIterator nextNode = 3
Input for oacore_server1 Listen Address OR Port is NULL
 Updating oacore_server1 Listen Address to
 Updating oacore_server1 Listen Port to 7205

These highlighted details above confirmed that adop is unable to update AdminPort to correct value as part of its preparations.

Upon further investigation, Listen Address value for AdminServer on Run Filesystem is removed manually.

1) Login to weblogic console of Run  Filesystem. Perform Lock & Edit. Then Navigate to Domain Structure -> Servers -> Select Adminserver.
2) Under Configuration screen -> Update Listen Address value to application server hostname(FQDN).
3) Bounce all the application services.
4) Run adop phase=prepare.

If you want the Listen address value to be Null. Run fs_clone for the EBS Apps. Then proceed with adop prepare phase.

Tuesday, August 11, 2015

ADOP Failes with "Unable to find appltop_id for the host from database"


Our EBS 12.2 application has been cloned recently from PROD and this is a shared application tier. User requested to apply a patch and when we tried to apply the patch adop is coming with errors. Below are the error details.

[UNEXPECTED]Unable to find appltop_id for the host oraap02 from database
[UNEXPECTED]Unable to find appltop_id for the host oraap02 from database

[EVENT]     [START 2015/04/10 02:23:17] Verify SSH
Logfile location /u01/app/fs1/inst/apps/EBSR12_shared/logs/appl/rgf/TXK/verifyssh.log
xml output = /u01/app/fs1/inst/apps/EBSR12_shared/logs/appl/rgf/TXK/out.xml
[EVENT]     [END   2015/04/10 02:23:20] Verify SSH
All nodes have ssh connectivity. Continuing...
[EVENT]     [START 2015/04/10 02:23:20] Checking the DB parameter value
[EVENT]     [END   2015/04/10 02:23:23] Checking the DB parameter value
  [UNEXPECTED]Unable to find appltop_id for the host oraap02 from database
  [UNEXPECTED]Invalid appltop id: "".
  [UNEXPECTED]Unrecoverable error occured. Exiting the current session.

  Log file: /u01/app/fs_ne/EBSapps/log/adop/adop_20150410_022145.log

The above error doesn't give much clear information on what was causing issue. I started reviewing & validating the below tables and everything shows good.


Then as a first attempt I ran autoconfig on Node 1 "oraapp01" and tried to restart adop again but this time with fs_clone option. This again printed same errors again(pointing to node 2) but luckily it went ahead with fs_clone.

[EVENT]     [START 2015/04/10 22:54:51] Checking the DB parameter value
[EVENT]     [END   2015/04/10 22:54:54] Checking the DB parameter value
    [UNEXPECTED]Unable to find appltop_id for the host oraapp02 from database
    [UNEXPECTED]Unable to find appltop_id for the host oraapp02 from database

 but fs_clone failed at Validation phase and the log file <ADPLOGHOME>/fs_clone_20150410_224816/EBSR12_oraap01/remote_execution_result_20150410_225611.xml showed below details

[ERROR]: Either the value of the context variables_shared_file_system in the run file system is not consistent between the file system and the database or the APPL_TOP name across nodes is not set correctly.


This made me think about APPL_TOP Name again. I reviewed my context files on both the application tiers (Primary & Secondary) and found that the context variable s_atName is different on both the nodes.

For a shared Applicaiton Tier APPL_TOP name should be same across all the applicaiton tier nodes.
1) Modified context variable s_atName on Secondary node to Master Application Nodename .
2) Executed autoconfig and bounced services.

3) Then adop is went successfully.