Home > Java, Solaris, Weblogic > Troubleshooting Weblogic Authentication with the Truss Command

Troubleshooting Weblogic Authentication with the Truss Command

Recently, we were faced with a situation where a Weblogic admin server was failing to start, but was leaving very little info in the logs to point to the reason why, so we had to dig a bit deeper to find the cause.

This is the error message as it appeared in the weblogic log:

<Jun 22, 2010 11:53:43 PM CDT> <Notice> <Security> <BEA-090082> <Security initializing using security realm myrealm.>
<Jun 22, 2010 11:53:43 PM CDT> <Critical> <Security> <BEA-090403> <Authentication for user recycler denied>
<Jun 22, 2010 11:53:43 PM CDT> <Critical> <WebLogicServer> <BEA-000386> <Server subsystem failed. Reason: weblogic.security.SecurityInitializationException: Authentication for user recycler denied
weblogic.security.SecurityInitializationException: Authentication for user recycler denied
at weblogic.security.service.CommonSecurityServiceManagerDelegateImpl.doBootAuthorization(Unknown Source)
at weblogic.security.service.CommonSecurityServiceManagerDelegateImpl.initialize(Unknown Source)
at weblogic.security.service.SecurityServiceManager.initialize(Unknown Source)
at weblogic.security.SecurityService.start(SecurityService.java:141)
at weblogic.t3.srvr.SubsystemRequest.run(SubsystemRequest.java:64)
Truncated. see log file for complete stacktrace
>
<Jun 22, 2010 11:53:44 PM CDT> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FAILED>
<Jun 22, 2010 11:53:44 PM CDT> <Error> <WebLogicServer> <BEA-000383> <A critical service failed. The server will shut itself down>
<Jun 22, 2010 11:53:44 PM CDT> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN>

So the admin server was dying in under 30 seconds, because the user it starts as (recycler) could not be authenticated.

So we validated that the user existed, was correctly configured in the admin server, had the proper permissions, etc.  We couldn’t find any issues with the user or why it wasn’t being authenticated.  To make matters even more confusing, this is what happened when the start command was run on the localhost, but we discovered that if we ran the start command remotely, via SSH, it started without issue…

We needed more information than the logs were providing, so we busted out the Solaris ‘truss’ command to see the system calls and library calls being made by the weblogic process.  The problem became apparent very quickly.  Here is a sample of the truss output:

/2:     lwp_cond_signal(0×02259130)                     = 0
/46:    lwp_cond_wait(0×02259130, 0×02259118, 0×00000000, 0) = 0
/2:     write(1, ” < J u n   2 3 ,   2 0 1″.., 117)     = 117
/2:     write(8, ” # # # # < J u n   2 3 ,”.., 188)     = 188
/2:     stat64(“/apps/web/weblogic/ratesvc/WLS10SP3/admin/weblogic/ldap/LDAPExecuteRequest.class”, 0xFE4FDC30) Err#2 ENOENT
/2:     llseek(258, 3041622, SEEK_SET)                  = 3041622
/2:     read(258, ” P K030414\0\0\0\b\01C84″.., 30)     = 30

The error #2 ENOENT means “no such file or directory”… and we verified that this directory does not exist, so we have our problem, but we had already double-checked all of the configs and everything was correct, so why was it looking in this nonexistent directory?

The final clue came in the fact that it started without issue when started remotely, via SSH, but broke when run on the localhost…  the difference?  when a command is run remotely via SSH, the user’s profile is not sourced, so we looked to the user’s profile and found the problem.  Multiple Weblogic domains were running under this one user, which is typically not an issue, but in this case, someone has placed domain variables that were specific to one domain, but not to all, in the profile…  we verified the confused environment variables by running ‘pargs -e’ on the PID and many of these variables were wrong and were being passed to the other domains that we were trying to start.  A bad practice and this is one reason why…

So we moved the domain-specific variables from the users profile to a properties file that is sourced by the start script of the domain and now only the process that needs them gets them.

So when you’re knee-deep in a problem with very little information to go on, don’t forget the “trussty” truss command (sorry…)

Categories: Java, Solaris, Weblogic Tags:
  1. No comments yet.
  1. No trackbacks yet.