Issue:
When attempting to start Fusion or Zookeeper, there are instances where the following error message pertaining to the shutdown hook is encountered:
$ bin/zookeeper start
Starting zookeeper.....
Successfully started zookeeper on port 9983 (process ID 24448)
$ bin/zookeeper status
zookeeper is running on port 9983 (process ID 24448)
2017-11-17 16:41:21,562 agent service-monitor ERROR Unable to register shutdown hook because JVM is shutting down. java.lang.IllegalStateException: Cannot add new shutdown hook as this is not started. Current state: STOPPED
at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.addShutdownCallback(DefaultShutdownCallbackRegistry.java:113)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:273)
at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:145)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:182)
at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:103)
at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43)
at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:358)
at org.zeroturnaround.exec.stream.slf4j.Slf4jStream.ofCaller(Slf4jStream.java:85)
at org.zeroturnaround.process.UnixProcess.isAlive(UnixProcess.java:26)
at com.lucidworks.apollo.supervisor.ProcessLivenessDetector.getState(ProcessLivenessDetector.java:46)
at com.lucidworks.apollo.supervisor.LocalRunningService$MonitorThread.run(LocalRunningService.java:791)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
What is shutdown hook?
A special construct that facilitates the developers to add some code that has to be run when the Java Virtual Machine (JVM) is shutting down is known as the Java shutdown hook.
In the context of the Log4j library, a shutdown hook is a mechanism that ensures proper cleanup and resource release for the Log4j logging framework when your Java application is shutting down. Disabling the shutdown hook will prevent Log4j from performing its usual cleanup operations when the application exits. This can be helpful when the application is not running or shutting down unexpectedly.
Environment:
Fusion 4
Resolution:
When starting ZooKeeper, you may initially see a successful start message; however, upon checking the status, the error will show immediately.
To resolve this issue, you will need to edit the agent-log4j2.xml file on each server and subsequently restart the Agent process on each of them.
-
First check to determine if zookeeper is actually running with the ‘ruok’ command.
-
echo ruok | nc localhost 9983
-
-
Navigate to your agent-log4j2.xml file which is located in the $FUSION-HOME/conf directory and edit the Configuration section to add the shutdownHook disable (it should be the second line in the file)
<Configuration status="WARN" monitorInterval="120" shutdownHook="disable">
-
Restart the agent
-
bin/fusion restart agent
-
-
After implementing this change on all of the servers, attempt to restart Zookeeper. If the error persists, you should then check that there is no AgentMain process hanging around. If such a process exists, please remove it.
-
To remove the Agent process, you can check by following these steps:
-
Run 'jps' to identify the AgentMain Java process ID, Alternatively, you can use 'ps -ef | grep AgentMain' to locate the process.
-
-
-
kill -9 <pid>
Example: based on the above 'jps' output, you would use the command: 'kill -9 57623' -
After removing the
AgentMain
, attempt to start Zookeeper or Fusion again. It should start without the previous error.
NOTE: There is an internal JIRA, APOLLO-11615, opened to address this issue.
Comments
0 comments
Please sign in to leave a comment.