Summary
Synchrony service is not getting restarted and it is giving the following error in the logs:
2022-09-28 12:45:57,805 ERROR [Caesium-1-3] [troubleshooting.healthcheck.concurrent.SupportHealthCheckProcess] lambda$getCompletedStatuses$0 Health check 'Collaborative Editing Mode' failed with severity 'major': 'Exception during health check invocation java.lang.IllegalStateException: Base URL misconfigured: <null>'
Environment
Diagnosis
Synchrony service is not getting started and giving the following error in the logs:
2022-09-28 12:45:57,805 ERROR [Caesium-1-3] [troubleshooting.healthcheck.concurrent.SupportHealthCheckProcess] lambda$getCompletedStatuses$0 Health check 'Collaborative Editing Mode' failed with severity 'major': 'Exception during health check invocation java.lang.IllegalStateException: Base URL misconfigured: <null>'
Also after checking the application.xml file, we could see the following property for the application base URL:
<server.base.url>N/A</server.base.url>
Cause
Confluence Synchrony service is dependent on the Base URL, if the Confluence base URL is not configured correctly or missing in the general configuration page, then we can get this issue.
Solution
As per the logs, Confluence Base URL is not configured correctly. To fix this, we can follow the below steps:
- Connect the Confluence General Configuration and try to set the correct Base URL.
- If the base URL is configured, please try to re-update the same.
- Once the base URL is updated, try to restart the Synchrony service from the Collaborative editing page.
- If the above steps doesn’t fix the issue, try to capture the Base URL detail from the DB ( Bandana Table ) and try to capture the base URL detail.
select bandanavalue from bandana where bandanakey = 'atlassian.confluence.settings' ;
The problem
When I try to start the Confluence I always get this exception:
30-Aug-2017 08:44:03.053 SEVERE [main] org.apache.catalina.core.StandardServer.await StandardServer.await: create[localhost:8091]:
java.net.BindException: Address already in use (Bind failed)
Here are the log and the server.xml:
- catalina.log
- server.xml
My confluence version: confluence-6.3.1
What have I noticed so far
When I start the confluence it spawns a process (at 08:41 AM)
conflue+ 5430 264 19.6 4935920 1606444 pts/0 Sl 08:41 7:24 /opt/atlassian/confluence/jre//bin/java -Djava.util.logging.config.file=/opt/atlassian/confluence/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djdk.tls.ephemeralDHKeySize=2048 -Djava.protocol.handler.pkgs=org.apache.catalina.webresources -Dconfluence.context.path= -Datlassian.plugins.startup.options= -Dorg.apache.tomcat.websocket.DEFAULT_BUFFER_SIZE=32768 -Dsynchrony.enable.xhr.fallback=true -Xms1024m -Xmx1024m -XX:+UseG1GC -Datlassian.plugins.enable.wait=300 -Djava.awt.headless=true -XX:G1ReservePercent=20 -Xloggc:/opt/atlassian/confluence/logs/gc-2017-08-30_08-41-24.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M -XX:-PrintGCDetails -XX:+PrintGCDateStamps -XX:-PrintTenuringDistribution -Djava.endorsed.dirs=/opt/atlassian/confluence/endorsed -classpath /opt/atlassian/confluence/bin/bootstrap.jar:/opt/atlassian/confluence/bin/tomcat-juli.jar -Dcatalina.base=/opt/atlassian/confluence -Dcatalina.home=/opt/atlassian/confluence -Djava.io.tmpdir=/opt/atlassian/confluence/temp org.apache.catalina.startup.Bootstrap start
Which uses the 8090 port
netstat -nap |grep :::80
tcp6 0 0 :::8090 :::* LISTEN 5430/java
Then it spawns another process (at 08:43 AM)
conflue+ 5430 264 19.6 4935920 1606444 pts/0 Sl 08:41 7:24 /opt/atlassian/confluence/jre//bin/java -Djava.util.logging.config.file=/opt/atlassian/confluence/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djdk.tls.ephemeralDHKeySize=2048 -Djava.protocol.handler.pkgs=org.apache.catalina.webresources -Dconfluence.context.path= -Datlassian.plugins.startup.options= -Dorg.apache.tomcat.websocket.DEFAULT_BUFFER_SIZE=32768 -Dsynchrony.enable.xhr.fallback=true -Xms1024m -Xmx1024m -XX:+UseG1GC -Datlassian.plugins.enable.wait=300 -Djava.awt.headless=true -XX:G1ReservePercent=20 -Xloggc:/opt/atlassian/confluence/logs/gc-2017-08-30_08-41-24.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M -XX:-PrintGCDetails -XX:+PrintGCDateStamps -XX:-PrintTenuringDistribution -Djava.endorsed.dirs=/opt/atlassian/confluence/endorsed -classpath /opt/atlassian/confluence/bin/bootstrap.jar:/opt/atlassian/confluence/bin/tomcat-juli.jar -Dcatalina.base=/opt/atlassian/confluence -Dcatalina.home=/opt/atlassian/confluence -Djava.io.tmpdir=/opt/atlassian/confluence/temp org.apache.catalina.startup.Bootstrap start
conflue+ 5756 82.4 8.0 4665924 658816 pts/0 Sl 08:43 0:40 /opt/atlassian/confluence/jre/bin/java -classpath /opt/atlassian/confluence/temp/1.0.0-release-confluence_6.1-a1ab321e.jar:/opt/atlassian/confluence/confluence/WEB-INF/lib/postgresql-42.1.1.jar -Xss2048k -Xmx1g synchrony.core sql
Which uses the 8091 port
tcp6 0 0 :::8090 :::* LISTEN 5430/java
tcp6 0 0 :::8091 :::* LISTEN 5756/java
And a few moments later it fails to start. If I kill the synchrony process (the second one) the confluence starts up correctly but I can’t edit the documents because I’get 502 error on the synchrony-proxy/heartbeat url.
So what should I do? Can I put the synchrony to another port? Or the synchrony should start faster and free up the port? What is the expected behaviour?
Problem
So, I have an issue quite often when restarting confluence on several servers (one running Ubuntu server 16.10 and the other running Amazon Linux) — synchrony either doesn’t start or doesn’t die (when stopping previous confluence instance). Things look fine until you go create or edit a page… you just get the an eternally spinning progress circle.
Checking General Configuration → Collaborative editing causes continual error boxes to be displayed (when I restart my instance and it happens again I put a picture here).
Solution
Well, I haven’t actually found a solution or workaround that I’m confident in yet, but I can usually gets things up and running correctly by one of the following.
Restarting confluence
Try restarting confluence by:
sudo service confluence restart
Stopping confluence, running stop-synchrony.sh, and waiting a bit
A few occasions restarting confluence didn’t work. Next I would try:
sudo service confluence stop sudo /opt/atlassian/confluence/bin/stop-confluence.sh sudo /opt/atlassian/confluence/bin/synchrony/stop-synchrony.sh
wait 30 seconds, then
sudo service confluence start
References
- https://jira.atlassian.com/browse/CONFSERVER-43741
Related articles
Problem
Synchrony process won’t start either with Confluence or manually.
The following error appears in the atlassian-confluence.log
ERROR [Long running task: Restart Synchrony Task] [plugins.synchrony.tasks.AbstractConfigLongRunningTask] runInternal An error occurred when running a Synchrony ConfigLongRunningTask
-- url: /rest/synchrony-interop/restart | referer: http://mycompany.com/admin/confluence-collaborative-editor-plugin/configure.action | traceId: 4e06227767fccf2b | userName: XXXXX
java.lang.NumberFormatException: For input string: ""
Cause
Synchrony PID file has been corrupted.
Resolution
The PID file is created when starting Confluence, in case it doesn’t already exist. The steps below will fix the issue:
-
Stop Confluence
-
Remove the file
<confluence-install>/temp/synchrony.pid -
Start Confluence
Last modified on Jun 14, 2019
Related content
- No related content found
Remember the RCA of a confluence failure
- Confluence failure RCA (Root Cause Analysis)
-
- problem
- Root Cause
-
- Fault trigger reason
- Verify the service process corresponding to port 8091
-
- About Synchrony
- Verify the start method of the Synchrony service process
- Start the Synchrony service and observe whether Confluence is connected
- Q&A
Confluence failure RCA (Root Cause Analysis)
This article records a troubleshooting cause of confluence that I did for a customer last year. I also hope to take this opportunity to communicate with friends who use confluence.
The content of this article has been desensitized.
problem
The user’s monitoring system alarms that confluence cannot be accessed, and the home page cannot be opened. During the investigation, our engineers restarted confluence (stop-confluence.sh/start-confluence.sh) And found that the process and port failed to return to normal.
Root Cause
Fault trigger reason
Check the log file catalina.out. At 07-Jul-2020 15:10:49.691, an out of memory (OOM) can be seen, and the WebSocket Connection Manager stops because of this. At the same time, «Exception in thread «synchronyProxyFilter-74905″ appears in the log below. java.lang .OutOfMemoryError: Java heap space» and many other Exceptions, it is determined that the component that provides the WebSocket service on port 8091 has OOM.
Verify the service process corresponding to port 8091
Based on the results obtained above, we determine that the service needs to be restored, and we need to make the WebSocket of port 8091 work again. Therefore, we need to verify which application provides the service on this port.
By commandnetstat -tunlp | grep 8091The PID of the corresponding process of the port can be obtained, and the results are as follows:
According to the PID, you can pass the commandps -ed | grep java, We can get the following four information:
1) The process of PID 17511 is Synchrony
2) The PPID (parent process) of PID 17511 is 16748
3) The process of PPID 16748 is the main Confluence program
4) The JVM of the Synchrony process is configured with a maximum memory of 1GB, and the JVM of the main process of Confluence is 8GB.
About Synchrony
Synchrony is the engine that provides collaborative editing to Confluence. It is a service that allows real-time synchronization of arbitrary data models.
1) How Synchrony works
- Confluence uses appId and appSecret to communicate with the Synchrony service.
- JSON Web Token (JWT) provides connection details to the client.
- When the Confluence editor is initialized, Synchrony Javascript will be loaded into the browser.
- Synchrony opens a WebSocket session through the JWT and the contentId of the page being edited.
- WebSocket connections allow multiple clients to stay in sync.
Therefore, the content data of the page will be stored on the Synchrony service, which will serve as the true source of the page content.
Verify the start method of the Synchrony service process
In order to make Synchrony work again, the corresponding startup script must be found. We use the following command to find:
Use commandgrep -R Synchrony ./*Find the corresponding startup file in /data/atlassian/confluence/bin.
Several points have been implemented:
1) Synchrony is started by ./synchrony/start-synchrony.sh and closed by ./synchrony/stop-synchrony.sh.
2) No Synchrony startup script was found in start-confluence.sh.
3) No Synchrony startup script was found in catalina.sh.
4) There is a startup script associated with /synchrony/start-synchrony.sh in /etc/init.d/.
Start the Synchrony service and observe whether Confluence is connected
1) Before restarting Synchrony, we can see a large number of connect failed in the log:
2) After restarting Synchrony, we can see a large number of successfully connected on the log:
3) Failure analysis
In summary, due to Synchrony’s OOM, access problems occur in the confluence front end.
At the same time, due to the separation of Synchrony’s startup script and confluence’s startup script, 30 minutes before the debug work, the confluence startup problem was focused on and the root cause of the problem was ignored. The root cause of the problem was the WebSocket on port 8091.
Q&A
Q: Why does a prompt page appear on the front end during debugging: BootstrapException: Unable to bootstrap application: failed to find config at: /data/atlassian/application-data/confluence/confluence.cfg.xml?
A: This problem is caused by the use of the wrong startup script to start confluence during the debug process, resulting in the user permissions of confluence.cfg.xml becoming root, and then using the correct script to restart confluence, because the user group permissions are lower than root, it will Cause the problem that the file cannot be overwritten.
For example, the startup script /data/atlassian/confluence/bin/startup.sh did not call the user environment variable setting file user.sh, and did not set CONF_USER, which caused the problem to occur, but this is not the root cause of this failure.
The correct way to start confluence is:
systemctl start confluence
or/etc/init.d/confluence start
orcd /data/atlassian/confluence/bin/Execute againstart-confluence.sh
Q: The current problem of business interruption caused by single node failure, is there any feasibility for high availability or cluster deployment?
A: Refer to official information:https://confluence.atlassian.com/doc/set-up-a-synchrony-cluster-for-confluence-data-center-958779073.html






