Diagnostics to Collect for a Hung or Blocked Java DSP Client.jar Process (KBA10850)
KBA
KBA#Applicable Delphix Versions
- This note currently applies to all Continuous Data engine versions from 6.0 onwards
-
Date Release Jan 25, 2024 19.0.0.0 Dec 20, 2023 | Jan 10, 2024 18.0.0.0 | 18.0.0.1 Nov 21, 2023 17.0.0.0 Oct 18,2023 16.0.0.0 Sep 21, 2023 15.0.0.0 Aug 24, 2023 14.0.0.0 Jul 24, 2023 13.0.0.0 Jun 21, 2023 12.0.0.0 May 25, 2023 11.0.0.0 Apr 13, 2023 10.0.0.0 Mar 13, 2023 | Mar 20, 2023 9.0.0.0 | 9.0.0.1 Feb 13, 2023 8.0.0.0 Jan 12, 2023 7.0.0.0 Releases Prior to 2023 Major Release All Sub Releases 6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2
Gathering Diagnostics for a Blocked Java DSP Client on a Source or Target Host
This information may be requested by Support in order to facilitate the troubleshooting of a job that is not progressing or moving forward, due to an unresponsive, hung, or blocked DSP java client at the server host.
Prerequisites
Jobs such as Snapsync were initiated but are not progressing.
Resolution
Complete the following procedures on the target host while a Snapsync job hang is actually occurring or at the next Snapsync hang. This assumes that the dsp client.jar java processes have not been killed at the host prior to collecting these diagnostics information.
- From the target/server host:
- Gather Memory and CPU information
hostname;date > <hostname>.out ulimit -a >> <hostname>.out free -h >> <hostname>.out top >> <hostname>.out
If you are using AIX, use the following:
hostname;date > <hostname>.out ulimit -a >> <hostname>.out prtconf >> <hostname>.out svmon >> <hostname>.out
- Collect the following ps listings
date;hostname > <hostname>_ps_output.out
ps -ef >> <hostname>_ps_output.out
- Obtain a jstack (and pstack if the tool exists) of all dsp client processes, assuming that none of the DSP client.jar processes have been killed.
- Get listing via ps
ps -ef | grep client.jar | grep dsp > <hostname>_delphix_dsp.out
Example of what you see when you issue ps -ef | grep client.jar | grep dsp
:
delphix+ 28329 1 0 Nov13 ? 00:02:31
/delphix/toolkit/MDMAR/Delphix_COMMON_ae8f5d539287_cdcd110c3e5c_5_host/java/jdk/bin/java -ea -XX:- UseVMInterruptibleIO -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/delphix/toolkit/MDMAR -Dcom.ibm.tools.attach.enable=no -javaagent:/delphix/toolkit/MDMAR/Delphix_COMMON_ae8f5d539287_cdcd110c3e5c_5_host/client/dsp/libs/com.delphix.common/agent-1.0.0.jar -Ddelphix.host.os=unix -Ddelphix.toolkit.base.dir=/delphix/toolkit/MDMAR -Ddelphix.max.worker=16 -Djava.io.tmpdir=/delphix/toolkit/MDMAR/Delphix_ae8f5d539287_cdcd110c3e5c_5_host/tmp -jar /delphix/toolkit/MDMAR/Delphix_COMMON_ae8f5d539287_cdcd110c3e5c_host/client/dsp/client.jar
- Execute
jstack
on each delphix dsp client.jar process.
To run jstack:
<path to COMMON TOOLKIT>/java/jdk/bin/jstack -l <pid> > <pid>_jstack.out
example:
/delphix/toolkit/MDMAR/Delphix_COMMON_ae8f5d539287_cdcd110c3e5c_host/java/jdk/bin/jstack -l 28329 > 28329_jstack.out
If you are using AIX, use the following:
ps -ef | grep client.jar | grep dsp
delphix 28442732 1 0 Apr 08 - 15:48 /home/delphix/toolkit/Delphix_COMMON_378afb0bc5a8_973cb425e665_host/java/jdk/bin/java -ea -XX:-UseVMInterruptibleIO -Xms512m -Xmx512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/delphix/toolkit -Dio.netty.allocator.maxOrder=7 -Djdk.tls.acknowledgeCloseNotify=true -Dconnector -Dcom.ibm.tools.attach.enable=no -javaagent:/home/delphix/toolkit/Delphix_COMMON_378afb0bc5a8_973cb425e665_host/client/dsp/libs/com.delphix.common/agent-1.0.0.jar -Ddelphix.host.os=unix -Ddelphix.toolkit.base.dir=/home/delphix/toolkit -Ddelphix.max.worker=16 -Djava.io.tmpdir=/home/delphix/toolkit/Delphix_378afb0bc5a8_973cb425e665_7_host/tmp -jar /home/delphix/toolkit/Delphix_COMMON_378afb0bc5a8_973cb425e665_host/client/dsp/client.jar
On AIX, jstack may not exist on the host , so in order to get an an equivalent output, execute the following as root or using sudo
:
Send a signal 3 to create the jstack-like file:
root@aix101-14:/-> kill -3 28442732
Result:
root@aix101-14:/-> ls -lt /home/delphix/toolkit
-rw-r--r-- 1 delphix delphix 1236119 Apr 15 08:23 javacore.20240415.082327.28442732.0001.txt
The javacore text file is the file needed.
- Take the native stack of each DSP client process using
pstack
, optional for AIX or if pstack/gstack does not exist:
pstack <pid> > /tmp/pstack_<pid>_dsp.out
** or on some platforms
gstack <pid> > /tmp/gstack_<pid>_dsp.out
strace -p <pid> -o /tmp/strace_<pid>_dsp.out
* or, with forks, if the command works:
strace -p <pid> -f -o /tmp/strace_<pid>_dsp.out
strace
of the dsp client process. Omit this if strace does not exist or does not work on the host.
strace -p <pid> -o /tmp/strace_<pid>_dsp.out
* or, with forks, if the command works:
strace -p <pid> -f -o /tmp/strace_<pid>_dsp.out
- If the non-progressing Snapsync is of an Oracle database, check if a Bequeath connection to the database (CDB or vdb) is possible.
- Connect to the target host using the Delphix OS user:
i) Connect to the host/node as the Delphix OS user ( for example, 'delphix_os')
ii) set the environment variable ORACLE_SID for the database (cdb or vdb).
export ORACLE_HOME=<Oracle home>
export ORACLE_SID=<DB_SID>
iii) then invoke sqlplus, for example:
$ORACLE_HOME/bin/sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Tue Jan 4 11:00:52 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL>
- Get a list and pstack of Bequeath processes, if the target database is an Oracle database.
This is optional, but Support may ask for it. The pstack should only be done for the bequeath process(es) owned by the Delphix OS user. This section may not apply to AIX if pstack is not available. Support may provide other commands to obtain similar information.- To get the list of Bequeath processes:
ps -ef | grep -i beq >> <hostname>_beq_ps.out
- Gather the pstack of the BEQ process for each process/pid owned by the Delphix OS user:
pstack <BEQ_pid> > /tmp/BEQ_<pid>_pstack.out
- strace each BEQ process, if strace is available, for each process/pid owned by the Delphix OS user
strace -p <BEQ_pid> -o /tmp/BEQ_<pid>_strace.out
- To get the list of Bequeath processes:
- Provide the following to Support:
- The Support Bundle
- All the output files (.out and .txt) from the previous steps
- Delphix Toolkit Connector logs from the target host, example:
tar cvf /tmp/Connectorlogs.tar /<Delphix toolkit directory path/Delphix_662429985e19_29b9a0c46581_13_host/log/
Related Articles
The following articles may provide more information or related information to this article: