Delphix Java Process May Stop Responding When Endpoint Security Software Is Used (KBA9465)
KBA
KBA# 9465Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
-
Major Release All Sub Releases 6.0 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0
Issue
Delphix client unresponsive Java process issues have been observed related to endpoint security blocking the Delphix client software (DSP) on Linux/Unix hosts. This may be related to either deliberate endpoint security policies or bugs in the endpoint software.
Prerequisites
- Delphix version 6.0.7.0 or above.
- Linux/Unix host running endpoint security software.
Details
Delphix Continuous Data platform uses client-side java processes to perform various operations (such as snapsync, provisioning, monitoring) on source and target hosts. Endpoint security software may block this java process from operating as expected by preventing the dynamic loading of libraries or blocking network port access.
This behaviour has specifically been observed with "Cortex XDR Endpoint Protection by PaloAlto" and "Dynatrace", resulting in Delphix jobs (provision, delete, refresh) becoming unresponsive where the endpoint software is blocking the Delphix client from loading the required Oracle JDBC libraries.
Example identifying the issue
This example shows Cortex XDR blocking the loading of Oracle JDBC libraries.
From the Delphix GUI, the issue will be seen as a job (delete, provision, etc) not progressing. Typically this will be observed very early in the job lifecycle.
To identify if Cortex is running on the source/target host, ps
can be used to check for the following processes:
ps -ef | grep traps root 374887 1298 0 22:26 ? 00:00:00 /opt/traps/bin/pmd root 374901 374887 0 22:26 ? 00:00:00 /opt/traps/bin/dypd -- 15 root 1234 54321 0 22:26 ? 00:00:00 /opt/traps/analyzerd/analyzerd 281 283 285
Any of these processes confirms that Cortex endpoint security software is installed on the host.
The presence of the software does not guarantee issues will be encountered. To confirm if the Delphix client process is blocked, tools such as gdb
, pstack
, andldd
can be used.
- Find the Delphix client process:
[root@host ~]# ps -ef | grep java | grep Del oracle 1549745 1 99 23:08 ? 00:00:06 /work/Delphix_COMMON_531509320050_0e3b14394b1f_3_host/java/jdk/bin/java -ea -XX:-UseVMInterruptibleIO -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/work -Dio.netty.allocator.maxOrder=7 -Djdk.tls.acknowledgeCloseNotify=true -Dcom.ibm.tools.attach.enable=no -javaagent:/work/Delphix_COMMON_531509320050_0e3b14394b1f_3_host/client/dsp/libs/com.delphix.common/agent-1.0.0.jar -Ddelphix.host.os=unix -Ddelphix.toolkit.base.dir=/work -Ddelphix.max.worker=16 -Djava.io.tmpdir=/work/Delphix_531509320050_0e3b14394b1f_3_host/tmp -jar /work/Delphix_COMMON_531509320050_0e3b14394b1f_3_host/client/dsp/client.jar
- Collect a
pstack
of the process:
pstack 1549745 > /tmp/pstack.1549745
If a Delphix thread is waiting for Cortex software, one or more threads will have a stack trace such as the following:
Thread 50 (Thread 0x7f7ce2098700 (LWP 28975)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007f7ddfe675f3 in _L_lock_892 () from /lib64/libpthread.so.0 #2 0x00007f7ddfe674d7 in __pthread_mutex_lock (mutex=0x7f7de029c930) at pthread_mutex_lock.c:82 #3 0x00007f7ddf7d490b in __dl_iterate_phdr (callback=0x7f7ddefc3080, data=0x7f7ce2093b60) at dl-iteratephdr.c:42
Of particular interest here is the threads stuck in __lll_lock_wait.
- Use
lsof
to check for Cortex libraries loaded against the Java process:
lsof -p 1549745 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 1550146 oracle cwd DIR 8,4 129 1013737 /home/oracle java 1550146 oracle rtd DIR 8,4 247 128 / java 1550146 oracle txt REG 8,4 8640 205176828 /work/Delphix_COMMON_531509320050_0e3b14394b1f_3_host/java/jdk/bin/java java 1550146 oracle mem REG 8,4 51656 2509 .... java 1550146 oracle mem REG 8,4 34208 1940 /opt/traps/lib/libmodule64.so
This shows that the Cortex libraries have been loaded against the Delphix client software.
Solution
As the issue is caused by third party security software blocking the Delphix client software from operating correctly, to resolve the issue:
- Configure the security software to exempt Delphix/Java processes from the endpoint security policy.
- Work with the third party software vendor to verify the installed endpoint is working as expected.