Sunday, April 29, 2012

Freaky Leaks

In my last post, I described problems with sending ICMP packets in Java. As the lesser of the evils, we instead spawned a new process that used the command line ping. We did this using java.lang.ProcessBuilder. But when we had to fire off hundreds of requests per second, we ran into problems.

Firstly, on a Sun box it was slow. Firing up a new process was expensive as (so I am told) SunOS must initiate the memory for ping every time.

[Apparently, you can hope to improve this by setting the sticky bit on the file (eg, with chmod 1755 /tmp/test.sh). The sticky bit is "one of the status flags on a file that tells UNIX to load a copy of the file into the page file the first time it is executed. This is done for programs that are commonly used so the bytes are available quickly" (Unix Unleashed p1270). This is quite an old book and the Linux chmod man pages say "on some older systems, the bit saves  the program’s text image on the swap device so it will load more quickly when run".]

Anyway, it turns out that our software was to run on Linux which was much faster at repeatedly starting new processes anyway. However, under heavy load (but non-deterministically) we started seeing java.io.Exceptions saying Bad file descriptor and Too many open files.

Sure enough, running lsof -p PID showed lots of open pipes for our Java process. Hitting the Garbage Collection button on JConsole seemed to help but was didn't solve the mystery.

We were draining the stream from the Process (as outlined in a previous blog post) using a Stream Gobbler but we closed it after use. By trial and error, we looked at the other streams associated with this Process - the error and output streams. Although we never got the reference for them from the Process, we discovered that these need closing too. Why - we have no idea.

[The reason forcing Garbage Collection helped was that the particular instance of the abstract Process (UNIXProcess) for our environment creates a java.io.FileInputStream that has a finalize() method that closes an associated Channel (if there is one) as well as calling the native method to close the stream.]

This is one of those conditions where increasing the memory actually makes things worse. Without the Garbage Collecting reaping these references, the operating system eventually hits its limit for open files (see /proc/PID/limits).




Ping Things

Ping seems a very simple command but it's surprisingly hard to implement in Java.

The JavaDocs do warn you that:

"A typical implementation will use ICMP ECHO REQUESTs if the privilege can be obtained, otherwise it will try to establish a TCP connection on port 7 (Echo) of the destination host."

And sure enough, the way java.net.InetAddress.isReachable is implemented differently depending on who is calling it. Take this code:

            InetAddress inetAddress = InetAddress.getByName(host);
            int timeout = 5000;
            System.out.println(inetAddress.isReachable(timeout));


and run it a few times as root while watching the traffic between the host on which it runs (192.168.0.6) and the remote host (192.168.0.3). It looks somthing like this:

[root@vmwareFedoraII Java]# tcpdump -c5 -nn host 192.168.0.6 and 192.168.0.3
.

.
16:44:29.197962 IP 192.168.0.6 > 192.168.0.3: ICMP echo request, id 2772, seq 1, length 44
16:44:29.200776 IP 192.168.0.3 > 192.168.0.6: ICMP echo reply, id 2772, seq 1, length 44

.
.

(The -c5 flag for tcpdump means exit after receiving 5 packets that meets the filter criteria and the -nn flag means don't resolve ports.)

This is the same as running the ping command which produces network traffic that looks like this:

[root@vmwareFedoraII Java]# tcpdump -c5 -nn host 192.168.0.6 and 192.168.0.3
.

.
16:46:53.276925 IP 192.168.0.6 > 192.168.0.3: ICMP echo request, id 5899, seq 1, length 64
16:46:53.277078 IP 192.168.0.3 > 192.168.0.6: ICMP echo reply, id 5899, seq 1, length 64
.

.
.

This is true even if the user who is pinging is not root. This is because ping has its setuid flag set to root:

[henryp@vmwareFedoraII ~]$ ls -l `which ping`
-rwsr-xr-x 1 root root 42072 2009-07-26 13:22 /bin/ping


However, running the Java code as a non-root user produces network traffic that looks like this:

[root@vmwareFedoraII Java]# tcpdump -c5 -nn host 192.168.0.6 and 192.168.0.3
.

.
16:39:48.290445 IP 192.168.0.6.56469 > 192.168.0.3.7: Flags [S], seq 890450676, win 5840, options [mss 1460,sackOK,TS val 1556799 ecr 0,nop,wscale 6], length 0
16:39:48.290660 IP 192.168.0.3.7 > 192.168.0.6.56469: Flags [R.], seq 0, ack 890450677, win 0, length 0

.
.

True to the JavaDocs, this time a packet was sent to port 7 on the remote host. Luckily, there is a process listening on this port as a reply [R.] is returned. The ping command does not need a process to be listening on any given port. Indeed, "ICMP doesn't use ports as TCP and UDP do." (Linux Network Administrator's Guide, Olaf Kirch and Terry Dawson).

Why the difference? Well, ping is a very powerful command. I'm told by somebody much better at UNIX than me that ICMP packets are not buffered by the kernel and so go straight to the hardware without being throttled. This can cause havoc on a network if abused.

"Sometimes people with too much time on their hands attempt to maliciously disrupt the network access of a user by generating large numbers of ICMP messages. This is commonly called ping flooding." (ibid).

Ping flooding can be achieved with the -f flag but if you run it as a non-root user, you see:

[henryp@vmwareFedoraII ~]$ ping -f 192.168.0.3
PING 192.168.0.3 (192.168.0.3) 56(84) bytes of data.
ping: cannot flood; minimal interval, allowed for user, is 200ms


The impact from such an attack can be minimized by using IP tables (try running as root iptables -L INPUT -n to see what tables you have on your Linux box). These tables hold configuration that can ignore ICMP packets if they become too numerous.

[What are these ICMP packets anyway? From RFC 792:

"The Internet Protocol is not designed to be absolutely reliable.  The purpose of these control messages is to provide feedback about problems in the communication environment, not to make IP reliable.  There are still no guarantees that a datagram will be delivered or a control message will be returned.  Some datagrams may still be undelivered without any report of their loss.  The higher level protocols that use IP must implement their own reliability procedures if reliable communication is required. 

"The ICMP messages typically report errors in the processing of datagrams.  To avoid the infinite regress of messages about messages etc., no ICMP messages are sent about ICMP messages.  Also ICMP messages are only sent about errors in handling fragment zero of fragemented datagrams.  (Fragment zero has the fragment offeset equal zero)."]

So, if you're using Java but can't run as root, how can you ping a remote host that may not be listening on port 7? Well, what first appeared to us as a cunning plan was to send a UDP packet to the remote host not caring if something was listening on the port since:

"One rule of UDP is that if it receives a UDP datagram and the destination port does not correspond to a port that some process has in use, UDP responds with an ICMP port unreachable." TCP-IP Illustrated, Section 6.5 (Kevin R Fall, W. Richard Stevens).

Clever, but how do you pick up the returned ICMP packet in Java?

In the end, we used java.lang.ProcessBuilder to launch the ping command. And that's where our problems began...