Weird behavior of Java -Xmx on large amounts of ram

Question

You can control the maximum heap size in java using the -Xmx option.

We are experiencing some weird behavior on Windows with this switch. We run some very beefy servers (think 196gb ram). Windows version is Windows Server 2008R2

Java version is 1.6.0_18, 64-Bit (obviously).

Anyway, we were having some weird bugs where processes were quitting with out of memory exceptions even though the process was using much less memory than specified by the -Xmx setting.

So we wrote simple program that would allocate a 1GB byte array each time one pressed the enter key, and initialize the byte array to random values (to prevent any memory compression etc).

Basically, whats happening is that if we run the program with -Xmx35000m (roughly 35 gb) we get an out of memory exception when we hit 25 GB of process space (using windows task manager to measure). We hit this after allocating 24 GB worth of 1 GB blocks, BTW, so that checks out.

Simply specifying a larger value for -Xmx option makes the program work fine to larger amounts of ram.

So, what is going on? Is -Xmx just "off". BTW: We need to specify -Xmx55000m to get a 35 GB process space...

Any ideas on what is going on?

Is their a bug in the Windows JVM?

Is it safe to simply set the -Xmx option bigger, even though there is a disconnect between the -Xmx option and what is going on process wise?

unless you tune NewRatio basically you wont use the young gen w/ so huge objects. Try the test with significantly smaller byte[] and the results will improve — bestsss, Commented Mar 8, 2011 at 0:06
No change when the block size was set to 1KB and 10KB respectively, so I don't think the size of the objects is the issue — SvrGuy, Commented Mar 8, 2011 at 0:44

Stephen C · Accepted Answer · 2011-03-08 03:53:30Z

Theory #1

When you request a 35Gb heap using -Xmx35000m, what you are actually saying is that to allow the total space used for the heap to be 35Gb. But the total space consists of the Tenured Object space (for objects that survive multiple GC cycles), the Eden space for newly created objects, and other spaces into which objects will be copied during garbage collection.

The issue is that some of the spaces are not and cannot be used for allocating new objects. So in effect, you "lose" a significant percent of your 35Gb to overheads.

There are various -XX options that can be used to tweak the sizes of the respective spaces, etc. You might try fiddling with them to see if they make a difference. Refer to this document for more information. (The commonly used GC tuning options are listed in section 8. The -XX:NewSpace option looks promising ...)

Theory #2

This might be happening because you are allocating huge objects. IIRC, objects above a certain size can be allocated directly into the Tenured Object space. In your (highly artificial) benchmark, this might result in the JVM not putting stuff into the Eden space, and therefore being able to use less of the total heap space than is normal.

As an experiment, try changing your benchmark to allocate lots of small objects, and see if it manages to use more of the available space before OOME-ing.

Here are some other theories that I would discount:

"You are running into OS-imposed limits." I would discount this, since you said that you can get significantly greater memory utilization by increasing the -Xmx... setting.
"The Windows task manager is reporting bogus numbers." I would discount this because the numbers reported roughly match the 25Gb that you think your application had managed to allocate.
"You are losing space to other things; e.g. the permgen heap." AFAIK, the permgen heap size is controlled and accounted independently of the "normal" heaps. Other non-heap memory usage is either a constant (for the app) or dependent on the app doing specific things.
"You are suffering from heap fragmentation." All of the JVM garbage collectors are "copying collectors", and this family of collectors has the property that heap nodes are automatically compacted.
"JVM bug on Windows." Highly unlikely. There must be tens of thousands of 64bit Java on Windows installations that maximize the heap size. Someone else would have noticed ...

Finally, if you are NOT doing this because your application requires you to allocate memory in huge chunks, and hang onto it "for ever" ... there's a good chance that you are chasing shadows. A "normal" large-memory application doesn't do this kind of thing, and the JVM is tuned for normal applications ... not anomalous ones.

And if your application really does behave this way, the pragmatic solution is to just set the -Xmx... option larger, and only worry if you start running into OS-level issues.

@bestsss - apparently not; see the response to your comment on the question. — Stephen C, Commented Mar 8, 2011 at 3:39

Roland Illig · Accepted Answer · 2011-03-07 23:10:17Z

2

To get a feeling for what exactly you are measuring you should use some different tools:

the Windows Task Manager (I only know Windows XP, but I heard rumours that the Task Manager has improved since then.)
procexp and vmmap from Sysinternals
jconsole from the JVM (you are using the ~~Sun~~Oracle HotSpot JVM, aren't you?)

Now you should answer the following questions:

What does jconsole say about the used heap size? How does that differ from procexp?
Does the value from procexp change if you fill the byte arrays with non-zero numbers instead of keeping them at 0?

answered Mar 7, 2011 at 23:10

Roland Illig

41.8k12 gold badges91 silver badges126 bronze badges

+1 for Java native tools, although I prefer VisualVM (part of the JDK) over jconsole.
– user330315
Commented Mar 7, 2011 at 23:18
Oh yes, I should really make myself comfortable with all of the native HotSpot tools.
– Roland Illig
Commented Mar 8, 2011 at 7:03

Add a comment |

ams · Accepted Answer · 2011-03-07 23:43:03Z

2

did you try turning on the verbose output for the GC to find out why the last allocation fails. is it because the OS fails to allocate a heap beyond 25GB for the native JVM process or is it because the GC is hitting some sort of limit on the maximum memory it can manage. I would recommend you also connect to the command line process using jconsole to see what the status of the heap is just before the allocation failure. Also tools like the sysinternals process explorer might give better details as where the failure is occurring if it is in the jvm process.

Since the process is dying at 25GB and you have a generational collector maybe the rest of the generations are consuming 10GB. I would recommend you install JDK 1.6_u24 and use jvisualvm with the visualGC plugin to see what the GC is doing especially factor in the size of all the generations to see how the 35GB heap is being chopped up into different regions by the GC / VM memory manager.

see this link if you are not familiar with Generational GC http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#generation_sizing.total_heap

edited Mar 7, 2011 at 23:43

answered Mar 7, 2011 at 23:14

ams

62.9k74 gold badges210 silver badges297 bronze badges

[GC 11539579K->11534560K(12440896K), 0.0411290 secs] [GC 11534560K->11534528K(12440896K), 0.0311573 secs] [Full GC 11534528K->11534499K(11950336K), 0.0139061 secs] [GC 11534499K->11534499K(12440896K), 0.0317842 secs] [Full GC 11534499K->11534475K(11977280K), 0.0139566 secs]
– SvrGuy
Commented Mar 7, 2011 at 23:22
The above is the verbose GC output where -Xmx18000m is set. Process bails at (roughly) 11gb of memory usage.
– SvrGuy
Commented Mar 7, 2011 at 23:23
are you using -XX:-PrintGCDetails for your output or just --XX:-PrintGC can you share the output of -XX:-PrintGCDetails
– ams
Commented Mar 7, 2011 at 23:34

Add a comment |

user330315user330315 · Accepted Answer · 2011-03-07 23:10:13Z

0

I assume this has to do with fragmenting the heap. The free memory is probably not available as a single contiguous free area and when you try to allocate a large block this fails because the requested memory cannot be allocated in a single piece.

answered Mar 7, 2011 at 23:10

user330315

1

Normally I would suggest that, too, but SvrGuy mentioned that he allocated the memory in 1 GB blocks. And even then, there should be a full garbage collection with heap compaction, after which he should be able to use the full heap again. Maybe tracing the garbage collector (-verbose:gc -Xloggc) helps to trace down the real cause.
– Roland Illig
Commented Mar 7, 2011 at 23:13
The reason we wrote the test program the way we did was to avoid any possibility of heap fragmentation. 1GB blocks, and all object are reachable, so no garbage collection is being done.
– SvrGuy
Commented Mar 7, 2011 at 23:17

Add a comment |

d-live · Accepted Answer · 2011-03-07 23:10:56Z

0

The memory displayed by windows task manager is the total memory allocated to the process which includes memory for code, stack, perm gen and heap. The memory you measure using your click program is the amount of heap jvm makes available to running jvm programs. Natrually the total allocated memory to JVM by windows should be greater than what JVM makes available to your program as heap memory.

answered Mar 7, 2011 at 23:10

d-live

8,0363 gold badges23 silver badges16 bronze badges

This is true but does not explain a 30-50% spread. The code is approx 500mb, not 15 gigabytes.
– SvrGuy
Commented Mar 7, 2011 at 23:16

Add a comment |

Collectives™ on Stack Overflow

Weird behavior of Java -Xmx on large amounts of ram

5 Answers 5

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Related