分类
Articles

How Java object layout in memory?

As we all know in database query is about computation. Computation requires data deserialized as objects in memory. So how object layout in memory and how many memory it cost is very important especially for waste memory operators like group-by, join, count-distinct etc.

Why this topic?

As we all know in database query is about computation. Computation requires data deserialized as objects in memory. So how object layout in memory and how many memory it cost is very important especially for waste memory operators like group-by, join, count-distinct etc.

Object only contain primitive fields.

Test 1 :

ClassLayout layout = ClassLayout.parseClass(ClassA.class);

Output :

woo.demo.java.ClassA object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 int ClassA.f1 N/A
16 4 int ClassA.f2 N/A
20 4 (loss due to the next object alignment)
Instance size: 24 bytes (estimated, the sample instance is not available)
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Object containing object fields

Test 2 :

Output :

woo.demo.java.ClassB object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 int ClassB.f8 N/A
16 4 ClassA ClassB.fa N/A
20 4 (loss due to the next object alignment)
Instance size: 24 bytes (estimated, the sample instance is not available)
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Conclusion 1: ClassB has a field of object, but they are linked by reference which guarantee the length of ClassB object is determinate. So when creating a ClassB object the size of memory is determinate.

Array is a special object who has a size field and has indeterminate length.

Array of primitive data

What about object array?

Test 3 :

ClassA[] arr = {new ClassA(1, 1), new ClassA(1, 1)};System.out.println(GraphLayout.parseInstance(arr).toPrintable());

Output :

[Lwoo.demo.java.ClassA; object externals:
ADDRESS SIZE TYPE PATH VALUE
76bf5def0 24 [Lwoo.demo.java.ClassA; [(object), (object)]
76bf5df08 24 woo.demo.java.ClassA 0
76bf5df20 24 woo.demo.java.ClassA 1

From experiment 3, we can get the shallow size of arr is 24 and two elements size is 8, so the element of arr must be reference. So arr layout is:

As we all know in C we can create an array with object as element, but in Java there is no way.

What happens when creating a ClassB object in method

Two kinds of reference

What about boxed type?

Test 4 :

Output :

java.lang.Integer object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 int Integer.value N/A
Instance size: 16 bytes (estimated, the sample instance is not available)
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total

Conclusion 2: Primitive types are much more memory efficient than boxed types

How about Java collections?

Collection elements are all boxed types.

Test 5 :

Output :

java.util.ArrayList object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 int AbstractList.modCount N/A
16 4 int ArrayList.size N/A
20 4 Object[] ArrayList.elementData N/A
Instance size: 24 bytes (estimated, the sample instance is not available)
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total

Two element int array takes 24 bytes but ArrayList takes 64 byte

Conclusion 3: Array is much more memory efficient than collections

How about String?

Test 6:

Output:

java.lang.String object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 char[] String.value N/A
16 4 int String.hash N/A
20 4 (loss due to the next object alignment)
Instance size: 24 bytes (estimated, the sample instance is not available)
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Let us assume a scenario that reading a char sequence from disk and creating a string with it.

So how many memory the string cost?

Char sequence abc with ANSII or UTF8 encoding cost 3 bytes in disk. But The string memory cost 48 bytes.

Test 7:

Output:

java.lang.String object externals:
ADDRESS SIZE TYPE PATH VALUE
76bf6eb08 24 java.lang.String (object)
76bf6eb20 24 [C .value [a, b, c]

Conclusion 4: String costs much more space than disk char sequence.

Why reference is 4 bytes?

Reference is 4 bytes, which can only address 4GB memory. Obviously Jvm can have larger heap size. So what happens?

We find that there is alignment at the last of an object. Jvm always guarantee that an object size is 8x.

In Java reference address objects but not arbitrary byte, so 4 bytes reference can address 8*4GB memory. This maybe the mainly difference between Java reference and C pointer.

But when Jvm heap is larger than 32GB? Actually Jvm will increase reference length to 8 bytes.

At last

  1. Data from disk to memory actually need more space.
  2. Primitive types and array are more memory efficient than boxed type and collections.
  3. Object layout in Java is more restrict than in C.

This work is licensed under a Creative Commons Attribution 4.0 International License. When redistributing, please include the original link.

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注