Java堆内存

来源: 朱小厮

链接: http://blog.csdn.net/u013256816/article/details/50764532

Java 中的堆是 JVM 所管理的最大的一块内存空间,主要用于存放各种类的实例对象。

在 Java 中,堆被划分成两个不同的区域:新生代 ( Young )、老年代 ( Old )。新生代 ( Young ) 又被划分为三个区域:Eden、From Survivor、To Survivor。

这样划分的目的是为了使 JVM 能够更好的管理堆内存中的对象,包括内存的分配以及回收。

堆的内存模型大致为:

新生代:Young Generation,主要用来存放新生的对象。

老年代:Old Generation或者称作Tenured Generation,主要存放应用程序声明周期长的内存对象。

永久代:(方法区,不属于java堆,另一个别名为“非堆Non-Heap”但是一般查看PrintGCDetails都会带上PermGen区)是指内存的永久保存区域,主要存放Class和Meta的信息,Class在被 Load的时候被放入PermGen space区域. 它和和存放Instance的Heap区域不同,GC(Garbage Collection)不会在主程序运行期对PermGen space进行清理,所以如果你的应用会加载很多Class的话,就很可能出现PermGen space错误。

堆大小 = 新生代 + 老年代。其中,堆的大小可以通过参数 –Xms、-Xmx 来指定。

默认的,新生代 ( Young ) 与老年代 ( Old ) 的比例的值为 1:2 ( 该值可以通过参数 –XX:NewRatio 来指定 ),即:新生代 ( Young ) = 1/3 的堆空间大小。老年代 ( Old ) = 2/3 的堆空间大小。其中,新生代 ( Young ) 被细分为 Eden 和 两个 Survivor 区域,这两个 Survivor 区域分别被命名为 from 和 to,以示区分。

默认的,Edem : from : to = 8 : 1 : 1 ( 可以通过参数 –XX:SurvivorRatio 来设定 ),即: Eden = 8/10 的新生代空间大小,from = to = 1/10 的新生代空间大小。

JVM 每次只会使用 Eden 和其中的一块 Survivor 区域来为对象服务,所以无论什么时候,总是有一块 Survivor 区域是空闲着的。

因此,新生代实际可用的内存空间为 9/10 ( 即90% )的新生代空间。

回收方法区(附加补充)

很多人认为方法区(或者HotSpot虚拟机中的永久代[PermGen])是没有垃圾收集的,java虚拟机规范中确实说过可以不要求虚拟机在方法区实现垃圾收集,而且在方法去中进行垃圾收集的“性价比”一般比较低:在堆中,尤其是在新生代中,常规应用进行一次垃圾收集一般可以回收70%-95%的空间,而永久代的垃圾收集效率远低于此。

永久代的垃圾收集主要回收两部分内容:废弃的常量和无用的类。

1. 废弃的常量:回收废弃常量与回收java堆中的对象非常类似。以常量池字面量的回收为例,加入一个字符串“abc”已经进入了常量池中,但是当前系统没有任何一个String对象是叫做”abc”的,换句话说,就是有任何String对象应用常量池中的”abc”常量,也没有其他地方引用了这个字面量,如果这时发生内存回收,而且必要的话,这个“abc”常量就会被系统清理出常量池。常量池中的其他类(接口)、方法、字段的符号引用也与此类似。(注:jdk1.7及其之后的版本已经将常量池从永久代中移出)

2. 无用的类:类需要同时满足下面3个条件才能算是“无用的类”:

  • 该类所有的实例都已经被回收,也就是java堆中不存在该类的任何实例。
  • 加载该类的ClassLoader已经被回收
  • 该类对应的java.lang.Class对象没有在任何地方被引用,无法在任何地方通过反射访问该类的方法。

虚拟机可以对满足上述3个条件的无用类进行回收,这里说的仅仅是”可以“,而并不和对象一样,不使用了就必然会回收。是否对类进行回收,HotSpot虚拟机提供了-Xnoclassgc(关闭CLASS的垃圾回收功能,就是虚拟机加载的类,即便是不使用,没有实例也不会回收。)参数进行控制。

在大量使用反射、动态代理、CGlib等ByteCode框架、动态生成JSP以及OSGi这类频繁自定义ClassLoader的场景都需要虚拟机具备类卸载的功能,以保证永久代不会溢出。

GC

Java 中的堆也是 GC 收集垃圾的主要区域。GC 分为两种:Minor GC、Full GC ( 或称为 Major GC )。

Minor GC 是发生在新生代中的垃圾收集动作,所采用的是复制算法。

新生代几乎是所有 Java 对象出生的地方,即 Java 对象申请的内存以及存放都是在这个地方。Java 中的大部分对象通常不需长久存活,具有朝生夕灭的性质。

当一个对象被判定为 “死亡” 的时候,GC 就有责任来回收掉这部分对象的内存空间。新生代是 GC 收集垃圾的频繁区域。

当对象在 Eden ( 包括一个 Survivor 区域,这里假设是 from 区域 ) 出生后,在经过一次 Minor GC 后,如果对象还存活,并且能够被另外一块 Survivor 区域所容纳( 上面已经假设为 from 区域,这里应为 to 区域,即 to 区域有足够的内存空间来存储 Eden 和 from 区域中存活的对象 ),则使用复制算法将这些仍然还存活的对象复制到另外一块 Survivor 区域 ( 即 to 区域 ) 中,然后清理所使用过的 Eden 以及 Survivor 区域 ( 即 from 区域 ),并且将这些对象的年龄设置为1,以后对象在 Survivor 区每熬过一次 Minor GC,就将对象的年龄 + 1,当对象的年龄达到某个值时 ( 默认是 15 岁,可以通过参数 -XX:MaxTenuringThreshold 来设定 ),这些对象就会成为老年代。

但这也不是一定的,对于一些较大的对象 ( 即需要分配一块较大的连续内存空间 ) 则是直接进入到老年代。虚拟机提供了一个-XX:PretenureSizeThreshold参数,令大于这个设置值的对象直接在老年代分配。这样做的目的是避免在Eden区及两个Survivor区之间发生大量的内存复制(新生代采用复制算法收集内存)。

为了能够更好的适应不同的程序的内存状况,虚拟机并不是永远地要求对象的年龄必须达到了MaxTenuringThreshold才能晋升老年代,如果在Survivor空间中相同年龄所有对象大小的总和大于Survivor空间的一半,年龄大于或等于该年龄的对象可以直接进入老年代,无需等到MaxTenuringThreshold中要求的年龄。

Full GC 是发生在老年代的垃圾收集动作,所采用的是“标记-清除”或者“标记-整理”算法。

现实的生活中,老年代的人通常会比新生代的人 “早死”。堆内存中的老年代(Old)不同于这个,老年代里面的对象几乎个个都是在 Survivor 区域中熬过来的,它们是不会那么容易就 “死掉” 了的。因此,Full GC 发生的次数不会有 Minor GC 那么频繁,并且做一次 Full GC 要比进行一次 Minor GC 的时间更长。

在发生MinorGC之前,虚拟机会先检查老年代最大可用的连续空间是否大于新生代所有对象总空间,如果这个条件成立,那么MinorGC可以确保是安全的。如果不成立,则虚拟机会查看HandlePromotionFailure设置值是否允许担保失败。如果允许,那么会继续检查老年代最大可用的连续空间是否大于历次晋升到老年代对象的平均大小,如果大于,将尝试这进行一次MinorGC,尽管这次MinorGC是有风险的;如果小于,或者HandlePromptionFailure设置不允许冒险,那这是也要改为进行一次FullGC.

另外,标记-清除算法收集垃圾的时候会产生许多的内存碎片 ( 即不连续的内存空间 ),此后需要为较大的对象分配内存空间时,若无法找到足够的连续的内存空间,就会提前触发一次 GC 的收集动作。

GC日志

首先看一下如下代码:

package jvm;

 

public class PrintGCDetails

{

public static void main(String[] args)

{

Object obj = new Object();

System.gc();

System.out.println();

obj = new Object();

obj = new Object();

System.gc();

System.out.println();

}

}

设置JVM参数为-XX:+PrintGCDetails,执行结果如下:

[GC [PSYoungGen: 1019K->568K(28672K)] 1019K->568K(92672K), 0.0529244 secs] [Times: user=0.00 sys=0.00, real=0.06 secs]

{博主自定义注解:[GC [新生代: MinorGC前新生代内存使用->MinorGC后新生代内存使用(新生代总的内存大小)] MinorGC前JVM堆内存使用的大小->MinorGC后JVM堆内存使用的大小(堆的可用内存大小), MinorGC总耗时] [Times: 用户耗时=0.00 系统耗时=0.00, 实际耗时=0.06 secs] }

[Full GC [PSYoungGen: 568K->0K(28672K)] [ParOldGen: 0K->478K(64000K)] 568K->478K(92672K) [PSPermGen: 2484K->2483K(21504K)], 0.0178331 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]

{博主自定义注解:[Full GC [PSYoungGen: 568K->0K(28672K)] [老年代: FullGC前老年代内存使用->FullGC后老年代内存使用(老年代总的内存大小)] FullGC前JVM堆内存使用的大小->FullGC后JVM堆内存使用的大小(堆的可用内存大小) [永久代: 2484K->2483K(21504K)], 0.0178331 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]}

 

[GC [PSYoungGen: 501K->64K(28672K)] 980K->542K(92672K), 0.0005080 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

[Full GC [PSYoungGen: 64K->0K(28672K)] [ParOldGen: 478K->479K(64000K)] 542K->479K(92672K) [PSPermGen: 2483K->2483K(21504K)], 0.0133836 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]

 

Heap

PSYoungGen      total 28672K, used 1505K [0x00000000e0a00000, 0x00000000e2980000, 0x0000000100000000)

eden space 25088K, 6% used [0x00000000e0a00000,0x00000000e0b78690,0x00000000e2280000)

from space 3584K, 0% used [0x00000000e2600000,0x00000000e2600000,0x00000000e2980000)

to   space 3584K, 0% used [0x00000000e2280000,0x00000000e2280000,0x00000000e2600000)

ParOldGen       total 64000K, used 479K [0x00000000a1e00000, 0x00000000a5c80000, 0x00000000e0a00000)

object space 64000K, 0% used [0x00000000a1e00000,0x00000000a1e77d18,0x00000000a5c80000)

PSPermGen       total 21504K, used 2492K [0x000000009cc00000, 0x000000009e100000, 0x00000000a1e00000)

object space 21504K, 11% used [0x000000009cc00000,0x000000009ce6f2d0,0x000000009e100000)

注:你可以用JConsole或者Runtime.getRuntime().maxMemory(),Runtime.getRuntime().totalMemory(), Runtime.getRuntime().freeMemory()来查看Java中堆内存的大小。

再看一个例子:

package jvm;

public class PrintGCDetails2

{

/**

* -Xms60m -Xmx60m -Xmn20m -XX:NewRatio=2 ( 若 Xms = Xmx, 并且设定了 Xmn,

* 那么该项配置就不需要配置了 ) -XX:SurvivorRatio=8 -XX:PermSize=30m -XX:MaxPermSize=30m

* -XX:+PrintGCDetails

*/

public static void main(String[] args)

{

new PrintGCDetails2().doTest();

}

 

public void doTest()

{

Integer M = new Integer(1024 * 1024 * 1); // 单位, 兆(M)

byte[] bytes = new byte[1 * M]; // 申请 1M 大小的内存空间

bytes = null; // 断开引用链

System.gc(); // 通知 GC 收集垃圾

System.out.println();

bytes = new byte[1 * M]; // 重新申请 1M 大小的内存空间

bytes = new byte[1 * M]; // 再次申请 1M 大小的内存空间

System.gc();

System.out.println();

}

}

运行结果:

[GC [PSYoungGen: 2007K->568K(18432K)] 2007K->568K(59392K), 0.0059377 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]

[Full GC [PSYoungGen: 568K->0K(18432K)] [ParOldGen: 0K->479K(40960K)] 568K->479K(59392K) [PSPermGen: 2484K->2483K(30720K)], 0.0223249 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]

 

[GC [PSYoungGen: 3031K->1056K(18432K)] 3510K->1535K(59392K), 0.0140169 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]

[Full GC [PSYoungGen: 1056K->0K(18432K)] [ParOldGen: 479K->1503K(40960K)] 1535K->1503K(59392K) [PSPermGen: 2486K->2486K(30720K)], 0.0119497 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]

 

Heap

PSYoungGen      total 18432K, used 163K [0x00000000fec00000, 0x0000000100000000, 0x0000000100000000)

eden space 16384K, 1% used [0x00000000fec00000,0x00000000fec28ff0,0x00000000ffc00000)

from space 2048K, 0% used [0x00000000ffe00000,0x00000000ffe00000,0x0000000100000000)

to   space 2048K, 0% used [0x00000000ffc00000,0x00000000ffc00000,0x00000000ffe00000)

ParOldGen       total 40960K, used 1503K [0x00000000fc400000, 0x00000000fec00000, 0x00000000fec00000)

object space 40960K, 3% used [0x00000000fc400000,0x00000000fc577e10,0x00000000fec00000)

PSPermGen       total 30720K, used 2493K [0x00000000fa600000, 0x00000000fc400000, 0x00000000fc400000)

object space 30720K, 8% used [0x00000000fa600000,0x00000000fa86f4f0,0x00000000fc400000)

从打印结果可以看出,堆中新生代的内存空间为 18432K ( 约 18M ),eden 的内存空间为 16384K ( 约 16M),from / to survivor 的内存空间为 2048K ( 约 2M)。

这里所配置的 Xmn 为 20M,也就是指定了新生代的内存空间为 20M,可是从打印的堆信息来看,新生代怎么就只有 18M 呢? 另外的 2M 哪里去了?

别急,是这样的。新生代 = eden + from + to = 16 + 2 + 2 = 20M,可见新生代的内存空间确实是按 Xmn 参数分配得到的。

而且这里指定了 SurvivorRatio = 8,因此,eden = 8/10 的新生代空间 = 8/10 * 20 = 16M。from = to = 1/10 的新生代空间 = 1/10 * 20 = 2M。

堆信息中新生代的 total 18432K 是这样来的: eden + 1 个 survivor = 16384K + 2048K = 18432K,即约为 18M。

因为 jvm 每次只是用新生代中的 eden 和 一个 survivor,因此新生代实际的可用内存空间大小为所指定的 90%。

因此可以知道,这里新生代的内存空间指的是新生代可用的总的内存空间,而不是指整个新生代的空间大小。

另外,可以看出老年代的内存空间为 40960K ( 约 40M ),堆大小 = 新生代 + 老年代。因此在这里,老年代 = 堆大小 – 新生代 = 60 – 20 = 40M。

最后,这里还指定了 PermSize = 30m,PermGen 即永久代 ( 方法区 ),它还有一个名字,叫非堆,主要用来存储由 jvm 加载的类文件信息、常量、静态变量等。

附:JVM常用参数

-XX:+<option> 启用选项

-XX:-<option>不启用选项

-XX:<option>=<number>

-XX:<option>=<string>

堆设置

-Xms :初始堆大小

-Xmx :最大堆大小

-Xmn:新生代大小。通常为 Xmx 的 1/3 或 1/4。新生代 = Eden + 2 个 Survivor 空间。实际可用空间为 = Eden + 1 个 Survivor,即 90%

-XX:NewSize=n :设置年轻代大小

-XX:NewRatio=n: 设置年轻代和年老代的比值。如:为3,表示年轻代与年老代比值为1:3,年轻代占整个年轻代年老代和的1/4

-XX:SurvivorRatio=n :年轻代中Eden区与两个Survivor区的比值。注意Survivor区有两个。如:3,表示Eden:Survivor=3:2,一个Survivor区占整个年轻代的1/5

-XX:PermSize=n 永久代(方法区)的初始大小

-XX:MaxPermSize=n :设置永久代大小

-Xss 设定栈容量;对于HotSpot来说,虽然-Xoss参数(设置本地方法栈大小)存在,但实际上是无效的,因为在HotSpot中并不区分虚拟机和本地方法栈。

-XX:PretenureSizeThreshold (该设置只对Serial和ParNew收集器生效) 可以设置进入老生代的大小限制

-XX:MaxTenuringThreshold=1(默认15)垃圾最大年龄 如果设置为0的话,则年轻代对象不经过Survivor区,直接进入年老代. 对于年老代比较多的应用,可以提高效率.如果将此值设置为一个较大值,则年轻代对象会在Survivor区进行多次复制,这样可以增加对象再年轻代的存活 时间,增加在年轻代即被回收的概率

该参数只有在串行GC时才有效.

收集器设置

-XX:+UseSerialGC :设置串行收集器

-XX:+UseParallelGC :设置并行收集器

-XX:+UseParallelOldGC :设置并行年老代收集器

-XX:+UseConcMarkSweepGC :设置并发收集器

垃圾回收统计信息

-XX:+PrintHeapAtGC GC的heap详情

-XX:+PrintGCDetails GC详情

-XX:+PrintGCTimeStamps 打印GC时间信息

-XX:+PrintTenuringDistribution 打印年龄信息等

-XX:+HandlePromotionFailure 老年代分配担保(true or false)

-Xloggc:gc.log 指定日志的位置

并行收集器设置

-XX:ParallelGCThreads=n :设置并行收集器收集时使用的CPU数。并行收集线程数。

-XX:MaxGCPauseMillis=n :设置并行收集最大暂停时间

-XX:GCTimeRatio=n :设置垃圾回收时间占程序运行时间的百分比。公式为1/(1+n)

并发收集器设置

-XX:+CMSIncrementalMode :设置为增量模式。适用于单CPU情况。

-XX:ParallelGCThreads=n :设置并发收集器年轻代收集方式为并行收集时,使用的CPU数。并行收集线程数。

其他

-XX:PermSize=10M和-XX:MaxPermSize=10M限制方法区大小。

-XX:MaxDirectMemorySize=10M指定DirectMemory(直接内存)容量,如果不指定,则默认与JAVA堆最大值(-Xmx指定)一样。

-XX:+HeapDumpOnOutOfMemoryError 可以让虚拟机在出现内存溢出异常时Dump出当前的内存堆转储快照(.hprof文件)以便时候进行分析(比如Eclipse Memory Analysis)。

参考资料:

1. 《java 虚拟机–新生代与老年代GC》

2. 《Java 堆内存》

3. 《深入理解Java虚拟机》周志明著

Java Memory Model

Jakob Jenkov

Source: http://tutorials.jenkov.com/java-concurrency/java-memory-model.html

The Java memory model specifies how the Java virtual machine works with the computer’s memory (RAM). The Java virtual machine is a model of a whole computer so this model naturally includes a memory model – AKA the Java memory model.

It is very important to understand the Java memory model if you want to design correctly behaving concurrent programs. The Java memory model specifies how and when different threads can see values written to shared variables by other threads, and how to synchronize access to shared variables when necessary.

The original Java memory model was insufficient, so the Java memory model was revised in Java 1.5. This version of the Java memory model is still in use in Java 8.

The Internal Java Memory Model

The Java memory model used internally in the JVM divides memory between thread stacks and the heap. This diagram illustrates the Java memory model from a logic perspective:

The Java Memory Model From a Logic Perspective

Each thread running in the Java virtual machine has its own thread stack. The thread stack contains information about what methods the thread has called to reach the current point of execution. I will refer to this as the “call stack”. As the thread executes its code, the call stack changes.

The thread stack also contains all local variables for each method being executed (all methods on the call stack). A thread can only access it’s own thread stack. Local variables created by a thread are invisible to all other threads than the thread who created it. Even if two threads are executing the exact same code, the two threads will still create the local variables of that code in each their own thread stack. Thus, each thread has its own version of each local variable.

All local variables of primitive types ( boolean, byte, short, char, int, long, float, double) are fully stored on the thread stack and are thus not visible to other threads. One thread may pass a copy of a pritimive variable to another thread, but it cannot share the primitive local variable itself.

The heap contains all objects created in your Java application, regardless of what thread created the object. This includes the object versions of the primitive types (e.g. Byte, Integer, Long etc.). It does not matter if an object was created and assigned to a local variable, or created as a member variable of another object, the object is still stored on the heap.

Here is a diagram illustrating the call stack and local variables stored on the thread stacks, and objects stored on the heap:

The Java Memory Model showing where local variables and objects are stored in memory.

A local variable may be of a primitive type, in which case it is totally kept on the thread stack.

A local variable may also be a reference to an object. In that case the reference (the local variable) is stored on the thread stack, but the object itself if stored on the heap.

An object may contain methods and these methods may contain local variables. These local variables are also stored on the thread stack, even if the object the method belongs to is stored on the heap.

An object’s member variables are stored on the heap along with the object itself. That is true both when the member variable is of a primitive type, and if it is a reference to an object.

Static class variables are also stored on the heap along with the class definition.

Objects on the heap can be accessed by all threads that have a reference to the object. When a thread has access to an object, it can also get access to that object’s member variables. If two threads call a method on the same object at the same time, they will both have access to the object’s member variables, but each thread will have its own copy of the local variables.

Here is a diagram illustrating the points above:

The Java Memory Model showing references from local variables to objects, and from object to other objects.

Two threads have a set of local variables. One of the local variables (Local Variable 2) point to a shared object on the heap (Object 3). The two threads each have a different reference to the same object. Their references are local variables and are thus stored in each thread’s thread stack (on each). The two different references point to the same object on the heap, though.

Notice how the shared object (Object 3) has a reference to Object 2 and Object 4 as member variables (illustrated by the arrows from Object 3 to Object 2 and Object 4). Via these member variable references in Object 3 the two threads can access Object 2 and Object 4.

The diagram also shows a local variable which point to two different objects on the heap. In this case the references point to two different objects (Object 1 and Object 5), not the same object. In theory both threads could access both Object 1 and Object 5, if both threads had references to both objects. But in the diagram above each thread only has a reference to one of the two objects.

So, what kind of Java code could lead to the above memory graph? Well, code as simple as the code below:

public class MyRunnable implements Runnable() {

    public void run() {
        methodOne();
    }

    public void methodOne() {
        int localVariable1 = 45;

        MySharedObject localVariable2 =
            MySharedObject.sharedInstance;

        //... do more with local variables.

        methodTwo();
    }

    public void methodTwo() {
        Integer localVariable1 = new Integer(99);

        //... do more with local variable.
    }
}

public class MySharedObject {

    //static variable pointing to instance of MySharedObject

    public static final MySharedObject sharedInstance =
        new MySharedObject();


    //member variables pointing to two objects on the heap

    public Integer object2 = new Integer(22);
    public Integer object4 = new Integer(44);

    public long member1 = 12345;
    public long member1 = 67890;
}

If two threads were executing the run() method then the diagram shown earlier would be the outcome. The run() method calls methodOne() and methodOne() calls methodTwo().

methodOne() declares a primitive local variable (localVariable1 of type int) and an local variable which is an object reference (localVariable2).

Each thread executing methodOne() will create its own copy of localVariable1 and localVariable2 on their respective thread stacks. The localVariable1 variables will be completely separated from each other, only living on each thread’s thread stack. One thread cannot see what changes another thread makes to its copy of localVariable1.

Each thread executing methodOne() will also create their own copy of localVariable2. However, the two different copies of localVariable2 both end up pointing to the same object on the heap. The code sets localVariable2 to point to an object referenced by a static variable. There is only one copy of a static variable and this copy is stored on the heap. Thus, both of the two copies of localVariable2 end up pointing to the same instance of MySharedObject which the static variable points to. The MySharedObject instance is also stored on the heap. It corresponds to Object 3 in the diagram above.

Notice how the MySharedObject class contains two member variables too. The member variables themselves are stored on the heap along with the object. The two member variables point to two other Integer objects. These Integer objects correspond to Object 2 and Object 4 in the diagram above.

Notice also how methodTwo() creates a local variable named localVariable1. This local variable is an object reference to an Integer object. The method sets the localVariable1 reference to point to a new Integer instance. The localVariable1 reference will be stored in one copy per thread executing methodTwo(). The two Integer objects instantiated will be stored on the heap, but since the method creates a new Integer object every time the method is executed, two threads executing this method will create separate Integer instances. The Integer objects created inside methodTwo() correspond to Object 1 and Object 5 in the diagram above.

Notice also the two member variables in the class MySharedObject of type long which is a primitive type. Since these variables are member variables, they are still stored on the heap along with the object. Only local variables are stored on the thread stack.

Hardware Memory Architecture

Modern hardware memory architecture is somewhat different from the internal Java memory model. It is important to understand the hardware memory architecture too, to understand how the Java memory model works with it. This section describes the common hardware memory architecture, and a later section will describe how the Java memory model works with it.

Here is a simplified diagram of modern computer hardware architecture:

Modern hardware memory architecture.

A modern computer often has 2 or more CPUs in it. Some of these CPUs may have multiple cores too. The point is, that on a modern computer with 2 or more CPUs it is possible to have more than one thread running simultaneously. Each CPU is capable of running one thread at any given time. That means that if your Java application is multithreaded, one thread per CPU may be running simultaneously (concurrently) inside your Java application.

Each CPU contains a set of registers which are essentially in-CPU memory. The CPU can perform operations much faster on these registers than it can perform on variables in main memory. That is because the CPU can access these registers much faster than it can access main memory.

Each CPU may also have a CPU cache memory layer. In fact, most modern CPUs have a cache memory layer of some size. The CPU can access its cache memory much faster than main memory, but typically not as fast as it can access its internal registers. So, the CPU cache memory is somewhere in between the speed of the internal registers and main memory. Some CPUs may have multiple cache layers (Level 1 and Level 2), but this is not so important to know to understand how the Java memory model interacts with memory. What matters is to know that CPUs can have a cache memory layer of some sort.

A computer also contains a main memory area (RAM). All CPUs can access the main memory. The main memory area is typically much bigger than the cache memories of the CPUs.

Typically, when a CPU needs to access main memory it will read part of main memory into its CPU cache. It may even read part of the cache into its internal registers and then perform operations on it. When the CPU needs to write the result back to main memory it will flush the value from its internal register to the cache memory, and at some point flush the value back to main memory.

The values stored in the cache memory is typically flushed back to main memory when the CPU needs to store something else in the cache memory. The CPU cache can have data written to part of its memory at a time, and flush part of its memory at a time. It does not have to read / write the full cache each time it is updated. Typically the cache is updated in smaller memory blocks called “cache lines”. One or more cache lines may be read into the cache memory, and one or mor cache lines may be flushed back to main memory again.

Bridging The Gap Between The Java Memory Model And The Hardware Memory Architecture

As already mentioned, the Java memory model and the hardware memory architecture are different. The hardware memory architecture does not distinguish between thread stacks and heap. On the hardware, both the thread stack and the heap are located in main memory. Parts of the thread stacks and heap may sometimes be present in CPU caches and in internal CPU registers. This is illustrated in this diagram:

The division of thread stack and heap among CPU internal registers, CPU cache and main memory.

When objects and variables can be stored in various different memory areas in the computer, certain problems may occur. The two main problems are:

  • Visibility of thread updates (writes) to shared variables.
  • Race conditions when reading, checking and writing shared variables.

Both of these problems will be explained in the following sections.

Visibility of Shared Objects

If two or more threads are sharing an object, without the proper use of either volatile declarations or synchronization, updates to the shared object made by one thread may not be visible to other threads.

Imagine that the shared object is initially stored in main memory. A thread running on CPU one then reads the shared object into its CPU cache. There it makes a change to the shared object. As long as the CPU cache has not been flushed back to main memory, the changed version of the shared object is not visible to threads running on other CPUs. This way each thread may end up with its own copy of the shared object, each copy sitting in a different CPU cache.

The following diagram illustrates the sketched situation. One thread running on the left CPU copies the shared object into its CPU cache, and changes its count variable to 2. This change is not visible to other threads running on the right CPU, because the update to count has not been flushed back to main memory yet.

Visibility Issues in the Java Memory Model.

To solve this problem you can use Java’s volatile keyword. The volatile keyword can make sure that a given variable is read directly from main memory, and always written back to main memory when updated.

Race Conditions

If two or more threads share an object, and more than one thread updates variables in that shared object, race conditions may occur.

Imagine if thread A reads the variable count of a shared object into its CPU cache. Imagine too, that thread B does the same, but into a different CPU cache. Now thread A adds one to count, and thread B does the same. Now var1 has been incremented two times, once in each CPU cache.

If these increments had been carried out sequentially, the variable count would be been incremented twice and had the original value + 2 written back to main memory.

However, the two increments have been carried out concurrently without proper synchronization. Regardless of which of thread A and B that writes its updated version of count back to main memory, the updated value will only be 1 higher than the original value, despite the two increments.

This diagram illustrates an occurrence of the problem with race conditions as described above:

Race Condition Issues in the Java Memory Model.

To solve this problem you can use a Java synchronized block. A synchronized block guarantees that only one thread can enter a given critical section of the code at any given time. Synchronized blocks also guarantee that all variables accessed inside the synchronized block will be read in from main memory, and when the thread exits the synchronized block, all updated variables will be flushed back to main memory again, regardless of whether the variable is declared volatile or not.