第20講：動手實踐：不為人熟知的字節碼指令 · 深入淺出Java虛擬機

本課時我們主要分享一個實踐案例：不為人熟知的字節碼指令。下面將通過介紹 Java 語言中的一些常見特性，來看一下字節碼的應用，由于 Java 特性非常多，這里我們僅介紹一些經常遇到的特性。javap 是手中的利器，復雜的概念都可以在這里現出原形，并且能讓你對此產生深刻的印象。本課時代碼比較多，相關代碼示例都可以在倉庫中找到，建議實際操作一下。 #### 異常處理在上一課時中，細心的你可能注意到了，在 synchronized 生成的字節碼中，其實包含兩條 monitorexit 指令，是為了保證所有的異常條件，都能夠退出。這就涉及到了 Java 字節碼的異常處理機制，如下圖所示。 ![](https://img.kancloud.cn/be/f7/bef7abbb373b291729d85b4d370627e5_757x277.jpg) 如果你熟悉 Java 語言，那么對上面的異常繼承體系一定不會陌生，其中，Error 和 RuntimeException 是非檢查型異常（Unchecked Exception），也就是不需要 catch 語句去捕獲的異常；而其他異常，則需要程序員手動去處理。 #### 異常表在發生異常的時候，Java 就可以通過 Java 執行棧，來構造異常棧。回想一下第 02 課時中的棧幀，獲取這個異常棧只需要遍歷一下它們就可以了。但是這種操作，比起常規操作，要昂貴的多。Java 的 Log 日志框架，通常會把所有錯誤信息打印到日志中，在異常非常多的情況下，會顯著影響性能。我們還是看一下上一課時生成的字節碼： ``` void doLock(); ? ?descriptor: ()V ? ?flags: ? ?Code: ? ? ?stack=2, locals=3, args_size=1 ? ? ? ? 0: aload_0 ? ? ? ? 1: getfield ? ? ?#3 ? ? ? ? ? ? ? ? ?// Field lock:Ljava/lang/Object; ? ? ? ? 4: dup ? ? ? ? 5: astore_1 ? ? ? ? 6: monitorenter ? ? ? ? 7: getstatic ? ? #4 ? ? ? ? ? ? ? ? ?// Field java/lang/System.out:Ljava/io/PrintStream; ? ? ? ?10: ldc ? ? ? ? ? #8 ? ? ? ? ? ? ? ? ?// String lock ? ? ? ?12: invokevirtual #6 ? ? ? ? ? ? ? ? ?// Method java/io/PrintStream.println:(Ljava/lang/String;)V ? ? ? ?15: aload_1 ? ? ? ?16: monitorexit ? ? ? ?17: goto ? ? ? ? ?25 ? ? ? ?20: astore_2 ? ? ? ?21: aload_1 ? ? ? ?22: monitorexit ? ? ? ?23: aload_2 ? ? ? ?24: athrow ? ? ? ?25: return ? ? ?Exception table: ? ? ? ? from ? ?to ?target type ? ? ? ? ? ? 7 ? ?17 ? ?20 ? any ? ? ? ? ? ?20 ? ?23 ? ?20 ? any ``` 可以看到，編譯后的字節碼，帶有一個叫 Exception table 的異常表，里面的每一行數據，都是一個異常處理器： * from 指定字節碼索引的開始位置 * to 指定字節碼索引的結束位置 * target 異常處理的起始位置 * type 異常類型也就是說，只要在 from 和 to 之間發生了異常，就會跳轉到 target 所指定的位置。 * [ ] finally 通常我們在做一些文件讀取的時候，都會在 finally 代碼塊中關閉流，以避免內存的溢出。關于這個場景，我們再分析一下下面這段代碼的異常表。 ``` import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; public class A { ? ?public void read() { ? ? ? ?InputStream in = null; ? ? ? ?try { ? ? ? ? ? ?in = new FileInputStream("A.java"); ? ? ? ?} catch (FileNotFoundException e) { ? ? ? ? ? ?e.printStackTrace(); ? ? ? ?} finally { ? ? ? ? ? ?if (null != in) { ? ? ? ? ? ? ? ?try { ? ? ? ? ? ? ? ? ? ?in.close(); ? ? ? ? ? ? ? ?} catch (IOException e) { ? ? ? ? ? ? ? ? ? ?e.printStackTrace(); ? ? ? ? ? ? ? ?} ? ? ? ? ? ?} ? ? ? ?} ? ?} } ``` 上面的代碼，捕獲了一個 FileNotFoundException 異常，然后在 finally 中捕獲了 IOException 異常。當我們分析字節碼的時候，卻發現了一個有意思的地方：IOException 足足出現了三次。 ``` Exception table: ? ?from ? ?to ?target type ? ?17 ? ?21 ? ?24 ? Class java/io/IOException ? ?2 ? ?12 ? ?32 ? Class java/io/FileNotFoundException ? ?42 ? ?46 ? ?49 ? Class java/io/IOException ? ? 2 ? ?12 ? ?57 ? any ? ?32 ? ?37 ? ?57 ? any ? ?63 ? ?67 ? ?70 ? Class java/io/IOException ``` Java 編譯器使用了一種比較傻的方式來組織 finally 的字節碼，它分別在 try、catch 的正常執行路徑上，復制一份 finally 代碼，追加在正常執行邏輯的后面；同時，再復制一份到其他異常執行邏輯的出口處。這也是下面這段方法不報錯的原因，都可以在字節碼中找到答案。 ``` //B.java public int read() { ? ? ? ?try { ? ? ? ? ? ?int a = 1 / 0; ? ? ? ? ? ?return a; ? ? ? ?} finally { ? ? ? ? ? ?return 1; ? ? ? ?} } ``` 下面是上面程序的字節碼，可以看到，異常之后，直接跳轉到序號 8 了。 ``` stack=2, locals=4, args_size=1 ? ? ? ? 0: iconst_1 ? ? ? ? 1: iconst_0 ? ? ? ? 2: idiv ? ? ? ? 3: istore_1 ? ? ? ? 4: iload_1 ? ? ? ? 5: istore_2 ? ? ? ? 6: iconst_1 ? ? ? ? 7: ireturn ? ? ? ? 8: astore_3 ? ? ? ? 9: iconst_1 ? ? ? ?10: ireturn ? ? ?Exception table: ? ? ? ? from ? ?to ?target type ? ? ? ? ? ? 0 ? ? 6 ? ? 8 ? any ``` #### 裝箱拆箱在剛開始學習 Java 語言的你，可能會被自動裝箱和拆箱搞得暈頭轉向。Java 中有 8 種基本類型，但鑒于 Java 面向對象的特點，它們同樣有著對應的 8 個包裝類型，比如 int 和 Integer，包裝類型的值可以為 null，很多時候，它們都能夠相互賦值。我們使用下面的代碼從字節碼層面上來觀察一下： ``` public class Box { ? ?public Integer cal() { ? ? ? ?Integer a = 1000; ? ? ? ?int b = a * 10; ? ? ? ?return b; ? ?} } ``` 上面是一段簡單的代碼，首先使用包裝類型，構造了一個值為 1000 的數字，然后乘以 10 后返回，但是中間的計算過程，使用了普通類型 int。 ``` public java.lang.Integer read(); ? ?descriptor: ()Ljava/lang/Integer; ? ?flags: ACC_PUBLIC ? ?Code: ? ? ?stack=2, locals=3, args_size=1 ? ? ? ? 0: sipush ? ? ? ?1000 ? ? ? ? 3: invokestatic ?#2 ? ? ? ? ? ? ? ? ?// Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; ? ? ? ? 6: astore_1 ? ? ? ? 7: aload_1 ? ? ? ? 8: invokevirtual #3 ? ? ? ? ? ? ? ? ?// Method java/lang/Integer.intValue:()I ? ? ? ?11: bipush ? ? ? ?10 ? ? ? ?13: imul ? ? ? ?14: istore_2 ? ? ? ?15: iload_2 ? ? ? ?16: invokestatic ?#2 ? ? ? ? ? ? ? ? ?// Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; ? ? ? ?19: areturn ``` 通過觀察字節碼，我們發現賦值操作使用的是 Integer.valueOf 方法，在進行乘法運算的時候，調用了 Integer.intValue 方法來獲取基本類型的值。在方法返回的時候，再次使用了 Integer.valueOf 方法對結果進行了包裝。這就是 Java 中的自動裝箱拆箱的底層實現。但這里有一個 Java 層面的陷阱問題，我們繼續跟蹤 Integer.valueOf 方法。 ``` @HotSpotIntrinsicCandidate ? ?public static Integer valueOf(int i) { ? ? ? ?if (i >= IntegerCache.low && i <= IntegerCache.high) ? ? ? ? ? ?return IntegerCache.cache[i + (-IntegerCache.low)]; ? ? ? ?return new Integer(i); ? ?} ``` 這個 IntegerCache，緩存了 low 和 high 之間的 Integer 對象，可以通過 -XX:AutoBoxCacheMax 來修改上限。下面是一道經典的面試題，請考慮一下運行代碼后，會輸出什么結果？ ``` public class BoxCacheError{ ? ?public static void main(String[] args) { ? ? ? ?Integer n1 = 123; ? ? ? ?Integer n2 = 123; ? ? ? ?Integer n3 = 128; ? ? ? ?Integer n4 = 128; ? ? ? ?System.out.println(n1 == n2); ? ? ? ?System.out.println(n3 == n4); ? ?} ``` 當我使用 java BoxCacheError 執行時，是 true,false；當我加上參數 java -XX:AutoBoxCacheMax=256 BoxCacheError 執行時，結果是 true,ture，原因就在于此。 #### 數組訪問我們都知道，在訪問一個數組長度的時候，直接使用它的屬性 .length 就能獲取，而在 Java 中卻無法找到對于數組的定義。比如 int[] 這種類型，通過 getClass（getClass 是 Object 類中的方法）可以獲取它的具體類型是 [I。其實，數組是 JVM 內置的一種對象類型，這個對象同樣是繼承的 Object 類。我們使用下面一段代碼來觀察一下數組的生成和訪問。 ``` public class ArrayDemo { ? ?int getValue() { ? ? ? ?int[] arr = new int[]{ ? ? ? ? ? ? ? ?1111, 2222, 3333, 4444 ? ? ? ?}; ? ? ? ?return arr[2]; ? ?} ? ?int getLength(int[] arr) { ? ? ? ?return arr.length; ? ?} } ``` 首先看一下 getValue 方法的字節碼。 ``` int getValue(); ? ?descriptor: ()I ? ?flags: ? ?Code: ? ? ?stack=4, locals=2, args_size=1 ? ? ? ? 0: iconst_4 ? ? ? ? 1: newarray ? ? ? int ? ? ? ? 3: dup ? ? ? ? 4: iconst_0 ? ? ? ? 5: sipush ? ? ? ?1111 ? ? ? ? 8: iastorae ? ? ? ? 9: dup ? ? ? ?10: iconst_1 ? ? ? ?11: sipush ? ? ? ?2222 ? ? ? ?14: iastore ? ? ? ?15: dup ? ? ? ?16: iconst_2 ? ? ? ?17: sipush ? ? ? ?3333 ? ? ? ?20: iastore ? ? ? ?21: dup ? ? ? ?22: iconst_3 ? ? ? ?23: sipush ? ? ? ?4444 ? ? ? ?26: iastore ? ? ? ?27: astore_1 ? ? ? ?28: aload_1 ? ? ? ?29: iconst_2 ? ? ? ?30: iaload ? ? ? ?31: ireturn ``` 可以看到，新建數組的代碼，被編譯成了 newarray 指令。數組里的初始內容，被順序編譯成了一系列指令放入： * sipush 將一個短整型常量值推送至棧頂； * iastore 將棧頂 int 型數值存入指定數組的指定索引位置。為了支持多種類型，從操作數棧存儲到數組，有更多的指令：bastore、castore、sastore、iastore、lastore、fastore、dastore、aastore。數組元素的訪問，是通過第 28 ~ 30 行代碼來實現的： * aload_1 將第二個引用類型本地變量推送至棧頂，這里是生成的數組； * iconst_2 將 int 型 2 推送至棧頂； * iaload 將 int 型數組指定索引的值推送至棧頂。值得注意的是，在這段代碼運行期間，有可能會產生 ArrayIndexOutOfBoundsException，但由于它是一種非捕獲型異常，我們不必為這種異常提供異常處理器。我們再看一下 getLength 的字節碼，字節碼如下： ``` int getLength(int[]); ? ?descriptor: ([I)I ? ?flags: ? ?Code: ? ? ?stack=1, locals=2, args_size=2 ? ? ? ? 0: aload_1 ? ? ? ? 1: arraylength ? ? ? ? 2: ireturn ``` 可以看到，獲取數組的長度，是由字節碼指令 arraylength 來完成的。 * [ ] foreach 無論是 Java 的數組，還是 List，都可以使用 foreach 語句進行遍歷，比較典型的代碼如下： ``` import?java.util.List; public?class?ForDemo?{ ????void?loop(int[]?arr)?{ ????????for?(int?i?:?arr)?{ ????????????System.out.println(i); ????????} ????} ????void?loop(List<Integer>?arr)?{ ????????for?(int?i?:?arr)?{ ????????????System.out.println(i); ????????} ????} ``` 雖然在語言層面它們的表現形式是一致的，但實際實現的方法并不同。我們先看一下遍歷數組的字節碼： ``` void loop(int[]); ? ?descriptor: ([I)V ? ?flags: ? ?Code: ? ? ?stack=2, locals=6, args_size=2 ? ? ? ? 0: aload_1 ? ? ? ? 1: astore_2 ? ? ? ? 2: aload_2 ? ? ? ? 3: arraylength ? ? ? ? 4: istore_3 ? ? ? ? 5: iconst_0 ? ? ? ? 6: istore ? ? ? ?4 ? ? ? ? 8: iload ? ? ? ? 4 ? ? ? ?10: iload_3 ? ? ? ?11: if_icmpge ? ? 34 ? ? ? ?14: aload_2 ? ? ? ?15: iload ? ? ? ? 4 ? ? ? ?17: iaload ? ? ? ?18: istore ? ? ? ?5 ? ? ? ?20: getstatic ? ? #2 ? ? ? ? ? ? ? ? ?// Field java/lang/System.out:Ljava/io/PrintStream; ? ? ? ?23: iload ? ? ? ? 5 ? ? ? ?25: invokevirtual #3 ? ? ? ? ? ? ? ? ?// Method java/io/PrintStream.println:(I)V ? ? ? ?28: iinc ? ? ? ? ?4, 1 ? ? ? ?31: goto ? ? ? ? ?8 ? ? ? ?34: return ``` 可以很容易看到，它將代碼解釋成了傳統的變量方式，即 for(int i;i<length;i++) 的形式。而 List 的字節碼如下： ``` void loop(java.util.List<java.lang.Integer>); ? ?Code: ? ? ? 0: aload_1 ? ? ? 1: invokeinterface #4, ?1 ? ? ? ? ? ?// InterfaceMethod java/util/List.iterator:()Ljava/util/Iterator; ? ? ? 6: astore_2- ? ? ? 7: aload_2 ? ? ? 8: invokeinterface #5, ?1 ? ? ? ? ? ?// InterfaceMethod java/util/Iterator.hasNext:()Z ? ? ?13: ifeq ? ? ? ? ?39 ? ? ?16: aload_2 ? ? ?17: invokeinterface #6, ?1 ? ? ? ? ? ?// InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object; ? ? ?22: checkcast ? ? #7 ? ? ? ? ? ? ? ? ?// class java/lang/Integer ? ? ?25: invokevirtual #8 ? ? ? ? ? ? ? ? ?// Method java/lang/Integer.intValue:()I ? ? ?28: istore_3 ? ? ?29: getstatic ? ? #2 ? ? ? ? ? ? ? ? ?// Field java/lang/System.out:Ljava/io/PrintStream; ? ? ?32: iload_3 ? ? ?33: invokevirtual #3 ? ? ? ? ? ? ? ? ?// Method java/io/PrintStream.println:(I)V ? ? ?36: goto ? ? ? ? ?7 ? ? ?39: return ``` 它實際是把 list 對象進行迭代并遍歷的，在循環中，使用了 Iterator.next() 方法。使用 jd-gui 等反編譯工具，可以看到實際生成的代碼： ``` void loop(List<Integer> paramList) { ? ?for (Iterator<Integer> iterator = paramList.iterator(); iterator.hasNext(); ) { ? ? ?int i = ((Integer)iterator.next()).intValue(); ? ? ?System.out.println(i); ? ?} ?} ``` * [ ] 注解注解在 Java 中得到了廣泛的應用，Spring 框架更是由于注解的存在而起死回生。注解在開發中的作用就是做數據約束和標準定義，可以將其理解成代碼的規范標準，并幫助我們寫出方便、快捷、簡潔的代碼。那么注解信息是存放在哪里的呢？我們使用兩個 Java 文件來看一下其中的一種情況。 **MyAnnotation.java** ``` public @interface MyAnnotation { } ``` **AnnotationDemo** ``` @MyAnnotation public class AnnotationDemo { ? ?@MyAnnotation ? ?public void test(@MyAnnotation ?int a){ ? ?} } ``` 下面我們來看一下字節碼信息。 ``` { ?public AnnotationDemo(); ? ?descriptor: ()V ? ?flags: ACC_PUBLIC ? ?Code: ? ? ?stack=1, locals=1, args_size=1 ? ? ? ? 0: aload_0 ? ? ? ? 1: invokespecial #1 ? ? ? ? ? ? ? ? ?// Method java/lang/Object."<init>":()V ? ? ? ? 4: return ? ? ?LineNumberTable: ? ? ? ?line 2: 0 ?public void test(int); ? ?descriptor: (I)V ? ?flags: ACC_PUBLIC ? ?Code: ? ? ?stack=0, locals=2, args_size=2 ? ? ? ? 0: return ? ? ?LineNumberTable: ? ? ? ?line 6: 0 ? ?RuntimeInvisibleAnnotations: ? ? ?0: #11() ? ?RuntimeInvisibleParameterAnnotations: ? ? ?0: ? ? ? ?0: #11() } SourceFile: "AnnotationDemo.java" RuntimeInvisibleAnnotations: ?0: #11() ``` 可以看到，無論是類的注解，還是方法注解，都是由一個叫做 RuntimeInvisibleAnnotations 的結構來存儲的，而參數的存儲，是由 RuntimeInvisibleParameterAnotations 來保證的。 #### 小結本課時我們簡單介紹了一下工作中常見的一些問題，并從字節碼層面分析了它的原理，包括異常的處理、finally 塊的執行順序；以及隱藏的裝箱拆箱和 foreach 語法糖的底層實現。由于 Java 的特性非常多，這里不再一一列出，但都可以使用這種簡單的方式，一窺究竟。可以認為本課時屬于拋磚引玉，給出了一種學習思路。另外，也可以對其中的性能和復雜度進行思考。可以注意到，在隱藏的裝箱拆箱操作中，會造成很多冗余的字節碼指令生成。那么，這個東西會耗性能嗎？答案是肯定的，但是也不必糾結于此。你所看到的字節碼指令，可能洋洋灑灑幾千行，看起來很嚇人，但執行速度幾乎都是納秒級別的。Java 的無數框架，包括 JDK，也不會為了優化這種性能對代碼進行限制。了解其原理，但不要舍本逐末，比如減少一次 Java 線程的上下文切換，就比你優化幾千個裝箱拆箱動作，來的更快捷一些。