Java Virtual Machine Support for Non-Java Languages

The following topics are covered:

Introduction

The Java SE platform enables the development of applications with the following features:

The Java SE platform also provides robust support with respect to the following areas (and more):

Oracle's HotSpot JVM also offers the following tools and features:

The Java SE 7 platform enables non-Java languages to exploit the infrastructure and potential performance optimizations of the JVM. The key mechanism is the invokedynamic instruction, which simplifies the implementation of compilers and runtime systems for dynamically typed languages on the JVM.

Static and Dynamic Typing

A programming language is statically typed if it performs type checking at compile time. Type checking is the process of verifying that a program is type safe. A program is type safe if the arguments of all of its operations are the correct type.

Java is a statically typed language. All typed information for class and instance variables, method parameters, return values, and other variables is available when a program is compiled. The compiler for the Java programming language uses this type information to produce strongly typed bytecode, which can then be efficiently executed by the JVM at runtime.

The following example of a Hello World program demonstrates static typing. Types are shown in bold.

import java.util.Date;

public class HelloWorld {
    public static void main(String[] argv) {
        String hello = "Hello ";
        Date currDate = new Date();
        for (String a : argv) {
            System.out.println(hello + a);
            System.out.println("Today's date is: " + currDate);
        }
    }
}

A programming language is dynamically typed if it performs type checking at runtime. JavaScript and Ruby are examples of dynamically typed languages. These languages verify at runtime, rather than at compile time, that values in an application conform to expected types. These languages typically do not have any type information available at compile time. The type of an object can be determined only at runtime. Hence, in the past, it was difficult to efficiently implement them on the JVM.

The following is an example of the Hello World program written in the Ruby programming language:

#!/usr/bin/env ruby
require 'date'

hello = "Hello "
currDate = DateTime.now
ARGV.each do|a|
  puts hello + a
  puts "Date and time: " + currDate.to_s
end

Note that every name is introduced without a type declaration. Also, the main program is not located inside of a holder type (the Java class HelloWorld. The Ruby equivalent of the Java for loop is inside the dynamic type of the variable ARGV. The body of the loop is contained in a block called a closure, a common feature in dynamic languages.

Statically Typed Languages Are Not Necessarily Strongly Typed Languages

A programming language that features strong typing specifies restrictions on the types of values supplied to its operations. If a computer language implements strong typing, it prevents the execution of an operation if its arguments have the wrong type. Conversely, a language that features weak typing would implicitly convert (or cast) arguments of an operation if those arguments have wrong or incompatible types.

Statically typed programming languages can employ strong typing or weak typing. Similarly, dynamically typed languages can also apply strong typing or weak typing. For example, the Ruby programming language is dynamically typed and strongly typed. Once a variable has been initialized with a value of some type, the Ruby programming language will not implicitly convert the variable into another data type. The Ruby programming language would not allow the following:

a = "40"
b = a + 2

In this example, the Ruby programming language will not implicitly cast the number 2, which has a Fixnum type, to a string.

The Challenge of Compiling Dynamically Typed Languages

Consider the following dynamically typed method, addtwo, that adds any two numbers (which can be of any numeric type) and returns the sum:

def addtwo(a, b)
       a + b;
end

Suppose your organization is implementing a compiler and runtime system for the programming language in which the method addtwo is written. In a strongly typed language, whether typed statically or dynamically, the behavior of + (the addition operator) depends on the types of the operands. A compiler for a statically-typed language chooses which implementation of + is appropriate based on the static types of a and b. For example, a Java compiler implements + with the iadd JVM instruction if the types of a and b are int. The addition operator will be compiled to a method call because the JVM's iadd instruction requires the operand types to be statically known.

In contrast, a compiler for a dynamically-typed language must defer the choice until runtime. The statement a + b is compiled as the method call +(a, b), where + is the method name. (Note a method named + is permitted in the JVM but not in the Java programming language.) Suppose then that the runtime system for the dynamically-typed language is able to identify that a and b are variables of integer type. The runtime system would prefer to call an implementation of + that is specialized for integer types rather than arbitrary object types.

The challenge of compiling dynamically typed languages is how to implement a runtime system that can choose the most appropriate implementation of a method or function — after the program has been compiled. Treating all variables as objects of Object type would not work efficiently; the Object class does not contain a method named +.

Java SE 7 introduces the invokedynamic instruction that enables the runtime system to customize the linkage between a call site and a method implementation. In this example, the invokedynamic call site is +. An invokedynamic call site is linked to a method by means of a bootstrap method, which is a method specified by the compiler for the dynamically-typed language that is called once by the JVM to link the site. Assuming the compiler emitted an invokedynamic instruction that invokes +, and assuming that the runtime system knows about the method adder(Integer,Integer), the runtime can link the invokedynamic call site to the adder method as follows:

IntegerOps.java

class IntegerOps {

  public static Integer adder(Integer x, Integer y) {
    return x + y;
  }
}

Example.java

import java.util.*;
import java.lang.invoke.*;
import static java.lang.invoke.MethodType.*;
import static java.lang.invoke.MethodHandles.*;

class Example {

  public static CallSite mybsm(
    MethodHandles.Lookup callerClass, String dynMethodName, MethodType dynMethodType)
    throws Throwable {

    MethodHandle mh =
      callerClass.findStatic(
        Example.class,
        "IntegerOps.adder",
        MethodType.methodType(Integer.class, Integer.class, Integer.class));

    if (!dynMethodType.equals(mh.type())) {
      mh = mh.asType(dynMethodType);
    }

    return new ConstantCallSite(mh);
  }
}

In this example, the IntegerOps class belongs to the library that accompanies the dynamic language's runtime system.

The method Example.mybsm is a bootstrap method that links the invokedynamic call site to the adder method.

The object callerClass is a lookup object, which is a factory for creating method handles.

The method MethodHandles.Lookup.findStatic (called from the callerClass lookup object) creates a static method handle for the method adder.

Note: This bootstrap method links an invokedynamic call site only to the code defined in the adder method, and it assumes that the arguments given to the invokedynamic call site will be Integer objects. A bootstrap method requires additional code to properly link invokedynamic call sites to the appropriate code to execute if the parameters of the bootstrap method (in this example, callerClass, dynMethodName, and dynMethodType) vary.

The classes java.lang.invoke.MethodHandles and java.lang.invoke.MethodHandle contain various methods that create method handles based on existing method handles. This example calls the method asType if the method type of the method handle mh does not match the method type specified by the parameter dynMethodType. This enables the bootstrap method to link invokedynamic call sites to Java methods whose method types do not exactly match.

The ConstantCallSite instance returned by the bootstrap method represents a call site to be associated with a distinct invokedynamic instruction. The target for a ConstantCallSite instance is permanent and can never be changed. In this case there is only one Java method, adder, which is a candidate for executing the call site. Note that this method does not have to be a Java method. Instead, if there were several such methods being available to the runtime system, each handling different argument types, the bootstrap method mybsm could dynamically select the correct method, based on the dynMethodType argument.

The invokedynamic Instruction

The invokedynamic instruction simplifies and potentially improves implementations of compilers and runtime systems for dynamic languages on the JVM. The invokedynamic instruction does this by allowing the language implementer to define custom linkage behavior. This contrasts with other JVM instructions such as invokevirtual, in which linkage behavior specific to Java classes and interfaces is hard-wired by the JVM.

Each instance of an invokedynamic instruction is called a dynamic call site. A dynamic call site is originally in an unlinked state, which means that there is no method specified for the call site to invoke. As previously mentioned, a dynamic call site is linked to a method by means of a bootstrap method. A dynamic call site's bootstrap method is a method specified by the compiler for the dynamically-typed language that is called once by the JVM to link the site. The object returned from the bootstrap method permanently determines the call site's behavior.

The invokedynamic instruction contains a constant pool index (in the same format as for the other invoke instructions). This constant pool index references a CONSTANT_InvokeDynamic entry. This entry specifies the bootstrap method (a CONSTANT_MethodHandle entry), the name of the dynamically linked method, and the argument types and return type of the call to the dynamically linked method.

The following is an example of an invokedynamic instruction. In this example, the runtime system links the dynamic call site specified by this invokedynamic instruction (which is +, the addition operator) to the IntegerOps.adder method by using the bootstrap method Example.mybsm. The methods adder and mybsm are defined in the section The Challenge of Compiling Dynamically Typed Languages (line breaks have been added for clarity):

invokedynamic   InvokeDynamic
  REF_invokeStatic:
    Example.mybsm:
      "(Ljava/lang/invoke/MethodHandles/Lookup;
        Ljava/lang/String;
        Ljava/lang/invoke/MethodType;)
      Ljava/lang/invoke/CallSite;":
    +:
      "(Ljava/lang/Integer;
        Ljava/lang/Integer;)
      Ljava/lang/Integer;";

Note: The bytecode examples in these sections use the syntax of the ASM Java bytecode manipulation and analysis framework.

Invoking a dynamically linked method with the invokedynamic instruction involves the following steps:

  1. Defining the Bootstrap Method
  2. Specifying Constant Pool Entries
  3. Using the invokedynamic Instruction

1. Defining the Bootstrap Method

At runtime, when the JVM first encounters an invokedynamic instruction, it calls the bootstrap method. This method links the name specified by the invokedynamic instruction with the code that should be executed (the target method), which is referenced by a method handle. If the JVM executes the same invokedynamic instruction again, it does not call the bootstrap method; it automatically calls the linked method handle.

The bootstrap method's return type must be java.lang.invoke.CallSite. A CallSite object represents the linked state of an invokedynamic instruction and the method handle to which it is linked.

The bootstrap method takes three or more parameters:

  1. A MethodHandles.Lookup object, which is a factory for creating method handles in the context of the invokedynamic instruction.
  2. A String object, the method name mentioned in the dynamic call site.
  3. A MethodType object, the resolved type signature of the dynamic call site.
  4. Optionally, one or more additional static arguments to the invokedynamic instruction. These arguments, drawn from the constant pool, are intended to help language implementers safely and compactly encode additional metadata useful to the bootstrap method. In principle, the name and extra arguments are redundant since each call site could be given its own unique bootstrap method. However, such a practice is likely to produce large class files and constant pools

See the section The Challenge of Compiling Dynamically Typed Languages for an example of a bootstrap method.

2. Specifying Constant Pool Entries

As mentioned previously, the invokedynamic instruction contains a reference to an entry in the constant pool with the tag CONSTANT_InvokeDynamic. This entry contains references to other entries in the constant pool and references to attributes. This section briefly describes constant pool entries used by the invokedynamic instruction. For more information, see the java.lang.invoke package documentation and The Java Virtual Machine Specification.

Example Constant Pool

The following is an excerpt from the constant pool for the class Example, which contains the bootstrap method Example.mybsm that links the method + with the Java method adder:

    class #159; // #47
    Utf8 "adder"; // #83
    Utf8 "(Ljava/lang/Integer;Ljava/lang/Integer;)Ljava/lang/Integer;"; // #84
    Utf8 "mybsm"; // #87
    Utf8 "(Ljava/lang/invoke/MethodHandles/Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)
      java/lang/invoke/CallSite;"; // #88
    Utf8 "Example"; // #159
    Utf8 "+"; // #166

    // ...

    NameAndType #83 #84; // #228
    Method #47 #228; // #229
    MethodHandle 6b #229; // #230
    NameAndType #87 #88; // #231
    Method #47 #231; // #232
    MethodHandle 6b #232; // #233
    NameAndType #166 #84; // #234
    Utf8 "BootstrapMethods"; // #235
    InvokeDynamic 0s #234; // #236

The constant pool entry for the invokedynamic instruction in this example contains three values:

The value 0 refers to the first bootstrap method specifier in the array of specifiers stored in the BootstrapMethods attribute. Bootstrap method specifiers are not in the constant pool table; they are contained in this separate array of specifiers. Each bootstrap method specifier contains an index to a CONSTANT_MethodHandle constant pool entry, which is the bootstrap method itself.

The following is an excerpt from the same constant pool that shows the BootstrapMethods attribute, which contains the array of bootstrap method specifiers:

  [3] { // Attributes

    // ...

    Attr(#235, 6) { // BootstrapMethods at 0x0F63
      [1] { // bootstrap_methods
        {  //  bootstrap_method
          #233; // bootstrap_method_ref
          [0] { // bootstrap_arguments
          }  //  bootstrap_arguments
        }  //  bootstrap_method
      }
    } // end BootstrapMethods
  } // Attributes

The constant pool entry for the bootstrap method mybsm method handle contains three values:

The value 6 is the subtag REF_invokeStatic. See the next section, 3. Using the invokedynamic Instruction, for more information about this subtag.

3. Using the invokedynamic Instruction

The following bytecode uses the invokedynamic instruction to call the bootstrap method mybsm, which links the dynamic call site (+, the addition operator) to the method adder. This example uses the + method to add the numbers 40 and 2 (line breaks have been inserted for clarity):

bipush	40;
invokestatic    Method java/lang/Integer.valueOf:"(I)Ljava/lang/Integer;";
iconst_2;
invokestatic    Method java/lang/Integer.valueOf:"(I)Ljava/lang/Integer;";
invokedynamic   InvokeDynamic
  REF_invokeStatic:
    Example.mybsm:
      "(Ljava/lang/invoke/MethodHandles/Lookup;
        Ljava/lang/String;
        Ljava/lang/invoke/MethodType;)
      Ljava/lang/invoke/CallSite;":
    +:
      "(Ljava/lang/Integer;
        Ljava/lang/Integer;)
      Ljava/lang/Integer;";

The first four instructions put the integers 40 and 2 on the stack and boxes them in the java.lang.Integer wrapper type. The fifth instruction invokes a dynamic method. This instruction refers to a constant pool entry with a CONSTANT_InvokeDynamic tag:

REF_invokeStatic:
  Example.mybsm:
    "(Ljava/lang/invoke/MethodHandles/Lookup;
      Ljava/lang/String;
      Ljava/lang/invoke/MethodType;)
    Ljava/lang/invoke/CallSite;":
  +:
    "(Ljava/lang/Integer;
      Ljava/lang/Integer;)
    Ljava/lang/Integer;";

Four bytes follow the CONSTANT_InvokeDynamic tag in this entry:

In this example, the dynamic call site is presented with boxed integer values, which exactly match the type of the eventual target, the adder method. In practice, the argument and return types do not need to exactly match. For example, the invokedynamic instruction could pass either or both of its operands on the JVM stack as primitive int values. Either or both operands could also be untyped Object values. The invokedynamic instruction could also receive its result as a primitive int value, or an untyped Object value. In any case, the dynMethodType argument to mybsm will accurately describe the method type required by the invokedynamic instruction.

Independently, the adder method could also have been given primitive or untyped arguments or return values. The bootstrap method is responsible for making up any difference between the dynMethodType and the type of the adder method. As shown in the code, this is easily done with an asType call on the target method.

Resources


Copyright © 1993, 2011, Oracle and/or its affiliates. All rights reserved.