As a part of JVM Series, this post tries to explore the object model in JVM.

All codes in this post are based on jdk9-b94.

What does the new keyword actually do

When you new a object in Java, you need to know what actually happens beneath this statement. _new method in InterpreterRuntime will be invoked.

// InterpreterRuntime.cpp

IRT_ENTRY(void, InterpreterRuntime::_new(JavaThread* thread, ConstantPool* pool, int index))
  Klass* k = pool->klass_at(index, CHECK);
  InstanceKlass* klass = InstanceKlass::cast(k);

  // Make sure we are not instantiating an abstract klass
  klass->check_valid_for_instantiation(true, CHECK);

  // Make sure klass is initialized
  klass->initialize(CHECK);

  // At this point the class may not be fully initialized
  // because of recursive initialization. If it is fully
  // initialized & has_finalized is not set, we rewrite
  // it into its fast version (Note: no locking is needed
  // here since this is an atomic byte write and can be
  // done more than once).
  //
  // Note: In case of classes with has_finalized we don't
  //       rewrite since that saves us an extra check in
  //       the fast version which then would call the
  //       slow version anyway (and do a call back into
  //       Java).
  //       If we have a breakpoint, then we don't rewrite
  //       because the _breakpoint bytecode would be lost.
  oop obj = klass->allocate_instance(CHECK);
  thread->set_vm_result(obj);
IRT_END

It seems like the klass takes the most responsibilities to allocate instance. We will penetrate into it of course, but later on. After allocation, constructor will be called to initilized the raw object.

// jvm.cpp

JVM_ENTRY(jobject, JVM_NewInstanceFromConstructor(JNIEnv *env, jobject c, jobjectArray args0))
  JVMWrapper("JVM_NewInstanceFromConstructor");
  oop constructor_mirror = JNIHandles::resolve(c);
  objArrayHandle args(THREAD, objArrayOop(JNIHandles::resolve(args0)));
  oop result = Reflection::invoke_constructor(constructor_mirror, args, CHECK_NULL);
  jobject res = JNIHandles::make_local(env, result);
  if (JvmtiExport::should_post_vm_object_alloc()) {
    JvmtiExport::post_vm_object_alloc(JavaThread::current(), result);
  }
  return res;
JVM_END

JVM_NewInstanceFromConstructor resorts to Reflection, let’s what the latter does.

// reflection.cpp

oop Reflection::invoke_constructor(oop constructor_mirror, objArrayHandle args, TRAPS) {
  oop mirror             = java_lang_reflect_Constructor::clazz(constructor_mirror);
  int slot               = java_lang_reflect_Constructor::slot(constructor_mirror);
  bool override          = java_lang_reflect_Constructor::override(constructor_mirror) != 0;
  objArrayHandle ptypes(THREAD, objArrayOop(java_lang_reflect_Constructor::parameter_types(constructor_mirror)));

  InstanceKlass* klass = InstanceKlass::cast(java_lang_Class::as_Klass(mirror));
  Method* m = klass->method_with_idnum(slot);
  if (m == NULL) {
    THROW_MSG_0(vmSymbols::java_lang_InternalError(), "invoke");
  }
  methodHandle method(THREAD, m);
  assert(method->name() == vmSymbols::object_initializer_name(), "invalid constructor");

  // Make sure klass gets initialize
  klass->initialize(CHECK_NULL);

  // Create new instance (the receiver)
  klass->check_valid_for_instantiation(false, CHECK_NULL);
  Handle receiver = klass->allocate_instance_handle(CHECK_NULL);

  // Ignore result from call and return receiver
  invoke(klass, method, receiver, override, ptypes, T_VOID, args, false, CHECK_NULL);
  return receiver();
}

And klass appears, yes, again. a oop will be returnd after constructor, apparently, it is a managed pointer to an object.

oop

oop is a type alias to oopDesc* which we can see in oopHierarchy.hpp.

// oopsHierarchy.hpp

typedef class oopDesc*                            oop;
typedef class   instanceOopDesc*            instanceOop;
typedef class   arrayOopDesc*                    arrayOop;
typedef class     objArrayOopDesc*            objArrayOop;
typedef class     typeArrayOopDesc*            typeArrayOop;

Here comes oopDesc, which is the top baseclass for objects classes. The {name}Desc classes describe the format of Java objects so the fields can be accessed from C++. NO virtual functions allowed.

class oopDesc {
 private:
  volatile markOop _mark;
  union _metadata {
    Klass*      _klass;
    narrowKlass _compressed_klass;
  }
}

So far, oop and Klass are linked together. Unsatisfied, we still want to figure what is the usage of these fields.

  • _mark, used for GC, lock etc.
  • _metadata, a union for compressed pointer in 64-bit OS.

Other methods includes,

  • inline void* oopDesc::field_base(int offset) const { return (void*)&((char*)this)[offset]; } get the offset where data area starts.
  • inline jint* oopDesc::int_field_addr(int offset) const { return (jint*)field_base(offset); } return the address of int field.
  • inline jint oopDesc::int_field(int offset) const { return *int_field_addr(offset);} get the int value.
  • inline jint oopDesc::int_field_acquire(int offset) const { return OrderAccess::load_acquire(int_field_addr(offset));} acquire int field with one lock.

The hierarchy of oop shows in the following,

and the layout in memory.

Klass

A Klass provides:

  1. language level class object (method dictionary etc.)
  2. provide vm dispatch behavior for the object

Both functions are combined into one C++ class.

Remember allocate_instance of InstanceKlass is called for returning oop object in invoke_constructor?

// instanceKlass.cpp

instanceOop InstanceKlass::allocate_instance(TRAPS) {
  bool has_finalizer_flag = has_finalizer(); // Query before possible GC
  int size = size_helper();  // Query before forming handle.

  instanceOop i;

  i = (instanceOop)CollectedHeap::obj_allocate(this, size, CHECK_NULL);
  if (has_finalizer_flag && !RegisterFinalizersAtInit) {
    i = register_finalizer(i, CHECK_NULL);
  }
  return i;
}

CollectedHeap, the name of this class implies it will allocate this object in heap. Yes it does.

A CollectedHeap is an implementation of a java heap for HotSpot. This is an abstract class: there may be many different kinds of heaps. This class defines the functions that a heap must implement, and contains infrastructure common to all heaps.

// CollectedHeap.inline.hpp

oop CollectedHeap::obj_allocate(Klass* klass, int size, TRAPS) {
  debug_only(check_for_valid_allocation_state());
  assert(!Universe::heap()->is_gc_active(), "Allocation during gc not allowed");
  assert(size >= 0, "int won't convert to size_t");
  HeapWord* obj = common_mem_allocate_init(klass, size, CHECK_NULL);
  post_allocation_setup_obj(klass, obj, size);
  NOT_PRODUCT(Universe::heap()->check_for_bad_heap_word_value(obj, size));
  return (oop)obj;
}

Why bother dividing oop and ‘Klass’ while both for metadata? As the official said,

One reason for the oop/klass dichotomy in the implementation is that we don’t want a C++ vtbl pointer in every object. Thus, normal oops don’t have any virtual functions. Instead, they forward all “virtual” functions to their klass, which does have a vtbl and does the C++ dispatch depending on the object’s actual type.

Reference