Java Hiccups

Wed Nov 07 14:01:00 EST 2018

Tags: java
  1. Java Hiccups
  2. Bitwise Operators
  3. Java Grab Bag 2
  4. Java Travelogue: The Care and Feeding of Locales
  5. More Notes on Filesystem and Charset Portability

To take a break from the doom-and-gloom of my last post, I figured it'd be good to dust off a post idea I've had in my drafts for a while: common hiccups that Java developers - particularly those coming from a Domino background - run into. This is sort of a grab bag of non-obvious concepts that are easy to assume incorrectly about, whether because of the way other languages work or the behavior of the lotus.domino API specifically.

So, roughly in order of complexity:

import Is Just For Cleanliness

In many languages, in particular C/C++/Objective-C, the natural equivalent of Java's import statement has a massive effect, physically grafting files into your source. In Java, though, import is really just for developer convenience. At runtime, there's no difference between having written this:

1
2
3
4
5
import java.util.ArrayList;
import java.util.List;

/* snip */
List<String> foo = new ArrayList<>();

...or this:

1
java.util.List<java.lang.String> foo = new java.util.ArrayList<>();

If you import a class but never use it in a file, it won't have any effect on the runtime behavior of the class. It's just used by the compiler to clarify what you mean when you use a base class name without reference.

Incidentally, as seen here, classes within the java.lang package (but not subpackages) are auto-imported, so it's as if each Java file has an invisible import java.lang.* at the top.

Compilation Doesn't Bake In Libraries

This is related, and is also an area where Java differs from some other environments. With C et al, you have the option to statically link referenced external libraries - which is to say, grab their contents at build time and put them into your compiled result such that they may as well be part of your program. Java doesn't do this: every time you reference a class or method, it's really just storing the equivalent of the string name of that class, which is then resolved at runtime.

This is why it's very easy to run into a ClassNotFoundException: you can compile code with some classes present on the class path, but then run it in a system where they're not present. The Java runtime doesn't pre-check whether all the required classes are available when it starts running, so you only find out when it hits that line of code.

Different Java-based environments deal with this in different ways. Standard (non-Domino) web apps deal with this by including dependency jars inside the WEB-INF/lib folder during the packaging phase of building. OSGi, XPages's framework of choice, has a whole dependency mechanism where you can specify bundles or packages with version ranges, in the hope of bringing some order to the chaos, with mixed success.

Primitives Are A Thing

Though the term "primitive" means a built-in data type generally, here I'm specifically talking about things like int and double. For historical and performance reasons, Java has a conceptual and practical distinction between the objects that you deal with most of the time and the primitive types used mostly for number storage. Namely: byte, char, short, int, long, float, double, and boolean. Unlike object references, there is no concept of a "null" value with these that would cause a NullPointerException. Referring to a variable with one of these types will always contain some value if the code compiles, even if it's just the 0/false default for an object property when not otherwise initialized.

Each of these types has a corresponding "boxed" object version, generally with the un-abbreviated name capitalized, such as Byte and Integer.

The distinction used to be harsher than it is today, thanks to autoboxing. Autoboxing is a compile-time behavior that will automatically convert between the primitive types and their object holders as necessary, allowing this type of code, which would otherwise be illegal:

1
2
Object i = 3;
int j = new Integer(4);

Autoboxing is mildly inefficient, so it's good to know that it exists, but you don't normally need to lose sleep over it.

All Object Variables Are Pointers

In a language like C++, there is a distinction between a variable that "is" an object vs. one that is a pointer to an object somewhere in memory. In Java, however, the former doesn't exist: an object variable is only ever a pointer. This has a couple implications. For one, this code only deals one object:

1
2
3
4
5
SomeClass foo = new SomeClass();
SomeClass bar = foo;
bar.setName("hi");
foo.setName("hello");
bar.getName(); // Will be "hello"

This is also why Java is picky about not referencing object variables until they've been initialized to at least something, so this generates a compile-time error:

1
2
Object foo;
foo.getClass();

Unfortunately, unlike some languages, Java has no language-level support for enforcing the distinction between a null object reference and a non-null one, which is why NullPointerExceptions are so prevalent.

Another implication of this leads into its own hiccup common to LotusScript programmers:

Strictly Speaking, All Method Arguments Are "By Value", But...

All method parameters in Java are "by value" in the LotusScript sense, the fact that all object variables are pointers means that the "value" you're passing to the method for an object parameter is always a reference. Java has no mechanism to pass a reference to a primitive type, nor does it have a mechanism to implicitly duplicate an object when passing it to a method.

Not only is this a bit conceptually confusing at first, but it's also a potential trap for bad programming practices. It's very easy to write a method that performs modifications on objects passed in as parameters, and this is often the right thing to do. However, since the language doesn't have any syntax mechanism for broadcasting this behavior, it's up to you as the programmer to either write the method name in such a way that it's obvious what's going to happen or clearly state it in the documentation if it's something that's going to be used outside the current file.

Casting Objects Doesn't Do Anything

By "casting", I'm referring to something like this:

1
RichTextItem body = (RichTextItem)doc.getFirstItem("SomeItem");

The (RichTextItem) is a cast, and it's yet another area that diverges from some other languages. What casting an object in Java means is that you're going to refer to an object by a different class or interface than the one it's been previously referred to as. It has some runtime implications, but the thing to keep in mind is that it's about choosing the name for something that exists as opposed to changing an object into a different class.

So, for example, the RichTextItem idiom above exists because the Document#getFirstItem method returns an object that's called a Item, but, when the item in the document is rich text, it will actually return a RichTextItem object. RichTextItem is a subclass of Item, and so it's legal to refer to such an object either as RichTextItem, Item, Base (the common interface for all Notes objects), or Object (the common superclass of all objects). In a situation like this, you have to cast the object because you're going to refer to it as a more-specific type of object than the one the method says it returns.

If you do this and the object is not actually of the type you're trying to cast it to (in this case, if it's a plain text item or MIME, most commonly), you'll end up with a ClassCastException, because the cast is enforced at runtime. But, success or fail, the cast will not actually affect the object itself in any way - it will continue on being whatever it was already, regardless of name.

Casting Primitives Does Do Something

For better or for worse, performing a cast on a primitive type does have the possibility of creating a new value. For example:

1
2
int foo = Integer.MAX_VALUE;
short bar = (short)foo;

Because int can hold more data than a short, this case creates a new value based on chopping off the highest-value bits of the internal binary representation of foo. (As a side note, because of the fun way computers deal with numbers, foo is 2147483647, while bar is -1.)

Normally, this behavior doesn't matter too much, since, if you have a method that takes, say, an int and you have a long, you can safely cast it down since it'll likely be a tiny value anyway. It's important to know that it can happen, though, and this behavior is very important when, for example, native C libraries that use unsigned values, which do not exist in Java as such.

Java Has Only A Limited Concept of Immutability

"Immutability" refers to the inability to change the value of an entity once it's created. It's come to the fore as a concept recently because working with immutable objects sidesteps a lot of issues with asynchronous programming. Java, unfortunately, doesn't really have any language-level support for immutable objects in the sense that, for example, Swift does.

Java throws a bit of a curve ball in this area with the final keyword, which means that a variable can't be reassigned after being first initialized. This means that you can't do things like this:

1
2
3
4
5
final int foo = 3;
foo = 4; // compiler error

final SomeClass bar = new SomeClass();
bar = new SomeClass(); // compiler error

This, on the other hand, is entirely legal:

1
2
3
final SomeClass foo = new SomeClass();
foo.setName("hi");
foo.setName("hello");

This is because the only thing blocked from changing here is the value of foo-the-reference, but the object it's referencing can be changed at will.

An object can be made effectively immutable, though, by means of making its outward-facing methods not change any of the internal state. This is used commonly for "value" classes, such as the aforementioned Integer. Though the language doesn't do anything to guarantee that the Integer class doesn't allow mutation, the class is written in such a way that it has no inlet for it.

Because of the value of immutable objects, they're used commonly in the core Java classes and in third-party libraries, particularly newer ones. However, since the language can't tell you if an object is immutable, you have to be on the lookout for whether a given method modifies the existing object in-place or returns a new object reflecting the change. This comes up frequently with Strings, which are immutable in Java. This is something I've seen commonly:

1
2
3
String foo = " hello ";
foo.trim();
System.out.println(foo);

That code will print " hello ", with the leading and trailing spaces (though without the quotes). This is because the String#trim method, like all "changing" methods on String, leaves the original value intact but returns a new String object reflecting the expected value.

This is just something you have to be on the lookout for, especially since this pattern isn't even consistently applied within the core Java classes. The Date class, for example, is infamously bad in a lot of ways, and one of those ways is that it has mutation methods.

Generics In Java Are Weird

A "generic" refers in this case to a class that is declared as being associated with one or more other types that can be defined after the fact. The prototypical example of this is a collection class, like List<String> foo. In this case, the List interface is generic and lets you specify the type of object you expect to find within it, in this case String.

Generics, unfortunately, were added after Java's initial release, and they bear the marks of it. Unlike languages like C++, Java generics are largely syntactic sugar, meant to replace things like:

1
String someString = (String)aListIKnowHasStrings.get(0);

...with this:

1
String someString = aListDeclaredWithStrings.get(0);

However, under the covers, a List only ever really knows it contains Objects, and the second form just transparently shims in a (String) cast at runtime. That's why you can do something like this:

1
((List<Object>)(List<?>)aListDeclaredWithStrings).add(new NotAStringObject());

That line will not only compile, but it will execute without issue at runtime. It's only later, when you try to extract the value to a String variable, that you'll hit a ClassCastException.

Some generic information is retained at runtime, depending on how it's used, but for the most part it's best to think of it as just a syntax nicety. This behavior is endless trouble, but something we have to live with.

Garbage Collection Is Automatic, But Resource Management Isn't Necessarily

This is one of the main things that bites Domino developers as they learn about Java. One of the early things they learn when switching from LotusScript to Java is that now you have to worry about the .recycle() method on your objects, or else you'll have trouble. This leads to two misapprehensions: that Java in general requires "recycling" for every object, and that recycling with Domino objects is about memory in the same way that a Java OutOfMemoryError is.

Unfortunately, the reason that recycle() exists at all requires delving into some nitty-gritty aspects of the Java environment, but I first want to reinforce that Java uses automatic garbage collection at all times to watch for and delete objects that are no longer used. That "no longer used" bit glosses over a bit, but take this as an example:

1
2
3
4
5
6
7
8
9
public void foo() {
  String a = "hello";
  String b = " there";
  return a + b;
}
public void bar() {
  String message = foo();
  System.out.println(message);
}

There are three objects in action here, but, by the time the code reaches the System.out.println line, a and b are no longer used and will be slated for automatic garbage collection. You as a programmer do not need to worry about them.

The lotus.domino objects, though, are trouble. I think it's best to not think of recycle() in terms of "memory" but instead think of the objects as "open resources", in the same way that you might open a network connection or a stream to a file on the filesystem. Unlike object memory, network resources are not necessarily automatically closed by Java - there are some affordances with syntax and the concept of "finalizers", but, in general, the responsibility for closing a resource lies with the programmer.

There are a few grab-bag notes to do with these objects:

  • Different lotus.domino objects refer to different kind of backing resources, which is why problems will sometimes manifest as complaints about memory (when they refer primarily to C-side structures in Domino's native memory, separate from Java) and sometimes as complaints about handles (database and document references, generally)
  • Recycling a "parent" object recycles all of its children, but the relationship is not always clear. Importantly, DateTime objects are children of the ancestor Session, even when you retrieve them from a Document, and so they can linger for a long time
  • Agents and XPages both mitigate and conceal the need for recycling by automatically closing the auto-generated Session(s) at the end of the agent execution or page request. In practice, you only really need to worry about recycling if you're, for example, looping over a large view

I may make another one of these posts in the future, and hopefully this goes a little way to clearing up some common misconceptions.

Commenter Photo

Dan Sickles - Wed Nov 07 18:40:08 EST 2018

Java (8) does have an Optional type but it's incomplete and a bit awkward compared to most of the referenced Option implementations. Also, there's no Either in Java for handling disjoint types (T OR Exception for example)

 

Commenter Photo

Jesse Gallagher - Wed Nov 07 18:42:44 EST 2018

Indeed. I'm a fan of using Optional when I can, but it's a shame that it wasn't there from JDK 1. I also like nullness annotations, but they're also not language-enforced, not to mention extremely balkanized.

New Comment