Computing Thoughts Bruce Eckel's Programming Blog

On Java 8 And The Concurrent Python Developer Retreat

For many years I’ve been getting requests for some kind of sequel to Thinking in Java, 4th Edition. Over two years ago, I finally decided to “pull together something quickly.” After all, how much time have I spent writing about the language? I should be getting pretty fast by now.

Self-delusion knows no bounds. No matter how many books I write, every one seems to take longer than the previous ones, not shorter. I hope this is because I’m getting more meticulous.

Another factor is that Java 8 is a dramatic departure from previous versions of Java. It has pulled a major rabbit out of a hat with the introduction of lambdas and functional programming—perhaps not as pure as you expect from a real functional language, but a huge step forward nonetheless. Along with CompletableFutures and Streams, Java 8 is a radical improvement in the experience of programming with Java.

I’ve rolled the book out very slowly; for the last couple of months it’s been in beta to make sure there were no glitches in the delivery or reading experience, but it is now officially released.

You can find it at www.OnJava8.com.

This book is far too large to publish as a single print volume, and my intent has always been to only publish it as an eBook. Color syntax highlighting for code listings is, alone, worth the cost of admission. Searchability, font resizing or text-to-voice for the vision-impaired, the fact you can always keep it with you—there are so many benefits to eBooks it’s hard to name them all.

Anyone buying this book needs a computer to run the programs and write code, and the eBook reads nicely on a computer (I was also surprised to discover it reads tolerably well on a phone). However, the best reading experience is on a tablet computer. Tablets are inexpensive enough you can now buy one for less than you’d pay for an equivalent print version of this book (which, note, does not exist). It’s much easier to read a tablet in bed (for example) than trying to manage the pages of a physical book, especially one this big. When working at your computer, you don’t have to hold the pages open when using a tablet at your side. It might feel different at first, but I think you’ll find the benefits far outweigh the discomfort of adapting.

I’ve done the research, and Google Play Books provides a very nice reading experience on every platform, including Linux and iOS devices. As an experiment, I’ve decided to try publishing exclusively through Google Books.

The Concurrent Python Developer Retreat

I will be attending GopherCon in Denver with my friend Luciano Ramalho (Author of Fluent Python), and afterwards he’ll be coming with me to my home in Crested Butte, Colorado where we’ll be holding the Concurrent Python Developer retreat, July 16-19 2017. This will focus on all the aspects of concurrency in Python, as part of the development work for my book project Concurrent Python. This will be a free ebook, available now and throughout development (it might also turn into a print book if the interest is there). The target audience is people who know Python but don’t know anything about concurrency, so if that’s you, please consider joining the retreat. You can find out more, and register, here.

View or add comments

Pycon 2017

This year I repeated my strategy of “Don’t go to any recorded sessions,” and also did more volunteering (one of these was at the green room, and again one of the session chairs failed to show up so you’ll see me introducing three of the talks).

Dinners

In the past I’ve kind of stumbled into dinner groups, but not always. This time I made more of an effort and was rewarded with dinner friends every night. I think, however, that I’d like to make more of an effort to pre-plan these next year somehow, perhaps using some online tool. Admittedly this could reduce the chances for serendipity, but once a core of people are going to dinner it’s not usually hard to add others.

I wonder if there’s already an app for setting up group dinners. One of the hopeful things that happened is that I discovered that someone I have started to know is a web developer and we’ve agreed to do some experiments. I’ve been approached by any number of people over time who want to build web projects, but these are invariably people who have figured out how to configure a technology (often Wordpress, which I’ve come to loathe) and recommend that I use that particular technology. But in this case, I’ve got a real programmer who can adapt to my arcane needs. As a result, I’ve started imagining asking him to create all kinds of new experiments, and one of these could be a dinner planner app for conferences. The possibilities for this and other ideas are very exciting.

Hunting for Self-Guided Tutorials

For years, many people in Crested Butte have expressed to me an interest in learning to program, while declaring they know nothing about it. Perhaps the idea gestated long enough, or some other experience triggered this, but I recently began formulating a free class for new programmers that relies completely on self-guided tutorials. The benefit of this is that people do not get driven away by the class going either too fast or too slow; by its nature it adapts to the learning experience of each individual. In addition, once anyone has solved one of the steps, they can help their classmates solve that step (indeed, they are probably better at it, being closer to “beginner’s mind”).

The site is EveryoneProgram.com and might still be rough when you look at it, but the content is basically there, including links to the tutorials.

While at Pycon I talked to various folks who are in education to try to see if there were other, better tutorials and ended up with the ones you see on the site. Of course it’s an experiment so we’ll adjust it depending on how things work and new discoveries.

One of my goals is to create new coaches from the people that attend the class. There are other folks in town who can help and take over if I’m traveling, as well.

Open Spaces

Although I attended more open spaces than the ones described here, these were the ones I found most engaging.

Inclusion

This was convened by a young woman who had a poor experience in a workshop. From the context, I don’t think it was one of the pre-conference tutorials; I think it had happened awhile ago but it had left a mark and she needed to discuss it. Others in the session shared similar experiences.

I have not found these kinds of discussions at other conferences. One of the unique things about Pycon is that it (intentionally) has a very high percentage of women attendees and speakers. Guido’s stated goal is 50% and it seems like that number is fairly close in terms of speakers, and I might guess the women attendees could be as high as 35% or more. Minorities of various kinds are also becoming more represented. I wonder if one reason Python’s popularity keeps rising is its inclusiveness.

What Confuses You About Concurrency?

I held this session to gather more ideas to support the development of Concurrent Python. You can find my notes from this session here by scrolling down to that heading.

Improve Your Writing

This was requested directly to me by one person (who ended up not being able to make the session). I had actually submitted a session proposal for something similar, but it wasn’t accepted. We had about 10 people.

To keep it simple I gave three core ideas I think make the biggest impact on someone’s writing:

  1. Less is More. The more words you have, the more work your reader must do. Shorter sentences are easier to read and understand. Multiple editing passes are usually required to achieve short sentences.

  2. Active Voice. You can find some examples here, although these are rather basic. Perhaps we need another category of “super- active” or “direct” voice, as it’s usually possible to make your sentences even more immediate – although it takes even more work to get there. Active voice helps a lot with point 1, because active sentences are almost always shorter than passive sentences.

  3. Read It Aloud. This is a trick I learned from a friend many years ago. When you read silently, your mind tends to skip and skim. When you read every single word out loud, it forces you to scrutinize your prose. Problems jump out at you and you’re surprised you missed such glaring errors. It takes a lot of time (especially for book chapters) but it pays off.

I tried to keep the description of these points short and take questions. Afterward there was general Q&A and discussion.

Starting Teal Organizations

I never know what’s going to happen when I hold one of these sessions. Usually I get curious folks and it gives me practice in explaining the concepts, which you can find at Reinventing Business.

The turnout was larger than I expected, perhaps 8 folks, and we had a stimulating discussion. At this point I’m trying to discover the details, things not covered in the Reinventing Organizations book, such as bringing people into and out of the organization, payment, investing, and things like that. Frederic Laloux (the author) says these topics are different from one Teal organization to another and chooses not to cover them for that reason. But if you’re starting such an organization, you need those structures, even if you just use them as a starting point. I think I can make an important contribution to this field by discovering, collecting and passing on such information.

While I was in Portland I was able to attend a separate one-day open-spaces event around non-violent communication, and there I also held a “Starting Teal Organizations” session, which produced more good discussion.

Sprint days

Last year was my first experience sprinting (If you don’t include what was arguably the very first sprint, coached by Jim Fulton at Zope Corp in Virginia after one of the D.C. Pycons – but I can’t remember much from that one; I do know he tried to teach us Git). Then I went to many different projects and contributed a little to each one, primarily either documentation or testing their onboarding process, but in a few cases I added code. I learned a great deal through this sampling process, but this year I ended up spending all three days (and wishing I had the fourth) on a single project: Beeware.

I’ve been experimenting with and using a small decorator framework for building command-line applications, much like (for example) Click although mine is much simpler. I have a friend or two who would like some automation, so I’ve been imagining that a command-line system would work for them.

Then a friend pointed out that most people have no idea what the shell is, and are unfamiliar with command-line applications. At Pycon it occurred to me that it might be possible to use the same decorator approach, but instead of producing a command-line option, the decorator would insert a menu item in a windowed program, and thus be easier and more familiar for the vast majority of users, while at the same time making the creation of such a program far easier. I went to the Beeware booth and ask the creator, Russell Keith-Magee (whom I had met and spent time with at PyCarribbean), if he thought this was possible and he said yes.

It turns out his answer was a bit … premature. When you go to the Beeware Site, it can be hard to discover where to start. There’s a reason for that: this is a collection of projects, so you must first (A) Know that and (B) Decide what project fits your needs. And even then, some parts of the project are still more visionary than completely fleshed out, so you won’t always find a tutorial or example of what you’re looking for.

I quite like the vision of BeeWare: make Python work everywhere. I think it will get there. We even had some discussions about funding so Russell (and ideally other programmers) could work on it full time, and get it there faster. But be warned that at this writing the vision is only partly there.

The sub-project that met my needs is called Toga, with Docs here. The first thing I discovered was that there were menus for the Mac and I think Linux, but not for Windows. Thus, my effort was spent on learning the Toga architecture and figuring out how to add menus for Windows apps. This was tremendously educational but I’m still not close to creating my “decorators for menus.” However, I’d rather continue working on Beeware until it gets to the point where I can, rather than building something from scratch. This way I’ll produce results that works across platforms, by taking advantage of the Beeware architecture.

Virtual Environments Revisited

Setting up the virtual environment for this project was one of my learning experiences. When you install a package that you want to modify and test, it turns out you must make it editable using the -e flag, as in pip install -e. Otherwise your changes won’t be used, and only the original installation will ever be seen when you run your program. Without knowing this I experienced some frustration until Russell walked me through it. If I hadn’t been doing this in a sprint, I probably would have given up.

In the process, I ended up making multiple installations and building the virtual environment multiple times. I also realized that I sometimes hesitate to build a virtual environment because I must look up and re-learn the configuration commands, and running the activate script is always a bother. If only we could get the computer to do annoying, repetetive things for us! Here’s a Windows batch file that does the trick (anyone with basic shell skills can create the equivalent for Mac/Linux):

@echo off
rem venv.bat

rem Works if you're outside the starting directory:
if defined VIRTUAL_ENV (
  deactivate.bat
) else (
  rem Only works inside starting directory, otherwise
  rem creates a new virtual environment:
  if exist virtualenv (
    virtualenv\Scripts\activate.bat
  ) else (
    python -m venv virtualenv
    virtualenv\Scripts\activate.bat
  )
)

I’m choosing to call my virtual environment directory virtualenv, and so can use that directory’s existence to indicate whether there’s an installed virtual environment.

You can place venv.bat somewhere in your Windows PATH and use it anytime you want to create and use virtual environments. Note that for it to work correctly, you must call it from the directory where the virtualenv subdirectory will be created or already exists.

activate.bat sets VIRTUAL_ENV and deactivate.bat clears it, so VIRTUAL_ENV means that the virtual environment has been activated. This way, running venv just turns it on if it’s off, and off if it’s on, so that’s two less commands I need to remember.

If the virtual environment doesn’t exist, it is created and activated. But while working on Toga, I needed the special -e editable installations in that new environment, so I created a local venv.bat and added the following lines after the virtual environment creation and activation:

  cd src\core
  pip install -e .
  cd ..\winforms
  pip install -e .
  cd ..\..
  pip install -e .

Notes

The rest of the time I sought out experiences and interactions. And this is a place where the smart phone really has enhanced my life: whenever anyone said something interesting, I pulled out the phone and entered it in a list in Google Keep. I used to write notes and then set them aside and forget them, but entering them in Keep and knowing they will show up on all my other devices seems to really make the key difference.

Here are my notes, in no particular order:

The Concurrent Python Project

I’ve started working on a big, ambitious Python project: a book called Concurrent Python which assumes you know Python but that you don’t know anything about concurrency. I considered writing yet another introductory book, but realized there are already plenty of good ones and that I wouldn’t contribute much there. Concurrency, however, is a topic I’ve struggled with over the years and I know I could add some value to that discussion. It also presents opportunities for much more interesting training, conferences, speaking and consulting (these days I’m far more invested in discovering stimulating experiences, instead of just any experiences).

The first chapter has mostly been taken from the Concurrency chapter in On Java 8, which is now available in beta (as in “nearly finished”) form but still requires a little more work. The Atomic Kotlin book has also inserted itself in my schedule. So don’t expect a lot of progress right now, although I am thinking about it and watching for ideas, tools and libraries which I’m capturing in the 00_Notes.md file – feel free to make pull requests or add issues if you think something belongs there.

View or add comments

PyCaribbean Keynote On Youtube

I gave the closing keynote at PyCaribbean, the Python conference held in Puerto Rico. It’s called Science is What Works, and you can see it on YouTube. The slides are available here.

I think a closing keynote should provide perspective and look at things from a higher point of view. By then, the audience is full of technical information and, I believe, looking for relief rather than more code.

Even though I was a physics major as an undergraduate, I’ve begun to realize over the last ten years or so that I didn’t understand what science was, and that’s probably why I was only a mediocre physics student, at best. I thought that science was about “the truth,” so I had a very hard time when the teacher manipulated models, throwing away higher-order terms and arguing that a billiard-balls-and-springs model that produced a specific heat within an order of magnitude was “pretty darn good.”

Had I understood that science is just coming up with a model that fits the data, I wouldn’t have been stuck on ideas of the truth. That is not to say that science doesn’t disprove models, because it certainly does—indeed, that’s all you can know with any certainty, and falsifiability is a requirement for any theory. A model works until it doesn’t, and then you must either make adjustments or throw it away altogether and come up with a new one.

One way to think about science is that it is built on doubt, down to the point that we don’t even bother trying to believe whether our models are true or not. They just fit the data…so far. Whereas what came before the scientific revolution was belief without evidence, and doubt had to be crushed lest the whole operation topple.

Once you understand that science is just models representing parts of the world, you can start observing how those models are created. Some models are purely observational: cell theory in biology (“all organisms are made of cells”) required us to look at every living thing we could get our hands on, through a microscope. In physics, we’re fond of equations, but we have a limited set of approaches to use in order to formulate those equations, because our brains can only deal with so much complexity. The algorithms produced through machine learning have no such limitations, which may produce breakthroughs in science that we have up until now been unable to achieve.

In this presentation, I briefly look at a lot of different sciences and see how they work and when they don’t, and finally ask the question of whether computer science is actually a science (aspects of it certainly seem to be, while other parts are clearly not).

I did learn something important from seeing the video. I used one of Google’s standard slide formats (including color choices), and the video is just a single camera pointed at both me and the screen. This is certainly the easiest way to capture a presentation, and I can’t count the number of videos I’ve watched where they had hired a “professional” who kept pointing the camera at the speaker when the speaker was describing code, and at the screen when there was nothing interesting going on there. But I will be thinking in the future of the one-static-camera capture and ensure that the font and background are contrasty enough, and of course large enough. In the past this wasn’t even an option because pointing a camera at a video projection would cause all kinds of interference; technology has improved to the point where this is no longer an issue.

View or add comments

Constructors Are Not Thread-Safe

When you imagine the construction process, it can be easy to think that it’s thread-safe. After all, no one can even see the new object before it finishes initialization, so how could there be contention over that object? Indeed, the Java Language specification (JLS) confidently states:

“There is no practical need for a constructor to be synchronized, because it would lock the object under construction, which is normally not made available to other threads until all constructors for the object have completed their work.”

Unfortunately, object construction is as vulnerable to shared-memory concurrency problems as anything else. The mechanisms can be more subtle, however.

Consider the automatic creation of a unique identifier for each object using a static field. To test different implementations, we’ll start with an interface:

// HasID.java

public interface HasID {
  int getID();
}

Then implement that interface in an obvious way:

// StaticIDField.java

public class StaticIDField implements HasID {
  private static int counter = 0;
  private int id = counter++;
  public int getID() { return id; }
}

This is about as simple and innocuous a class as you can imagine. It doesn’t even have an explicit constructor to cause problems. To see what happens when we make multiple concurrent tasks that create these objects, here’s a test harness:

// IDChecker.java
import java.util.*;
import java.util.function.*;
import java.util.stream.*;
import java.util.concurrent.*;
import com.google.common.collect.Sets;

public class IDChecker {
  public static int SIZE = 100_000;
  static class MakeObjects
  implements Supplier<List<Integer>> {
    private Supplier<HasID> gen;
    public MakeObjects(Supplier<HasID> gen) {
      this.gen = gen;
    }
    @Override
    public List<Integer> get() {
      return
        Stream.generate(gen)
          .limit(SIZE)
          .map(HasID::getID)
          .collect(Collectors.toList());
    }
  }
  public static void test(Supplier<HasID> gen) {
    CompletableFuture<List<Integer>>
      groupA = CompletableFuture
        .supplyAsync(new MakeObjects(gen)),
      groupB = CompletableFuture
        .supplyAsync(new MakeObjects(gen));
    groupA.thenAcceptBoth(groupB, (a, b) -> {
      System.out.println(
        Sets.intersection(
          Sets.newHashSet(a),
          Sets.newHashSet(b)).size());
    }).join();
  }
}

The MakeObjects class is a Supplier with a get() that produces a List<Integer>. This List is generated by extracting the id from each HasID object. The test() method creates two parallel CompletableFutures that run MakeObjects suppliers, then takes the results of each and uses the Guava library Sets.intersection() to find out how many ids are common between the two List<Integer> (Guava is much faster than using retainAll()).

Now we can test the StaticIDField:

// TestStaticIDField.java

public class TestStaticIDField {
  public static void main(String[] args) {
    IDChecker.test(StaticIDField::new);
  }
}
/* Output:
47643
*/

That’s a rather large number of duplicates. Clearly, a plain static int is not safe to use for construction. Let’s make it thread-safe using an AtomicInteger:

// GuardedIDField.java
import java.util.concurrent.atomic.*;

public class GuardedIDField implements HasID {
  private static AtomicInteger counter =
    new AtomicInteger();
  private int id = counter.getAndAdd(1);
  public int getID() { return id; }
  public static void main(String[] args) {
    IDChecker.test(GuardedIDField::new);
  }
}
/* Output:
0
*/

Constructors have an even more subtle way to share state: through constructor arguments:

// SharedConstructorArgument.java
import java.util.concurrent.atomic.*;

interface SharedArg {
  int get();
}

class Unsafe implements SharedArg {
  private int i = 0;
  public int get() { return i++; }
}

class Safe implements SharedArg {
  private static AtomicInteger counter =
    new AtomicInteger();
  public int get() {
    return counter.getAndAdd(1);
  }
}

class SharedUser implements HasID {
  private final int id;
  public SharedUser(SharedArg sa) {
    id = sa.get();
  }
  @Override
  public int getID() { return id; }
}

public class SharedConstructorArgument {
  public static void main(String[] args) {
    Unsafe unsafe = new Unsafe();
    IDChecker.test(() -> new SharedUser(unsafe));
    Safe safe = new Safe();
    IDChecker.test(() -> new SharedUser(safe));
  }
}
/* Output:
47747
0
*/

Here, the SharedUser constructors share the same argument. Even though SharedUser is using its argument in a completely innocent and reasonable fashion, the way the constructor is called causes collisions. SharedUser cannot even know it is being used this way, much less control it!

synchronized constructors are not supported by the language, but it’s possible to create your own using a synchronized block. Although the JLS states that “… it would lock the object under construction”, this is not true—the constructor is effectively a static method, so a synchronized constructor would actually lock through the class object. We can reproduce this by creating our own static object and locking on that:

// SynchronizedConstructor.java
import java.util.concurrent.atomic.*;

class SyncConstructor implements HasID {
  private final int id;
  private static Object constructorLock = new Object();
  public SyncConstructor(SharedArg sa) {
    synchronized(constructorLock) {
      id = sa.get();
    }
  }
  @Override
  public int getID() { return id; }
}

public class SynchronizedConstructor {
  public static void main(String[] args) {
    Unsafe unsafe = new Unsafe();
    IDChecker.test(() -> new SyncConstructor(unsafe));
  }
}
/* Output:
0
*/

The shared use of the Unsafe class is now safe.

An alternate approach is to make the constructors private (thus preventing inheritance) and provide a static Factory Method to produce new objects:

// SynchronizedFactory.java
import java.util.concurrent.atomic.*;

class SyncFactory implements HasID {
  private final int id;
  private SyncFactory(SharedArg sa) {
    id = sa.get();
  }
  @Override
  public int getID() { return id; }
  public static synchronized
  SyncFactory factory(SharedArg sa) {
    return new SyncFactory(sa);
  }
}

public class SynchronizedFactory {
  public static void main(String[] args) {
    Unsafe unsafe = new Unsafe();
    IDChecker.test(() ->
      SyncFactory.factory(unsafe));
  }
}
/* Output:
0
*/

By synchronizing the static Factory Method you lock on the class object during construction.

These examples emphasize how insidiously difficult it is to detect and manage shared state in concurrent Java programs. Even if you take the “share nothing” strategy, it’s remarkably easy for accidental sharing to take place.

View or add comments

A Canonical equals() For Java

Even with the help of Java 7’s Objects.equals(), the equals() method is often written in a verbose and messy fashion. This article shows how you can write a succinct equals() in a format that allows easy checking with visual inspection.

When you create a new class, it automatically inherits class Object. If you don’t override equals(), you’ll get Objects equals() method. By default this compares addresses, so only if you are comparing the exact same objects will you get true. The default case is the “most discriminating.”

// DefaultComparison.java

class DefaultComparison {
  private int i, j, k;
  public DefaultComparison(int i, int j, int k) {
    this.i = i;
    this.j = j;
    this.k = k;
  }
  public static void main(String[] args) {
    DefaultComparison
      a = new DefaultComparison(1, 2, 3),
      b = new DefaultComparison(1, 2, 3);
    System.out.println(a == a);
    System.out.println(a == b);
  }
}
/* Output:
true
false
*/

Normally you’ll want to relax this restriction. Typically, if two objects are the same type and have fields with identical values, you’ll consider those objects equal, but there may also be fields that you don’t want to include in the equals() comparison. This is part of the class design process.

A proper equals() must satisfy the following five conditions:

  1. Reflexive: For any x, x.equals(x) should return true.

  2. Symmetric: For any x and y, x.equals(y) should return true if and only if y.equals(x) returns true.

  3. Transitive: For any x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.

  4. Consistent: For any x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified.

  5. For any non-null x, x.equals(null) should return false.

Here are the tests that satisfy those conditions and determine whether the object you’re comparing yourself to (which we’ll call here the rval) is equal to this object:

  1. If the rval is null, it’s not equal.

  2. If the rval is this (you’re comparing yourself to yourself), the two objects are equal.

  3. If the rval is not the same class or subclass, the two objects are not equal.

  4. If all the above checks pass, then you must decide which fields in the rval are important (and consistent), and compare those.

Java 7 introduced the Objects class to help with this process, which we use to write a better equals().

The following examples compare different versions of the Equality class. To prevent duplicate code we’ll build the examples using the Factory Method. The EqualityFactory interface simply provides a make() method to produce an Equality object, so a different EqualityFactory can produce a different subtype of Equality:

// EqualityFactory.java
import java.util.*;

interface EqualityFactory {
  Equality make(int i, String s, double d);
}

Now we’ll define Equality containing three fields (all of which we consider important during comparison) and an equals() method that fulfills the four checks described above. The constructor displays its type name to ensure we are performing the tests we think we are:

// Equality.java
import java.util.*;

public class Equality {
  protected int i;
  protected String s;
  protected double d;
  public Equality(int i, String s, double d) {
    this.i = i;
    this.s = s;
    this.d = d;
    System.out.println("made 'Equality'");
  }
  @Override
  public boolean equals(Object rval) {
    if(rval == null)
      return false;
    if(rval == this)
      return true;
    if(!(rval instanceof Equality))
      return false;
    Equality other = (Equality)rval;
    if(!Objects.equals(i, other.i))
      return false;
    if(!Objects.equals(s, other.s))
      return false;
    if(!Objects.equals(d, other.d))
      return false;
    return true;
  }
  public void
  test(String descr, String expected, Object rval) {
    System.out.format("-- Testing %s --%n" +
      "%s instanceof Equality: %s%n" +
      "Expected %s, got %s%n",
      descr, descr, rval instanceof Equality,
      expected, equals(rval));
  }
  public static void testAll(EqualityFactory eqf) {
    Equality
      e = eqf.make(1, "Monty", 3.14),
      eq = eqf.make(1, "Monty", 3.14),
      neq = eqf.make(99, "Bob", 1.618);
    e.test("null", "false", null);
    e.test("same object", "true", e);
    e.test("different type", "false", new Integer(99));
    e.test("same values", "true", eq);
    e.test("different values", "false", neq);
  }
  public static void main(String[] args) {
    testAll( (i, s, d) -> new Equality(i, s, d));
  }
}
/* Output:
made 'Equality'
made 'Equality'
made 'Equality'
-- Testing null --
null instanceof Equality: false
Expected false, got false
-- Testing same object --
same object instanceof Equality: true
Expected true, got true
-- Testing different type --
different type instanceof Equality: false
Expected false, got false
-- Testing same values --
same values instanceof Equality: true
Expected true, got true
-- Testing different values --
different values instanceof Equality: true
Expected false, got false
*/

testAll() performs comparisons with all different types of objects we ever expect to encounter. It creates Equality objects using the factory.

In main(), notice the simplicity of the call to testAll(). Because EqualityFactory has a single method, it can be used with a lambda expression as the make() method.

The above equals() method is annoyingly verbose, and it turns out we can simplify it into a canonical form. Observe:

  1. The instanceof check eliminates the need to test for null

  2. The comparison to this is redundant. A correctly-written equals() will work properly with self comparison.

Because && is a short-circuiting comparison, it quits and produces false the first time it encounters a failure. So, by chaining the checks together with &&, we can write equals() much more succinctly:

// SuccinctEquality.java
import java.util.*;

public class SuccinctEquality extends Equality {
  public SuccinctEquality(int i, String s, double d) {
    super(i, s, d);
    System.out.println("made 'SuccinctEquality'");
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof SuccinctEquality &&
      Objects.equals(i, ((SuccinctEquality)rval).i) &&
      Objects.equals(s, ((SuccinctEquality)rval).s) &&
      Objects.equals(d, ((SuccinctEquality)rval).d);
  }
  public static void main(String[] args) {
    Equality.testAll( (i, s, d) ->
      new SuccinctEquality(i, s, d));
  }
}
/* Output:
made 'Equality'
made 'SuccinctEquality'
made 'Equality'
made 'SuccinctEquality'
made 'Equality'
made 'SuccinctEquality'
-- Testing null --
null instanceof Equality: false
Expected false, got false
-- Testing same object --
same object instanceof Equality: true
Expected true, got true
-- Testing different type --
different type instanceof Equality: false
Expected false, got false
-- Testing same values --
same values instanceof Equality: true
Expected true, got true
-- Testing different values --
different values instanceof Equality: true
Expected false, got false
*/

For each SuccinctEquality, the base-class constructor is called before the derived-class constructor. The output shows that we still get the correct result. You can tell that short-circuiting happens because both the null test and the “different type” test would otherwise throw exceptions during the casts that occur further down the list of comparisons in equals().

Objects.equals() shines when you compose your new class using another class:

// ComposedEquality.java
import java.util.*;

class Part {
  String ss;
  double dd;
  public Part(String ss, double dd) {
    this.ss = ss;
    this.dd = dd;
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Part &&
      Objects.equals(ss, ((Part)rval).ss) &&
      Objects.equals(dd, ((Part)rval).dd);
  }
}

public class ComposedEquality extends SuccinctEquality {
  Part part;
  public ComposedEquality(int i, String s, double d) {
    super(i, s, d);
    part = new Part(s, d);
    System.out.println("made 'ComposedEquality'");
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof ComposedEquality &&
      super.equals(rval) &&
      Objects.equals(part, ((ComposedEquality)rval).part);
  }
  public static void main(String[] args) {
    Equality.testAll( (i, s, d) ->
      new ComposedEquality(i, s, d));
  }
}
/* Output:
made 'Equality'
made 'SuccinctEquality'
made 'ComposedEquality'
made 'Equality'
made 'SuccinctEquality'
made 'ComposedEquality'
made 'Equality'
made 'SuccinctEquality'
made 'ComposedEquality'
-- Testing null --
null instanceof Equality: false
Expected false, got false
-- Testing same object --
same object instanceof Equality: true
Expected true, got true
-- Testing different type --
different type instanceof Equality: false
Expected false, got false
-- Testing same values --
same values instanceof Equality: true
Expected true, got true
-- Testing different values --
different values instanceof Equality: true
Expected false, got false
*/

Notice the call to super.equals()—no need to reinvent it (plus you don’t always have access to all necessary parts of a base class).

Equality Across Subtypes

Inheritance suggests that objects of two different subtypes can be “the same” when they are upcast. Suppose you have a collection of Pet objects. This collection will naturally accept subtypes of Pet: In this example, Dogs and Pigs. Each Pet has a name and a size, as well as a unique internal id number.

We define equals() and hashCode() using the canonical form via the Objects class, but we only define them in the base class Pet, and we do not include the unique id number in either one. From the standpoint of equals(), this means we only care if something is a Pet, not whether it is a specific type of Pet:

// SubtypeEquality.java
import java.util.*;

enum Size { SMALL, MEDIUM, LARGE }

class Pet {
  private static int counter = 0;
  private final int id = counter++;
  private final String name;
  private final Size size;
  public Pet(String name, Size size) {
    this.name = name;
    this.size = size;
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Pet &&
      // Objects.equals(id, ((Pet)rval).id) && // [1]
      Objects.equals(name, ((Pet)rval).name) &&
      Objects.equals(size, ((Pet)rval).size);
  }
  @Override
  public int hashCode() {
    return Objects.hash(name, size);
    // return Objects.hash(name, size, id);  // [2]
  }
  @Override
  public String toString() {
    return String.format("%s[%d]: %s %s %x",
      getClass().getSimpleName(), id,
      name, size, hashCode());
  }
}

class Dog extends Pet {
  public Dog(String name, Size size) {
    super(name, size);
  }
}

class Pig extends Pet {
  public Pig(String name, Size size) {
    super(name, size);
  }
}

public class SubtypeEquality {
  public static void main(String[] args) {
    Set<Pet> pets = new HashSet<>();
    pets.add(new Dog("Ralph", Size.MEDIUM));
    pets.add(new Pig("Ralph", Size.MEDIUM));
    pets.forEach(System.out::println);
  }
}
/* Output:
Dog[0]: Ralph MEDIUM a752aeee
*/

If we are just thinking about types, it does make sense—sometimes—to only consider the classes from the standpoint of their base type, which is the foundation of the Liskov Substitution Principle. This code fits nicely with that principle because the derived types don’t add any extra functionality (methods) that isn’t in the base class. The derived types only differ in behavior, not in interface (which of course is not the general case).

But when we provide two different object types with identical data and place them in a HashSet<Pet>, only one of these objects survives. This emphasizes that equals() is not a perfectly mathematical concept but (at least partially) a mechanical one. hashCode() and equals() must be defined hand-in-hand in order to allow types to work properly in a hashed data structure.

In the example, both the Dog and Pig hash to the same bucket in the HashSet. At this point, the HashSet falls back to equals() to differentiate the objects, but equals() also declares the objects to be the same. The HashSet doesn’t add the Pig because it’s already got an identical object.

We can still make the example work by forcing uniqueness on otherwise identical objects. Here, each Pet already has a unique id so you can either uncomment line [1] in equals() or switch to line [2] in hashCode(). In the canonical form you would do both, to involve all “unchanging” fields in both operations (“unchanging” so that the equals() and hashCode() don’t produce different values between storing and retrieving in a hashed data structure. I put “unchanging” in quotes because you must evaluate whether modification might happen).

Side note: in hashCode(), if you are only working with a single field, use Objects.hashCode() and if you are using multiple fields use Objects.hash().

We can also solve the issue by following the standard form and defining equals() in the subclasses (but still not including the unique id):

// SubtypeEquality2.java
import java.util.*;

class Dog2 extends Pet {
  public Dog2(String name, Size size) {
    super(name, size);
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Dog2 &&
      super.equals(rval);
  }
}

class Pig2 extends Pet {
  public Pig2(String name, Size size) {
    super(name, size);
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Pig2 &&
      super.equals(rval);
  }
}

public class SubtypeEquality2 {
  public static void main(String[] args) {
    Set<Pet> pets = new HashSet<>();
    pets.add(new Dog2("Ralph", Size.MEDIUM));
    pets.add(new Pig2("Ralph", Size.MEDIUM));
    pets.forEach(System.out::println);
  }
}
/* Output:
Dog2[0]: Ralph MEDIUM a752aeee
Pig2[1]: Ralph MEDIUM a752aeee
*/

Notice that the hashCode()s are identical, but because the objects are no longer equals(), both now appear in the HashSet. Also, super.equals() means we don’t need access to the private fields in the base class.

One way to look at this is to say that Java separates substitutability from the definition of equals() and hashCode(). We can still place Dogs and Pigs into a Set<Pet> regardless of how equals() and hashCode() are defined, but the objects won’t behave correctly in hashed data structures unless those methods are defined with hashed structures in mind. Unfortunately, equals() is not only used in conjunction with hashCode(). This complicates things when you try to avoid defining it for specific classes, and it’s why it’s worth following the canonical form. However, this is further complicated because there are times when you don’t need to define either method.

View or add comments