Notes on Ousterhout's Philosophy of Software Design

In Philosophy of Software Design, John Ousterhout discusses approaches to software interface design that in his experience lead to simple and maintainable code. I've collected a few musings and reactions here.

Chapter 4: Modules should be deep

I think there are some neat insights in this chapter, although the language and examples were such that they took me a little bit of time to mull over. In particular, "modules" are a key part of the argument, but that term has a lot of specific meanings in different contexts. Ousterhout explains that by using the term "module" he could be talking about classes, subsystems, or services. He does not say "function," but he does later refer to shallow methods. However, in my opinion, some of the arguments break down at the scale of methods or functions.

I would restate (and embellish) the theme that resonated with as follows:

software consists of abstractions
abstractions provide value in one of two ways
- they bundle functionality in a way that is semantically atomic at the level of a particular abstraction
- they hide details in order to make operations at a given level generalizable
deep modules add more functionality than shallow modules relative to their interface (surface area)
shallow modules aren't shallow because they use underlying abstractions, but because they add very little on top of those abstractions.

The reason that I had some trouble with Ousterhout's discussion is because in some subsequent examples, he makes the case that a module is too shallow, but my instinct for what consitutes "enough" functionality to justify a module is evidently lower. For example, in chapter 4 he cautions against factoring methods into smaller methods because it creates "large numbers of shallow [...] methods." Outside of extremely performance sensitive paths, I don't personally see much difference between a "shallow" internal method and value binding within a method that captures the result of some shallow operation.

Maybe absolute "depth" isn't the right measure, but instead the aspect ratio — the quotient of depth to interface "width" such that the wider an interface, the more functionality required to justify it. With this revised measure, shallow modules with extremely constrained interfaces may still be worthwhile when they crystallize some human idea or way of understanding better than the abstraction at play.

Java classitis

Ousterhout asserts that the normal approach to reading data from a file in Java is a bad interface. To recap, a typical approach used in his example looks like

FileInputStream fileStream = new FileInputStream(path);
BufferedInputStream bufferedStream = new BufferedInputStream(fileStream);
ObjectInputStream objectStream = new ObjectInputStream(bufferedStream);

// Do something with `objectStream`, and never touch `fileStream` and
// `bufferedStream` again.
// [...]

objectStream.close();

Ousterhout's objections centre on the first two lines. (The third emphasizes the verbosity of the API, but I don't think it makes an additional point and I'll ignore it onward.) In particular, he notes that

the majority of the time, the programmer wants to buffer the input for performance, so the second line is required
The ceremony of the interface exposes unnecessary complexity to the programmer

Ousterhout recommends that the common use of an interface be as simple as possible, and argues that the Java input stream interface fails at that.

I don't disagree with making interfaces simple in the common case, but I do have a somewhat different view of the interface above. In fact, rather than being an example of bad design, I think it's pretty nice.Incidentally, Rust and Go, both implemented by teams of talented designers who had a decade plus to learn from the Java example, end up doing more or less the same thing. It cleanly separates two different things: reading bytes from a file, and buffering those bytes for efficiency. It layers these together to get the necessary behaviour. The advantage of composition in this case is that the individual parts are reusable. The programmer could provide a different kind of unbuffered InputStream to BufferedInputStream and get buffering without needing to reimplement it.

However, there's merit to the assertion that making the programmer handle this in the default case increases cognitive load. Ousterhout mentions that in some cases a programmer may need to avoid buffering, and that it should be possible to disable it. In Java (as his example), I can thing of two obvious ways of doing this.

implement a void setBuffering(Boolean) method that enables or disables buffering. This is roughly what Haskell does via BufferingMode and Racket does with file-stream-buffer-mode.) Personally, I don't think this is great because it creates some new design problems where before there were none, like what happens if the buffering mode is changed midway through using a stream? And when exactly should a buffer be allocated, assuming that we default to buffering but may disable it?
add an additional constructor for FileInputStream that accepts an option to disable buffering, defaulting to using buffering. This is the similar to the approach in Python, but it (like the previous suggestion) gives up the existing design's flexibility that permits any type of InputStream to be buffered.

Yet I still think it's valuable for the underlying implementation to use the original API, keeping file reading and buffering as separate concerns.It would be nice if BufferedInputStream could also be relied on to mark within the type system that an input source is buffered, such that consumers of the type could require a buffering. Sadly, that's not the case, as passing the bufferedStream to the next constructor doesn't retain this information. That's easier with option (2). A third approach that doesn't require buy-in from the maintainers of the Java standard library is to observe that this is ultimately about constructors, and constructors are nothing but functions. We know that we can compose functions f and g such that (f ∘ g) = λx → f(g(x)). Java may not provide the syntax to compose class constructors, but I guess we can manually write

class FileInputOps {
    static BufferedInputStream bufferedFileInput(String path) throws FileNotFoundException {
        FileInputStream fileStream = new FileInputStream(path);
        return new BufferedInputStream(fileStream);
    }
}

or even

class SimpleInputStream extends InputStream {
    BufferedInputStream bufferedStream;

    SimpleInputStream(String path) throws FileNotFoundException {
        FileInputStream fileStream = new FileInputStream(path);
        bufferedStream = new BufferedInputStream(fileStream);
    }

    public int read() throws IOException {
        return bufferedStream.read();
    }
}

The underlying input streams remain part of the publish API, although relegated to special use cases, and the common case uses SimpleInputStream and saves a line of boilerplate.

Chapter 7: Passthrough values

As functions or objects become nested, values required by a procedure low in the hierarchy may need to be passed through multiple levels of abstraction. For example, a CLI argument that controls the behaviour of a deeply buried component ends up as a parameter in all intermediate levels between that component and main.

Ousterhout recommends a pattern employing a context object that combines the values being passed downward. So rather than

(define (foo a b c)
    (bar a b c))

(define (bar a b c)
    (begin
        (do-something-with a)
        (baz b c)))

(define (baz b c)
    (do-something-else-with b c))

you'd have

(struct ctx [a b c])

(define (foo context)
    (bar context))

(define (bar context)
    (begin
        (do-something-with (ctx-a context))
        (baz context)))

(define (baz context)
    (do-something-else-with (ctx-b context) (ctx-c context)))

Is that better? Maybe sometimes as the hierachy and number of context values grows, and particularly when the context is immutable within a scope. It still has downsides though:

the context itself is still passthrough value (or it becomes global within some scope). I'm not sure that's a big deal, except that
everything in the context is still visible in the intermediate layers. For example, baz in the example above now has access to a. That might can result in unintentional dependencies and coupling.

Nevertheless, I've seen this basic approach used in a lot of places, e.g.:

Go's context package
gRPC-java's Context.
Haskell's Reader, which does essentially the same thingusing a monad, so you're in the club as an explicit context parameter, but encodes the passthrough in the type rather than the parameter list

Chapter 9

Doesn't the example in listing 9.2 contain a bug? That is, based on the code shown, it looks like the flow of execution always passes through the labeled logging statements, even if there was no error to log.

(A typo in a code block is no big deal, but it's ironic that the thesis of this one is that goto is a good idea in this situation. However, goto is frowned upon precisely because this sort of ad hoc control flow makes it easier to write bugs like this. Furthermore, it looks like the code ostensibly being fixed could have been improved by deduplicating with a simple utility method... I think that in this book that might be considered too "shallow," which is a shame because it's just a more structured way of solving exactly the problem that the goto imperfectly solves.)

Chapter 10

I love the strategy of defining errors out of existence. I also enjoyed this chapter, because rather than relying on examples from student projects that seem like strawmen, this chapter calls out two real examples from professionally designed projects: Tcl's unset command, and Windows' handling of file deletion. In both cases there are elegant ways to redefine behaviour such that

the component fulfills the basic requirements of the problem
the interface is narrower

In other words, the functionality is essentially the same, but the interface is smaller, so the aspect ratioUsing the aspect ratio proposed in the commentary on chapter 4. is larger.