Monotonic Collections: a middle ground between immutable and fully mutable
This post covers several topics around collections (sets, lists, maps/dictionaries, queues, etc) that I’d like to see someone explore more fully. To my knowledge, there are many alternative collection libraries for Java and for many other languages, but I’m not aware of any that provide support for monotonic collections. What is a monotonic collection, I hear you ask? Well, I’m about to answer that. Jesus, give me a moment.
It’s become popular, in the JVM ecosystem at least, for collections libraries to provide parallel class hierarchies for mutable and immutable collections: Set vs MutableSet, List vs MutableList, etc. I think this probably originated with Scala, and has been copied by Kotlin, and various alternative collection libraries, e.g. Eclipse Collections, Guava, etc. There are plenty of articles out there on the benefits and drawbacks of each type. But the gulf between fully immutable and fully mutable objects is enormous: they are polar opposites, with wildly different properties, performance profiles, and gotchas. I’m interested in exploring the space between these two extremes. (Actually, I’m interested in someone else exploring it, hence this post). One such point is the idea of monotonic collections, and I’ll now explain what that means.
By monotonic I mean here logical monotonicity: the idea that any information that is entailed by some set of logical formulas is also entailed by any superset of those formulas. For a collection data structure, I would formulate that as follows:
If any (non-negated) predicate is true of the collection at time t, then it is also true of the collection at any time t’ > t.
For example, if c is a collection and c.contains(x) returns true at some point in time, then it must always return true from then onwards.
To make this concrete, a MonotonicList (say) would have an append operation, but not insert, delete, or replace operations. More subtly, monotonic collections cannot have any aggregate operations: i.e., operations that report statistics/summary information on the collection as a whole. For example, you cannot have a size method, as the size will change as new items are added (and thus the predicate c.size() == n can become false). You can have (as I understand it) map and filter operations, but not a reduce/fold.
So why are monotonic collections an important category to look at? Firstly, monotonic collections can have some of the same benefits as immutable data structures, such as simplified concurrency. Secondly, monotonic collections are interesting because they can be (relatively) easily made distributed, per the CALM principle: Consistency as Logical Monotonicity (insecure link, sorry). This says that monotonic collections are strongly eventually consistent without any need for coordination protocols. Providing such collections would thus somewhat simplify making distributed systems.
Class hierarchies and mutability
Interestingly, Kotlin decided to make their mutable collection classes sub-types of the immutable ones: MutableList is a sub-type of List, etc. (They also decided to make the arrows go the other way from normal in their inheritance diagram, crazy kids). This makes sense in one way: mutable structures offer more operations than immutable ones. But it seems backwards from my point of view: it says that all mutable collections are immutable, which is logically false. (But then they don’t include the word Immutable in the super types). It also means that consumers of a List can’t actually assume it is immutable: it may change underneath them. Guava seems to make the opposite decision: ImmutableList extends the built-in (mutable) List type, probably for convenience. Both options seem to have drawbacks.
I think the way to resolve this is to entirely separate the read-only view of a collection from the means to update it. On the view-side, we would have a class hierarchy consisting of ImmutableList, which inherits from MonotonicList, which inherits from the general List. On the mutation side, we’d have a ListAppender and ListUpdater classes, where the latter extends the former. Creating a mutable or monotonic list would return a pair of the read-only list view, and the mutator object, something like the following (pseudocode):
ImmutableList<T> list = ImmutableList.of(....); // normalPair<MonotonicList<T>, ListAppender<T>> mono = MonotonicList.of(...);Pair<List<T>, ListUpdater<T>> mut = List.of(...);
The type hierarchies would look something like the following:
interface List<E> { void forEach(Consumer<E> action); ImmutableList<E> snapshot();}interface MonotonicList<E> extends List<E> { boolean contains(E element); // Positive version of isEmpty(): boolean containsAnything(); <T> MonotonicList<T> map(Function<E, T> f); MonotonicList<E> filter(Predicate<E> p);}interface ImmutableList<E> extends MonotonicList<E> { int size(); <T> T reduce(BiFunction<E, T, T> f, T initial);}interface ListAppender<E> { void append(E element);}interface ListUpdater<E> extends ListAppender<E> { E remove(int index); E replace(int index, E newValue); void insert(int index, E newValue);}
This seems to satisfy allowing the natural sub-type relationships between types on both sides of the divide. It’s a sort of CQRS at the level of data structures, but it seems to solve the issue that the inheritance direction for read-only consumers is the inverse of the natural hierarchy for mutating producers. (This has a relationship to covariant/contravariant subtypes, but I’m buggered if I’m looking that stuff up again on my free time).
Anyway, these thoughts are obviously pretty rough, but maybe some inklings of ideas if anyone is looking for an interesting project to work on.
#dataStructures #FunctionalProgramming #immutability #Java #programming