Spliterator Explained: The Engine Behind Streams

Post author:Noel Kamphoa
Post published:January 6, 2026
Post category:Java Stream API / Core Java Tutorials / Java Practitioner
Reading time:7 mins read

This entry is part 10 of 10 in the series Modern Java Features (Java 8+)

Local Variable Type Inference in Java
Sealed Classes and Interfaces In Java
Records In Java
Java Stream API: What It Is (and What It Is Not)
Creating and Consuming Streams in Java
Stream Operations and Pipelines Explained
Aggregating Stream Data Using the reduce Operation
Bi-Argument Functional Interfaces in the Stream API
Parallel Stream Processing: Performance and Risks
Spliterator Explained: The Engine Behind Streams

Introduction

Behind every Java stream lies a lesser-known but fundamental component: the Spliterator. While most developers interact only with high-level stream operations, the performance, correctness, and parallel behavior of streams are largely determined by how data is split and traversed.

This article explains what a Spliterator is, why it exists, and how it powers both sequential and parallel streams. Understanding this mechanism completes the mental model of the Java Stream API.

“Streams describe computations; spliterators define how data is traversed.”

1. Why Spliterator Exists

Before Java 8, iteration relied on Iterator, which supports only sequential traversal. This model does not scale well for parallel execution.

Unlike the classic Iterator, the Spliterator can:

Traverse elements sequentially
Split itself into multiple pieces for parallel processing
Estimate the size of the remaining elements
Describe characteristics of the data source (sorted, distinct, sized, etc.)

Every Collection in Java has a default Spliterator implementation, which you can access via the spliterator() method:

List<String> names = List.of("Alice", "Bob", "Charlie");
Spliterator<String> spliterator = names.spliterator();
spliterator.forEachRemaining(System.out::println);

2. The Four Key Responsibilities

The Spliterator interface defines four key methods.

boolean tryAdvance(Consumer<? super T> action);
Spliterator<T> trySplit();
long estimateSize();
int characteristics();

2.1. Traversal: tryAdvance()

This is the core iteration method. It takes a Consumer and applies it to the next element if one exists:

int count = 0;

List<String> names = new ArrayList<>(List.of("Alice", "Bob", "Charlie"));
Spliterator<String> sp = names.spliterator();

while (sp.tryAdvance(name -> System.out.println(name))) {
    count++;
}

System.out.println("Processed elements: " + count);

Unlike Iterator.next(), which throws an exception if no elements remain, tryAdvance() returns false when complete—a cleaner, functional approach.

2.2. Splitting: trySplit()

This is where the magic of parallel streams happens. When Java needs to process elements concurrently, it calls trySplit() to divide the work:

List<String> names = List.of("Alice", "Bob", "Charlie", "Diana");
Spliterator<String> sp1 = names.spliterator();
Spliterator<String> sp2 = sp1.trySplit();

System.out.println("First spliterator:");
sp1.forEachRemaining(System.out::println);

System.out.println("Second spliterator:");
if (sp2 != null) {
    sp2.forEachRemaining(System.out::println);
}

// Now two spliterators exist, each covering half the data
// Can be processed in different threads

The quality of splitting determines parallel efficiency. ArrayList‘s spliterator divides neatly in half, while LinkedList‘s must traverse to find the midpoint.

2.3. Estimation: estimateSize()

Returns an approximate count of remaining elements. This helps optimize splitting decisions and batch sizing:

List<String> names = List.of("Alice", "Bob", "Charlie");
Spliterator<String> sp = names.spliterator();

System.out.println("Estimated size before traversal: " + sp.estimateSize()); // 3

sp.tryAdvance(System.out::println);

System.out.println("Estimated size after one element: " + sp.estimateSize()); // 2

2.4. Characteristics: characteristics()

Returns bit flags describing the data source’s properties:

SIZED: Known exact size (arrays, most collections)
SORTED: Elements follow a natural order (TreeSet)
DISTINCT: No duplicates (Set implementations)
CONCURRENT: Can be safely modified by multiple threads
IMMUTABLE: Cannot be modified at all
SUBSIZED: All child Spliterators, whether direct or indirect, will be SIZED

These characteristics allow Stream operations to optimize themselves. For example, knowing a source is SORTED lets skip() and limit() work more efficiently.

Default Spliterator characteristics in the JDK

Every Java collection exposes a Spliterator, which describes how elements can be traversed and split:

ArrayList: ordered and size-aware

List<String> list = new ArrayList<>(List.of("Alice", "Bob", "Charlie"));
Spliterator<String> sp = list.spliterator();

System.out.println("ORDERED   : " + sp.hasCharacteristics(Spliterator.ORDERED));   // true
System.out.println("SORTED    : " + sp.hasCharacteristics(Spliterator.SORTED));    // false
System.out.println("DISTINCT  : " + sp.hasCharacteristics(Spliterator.DISTINCT));  // false
System.out.println("SIZED     : " + sp.hasCharacteristics(Spliterator.SIZED));     // true
System.out.println("SUBSIZED  : " + sp.hasCharacteristics(Spliterator.SUBSIZED));  // true

HashMap (keys): distinct but unordered

Map<String, Integer> map = new HashMap<>();
map.put("Alice", 30);
map.put("Bob", 25);
map.put("Charlie", 35);

Spliterator<String> sp = map.keySet().spliterator();

System.out.println("ORDERED   : " + sp.hasCharacteristics(Spliterator.ORDERED));   // false
System.out.println("SORTED    : " + sp.hasCharacteristics(Spliterator.SORTED));    // false
System.out.println("DISTINCT  : " + sp.hasCharacteristics(Spliterator.DISTINCT));  // true
System.out.println("SIZED     : " + sp.hasCharacteristics(Spliterator.SIZED));     // true

TreeSet: ordered because it is sorted

Set<Integer> treeSet = new TreeSet<>(Set.of(3, 1, 2));
Spliterator<Integer> sp = treeSet.spliterator();

System.out.println("ORDERED   : " + sp.hasCharacteristics(Spliterator.ORDERED));   // true
System.out.println("SORTED    : " + sp.hasCharacteristics(Spliterator.SORTED));    // true
System.out.println("DISTINCT  : " + sp.hasCharacteristics(Spliterator.DISTINCT));  // true
System.out.println("SIZED     : " + sp.hasCharacteristics(Spliterator.SIZED));     // true

HashSet: distinct but neither ordered nor sorted

Set<Integer> hashSet = new HashSet<>(Set.of(3, 1, 2));
Spliterator<Integer> sp = hashSet.spliterator();

System.out.println("ORDERED   : " + sp.hasCharacteristics(Spliterator.ORDERED));   // false
System.out.println("SORTED    : " + sp.hasCharacteristics(Spliterator.SORTED));    // false
System.out.println("DISTINCT  : " + sp.hasCharacteristics(Spliterator.DISTINCT));  // true
System.out.println("SIZED     : " + sp.hasCharacteristics(Spliterator.SIZED));     // true

3. How Streams Use Spliterators

When you create a stream from a collection:

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
Stream<Integer> stream = numbers.stream();

Here’s what happens internally:

The collection’s spliterator() method provides a Spliterator
Stream operations (map, filter, etc.) wrap this Spliterator with new ones
For parallel streams, trySplit() divides the work among threads
Terminal operations consume elements via tryAdvance()

The quality of splitting directly impacts performance.

“Good splitting leads to good parallelism.”

4. Writing a Custom Spliterator (When Needed)

In day-to-day Java development, you will rarely manipulate a Spliterator directly.
Standard collections, arrays, and I/O utilities already expose well-designed spliterators, and the Stream API builds on them transparently.

That said, a custom Spliterator becomes useful when you want to stream data that is not stored in a collection, but still has:

a clear traversal order,
a known size,
and no reason to be fully loaded into memory.

A common real-world example is streaming a numeric range or identifiers produced on the fly, such as database IDs or batch numbers.

Minimal real-world example: streaming an ID range

class IdRangeSpliterator implements Spliterator<Long> {

    private long current;
    private final long end;

    IdRangeSpliterator(long start, long end) {
        this.current = start;
        this.end = end;
    }

    @Override
    public boolean tryAdvance(Consumer<? super Long> action) {
        if (current > end) {
            return false;
        }
        action.accept(current++);
        return true;
    }

    @Override
    public Spliterator<Long> trySplit() {
        long remaining = end - current;
        if (remaining < 10) {
            return null; // too small to split
        }
        long mid = current + remaining / 2;
        Spliterator<Long> split = new IdRangeSpliterator(current, mid);
        current = mid + 1;
        return split;
    }

    @Override
    public long estimateSize() {
        return end - current + 1;
    }

    @Override
    public int characteristics() {
        return ORDERED | SIZED | SUBSIZED | NONNULL | IMMUTABLE;
    }
}

Usage:

Spliterator<Long> spliterator = new IdRangeSpliterator(1, 100);

StreamSupport.stream(spliterator, false)
        .forEach(System.out::println);

Why this example matters

No collection is created
The stream remains ordered and size-aware
The logic stays simple and predictable

An Iterator could traverse the data, but only a Spliterator can describe how it behaves, which is what allows the Stream API to work efficiently—especially for parallel execution.

Write a custom Spliterator only when you need to describe a data source, not just iterate over it.

Most applications will never need one, but understanding this mechanism clarifies how streams handle ordering, sizing, and parallelism under the hood.

Note: The above example could be replaced by LongStream.rangeClosed.
It is intentionally simplified to illustrate how a Spliterator works.
In real systems, the same structure is used when the data source is lazy, paginated, or external.

5. Spliterator vs Iterator: Key Differences

While both interfaces traverse elements, their responsibilities differ.

Feature	Iterator	Spliterator
Traversal	Sequential only / `hasNext()` / `next()`	Sequential or parallel / `tryAdvance(Consumer)`
Parallel streams	No	Yes via `trySplit()`
Bulk operations	No	`forEachRemaining(Consumer)`
Metadata	None	Size estimation, characteristics

“Iterator walks; Spliterator divides and conquers.”

Conclusion

The Spliterator is the silent engine of the Java Stream API. It defines how data is traversed, how work is divided, and how parallel execution scales. While most developers never implement one, understanding spliterators explains why streams behave the way they do.

You can find the complete code of this article here on GitHub.

Tags: Java, Streams, Collections, Thread

Noel Kamphoa

Senior Software Engineer and Tech Lead with 14+ years of professional experience in backend development, system design, and enterprise software. Passionate about clean architecture, scalable Java applications, and helping developers grow through structured learning and real-world practice.

Kloudly Academy