POPL 2017
Sun 15 - Sat 21 January 2017
Wed 18 Jan 2017 16:55 - 17:20 at Auditorium - Compiler Optimisation Chair(s): Andrew Myers

Streams have (re-)emerged as a mainstream data processing paradigm. Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. In all past work, however, stream support comes with several shortcomings, in both expressiveness and performance. For instance, the well-optimized and immensely popular Java 8 streams do not support a zip operator and are still many tens of times slower than hand-written loops.

We present the first approach that encodes the full generality of stream processing and eliminates all overheads, via the use of staging. It is based on an unusually rich semantic model of stream interaction. We support zipping, nesting (or flat-mapping), sub- ranging, filtering, mapping—and any combination—of finite or infinite streams. Our model captures idiosyncrasies that a programmer uses in optimizing stream pipelines, such as rate differences and the choice of a “for” vs. “while” loops. Hence our approach delivers hand-written–like code, but automatically. It explicitly avoids the reliance on black-box optimizers and sufficiently-smart compilers, offering highest, guaranteed and portable performance.

Our approach relies on high-level concepts that are then readily mapped to an implementation. Accordingly, we have two diverse implementations: an OCaml stream library, staged via MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we derive libraries richer and simultaneously many tens of times faster than past work. We greatly exceed in performance the standard stream libraries available in Java, Scala and OCaml, including the well-optimized Java 8 streams.