A River of Bytes: Notes from the second day of PNWScala 2014

Here are some notes from the second and last day of the very successful PNWScala 2014 conference.

Adding Tree and Tree: Distributed Decision Tree Learning - Avi Bryant (Stripe)

This was about the Brushfire framework for learning decision trees, based on Hadoop via Scalding and Algebird. Generality, modularity and composability seem to have been very carefully thought through. The high level approach is based on the PLANET paper. The code will soon be available at http://github.com/stripe/brushfire.

What's new since Programming in Scala - Marconi Lanna (Originate)

A guided tour of language features since the last edition of the book. Some notable ones:

App trait (2.9)
Range foreach optimization (2.10)
Parallel collections (2.9)
Generalized try catch finally with reusable exception handling via PartialFunction
Try [almost] monad (2.10)
Implicit classes (2.10)
Value classes (2.10)
Extension methods (2.10)
String interpolation (2.10) and custom interpolators
Futures and promises (2.10, 2.9.3)
Dynamic trait (2.10)
Akka actors
Modularization of advanced language features
Reflection, macros and quasiquotes

One Year of Akka - Ryan Tanner (Conspire)

A "view from the trenches", describing adoption of Scala and Akka at an early stage startup, and dealing with scaling issues. We were reminded that "Akka won't save you from building a monolith" and that it's easy to end up with a tightly coupled system. Additional advice included "pull, don't push", as described in the Akka work pulling pattern and more specifically in a post on the Conspire blog. The latter is the last of a series of five posts on this whole effort, and all five seem very much worth reading. Like some other members of the audience I was surprised to hear that Conspire was in the process of making a turn away from Akka clustering (but not Akka actors) and planning to introduce Kafka.

Hands-on Scala.js - Li Haoyi (DropBox)

Lots of live coding in this talk demonstrated that Scala.js appears to deliver on its promise of unifying server side and portable browser side programming in a single, strongly typed language with decent performance. The story gets even stronger when ScalaTags is included, providing an interface to DOM. Examples projects started very simple but got quite complex. There was also a project showing common code on the browser and server, and, for a grand finale, an example where communication between browser and server code was type checked. A very engaging presentation.

Unruly Creatures: Strategies for dealing with Real Numbers - Erik Osheim (Typelevel)

Starting with a "primitive math blooper real", this talk motivated and explained the Spire library providing various advanced and well behaved representations of numbers.

What every (Scala) programmer should know about category theory - Gabriel Claramunt

I've watched people try to give variants of this talk for three decades and it hasn't gotten any easier, especially in front of an audience with varied interests and backgrounds. It's quite a bit more compelling with Scala than it was with Standard ML. The bigger problem is that Scala has been successful to a large degree because it hasn't just been pitched to people who have learned or are willing to learn category theory. Most Scala programmers will never learn it and that's mostly a good thing. But knowing it does yield some insight into Scala, so this talk remains worth giving, and perhaps the Scala variant is more relevant than those in the past. Best line: "I came for the abstraction, stayed for the composition."

It may be worth checking out "Category Theory Applied to Functional Programming."

I'm still very interested in the question "what shared conceptual model do all Scala programmers need?", but my starting point is that it probably isn't category theory. It may be a dumbed down version, that explains what a Monad is (and isn't) and why it matters.

Building a Better Future: Advanced Error Handling for Concurrent Programming with Scalaz and Shapeless -- Jean-Rémi Desjardins and Eddie Carlson (Whitepages)

The last of several good discussions of error handling, this time in the context of futures. Almost anybody who has used futures a lot has at some point needed to collect multiple futures into a single one. Then they learned the hard way that Future.sequence doesn't do quite what they want, returning the first error in traversal order of the sequence, rather than temporal order, and thus not "failing fast" as is usually desired. See, for example, this discussion on StackOverflow. A lot of this solution was over my head as I haven't used wither scalaz or shapeless, but the key ingredients were scalaz.Applicative, shapeless.HList and HList sequencing features of shapeless-contrib. I'm hoping the slides get posted as this is an important problem.

Composing Project Archetypes with SBT AutoPlugins - Mark Schaake (Allen Institute for Artificial Intelligence)

How to solve "Multiple Build Maintenance Hell" (MBNH) in an organization with lots of sbt projects. The solution described is essentially to define shared, versioned plugins based on the AutoPlugin concept introduced in sbt 0.13.5 -- described in this tutorial. Each plugin covers on facet of a project (a command line tool, a web service, ...) and plugins can depend on each-other using "requires". The specific plugins defined have been open sourced. The approach seems intuitively right, but somebody asked how a developer could be sure to avoid accidentally overriding plugin behavior. This seems like an interesting problem as sbt seems to be an area where developers often cudgel their code into submission without knowing what they're doing and the first thing that "works" tends to get checked in (and not looked at until something breaks.)

A River of Bytes

Sunday, November 16, 2014

Notes from the second day of PNWScala 2014