In the last few weeks, a bit of discussion of John Shutt's
Fexpr-based programming language, Kernel, has sprung up in
various circles. I read Shutt's PhD thesis, implemented a couple
of toy Kernel-inspired interpreters (in Javascript and
Racket), and became simultaneously excited about the simplicity,
hygiene and power of the approach and apprehensive about the strong
possibility that there was something fundamental I was
overlooking. After all, the theory of Fexprs is
trivial.
Calling Kernel's Fexprs "trivial" doesn't seem quite right to me;
perhaps calling them "inscrutable" would be a better fit.
Three serious problems with Kernel and $vau
After a discussion with a friend earlier this week, I think it's
important to note three serious problems with Kernel and the
underlying $vau calculi Shutt proposes:
It's not possible even to compute the free variables of a term in
$vau-calculus in general. This makes compilation,
automatic refactoring, and cross-referencing impossible in general.
There are too many contexts that tell terms apart; contextual
equivalence is too fine. Consider these two terms:
($lambda (f) (f (+ 3 4)))
($lambda (f) (f (+ 4 3)))
These two terms are not contextually equivalent, because if you
pass an operative as the argument to the
function, the operative can inspect the syntactic structure of the
program.
This over-fine contextual equivalence is a problem in practice, not
just a theoretical concern. Consider this following definition of a
foldr function:
($define! foldr
($lambda (c n xs)
($if (null? xs)
n
(c (car xs) (foldr c n (cdr xs))))))
See how c has control over whether and when the recursive case is
evaluated, because it is not required to evaluate its arguments!
Compare this to the same definition where the last line is replaced
with
($let ((ys (foldr c n (cdr xs)))) (c (car xs) ys))
By introducing the intermediate $let, we take control over the
recursion away from c, forcing evaluation of the recursive case.
To show concretely why this is a problem, the call (foldr $and #t
some-list) will either be a short-circuiting "and" or not,
depending on an implementation detail of the foldr function!
Taken together, we see that purely local reasoning about Kernel
programs is impossible; or at least, getting useful results from such
reasoning is impossible in general. Only whole-program analysis can
give us enough stability to let us reach useful conclusions about our
programs.
On the other hand, it's possible to mitigate these problems somewhat.
First of all, it might be possible to use online partial evaluation to
compute the various static properties of terms we're used to from the
lambda-calculus in many (but certainly not all)
situations. Undecidable situations might be able to be warned
about. This might make the language stable enough to work in practice.
Secondly, by shifting one's perspective, the inequivalence of the two
variants of foldr given above might be able to be seen as a
feature: applicatives, after all, are not operatives in
Kernel. Personally, I find it unsettling, but that could be my Scheme
heritage showing. Documentation for Kernel applicatives would need
to specify what possible computations any higher-order parameters
might receive as arguments. The foldr given above can be
simultaneously CPS and non-CPS, when given operative and applicative
arguments, respectively.
Both these possible mitigations are speculative and weak.
Reasons to be cheerful
I think that the core idea of separating argument evaluation from
argument transmission (or function call; I'm unsure what to call it)
is a very powerful one. Consider a distributed system: evaluation of
arguments has to happen on the sending side, and transmission of the
results of evaluation to the receiving side (where instantiation of
the function body happens) is a separate operation.
I also think the simplicity of implementation of a Kernel-style Fexpr
based system is worth paying attention to, especially given that it
makes writing hygienic "macros" effortless, and in the process handily
avoids the thorny problem of exactly what "hygiene" is in the first
place. Achieving such ease-of-use takes heavy-duty macrology in
languages like Scheme that lack first-class macros or Fexprs.
Having our cake and eating it, while also enjoying improved security
Consider securing Kernel in the same way that a capability-based
approach can secure Scheme. To secure Kernel as it stands, each
applicative has to guard against possibly receiving some operative as
an argument.
To make it easier to secure, we might make the system signal an error
when an attempt is made to pass an operative as an argument to an
applicative.
This makes Fexprs/operatives/"macros" second-class, just as they are
in Scheme. I conjecture that doing so has the following effects:
it makes contextual equivalence useful again, making the two
variants of foldr given above indistinguishable, but
it simultaneously makes the system exactly equivalent in power to
Scheme with, say, a syntax-case macro system.
So by doing this, we've given up most of Kernel's approach, seemingly
in exchange for, well, familiar old Scheme. We haven't lost
everything, though: we've retained the simplicity and power of
Kernel's approach to hygiene.
Conclusions
Without quite a bit of further research, I don't think Kernel-style
languages can be used to build secure or efficiently compiled
systems. Making changes necessary to build a secure variant of Kernel
may destroy many of the properties that make the language interesting
in the first place, but doing so might leave us in a familiar setting
seen through new eyes: Scheme.