Writing Atom plugins in Haskell using ghcjs

Posted on February 14, 2015

Atom is an editor developed by the people behind GitHub. Written in JavaScript, it is expressly designed to be hackable, and hackable it is; I find writing Atom plugins a rather pleasant experience, even though this means writing CoffeeScript. Nonetheless, one of the things that initially attracted me to Atom is the existence of ghcjs, and hence the possibility of writing Atom plugins in Haskell. Why bother? Well, Haskell versus JavaScript, need I say more? Moreover, it means we can use almost any existing Haskell library in our Atom plugins; for instance, my Cabal extension calls into the Cabal library to parse .cabal files.

If you hadn’t heard of it before, ghcjs is a truly impressive tour de force by Luite Stegeman. It compiles Haskell code to JavaScript, using ghc as a frontend; since ghcjs plugs into ghc’s pipeline at a rather late stage (STG), this means that the entire Haskell language is supported. Moreover, ghcjs comes with a full concurrent Haskell runtime so that almost anything you can do in Haskell is supported. (Did I mention how impressive it is?)

In this blogpost I will explain how you can use ghcjs to write Atom plugins (partly in) Haskell. Most of the information in this blogpost however should also be useful in any other context with a predominantly JavaScript application where you want to use some Haskell.

If this is your first time writing an Atom plugin, you will find enough information in this tutorial to get by, but you might want to do the official Create your first package tutorial first.

Getting started

Let’s first create a new package. Start Atom, bring up command palette (⌘P), select “Generate Package”, and pick a path. In this tutorial we will assume we will call our new package “reverse”, so pick a path /path/to/reverse that ends in reverse.

Once the new package is created, remove reverse-view.coffee from the lib/ directory, and delete any reference to ReverseView, reverseView or modalPanel in reverse.coffee (the serialize method can go completely). We will not be needing this or any other view in this tutorial.

Reload the window (command palette, “Reload”), open the dev console (menu View, Developer, Toggle Developer Tools) and execute the “toggle” command for your new package (command palette, “Reverse: Toggle”). You should see

Reverse was toggled!

on your console.

Adding some functionality

Not unlike the ascii-art package from Create your first package, we will implement some functionality that allows the user to select some text, run our new command, and our package will reverse the text the user selected.

We will implement the actual reversing in Haskell, but before we get there let’s first implement a command that just replaces the current section with some placeholder text. Replace the definition of toggle with

  toggle: ->
    editor    = atom.workspace.activePaneItem
    selection = editor.getSelection()
    selected  = selection.getText()

    console.log "Selected", selected
    selection.insertText "PLACEHOLDER"

Reload your window again, select some text, and execute toggle (command palette, “Reverse: Toggle”). You should see your selected text on the console and the text should have been replaced with the PLACEHOLDER text.

Calling Haskell

I assume that you have ghcjs installed; if not, follow the instructions on the github page; it is entirely painless, if a little time consuming.

Create a new subdirectory /path/to/reverse/hs in your reverse package and create a file Reverse.hs containing

module Main (main) where

main :: IO ()
main = putStrLn "Dummy main"

The dummy main function we will leave here for two reasons: it will allow us to check that the basic functionality is working, and moreover if we omit main, ghcjs will skip linking our code which is not what we want. At the time of writing ghcjs has only rudimentary support for calling into Haskell from JavaScript; it works (as we shall see), but it requires some tricks. The module export clause states that we intend to export main (and will be useful later to avoid functions being stripped out of the generated code because the linker thinks that they are unused).

When we call ghcjs on our Haskell module it creates a bunch of files. If we just call

# ghcjs Reverse.hs

then it will create a new directory Reverse.jsexe and we should be able to run our dummy main function using node (provided that you have node.js in your path):

# node Reverse.jsexe/all.js
Dummy main

However, when we want to use the generated Haskell code in Atom we need to make it available as a Node module. This is a little tricky, because Atom restricts the use of eval, and the ghcjs generated code doesn’t play nice with the node vm module.

We will therefore wrap the generated code in some code of our own. Create a new files called Exports.js. This file is going to contain the functionality that we want to export from Haskell to JavaScript:

global["main"] = function() {
  h$run(h$mainZCMainzimain);
}

Then create a Makefile in the reverse/hs directory with

HaskellReverse.js: Reverse.hs Exports.js
	ghcjs -DGHCJS_BROWSER Reverse.hs
	echo "(function(global) {" >HaskellReverse.js
	cat Reverse.jsexe/{rts,lib,out}.js Exports.js >>HaskellReverse.js
	echo "})(exports);" >>HaskellReverse.js

.PHONY: clean
clean:
	rm -rf Reverse.jsexe
	rm -f HaskellReverse.{js,min.js}
	rm -f *.js_{hi,o}

We call ghcjs with the -DGHCJS_BROWSER argument; this strips some node.js specific code from the generated JavaScript, which is currently necessary when trying to load the generated code into Atom. Then the Makefile sandwiches the generated code with our header and footer, so that after running make, HaskellReverse.js should look something like

(function(global) {

  // Generated code

  global["main"] = function() {
    h$run(h$mainZCMainzimain);
  }

})(exports);

The code generated by ghcjs assumes the presence of a global object; our wrapper redefines this global object to be our module exports. The other difference between our wrapped code and the all.js module generated by ghcjs is that we call h$run instead of h$main; this is necessary because h$main (at least in the current release of ghcjs) shuts down the process after it returns, causing the editor to crash.

You should be able to test this directly in the Atom JavaScript console:

> HaskellReverse = require('/path/to/reverse/hs/HaskellReverse.js');
Object {ArrayBuffer: function, ...}

> HaskellReverse.main()
undefined
Dummy main

The undefined value here is the return value h$run. We see the output of h$run (probably) after we see the return value of h$run, because h$run is asynchronous. More on this below.

Passing information from JavaScript to Haskell

As a first attempt to actually do something slightly more interesting in Haskell, let’s attempt to pass a string from the JavaScript world to the Haskell world and print it the console from Haskell. Replace the contents of Reverse.hs with

module Main (main, reverseJSRef) where

import GHCJS.Foreign
import GHCJS.Prim

foreign import javascript unsafe "console.log($1);"
  consoleLog :: JSRef a -> IO ()

main :: IO ()
main = putStrLn "Dummy main"

reverseJSRef :: JSString -> IO ()
reverseJSRef input = consoleLog input

Hopefully the syntax for the javascript foreign import is self explanatory. JSRef is the main datatype that models JavaScript values Haskell-side. The phantom type argument (a) indicates what kind of value it is; JSString is just a type alias for JSRef instantiated with a particular type argument.

Let’s try this out. Add the following function to Exports.js

global["reverse"] = function(input) {
  var action = h$c2( h$ap1_e
                   , h$mainZCMainzireverseJSRef
                   , h$c1(h$ghcjszmprimZCGHCJSziPrimziJSRef_con_e, input)
                   )
  h$run(action);
}

I did mention that support for exporting Haskell functions to JavaScript was still somewhat rudimentary.. We construct an action by calling low-level ghcjs runtime functions to wrap the request in a Haskell JSRef constructor and then apply function reverseJSRef to it. We then run this action with h$run, as before. If you are wondering where this name h$mainZCMainzireverseJSRef is coming from—

  1. All ghcjs generated functions are prefixed with h$
  2. mainZCMainzireverseJSRef is the Z-encoding of main:Main.reverseJSRef, which is function reverseJSRef in module Main of package main

Let’s try this. Reload your window (if you don’t, then requireing the module a second time won’t reload it), and do

> HaskellReverse = require('/path/to/reverse/hs/HaskellReverse.js');
Object {...}

> HaskellReverse.reverse("Hello")
undefined
Hello

Again, we see Hello printed after undefined because we are doing an asynchronous call. Incidentally, since so far we are just calling console.log from reverseJSRef, we can also pass other kinds of JavaScript values:

> HaskellReverse.reverse({ 'someField': "hi" })
undefined
Object {someField: "hi"}

Returning information from JavaScript to Haskell

Since we are doing an asynchronous call into Haskell, we need to invoke a callback when we have completed computing the result. We change our wrapper function to

global["reverse"] = function(input, callback) {
  var action = h$c3( h$ap2_e
                   , h$mainZCMainzireverseJSRef
                   , h$c1(h$ghcjszmprimZCGHCJSziPrimziJSRef_con_e, input)
                   , h$c1(h$ghcjszmprimZCGHCJSziPrimziJSRef_con_e, callback)
                   )
  h$run(action);
}

This is almost the same as before, except that we get the additional callback argument that we need to wrap (note also the change from ap1 to ap2). On the Haskell side we need an additional import to allow us to invoke this callback:

foreign import javascript safe "$1($2);"
  invokeCallback :: JSFun (JSRef a -> IO ()) -> JSRef a -> IO ()

JSFun is another type alias for JSRef; the type argument to JSFun is intended to indicate what kind of function it is. (I’m not sure why this isn’t predefined.). The safe keyword means that if the JavaScript code that we call throws an exception, this exception will be translated to a JSException Haskell-side that we can catch like any other exception.

We can now implement reverse fully:

reverseJSRef :: JSString -> JSFun (JSString -> IO ()) -> IO ()
reverseJSRef input callback =
    invokeCallback callback $ reverseJSString input
  where
    reverseJSString :: JSString -> JSString
    reverseJSString = toJSString . reverseString . fromJSString

    reverseString :: String -> String
    reverseString = reverse

Let’s add this functionality into our package. Add the necessary import to reverse.coffee:

HaskellReverse = require '../hs/HaskellReverse.js'

and replace the placeholder call to insertText with

HaskellReverse.reverse selected, (reversed) ->
  selection.insertText reversed

and try it out!

Trying it out

Trying it out

Asynchronous and synchronous calls

Now that we have basic functionality working, let’s get slightly more sophisticated. So far all our calls have been asynchronous. This is useful for two reasons. The ghcjs runtime makes sure that the execution of asynchronously called Haskell code won’t block the editor (JavaScript is single threaded). The second reason is that various Haskell features are only available when the code is executing asynchronously. Although we can call Haskell synchronously (and we shall see an example shortly), we have to be careful to avoid these features; in particular, we have to avoid the possibility of blocking.

Suppose that we want to keep running our main Haskell code asynchronously, but we want to make sure that on the Haskell side requests are handled in order and sequentially; for our simple reverse package ordering doesn’t really matter but of course there are lots of examples where it does. One approach is to start a Haskell “server” thread that listens on some channel for requests and execute them one by one; to submit a new request we just put it on the channel.

We can add the following to functions to our Haskell code:

data Request where
    Reverse :: JSString -> JSFun (JSString -> IO ()) -> Request

server :: Chan Request -> IO ()
server chan = forever $ do
    request <- readChan chan
    handle logJSException $ case request of
      Reverse input callback ->
        reverseJSRef input callback
  where
    logJSException :: JSException -> IO ()
    logJSException = print

startServer :: JSFun (JSRef a -> IO ()) -> IO ()
startServer callback = do
    chan      <- newChan
    serverTid <- forkIO $ server chan
    handle    <- newObj
    reverse   <- syncCallback2 AlwaysRetain False $ \input callback ->
                   writeChan chan $ Reverse input callback
    shutdown  <- syncCallback AlwaysRetain False $ do
                   killThread serverTid
                   release reverse
    setProp "reverse"  reverse  handle
    setProp "shutdown" shutdown handle
    invokeCallback callback handle

Function server reads requests from channel chan and executes them, catching and outputting any exception that the JavaScript code may throw. Function startServer creates new channel, starts the server thread, and then constructs a JavaScript object with two fields reverse and shutdown, both containing JavaScript functions; reverse puts the reverse request onto the channel, and shutdown terminates the server thread and releases any resources retained by reverse. Since the Haskell runtime cannot keep track of references to these callbacks, we indicate that the callbacks can never be garbage collected (AlwaysRetain) until we explicitly release them.

In our JavaScript wrapper we now no longer need reverse, but replace it with a wrapper for startServer:

global["startServer"] = function() {
  var result   = undefined;
  var callback = function(server) { result = server }
  var action   = h$c2( h$ap1_e
                     , h$mainZCMainzistartServer
                     , h$c1(h$ghcjszmprimZCGHCJSziPrimziJSRef_con_e, callback)
                     )
  h$runSync(action, false);
  return result;
}

This is similar to before, except that we now do a synchronous call; when runSync returns the Haskell code is guaranteed to have completed and hence the callback that we provide (which writes to our local result variable) will have been called. The second argument to h$runSync indicates what we want to happen if the Haskell code does attempt to block: if we pass false, the code will throw an exception; if we pass true, it becomes an asynchronous call instead.

To start using this in our Atom plugin, we only need a few small modifications in reverse.coffee:

{CompositeDisposable} = require 'atom'
HaskellReverse        = require '../hs/HaskellReverse.js'

module.exports = Reverse =
  subscriptions: null
  haskellServer: null

  activate: (state) ->
    # Events subscribed to in atom's system can be easily cleaned up with a CompositeDisposable
    @subscriptions = new CompositeDisposable

    # Register command that toggles this view
    @subscriptions.add atom.commands.add 'atom-workspace', 'reverse:toggle': => @toggle()

    # Start the Haskell request server
    @haskellServer = HaskellReverse.startServer()

  deactivate: ->
    @subscriptions.dispose()
    @haskellServer.shutdown()

  toggle: ->
    editor    = atom.workspace.activePaneItem
    selection = editor.getSelection()
    selected  = selection.getText()

    @haskellServer.reverse selected, (reversed) ->
      selection.insertText reversed

Minimization

With the above infrastructure in place, adding further functionality is easy; ghcjs is so good that most libraries from Hackage Just Work&trade;. For example, in my Atom Cabal extension I wanted to be able to parse Cabal files, but I didn’t want to write a parser&mdash;and I didn’t need to! With ghcjs we can just reuse the existing Haskell parser for Cabal files. Re-phrased in the context of this blog post, we can add an additional request type

data Request where
    Reverse   :: JSString -> JSFun (JSString -> IO ()) -> Request
    ReadCabal :: JSString -> JSFun (JSRef a  -> IO ()) -> Request

with corresponding handler

readCabalFile :: JSString -> JSFun (JSRef a -> IO ()) -> IO ()
readCabalFile input callback = do
    case parsePackageDescription (fromJSString input) of
      ParseFailed err ->
        invokeCallback callback nullRef
      ParseOk _warnings gpkg -> do
        let pkg = package (packageDescription gpkg)
        obj <- newObj
        setProp "name"    (toJSString . display $ pkgName    pkg) obj
        setProp "version" (toJSString . display $ pkgVersion pkg) obj
        invokeCallback callback obj

(this example just extracts the package name and version from the Cabal file); in the extension we can use this in the same way that we use reverse:

fs.readFile cabalFilePath, {encoding: 'utf8'}, (err, data) =>
  @haskellServer.readCabal data, (result) ->
    console.log result

Rather amazing. (For people not so familiar with CoffeeScript, note the use of the fat arrow so that we have the right object bound to this inside the callback.) So where’s the catch? There isn’t really one, except perhaps in the size of the generated code; the generated code for the reverse server is about 1M, going up to 2.5M when we add the Cabal example. We can use minification to address this to some extent; following the instructions on the ghcjs wiki we can add the following target to our Makefile:

HaskellReverse.min.js: Reverse.hs Exports.js
	ghcjs -DGHCJS_BROWSER Reverse.hs
	echo "(function(global) {" >HaskellReverse.min.js
	ccjs Reverse.jsexe/{rts,lib,out}.js Exports.js \
	  --compilation_level=ADVANCED_OPTIMIZATIONS \
	  >>HaskellReverse.min.js
	echo "})(exports);" >>HaskellReverse.min.js

Note that for minimization to work correctly, Closure Compiler (ccjs) needs to know what we need to export; that’s why it’s important that we compile the generated code together with our Exports.js but not including the wrapper code that we generate. Minimization is also the reason that we use

global["startServer"] = ...

rather than

global.startServer = ...

because if we write the latter ccjs will rename our functions. Minimization helps, but the minimized code is still pretty big; 237 kB for the basic reverse example and 714 kB once we use the Cabal library. Oh well, not a huge deal.

Wrapping up

Hopefully this tutorial will have given you the necessary tools you need to write Atom plugins in Haskell. When you publish your package, simply make sure that the file you import from CoffeeScript (in this example HaskellReverse.js or HaskellReverse.min.js) is checked into your repository.

References

  1. The best introduction to ghcjs is still Luite’s original blogpost GHCJS introduction&mdash;Concurrent Haskell in the browser.
  2. You’ll probably want to refer to GHCJS.Prim from the ghcjs-prim package, as well as GHCJS.Foreign, GHCJS.Types, GHCJS.Concurrent and GHCJS.Marshal from ghcjs-base.
  3. The ghcjs GitHub and corresponding wiki contain more information, including installation instructions and a useful page on deployment.