Comprehensive Haskell Sandboxes, Revisited

Posted on March 9, 2015 (last updated April 2, 2015)

For the last few years I’ve been using a system of symlinks as Haskell sandboxes. This worked well, but has one important drawback: symlinks are basically user-global state, and hence it is not possible to have multiple sandboxes active at the same time. However, as Haskell tools are getting better and better we can actually simplify the entire setup dramatically. Here’s a summary of my new approach:

  1. No global GHC installed. Instead, different versions of ghc are installed in

    ~/opt/ghc/<version>

    and we activate a version of GHC by adding setting our PATH accordingly.

  2. We require that Cabal sandboxes are always used by adding

    require-sandbox: True

    to our global cabal config (~/.cabal/config). This way a global ~/.ghc directory will only contain our ghci history (and no packages), and the ~/.cabal directory will only contain your global configuration file and the Hackage package index, nothing else. Any settings that you want to be true for all sandboxes in your system can be added to this global config file (in my case, none); any other settings can be added to a project-specific cabal.config file.

  3. Use different sandboxes and different build directories for different ghc versions.

Details below.

Making it easy to pick a ghc version

I use a script ~/opt/bin/activate.sh

#!/bin/bash

# from https://wiki.archlinux.org/index.php/Color_Bash_Prompt
Color_Off='\e[0m'       # Text Reset
Red='\e[0;31m'          # Red
Green='\e[0;32m'        # Green

# Strip current ghc from PATH (if any)
# Uses http://unix.stackexchange.com/questions/108873/removing-a-directory-from-path

if [ "${ACTIVE_GHC}" != "" ]
then
  OLD_DIR=~/opt/ghc/${ACTIVE_GHC}/bin
  export PATH=$(echo "$PATH" | sed "s@${OLD_DIR}:@@g")
fi

# Add new ghc to path (unless specified 'none')

if [ "$1" != "none" ]
then
  NEW_DIR=~/opt/ghc/$1/bin
  export PATH=${NEW_DIR}:$PATH
fi

# Specify file for sandbox, set prompt, and remember active GHC version

export PS1="\n[${Red}$1 ${Green}\w${Color_Off}]\n# "
export CABAL_SANDBOX_CONFIG=./cabal.sandbox.config-$1
export ACTIVE_GHC=$1

and then add an alias to my ~/.profile:

function CompleteActivate() {
  local ghcs=`ls ~/opt/ghc`
  local word=${COMP_WORDS[COMP_CWORD]}
  COMPREPLY=(`compgen -W "none $ghcs" -- "$word"`)
}
complete -F CompleteActivate activate
alias activate="source activate.sh"

This gives me autocompletion for activate, as well as a pretty prompt which looks something like

[ghc-7.8.4 ~/current/working/directory]
#

Installing ghc

When installing from a binary distribution, just specify the appropriate --prefix option when calling configure.

To install from source, either do the same as for a binary distribution, or make the in-place compiler available directly (useful when hacking on ghc). To do that, configure as normal (without specifying a prefix) and then make the in-place compiler available by creating a directory ~/opt/ghc/<version>/bin and symlinking the binaries from inplace/bin:

ln -s ~/path/to/ghc/inplace/bin/* .
mv ghc-stage2 ghc
cp ghc ghc-<version>

I personally find it helpful to create symlinks such as ~/opt/ghc/bootstrap-7.8 which point to whatever compiler I use to bootstrap ghc 7.8. Remember that you need to have alex and happy installed in order to build ghc (see below), as well as autoconf and automake.

Unless you explicitly use different sandboxes for different ghc versions everywhere (see below), it is probably a good idea to make sure that all available installations of ghc have distinct version numbers. In the rare case that you want to have multiple installations of the same version (for hacking purposes), the easiest solution is to just manually modify the ghc version number.

Note on OSX. If you are on OSX and are installing an older version of ghc (pre 7.8), remember that you need to need to use gcc-4.9:

./configure --with-gcc=gcc-4.9 --prefix=/Users/e/opt/ghc/7.4.2

where gcc-4.9 can be installed through homebrew:

brew install homebrew/versions/gcc49
You can also use brew to install autoconf and automake.

Note on brew. I don’t like having brew installing stuff globally either. Fortunately, this is not necessary; although it’s officially not recommended, I find having brew installed in ~/homebrew works perfectly. You can follow the alternative installation guidelines and do

mkdir homebrew && curl -L https://github.com/Homebrew/homebrew/tarball/master | tar xz --strip 1 -C homebrew

You will also want to create a local directory for brew to compile its packages:

mkdir ~/Library/Caches/Homebrew
and then add ~/homebrew/bin to your PATH.

ghc-version independent tools

To install tools such as alex, happy, hasktags or darcs which are independent of the GHC version, you could create a general sandbox for utilities in ~/opt/util:

cabal sandbox init --sandbox .
cabal install alex happy hasktags

and then add ~/opt/util/bin to your PATH. Don’t install ghc-version specific tools here though.

Building a package with different ghc versions

The official way to switch between ghc versions when using sandboxes relies on overwriting the sandbox configuration file each time, which re-introduces global state (and therefore, for example, makes parallel builds impossible); moreover, although the sandbox stores the installed packages in a subdirectory per ghc version, it does not do this for the bin/ directory. For both of these reasons I prefer to use different sandboxes for different GHC versions, which is why my activate script also sets the CABAL_SANDBOX_CONFIG file to a different file for different GHC versions. You will still need to make sure to manually specify the a different sandbox directory though:

activate 7.8.4
cabal sandbox init --sandbox ./.cabal-sandbox/7.8.4

activate 7.10
cabal sandbox init --sandbox ./.cabal-sandbox/7.10

(If you wanted to, you could also set things up to have the &ldquo;active&rdquo; sandbox appear in your bash prompt.)

For hacking on the package locally (rather than just installing stuff), you will still need to manually set the dist dir so that we have a different dist directory for different versions of ghc:

cabal configure --builddir=dist/7.8.4
cabal build     --builddir=dist/7.8.4

Sadly we cannot set builddir any other way right now; it would be nicer if we could specify it in a project local cabal.config file and then use different config files for different GHC versions. Note that setting builddir for install --only is currently broken.

Copying sandboxes

Sandboxes are great but they take time to build (as well as disk space), which can be annoying. Wouldn’t it be great if we could somehow seed or initialize a sandbox by cloning or copying another? If that were possible we could set up pre-built sandboxes for packages we commonly use, or containing hard-to-build packages such as the gtk bindings (see below), or indeed packages from curated package collections such as the Haskell Platform or Stackage/LTS Haskell.

Unfortunately, unlike my symlink-based sandboxes, Cabal sandboxes are not relocatable. There are some issues about this on the Cabal issue tracker (Construct a sandbox from another package repository, Make Cabal sandboxes relocatable), but it’s a thorny problem which probably won’t be solved very soon. [Edit: this may not be true anymore]

But all is not lost. A typical Cabal sandbox contains two directories (as wel as some other stuff):

lib/
x86_64-osx-ghc-7.8.4-packages.conf.d/

The former contains the actual packages themselves; we can’t copy those. However, the latter (whose name depends on the compiler and platform) contains the actual &ldquo;package database&rdquo; which is used by ghc to find out which packages are available. The package database contains one file per installed package, describing where the package is, it’s dependencies, etc. It turns out that we can copy these.

Example

Suppose we create a new sandbox in ~/opt/sandboxes/hakyll:

cabal sandbox init --sandbox .
cabal install hakyll

Then ~/opt/sandboxes/hakyll/x86_64-osx-ghc-7.8.4-packages.conf.d contains files such as

x86_64-osx-ghc-7.8.4.20141229-packages.conf.d/aeson-0.8.0.2-f8f36f492a4d7eb03f8bc6cd8cd76f9f.conf

(one for each package that was installed), and ~/opt/sandboxes/hakyll/lib contains the actual packages themselves. If we now have a project that needs hakyll and its dependencies, we can seed its sandbox by creating a new sandbox and copying over the package DB from our shared sandbox:

cabal sandbox init
cp ~/opt/sandboxes/hakyll/x86_64-osx-ghc-7.8.4.20141229-packages.conf.d/* \
         ./.cabal-sandbox/x86_64-osx-ghc-7.8.4.20141229-packages.conf.d/

A package database contains one additional file package.cache which is a binary cache of the package database; we have to recreate this:

cabal sandbox hc-pkg recache

and we’re good to go! (Note that cabal sandbox hc-pkg is a wrapper around ghc-pkg that points it to the right package database.) Now all packages that are installed in the ~/opt/sandboxes/hakyll sandbox are available here too. Moreover, we can make changes to our new sandbox if we wish; we can add new packages, but also rebuild existing ones. Any package we build in our new sandbox will be stored in the local .cabal-sandbox/lib directory and get an updated entry in the package DB; the rest will remain in the shared sandbox.

Note that if you do things this way, you should regard the shared sandbox as immutable, much like the global package DB. It’s okay to add packages to it, but as soon as you rebuild any of the packages in it (or even remove them) all derived sandboxes will break.

Tool support

There are a number of tools that aim to make package management for Haskell for convenient and that support cloning sandboxes:

Building ghcjs

Building ghcjs in a sandbox works fine, just follow the standard installation instructions. However, unless you are using GHC 7.10 or up, building the ghcjs boot libraries does not. The problem is that ghcjsboot requires Cabal version 1.22, and it’s not enough if this is available in the local sandbox. You will get an error such as

# ghcjs-boot --dev
program ghcjs found at /Users/e/opt/ghcjs/.cabal-sandbox/bin/ghcjs
...
fatal: program /path/to/ghc returned a nonzero exit code
fatal: GHC program /path/to/ghc does not have a Cabal library that supports GHCJS
(note that the Cabal library is not the same as the cabal-install program, you need a compatible version for both)

This is a known issue. For now the best way around this (that I know of) is to simply make Cabal 1.22 available in the global sandbox. One way to do this is to build Cabal in its own sandbox and then just copy the resulting .conf file to the global package DB and running ghc-pkg recache (you can find out the location of the global sandbox by running ghc-pkg list). Having this package available globally should not cause any problems when building other packages.

Note that ghcjs-boot will also fail if you try to use an in-place install of ghc.

GTK sandbox

(This section is OSX specific.) I mentioned above that installing the gtk packages can be a bit more difficult. Actually, these days it’s not so bad anymore. First, make sure you install XQuarts X11 window server (be patient when it says &ldquo;Running package scripts&rdquo;), and add this line to your ~/.profile:

export PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig

Then use homebrew to install the GTK+ and poppler libraries:

brew install gtk+
brew install poppler

Then create a new sandbox (I use ~/opt/sandboxes/gtk) and install the gtk libraries:

cabal install gtk2hs-buildtools
cabal install gtk
cabal install poppler --with-gcc=gcc-4.9 --extra-include-dirs=/path/to/homebrew/include

The gtk2hs-buildtools are necessary to build the Haskell GTK bindings (but not thereafter). The installation of poppler is a bit non-standard; we need to use gcc-4.9 to work around an issue with the buildtools on OSX, and pass the explicit path to homebrew’s include path due to an issue in poppler itself. (You could omit poppler completely if you don’t need it; it is a PDF rendering engine which is an independent part of the GTK+ infrastructure.)

Once this sandbox is built however we can install gtk applications very easily. For instance, we can create a new sandbox, copy the package DB from the gtk sandbox, and then install ThreadScope by just doing

cabal install threadscope

(Make sure to start the X11 server before starting threadscope or you will get an &ldquo;Cannot initialize GUI&rdquo; error message.)

GHC 6.12.3 on OSX

If you are very serious about backwards compatibility you might want to install GHC 6.12.3. Sadly, the download page for 6.12.3 does not provide a binary distribution for this version, only a distribution package (.pkg file). You could attempt to build from sources using a later version of ghc; I tried with 7.0.4 but got stuck halfway; it might be possible but it’s not trivial.

Fortunately, not all is lost. Download GHC-6.12.3-i386.pkg and unpack it in a temporary directory:

tar xvfz ~/path/to/GHC-6.12.3-i386.pkg

Amongst other things, this will create a file ghc.pkg/Payload; unpack that too. Then create a new directory for this version of GHC (I use ~/opt/ghc/6.12.3) and copy GHC.framework/Versions/612/usr/{bin,lib,share} to that directory.

Finally, we need to fix up some paths (both in the scripts in bin/ and in the package configuration files), add some flags to use gcc rather than llvm, and add some flags to suppress linker warnings. You can apply relocate-6.12.3.diff to do that, but note that this patch assumes that you copied GHC to /Users/e/opt/ghc/6.12.3, so you will probably want to replace that string everywhere in the diff with whatever location you are using. After applying the patch, just run ghc-pkg recache to update the package database, and you’re good to go. You will still get the occassional warning from cabal about -no-link being deprecated; just ignore them.