Friday, June 24, 2011

EclipseFP: the way forward??

This is not the usual "hey I managed to get some code to work, check it out" blog post. Bear with me as I'm just thinking out loud. This is about EclipseFP and how we can move it forwards.

At the moment EclipseFP works pretty well for simple, smallish Haskell projects. People have noted some performance issues on large projects, and part of it is the huge memory footprint of GHC when used through its API. After a couple of hours of working on a yesod project (so we're talking quasi quotation, template haskell, etc), the scion server takes 1Gb of memory... Restarting brings it back to reasonable levels, which seems to point to memory leaks in the GHC API.
Maybe more critically, people are now asking for features that get increasingly harder to implement using Scion:
- static GHC flags like -threaded (Nominolo's working on a version of Scion that could handle these)
- C or HSC sources (or other kind of stuff requiring a preprocessor)
- Having executables or test suites referencing the library component directly in the Cabal file

The main thing with these is that Cabal handles them fine, so basically typing cabal build at the command prompt solves these issues, which undermines the usefulness of having an IDE in the first place... But they require more than just calling the GHC API: building and linking a library in-place, calling a c compiler, hsc2hs, etc...
Of course Cabal publishes an API. So for example the code used by Cabal to build an in-place version of the library I could access, in theory, from Scion. But then Cabal calls the ghc executable, so the Cabal API gives me no easy way to get back errors from GHC, etc...

What does accessing the GHC API gives us? An AST, that can be generated from an unsaved file (so you don't need to save your haskell source file to see your errors, our up to date outline, etc). All the rest (syntax highlighting, jump to definition, etc) could probably be done from the AST.

So I'm wondering now if scrapping scion totally would not be a good idea. A rough outline of how things could work would be:
- create a temp folder in which we will work, copy in it all the project files (sources, etc)
- use cabal to build in that folder
- parse cabal output to get errors and their location
- unsaved files in the EclipseFP editor could have their unsaved contents copied into that folder (that folder represent the current state of the project, which may not be what is really saved on disk)
- have only a simple Haskell executable that given a module file, loads it using all the built files in the temp folder, and return the AST in some easy to parse format. Firing and getting the result would ensure GHC cannot grab loads of memory and not release it
- the IDE code then gets the AST and does all it needs to it
- the AST could even be saved in that temp folder in some format, so that the Java code would only request it when the original file is more recent, which would allow easy parsing of dependent modules AST (for example to retrieve all the symbols exported, etc)

There are a lot of ugly bits in there, maybe, but that would solve my problems... I suppose all of that could be written in Haskell and we could keep a similar enough API to scion, so that the Java code wouldn't need change too much...

For the moment I'm not going to reach into things anyway, but if anybody has some feedback, I'll gladly take any opinion on board!