Monday, October 16, 2006

ANTLR to run on Cygwin and GCJ

I've been trying to get ANTLR 2.7.4 to run under Cygwin with GCJ. After trying to piece information together, I've finally found a recipe that makes it work.

For those who don't know ANTLR: it's a replacement for the lex/yacc combination. Unlike YACC, which is a bottom up parser, ANTLR uses a TOP-DOWN parsing approach. There is a lot of information available on the ANTLR website. If you're interested in parser generation, you certainly should have a look. My goal is to write an ANTLR compatible grammar for VHDL.

Back to the topic: ANTLR comes with a standard ./configure script. It executes fine, until, during compilation, there is an error related to AWT in directory base/antlr/debug/misc. AWT seems to be a JAVA standard library.

Some googling, resulted a link that shows how to work around the error.

I tried this, but probably because some things in my path weren't set correctly (I think), this didn't work very well either. I was able to create a libantlr.so file (see step 4 of that page), but step 5 broke down. Step 5 relies on a file called antlr/Tool.class, which doesn't exist in the ./antlr directory but in the antlr.jar file instead (a jar file is a tar-like library in which .class files can be gathered, I think.)
In the end, I got it to work by un-jarring all the files and simply bypassing the creation of a shared library.

Here's what I did: (basedir is the directory where you can find ./configure)

  1. Install gcj (through the setup.exe on cygwin.com)
  2. Download the tar files of the latest version of ANTLR (2.7.4 in my case) and untar it in a local directory.
  3. Run ./configure --prefix=install dir. While we do not need to configure to compile the antlr executable, we do need it to compile the support C++ libraries. Do NOT run make after configure!
  4. Delete AWT dependent code:
    cd basedir/antlr/debug
    rm -fr misc

  5. cd basedir
  6. Un-jar all the files in the antlr.jar, back into their original position.
    jar xfv antlr.jar
  7. Compile everything into an executable
    gcj --main=antlr.Tool `find antlr -name "*.class"` -o cantlr
    If all is well, you will see a bunch of warnings on your screen that, I assume, you can safely ignore. The end result is a file called cantlr.exe).
  8. Test the executable by running it (./cantlr). You should see a bunch of lines with program information.
  9. Move this file to a place somewhere in your path (e.g. /usr/local/bin)

At this point, we have created that main executable that will convert a .g grammar file into a set of Java, C# or C++ files. These newly generated files, however, rely on base classes that are also part of the ANTLR distribution. For C++, we need to build a library that contains the compiled based classes.


  1. cd basedir/lib/cpp
  2. Build the C++ library. This step won't work if you previously didn't run ./configure.
    make
  3. If everything went fine, then the ./src directory will contain a file called libantlr.a. Now install the library and include files to the place that was originally indicated during the ./configure step.
    make install

The next step is to test if now have a fully working system.

  1. cd basedir/examples/cpp/calc
  2. The standard Makefile will not know where to find the antlr executable. As a work around, just generate all files by manually invoking antlr.
    cantlr calc.g
    A set of C++ files will now be generated.
  3. make
  4. If everything went fine, you will now see a fresh set of executables!

This is how I got the system work for me while using GCJ and Cygwin. I'm pretty sure that the same procedure will also work in a Solaris or Linux environment...