Saturday, January 16, 2010

Compiler in Java

In my second semester at George Mason University, I took a course "CS-540 Language Processors". In this course, we had to do 4 projects. Guess what were these 4 projects? They were the 4 phases of compiler construction:
1) Lexical Analyzer
2) Syntax Analyzer
3) Semantic Analyzer
4) Code Generator

The compiler was developed for the subset of Tiger language. Tiger language is a small, imperative language with integer and string variables, arrays, records, and nested functions. Its syntax resembles some functional languages.

In this post, I'm going to discuss about the Lexical and Syntax Analyzer. Both of these phases were part of the second project. They were developed using JFlex tool. JFlex is a lexical analyzer generator for Java that is designed to work together with the LALR parser generator CUP by Scott Hudson, and the Java modification of Berkeley Yacc known as BYacc/J by Bob Jamison. For detailed information about JFlex and its specification, click here.

The reason I'm not discussing the first project is that it was a small one whose aim was to facilitate students in getting familiar with JFlex. In order to work with JFlex, one has to provide lexical specification of the language in a format acceptable by JFlex. This specification, as mentioned in the link above, consists of three parts, divided by %%:
a) usercode
b) options and declarations
c) lexical rules.

From this specification JFlex generates a .java file with one class that contains code for the scanner. The class has a constructor that reads standard input and contains a function yylex() that runs the scanner and used by parser to get the next token from the input.

Click here to download my second project which is a text file. Download JFlex and run it over this text file which will generate a .java file. Create a new Java Application project and include the newly created file in it. This file reads input from a file. Therefore, you should provide full path of the test file along with filename as parameter when running this project.

No comments:

Post a Comment