In the course project, we will implement a compiler for an object-oriented language called MiniJava into LLVM assembly/IR. The project is adapted from materials kindly provided by the staff of the compiler classes at the University of Athens and the University of Washington. We also thank Hila Peleg and Oren Ish-Shalom for consulting on the project design.
MiniJava is (almost exactly) a subset of Java, defined in the Appendix of Appel and Palsberg's Modern Compiler Implementation in Java, 2nd edition and described on the MiniJava web site.
Compared to full Java, a MiniJava program consists of a single class containing a static main
method followed by zero or more other classes. The main method includes a single statement (including a block).
A
can contain a field of type B
only when B
is defined later in the file. When class B extends A
, the definition of A
must come before that of B
.There are no static methods (except for main
) or static variables.
There are no global functions or variables.
int
, boolean
, int[]
and reference types (classes).
Classes can contain fields of these basic types or of other classes. Classes can contain methods with arguments of basic or class types.MiniJava supports single inheritance but not interfaces. It does not support function overloading. Methods overriding is supported: a method defined in a parent class can be redefined in a subclass, but it is an error if it exists with other argument types or return type in the parent.
public
and all fields protected
. A class method cannot access fields of another class, with the exception of its superclasses, but can call methods of another class. A class's own methods can be called via this
: for example, this.foo(5)
calls the object's own foo
method, a.foo(5)
calls the foo
method of object a
.
x
shadows a field x
of the surrounding class.All methods are value-returning and must end with a
single return
statement.
The main
method cannot access its argv
arguments.
Only default constructors are supported; the new
operator calls a default void constructor which initializes all fields to zero.
System.out.println(...);
"
is a statement and can only print integers - it is not a
normal method call, and does not refer to a static method of a
class.if
, while
, and
assignment. if
statements are always followed by else
.new A()
, o.f(arg1, ...)
, arr[i]
and arr.length
.o.f(arg1, ...)
expressions, o
must be either a local variable, a field, a new A()
, or this
, but more complex expressions for o
are disallowed. For example, code such as a.g().f()
is an error. (This restriction is special to our course, and is not present in the original MiniJava.)-3
is invalid, but 0 - 3
is valid.
/* comments */
are not nested, i.e.,
you need to implement /* */
comments, but the
first */
terminates any open comment regardless
of how many /*
sequences have appeared before it,
as in standard Java. There are also single-line //
comments.
In our project we will be using a specific way to represent programs an abstract syntax tree (AST).
In Java code, we represent an AST using the class hierarchy provided here.
Your code will process an XML representation of the AST. (See the XML tutorial.)
Here are some example MiniJava programs and XML representations of their AST.
The following XML schema defines the span of possible ASTs in XML.
To ensure that an XML file represents a well-formed AST,
validate the file against the schema using (on Linux):
xmllint --schema schema/ast.xsd yourCoolAst.xml --noout
The provided solution starter kit (see more details below) provides code to translate from an XML to Java objects and vice versa.
To read an AST from XML and print it as a Java program, use
java -jar mjavac.jar unmarhsal print yourCoolAst.xml out.java
There is also code for writing an AST to XML; for example,
java -jar mjavac.jar unmarhsal marshal yourCoolAst.xml out.xml
reads the AST from the XML file and outputs the same AST to another XML file. This may not seem so useful (it's not), but some of the exercises ask you to output to XML an AST that you will generate, so the code that does the marshaling will be useful to you.
In your solutions, you may modify the AST classes as you desire, but you must preserve the same XML interface. If you want to add additional fields, you might find the @XmlTransient
annotation helpful.
The provided build settings (in build.xml
) are intended for Ubuntu 18.04 and Java 11.
It works using Apache Ant, you need to have it installed (e.g. sudo apt install ant
).
Compile the project by running ant
(and ant clean
for cleaning), which creates a JAR file called mjavac.jar
. Execute by java -jar mjavac.jar ...
.
To install Ant on Ubuntu, you can run sudo apt install ant
.
Your build might run into problems in the generate
phase - as a temporary workaround you can remove (by editing build.xml
) the dependency of the compile
task on generate
.
Also note that defining a class of name Symbol
leads to name clashes with CUP's class of the same name.
On the "nova" university server, the problem seems to be related to a preexisting old installation of JFlex. Fortunately, there is a simple workaround (due to the ant -lib ./tools
(instead of simply ant
).
Note that you need to use Java 11 for this; you can download a .tar.gz
file and "install" it locally as explained e.g. here (from bash
).
Submit all your code in a zip file.
ant
in this directory, and this must generate a JAR file called mjavac.jar
.
ids.txt
that contains the ID numbers of the students in the group, each in a different line.
Your code must compile on a standard Ubuntu 18.04 distribution. The grade on a project that doesn't compile is 0. We will try to notify you when your submission doesn't compile and allow you to resubmit (on the expense of the grace days), but we can't promise to do that - make sure your code compiles.
One student from each group should submit the solution via the course's Moodle (under "HW submission").