Tuesday, September 8, 2009

This will all end in tears

So this semester Mississippi State is offering a course in Compilers. For the uninitiated, a compiler is what translates a programmer's source code into machine code that can be executed by a computer. MSU hasn't offered this course in about five years, and may not offer it again for some time, so I jumped on the chance to take it while I could.

Today we received our first assignment: implement a Lexical Analyzer Generator. A lexical analyzer parses through a string of characters and assigns tokens to different parts of the input. For example, a lexical analyzer for English would produce tokens for nouns, verbs, adjectives, etc. So, for example, the sentence "I drove to the store." would get tokenized as . This stream of tokens could then be input into a syntax analyzer to check if the sentence conforms to the English grammar. Now, English being the mess it is, this isn't always an easy task. Programming languages, on the other hand, are typically members of a much simpler class of languages known as the "context free" languages.

The traditional lexical analyzer generator is lex. Our assignment is to reimplement lex (or, more accurately, it's modern incarnation flex). To better help me organize my thoughts for this assignment, I will be posting commentary on lex/flex and our own implementation, Luthor. For anyone not interested in Formal Languages or Computer Science, this might be pretty boring, but you never know.

No comments:

Post a Comment