Structure of the LEX program
In this article, we will basic concepts of LEX and YACC programs in COmpiler design and Structure of the LEX program.
Introduction to LEX:
Lex & YACC are the tools designed for writers of compilers & interpreters.
Lex & Yacc helps us write programs that transform structured input. In programs with structured input, two tasks occur again & again.
- Dividing the input into meaningful units (tokens).
- Establishing or discovering the relationship among the tokens.
Two Rules to remember of Lex:
- Lex will always match the longest (number of characters) token possible.
Ex: Input: abc
Then [a-z]+ matches abc rather than a or ab or bc. - If two or more possible tokens are of the same length, then the token with the regular expression that is defined first in the lex specification is favored.
Ex:
[a-z]+ {printf(“Hello”);}
[hit] {printf(“World”);}
Input: hit output: Hello
The Structure of LEX:
%{ Definition section %} %% Rules section %% User Subroutine section |
The Definition section is the place to define macros and import header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file. It is bracketed with %{ and %}.
The Rules section is the most important section; Each rule is made up of two parts: a pattern and an action separated by whitespace. The lexer that lex generates will execute the action when it recognizes the pattern. Patterns are simply regular expressions. When the lexer sees some text in the input matching a given pattern, it executes the associated C code. It is bracketed with %% & %%.
The User Subroutine section in which all the required procedures are defined. It contains the main in which C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section.
Sample LEX program to recognize numbers
%{ #include <stdio.h> %} %% [0-9]+ { printf(“Saw an integer: %s\n”, yytext); } . { ;} %% main( ) { printf(“Enter some input that consists of an integer number\n”); yylex(); } int yywrap() { return 1; } |
Output:
Running Lex program:
[student@localhost ~]$ lex 1a.l
[student@localhost ~]$ cc lex.yy.c
[student@localhost ~]$ ./a.out
Enter some input that consists of an integer number
hello 2345
Saw an integer: 2345
Explanation:
First-line runs lex over the lex specification & generates a file, lex.yy.c
which contains C code for the lexer.
The second line compiles the C file.
The third line executes the C file.
Summary:
This article discusses, Structure of the LEX program. If you like the article, do share it with your friends.