--------- COP C-like compiler for NXT lego ROBOT -------

28th of August 2007 – v1.2

 

 

 

New :

-        partial enrichment  of the “programming guide

-        optimized management of library, only functions that are called are generated

-        enrichment of the sound library

-        several bugs are fixed, see : 11 – Fixed bugs

-        all error messages are now translated in English

-        entire translation of the GUI in English

 

Go to the table of contents :

Go to Download section :

 

_________________________________________________________________________________________________________________________________________________________

Introduction :

 

Find in this WEB page, explanations about a C-like COP compiler dedicated to the NXT Lego ROBOT.

I have started this project 6 months ago, and  I have not enough time to work on it, so it is still an ongoing project J

 

To be quite honest, I have not built any robots, and my son ( 5 years old ) looks at the mountain of pieces and asks :“Daddy, when are you building this robot ?”  and I always answer : “When this compiler is finished !”. I guess he doesn’t understand why.

 

Sources of this compiler are free, that is to say that anybody can reuse, and modify this compiler. I don’t know if this compiler can be useful for other people ...

 

To help other developers, this web page intends to give enough information to make this compiler a reusable source.

 

This WEB page and compiler are beta-versions, many remaining issues are to be completed, especially :

-        a complete specification of code patterns

-        lot of bugs to be discovered !!! J

-        automated generation of code to allocate memory for arrays

-        a complete library

-        a library dedicated to mathematic function to be developed

-        and so many other stuffs like that

 

Find below explanations concerning this compiler and files to be downloaded.

 

Do not hesitate to send me an email to reports bugs of the compiler and or any comments on this web page to :

 

__________________________________________________________________________________________________________________________________________________________

 

Below a list of favourite links :

 

Web sites :

 

Lego WEB site :

http://mindstorms.lego.com/Overview/NXT_Software.aspx

Extensions of NXT :

 http://www.philohome.com/nxt.htm

National Instrument tool kit :

http://www.ni.com/academic/mindstorms/

A Web site where you can purchase NXT robots :

http://www.robopolis.com/produit/2474/20/Robots-programmables/MINDSTORMS-NXT-V41.php

NBC compiler, which is the code generated by COP compiler :

http://bricxcc.sourceforge.net/nbc/

An assembly language fot NXT :

http://users.ece.utexas.edu/~valvano/assmbly/syntax.htm

More sensors for NXT :

http://www.mindsensors.com/index.php?module=pagemaster&PAGE_user_op=view_page&PAGE_id=57&MMN_position=24:24

Robots developed for computer science training course, in the RENNES university (IFSIC, I am a former student of IFSIC ;-)) :

http//www.irisa.fr/caps/people/puaut/Mindstorm/TPMindstorm.html#TP4

Bricx Command Center 3.3 :

http://bricxcc.sourceforge.net/nbc/

NBC Debugger :

http://nxtasy.org/2006/11/29/nbc-debugger-for-nxt/

Other tools :

http://www.lysator.liu.se/~nisse/lego-nxt-dev/

Debugger for NXT :

http://www.sorosy.com/lego/nxtdbg/

 

 

Forum :

 

        French Forum : http://www.vieartificielle.com/forum/

        English/American Forum : http://thenxtstep.com/forums.html

        French WEB site dedicated to LEGO  : http://www.freelug.org/

English/American Forum : http://forums.nxtasy.org

Lego Forum : http://messageboards.lego.com/ShowForum.aspx?ForumID=1042

POB-Technology, a company dedicated to robots : http://pob-technology.com/

__________________________________________________________________________________________________________________________________________________________

 

 

1- What is COP compiler ?

2- Installation guide

3- User manual

4- Grammar implemented in COP compiler

5 –Programming guide

        5.1 Task

        5.2 Function

        5.3 Types and structures

        5.4 Expression

5.5 Variable

5.6 Assigment

5.7 If

        5.8 For, continue and break

        5.9 While

        5.10 Switch

5.11 Return

5.12 Arrsize

        5.13 Arrinit

        5.14 Arrsubset

        5.15 Arrbuild

        5.16 Precedes

        5.17 Label and Goto

        5.18 Inline

        5.19 Mutex, acquire & Release

6- How COP compiler works

-        6.1 Principles

-        6.2 Data model

-        6.3 Lexical parsing

-        6.4 Syntax parsing

1.   Principles

2.   Top-Down analysis

3.   Bottom-Up analysis

4.   Comments on the syntax parsing in the COP compiler

1.   Example of the for loop

2.   Dedicated grammar for operators

3.   Mapping between rules and C functions

-        6.5 Code generation

1.   Principles :

2.   Mapping between C structures and code generation functions  :

3.   Code generation patterns :

7 - libraries

8 – Examples of COP sources

9 – Errors 

10 – Download section : sources, files and executable  

11 – Fixed bugs :

12 - Annexes

 

__________________________________________________________________________________________________________________________________________________________

 

1 – What is COP compiler :

 

For X-mas, I got a Lego NXT robot. As I am a lazy guy, I do not want to code NBC instructions. So I decided to write my own C-like compiler which generates NBC codes. For people which are familiar with the C language syntax, I guess they won’t be destabilized by COP compiler.

 

This compiler is fully open, it means that you can enrich it in term of functions or data or enrich its library as well, C sources are downloadable.

 

As I am not a very good programmer, the C code is not commented ( few comments are in French ) but in section 5, I try to give enough information for a better understanding. Moreover, I have fixed a huge number of bugs, but I am quite sure there are still lot of  bugs to be discovered and fixed. Do not hesitate to report bugs.

 

This compiler is compiled with lcc compiler ( http://www.cs.virginia.edu/~lcc-win32/ ) and use an external library (dislin : http://www.mps.mpg.de/dislin/).

 

To complete this introduction, as you can read ( see ) , I am French and English is not my native language, so do not hesitate to report me all  misunderstandings due to my poor English.

 

__________________________________________________________________________________________________________________________________________________________

 

2- Installation guide

 

The installation is quite simple :

 

1) Create a directory named “COP” and copy “cop.exe” in this directory ( see executable file  )

2) Create a sub-directory named “source”

3) Create a sub-directory  of “source” named “lib” and copy all library files in this directory (see 6 - Libraries )

4) If you want to use COP source examples, copy these files in the “source” directory ( see 7 - Example of COP sources )

5) Create you own COP sources in the “source” directory.

 

The directory tree must look as below :

 

COP --+

            + cop.exe

            +  source -- +

        + COP files eg. essai_librairie_def.cop.txt

                                + lib -- +

                                           + display.cop.lib

                                           + file.cop.lib

                                            + motor.cop.lib

                                            + sensor.cop.lib

                                            + sound.cop.lib

        + string.cop.lib

                                           + system.cop.lib 

 

Or download the package : 10 – Download section : sources, files and executable

__________________________________________________________________________________________________________________________________________________________

 

3- User manual

 

The use of COP compiler is quite simple.

The only thing you have to do is to execute cop.exe.

 

Below this window will be displayed :

 

 

The selection of the “exit” menu  will end the program.

The selection of the “help” menu will show you the current version of COP compiler.

The selection of file will offer you a dedicated window to select the cop file you want to compile.

 

The following options are offered :

            - Syntax Parsing traces : if selected, the entire syntax parsing is traced in the text window.

            - Syntax tree, types and variables : if selected, the entire syntax tree which is the result of the parsing step is displayed in the text window.

            - Code generation traces : If selected, traces will be added in the target .nbc file.

            - Automatic generation mutex : If selected, mutexes are automatically declared and invoked to protect accesses to all functions.

 

To compile the only thing you have to do, is to press the “OK” button.

 

 

 _____________________________________

 COP Compiler v.1.2 / 28 th August 2007

 Author : Olivier POURCHER

 _____________________________________

 

 

 

 PASS #1 : pre-processing ( defines lexical parsing of define completed

 Lexical parsing of define completed : D:\DELTA\lcc1\source\essai_sound.cop.txt - 68 lines

 Lexical parsing of define completed : lib\display.cop.lib.txt - 123 lines

 Lexical parsing of define completed : lib\file.cop.lib.txt - 118 lines

 Lexical parsing of define completed : lib\motor.cop.lib.txt - 115 lines

 Lexical parsing of define completed : lib\sensor.cop.lib.txt - 181 lines

 Lexical parsing of define completed : lib\sound.cop.lib.txt - 154 lines

 Lexical parsing of define completed : lib\string.cop.lib.txt - 55 lines

 Lexical parsing of define completed : lib\system.cop.lib.txt - 74 lines

 PASS #1 : Successfully completed / 117 'defines' parsed

 

 

 

 

 PASS #2 : Lexical parsing of libraries and module to be compiled

 Lexical parsing completed : lib\display.cop.lib.txt - 123 lines

 Lexical parsing completed : lib\file.cop.lib.txt - 118 lines

 Lexical parsing completed : lib\motor.cop.lib.txt - 115 lines

 Lexical parsing completed : lib\sensor.cop.lib.txt - 181 lines

 Lexical parsing completed : lib\sound.cop.lib.txt - 154 lines

 Lexical parsing completed : lib\string.cop.lib.txt - 55 lines

 Lexical parsing completed : lib\system.cop.lib.txt - 74 lines

 PASS #2 : Successfully completed

 PASS #2 : Lexical parsing completed : D:\DELTA\lcc1\source\essai_sound.cop.txt  - 68 lines

 

 

 

 PASS #3 : Syntax parsing of libraries and module to be compiled

 End of parsing of global variables

 -------> module lib\file.cop.lib.txt line 31 ::

[Warning Error #SX0675: Types not compatible in expression ] token = )

 -------> module lib\file.cop.lib.txt line 39 ::

[Warning Error #SX0675: Types not compatible in expression ] token = )

 End of parsing of functions and tasks

 compiled line = 68

 # of types =12

 # of variables = 89

 # of functions = 47

 # of tasks = 1

 PASS #3 : Successfully completed

 PASS #3 : Syntax parsing completed D:\DELTA\lcc1\source\essai_sound.cop.txt - total compiled lines = 888

 

 

 

 PASS #4 : Code generation of libraries (only those called) and module to be compiled

 Code generation completed D:\DELTA\lcc1\source\essai_sound.cop.txt

 PASS #4 : Successfully completed

 

 

 Press a key to terminate D:\DELTA\lcc1\source\essai_sound.cop.txt

 

__________________________________________________________________________________________________________________________________________________________

 

4- Grammar implemented in COP compiler

 

Find below a description of the grammar of the COP language.

 

<Program> ->  [ <define declaration> | <struct declaration> | <variable declarations>| <mutex declarations>   | <function> | <task> ]0..n

<define declaration> -> define  <ident>  <text>

 

<struct declaration> -> struct { < field definition >1..n };

<field definition> -> <type> <list of fields>

<list of fields> -> <field> , <list of fields> | <field> ;

<field> -> [*]0|1 <ident> [ [<integer>] ] 0..n

 

<variable declarations> -> <type> <list of variables>

<list of variables> -> <variable> , <list of variables> | <field> ;

<variable> -> [*]0|1 <ident> [ [<integer>] ] 0..n [= <init>]0|1

<init> -> <numeric value> | <string>

<type> -> char | schar | byte | int | sint | long | slong | <ident type>

<list of mutex> -> <mutex> , <list of mutex> | <mutex> ;

< mutex> -> mutex <ident>

 

<ident type> -> <generic ident >

<ident > -> <generic ident >

 

<function> -> <ident> ( <list of parameters> )  < body > |

          < return value > <ident> ( <list of parameters> | void  ) < body >

<list of parameters>  -> <list of parameters> ,<parameter> | <parameter>

<parameter> -> [ mod ] 0|1  <type> [*]0|1 <ident>

<return value> -> void | <type>

<task> -> task  <ident> ( ) < body >

 

<body> -> {  [<variable declarations>] 0|1  [<instruction>]0..n  }

<instruction> -> <assignment> |

                         <for loop> |

                        <do while loop> |

                        <if then else> |

                        <switch> |

                        <goto> |

                        <label> |

                        <continue> |

                        <break> |

                         <exit> |

<return> |

<acquire> |

<release> |

<precedes> |

<arrinit> |

<arrsize> |

<arrbuild> |

<arrsubset> |

<inline> |

< function call> ;

 

<assignment> -> [ [*( <ident> + <expr> ) | <ident assign> ] [=  <expr> ; ] 0|1    |  <ident> ++; | <ident> -- ;

<for loop> -> for ( [ <affectation> ] 0..1   ;  [<expr>] 0..1  ; <affectation> ] 0..1   ) <body instruction>

<do while loop> -> do {  [<instruction>]0..n  } while ( <expr>   ) ;

<if then else> -> if ( <expr> ) <body instruction> [ else <body instruction>  ] 0..1 

<switch> -> switch ( <expr> ) { [< switch body>]0..1  }

<switch body> -> case <expr> :  [<body instruction>] 0..1  [break ; ] 0..1 |

    default  : [<body instruction>] 0..1 

<goto> -> goto  <ident> ;

<label> -> <ident> :

<continue> -> continue ;

<break> -> break ;

 <exit> -> exit ( ) ; 

<return> -> return ( expr ) ;

<acquire> -> acquire ( <ident> ) ;

<release> -> release ( <ident> ) ;

<precedes>-> precedes <list of tasks> ;

<list of tasks> -> <list of tasks> , <ident> | <ident>

<arrinit> -> arrsubset ( <ident>, <expr>, <expr> ) ;

<arrsize> -> arrssize ( <ident>, <ident> ) ;

<arrbuild> -> arrsbuild (  <list of arrays> ) ;

<list of arrays> -> <list of arrays> , <ident> | <ident>

<arrsubset> -> arrsubset ( <ident>, <ident>, <expr>, <expr> ) ;

<inline> -> inline “<text>” ;

 

<body instruction> -> {  [<instruction>]0..n  } |

<instruction>

 

 

<expr>  -> [ <expr rec ou>  ] 0..1  

<expr rec ou>  -> <expr rec et>  | <expr rec et>  || <expr rec ou> 

<expr rec et>  -> <expr rec comp>  | <expr rec comp>  && <expr rec et> 

<expr rec comp>  -> <expr rec arith>  | <expr rec arith> [ ==| != |  >|  <|  <=|  >=  ] <expr rec comp> 

<expr rec arith>  -> <expr rec>  | <expr rec>  [  + | - |  | ] <expr rec arith> 

<expr rec>  -> <expr rec shr shl>  | <expr rec shr shl>  [  & | /  | *  | % ]  <expr rec> 

<expr rec shr shl>  -> <expr rec not>  | <expr rec not >  [  >> | <<  ]   <expr rec shr shl > 

<expr rec not>  -> <expr rec neg>  | ! <expr rec neg >

<expr rec neg>  -> <expr rec atom>  | -  <expr rec atom >

<expr rec atom>  -> <atom> | <var func> | (<expr rec ou>   )

<var func> - > <ident assign>| <function call>

 

<ident assign>-> <ident>  |

                           <ident> ++ ; |

                           <ident>  --  ; |

   <ident> [ . <ident assign> ] 1..n   | 

                          <ident> [ [<expr> ] ] 1..n  

 

<function call> -> <ident> (  [< list of function parameters>]0..1   )

< list of function parameters> -> < list of function parameters> , <expr> | <expr>

 

<atom> -> <numeric value> | <string>

<generic ident > ->  [a..z] | [A..Z] [[a..z] | [A..Z] | [0..9] | _ ] 0..n

<string> -> [all ASCII code] 0..n

<numeric value> -> <binary>  | <hexa>  | <decimal>

<binary> -> 0b [0|1] 1..n

<hexa> -> 0x [ [0|1] | [ [A..F] | [a..f] ] ] 1..n

<decimal>-> [0..9]  [0..9] 1..n

<text> -> [all ASCII code] 0..n

 

__________________________________________________________________________________________________________________________________________________________

 

5- Programming guide

 

5.1 Task

 

Downloadable example

 

A COP program contains at least one task which is called ‘main’. This is the entry point of the program. Several tasks can be defined and launched, see precedes.

These tasks can be executed in parallel.

 

Below an example of program :

 

task tache1()

 {

 }

 

 

task tache2()

 {

 }

 

task main()

 {

            precedes tache1,tache2;

 }

 

5.2 Function

 

downloadable example

 

A COP program can contain several functions. Some of these functions are defined in libraries, eg. COP_wait() and cannot be double defined.

As they are not really  local variables  ( they are emulated by the COP compiler ), mutex can be automatically generated at complie time to have a one and only one call at a given time.

 

A function is composed of 4 parts : < return value > <ident> ( <list of parameters>) < body >

 

            - type of returned value, if exists :

                        - the type can be void if no value are returned

                        - the type can be an atomic ‘type’ or structure

            - identifier of the function

            - parameters

                        - parameters can be passed by value or can be modified  ( mod )

            - body of the function

                        - the body contains a set of instruction and local variables ( see variables )

 

Example1 :

 

void ffff()

 {

 }

 

task main()

 {

            ffff();

 }

 

example 2 :

 

int abcdefgh ()

 {

            return(1);

 }

 

task main()

 {

            int v;

           

            v=abcsdefgh();

 }

 

example 3 :

 

 

struct COORD

 {

            int x;

            int y;    

 };

 

struct LINE

 {

            COORD ext1;

            COORD ext2;

 };

 

int ff(mod int g, int k,mod int x,mod COORD z)

 {

            int a,b,c,d:

 

            g=1;

            x = 0;

            z.x=1;

            z.y=2;

 }

 

task main()

 {

            int a,b,c;

            LINE line1;

            COORD point;

           

            c=ff(a,b,line.ext1.x,point);       

 }

5.3 Types and structures

 

downloadable example :

 

Cop program can define data types and structures.

 

Atoms :

            byte     unsigned 8 bits

            char     unsigned 8 bits

            schar   signed 8 bits

            int        unsigned 16 bits

            sint      signed 16 bits

            long     unsigned 32 bits

            slong   signed 32 bits

 

<type> -> char | schar | byte | int | sint | long | slong | <ident type>

 

Nota : float is not available, this type is not supported by the NXT. I am writing a library dedicated to “float an mathematic functions ( cos, sin, … ).

Nota : bollean type doesn’t exist, but can be replaced by a byte variable, 0 is ‘FALSE’, all other values (>0) are ‘TRUE’.

 

Pointers:

Pointers are used to handle arrays which are passed as ‘reference’ parameters in function.

 

<parameter> -> [ mod ] 0|1  <type> [*]0|1 <ident>

 

Arrays :

Arrays are scalar types. Arrays can be applied to atom or structures. Arrays can be multi-dimensional, COP compiler support up to 10 dimensions.

These dimensions are used to address the exact memory location.

Strings are defined as arrays of  char, eg.  char String[10]. Strings must be terminated by a null  byte.

 

<variable> -> [*]0|1 <ident> [ [<integer>] ] 0..n [= <init>]0|1

 

Nota : If arrays are passed by ‘reference’ in a called, they must be then used as pointer in the called function.

 

Structures :

A structure is set of fields which are gathered into a one and only one structure. A structure can contains, atoms, arrays, structures or arrays of structures.

 

<struct declaration> -> struct { < field definition >1..n };

<field definition> -> <type> <list of fields>

<list of fields> -> <field> , <list of fields> | <field> ;

<field> -> [*]0|1 <ident> [ [<integer>] ] 0..n

 

Example 1 :

 

struct COORD

 {

            int x;

            int y;    

 };

 

struct LINE

 {

            COORD ext1;

            COORD ext2;

 };

 

LINE line[10];

 

 

void assign(mod int d,int s)

 {

            d=s;

 }

 

void init_line(mod LINE *l,int index, int x1, int y1,int x2,int y2)

 {

            LINE v;

 

            assign(v.ext1.x,x1);     

            assign(v.ext1.y,y1);

           

            v.ext2.x=x2;

            v.ext2.y=y2;

 

            *(l+index)=v;  

 }

 

 

task main()

 {

            int x;

            int valeur;

            char buffer[10];

            LINE v;

 

            inline "arrinit globale_line,0 , 10";

           

COP_DrawText(1,1,"INIT");

            COP_Wait(1000);

 

            x=0;

 

            do

              {

                        init_line(line,x,1,x,60,60-x);

                        x++;

              }

            while(x<10);

           

            for(x=0;x<10;x++)

             {

                        v=line[x];

 

                        COP_LineOut(v.ext1.x, v.ext1.y, v.ext2.x, v.ext2.y, 1);

                        COP_Wait(1000);

             }

           

            COP_DrawText(1,1,"FIN");

            COP_Wait(1000);

 

            COP_LineOut(1, 1, 20, 20, 1);

            COP_Wait(5000);

 }

5.4 Expression

 

downloadable example

 

to be completed

5.5 Variable

to be completed

5.6 Assigment

to be completed

5.7 If

 

downloadable example

 

The if instruction follows this pattern: if ( <expr> ) <body instruction> [ else <body instruction>  ] 0..1 

 

example 1 :

 

if (a==b) k=0;

 

example 2 :

 

if (a==b)

{

k=0;

d=1;

            }         

 

example 3 :

 

if (a==b)

{

k=0;

d=1;

            }

else

            {

k=1;

d=0;

            }         

5.8 For, continue and break

 

downloadable example

 

The for  instruction follows this pattern: for ( [ <affectation> ] 0..1   ;  [<expr>] 0..1  ; <affectation> ] 0..1   ) <body instruction>

The for loop is divided into 3 :

-         initial value

-         logical expression which stops the loop if equal to false, eg. if i =5, (i<4) is evaluated as FALSE

-         incrementation, this assignement change values of variable which are part of the logical expression

 

The ‘break’ instruction ‘jumps’ to next instruction which follows the for loop.

The ‘continue’ instruction ‘jumps’ to the beginning of the for loop but executes the incremental instructions.

 

example 1 :

 

task main()

 {

            int a;

            char chaine[10];

            int inv;

           

            for(a=0;a<10;a++)

             {        

                        if(inv==1)

                         {

                                   inv=1-inv;

                                   continue;

                         }

 

                        inv=1-inv;

             }        

 

            /* test 2 */

 

            for(a=0;;a++)

             {                                           

                        if(a>5) break;

             }

 

 

            /* test 3 */

 

            a=0;

            for(;;)

             {        

                                  

                        if(a>5)  break;

 

                        a++;

             }

5.9 While

 

downloadable example

 

The do/while  instruction follows this pattern: <do while loop> -> do {  [<instruction>]0..n  } while ( <expr>   ) ;

The while loop evaluate a logical expression which stops the loop if equal to FALSE, eg. x<10

The ‘break’ instruction ‘jumps’ to next instruction which follows the for do/while loop.

The ‘continue’ instruction ‘jumps’ to the beginning of the do/while loop but executes the incremental instructions.

 

example 1 :

 

task main()

 {

            int x;

            int valeur;

            char buffer[10];

           

            COP_DrawText(1,1,"AFFICHAGE");

        COP_Wait(1000);

 

            x=0;

 

            do

              {

                        COP_NumtoString(x,buffer);

                        COP_DrawText(1,1,buffer);

                        COP_Wait(1000);

                        x++;

              }

            while(x<10);

 

            COP_DrawText(1,1,"FIN");

        COP_Wait(1000);

 }

5.10 Switch

 

downloadable example

 

5.11 Return

 

downloadable example

 

return is use to return value ( amazing ! ). Return can only be used in functions which return values.

return can contain complex expression, but the return type must comply with the declared type of the function.

 

example 1 :

 

int f(int a,int x,int b)

 {

            return(a*x+b);

 }

 

task main()

 {

            int x;

            int valeur;

            char buffer[10];

           

            COP_DrawText(1,1,"AFFICHAGE");

            COP_Wait(1000);

 

            x=0;

 

            do

              {

                        valeur=f(2,x,3);

                        COP_NumtoString(valeur,buffer);

                        COP_DrawText(1,1,buffer);

                        COP_Wait(1000);

                        x++;

              }

            while(x<10);

 

COP_DrawText(1,1,"FIN");

            COP_Wait(1000);

 }

5.12 Arrsize

to be completed

5.13 Arrinit

to be completed

5.14 Arrsubset

to be completed

5.15 Arrbuild

to be completed

5.16 Precedes

downloadable example

to be completed

5.17 Label and Goto

 

downloadable example

 

label and goto are used to jump to a given instruction of the program.

God of programmers never uses the goto instruction, using goto is strictly forbidden ( even in the toilet J).

Obviously , to program a goto, a label must be declared in the code.

 

<goto> -> goto  <ident> ;

<label> -> <ident> :

 

example 1 :

 

/**/

/**/

/**/

/*******************************/

/* this function opens file in */

/* read or write mode          */

/*******************************/

 

byte COP_Fopen(char *name,char *mode)

 {

            inline "dseg segment";

            inline "FO TFileOpen";

            inline "expr_g_byte_tab_COP_Fopen byte[]";

            inline "expr_d_byte_tab_COP_Fopen byte[]";

            inline "expr_g_byte_COP_Fopen byte";

            inline "expr_d_byte_COP_Fopen byte";

            inline "expr_byte_tab_COP_Fopen byte[]";

            inline "inc_COP_Fopen word";

           

            inline "dseg ends";

 

            if(mode=="w")

             {        

                        inline "mov FO.Filename, COP_Fopen_name";

                        inline "mov FO.Length, 100";

                        inline "syscall FileOpenWrite, FO";

                        inline "mov expr_byte_COP_Fopen, FO.FileHandle";

                        goto end_fopen;

             }

            if(mode=="r")

             {

                        inline "mov FO.Filename, COP_Fopen_name";

                        inline "mov FO.Length, 100";

                        inline "syscall FileOpenRead, FO";

                        inline "mov expr_byte_COP_Fopen, FO.FileHandle";

             }

    end_fopen:

 }

 

5.18 Inline

 

downloadable example

 

inline is a very simple and powerful instruction. It can be used to directly low level nbc instructions.

inline’ is also massively used in libraries to make the link between COP functions and nbc libraries.

 

<inline> -> inline “<text>” ;

 

Nota : the text between the two quotes is not parsed.

 

example 1 :

 

/**/

/**/

/**/

/*******************************/

/* this function opens file in */

/* read or write mode          */

/*******************************/

 

byte COP_Fopen(char *name,char *mode)

 {

            inline "dseg segment";

            inline "FO TFileOpen";

            inline "expr_g_byte_tab_COP_Fopen byte[]";

            inline "expr_d_byte_tab_COP_Fopen byte[]";

            inline "expr_g_byte_COP_Fopen byte";

            inline "expr_d_byte_COP_Fopen byte";

            inline "expr_byte_tab_COP_Fopen byte[]";

            inline "inc_COP_Fopen word";

           

            inline "dseg ends";

 

            if(mode=="w")

             {        

                        inline "mov FO.Filename, COP_Fopen_name";

                        inline "mov FO.Length, 100";

                        inline "syscall FileOpenWrite, FO";

                        inline "mov expr_byte_COP_Fopen, FO.FileHandle";

                        goto end_fopen;

             }

            if(mode=="r")

             {

                        inline "mov FO.Filename, COP_Fopen_name";

                        inline "mov FO.Length, 100";

                        inline "syscall FileOpenRead, FO";

                        inline "mov expr_byte_COP_Fopen, FO.FileHandle";

             }

    end_fopen:

 }

5.19 Mutex, acquire & Release

downloadable example

to be completed

 

__________________________________________________________________________________________________________________________________________________________

 

6- How COP compiler works

 

6.1 Principles

C files :

 

The C code is made of 4 C files (see 9 – Download section : sources, files and executable) :

            - cop.c : this is the heart of COP compiler, this module contains the entry point of the program ( main ) and the calls of the four passes.

            - lexical.c : this module focuses on the lexical parsing pass and pre-processing pass

            - syntaxique.c : this module focuses on the syntax parsing pass

            - genere_code.c : this module focuses on the code generation pass

            - delta_encrier.h : this header contains C structures and defines used by COP compiler.

 

4 passes compiler :

 

The COP compiler is a 4 passes compiler :

            - 1) pre-processing pass, which identifies all defines

            - 2) a lexical parsing pass

            - 3) a syntax parsing pass

            - 4) a code generation pass

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Above, a schema which illustrates flows and main functions of the COP compilers.

 

The first pass, pre-processing, parses all files and stores all defines ( #define <ident> text ).

ð     it generates a table of defines

ð     file : lexical.c

ð     entry point : cop.c :

o       pre-processing : analyse_lexicale_define()

The second pass, lexical parsing, parses all files and identifies a flow of lexem ( +, -, <ident>, for, if, else  … ), if an <ident> is detected and if this <ident> is a ‘define’, it is replaced by its value.

ð     it generates a table of lexems

ð     file : lexical.c

ð     entry point : cop.c : analyse_lexicale()

The third pass, syntax parsing, parses the list of lexems according to the grammar rules specified in the chapter 4- Grammar implemented in COP compiler.

ð     it generates table of structures, table of variables, table of functions, table of tasks, tables of instructions. Those table can be considered as a syntax tree.

ð     file : syntaxique.c

ð     entry point : cop.c : analyse_syntaxique()

The fourth pass, code generation, generates the NBC codes by analysing the syntax tree.

            =>  it generates a NBC target file

ð     file : genere_code.c

ð     entry point : generation_de_code()

 

In the next chapters, main data, and each pass is explained in detailed.

6.2 Data Model

 

Below the data model of the COP compiler.

 

 

 

This table comments the data model :

 

Structure

Array/variable

Index

Semantic

Display/debug function

PROGRAM

PROGRAM program;

 

This variable contains the Variable_globales array.

The principle is that even if we are facing to a local variable of a function or a task, variables are stored in the same table. To find the relevant variable, first, we always try to find it in the function or task and if it is not defined at this level, we assume that it is a global variable.

void analyse_syntaxique(void)

 

printf("\n LISTE DES VARIABLES ");

for(i=0;i<program.nb_vg;i++)

{

printf("\n\n#################################\n VARIABLE %d",i);

affiche_variable(i);

}

TACHE

TACHE tab_tache[MAX_TACHE];

int nb_tache;

This table contains the list of task that are defined in the program.

void analyse_syntaxique(void)

for(i=0;i<nb_tache;i++)

 {

printf("\n\n#################################\n TACHE %d '%s' nb_instruction %d",i,tab_tache[i].ident_tache,tab_tache[i].nb_instruction);

affichage_instruction(tab_tache[i].liste_instruction);

 }

FONCTION

FONCTION tab_fonction[MAX_FONCTION];

int nb_fonction;

This table contains the list of function that are defined in the program

void analyse_syntaxique(void)

printf("\n LISTE DES INSTRUCTIONS ");

 for(i=0;i<nb_fonction;i++)

 {

printf("\n\n#################################\n FONCTION %d '%s' nb_instruction %d ",i,tab_fonction[i].ident_fonction,tab_fonction[i].nb_instruction);

affichage_instruction(tab_fonction[i].liste_instruction);

}

VARIABLE_GLOBALE

See PROGRAM

 

This structure models variable, both global and local variables.

void affiche_variable(int index)

TYPE_VARIABLE

TYPE_VARIABLE tab_type_variable[MAX_TYPE_VARIABLE];

int nb_tab_type_variable;

This structure models types of variables. A type of variable can be :

   - atom ( #define ATOME 1 )

   - array ( #define TABLEAU 2 )

   - array of structure ( #define TABLEAU_STRUCTURE 4 )

   - structure ( #define STRUCTURE 2)

   - field of a structure ( #define CHAMP 5 )

This is a recursive structure.

The void init_type_variable(char *prof_bloc) function defines atom types such as int, char, byte, long, sint, char, sbyte, slong …

void affiche_type(TYPE_VARIABLE *type,int prof)

INSTRUCTION

INSTRUCTION tab_instruction[MAX_INSTRUCTION];

int nb_instruction;

 

This structure models a list of atomic instructions such as an assignment, a for loop …

void affichage_instruction(INSTRUCTION *inst)

INSTRUCTION_ELT

 

 

This union models an atomic instructions such as an assignment, a for loop, an if, …

 

SEXPRESSION

 

 

This structure models binary expression, in other words a node of a binary tree which contains an operator and 2 or 1 argument. Example, < +, sub-exp1,sup-exp2> or <not,sub-exp1,null > or a leaf (feuille in French ).

void affichage_recursif(SEXPRESSION *exp)

FEUILLE

 

 

This union models leaf of tree, which be really a leaf such an integer, a string … but can also be a call of function which contains arguments which are also modelled as expression trees ( eg. f(1+2,3*4 ) ) … but can also be an array variable which contains dimensions which are also modelled as expression trees ( tab[2][f(3-2)][tab1[j*3]] …

See void affichage_recursif(SEXPRESSION *exp)

SATOME

 

 

This structure models a real leaf of an expression tree, it means that it cannot be decomposed. Eg. integer value = 1968

 

SPOINTEUR

 

 

This structure models indirection of a pointer.

 

SVARIABLE

 

 

This structure models variables which can be simple variable such as i or j, or scalar variable such as array ( t[2][i] ) or structures ( a.b.c ).

 

SFONCTION

 

 

This structure models call of function such as  f(1+2,3*4 )

 

SAFFECTATION

 

 

This structure models an assignment of  a variable.

See void affichage_instruction(INSTRUCTION *inst)

SIF

 

 

This structure models the « if » instruction.

void affiche_inst_if(INSTRUCTION *inst)

SFOR

 

 

This structure models the « for » instruction.

void affiche_inst_for(INSTRUCTION *inst)

SBLOC

 

 

 

 

SSWITCH

 

 

This structure models the « switch » instruction.

See void affichage_instruction(INSTRUCTION *inst)

SDO_REPEAT

 

 

This structure models the « do while » instruction.

void affiche_inst_do_while(INSTRUCTION *inst)

SRETURN

 

 

This structure models the « return » instruction.

See void affichage_instruction(INSTRUCTION *inst)

SARRSIZE

 

 

This structure models the « arrsize » instruction.

See void affichage_instruction(INSTRUCTION *inst)

SARRINIT

 

 

This structure models the « arrinit » instruction.

See void affichage_instruction(INSTRUCTION *inst)

SARRSUBSET

 

 

This structure models the « arrsubset » instruction.

See void affichage_instruction(INSTRUCTION *inst)

SARRBUILD

 

 

This structure models the « arrbuild » instruction.

See void affichage_instruction(INSTRUCTION *inst)

SPRECEDE

 

 

This structure models the « precede » instruction.

See void affichage_instruction(INSTRUCTION *inst)

TAB_MUTEX

TAB_MUTEX tab_mutex[MAX_MUTEX];

int nb_mutex;

This structure models mutex ( semaphore ).

void analyse_syntaxique(void)

  for(i=0;i<nb_mutex;i++)

{

printf("\n\n#################################\n MUTEX %d \n ident ='%s' module '%s' ligne '%d' ",

                                                                                              i                tab_mutex[i].ident,

                                                                                              tab_mutex[i].module,

                                                                                              tab_mutex[i].ligne);

}

TAB_DEFINE

TAB_DEFINE tab_define[MAX_DEFINE];

int nb_define;

This structure models defines that are processed during the lexical analysis pre-processing. See Pre-processing pass :

 

ECHANT

ECHANT *Techant;

long nb_echant;

This structure models a lexem, it means an elementary token of the COP language, such as LFOR, LBEGIN (“{“ ), LEND (“=” ), see Lexical parsing :

 

 

6.3 Lexical parsing

 

Pre-processing pass :

 

 

 

 

 

 

 

 

 

 

 

 

 


The entry point is the lexical.c :: analyse_lexicales_define() function.This function calls the lecture_donnees_define() function.

The lecture_donnees_define() function opens the file to be parsed, calls a line after line parsing function named analyse_fichier_ligne_a_ligne_define(), and then closes the file.

The analyse_fichier_ligne_a_ligne_define(), parses line after line the file and store defines in the tab_define[] array ( nb_define is the index ) by invoking the ajout_define() function ( ajout = add in English ).

Lexical parsing :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


The lexical parsing is far more complex.

Before commenting in detail the C code, below a summary of the lexical parsing principle.

The lexical parsing analyses a flow of ASCII characters and generates a flow of lexems.

 

A lexem is :

-  a keyword  :  for, task , while, define, break … are keywords

- an integer : 123, 0xfffff, 0b0110110 … are integers

- a string : « hello world » … is a string

- an ident : variable_identifier, i, j … are idents

- separators : , ; { } ( ) … are seprators

- operators : + - / * == != .. are operators

All ASCII characters between /* ignored ASCII charecters and */  are ignored. /* */ is used for comments.

All ASCII such as CarriageReturn, Line Feed, Tab … are ignored.

 

In theory, lexems to be recognized can me modelled as a state machine. The principle is to recognize the longer string as possible.

Below a simplified state machine for the COP language :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


In this state machine, there are states and arrows between states. “Start” is the initial state. LBEGIN, LEND, LENTIER … are “end” states. The current ASCII character is the key used to determine the relevant transition between two states. Actually, in COP compiler, this state machine is hard-coded.

 

The entry point is the lexical.c :: analyse_lexicales() function.This function calls the lecture_donnees() function.

The lecture_donnees() function opens the file to be parsed, calls a line after line parsing function named analyse_fichier_ligne_a_ligne(), and then closes the file.

The analyse_fichier_ligne_a_ligne(), parses line after line the file, skips comments ( /* */ ) and recognized lexems ans stores them in the Techant[] array ( nb_echant is the index ).

 

Example for LBEGIN which is the lexem attached to { :

                                   if(buff[index]=='{')

                                                           {

#ifdef TESTLEX

                                                                       printf(" { [%d] ",nb_echant);

#endif

                                                                       Techant[nb_echant].type_lexeme=LBEGIN;

                                                                       strcpy(Techant[nb_echant].lexeme,"{");

                                                                       Techant[nb_echant].num_ligne=ligne_lu;

                                                                       strcpy(Techant[nb_echant].module,nom_fic);

                                                                       nb_echant++;

                                                               index++;

                                                                       goto fin;

                                                           }

Example for LCOMPEG which is the lexem attached to == :

 

/* LEG         31 /* = */

/* LCOMPEG     20 /* == */

 

 

                                               if(buff[index]=='=')

                                                           {

                                                                       if(buff[index+1]=='=')

                                                                        {

#ifdef TESTLEX

                                                                                  printf(" == [%d] ",nb_echant);

#endif

                                                                                  Techant[nb_echant].type_lexeme=LCOMPEG;

                                                                                  strcpy(Techant[nb_echant].lexeme,"==");

                                                                                  Techant[nb_echant].num_ligne=ligne_lu;

                                                                                  strcpy(Techant[nb_echant].module,nom_fic);

                                                                                  index=index+2;

                                                                                  nb_echant++;

                                                                                  goto fin;

                                                                       }

                                                                       else

                                                                       {

#ifdef TESTLEX

                                                                                  printf(" = [%d] ",nb_echant);

#endif

                                                                                  Techant[nb_echant].type_lexeme=LEG;

                                                                                  strcpy(Techant[nb_echant].lexeme,"=");

                                                                                  Techant[nb_echant].num_ligne=ligne_lu;

                                                                                  strcpy(Techant[nb_echant].module,nom_fic);

                                                                                  index++;

                                                                                  nb_echant++;

                                                                                  goto fin;

                                                                       }

                                                  }

 

Example for LIF which is the lexem attached to if :

 

 

                                               if(         ((buff[index]>='a')&&(buff[index]<='z')) ||

                                                           ((buff[index]>='A')&&(buff[index]<='Z'))

                                                 )

                                                           {

                                                                       char chaine_tempo[1000];

                                                                       int k;

 

                                                                       k=0;

                                                                       for(;(

                                                                            ((buff[index]>='a')&&(buff[index]<='z')) ||

                                                                                   ((buff[index]>='A')&&(buff[index]<='Z')) ||

                                                                            ((buff[index]>='0')&&(buff[index]<='9')) ||

                                                                             (buff[index]=='_'))

                                                                           ;)

                                                                                  {

                                                                                              chaine_tempo[k]=buff[index];

                                                                                              k++;index++;

                                                                                  }

 

                                                                       chaine_tempo[k]='\0';

 

/* LIF         21 /* if */

 

                                                                       if(strcmp("if",chaine_tempo)==0)

                                                                          {

#ifdef TESTLEX

                                                                       printf(" IF %s [%d] ",chaine_tempo,nb_echant);

#endif

                                                                                              Techant[nb_echant].type_lexeme=LIF;

                                                                                              strcpy(Techant[nb_echant].lexeme,chaine_tempo);

                                                                                              Techant[nb_echant].num_ligne=ligne_lu;

                                                                                              strcpy(Techant[nb_echant].module,nom_fic);

                                                                                              nb_echant++;

                                                                                              goto fin;

                                                                                  }/*if*/

 

In this compiler, the following list of lexems are implemented :

 

#define LBEGIN      0 /*{ */

#define LEND        1 /* } */

#define LFOR        2 /* for */

#define LWHILE      3 /* while */

#define LDO         4 /* do */

#define LTASK       5 /* task */

#define LSUBROUTINE 6 /* subroutine */

#define LIDENT      7 /*identifier */

#define LENTIER     8 /* int */

#define LCHAR       24 /* char */

#define OCTET       26 /* byte */

#define LFLOTTANT   25 /* float */

#define CHAINE      9 /* chaine = string*/

#define LPV         10 /* ; */

#define LV          11 /* , */

#define LPLUS       12 /* + */

#define LMOINS      13 /* - */

#define LMUL        14 /* * */

#define LDIV        15 /* / */

#define LEG         31 /* = */

#define LOP         32 /* ( */

#define LCL         33 /* ) */

#define LCOMPINF    16 /* < */

#define LCOMPSUP    17 /* > */

#define LCOMPINFEG  18 /* <= */

#define LCOMPSUPEG  19 /* >= */

#define LCOMPEG     20 /* == */

#define LIF         21 /* if */

#define LELSE                        22 /* else */

#define LRETURN     23 /* return */

#define LCONTINUE   27 /* continue */

#define LBREAK      28 /* break */

#define LSWITCH     29 /* switch */

#define LCASE       30 /* case */

#define LDP         34  /* : */

#define LET         35 /* & */

#define LOU         36 /* | */

#define LCOMPET     37 /* && */

#define LCOMPOU     38 /* || */

#define LNON        39 /* ! */

#define LADDR       40 /* & */

#define LCROP       41 /*[*/

#define LCRCL       42 /* ] */

#define LCHIFFREENT 43 /* 123 */

#define LCHIFFRREEL 44 /*123.52 */

#define LOPCOMM     45 /* /* */

#define LCLCOMM     46 /* */

#define LPOINT      47  /* . */

#define LSTRUCT     48 /* struct */

#define LEXIT       49 /* exit */

#define LVOID       50 /* LVOID */

#define LINC        51 /* ++ */

#define LDEC        52 /* -- */

#define LGOTO       53 /* goto */

#define LCOMPDIFF   54 /* != */

#define LINDIRECT   55 /* -> */

#define LSIZEOF     56 /* sizeof */

#define LDEFAULT    57 /* default */

#define LMODULO     58 /* % */

#define LSCHAR      59 /* schar */

#define LSINT       60 /* sint */

#define LLONG       61 /* long */

#define LSLONG      62 /* slong */

#define LSHL        63 /* << */

#define LSHR        64 /* >> */

#define LPRECEDES  65 /* precedes */

#define LMODIFY                 66 /* mod */

#define LINLINE                    67 /* inline */

#define LNEG             68 /* moins unaire */

#define LMUTEX                   69 /* semaphore mutex */

#define LACQUIRE    70 /* acquire */

#define LRELEASE     71 /* release */

#define LARRSIZE     72 /* arrsize */

#define LARRINIT      73 /* arrinit */

#define LARRSUBSET           74 /* arrsubset */

#define LARRBUILD  75 /* arrbuild */

6.4 Syntax parsing

 

Principles :

 

The syntax  parsing is far far more complex than the two previous steps …

Before commenting in detail the C code, below a summary of the parsing  principles.

 

The syntax parsing  analyses a flow of lexems character and generates a syntax tree.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


In theory, there are different types of languages, some of them can be modelled by using state machine, eg. LALR(2) state machine.

Smart programmers use yacc tool ( yet another compiler compiler ) to generate automatically state machines.

 

 

 

As I am not a smart guy, I have chosen to program my own syntax parser by using a simple “functional” parser. In a nutshell, a functional parser associates a function per “grammar rules”.

The stack which saves contexts is handled by the compiler itself and is managed by the micro-processor during the execution.

 

Below an example of syntax parsing of the “for” instruction : <for loop> -> for ( [ <affectation> ] 0..1   ;  [<expr>] 0..1  ; <affectation> ] 0..1   ) <body instruction>

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Example :

 

<bloc> -> { <sint> }

<sint> -> <inst> | <inst> <sinit>

<inst> -> <if>  | <assign>  | <func call>

<assign> -> <ident> = <exp>;

<func call> -> <ident> ();

<if> -> if <expr> <bloc>

<exp> -> <comp> | <func call exp>  | 1

<comp> -> <ident> == <ident>

<func call exp> -> <ident> ()

 

Top-down analysis :

 

The top-down approach is based on the fact that the grammar rule to be recognized is chosen first, in other words, before having read all the tokens. That’s why in some cases, backtrack is required if ambiguous choice between several grammar rules occur.

If we consider a top-down approach , this grammar is modelled as follow :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Example of parsing :

 

{

            if ( i=j) {g();k=1;}

}

 

Below the stack at each step of the parsing process :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

;

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

;

 

 

 

1

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

)

)

 

 

=

=

=

 

 

 

 

 

 

 

 

 

 

 

 

=

(

(

(

 

Ident

“k”

Ident

“k”

Ident

“k”

Ident

“k”

 

 

 

 

 

 

 

 

 

 

 

Ident

“g”

Ident

“g”

Ident

“g”

Ident

“g”

Ident

“g”

inst

inst

inst

inst

inst

 

 

 

 

 

 

ident
”j”

ident
”j”

 

 

{

{

{

{

{

{

{

{

{

{

{

 

 

 

 

 

==

==

==

 

)

)

)

)

)

)

)

)

)

)

)

)

 

 

 

 

ident “i”

ident “i”

ident “i”

ident “i”

exp

exp

exp

exp

exp

exp

exp

exp

exp

exp

exp

exp

exp

 

 

(

(

(

(

(

(

(

(

(

(

(

(

(

(

(

(

(

(

(

 

if

if

if

if

if

if

if

if

if

if

if

if

if

if

if

if

if

if

if

if

{

{

{

{

{

{

{

{

{

{

{

{

{

{

{

{

{

{

{

 

{

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

inst

}

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

inst

sinst

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

{

{

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

)

)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

exp

exp

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(

(

}

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

if

if

inst

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

{

{

{

bloc

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Bottom-up  analysis :

 

The bottom-up  approach is based on the fact that the grammar rule to be recognized is reduced after a complete recognition of the right part of the “grammar rule”, e.g <A> -> <atome> () is reduced only if <atom> and () have been previously recognized. In some cases, there are conflicts between rules to be reduced or reduce/shift (reading ).

 

Example :

 

<bloc> -> { <sint> }

<sinst> -> <inst> | <inst> <sinst>

<inst> -> <if>  | <assign>  | <func call>

<assign> -> <ident> = <exp>;

<func call> -> <ident> ();

<if> -> if <expr> <bloc>

<exp> -> <comp> | <func call exp>

<comp> -> <ident> == <ident>

<func call exp> -> <ident> ()

 

If we consider a bottom-up approach , this grammar is modelled by an LR(0), as follow : ( I’ve processed this state machine by hand, and it is not very easy … )

 

 

table of the LR(0) parser :

 

S : Shift : Read the current token ( lexem )

R : Reduce

 

State

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

<bloc>

 

 

 

R

<bloc>->{<sinst>}

R

<inst>-><assign>

R

<inst>-><if>

R

<inst>-><func call>

 

 

 

 

R

<func call>-><ident>();

S/26

 

 

R

<exp>-<comp>

<exp>-><func call exp>

 

 

 

<comp>-><ident> == <ident>

<func call exp>-> <ident>()

 

R

<sinst>-><inst><sinst>

R

<assign>-><ident>=<expr>;

R

<if>->if<expr><bloc>

<assign>

 

S/5

 

 

S/5

 

 

 

 

 

 

 

 

 

<if>

 

S/6

 

 

S/6

 

 

 

 

 

 

 

 

 

<sinst>

 

S/3

 

 

S/24

 

 

 

 

 

 

 

 

 

<inst>

 

S/9

 

 

S/ 9 (

<sint> -> .<inst> <sint>

)

R/9 (<sint> -> <inst>.)

Conflict

Use ‘}’to make the right choice

 

 

 

 

 

 

 

 

 

<func call>

 

S/7

 

 

S/7

 

 

 

 

 

 

 

 

 

<comp>

 

 

 

 

 

 

 

 

S/16

S/16

 

 

 

 

<func call exp>

 

 

 

 

 

 

 

 

S/17

S/17

 

 

 

 

<expr>

 

 

 

 

 

 

 

 

S/23

S/13

 

 

 

 

<ident>

 

S/8

 

 

S/8

 

 

 

S/18

S/18

 

S/21

 

 

(

 

 

 

S/10

 

 

 

 

 

 

S/20

 

 

 

)

 

 

 

 

 

S/11

 

 

 

 

 

 

S/22

 

{

S/2

 

 

 

 

 

 

S/2

 

 

 

 

 

 

}

 

 

S/4

 

 

 

 

 

 

 

 

 

 

 

;

 

 

 

 

 

 

S/12

 

 

 

 

 

 

S/25

=

 

 

 

S/14

 

 

 

 

 

 

 

 

 

 

==

 

 

 

 

 

 

 

 

 

 

S/19

 

 

 

if

 

S/15

 

 

S/15

 

 

 

 

 

 

 

 

 

 

This type of  state machine engenders conflicts :

-         Shift / Reduce conflicts

-         Reduce/Reduce conflicts

 

To solve conflicts, there is a simple method which consists in determining the next token to be read in order to choose the relevant action to be processed. For example to choose between “shitfing” <inst>  in the rule <sint> -> .<inst> <sint>  or reducing  <sint> -> <inst>., you may say that if the following token is “}” you reduce else you shift. But in some cases this method do not solve the problem, more powerful state machines must then be used.

The Parsing Algorithm

The LR parsing algorithm now works as follows:

1.     The stack is initialized with [0]. The current state will always be the state that is at the top of the stack.

2.     Given the current state and the current terminal on the input stream an action is looked up in the action table. There are four cases:

o       a shift sn:

§       the current terminal is removed from the input stream

§       the state n is pushed onto the stack and becomes the current state

o       a reduce rm:

§       the number m is written to the output stream

§       for every symbol in the right-hand side of rule m a state is removed from the stack

§       given the state that is then on top of the stack and the left-hand side of rule m a new state is looked up in the goto table and made the new current state by pushing it onto the stack

o       an accept: the string is accepted

o       no action: a syntax error is reported

3.     Step 2 is repeated until either the string is accepted or a syntax error is reported.

Below an execution of the algorithm according to the table of action.

 

Example of parsing :

 

{

            if i=j {g();k=f();}

}

 

 Reduction

 

 

 

 

 

<comp> -> <ident> == <ident>

 

 

<exp> -> <comp>

 

 

 

 

 

 

<func call exp> -> <ident> ()

 

 

<inst> -> <func call>

 

 

Token

{

if

Ident

==

Ident

{

{

{

{

{

Ident

(

)

;

;

Ident

Ident

Ident

State

1

2

15

18

19

21

15

16

15

13

2

8

10

11

12

2

7

2

Action

S/2

S/15

S/18

S/19

S/21

R

S/16

R

S/13

S/2

S/8

S/10

S/11

S/12

R

S/7

R

R

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

;12

;12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

);11

);11

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(;10

(;10

(;10

 

 

 

 

 

 

 

 

Ident-
j;
19

Ident-
j;
19

 

 

 

 

 

Ident-g;8

Ident-g;8

Ident-g;8

Ident-g;8

<fun call>;12

<fun call>;12

<inst>;7

 

 

 

 

==;18

==;18

==;18

 

 

 

 

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

 

 

 

Ident- I;15

Ident-
I;
15

Ident-
I;
15

Ident-
I;
15

<comp>;15

<comp>;15

<exp>;15

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

 

 

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

 

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

 

 

Reduction

 

 

 

 

 

 

<func call exp> -> <ident> ()

 

<exp> -> <func call exp> 

 

 

<assign> -> <ident> = <exp>;

 

<inst> -> <assign> 

<sinst> -> <inst>

 

Token

Ident

Ident

=

Ident

(

)

;

;

;

;

;

}

}

}

}

}

State

2

9

8

14

18

20

22

14

17

14

23

25

9

5

9

9

Action

S/9

S/8

S/14

S/18

S/20

S/22

R

S/17

R

S/23

S25

R

S/5

R

R

S/24

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

);20

);20

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(;18

(;18

(;18

 

 

 

 

;;25

 

 

 

 

 

 

 

 

Ident;14

Ident;14

Ident;14

Ident;14

<Func all exp>;14

<Func all exp>;14

<exp>;14

<exp>;14

<exp>;14

 

 

 

 

 

 

 

=;8

=;8

=;8

=;8

=;8

=;8

=;8

=;8

=;8

=;8

 

 

 

 

 

 

Ident;9

Ident;9

Ident;9

Ident;9

Ident;9

Ident;9

Ident;9

Ident;9

Ident;9

Ident;9

Ident;9

<assign>;9

<assign>;9

<inst>;9

<sinst>;9

 

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

<inst>;2

 

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

{;2

 

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

<exp>;13

 

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

If;2

 

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

{;1

 

Reduction

<sinst> -> <inst> <sinst>

 

 

 

<bloc> -> { <sint> }

 

 

<if> -> if <expr> <bloc>

 

 

<inst> -> <if> 

 

<sinst> -> <inst>

 

 

<bloc> -> { <sint> }

 

 

 

Token

}

}

}

}

}

}

}

}

}

}

}

void

 

 

 

 

State

2

2

3

4

13

26

2

2

2

9

2

3

 

 

 

 

Action

R

S/3

S/4

R

S/26

R

S/6

R

S/9

R

S/3

S/4

R