Program Construction

PROGRAM CONSTRUCTION

The compilation process

Compilers ▪ A compiler is a computer program (or a
set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language), with the latter often having a binary form known as object code.

Interpreters ▪ An interpreter is a computer program or a
tool that executes programming instructions. ▪ An interpreter may either execute the source code directly or converts the source to an intermediate code and execute it directly or execute precompiled code produced by a compiler (some interpreter systems include a compiler for this task). ▪ Languages like Perl, Python, MATLAB and Ruby are examples of programming languages that use an intermediate code. ▪ Languages like Java and BASIC first compile the source to an intermediate code called bytecode and then interpret it.

Compilers vs Interpreters ▪ A complier converts the high level
instruction into machine language while an interpreter converts the high level instruction into an intermediate form. ▪ Before execution, the entire program is executed by the compiler whereas after translating the first line, an interpreter then executes it and so on. ▪ List of errors is created by the compiler after the compilation process while an interpreter stops translating after the first error. ▪ An independent executable file is created by the compiler whereas interpreter is required by an interpreted program each time.

Assemblers ▪ An assembler is software or a tool that
translates Assembly language to machine code. So, an assembler is a type of a compiler and the source code is written in Assembly language. ▪ Assembly is a human readable language but it typically has a one to one relationship with the corresponding machine code. Therefore an assembler is said to perform isomorphic (one to one mapping) translation. ▪ Advanced assemblers provide additional features that support program development and debugging processes. For example, the type of assemblers called macro assemblers provides a macro facility.

Assemblers ▪ When we look at Little Man Computer (LMC)
you will see some assembly instructions like these.

Interpreters vs Assemblers ▪ An assembler can be considered a
special type of compiler, which only translates Assembly language to machine code. ▪ Interpreters are tools that execute instruction written in some language. Interpreter systems may include a compiler to pre-compile code before interpretation, but an interpreter cannot be called a special type of a compiler. ▪ Assemblers produce an object code, which might have to be linked using linker programs in order to run on a machine, but most interpreters can complete the execution of a program by themselves.

Interpreters vs Assemblers ▪ An assembler will typically do a
one to one translation, but this is not true for most interpreters. Because Assembly language has a one to one mapping with machine code, an assembler may be used for producing code that runs very efficiently for occasions in which performance is very important (for e.g. graphics engines, embedded systems with limited hardware resources compared to a personal computer like microwaves, washing machines, etc.) ▪ On the other hand, interpreters are used when you need high portability. For example, the same Java bytecode can be run on different platforms by using the appropriate interpreter (JVM).

Lexical Analysis ▪ Languages are a potentially infinite set of
strings (sometimes called sentences, which are a sequence of symbols from a given alphabet). The tokens of each sentence are ordered according to some structure. ▪ Lexical Syntactical Analysis (LSA) only deals with the front-end of the compiler, the front-end of a compiler only analyses the program, it does not produce code. ▪ From source code (Python for example) lexical analysis produces tokens, the words in the language and these are checked to ensure they conform to the rules of the language. ▪ Literal table produced – contains all strings, constants and identifiers for variables in the program. ▪ Error handler catches errors generated from a poorly formed line for instance.

Lexical Analysis ▪ A lexical analyser also removes all whitespace
and comments ▪ Tokenisation process takes the input tokens and puts them through a keyword recogniser, numeric recogniser, string recogniser and output based on disambiguating rules – rules include reserve words which cannot be used as identifiers, such as print, if, while, etc.

Syntax Analysis ▪ A symbol table is an important data
structure created and maintained by compilers in order to store information about the occurrence of various entities such as variable names, function names, objects, classes, interfaces, etc. ▪ A symbol table may serve the following purposes depending upon the language in hand: – To store the names of all entities in a structured form at one place. – To verify if a variable has been declared – To implement type checking, by verifying assignments and expressions in the source code are semantically correct. – To determine the scope of a name (remember binding). ▪ A symbol table is simply a table which can be either linear or a hash table. It maintains an entry for each name in the following format:

Syntax Analysis ▪ Syntax analysis, or parsing, is needed to
determine if the series of tokens given are appropriate in a language - that is, whether or not the sentence has the right shape/form. ▪ In syntactic analysis, context free grammars are used to construct parse trees to show the structure of the sentence, but they often contain redundant information due to implicit definitions (e.g., an assignment always has an assignment operator in it, so we can imply that), so syntax trees, which are compact representations are used instead

Context Free Grammars ▪ <expression> --> number ▪ <expression> -->
( <expression> ) ▪ <expression> --> <expression> + <expression> ▪ <expression> --> <expression> - <expression> ▪ <expression> --> <expression> * <expression> ▪ <expression> --> <expression> / <expression>

Semantic Analysis ▪ Semantic analysis checks whether the parse tree
constructed follows the rules of language. For example, assignment of values is between compatible data types, and adding string to an integer. ▪ Also, the semantic analyzer keeps track of identifiers, their types and expressions; whether identifiers are declared before use or not etc. ▪ The semantic analyzer produces an annotated syntax tree as an output.

Intermediate code generation ▪ After semantic analysis the compiler generates
an intermediate code of the source code for the target machine. It represents a program for some abstract machine. ▪ It is in between the high-level language and the machine language. ▪ This intermediate code should be generated in such a way that it makes it easier to be translated into the target machine code.

Code optimization ▪ The next phase performs code optimization on
the intermediate code. ▪ Optimization can be assumed as something that removes unnecessary code lines, and arranges the sequence of statements in order to speed up the program execution without wasting resources (CPU, memory).

Code generation ▪ In this phase, the code generator takes
the optimized representation of the intermediate code and maps it to the target machine language. ▪ The code generator translates the intermediate code into a sequence of (generally) re- locatable machine code. ▪ Sequence of instructions of machine code performs the task as the intermediate code would do.

Symbol Table ▪ A symbol table is an important data
structure created and maintained by compilers in order to store information about the occurrence of various entities such as variable names, function names, objects, classes, interfaces, etc. ▪ A symbol table may serve the following purposes depending upon the language in hand: – To store the names of all entities in a structured form at one place. – To verify if a variable has been declared – To implement type checking, by verifying assignments and expressions in the source code are semantically correct. – To determine the scope of a name (remember binding). ▪ A symbol table is simply a table which can be either linear or a hash table. It maintains an entry for each name in the following format:

Program Construction

Program Construction

AllenHeard

More Decks by AllenHeard

Other Decks in Education

Featured

Transcript

PROGRAM CONSTRUCTION

The compilation process

Compilers ▪ A compiler is a computer program (or a

Interpreters ▪ An interpreter is a computer program or a

Compilers vs Interpreters ▪ A complier converts the high level

Assemblers ▪ An assembler is software or a tool that

Assemblers ▪ When we look at Little Man Computer (LMC)

Interpreters vs Assemblers ▪ An assembler can be considered a

Interpreters vs Assemblers ▪ An assembler will typically do a

Lexical Analysis ▪ Languages are a potentially infinite set of

Lexical Analysis ▪ A lexical analyser also removes all whitespace

Syntax Analysis ▪ A symbol table is an important data

Syntax Analysis ▪ Syntax analysis, or parsing, is needed to

Context Free Grammars ▪ <expression> --> number ▪ <expression> -->

Semantic Analysis ▪ Semantic analysis checks whether the parse tree

Intermediate code generation ▪ After semantic analysis the compiler generates

Code optimization ▪ The next phase performs code optimization on

Code generation ▪ In this phase, the code generator takes

Symbol Table ▪ A symbol table is an important data