set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language), with the latter often having a binary form known as object code.
tool that executes programming instructions. ▪ An interpreter may either execute the source code directly or converts the source to an intermediate code and execute it directly or execute precompiled code produced by a compiler (some interpreter systems include a compiler for this task). ▪ Languages like Perl, Python, MATLAB and Ruby are examples of programming languages that use an intermediate code. ▪ Languages like Java and BASIC first compile the source to an intermediate code called bytecode and then interpret it.
instruction into machine language while an interpreter converts the high level instruction into an intermediate form. ▪ Before execution, the entire program is executed by the compiler whereas after translating the first line, an interpreter then executes it and so on. ▪ List of errors is created by the compiler after the compilation process while an interpreter stops translating after the first error. ▪ An independent executable file is created by the compiler whereas interpreter is required by an interpreted program each time.
translates Assembly language to machine code. So, an assembler is a type of a compiler and the source code is written in Assembly language. ▪ Assembly is a human readable language but it typically has a one to one relationship with the corresponding machine code. Therefore an assembler is said to perform isomorphic (one to one mapping) translation. ▪ Advanced assemblers provide additional features that support program development and debugging processes. For example, the type of assemblers called macro assemblers provides a macro facility.
special type of compiler, which only translates Assembly language to machine code. ▪ Interpreters are tools that execute instruction written in some language. Interpreter systems may include a compiler to pre-compile code before interpretation, but an interpreter cannot be called a special type of a compiler. ▪ Assemblers produce an object code, which might have to be linked using linker programs in order to run on a machine, but most interpreters can complete the execution of a program by themselves.
one to one translation, but this is not true for most interpreters. Because Assembly language has a one to one mapping with machine code, an assembler may be used for producing code that runs very efficiently for occasions in which performance is very important (for e.g. graphics engines, embedded systems with limited hardware resources compared to a personal computer like microwaves, washing machines, etc.) ▪ On the other hand, interpreters are used when you need high portability. For example, the same Java bytecode can be run on different platforms by using the appropriate interpreter (JVM).
strings (sometimes called sentences, which are a sequence of symbols from a given alphabet). The tokens of each sentence are ordered according to some structure. ▪ Lexical Syntactical Analysis (LSA) only deals with the front-end of the compiler, the front-end of a compiler only analyses the program, it does not produce code. ▪ From source code (Python for example) lexical analysis produces tokens, the words in the language and these are checked to ensure they conform to the rules of the language. ▪ Literal table produced – contains all strings, constants and identifiers for variables in the program. ▪ Error handler catches errors generated from a poorly formed line for instance.
and comments ▪ Tokenisation process takes the input tokens and puts them through a keyword recogniser, numeric recogniser, string recogniser and output based on disambiguating rules – rules include reserve words which cannot be used as identifiers, such as print, if, while, etc.
structure created and maintained by compilers in order to store information about the occurrence of various entities such as variable names, function names, objects, classes, interfaces, etc. ▪ A symbol table may serve the following purposes depending upon the language in hand: – To store the names of all entities in a structured form at one place. – To verify if a variable has been declared – To implement type checking, by verifying assignments and expressions in the source code are semantically correct. – To determine the scope of a name (remember binding). ▪ A symbol table is simply a table which can be either linear or a hash table. It maintains an entry for each name in the following format:
determine if the series of tokens given are appropriate in a language - that is, whether or not the sentence has the right shape/form. ▪ In syntactic analysis, context free grammars are used to construct parse trees to show the structure of the sentence, but they often contain redundant information due to implicit definitions (e.g., an assignment always has an assignment operator in it, so we can imply that), so syntax trees, which are compact representations are used instead
constructed follows the rules of language. For example, assignment of values is between compatible data types, and adding string to an integer. ▪ Also, the semantic analyzer keeps track of identifiers, their types and expressions; whether identifiers are declared before use or not etc. ▪ The semantic analyzer produces an annotated syntax tree as an output.
an intermediate code of the source code for the target machine. It represents a program for some abstract machine. ▪ It is in between the high-level language and the machine language. ▪ This intermediate code should be generated in such a way that it makes it easier to be translated into the target machine code.
the intermediate code. ▪ Optimization can be assumed as something that removes unnecessary code lines, and arranges the sequence of statements in order to speed up the program execution without wasting resources (CPU, memory).
the optimized representation of the intermediate code and maps it to the target machine language. ▪ The code generator translates the intermediate code into a sequence of (generally) re- locatable machine code. ▪ Sequence of instructions of machine code performs the task as the intermediate code would do.
structure created and maintained by compilers in order to store information about the occurrence of various entities such as variable names, function names, objects, classes, interfaces, etc. ▪ A symbol table may serve the following purposes depending upon the language in hand: – To store the names of all entities in a structured form at one place. – To verify if a variable has been declared – To implement type checking, by verifying assignments and expressions in the source code are semantically correct. – To determine the scope of a name (remember binding). ▪ A symbol table is simply a table which can be either linear or a hash table. It maintains an entry for each name in the following format: