What Does Compiler Mean?
A compiler is a software program that is responsible for changing initial programmed code into a more basic machine language closer to the “bare metal” of the hardware, and more readable by the computer itself. A high-level source code that is written by a developer in a high-level programming language gets translated into a lower-level object code by the compiler, to make the result “digestible” to the processor.
Formally, the output of the compilation is called object code or sometimes an object module. The object code is machine code that the processor can perform one instruction at a time.
Compilers are needed because of the way that a traditional processor executes object code. The processor uses logic gates to route signals on a circuit board, manipulating binary high and low signals to work the computer’s arithmetic logic unit. But that’s not how a human programmer builds the code: unlike this basic, binary machine language, the initial high-level code consists of variables, commands, functions, calls, methods and other assorted fixtures represented in a mixture of arithmetic and lexical syntax. All of that needs to be put into a form that the computer can understand in order to execute the program.
A compiler executes four major steps:
- Scanning: The scanner reads one character at a time from the source code and keeps track of which character is present in which line.
- Lexical Analysis: The compiler converts the sequence of characters that appear in the source code into a series of strings of characters (known as tokens), which are associated by a specific rule by a program called a lexical analyzer. A symbol table is used by the lexical analyzer to store the words in the source code that correspond to the token generated.
- Syntactic Analysis: In this step, syntax analysis is performed, which involves preprocessing to determine whether the tokens created during lexical analysis are in proper order as per their usage. The correct order of a set of keywords, which can yield a desired result, is called syntax. The compiler has to check the source code to ensure syntactic accuracy.
- Semantic Analysis: This step consists of several intermediate steps. First, the structure of tokens is checked, along with their order with respect to the grammar in a given language. The meaning of the token structure is interpreted by the parser and analyzer to finally generate an intermediate code, called object code.
The object code includes instructions that represent the processor action for a corresponding token when encountered in the program. Finally, the entire code is parsed and interpreted to check if any optimizations are possible. Once optimizations can be performed, the appropriate modified tokens are inserted in the object code to generate the final object code, which is saved inside a file.