Compilers — Semantic Analysis

Kevin Da Silva
3 min readJan 20, 2022

Hello, again friend of a friend, today we are going to talk about the most complex part of compilers theory in my opinion. So if you are not familiar with the previous compilation phases, just click here for part 1 and part 2.

What is semantic analysis?

Semantic analysis is responsible to guarantee that our statements have the correct meaning and to specify it, let's see how a human language works:

The cat barks.

This sentence is lexically correct(only valid English words), syntactically correct(all words are in their correct position), but there's a missing detail

A cat doesn't bark, they meow which means our sentence is meaningless and that's the problem that semantic analysis solves

In general, in a programming language, if I have something like this:

5 / "hi"

We will get an error in the semantic analysis part because we are trying to perform an arithmetic operation with a string value, which doesn't make any sense neither to our compiler, mathematically, nor to ourselves.

The same goes for scopes

number = 32function fn() {
number = 2
return number*2 //should output 4 instead of 64
}

The semantic analysis also controls variables, values, and scopes to make sure we are referring to the correct value in each scope

number*2 // throws an error because number is not yet definednumber = 32

By also guaranteeing, we are always using variables previously declared.

And remember:

Semantic analysis is all about the meaning, about making sure our programs are able to express their intentions in a concise and doubtless way.

Performing semantic analysis in our interpreter

Considering that our program is just an interpreter for arithmetic lisp operations, we will have only one step to validate in our semantic analysis process.

We will have to validate the division by zero because we need to make sure our interpreter is not gonna fail when processing a division statement.

But first…

We are going to need a new module

lisp-eval/lib/lisp_eval/semantic_analysis.ex

defmodule SemanticAnalysis do  

end

And for this module we will just define a single function

defmodule SemanticAnalysis do  
def semantic_analysis(["*" | tail]) do
{v1, nTail} = semantic_analysis tail
{v2, nNtail} = semantic_analysis nTail
{v1 * v2, nNtail}
end
def semantic_analysis(["/" | tail]) do
{v1, nTail} = semantic_analysis tail
{v2, nNtail} = semantic_analysis nTail
case v2 do
0 -> raise "Error division by zero"
_ -> {v1 / v2, nNtail}
end
end
def semantic_analysis(["-" | tail]) do
{v1, nTail} = semantic_analysis tail
{v2, nNtail} = semantic_analysis nTail
{v1 - v2, nNtail}
end
def semantic_analysis(["+" | tail]) do
{v1, nTail} = semantic_analysis tail
{v2, nNtail} = semantic_analysis nTail
{v1 + v2, nNtail}
end
def semantic_analysis([head | tail]), do:
{round(String.to_integer(head)), tail}
end

In the syntactic analysis after we processed our program, we got something like this:

program = ["*", "5", "5"]

Then we just have to call:

SemanticAnalysis.semantic_analysis(program)

Where first we get the operator, and then we call semantic analysis to get the first number value and the remaining of the program and after it, we do the same to get the second numeric value then after it, we just do:

v1 operation v2

And return it into a tuple containing the result and the remaining of the program.

And if the symbol is not an operator It's because it is a string containing a numerical value, so we just turn this string into a number, and after it, round it to be able to perform divisions that can result in a floating pointer result(example 3/2 = 1.5)

Also notice in the division step that we are validating if v2(our divisor) is zero because if true we get an arithmetic error due to in mathematics a division by zero is not allowed.

And voilá our semantic analysis is ready

And here we are at the end of one more article, I think it's nice to point that semantic analysis will be a step very simple for simple languages(JSON, arithmetic operators), but will increase exponentially its complexity depending on the rules that our program needs to follow(a full programming language like common-lisp, for example, has a lot more of rules to follow and to guarantee the meaning).

And that's comprehensible because the semantic analysis is responsible for the meaning of the program and the last step before code generation.

Whew, this one was a bit shorter than the others, but it's because the semantic analysis of our interpreter is very simple, but if you wanna take a look at something more complex here’s the semantic analysis of the klang language.

Thank you for reading until here and hope to see you in the next article. Bye!

--

--

Kevin Da Silva

I'm a back-end developer, functional programming lover, fascinated by computer science and languages. From the south part of Brazil to the world