A Sit in the Woods.

I felt the need to escape to the woods for just a few minutes today. As I sat down between two logs, I felt the damp earth beneath me and listened to the trickle of a nearby spring. While the wind…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




The process of compilation with C

Our computer is capable of doing many things but to make it happen, we need to talk in a language it could understand. The problem with this is that if we get to the bottom of what a computer really understands, it’s binary code. A language that is organized in multiple combinations of 0s and 1s. And this is harsh language to write if we want to communicate something. Here is where programming languages kick in, they simplify our communication with the computer. However, the computer keeps reading 0s and 1s, and this is the point of our article today, how that actually happens.

That process is called compilation and the compiler is responsible for making that labor for us. One of the most popular is the GNU Compiler Collection (GCC) and it was one of the first compilers capable of rearranging scripts to computer-readable files. Giving orders to a computer may not be that complicated to accomplish by using command lines and machine language (binary code) yet writing entire programs scales the task in a really unsuitable way.

Here is when it comes to Dennis Richie, the creator of the C programming language. It was one of the first programming languages that create an impact on how the programming languages were made and improved de capacity of all developers to create programs in a much easy way than it was before. Although the language was developed between 1972 to 1973, it was until the 1980s that it gains its popularity. Nowadays many commonly used languages are based on some of the features of C, making this language one of the oldest and sustainable worthwhile languages of history.

We can think of a programing language as a similar way of conceiving our human-based language, each one of them has its own grammar and rules for writing it correctly in a comprehensive way. But learning to write in the language the computer can read is really hard and impractical if we want to create complex ideas. In order to do that we need a sort of interpreter that translates our language to the machine code so it can understand correctly our ideas. Yet to translate the correct ideas we need to make sure to follow the rules for writing them correctly.

As we already said, the compiler is responsible for making your code machine-readable, and we are going to explore a bit on how the process works.

First of all, we need to consider that the process of compilation is subdivided into four stages. The preprocessor, the compiler, the assembler, and last but not least the linker which work respectively in that order. We can think of these stages as different departments of an office working in order to do the task the most efficient way.

The pre-processor is the first department dealing with the code we had written. Is the first step for traducing all the content to provide the other departments with the information they need to make its labor. It is in charge of appending to the original file the code of the existing header files we added. May think of header files as references needed to make our code make sense. Also can think of it as dictionaries that give definitions to the words we are using on the actual code we are compiling. However, they don’t specify the functionality of what we are using, libraries are the ones who store that information and they will appear way ahead in the process.

In the C language headers need to be processed before the actual code is processed. The standard header to use on every C file is the “<stdio.h>” (standard input and output) and it needs to be written before the actual code.

The other task this department is assigned to do is to get rid of all the comments that exist in our code. Comments aren’t part of the code, there are notes that we made to make our code more understandable for us and others. Although they are really useful for us, the computer can’t read them and actually disrupt the process of compiling.

When the pre-processor is ready with the extended and clean code is ready to pass to the next step, the compiler.

Here is when the language starts to change to a closer way to how the computer can process it with its own hardware processor. Although at this point is stills only human-readable. This department is in charge of decoding the individual words and symbols and associating them with correct semantics meaning to translate them to assembly code. This type of code looks more like a series of instructions that the computer needs to do in order to run our code.

This department is the only one capable of reading assembly code, so it would be the only one that could do something with it. Its labor is to traduce it to binary, the actual machine code. We are almost there on making the program machine-readable!

The problem with the assembler is that it doesn’t have the sources to link the binary code to the libraries needed. The code that the assembler produce is called object code, which is machine code but only with data about our program. In order to run our program, we need the information of the libraries associated. There is when our fellow linker department comes in.

As we already said this is the only department capable of connecting our program with previously created information. In this context, that pack of information is called libraries and is incredibly useful in the process of creating a program.

<< There exist two types of libraries, static and dynamic libraries, yet we won’t go any further on that concept because it misleads the purpose of this article. >>

The linker department contains the information for creating a unified program out of many separated object files and libraries resolving the symbolic references as it goes along. We can think of it as the meeting room, where all the files and existing libraries are put into consideration for consolidating a consistent program.

By now we should have an executable program produced only by our code written. The rest of the service was provided by The Compiler Office. As amazing it sounds that’s not only the service that the GCC compiler provides, we also have access to the code produced at any stage of the process.

First of all, to begin this process, we need to call the office by running the “gcc” command on our terminal followed by the “program-script.c”.

This will compile it and give a default name (“a.out”), if we want to custom our name we need to add the flag “-o” after the script name preceded by the custom file name.

As we mentioned before we can stop the process at any stage. In order to stop it on the pre-processor stage we need to run the “gcc” command with the flag “-E”, notice we won’t need to add our actual executable file name because it would never get to that stage.

This will give us back a pre-processed file without the comments and all the header information named by the “.i” extension. This command could be useful if we want to know the content of the header files we added.

The compiler would take all the information that is on our script and select the useful one and write it in assembly code. Taking a look at the file at this stage could also be useful. This time the flag “-S” would make the work.

At this point, the code we get by asking for the assembler processed information won’t be that readable as it was before. Although we could ask for it by running the command “-c”.

In the next stage, the file would be already processed by the linker, and that’s the actual executable file. Of course, we can open and read the file but it would be on binary code, that’s the actual code that the computer would read.

As we go along with the process, the resultant files are more difficult to understand, and the distance increases even more if we compare it with our spoken language. However, remember that computers were made by humans, at some point in history, engineers only could communicate with their computers by using machine language. We need to feel lucky about the way we communicate nowadays with our computers. It was much harder since.

Add a comment

Related posts:

The African Who Should Have Been an American

To those unfamiliar with the series, it documents the story of modern treasure hunters the Brothers Lagina. The treasure of Oak Island is an old story in American history. If you want to learn more…

Shimmering Lights

The late evening bleeds dreams that float in the sky, reflecting the sunlight, gleaming like fireflies in a violet forest. I witness this beauty alone, I surrender to the warm wind, and the pain of…

Welcome to Konvergenz Studio

In 2003 most of us haven’t seen a smartphone yet. We haven’t streamed a movie on Netflix nor did we even know what streaming was. Except for when we looked at the water stream in a river that is…