一个C/C++文件要经过预处理(preprocessing)、编译(compilation)、汇编(assembly)和链接(linking)等4步才能变成可执行文件,通常使用“编译”统称这4个步骤。

  • 预处理(preprocessing)
    C/C++源文件中,以“#”开头的命令被称为预处理命令,如包含命令“#include”、宏定义命令“#define”、条件编译命令“#if”、“#ifdef”等。预处理就是将要包含(include)的文件插入原文件中、将宏定义展开、根据条件编译命令选择要使用的代码,最后将这些代码输出到一个“.i”文件中等待进一步处理。

  • 编译(compilation)
    编译就是把C/C++代码(比如上述的“.i”文件)翻译成汇编代码。

  • 汇编(assembly)
    汇编就是将第二步输出的汇编代码翻译成符合一定格式的机器代码,在linux系统上一般表现为ELF目标文件(OBJ文件)。“反汇编”是指将机器代码转换为汇编代码。

  • 链接(linking)
    链接就是将上步生成的OBJ文件和系统库的OBJ文件、库文件链接起来,最终生成可以在特定平台运行的可执行文件。

编译器利用这4个步骤中的一个或多个来处理输入文件,源文件的后缀名表示源文件所用的语言,后缀名控制着编译器的默认动作。
文件后缀名对应表:

后缀名 类型
.c c源程序
.h 预处理器文件
.cpp c++源程序
.i 预处理后的c文件
.ii 预处理后的c++文件
.s 汇编语言源程序
.o 目标文件(Object file)
.a 静态链接库文件(linux)
.so 动态链接库文件(linux)
.lib 静态链接库文件(windows)
.dll 动态链接库文件(windows)

gcc的使用方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
gcc --help

Usage: gcc [options] file...
Options:
-pass-exit-codes Exit with highest error code from a phase.
--help Display this information.
--target-help Display target specific command line options.
--help={common|optimizers|params|target|warnings|[^]{joined|separate|undocumented}}[,...].
Display specific types of command line options.
(Use '-v --help' to display command line options of sub-processes).
--version Display compiler version information.
-dumpspecs Display all of the built in spec strings.
-dumpversion Display the version of the compiler.
-dumpmachine Display the compiler's target processor.
-print-search-dirs Display the directories in the compiler's search path.
-print-libgcc-file-name Display the name of the compiler's companion library.
-print-file-name=<lib> Display the full path to library <lib>.
-print-prog-name=<prog> Display the full path to compiler component <prog>.
-print-multiarch Display the target's normalized GNU triplet, used as
a component in the library path.
-print-multi-directory Display the root directory for versions of libgcc.
-print-multi-lib Display the mapping between command line options and
multiple library search directories.
-print-multi-os-directory Display the relative path to OS libraries.
-print-sysroot Display the target libraries directory.
-print-sysroot-headers-suffix Display the sysroot suffix used to find headers.
-Wa,<options> Pass comma-separated <options> on to the assembler.
-Wp,<options> Pass comma-separated <options> on to the preprocessor.
-Wl,<options> Pass comma-separated <options> on to the linker.
-Xassembler <arg> Pass <arg> on to the assembler.
-Xpreprocessor <arg> Pass <arg> on to the preprocessor.
-Xlinker <arg> Pass <arg> on to the linker.
-save-temps Do not delete intermediate files.
-save-temps=<arg> Do not delete intermediate files.
-no-canonical-prefixes Do not canonicalize paths when building relative
prefixes to other gcc components.
-pipe Use pipes rather than intermediate files.
-time Time the execution of each subprocess.
-specs=<file> Override built-in specs with the contents of <file>.
-std=<standard> Assume that the input sources are for <standard>.
--sysroot=<directory> Use <directory> as the root directory for headers
and libraries.
-B <directory> Add <directory> to the compiler's search paths.
-v Display the programs invoked by the compiler.
-### Like -v but options quoted and commands not executed.
-E Preprocess only; do not compile, assemble or link.
-S Compile only; do not assemble or link.
-c Compile and assemble, but do not link.
-o <file> Place the output into <file>.
-pie Create a position independent executable.
-shared Create a shared library.
-x <language> Specify the language of the following input files.
Permissible languages include: c c++ assembler none
'none' means revert to the default behavior of
guessing the language based on the file's extension.

Options starting with -g, -f, -m, -O, -W, or --param are automatically
passed on to the various sub-processes invoked by gcc. In order to pass
other options on to these processes the -W<letter> options must be used.

For bug reporting instructions, please see:
<http://gcc.gnu.org/bugs.html>.

常用选项:

选项 含义
-v 查看gcc编译器的版本,显示gcc执行时的详细过程
-E 只预处理,不编译、汇编、链接
-S 只编译,不汇编、链接
-c 编译和汇编,不链接
-o 指定输出文件名为file
-static 进行静态编译,即链接静态库,禁止使用动态库
-shared 1.可以生成动态库文件 2.进行动态编译,尽可能的链接动态库,只有没有动态库时才会链接同名的静态库(默认选项,可省略)
-Ldir 在库文件的搜索路径列表中添加dir目录
-lname 链接称为libname.a(静态库)或者libname.so(动态库)的库文件。若两个库都在,则根据编译方式(-static还是-shared)而进行链接。
-fPIC 生成使用相对地址的位置无关的目标代码(Position Independent Code)。

以一个实例来分析gcc程序编译的过程(helloworld.c):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>

#define TRUE 1
#define FALSE 0

#define DEBUG_ENABLE

int main(void){
int i = 0;
if(i == TRUE){
printf("hello\n");
}else{
#ifdef DEBUG_ENABLE
printf("i = %d\n",i);
#endif
printf("hello world\n");
}
return 0;
}

预处理

1
gcc -E -o helloworld.i helloworld.c

打开helloworld.i文件(用sublime打开),
可以看到include的文件已插入原文件中,宏定义展开、条件编译命令已选择好代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
434  __attribute__((__cdecl__)) __attribute__((__nothrow__)) int putw (int, FILE *);
435
436
437
438
439
440 # 2 "helloworld.c" 2
441
442
443
444
445
446
447
448 # 8 "helloworld.c"
449 int main(void){
450 int i = 0;
451 if(i == 1){
452 printf("hello\n");
453 }else{
454
455 printf("i = %d\n",i);
456
457 printf("hello world\n");
458 }
459 return 0;
460 }
461

编译

1
gcc -S -o helloworld.s helloworld.i

编译生成的汇编代码内容如下(用sublime打开):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
    .file   "helloworld.c"
.def ___main; .scl 2; .type 32; .endef
.section .rdata,"dr"
LC0:
.ascii "hello\0"
LC1:
.ascii "i = %d\12\0"
LC2:
.ascii "hello world\0"
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB10:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $32, %esp
call ___main
movl $0, 28(%esp)
cmpl $1, 28(%esp)
jne L2
movl $LC0, (%esp)
call _puts
jmp L3
L2:
movl 28(%esp), %eax
movl %eax, 4(%esp)
movl $LC1, (%esp)
call _printf
movl $LC2, (%esp)
call _puts
L3:
movl $0, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE10:
.ident "GCC: (MinGW.org GCC-6.3.0-1) 6.3.0"
.def _puts; .scl 2; .type 32; .endef
.def _printf; .scl 2; .type 32; .endef

汇编

1
gcc -c -o helloworld.o helloworld.s

.o文件打开内容如下(用winhex打开):
捕获.JPG

链接

1
gcc -o helloworld helloworld.o

最终生成helloworld.exe文件,执行(使用的是Notepad++里的控制台):

1
2
3
4
5
6
helloworld
helloworld
Process started (PID=15044) >>>
i = 0
hello world
<<< Process finished (PID=15044). (Exit code 0)

在编译过程中,除非使用了”-E”、”-S”、”-c”选项,或者编译器错误阻止了完整的过程,否则最后步骤总是链接。
例如:

1
2
gcc  helloworld.c
gcc -o helloworld helloworld.c

都是已经完成链接操作。