Golang Internals, Part 4: Object Files and Function Metadata
Today, we’ll take a closer look at the Func
structure and discuss a few details on how garbage collection works in Go.
This post is a continuation of “Golang Internals, Part 3: The Linker and Go Object Files” and uses the same sample program. So, we strongly advise that you read the previous part before moving forward.
The structure of function metadata
The main idea behind relocations should be clear from Part 3. Now let’s take a look at the Func
structure of the main
method.
Func: &goobj.Func{ Args: 0, Frame: 8, Leaf: false, NoSplit: false, Var: { }, PCSP: goobj.Data{Offset:255, Size:7}, PCFile: goobj.Data{Offset:263, Size:3}, PCLine: goobj.Data{Offset:267, Size:7}, PCData: { {Offset:276, Size:5}, }, FuncData: { { Sym: goobj.SymID{Name:"gclocals·3280bececceccd33cb74587feedb1f9f", Version:0}, Offset: 0, }, { Sym: goobj.SymID{Name:"gclocals·3280bececceccd33cb74587feedb1f9f", Version:0}, Offset: 0, }, }, File: {"/home/adminone/temp/test.go"}, },
You can think of this structure as function metadata emitted by the compiler in the object file and used by the Go runtime. This article explains the exact format and meaning of the different fields in Func
. Now, we will try to show you how this metadata is used in the runtime.
Inside the runtime package, this metadata is mapped on the following struct.
type _func struct { entry uintptr // start pc nameoff int32 // function name args int32 // in/out args size frame int32 // legacy frame size; use pcsp if possible pcsp int32 pcfile int32 pcln int32 npcdata int32 nfuncdata int32 }
You can see that not all the information that was in the object file has been mapped directly. Some of the fields are only used by the linker. Still, the most interesting here are the pcsp
, pcfile
, and pcln
fields, which are used when a program counter is translated into a stack pointer, file name, and line accordingly.
This is required, for example, when panic
occurs. At that exact moment, the runtime only knows about the program counter of the current assembly instruction that has triggered panic
. So, the runtime uses that counter to obtain the current file, line number, and full stack trace. The file and line number are resolved directly, using the pcfile
and pcln
fields. The stack trace is resolved recursively, using pcsp
.
Now that we have a program counter, the question is, how do we get a corresponding line number? To answer it, you need to look through assembly code and understand how line numbers are stored in the object file.
0x001a 00026 (test.go:4) MOVQ $1,(SP) 0x0022 00034 (test.go:4) PCDATA $0,$0 0x0022 00034 (test.go:4) CALL ,runtime.printint(SB) 0x0027 00039 (test.go:5) ADDQ $8,SP 0x002b 00043 (test.go:5) RET ,
We can see that program counters from 26 to 38 inclusive correspond to line number 4 and counters from 39 to next_function_program_counter - 1
correspond to line number 5. For space efficiency, it is enough to store the following map.
26 - 4 39 - 5 …
This is almost exactly what the compiler does. The pcln
field points to a particular offset in a map that corresponds to the first program counter of the current function. Knowing this offset and also the offset of the first program counter of the next function, the runtime can use binary search to find the line number that corresponds to the given program counter.
In Go, this idea is generalized. Not only a line number or stack pointer can be mapped to a program counter, but also any integer value. This is done via the PCDATA
instruction. Each time, the linker finds the following instruction.
0x0022 00034 (test.go:4) PCDATA $0,$0
It doesn’t generate any actual assembler instructions. Instead, it stores the second argument of this instruction in a map with the current program counter, while the first argument indicates what map is used. With this first argument, we can easily add new maps, which meaning is known to the compiler and runtime but is opaque to the linker.
How a garbage collector uses function metadata
The last thing that still needs to be clarified in function metadata is the FuncData
array. It contains information necessary for garbage collection. Go uses the mark-and-sweep garbage collector (GC) that operates in two stages. During the first stage (mark), it traverses through all objects that are still in use and marks them as reachable. All the unmarked objects are removed during the second (sweep) stage.
So, the garbage collector starts by looking for a reachable object in several known locations, such as global variables, processor registers, stack frames, and pointers in objects that have already been reached. However, if you think about it carefully, looking for pointers in stack frames is far from a trivial task. So, when the runtime is performing garbage collection, how does it distinguish whether a variable in the stack is a pointer or belongs to a non-pointer type? This is where FuncData
comes into play.
For each function, the compiler creates two variables. One contains a bitmap vector for the arguments area of the stack frame. The other one contains a bitmap for the rest of the frame that includes all the local variables of pointer types defined in the function. Each of these variables tells the garbage collector, where exactly in the stack frame the pointers are located, and that information is enough for it to do its job.
It is also worth mentioning that like PCDATA
, FUNCDATA
is also generated by a pseudo-Go assembly instruction.
0x001a 00026 (test.go:3) FUNCDATA $0,gclocals·3280bececceccd33cb74587feedb1f9f+0(SB)
The first argument of this instruction indicates, whether this is function data for arguments or a local variables area. The second one is actually a reference to a hidden variable that contains a GC mask.
In the upcoming posts, we will investigate the Go bootstrap process, which is the key to understanding how the Go runtime works.
Further reading
- Golang Internals, Part 1: Main Concepts and Project Structure
- Golang Internals, Part 2: Diving Into the Go Compiler
- Golang Internals, Part 3: The Linker, Object Files, and Relocations
- Golang Internals, Part 5: the Runtime Bootstrap Process
- Golang Internals, Part 6: Bootstrapping and Memory Allocator Initialization