Go: Unsafe
As with every language, there are those odd parts and gotcha. If you’re a web developer using Go for back-end services
there is a high probability that you’ve managed to avoid a majority of these odd parts of the language. One of these
odd parts, which I don’t have a ton of experience using, is the unsafe
package. This package gives us the ability to
side step Go’s type system and do more C like operations. I know, I know, somehow in these post I always manage to find
a way to mention C. Either way, I think providing some examples in C might provide us some better insight in to why
this package exist and how to use it. Enough of this intro, let’s get started.
C Pointers
Nowadays, most of our PCs have 64-bit CPU’s in them; however, it is important to mention that was not always the case. Why bother even mentioning this? Whoa, slow down there skipper, what does it mean when we have a 32-bit vs 64-bit CPU? Well, one major thing is addressable memory space and the default word size of the CPU. The term word just refers to the natural size that a processor uses, typically related to the register sizes. Normally, 32-bit CPUs have 32-bit or 4 byte word sizes and 64-bit have 64-bit or 8 byte word sizes.
Well, okay Nick, I’m still not really sure why we should care about this?
Alright, let me slap ya with a bit of C enlightenment and Go knowledge. When you create a variable, in this example
we’ll just say we are talking about C, as the type int
what is the size of that variable? Well, instead of just
talking about it, let’s just write a C program that will tell me. In C, we have a built-in sizeof
that will tell us
the byte size of a particular type.
#include <stdio.h>
int main()
{
int a;
printf("%lu\n", sizeof(a));
return 0;
}
./a.out
4
The CPUs that I’m using on my machines are all x64 Intel chips. We wrote an example C program that creates a integer
variable as an int
, and it looks like the default size for an integer variable is 4 bytes on my hardware. Alright,
what about the size of an int *
(integer pointer)?
#include <stdio.h>
int main()
{
int *a;
printf("%lu\n", sizeof(a));
return 0;
}
./a.out
8
Interesting, it appears that the size of an integer pointer is 8 bytes on my hardware. To compile these programs, I’ve
been using clang and by default the compiler will use whatever architecture size you currently using unless
specified otherwise. We can actually compile a 32-bit program instead of a 64-bit program by adding the flag -m32
in
order to see the pointer size will indeed change based on the bit size of the program that we are compiling.
#include <stdio.h>
int main()
{
int *a;
printf("%u\n", sizeof(a));
return 0;
}
clang -o main-32 -m32 main.c
./main-32
4
The Void *
In C, there is a lot of trust between the language and the programmer that is using the language. We have the ability
to take any type and cast it to a different type without any strong type checking like we might have with Go’s type
assertions. An interesting thing that you can do in C is type cast a pointer to be a void *
(void pointer), but what
does it mean to have a variable that points to the type of void
? Well, it basically means that you have an untyped
pointer that you can’t dereference.
#include <stdio.h>
int main() {
void *a;
int b = 128;
a = &b;
*a = 256;
return 0;
}
clang main.c -o main-64
main.c:7:6: error: incomplete type 'void' is not assignable
*a = 256;
~~ ^
1 error generated.
Well, why is this useful, Nick, you smooth brain crayon eater?
It’s used to represent an abstract type that can be explicitly or implicitly cast to another type. This can be useful
when we want to give some control to the consumer of an API that we might have created. For instance, take a look at
the function prototype for malloc
, the return type should be a void *
. This is the reason why you would have to
cast the result of malloc
to the expected pointer type.
Holy Func, that’s unsafe
In the simplest terms, the Go unsafe
package gives us, as programmers, more control of our programs by side stepping
the safety of the Go type system. Just like the sizeof
built-in in C, Go offers a similar operation as
unsafe.SizeOf
.
package main
import (
"fmt"
"unsafe"
)
func main() {
var x int
size := unsafe.Sizeof(x)
fmt.Printf("%d\n", size)
}
Once again, the Go compiler allows us to choose the target architecture type that we are compiling to. If you’re on a
x64 machine, you can target x86 (32-bits, GOARCH=386
) or x64 (64-bit, GOARCH=amd64
) which will change the default
size of this of our pointer types. Another topic that’s important when we’re talking about dealing with lower level
code is alignment, structure padding, and structure packing. Let’s start with an example in C, and then we will
recreate a similar example in Go.
#include <stdint.h>
#include <stdio.h>
typedef struct _a {
int32_t FieldOne;
int64_t FieldTwo;
} A;
int main() {
A a = {};
printf("%ld\n", sizeof(a));
return 0;
}
Before compiling this code, let’s take a look at the structure we have defined. The structure A
contains two fields
a int32_t
(32-bit signed integer) and a int64_t
(64-bit signed integer), so that should be 4 bytes plus 8 bytes for
a total of 12 bytes for this structure, right? Well, turns out that in certain scenarios the compiler will choose the
optimal data alignment for the compilation target because it usually more efficient when
to store and load values from memory when they are properly aligned.
If we compile the C code above and run it, we should see the output of the value 16
; therefore, an additional 4 bytes
of padding was adding to this structure. We could actually enforce that the compiler to align our structure by packing
it instead of adding padding, but this would be platform dependent. For brevity, I will just provide what it would look
like using Linux and clang to pack our structure to be 12 bytes instead of 16 bytes.
#include <stdint.h>
#include <stdio.h>
typedef struct __attribute__((packed)) {
int32_t FieldOne;
int64_t FieldTwo;
} A;
int main() {
A a = {};
printf("%ld\n", sizeof(a));
return 0;
}
A similar Go example will show us that the Go compiler will do the same thing, add padding to our structure. The below
program should output the value of 16
as well. I don’t believe that Go directly supports structure packing, but there
are ways around it that require manually writing to byte buffers which we can talk about another time.
package main
import (
"fmt"
"unsafe"
)
type A struct {
FieldOne int32
FieldTwo int64
}
func main() {
var a A
size := unsafe.Sizeof(a)
fmt.Printf("%d\n", size)
}
In Go, we can also get the alignment of a structure or fields within a structure using the unsafe.Alignof
. This
reports the required alignment of its argument’s type. In our example below, we’ve defined a type A
that contains
three fields.
package main
import (
"fmt"
"unsafe"
)
// total size: 64-bit (32 bytes) | 32-bit (16 bytes)
type A struct {
X bool // 64-bit: 1 byte & Alignof=1, 32-bit: 1 byte & Alignof=1
Y int16 // 64-bit: 2 bytes & Alignof=2, 32-bit: 2 bytes & Alignof=2
Z []int // 64-bit: 24 bytes & Alignof=8, 32-bit: 24 bytes & Alignof=3
}
func main() {
var a A
fmt.Printf("size: %d\n", unsafe.Sizeof(a))
fmt.Printf("alignment a: %d\n", unsafe.Alignof(a))
fmt.Printf("alignment a.X: %d\n", unsafe.Alignof(a.X))
fmt.Printf("alignment a.Y: %d\n", unsafe.Alignof(a.Y))
fmt.Printf("alignment a.Z: %d\n", unsafe.Alignof(a.Z))
}
Looking at our program, we can see that unsafe.Alignof
, just gives us the byte boundary of a particular type or field
within that type. In addition to getting the field alignments, the unsafe.Offset
function can be called to compute
the offset of field f
relative to the start of its enclosing struct x
, accounting for any padding.
This segues nicely to to uintptr
and unsafe.Pointer
, a majority of the current functions in the unsafe
package
return a uintptr
which is another built-it type that is supposed to represent that platform dependent default pointer
size we discussed earlier; however, this just give us an integer value that would potentially represent an address in
memory. We can use these both together to perform pointer arithmetic to, for example, change a value of a field.
package main
import (
"fmt"
"unsafe"
)
type A struct {
X bool
Y int16
Z []int
}
func main() {
var a A
a.X = true
a.Y = 128
fmt.Printf("size: %d\n", unsafe.Sizeof(a))
fmt.Printf("alignment a: %d\n", unsafe.Alignof(a))
fmt.Printf("alignment a.X: %d\n", unsafe.Alignof(a.X))
fmt.Printf("alignment a.Y: %d\n", unsafe.Alignof(a.Y))
fmt.Printf("alignment a.Z: %d\n", unsafe.Alignof(a.Z))
fmt.Printf("a.Y: %d\n", a.Y)
var ptr *int16
ptr = (*int16)(unsafe.Pointer(
uintptr(unsafe.Pointer(&a)) +
unsafe.Offsetof(a.Y)))
*ptr = 256
fmt.Printf("ptr: %d\n", *ptr)
fmt.Printf("a.Y: %d\n", a.Y)
}
Because Go is a garbage collected language, we have to be careful of particular pitfalls that we might encounter working with this kind of code. It might seem safe to do the following:
package main
import (
"fmt"
"unsafe"
)
type A struct {
X bool
Y int16
Z []int
}
func main() {
var a A
a.X = true
a.Y = 128
fmt.Printf("size: %d\n", unsafe.Sizeof(a))
fmt.Printf("alignment a: %d\n", unsafe.Alignof(a))
fmt.Printf("alignment a.X: %d\n", unsafe.Alignof(a.X))
fmt.Printf("alignment a.Y: %d\n", unsafe.Alignof(a.Y))
fmt.Printf("alignment a.Z: %d\n", unsafe.Alignof(a.Z))
fmt.Printf("a.Y: %d\n", a.Y)
var tmp uintptr
tmp = uintptr(
unsafe.Pointer(uintptr(unsafe.Pointer(&a)) +
unsafe.Offsetof(a.Y)))
var ptr *int16
ptr = (*int16)(unsafe.Pointer(tmp))
*ptr = 256
fmt.Printf("ptr: %d\n", *ptr)
fmt.Printf("a.Y: %d\n", a.Y)
}
The reason why this code is incorrect is because that some garbage collectors move variables around in memory to reduce
fragmentation or bookkeeping. When a variable is moved, all pointers that hold the address of the old
location, must be updated to point to the new one. From Go’s garbage collector’s perspective, an unsafe.Pointer
is
a pointer; therefore, its value must change as the variable moves, but a uintptr
is just a number. In other words,
the above code hides a pointer from the garbage collector in a non-pointer variable tmp
and by the time the next
statement executes to convert the tmp
variable to a unsafe.Pointer
the variable a
could have been moved and the
number in tmp
would no longer be the address &a.Y
. The next statement to write to the address would then clobber
an arbitrary memory location with the value 256
.
Because the unsafe.Pointer
value gives us the equivalent of a void *
in C, we can convert our types similarly to
how you might expect it to behave in C. In the example below, when we do a type version between a float64
and a
uint64
the decimal value is truncated and the whole number is stored in the uint64
variable; however, when we do
our pointer conversion and then cast to a *uint64
we keep the float64
binary representation and dump it out as a
uint64
. Technically, Go doesn’t have type casting, but I’m just going to go with this terminology for now when
we’ve converting from our unsafe.Pointer
to another type.
package main
import (
"fmt"
"unsafe"
)
func main() {
var a float64
a = 3.14
var b uint64
fmt.Printf("a: %f\n", a)
fmt.Printf("int(a) conversion: %d\n", int(a))
b = Float64bits(a)
fmt.Printf("pointer cast: %d\n", b)
}
func Float64bits(f float64) uint64 {
return *((*uint64)(unsafe.Pointer(&f)))
}
go run main.go
a: 3.140000
int(a) conversion: 3
pointer cast: 4614253070214989087
Hello Operator: syscalls
The unsafe
package in Go is useful when needing to interface with lower level APIs or C code, in this example we will
make Unix system calls in a C program and it’s equivalent in a Go program. These lower level calls
in Go typically require the usage of the unsafe
package.
#include <fcntl.h> // For open() and O_RDONLY
#include <stdint.h> // For uint32_t
#include <stdio.h> // For printf() and perror()
#include <sys/ioctl.h> // For ioctl()
#include <sys/socket.h> // For struct sockaddr and sa_family_t
#include <unistd.h> // For close()
#include <linux/vm_sockets.h> // For IOCTL_VM_SOCKETS_GET_LOCAL_CID
int main() {
// Open the vsock device
int fd = open("/dev/vsock", O_RDONLY);
if (fd < 0) {
perror("Failed to open /dev/vsock");
return 1;
}
// Variable to hold the local CID
uint32_t cid;
// Get the local CID using ioctl
if (ioctl(fd, IOCTL_VM_SOCKETS_GET_LOCAL_CID, &cid) < 0) {
perror("Failed to get local CID");
close(fd);
return 1;
}
// Print the CID
printf("CID: %u\n", cid);
// Close the file descriptor
close(fd);
return 0;
}
package main
import (
"fmt"
"os"
"syscall"
"unsafe"
)
const IOCTL_VM_SOCKETS_GET_LOCAL_CID = 0x7B9 // Placeholder for the actual IOCTL value
func main() {
// Open the vsock device
fd, err := os.OpenFile("/dev/vsock", os.O_RDONLY, 0)
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to open /dev/vsock: %v\n", err)
return
}
defer fd.Close()
// Variable to hold the local CID
var cid uint32
// Perform ioctl syscall
_, _, errno := syscall.Syscall(
syscall.SYS_IOCTL,
fd.Fd(),
IOCTL_VM_SOCKETS_GET_LOCAL_CID,
uintptr(unsafe.Pointer(&cid)),
)
// Check for errors
if errno != 0 {
fmt.Fprintf(os.Stderr, "Failed to get local CID: %v\n", errno)
return
}
// Print the CID
fmt.Printf("CID: %d\n", cid)
}
SIGTERM
Well, at the least I hope that this demystified a little bit of one of the odd parts of the Go ecosystem and gave you some insight in to C. I will leave you with one fun program that does some pointer arithmetic in Go. Stay curious, friends.
package main
import (
"fmt"
"unsafe"
)
func main() {
b := []uint32{1, 2, 3, 4}
for i := 0; i < len(b); i++ {
fmt.Printf("%d ",
*(*uint32)(unsafe.Pointer(uintptr(unsafe.Pointer(&b[0])) +
uintptr(i)*unsafe.Sizeof(b[0]))))
}
}