从golang的垃圾回收说起(下篇)

猪小花1号2018-08-28 15:49

4 Golang垃圾回收的相关参数

4.1 触发GC

gc触发的时机:2分钟或者内存占用达到一个阈值(当前堆内存占用是上次gc后对内存占用的两倍,当GOGC=100时)

 # 表示当前应用占用的内存是上次GC时占用内存的两倍时,触发GC
export GOGC=100

4.2 查看GC信息

export GODEBUG=gctrace=1

可以查看gctrace信息。

举例:

gc 1 @0.008s 6%: 0.071+2.0+0.080 ms clock, 0.21+0.22/1.9/1.9+0.24 ms cpu, 4->4->3 MB, 5 MB goal, 4 P
# command-line-arguments
gc 1 @0.001s 16%: 0.071+3.3+0.060 ms clock, 0.21+0.17/2.9/0.36+0.18 ms cpu, 4->4->4 MB, 5 MB goal, 4 P
gc 2 @0.016s 8%: 0.020+6.0+0.070 ms clock, 0.082+0.094/3.9/2.2+0.28 ms cpu, 8->9->8 MB, 9 MB goal, 4 P
gc 3 @0.046s 7%: 0.019+7.3+0.062 ms clock, 0.076+0.089/7.1/7.0+0.24 ms cpu, 14->16->14 MB, 17 MB goal, 4 P
gc 4 @0.092s 8%: 0.015+24+0.10 ms clock, 0.060+0.10/24/0.75+0.42 ms cpu, 25->27->24 MB, 28 MB goal, 4 P

每个字段表示什么信息可以参考 golang doc

5 如何提高GC的性能

Golang的GC算法是固定的,用户无法去配置采用什么算法,也没法像Java一样配置年轻代、老年代的空间比例等。golang的GC相关的配置参数只有一个,即GOGC,用来表示触发GC的条件。

目前来看,提高GC效率我们唯一能做的就是减少垃圾的产生。所以说,这一章称为提高GC的性能也不太合适。下面我们就主要讨论一下,在golang中如何减少垃圾的产生,有哪些需要注意的方面。

5.1 golang中的内存分配

参考官网Frequently Asked Questions (FAQ)

How do I know whether a variable is allocated on the heap or the stack?

From a correctness standpoint, you don't need to know. Each variable in Go exists as long as there are references to it. The storage location chosen by the implementation is irrelevant to the semantics of the language.

The storage location does have an effect on writing efficient programs. When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame. However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors. Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.

In the current compilers, if a variable has its address taken, that variable is a candidate for allocation on the heap. However, a basic escape analysis recognizes some cases when such variables will not live past the return from the function and can reside on the stack.

我们看一个例子有个直观的认识:

1 package main
2 
3 import ()
4 
5 func foo() *int {
6     var x int
7     return &x
8 }
9 
10 func bar() int {
11     x := new(int)
12     *x = 1
13     return *x
14 }
15 
16 func big() {
17     x := make([]int,0,20)
18     y := make([]int,0,20000)
19 
20     len := 10
21     z := make([]int,0,len)
22 }
23 
24 func main() {
25  
26 }
# go build -gcflags='-m -l' test.go

./test.go:7:12: &x escapes to heap
./test.go:6:9: moved to heap: x
./test.go:11:13: bar new(int) does not escape
./test.go:18:14: make([]int, 0, 20000) escapes to heap
./test.go:21:14: make([]int, 0, len) escapes to heap
./test.go:17:14: big make([]int, 0, 20) does not escape
./test.go:17:23: x declared and not used
./test.go:18:23: y declared and not used
./test.go:21:23: z declared and not used

5.2 sync.Pool对象池

sync.Pool主要是为了重用对象,一方面缩短了申请空间的时间,另一方面,还减轻了GC的压力。不过它是一个临时对象池,为什么这么说呢?因为对象池中的对象会被GC回收。所以说,有状态的对象,比如数据库连接是不能够用sync.Pool来实现的。

use sync.Pool if you frequently allocate many objects of the same type and you want to save some allocation and garbage collection overhead. However, in the current implementation, any unreferenced sync.Pool objects are removed at each garbage collection cycle, so you can't use this as a long-lived free-list of objects. If you want a free-list that maintains objects between GC runs, you'll still have to build that yourself. This is only to reuse allocated objects between garbage collection cycles.

sync.Pool主要有两个方法:

func (p *Pool) Get() interface{} {
    ...
}
func (p *Pool) Put(x interface{}) {
    ...
}

Get方法是指从临时对象池中申请对象,put是指把不再使用的对象返回对象池,以便后续重用。如果我们在使用Get申请新对象时pool中没有可用的对象,那么就会返回nil,除非设置了sync.Pool的New func:

type Pool struct {

    ...

    // New optionally specifies a function to generate
    // a value when Get would otherwise return nil.
    // It may not be changed concurrently with calls to Get.
    New func() interface{}
}

另外,我们不能对从对象池申请到的对象值做任何假设,可能是New新生成的,可能是被某个协程修改过放回来的。

一个比较好的使用sync.Pool的例子:

var DEFAULT_SYNC_POOL *SyncPool

func NewPool() *SyncPool {
DEFAULT_SYNC_POOL = NewSyncPool(
5,
30000,
2,
)
return DEFAULT_SYNC_POOL
}

func Alloc(size int) []int64 {
return DEFAULT_SYNC_POOL.Alloc(size)
}

func Free(mem []int64) {
DEFAULT_SYNC_POOL.Free(mem)
}

// SyncPool is a sync.Pool base slab allocation memory pool
type SyncPool struct {
classes []sync.Pool
classesSize []int
minSize int
maxSize int
}

func NewSyncPool(minSize, maxSize, factor int) *SyncPool {
n := 0
for chunkSize := minSize; chunkSize <= maxSize; chunkSize *= factor {
n++
}
pool := &SyncPool{
make([]sync.Pool, n),
make([]int, n),
minSize, maxSize,
}
n = 0
for chunkSize := minSize; chunkSize <= maxSize; chunkSize *= factor {
pool.classesSize[n] = chunkSize
pool.classes[n].New = func(size int) func() interface{} {
return func() interface{} {
buf := make([]int64, size)
return &buf
}
}(chunkSize)
n++
}
return pool
}

func (pool *SyncPool) Alloc(size int) []int64 {
if size <= pool.maxSize {
for i := 0; i < len(pool.classesSize); i++ {
if pool.classesSize[i] >= size {
mem := pool.classes[i].Get().(*[]int64)
// return (*mem)[:size]
return (*mem)[:0]
}
}
}
return make([]int64, 0, size)
}

func (pool *SyncPool) Free(mem []int64) {
if size := cap(mem); size <= pool.maxSize {
for i := 0; i < len(pool.classesSize); i++ {
if pool.classesSize[i] >= size {
pool.classes[i].Put(&mem)
return
}
}
}
}


有一个开源的通用golang对象池实现,有兴趣的可以研究一下:Go Commons Pool,在此不再赘述。

5.3 append

我们先看一下append的基本用法。

nums:=make([]int,0,10)

创建切片,len=0,cap=10,底层实际上分配了10个元素大小的空间。在没有append数据的情况下,不能直接使用nums[index]。

nums:=make([]int,5,10)

创建切片,len=5,cap=10,底层实际上分配了10个元素大小的空间。在没有append数据的情况下,可以直接使用nums[index],index的范围是[0,4]。执行append操作的时候是从index=5的位置开始存储的。

nums := make([]int,5)

如果没有指定capacity,那么cap与len相等。

nums = append(nums,10)

执行append操作的时候,nums的地址可能会改变,因此需要利用其返回值重新设置nums。至于nums的地址会不会改变,取决于还没有空间来存储新的数据,如果没有空闲空间了,那就需要申请cap*2的空间,将数据复制过去。

因此,我们在使用append操作的时候,最好是设置一个比较合理的cap值,即根据自己的应用场景预申请大小合适的空间,避免无谓的不断重新申请新空间,这样可以减少GC的压力。

由append导致的内存飙升和GC压力过大这个问题,需要特别注意一下。

参考文献

1 https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)
2 http://legendtkl.com/2017/04/28/golang-gc/
3 https://github.com/golang/proposal/blob/master/design/17503-eliminate-rescan.md
4 https://lengzzz.com/note/gc-in-golang
5 https://making.pusher.com/golangs-real-time-gc-in-theory-and-practice/
6 https://blog.twitch.tv/gos-march-to-low-latency-gc-a6fa96f06eb7
7 https://golang.org/doc/faq
8 《垃圾回收的算法与实现》 中村成杨 相川光. 编著


相关阅读:

从golang的垃圾回收说起(上篇)

网易云新用户大礼包:https://www.163yun.com/gift

本文来自网易实践者社区,经作者李岚清授权发布。