hub / github.com/goccy/go-json

github.com/goccy/go-json @v0.10.6 sqlite

repository ↗ · DeepWiki ↗ · release v0.10.6 ↗

2,074 symbols 5,252 edges 125 files 210 documented · 10%

README

go-json

Fast JSON encoder/decoder compatible with encoding/json for Go

Roadmap

* version ( expected release date )

* v0.9.0
 |
 | while maintaining compatibility with encoding/json, we will add convenient APIs
 |
 v
* v1.0.0

We are accepting requests for features that will be implemented between v0.9.0 and v.1.0.0. If you have the API you need, please submit your issue here.

Features

Drop-in replacement of encoding/json
Fast ( See Benchmark section )
Flexible customization with options
Coloring the encoded string
Can propagate context.Context to MarshalJSON or UnmarshalJSON
Can dynamically filter the fields of the structure type-safely

Installation

go get github.com/goccy/go-json

How to use

Replace import statement from encoding/json to github.com/goccy/go-json

-import "encoding/json"
+import "github.com/goccy/go-json"

JSON library comparison

name	encoder	decoder	compatible with `encoding/json`
encoding/json	yes	yes	N/A
json-iterator/go	yes	yes	partial
easyjson	yes	yes	no
gojay	yes	yes	no
segmentio/encoding/json	yes	yes	partial
jettison	yes	no	no
simdjson-go	no	yes	no
goccy/go-json	yes	yes	yes

json-iterator/go isn't compatible with encoding/json in many ways (e.g. https://github.com/json-iterator/go/issues/229 ), but it hasn't been supported for a long time.
segmentio/encoding/json is well supported for encoders, but some are not supported for decoder APIs such as Token ( streaming decode )

Other libraries

jingo

I tried the benchmark but it didn't work. Also, it seems to panic when it receives an unexpected value because there is no error handling...

ffjson

Benchmarking gave very slow results. It seems that it is assumed that the user will use the buffer pool properly. Also, development seems to have already stopped

Benchmarks

$ cd benchmarks
$ go test -bench .

Encode

Decode

Fuzzing

go-json-fuzz is the repository for fuzzing tests. If you run the test in this repository and find a bug, please commit to corpus to go-json-fuzz and report the issue to go-json.

How it works

go-json is very fast in both encoding and decoding compared to other libraries. It's easier to implement by using automatic code generation for performance or by using a dedicated interface, but go-json dares to stick to compatibility with encoding/json and is the simple interface. Despite this, we are developing with the aim of being the fastest library.

Here, we explain the various speed-up techniques implemented by go-json.

Basic technique

The techniques listed here are the ones used by most of the libraries listed above.

Buffer reuse

Since the only value required for the result of json.Marshal(interface{}) ([]byte, error) is []byte, the only value that must be allocated during encoding is the return value []byte .

Also, as the number of allocations increases, the performance will be affected, so the number of allocations should be kept as low as possible when creating []byte.

Therefore, there is a technique to reduce the number of times a new buffer must be allocated by reusing the buffer used for the previous encoding by using sync.Pool.

Finally, you allocate a buffer that is as long as the resulting buffer and copy the contents into it, you only need to allocate the buffer once in theory.

type buffer struct {
    data []byte
}

var bufPool = sync.Pool{
    New: func() interface{} {
        return &buffer{data: make([]byte, 0, 1024)}
    },
}

buf := bufPool.Get().(*buffer)
data := encode(buf.data) // reuse buf.data

newBuf := make([]byte, len(data))
copy(newBuf, buf)

buf.data = data
bufPool.Put(buf)

Elimination of reflection

As you know, the reflection operation is very slow.

Therefore, using the fact that the address position where the type information is stored is fixed for each binary ( we call this typeptr ), we can use the address in the type information to call a pre-built optimized process.

For example, you can get the address to the type information from interface{} as follows and you can use that information to call a process that does not have reflection.

To process without reflection, pass a pointer (unsafe.Pointer) to the value is stored.


type emptyInterface struct {
    typ unsafe.Pointer
    ptr unsafe.Pointer
}

var typeToEncoder = map[uintptr]func(unsafe.Pointer)([]byte, error){}

func Marshal(v interface{}) ([]byte, error) {
    iface := (*emptyInterface)(unsafe.Pointer(&v)
    typeptr := uintptr(iface.typ)
    if enc, exists := typeToEncoder[typeptr]; exists {
        return enc(iface.ptr)
    }
    ...
}

※ In reality, typeToEncoder can be referenced by multiple goroutines, so exclusive control is required.

Unique speed-up technique

Encoder

Do not escape arguments of `Marshal`

json.Marshal and json.Unmarshal receive interface{} value and they perform type determination dynamically to process. In normal case, you need to use the reflect library to determine the type dynamically, but since reflect.Type is defined as interface, when you call the method of reflect.Type, The reflect's argument is escaped.

Therefore, the arguments for Marshal and Unmarshal are always escaped to the heap. However, go-json can use the feature of reflect.Type while avoiding escaping.

reflect.Type is defined as interface, but in reality reflect.Type is implemented only by the structure rtype defined in the reflect package. For this reason, to date reflect.Type is the same as *reflect.rtype.

Therefore, by directly handling *reflect.rtype, which is an implementation of reflect.Type, it is possible to avoid escaping because it changes from interface to using struct.

The technique for working with *reflect.rtype directly from go-json is implemented at rtype.go

Also, the same technique is cut out as a library ( https://github.com/goccy/go-reflect )

Initially this feature was the default behavior of go-json. But after careful testing, I found that I passed a large value to json.Marshal() and if the argument could not be assigned to the stack, it could not be properly escaped to the heap (a bug in the Go compiler).

Therefore, this feature will be provided as an optional until this issue is resolved.

To use it, add NoEscape like MarshalNoEscape()

Encoding using opcode sequence

I explained that you can use typeptr to call a pre-built process from type information.

In other libraries, this dedicated process is processed by making it an function calling like anonymous function, but function calls are inherently slow processes and should be avoided as much as possible.

Therefore, go-json adopted the Instruction-based execution processing system, which is also used to implement virtual machines for programming language.

If it is the first type to encode, create the opcode ( instruction ) sequence required for encoding. From the second time onward, use typeptr to get the cached pre-built opcode sequence and encode it based on it. An example of the opcode sequence is shown below.

json.Marshal(struct{
    X int `json:"x"`
    Y string `json:"y"`
}{X: 1, Y: "hello"})

When encoding a structure like the one above, create a sequence of opcodes like this:

- opStructFieldHead ( `{` )
- opStructFieldInt ( `"x": 1,` )
- opStructFieldString ( `"y": "hello"` )
- opStructEnd ( `}` )
- opEnd

※ When processing each operation, write the letters on the right.

In addition, each opcode is managed by the following structure ( Pseudo code ).

type opType int
const (
    opStructFieldHead opType = iota
    opStructFieldInt
    opStructFieldStirng
    opStructEnd
    opEnd
)
type opcode struct {
    op opType
    key []byte
    next *opcode
}

The process of encoding using the opcode sequence is roughly implemented as follows.

func encode(code *opcode, b []byte, p unsafe.Pointer) ([]byte, error) {
    for {
        switch code.op {
        case opStructFieldHead:
            b = append(b, '{')
            code = code.next
        case opStructFieldInt:
            b = append(b, code.key...)
            b = appendInt((*int)(unsafe.Pointer(uintptr(p)+code.offset)))
            code = code.next
        case opStructFieldString:
            b = append(b, code.key...)
            b = appendString((*string)(unsafe.Pointer(uintptr(p)+code.offset)))
            code = code.next
        case opStructEnd:
            b = append(b, '}')
            code = code.next
        case opEnd:
            goto END
        }
    }
END:
    return b, nil
}

In this way, the huge switch-case is used to encode by manipulating the linked list opcodes to avoid unnecessary function calls.

Opcode sequence optimization

One of the advantages of encoding using the opcode sequence is the ease of optimization. The opcode sequence mentioned above is actually converted into the following optimized operations and used.

- opStructFieldHeadInt ( `{"x": 1,` )
- opStructEndString ( `"y": "hello"}` )
- opEnd

It has been reduced from 5 opcodes to 3 opcodes ! Reducing the number of opcodees means reducing the number of branches with switch-case. In other words, the closer the number of operations is to 1, the faster the processing can be performed.

In go-json, optimization to reduce the number of opcodes itself like the above and it speeds up by preparing opcodes with optimized paths.

Change recursive call from CALL to JMP

Recursive processing is required during encoding if the type is defined recursively as follows:

type T struct {
    X int
    U *U
}

type U struct {
    T *T
}

b, err := json.Marshal(&T{
    X: 1,
    U: &U{
        T: &T{
            X: 2,
        },
    },
})
fmt.Println(string(b)) // {"X":1,"U":{"T":{"X":2,"U":null}}}

In go-json, recursive processing is processed by the operation type of opStructFieldRecursive.

In this operation, after acquiring the opcode sequence used for recursive processing, the function is not called recursively as it is, but the necessary values are saved by itself and implemented by moving to the next operation.

The technique of implementing recursive processing with the JMP operation while avoiding the CALL operation is a famous technique for implementing a high-speed virtual machine.

For more details, please refer to the article ( but Japanese only ).

Dispatch by typeptr from map to slice

When retrieving the data cached from the type information by typeptr, we usually use map. Map requires exclusive control, so use sync.Map for a naive implementation.

However, this is slow, so it's a good idea to use the atomic package for exclusive control as implemented by segmentio/encoding/json ( https://github.com/segmentio/encoding/blob/master/json/codec.go#L41-L55 ).

This implementation slows down the set instead of speeding up the get, but it works well because of the nature of the library, it encodes much more for the same type.

However, as a result of profiling, I noticed that runtime.mapaccess2 accounts for a significant percentage of the execution time. So I thought if I could change the lookup from map to slice.

There is an API named typelinks defined in the runtime package that the reflect package uses internally. This allows you to get all the type information defined in the binary at runtime.

The fact that all type information can be acquired means that by constructing slices in advance with the acquired total number of type information, it is possible to look up with the value of typeptr without worrying about out-of-range access.

However, if there is too much type information, it will use a lot of memory, so by default we will only use this optimization if the slice size fits within 2Mib .

If this approach is not available, it will fall back to the atomic based process described above.

If you want to know more, please refer to the implementation [here](https://github.com/goccy/go-json/blob/master/internal/runt

Extension points exported contracts — how you extend this code

Marshaler (Interface)

Marshaler is the interface implemented by types that can marshal themselves into valid JSON. [20 implementers]

json.go

Code (Interface)

(no doc) [14 implementers]

internal/encoder/code.go

Decoder (Interface)

(no doc) [19 implementers]

internal/decoder/type.go

Issue290 (Interface)

(no doc)

encode_test.go

EncodeOptionFunc (FuncType)

(no doc)

option.go

UserAddressResolver (FuncType)

(no doc)

test/example/example_query_test.go

MarshalerContext (Interface)

MarshalerContext is the interface implemented by types that can marshal themselves into valid JSON with context.Context. [20 …

json.go

AnonymousCode (Interface)

(no doc) [3 implementers]

internal/encoder/code.go

Core symbols most depended-on inside this repo

load

called by 336

internal/encoder/vm_color_indent/util.go

load

called by 336

internal/encoder/vm_indent/util.go

load

called by 335

internal/encoder/vm_color/util.go

load

called by 335

internal/encoder/vm/util.go

appendStructKey

called by 223

internal/encoder/vm_color_indent/util.go

appendStructKey

called by 223

internal/encoder/vm_color/util.go

appendStructKey

called by 223

internal/encoder/vm/util.go

appendStructKey

called by 223

internal/encoder/vm_indent/util.go

Shape

Function 976

Method 551

Struct 455

TypeAlias 74

Interface 14

FuncType 4

Languages

Go100%

Modules by API surface

decode_test.go220 symbols

encode_test.go175 symbols

internal/encoder/code.go89 symbols

benchmarks/encode_test.go89 symbols

internal/encoder/compiler.go69 symbols

internal/runtime/rtype.go64 symbols

internal/decoder/path.go56 symbols

internal/encoder/encoder.go54 symbols

internal/encoder/vm_color/util.go47 symbols

internal/encoder/vm_color_indent/util.go46 symbols

benchmarks/decode_test.go43 symbols

internal/encoder/vm/util.go40 symbols

Used by 112 indexed graphs manifest dependencies, hub-wide

github.com/1Panel-dev/1Panel

github.com/AlistGo/alist

github.com/AnalogJ/scrutiny

github.com/Billionmail/BillionMail

github.com/ConnectAI-E/feishu-openai

github.com/Jrohy/trojan

github.com/MHSanaei/3x-ui

github.com/OpenListTeam/OpenList

github.com/OpenNHP/opennhp

github.com/QuantumNous/new-api

… +102 more

Dependencies from manifests, versioned

github.com/francoispqt/gojayv1.2.13 · 1×

github.com/goccy/go-jsonv0.0.0-0001010100000 · 1×

github.com/json-iterator/gov1.1.10 · 1×

github.com/mailru/easyjsonv0.0.0-2019031214324 · 1×

github.com/modern-go/concurrentv0.0.0-2018030601264 · 1×

github.com/modern-go/reflect2v1.0.1 · 1×

github.com/pquerna/ffjsonv0.0.0-2019093013402 · 1×

github.com/segmentio/encodingv0.2.4 · 1×

github.com/stretchr/testifyv1.7.0 · 1×

github.com/valyala/fastjsonv1.6.3 · 1×

github.com/wI2L/jettisonv0.7.1 · 1×

For agents

$ claude mcp add go-json \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/goccy/go-json @v0.10.6 sqlite

go-json

Roadmap

Features

Installation

How to use

JSON library comparison

Other libraries

Benchmarks

Encode

Decode

Fuzzing

How it works

Basic technique

Buffer reuse

Elimination of reflection

Unique speed-up technique

Encoder

Do not escape arguments of Marshal

Encoding using opcode sequence

Opcode sequence optimization

Change recursive call from CALL to JMP

Dispatch by typeptr from map to slice

Extension points exported contracts — how you extend this code

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Used by 112 indexed graphs manifest dependencies, hub-wide

Dependencies from manifests, versioned

For agents

Do not escape arguments of `Marshal`