作者: jinzhongjia

  • Zig Msgpack

    Zig Msgpack

    MessagePack is an efficient binary serialization format. It lets you exchange data among multiple languages like JSON. But it’s faster and smaller. Small integers are encoded into a single byte, and typical short strings require only one extra byte in addition to the strings themselves.

    It’s fast and a typical use case is in neovim and redis! For neovim, it is used to as remote rpc protocol.

    MessagePack spec : Github

    Zig Msgpack : Github

    Thery

    The implementation plan of this protocol is that the header is a one-byte mark to indicate the type of data to be transmitted next. If it is a non-fixed length, it will be followed by a few bytes to indicate the length, and then the data.

    Here is a very simple schematic for reading a simple type:

    The types supported:

    Nil, Bool, Int, Float, Str, Bin, Array, Map, Ext (Predefined timestamp types)

    Usage

    Acorrding to GIthub README.md add this package to your project!

    zig-msgpack provide a generics function call Pack to build the read type used.

    Just like this:

    const bufferType = std.io.FixedBufferStream([]u8);
    
    const pack = msgpack.Pack(
        *bufferType,
        *bufferType,
        bufferType.WriteError,
        bufferType.ReadError,
        bufferType.write,
        bufferType.read,
    );

    masgpack.Pack will return a type, we can call it pack !

    We use FixedBufferStream as the writetype and readtype, Pack accepts 6 parameters:

    fn Pack(
        comptime WriteContext: type,
        comptime ReadContext: type,
        comptime WriteError: type,
        comptime ReadError: type,
        comptime writeFn: fn (context: WriteContext, bytes: []const u8) WriteError!usize,
        comptime readFn: fn (context: ReadContext, arr: []u8) ReadError!usize,
    ) type

    It looks very similar to std.io.writer.GenericWriter and std.io.GenericReader, the design of zig-msgpack here references them.

    This type pack contains methods for reading and writing the msgpack type, we can use the pack.init(write_context, read_context) to get a variable.

    But since zig does not distinguish strings separately, a separate type Str is defined:

    pub const Str = struct {
        str: []const u8,
        pub fn value(self: Str) []const u8 {
            return self.str;
        }
    };
    
    /// this is for encode str in struct
    pub fn wrapStr(str: []const u8) Str {
        return Str{ .str = str };
    }

    For message pack Bin type:

    pub const Bin = struct {
        bin: []u8,
        pub fn value(self: Bin) []u8 {
            return self.bin;
        }
    };
    
    /// this is wrapping for bin
    pub fn wrapBin(bin: []u8) Bin {
        return Bin{ .bin = bin };
    }

    For message pack Ext type:

    pub const EXT = struct {
        type: i8,
        data: []u8,
    };
    
    /// t is type, data is data
    pub fn wrapEXT(t: i8, data: []u8) EXT {
        return EXT{
            .type = t,
            .data = data,
        };
    }

    zig-msgpack provides multiple ways to write and read, and strict type checking will be performed on the parameters to ensure that no non-read failure errors will occur during runtime.

    For example, we write two bool value, and read them:

    var arr: [0xffff_f]u8 = std.mem.zeroes([0xffff_f]u8);
    var write_buffer = std.io.fixedBufferStream(&arr);
    var read_buffer = std.io.fixedBufferStream(&arr);
    var p = pack.init(
        &write_buffer,
        &read_buffer,
    );
    
    const test_val_1 = false;
    const test_val_2 = true;
    
    try p.write(.{ .bool = test_val_1 });
    try p.write(.{ .bool = test_val_2 });
    
    var val_1 = try p.read(allocator);
    defer val_1.free(allocator);
    
    var val_2 = try p.read(allocator);
    defer val_2.free(allocator);
    
    try std.testing.expect(val_1.bool == test_val_1);
    try std.testing.expect(val_2.bool == test_val_2);

    Overall, we convert the read result into the Payload type and obtain the value. This allows us to directly obtain data of unknown structure.

    For more examples, check out the unit tests of zig-msgpack.

  • 幂等性

    幂等性

    何为幂等?

    在计算机中编程中,一个幂等操作的特点是其任意多次执行所产生的影响均与一次执行的影响相同。

    幂等】是一个数学与计算机学概念,在数学中某一元运算为幂等时,其作用在任一元素两次后会和其作用一次的结果相同。

    幂等函数或幂等方法是指可以使用相同参数重复执行,并能获得相同结果的函数。这些函数不会影响系统状态,也不用担心重复执行会对系统造成改变。

    什么是接口幂等性?

    HTTP/1.1中,对幂等性进行了定义。它描述了一次和多次请求某一个资源对于资源本身应该具有同样的结果(网络超时等问题除外),即第一次请求的时候对资源产生了副作用,但是以后的多次请求都不会再对资源产生副作用。

    这里的副作用是不会对结果产生破坏或者产生不可预料的结果。也就是说,其任意多次执行对资源本身所产生的影响均与一次执行的影响相同。

    为什么需要实现幂等性?

    在接口调用时一般情况下都能正常返回信息不会重复提交,不过在遇见以下情况时可以就会出现问题,如:

    1. 前端重复提交表单: 在填写一些表格时候,用户填写完成提交,很多时候会因网络波动没有及时对用户做出提交成功响应,致使用户认为没有成功提交,然后一直点提交按钮,这时就会发生重复提交表单请求。
    2. 用户恶意进行刷单: 例如在实现用户投票这种功能时,如果用户针对一个用户进行重复提交投票,这样会导致接口接收到用户重复提交的投票信息,这样会使投票结果与事实严重不符。
    3. 接口超时重复提交:很多时候 HTTP 客户端工具都默认开启超时重试的机制,尤其是第三方调用接口时候,为了防止网络波动超时等造成的请求失败,都会添加重试机制,导致一个请求提交多次。
    4. 消息进行重复消费: 当使用 MQ 消息中间件时候,如果发生消息中间件出现错误未及时提交消费信息,导致发生重复消费。

    使用幂等性最大的优势在于使接口保证任何幂等性操作,免去因重试等造成系统产生的未知的问题。

    引入幂等性后对系统有什么影响?

    幂等性是为了简化客户端逻辑处理,能放置重复提交等操作,但却增加了服务端的逻辑复杂性和成本,其主要是:

    1. 把并行执行的功能改为串行执行,降低了执行效率。
    2. 增加了额外控制幂等的业务逻辑,复杂化了业务功能;

    所以在使用时候需要考虑是否引入幂等性的必要性,根据实际业务场景具体分析,除了业务上的特殊要求外,一般情况下不需要引入的接口幂等性。

  • Assembly in Zig

    Recently, I want to write a kenel with zig, naturally we need to use assembly.

    When computer boot, we need to deploy with assembly so we can enter protected mode.

    Zig reference assembly

    Seprate File

    we just need to declare the assembly function with keyword extern

    For example, we next use zig to call assembly to print our most common “Hello, World!”

    First, zig itself has its own support for assembly, but now only for AT&T, the modish intel syntax support poorly.

    Now zig use llvm for assembly parsing, and zig may have its own assembler in the future.

    Now, we config the build.zig for assembly:

    You can just watch the ///////////////// symbol

    const std = @import("std");
    
    // Although this function looks imperative, note that its job is to
    // declaratively construct a build graph that will be executed by an external
    // runner.
    pub fn build(b: *std.Build) void {
        // Standard target options allows the person running `zig build` to choose
        // what target to build for. Here we do not override the defaults, which
        // means any target is allowed, and the default is native. Other options
        // for restricting supported target set are available.
        const target = b.standardTargetOptions(.{});
    
        // Standard optimization options allow the person running `zig build` to select
        // between Debug, ReleaseSafe, ReleaseFast, and ReleaseSmall. Here we do not
        // set a preferred release mode, allowing the user to decide how to optimize.
        const optimize = b.standardOptimizeOption(.{});
    
        const exe = b.addExecutable(.{
            .name = "zig-prac",
            // In this case the main source file is merely a path, however, in more
            // complicated build scripts, this could be a generated file.
            .root_source_file = .{ .path = "src/main.zig" },
            .target = target,
            .optimize = optimize,
        });
    
        /////////////////
        // we need to notice here, as long as this sentence can add assembly language support
        exe.addAssemblyFile("./src/hello.s");
        ////////////////
    
        // This declares intent for the executable to be installed into the
        // standard location when the user invokes the "install" step (the default
        // step when running `zig build`).
        exe.install();
    
        // This *creates* a RunStep in the build graph, to be executed when another
        // step is evaluated that depends on it. The next line below will establish
        // such a dependency.
        const run_cmd = exe.run();
    
        // By making the run step depend on the install step, it will be run from the
        // installation directory rather than directly from within the cache directory.
        // This is not necessary, however, if the application depends on other installed
        // files, this ensures they will be present and in the expected location.
        run_cmd.step.dependOn(b.getInstallStep());
    
        // This allows the user to pass arguments to the application in the build
        // command itself, like this: `zig build run -- arg1 arg2 etc`
        if (b.args) |args| {
            run_cmd.addArgs(args);
        }
    
        // This creates a build step. It will be visible in the `zig build --help` menu,
        // and can be selected like this: `zig build run`
        // This will evaluate the `run` step rather than the default, which is "install".
        const run_step = b.step("run", "Run the app");
        run_step.dependOn(&run_cmd.step);
    
        // Creates a step for unit testing.
        const exe_tests = b.addTest(.{
            .root_source_file = .{ .path = "src/main.zig" },
            .target = target,
            .optimize = optimize,
        });
    
        // Similar to creating the run step earlier, this exposes a `test` step to
        // the `zig build --help` menu, providing a way for the user to request
        // running the unit tests.
        const test_step = b.step("test", "Run unit tests");
        test_step.dependOn(&exe_tests.step);
    }

    Then, we write the main.zig:

    const std = @import("std");
    
    extern fn hello_world(?[*:0]const u8) void;
    
    const msg: [:0]const u8 = "Hello World!\n";
    
    pub fn main() void {
        hello_world(msg.ptr);
    }

    the hello.s:

    Notie: the filename must be *.s!

    .globl hello_world
    # global function, expose the hello_world
    .type hello_world, @function
    # tell compiler, we define a function
    .section .text
    hello_world:
      mov $4, %eax
      mov $1, %ebx
      mov %edi, %ecx
      # get parameter from register edi, you can learn more on x86-64 abi document
      mov $0xd, %edx
      # the length of string
      int $0x80
      # system call
      ret

    We alse can get another version for main.zig:

    notice the function @ptrToInt, it can cast ptr to int(usize)

    const std = @import("std");
    
    extern fn hello_world(usize) void;
    
    const msg = "Hello World!\n";
    
    pub fn main() void {
        hello_world(@ptrToInt(msg));
    }

    Global Assembly

    When an assembly expression occurs in a container level comptime block, this is global assembly.

    This kind of assembly has different rules than inline assembly. First, volatile is not valid because all global assembly is unconditionally included. Second, there are no inputs, outputs, or clobbers. All global assembly is concatenated verbatim into one long string and assembled together. There are no template substitution rules regarding % as there are in inline assembly expressions.

    const std = @import("std");
    
    comptime {
        asm (
            \\.globl hello_world
            \\.type hello_world, @function
            \\.section .text
            \\hello_world:
            \\  mov $4, %eax
            \\  mov $1, %ebx
            \\  mov %edi, %ecx
            \\  mov $0xd, %edx
            \\  int $0x80
            \\  ret
        );
    }
    
    // extern fn hello_world(?[*:0]const u8) void;
    extern fn hello_world(usize) void;
    
    // const msg: [:0]const u8 = "Hello World!\n";
    const msg = "Hello World!\n";
    
    pub fn main() void {
        // hello_world(msg.ptr);
        hello_world(@ptrToInt(msg));
    }

    Summary

    Then we can just run zig build run, you will see “Hello, World!” on your screen!

    Assembly reference zig

    wait for completion

    Inline assembly in zig

    pub fn syscall1(number: usize, arg1: usize) usize {
        return asm volatile ("syscall"
            : [ret_reference] "={rax}" (-> usize),
            : [number_reference] "{rax}" (number),
              [arg1_reference] "{rdi}" (arg1),
            : "rcx", "r11"
        );
    }

    In this code, syscall1 is a wrap function for assembly

    Inline assembly is an expression which returns a value, the asm keyword begins the expression.

    volatile is an optional modifier that tells Zig this inline assembly expression has side-effects. Without volatile, Zig is allowed to delete the inline assembly code if the result is unused.

    syscall is assembly instructions.

    After the first colon is the output section, ret_reference is reference of output, "={rax}" is the output constraint string, In this example, the constraint string means “the result value of this inline assembly instruction is whatever is in $rax”. (-> usize), it is either a value binding, or -> and then a type. The type is the result type of the inline assembly expression. If it is a value binding, then %[ret] syntax would be used to refer to the register bound to the value.

    After the second colon is the output section, ret_reference is reference of input, we can have these in the asm string and it would refer to the operands, the register rax and register rdi will have the value of number and arg1

    After the second colon is the output section, it is the list of clobbers. These declare a set of registers whose values will not be preserved by the execution of this assembly code. These do not include output or input registers. The special clobber value of “memory” means that the assembly writes to arbitrary undeclared memory locations – not only the memory pointed to by a declared indirect output. In this example we list $rcx and $r11 because it is known the kernel syscall does not preserve these registers.

    Notice: for now inline asm is limited in zig, constraints don’t work, i mean they do but only a limited set. So we avoid them unless you really need to.

    Reference

    Zig Assembly