Contact Info

sean [at] coreitpro [dot] com gpg key

Mastodon

sc68cal on Libera

Zig Http Chunked Responses

Playing around with Zig, trying to get my arms around calling REST APIs and parsing the json. So, I figured why not use the GitHub REST API since it is pretty straight forward. Request my SSH keys and parse the resulting JSON.

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    var client = std.http.Client{ .allocator = allocator };
    defer client.deinit();

    var header_buffer: [4096]u8 = undefined;

    const uri = try std.Uri.parse("https://api.github.com/users/sc68cal/keys");
    var req = try client.open(std.http.Method.GET, uri, .{ .server_header_buffer = &header_buffer });
    defer req.deinit();
    try req.send();
    try req.finish();
    try req.wait();

    std.debug.print("Status={}\n", .{req.response.status});
    std.debug.print("Size={}\n", .{req.response.content_length.?});

    const resp_body = try allocator.alloc(u8, req.response.content_length.?);
    defer allocator.free(resp_body);

    const readall_size = try req.readAll(resp_body);
    std.debug.print("Readall size={d}\n", .{readall_size});

    std.debug.print("Respone Body={s}\n", .{resp_body});

    const parsed = try std.json.parseFromSlice(std.json.Value, allocator, resp_body, .{});
    std.debug.print("{any}", .{parsed});
}

The problem is that the response is chunked, and we only get part of it

Status=http.Status.ok
Size=840
Readall size=840
Respone Body=[{"id":87865257,"key":"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCbpTSPb+68wHVGqxb/0SAtPO7X1oHd3UZAwsJbfjz5CNdmxEb3Cyrec0tzn7b7lCZf3bWvD6rxUKVJogEheA0YTmlwNYa4wVUIcRxXzDTV5HtLRZqUK43BCfbTYPC50GTjaP4WVnBVZb6WB+rXRmZwG7svOy34Pg6Fi6zVAwdujwbfWwrp4gO47qvwt78Ot7qCrzp8MDwn46DLb+2YxVHO02MxrNda/XrAHvFgROIIJ+gMUG0IGf/KAl/LmjNI4hvOTr3CXuarsPnJWSUmulW0Dqyg0VlD0Wm2B8KbWYeR5/OvQ3+7ADXJgilZ0qOc1xk1XEoFrEEkWHM9HElihlb3"},{"id":102758427,"key":"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCi3nWuy3grCiCLqCsszQ/zGa1IU1mlIwFB4/TOryMiDYMj764Bvn0O4C2wDKpjS5t17CKg24kxSxKob6cYiVnxQtIByyH1r3w0TWjDh+p3/A9YdZNXnY/SIBU5fiq3HNB+Vt12hz+107yzhJMWGc79MYIkCGKEAvnvQgTsv7AKHlvSPkd4lI1hEZnT6GxUQ8SFzFcnrQClumrjS/VxpNS1P/J958KNsGRe/lS1CPJGr3dh4GhVSMLX2x8qIifvEgHzipk+tmPmIJQCnvlvTZ4gUzGZBp65BLZrDBgzEv5tSY331bBF40iDJi/h8Y2a1IesYsXVnUwysGp1HrotybeWDtUDK4VkT8lbZSclhiIRXwwStPwXCRg
error: UnexpectedEndOfInput

Calling readAll a second time results in a return value of 250

Readall size 2nd call=250

So, we can’t really rely on Content-Size from the HTTP response to allocate the buffer.

Calling readAll in a loop until the return is 0 could work, we’d have to grow our buffer each time.

Readall size 2nd call=250
Readall size 3nd call=0

That requires mucking with memory, copying things from a temporary buffer into a main buffer and then resizing the buffer each time. I fiddled with it a bit because I wanted to try doing it at a low level, but honestly the while loop got a bit complicated and I started to get a little annoyed.

After googling a bit, I found others that are using readAllAlloc where you pass in an allocator and it does the dirty work of allocating the memory and growing it to contain the entire response.

    const response = try req.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(response);
    std.debug.print("{s}", .{response});
[{"id":87865257,"key":"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCbpTSPb+68wHVGqxb/0SAtPO7X1oHd3UZAwsJbfjz5CNdmxEb3Cyrec0tzn7b7lCZf3bWvD6rxUKVJogEheA0YTmlwNYa4wVUIcRxXzDTV5HtLRZqUK43BCfbTYPC50GTjaP4WVnBVZb6WB+rXRmZwG7svOy34Pg6Fi6zVAwdujwbfWwrp4gO47qvwt78Ot7qCrzp8MDwn46DLb+2YxVHO02MxrNda/XrAHvFgROIIJ+gMUG0IGf/KAl/LmjNI4hvOTr3CXuarsPnJWSUmulW0Dqyg0VlD0Wm2B8KbWYeR5/OvQ3+7ADXJgilZ0qOc1xk1XEoFrEEkWHM9HElihlb3"},{"id":102758427,"key":"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCi3nWuy3grCiCLqCsszQ/zGa1IU1mlIwFB4/TOryMiDYMj764Bvn0O4C2wDKpjS5t17CKg24kxSxKob6cYiVnxQtIByyH1r3w0TWjDh+p3/A9YdZNXnY/SIBU5fiq3HNB+Vt12hz+107yzhJMWGc79MYIkCGKEAvnvQgTsv7AKHlvSPkd4lI1hEZnT6GxUQ8SFzFcnrQClumrjS/VxpNS1P/J958KNsGRe/lS1CPJGr3dh4GhVSMLX2x8qIifvEgHzipk+tmPmIJQCnvlvTZ4gUzGZBp65BLZrDBgzEv5tSY331bBF40iDJi/h8Y2a1IesYsXVnUwysGp1HrotybeWDtUDK4VkT8lbZSclhiIRXwwStPwXCRg7aTtb8oUZq2+hpAhqllV9M8HQ4faXKvOxmOB+RI3cHfLhBbKtpgksZAoPu4tERiobUKXqr7zwIHzdBzuFCZp87t2Iyzlc4r1J1ydfu0AgqX2h4S7D4mZHHiz8kVYFX4Ynp5CPHPQkbS8="},{"id":114074180,"key":"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAINN7wCTSwuReuTBykxjGny7rNmKnjgDvRyX1AJBmbu5D"}]

Using the fetch API we can shrink this down further:

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    var client = std.http.Client{ .allocator = allocator };
    defer client.deinit();

    var header_buffer: [4096]u8 = undefined;
    var resp_list = std.ArrayList(u8).init(allocator);
    defer resp_list.deinit();

    _ = try client.fetch(
        .{
            .server_header_buffer = &header_buffer,
            .location = .{ .url = "https://api.github.com/users/sc68cal/keys" },
            .response_storage = .{ .dynamic = &resp_list },
        },
    );

    std.debug.print("{s}", .{resp_list.items});
}

So, step 1 is done where we properly retrieve from the REST API.

The thing I keep thinking of is “With Python, yes it is a lot slower than Zig but man it takes a lot more manual work in Zig if you’re not aware of time saving API calls like fetch or knowing that there’s a reader that you can instantiate and then call readAllAlloc on.”

I am still determined to keep plugging away at Zig in order to learn the new language, but it is a thought that I keep coming back to.

Let’s see how difficult parsing the JSON is and accessing keys & values.

Since the GitHub REST API for SSH keys is very basic and just has an id and key field, we can create a struct in Zig that represents it. Having id be a unsigned 64 bit integer is probably a bit overkill but hey, why not splurge a little.

const SshKey = struct {
    id: u64,
    key: []u8,
};

We then use std.json.parseFromSlice and pass it a []SshKey since the REST API response will be a JSON Array of items, so we want the parsed result to be a slice of the items as well. The return from parseFromSlice is a std.json.static.Parsed which has a value field that contains the result.

    const json_parsed = try std.json.parseFromSlice(
        []SshKey,
        allocator,
        resp_list.items,
        .{},
    );
    defer json_parsed.deinit();

    for (json_parsed.value) |item| {
        std.debug.print("ID: {d}\n", .{item.id});
        std.debug.print("Key:\n{s}\n", .{item.key});
    }

So, overall the code is fairly compact. I am trying to keep perspective because I am forcing myself to use only the standard library, yet I am comparing it to Python with requests installed. I am sure if I limited myself to Python and only calls to urllib it would be more than a one-liner that you can do with requests.

I’m still skeptical about the supposed advantage of strong typing. In my short experience with Zig, the compiler does catch obvious mistakes where you are calling a function incorrectly, but I have seen more utility in the built in test functionality that Zig gives you, that you can embed right next to the code. This however isn’t really a game changer since with Python you write lots of unit tests and it would uncover mistakes at the same phase of development. So, while you are able to call a function with the wrong kind of arguments in Python and there’s no compiler to catch it while you’re writing it, running the unit tests for a Python project takes the same part of development that compiling in Zig does. So, it’s a different route to the same result.

Perhaps then instead, writing code in Zig will result in Web Applications and Web Clients that run faster and can scale better than Python? Although, those applications are far more often limited by network and disk IO rather than raw compute. However, if all things being equal the only thing I can optimize is for compute, then it’s still worth moving to Zig for more of my coding.

We’ll see. Zig is still definitely growing on me. While it may take significantly more time for me to write something in Zig compared to Python, code spends a lot longer being run than it does being developed and the speed advantages of Zig are very clear.