Skip to content

Make framing less error prone and requiring buffered IO #2

@daniel-j-h

Description

@daniel-j-h

At the moment we implement framing by size-prefixing graph chunk messages with a varint.

file header  # fixed size
graph header  # fixed size
size0
chunk0
size1
chunk1
..

this is the recommended way to implement framing; there's just one issue right now:

  • we do not hard-code the number of chunks, neither in the file header nor in the graph header
  • to check if there are still chunks in the file, we peek ahead and read four bytes; the idea was that we will either hit EOF and are done, or we will find a varint which then tells us about the size of the chunks to decode

This approach has the following downsides

  • in case the file contains en empty chunk, the varint will be a single byte. If such a chunk is at the very end of the file, we will hit EOF
  • we need buffered IO (e.g. a file or BufferedReader) to support peeking without reading; it would be great if we could avoid that

We need to figure out how we can change our approach and iterate some more. This really only came up when implementing the Python lib. Maybe we just encode how many chunks there are in the graph header. Or we use a fixed size prefix instead of a varint.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions