Snappy compressed format description
-Last revised: 2011-05-16
+Last revised: 2011-10-05
This is not a formal specification, but should suffice to explain most
stored as a little-endian varint. Varints consist of a series of bytes,
where the lower 7 bits are data and the upper bit is set iff there are
more bytes to be read. In other words, an uncompressed length of 64 would
-be stored as 0x40, and an uncompressed length of 2097151 (0x1FFFFF)
-would be stored as 0xFF 0xFF 0x7F.
+be stored as 0x40, and an uncompressed length of 2097150 (0x1FFFFE)
+would be stored as 0xFE 0xFF 0x7F.
2. The compressed stream itself
00: Literal
01: Copy with 1-byte offset
10: Copy with 2-byte offset
- 11: Copy with 3-byte offset
+ 11: Copy with 4-byte offset
The interpretation of the upper six bits are element-dependent.
- For literals up to and including 60 bytes in length, the upper
six bits of the tag byte contain (len-1). The literal follows
immediately thereafter in the bytestream.
- - For longer literals, the length is stored after the tag byte,
+ - For longer literals, the (len-1) value is stored after the tag byte,
little-endian. The upper six bits of the tag byte describe how
many bytes are used for the length; 60, 61, 62 or 63 for
1-4 bytes, respectively. The literal itself follows after the
little-endian 16-bit integer in the two bytes following the tag byte.
-2.2.3. Copy with 4-byte offsets (11)
+2.2.3. Copy with 4-byte offset (11)
These are like the copies with 2-byte offsets (see previous subsection),
except that the offset is stored as a 32-bit integer instead of a