Tinkering with CompressionLib (Part 2)

16 Jul

After thinking (and writing) a bit about the new CompressionLib available in OS 10.11 (and iOS 9) I’ve continued to play around with it a bit more.

Last time I had looked at the _buffer functions and how they could be used. I remarked, somewhat off handedly, about the inability to get at the expected uncompressed size of an archive and it being a limiting factor. The _buffer functions are simply not meant to be used that way. I do however, still yearn for a CompressionLib style way to get at the archives metadata (assuming the given archive format supports it of course) – I’ll have to give this some more thought.

I largely ignored the stream functions in my first post. This time around I’m going to take a much closer look at them and show how they can be used with NSData. Objective-C is getting some love today.

So, without further ado, let’s dive in.

Say you have an NSData object that contains some raw data, maybe level data for a game, or some other model data that doesn’t change frequently, and you would like to compress it. This is about the simplest case of using compression, but it’s also quite possibly the most common.

Here’s how we can use CompressionLib to compress a NSData object:

// the stream used for compression and it's status
compression_stream stream;
compression_status status;

// we want to compress so we use the ENCODE option
compression_stream_operation op = COMPRESSION_STREAM_ENCODE;

// COMPRESSION_STREAM_FINALIZE is used to indicate that no further input will be added. Since we have the entire input data we can finalize right away
compression_stream_flags flags = COMPRESSION_STREAM_FINALIZE;

// we want to use the super awesome LZFSE algorithm
compression_algorithm algorithm = COMPRESSION_LZFSE;

// init the stream for compression
status = compression_stream_init(&stream, op, algorithm);
    // FIXME: Shame on you for not handling this error properly

// setup the stream's source
NSData *inputData = // get some data
stream.src_ptr    = inputData.bytes;
stream.src_size   = inputData.length;

// setup the stream's output buffer
// we use a temporary buffer to store data as it's compressed
size_t   dstBufferSize = 4096;
uint8_t *dstBuffer     = malloc(dstBufferSize);
stream.dst_ptr         = dstBuffer;
stream.dst_size        = dstBufferSize;
// and we store the aggregated output in a mutable data object
NSMutableData *outputData = [NSMutableData new];

do {
    // try to compress some data
    status = compression_stream_process(&stream, flags);
    switch (status) {
            // Going to call _process at least once more
            if (stream.dst_size == 0) {
                // Output buffer is full...
                // Write out to outputData
                [outputData appendBytes:dstBuffer length:dstBufferSize];
                // Re-use dstBuffer
                stream.dst_ptr = dstBuffer;
                stream.dst_size = dstBufferSize;
            // We are done, just write out the dstBuffer if there's anything in it
            if (stream.dst_ptr > dstBuffer) {
                [outputData appendBytes:dstBuffer length:stream.dst_ptr - dstBuffer];
            // FIXME: Eat your vegetables, handle your errors.
} while (status == COMPRESSION_STATUS_OK);

// We're done with the stream so free it

// Finally we get our compressed data
NSData *compressedData = [outputData copy];

When you want to decompress the data you only need to change two variables:

// set the stream operation to decode instead of encode
compression_stream_operation op = COMPRESSION_STREAM_DECODE;

// set the flags to 0, we don't need any flags here
compression_stream_flags flags = 0;

The work involved in setting up a stream for compression / decompression is as minimal as it can be. Like the _buffer APIs, the _stream APIs are very clean and easy to use. In fact, the only stumbling block I ran into was the need to reset the dst_ptr and dst_size during processing to reuse the buffer. I had expected the reuse to be implied and didn’t see anything in the header about this.

With CompressionLib coming standard in the latest SDKs it seems only natural that NSData should support compression and decompression. Compressing some of your app’s data can go a long way to saving valuable space on the user’s device. Not to mention the performance and energy benefits of using compressed data during network transactions. So while I was playing with CompressionLib thats exactly what I did…

Here’s a small category on NSData that adds two public methods

// Returns a NSData object created by compressing the receiver using the given compression algorithm.
- (NSData *)lam_compressedDataUsingCompression:(LAMCompression)compression;

// Returns a NSData object by uncompressing the receiver using the given compression algorithm.
- (NSData *)lam_uncompressedDataUsingCompression:(LAMCompression)compression;

To compress some data you just call:

NSData *compressedData = [someUncompressedData lam_compressedDataUsingCompression:LAMCompressionLZFSE];

And to uncompress some data:

NSData *uncompressedData = [someCompressedData lam_uncompressedDataUsingCompression:LAMCompressionLZFSE];

Here’s the repo.