channel#

The compressed::channel<T> represents a single channel which is either part of a compressed::image<T> or standalone. It stores its data as a compressed buffer split up into smaller chunks. This allows for very memory efficient storage and modification as we only ever need to decompress one chunk at a time for which we can reuse the same memory buffer.

Iterating over a channel#

The C++ class exposes methods for handling this decompression/recompression for you. It allows for iteration over a channel almost like you would iterate over a regular std::vector<T> without having to worry about the decompression or recompression.

compressed::channel<T> my_channel = ...;

// This will iterate through all the chunks of the channel, decompressing and recompressing
// them, reusing the same internal buffer. On dereference you get a chunk
// span which is a std::span<T> with additional metadata for e.g. finding the x and y coordinate
for (compressed::container::chunk_span<T> chunk : channel)
{
    for (auto& [index, pixel] : std::views::enumerate(chunk))
    {
        // The index here is local to the chunk itself, we will take care
        // of computing its global coordinate.
        auto x = chunk.x(index);
        auto y = chunk.y(index);

        // You can now modify the pixel in-place
        pixel = static_cast<T>(x) * y;
    }
}
// Note that the chunk_span is only valid as long as the iterator is alive!
import compressed_image as compressed

channel: compressed.Channel = ...

buffer = np.ndarray((channel.chunk_elems(),), dtype = channel.dtype)
for i in range(channel.num_chunks() - 1):
    channel.get_chunk(i, buffer)
    buffer[:] = i
    channel.set_chunk(i, buffer)

final_buffer = channel.get_chunk(channel.num_chunks() -1)
final_buffer[:] = channel.num_chunks() - 1
channel.set_chunk(channel.num_chunks() -1, final_buffer)

Lazy channels#

The compressed::channel<T> additionally has support for lazy channels. These can be created via e.g.

compressed::channel<T>::zeros compressed::channel<T>::zeros_like compressed::channel<T>::full compressed::channel<T>::full_like

Lazy channels are channels which store a single value per-chunk until that chunk is filled. This is especially efficient for largely sparse data and is used extensively by the cryptomatte-api.

When using lazily generated channels, it is usually advised to not iterate over them as shown above (although it is entirely valid) as that will store explicit data on each of the chunks making the further memory savings negligible. Instead, it is recommended to explicitly set the chunks that will contain data.

Take for example a mask channel which will only cover a small portion of the image, rather than having to explicitly store compressed data for the whole image (which won’t compress 100% and will have a non-trivial overhead), we can just store a single value for all chunks outside of this mask area.

auto lazy_channel = compressed::channel<T>::zeros(2048 /*width*/, 2048 /*height*/);

// Generate a buffer which will hold the chunk data, we never have to allocate the whole image
std::vector<T> buffer(lazy_channel.chunk_elems());

// Iterate over all the chunks and modify only specific ones.
for (size_t i = 0; i < lazy_channel.num_chunks())
{
    if (/* some arbitrary condition is true */)
    {
        // Note that we need to take a subspan of the buffer here. All chunks within a
        // channel are guaranteed to be the same size, except for the last chunk which
        // may be smaller, we do not pad up to the chunk size so we need to ensure
        // we do not try to set too much data.
        std::span<T> chunk_span(buffer.begin(), buffer.begin() + lazy_channel.chunk_elems(i));

        // Setting this to a span which does not have the size compressed::channel<T>::chun_elems(index)
        // would raise an exception here.
        lazy_channel.set_chunk(chunk_span, i);
    }
}

Note

Even though you may provide your own chunk size, we internally ensure that this is always a multiple of sizeof(T) so you can always safely convert from chunk_size -> chunk_elems.

import compressed_image as compressed

channel = compressed.Channel.zeros(np.float16, width = 2048, height = 2048)

# Now we can iterate over all the channels and only modify specific ones
for chunk_idx in range(channel.num_chunks())
    if some_condition:
        decompressed = channel.get_chunk(chunk_idx)

        # Do some modifications
        decompressed[:] = 100

        # Now ensure we set back the chunk, as otherwise the data will not update!
        channel.set_chunk(chnk_ix, decompressed)

Memory layout#

compressed::channel<T> are internally stored as chunks of scanline data, with each chunk representing n scanlines (this may also include partial scanlines, although in most cases it will be aligned to scanlines). If we visualize this, a chunk could therefore look like this:

../_images/chunked_image.jpg

As either the chunk size grows, or the image shrinks, the chunk will take up more or less vertical space in the image. This is important to know as unlike e.g. tiled images this means that if you wish to access a vertical slice in an image, this will result in you having to decompress the entire image.

The chunk size is additionally capped such that if you have a channel that takes up less bytes than chunk_size, the chunk size will be adjusted to be == width * height * sizeof(T). Therefore there is no need to modify the chunk size if you plan on only compressing smaller images.

Block size#

You may notice the constructor of channels taking a block_size parameter. This parameter controls the size of blocks within chunks. The compressed data is stored in 3 levels with the blocks being the lowest level. It goes channels > chunks > blocks.

While the user can set the block size, they cannot extract just a single block of data from a chunk, and it is also not transparent where a block starts or ends.

The main thing to worry about when it comes to block size is knowing that:

A, it is the smallest unit and is what will be compressed in the end (chunks simply hold collections of blocks).

B, it should roughly fit into the L1 cache of your CPU for better data throughput.

These implementation details are usually tackled just fine with compressed::s_default_blocksize

Channel Struct#

template<typename T>
struct channel : public std::ranges::view_interface<channel<T>>#

Public Types

using value_type = T#
using iterator = channel_iterator<T>#
using const_iterator = channel_iterator<const T>#

Public Functions

inline channel(channel &&other)#
inline channel &operator=(channel &&other)#
channel(const channel&) = delete#
channel &operator=(const channel&) = delete#
inline channel()#

Default ctor, ensures the schunk and compression/decompression contexts are always initialized into valid states. This will not generate a valid channel however and the ctor taking data or the static functions zeros and full are preferred.

inline channel(const std::span<const T> data, size_t width, size_t height, enums::codec compression_codec = enums::codec::lz4, uint8_t compression_level = 9, size_t block_size = s_default_blocksize, size_t chunk_size = s_default_chunksize)#

Initialize the channel with the given data.

Parameters:
  • data – The span of input data to be compressed.

  • width – The width of the image channel.

  • height – The height of the image channel.

  • compression_codec – The compression codec to be used (default is lz4).

  • compression_level – The compression level (default is 5).

  • block_size – The size of the blocks stored inside the chunks, defaults to 32KB which is enough to comfortably fit into the L1 cache of most modern CPUs. If you know your cpu can handle larger blocks feel free to up this number although this may not increase performance

  • chunk_size

    The size of each individual chunk, defaults to 4MB which is enough to hold a 2048x2048 channel. This should be tweaked to be no larger than the size of the usual images you are expecting

    to compress for optimal performance but this could be upped which might give better compression ratios. Must be a multiple of sizeof(T).

inline channel(blosc2::schunk_var<T> schunk, size_t width, size_t height, enums::codec compression_codec = enums::codec::lz4, uint8_t compression_level = 9)#

Initialize the channel with the given data.

Parameters:
  • schunk – The initialized super-chunk.

  • width – The width of the image channel.

  • height – The height of the image channel.

  • compression_codec – The compression codec to be used.

  • compression_level – The compression level (default is 5).

  • block_size – The size of the blocks stored inside the chunks, defaults to 32KB which is enough to comfortably fit into the L1 cache of most modern CPUs. If you know your cpu can handle larger blocks feel free to up this number although this may not increase performance

  • chunk_size

    The size of each individual chunk, defaults to 4MB which is enough to hold a 2048x2048 channel. This should be tweaked to be no larger than the size of the usual images you are expecting

    to compress for optimal performance but this could be upped which might give better compression ratios. Must be a multiple of sizeof(T).

inline iterator begin()#

Returns an iterator pointing to the beginning of the compressed data.

Returns:

An iterator to the beginning of the compressed data.

inline iterator end()#

Returns an iterator pointing to the end of the compressed data.

Returns:

An iterator to the end of the compressed data.

inline blosc2::context_raw_ptr compression_context()#

Retrieve a view to the compression context. In most cases users will not have to modify this.

Returns:

A pointer to the compression context.

inline blosc2::context_raw_ptr decompression_context()#

Retrieve a view to the decompression context. In most cases users will not have to modify this.

Returns:

A pointer to the decompression context.

inline void update_nthreads(size_t nthreads, size_t block_size = s_default_blocksize)#

Update the number of threads used internally by c-blosc2 for compression and decompression.

Parameters:
  • nthreads – The number of threads to use for compression and decompression.

  • block_size – The block size to compress to

inline size_t width() const noexcept#

The channel width.

Returns:

The width of the channel.

inline size_t height() const noexcept#

The channel height.

Returns:

The height of the channel.

inline enums::codec compression() const noexcept#

Retrieve the compression codec used.

Returns:

The compression codec.

inline uint8_t compression_level() const noexcept#

Retrieve the compression level used.

Returns:

The compression level (typically from 1-9).

inline size_t compressed_bytes() const#

Retrieve the compressed data size.

Returns:

The size of the compressed data in bytes.

inline size_t uncompressed_size() const#

Retrieve the uncompressed data size.

Returns:

The size of the uncompressed data in elements.

inline size_t num_chunks() const#

Retrieve the total number of chunks the channel stores.

Returns:

The number of chunks.

inline size_t block_size() const#

Retrieve the block size (in bytes) of the channel.

The internal blosc2 implementation reserves changing this value on compression so it may be possible that this is not the value you initially set.

Returns:

The block size (in bytes).

inline size_t chunk_size() const noexcept#

Retrieve the chunk size (in bytes) of the channel.

This will be all of the chunk sizes except for the last chunk. The last chunk may be smaller so to accurately capture it you should use the override with a size_t

Returns:

The chunk size (in bytes).

inline size_t chunk_elems() const#
inline size_t chunk_size(size_t chunk_index) const#

Retrieve the chunk size (in bytes) of the channel at the given chunk index.

Throws:

std::out_of_range – if the chunk index is invalid

Returns:

The chunk size (in bytes) at index chunk_index.

inline size_t chunk_elems(size_t chunk_index) const#
inline void get_chunk(std::span<T> buffer, size_t chunk_idx) const#

Retrieves and decompresses a chunk of data into the provided buffer.

This function retrieves the chunk at the given index from the internal schunk, decompresses it using the current decompression context, and stores the result in buffer.

Parameters:
  • buffer – A span representing the destination buffer to store the decompressed data. Must be large enough to hold one chunk of decompressed data.

  • chunk_idx – The index of the chunk to retrieve.

Throws:

std::runtime_error – if the internal schunk pointer is not initialized.

inline void set_chunk(std::span<T> buffer, size_t chunk_idx)#

Compresses and sets a chunk of data from the provided buffer at the specified index.

This function compresses the data in the provided buffer using the current compression context and writes it into the internal schunk at the given index.

Parameters:
  • buffer – A span representing the source data to be compressed and stored.

  • chunk_idx – The index of the chunk to overwrite or set with the compressed data.

Throws:

std::runtime_error – if the internal schunk pointer is not initialized.

inline std::vector<T> get_decompressed() const#

Get the decompressed data as a vector.

Throws:

std::runtime_error – if the internal schunk pointer is not initialized.

Returns:

A vector containing the decompressed data.

inline bool operator==(const channel<T> &other) const noexcept#

Equality operators, compares pointers to check for equality.

Public Static Functions

static inline channel zeros(size_t width, size_t height, enums::codec compression_codec = enums::codec::lz4, uint8_t compression_level = 9, size_t block_size = s_default_blocksize, size_t chunk_size = s_default_chunksize)#

Create a channel filled with zeros.

Generates a lazy-channel which only stores a single value T per-chunk, only setting this to a compressed buffer if set with something like set_chunk. This is especially memory efficient and should be the preferred way when wanting to generate an empty channel only filling out some parts (i.e. sparse cryptomatte loading).

Parameters:
  • width – The width of the image channel.

  • height – The height of the image channel.

  • compression_codec – The compression codec to be used.

  • compression_level – The compression level (default is 9).

  • block_size – The size of the blocks stored inside the chunks, defaults to 32KB which is enough to comfortably fit into the L1 cache of most modern CPUs.

  • chunk_size – The size of each individual chunk, defaults to 4MB. Should be no larger than the expected image size for optimal performance and must be a multiple of sizeof(T).

Returns:

A channel instance with all values initialized to zero.

static inline channel zeros_like(const channel &other)#

Create a zero-initialized channel with the same shape and compression parameters as another channel.

Generates a lazy-channel which only stores a single value T per-chunk, only setting this to a compressed buffer if set with something like set_chunk. This is especially memory efficient and should be the preferred way when wanting to generate an empty channel only filling out some parts (i.e. sparse cryptomatte loading).

Parameters:

other – The reference channel from which to copy shape and compression settings.

Returns:

A new channel instance with the same dimensions and compression settings as other, filled with zeros.

static inline channel full(size_t width, size_t height, T fill_value, enums::codec compression_codec = enums::codec::lz4, uint8_t compression_level = 9, size_t block_size = s_default_blocksize, size_t chunk_size = s_default_chunksize)#

Create a channel filled with a specific value.

Generates a lazy-channel which only stores a single value T per-chunk, only setting this to a compressed buffer if set with something like set_chunk. This is especially memory efficient and should be the preferred way when wanting to generate an empty channel only filling out some parts (i.e. sparse cryptomatte loading).

Parameters:
  • width – The width of the image channel.

  • height – The height of the image channel.

  • fill_value – The value to fill the channel with.

  • compression_codec – The compression codec to be used.

  • compression_level – The compression level (default is 9).

  • block_size – The size of the blocks stored inside the chunks, defaults to 32KB.

  • chunk_size – The size of each individual chunk, defaults to 4MB. Should be no larger than the expected image size for optimal performance and must be a multiple of sizeof(T).

Returns:

A channel instance with all values initialized to fill_value.

static inline channel full_like(const channel &other, T fill_value)#

Create a channel filled with a specific value and the same shape and compression settings as another channel.

Generates a lazy-channel which only stores a single value T per-chunk, only setting this to a compressed buffer if set with something like set_chunk. This is especially memory efficient and should be the preferred way when wanting to generate an empty channel only filling out some parts (i.e. sparse cryptomatte loading).

Parameters:
  • other – The reference channel from which to copy shape and compression settings.

  • fill_value – The value to fill the channel with.

Returns:

A new channel instance filled with fill_value and the same dimensions and compression settings as other.

class compressed_image.Channel#

A dynamically-typed compressed image channel with support for lazy-storage.

Provides compressed image data with access to shape, compression settings, and conversion to/from numpy arrays.

Supports the following np.dtypes as fill values:
  • np.float16

  • np.float32

  • np.uint8

  • np.int8

  • np.uint16

  • np.int16

  • np.uint32

  • np.int32

The data is stored as compressed chunks rather than as one large compressed array allowing for decompression/recompression of only parts of the data allowing for very memory-efficient operations.

__init__(self: compressed_image.lib64.compressed_image.Channel, data: numpy.ndarray, width: typing.SupportsInt, height: typing.SupportsInt, compression_codec: compressed_image.lib64.compressed_image.Codec = <Codec.lz4: 1>, compression_level: typing.SupportsInt = 9, block_size: typing.SupportsInt = 32768, chunk_size: typing.SupportsInt = 4194304) None#

Initialize a compressed channel from a numpy array with the given compression settings. This numpy array should be 1/2-dimensional with it’s overall size matching width * height.

Typically you do not need to modify any of the defaults for compression_codec, compression_level, block_size and chunk_size

Parameters:
  • data – The input numpy array, must be 1- or 2-dimensional.

  • width – Width of the channel.

  • height – Height of the channel.

  • compression_codec – Compression codec to use (default: lz4).

  • compression_level – Compression level (default: 9).

  • block_size – Block size for compression (default: 32_768).

  • chunk_size – Chunk size for compression (default: 4_194_304).

block_size(self: compressed_image.lib64.compressed_image.Channel) int#
Returns:

Block size used for compression.

chunk_elems(*args, **kwargs)#

Overloaded function.

  1. chunk_elems(self: compressed_image.lib64.compressed_image.Channel) -> int

Returns:

Number of elements in a single chunk

  1. chunk_elems(self: compressed_image.lib64.compressed_image.Channel, chunk_index: typing.SupportsInt) -> int

Parameters:

chunk_index – Index of the chunk.

Returns:

Number of elements in a single chunk

chunk_size(*args, **kwargs)#

Overloaded function.

  1. chunk_size(self: compressed_image.lib64.compressed_image.Channel) -> int

Returns:

Chunk size (bytes) used for compression.

  1. chunk_size(self: compressed_image.lib64.compressed_image.Channel, chunk_index: typing.SupportsInt) -> int

Parameters:

chunk_index – Index of the chunk.

Returns:

Size of the specified chunk (bytes).

compressed_bytes(self: compressed_image.lib64.compressed_image.Channel) int#
Returns:

Size of the compressed data in bytes.

compression(self: compressed_image.lib64.compressed_image.Channel) compressed_image.lib64.compressed_image.Codec#
Returns:

The compression codec used.

compression_level(self: compressed_image.lib64.compressed_image.Channel) int#
Returns:

The compression level used.

property dtype#
Returns:

The numpy dtype of the underlying data.

static full(dtype: object, fill_value: object, width: typing.SupportsInt, height: typing.SupportsInt, compression_codec: compressed_image.lib64.compressed_image.Codec = <Codec.lz4: 1>, compression_level: typing.SupportsInt = 9, block_size: typing.SupportsInt = 32768, chunk_size: typing.SupportsInt = 4194304) compressed_image.lib64.compressed_image.Channel#

Create a new lazy-channel initialized with the fill value. This is very efficient as it only stores a single value per-chunk only storing compressed data for it if explicitly done so via set_chunk

Parameters:
  • dtype – numpy dtype for the data.

  • fill_value – The fill value for the data, may be a float or an integer

  • width – Image width.

  • height – Image height.

  • compression_codec – Compression codec.

  • compression_level – Compression level.

  • block_size – Block size for compression.

  • chunk_size – Chunk size for compression.

Returns:

A new compressed_image.Channel.

static full_like(other: compressed_image.lib64.compressed_image.Channel, fill_value: object) compressed_image.lib64.compressed_image.Channel#

Create a new channel with the same shape and dtype as another, filled to fill_value.

Parameters:
  • other – Another compressed_image.Channel to mimic.

  • fill_value – The fill value for the data, may be a float or an integer

Returns:

A new compressed_image.Channel.

get_chunk(*args, **kwargs)#

Overloaded function.

  1. get_chunk(self: compressed_image.lib64.compressed_image.Channel, chunk_index: typing.SupportsInt) -> numpy.ndarray

Get the decompressed data for a chunk. This represents a sub-part of the image which may be aligned to scanlines, but it doesn’t have to be. To compute the starting coordinate of a chunk you should query

start_x = chunk_index * channel.chunk_size() % channel.width() start_y = chunk_index * channel.chunk_size() // channel.width()

from there you can compute the full extents of the chunk.

Parameters:

chunk_index – Index of the chunk to decompress.

Returns:

1D numpy array containing decompressed data.

  1. get_chunk(self: compressed_image.lib64.compressed_image.Channel, chunk_index: typing.SupportsInt, array: numpy.ndarray) -> None

Get the decompressed data for a chunk. This represents a sub-part of the image which may be aligned to scanlines, but it doesn’t have to be. To compute the starting coordinate of a chunk you should query

start_x = chunk_index * channel.chunk_size() % channel.width() start_y = chunk_index * channel.chunk_size() // channel.width()

from there you can compute the full extents of the chunk.

This overload allows you to reuse a buffer rather than having to keep setting up a new one, to use it you should so something along the lines of the below code example. A channels chunks are guaranteed to be the same size for all chunks except the last one, so we can reuse the same buffer again.

buffer = np.ndarray((channel.chunk_elems(),), dtype= channel.dtype)
for i in range(channel.num_chunks() - 1):
    channel.get_chunk(i, buffer)
    # Modify chunk
    channel.set_chunk(i, buffer)

final_buffer = channel.get_chunk(channel.num_chunks() -1)
# Modify chunk
channel.set_chunk(channel.num_chunks() -1, final_buffer)
Parameters:
  • chunk_index – Index of the chunk to decompress.

  • array – The 1D numpy array to extract the data to, must be exactly chunk_elems(chunk_index) in size.

Returns:

1D numpy array containing decompressed data.

get_decompressed(self: compressed_image.lib64.compressed_image.Channel) numpy.ndarray#
Returns:

The full decompressed data as a 2D numpy array.

num_chunks(self: compressed_image.lib64.compressed_image.Channel) int#
Returns:

Number of chunks in the compressed channel.

set_chunk(self: compressed_image.lib64.compressed_image.Channel, chunk_index: SupportsInt, array: numpy.ndarray) None#

Replace a chunk’s contents with a new array. This array must match the size channel.chunk_size(chunk_index).

Parameters:
  • chunk_index – Index of the chunk to update. Must be less than self.num_chunks

  • array – 1D numpy array to set onto the chunk.

property shape#
Returns:

Tuple of (height, width).

uncompressed_size(self: compressed_image.lib64.compressed_image.Channel) int#
Returns:

Number of elements in the uncompressed array.

update_nthreads(self: compressed_image.lib64.compressed_image.Channel, nthreads: SupportsInt, block_size: SupportsInt = 32768) None#

Update the number of threads used for compression/decompression as controlled by blosc2. This does not limit the number of threads for the rest of the compressed_image library.

Parameters:
  • nthreads – Number of threads to use.

  • block_size – Optional block size override.

static zeros(dtype: object, width: typing.SupportsInt, height: typing.SupportsInt, compression_codec: compressed_image.lib64.compressed_image.Codec = <Codec.lz4: 1>, compression_level: typing.SupportsInt = 9, block_size: typing.SupportsInt = 32768, chunk_size: typing.SupportsInt = 4194304) compressed_image.lib64.compressed_image.Channel#

Create a new channel filled with zeros. This is very efficient as it only stores a single value per-chunk only storing compressed data for it if explicitly done so via set_chunk

Parameters:
  • dtype – numpy dtype for the data.

  • width – Image width.

  • height – Image height.

  • compression_codec – Compression codec.

  • compression_level – Compression level.

  • block_size – Block size for compression.

  • chunk_size – Chunk size for compression.

Returns:

A new compressed_image.Channel.

static zeros_like(other: compressed_image.lib64.compressed_image.Channel) compressed_image.lib64.compressed_image.Channel#

Create a new channel with the same shape and dtype as another, filled with zeros.

Parameters:

other – Another compressed_image.Channel to mimic.

Returns:

A new compressed_image.Channel.