Releases · MyrtleSoftware/vollo-sdk · GitHub

12 Jun 15:02

acairncross

Vollo SDK 28.0.0 Latest

Latest

MMIO optimisations
Speed up model compilation
Optimise pointwise operations
Support partial updates with inputs that have more than 65536 elements
Support for transposed dynamic weight matmuls
Add functions to check if a model was compiled with generate_state_reset = True
- vollo_rt_model_is_resettable in Vollo RT C/C++ API
- vollo_rt.VolloRTContext.model_is_resettable in Vollo RT Python bindings
- vollo_compiler.Program.model_is_resettable in vollo-compiler
Add vollo_rt_add_device in Vollo RT C/C++ API as a more versatile version of vollo_rt_add_accelerator
- vollo_rt.VolloRTContext.add_device in Vollo RT Python bindings
Handle SIGINT in long-running vollo_compiler functions
Support for non-streaming torch.nn.Conv1d
Support torch.sum/torch.Tensor.sum over multiple dimensions
Support variable-length arg for dims in torch.Tensor.permute
Fix handling of exponentiation on infinities

Assets 12

28 Apr 12:18

basile-henry

Vollo SDK 27.1.0

State reset with vollo_rt_add_reset_job (for models compiled with generate_state_reset = True)
Support for non-zero LSTM initial state
Support GELU (exact and tanh approximation)
Faster SiLU
Improve operator fusion in some cases
Fix a bug handling strides in slices
Fix HwConfig info (vollo-tool read-hw-config) of Artena bitstreams
Standardize naming of vollo_compiler.Config presets

Assets 12

16 Apr 15:34

alex-pay

Vollo SDK 27.0.1

Reduce tensor ram usage in certain models
Add experimental vollo_rt_prepare_raw_buffer_output_completion and vollo_rt_check_raw_buffer_output_completion
functions to Vollo RT for completion detection from a different thread to the one holding the vollo_rt_context

Assets 12

02 Apr 15:57

acairncross

Vollo SDK 27.0.0

Add support for Silicom Artena
IO optimization when using MMIO
Add experimental support for specifying which cores to allocate PyTorch operations to using vollo_torch.CorePartition
Add support for specifying which cores to allocate models to in a multi-model program by passing core_indices to vollo_compiler.ProgramBuilder.add_nnir
Optimize sigmoid and SiLU activation functions
Improve spaced latency for some stateful models that use dynamic weights
Reduce tensor RAM usage of state in stateful models
vollo_torch.Fp8Weights now errors if used on operations which require bf16 weights, such as dynamic weights

Assets 12

09 Mar 09:34

acairncross

Vollo SDK 26.2.0

Support for FP8 (E4M3) weights on Versal devices using vollo_torch.Fp8Weights context manager
Support for torch.exp, torch.exp2 at FP32 precision using vollo_torch.Fp32Activations context manager
Support for matmul operations where both inputs are dynamic (non-constant) tensors, when using allow_dynamic_weights
Optimize accumulations on Versal devices, improving performance of layers such as LayerNorm and RMSNorm, and Linear layers with small output features
Add support for multiple state tensors in vollo_torch.nn.Scan
Add allow_unserializable flag to vollo_compiler.NNIR.to_program for testing programs which can't be serialized
Fix multi-model programs that use dynamic weights

Assets 10

06 Feb 15:50

alex-pay

Vollo SDK 26.1.2

Optimize handling of biases in Linear layers when using allow_dynamic_weights
Speed up model compilation
Add random_seeds argument to vollo_compiler.NNIR.to_program

Assets 10

05 Feb 15:02

acairncross

Vollo SDK 26.1.1

Fix V80LL initialization bitstream so that the V80LL memory can be flashed over JTAG
Optimize handling of biases in Linear layers when using allow_dynamic_weights
Add support for multiple outputs to vollo_torch.nn.Scan
Speed up loading .vollo programs

Assets 10

30 Jan 12:44

acairncross

Vollo SDK 26.1.0

Fix DMA bug introduced on V80 in Vollo SDK 26.0.0
Add Alveo V80LL bitstream and vollo_compiler.Config.v80ll_c6b32 hardware config
Add support for Linear layers where the contracted dimension is not the data dimension via the allow_dynamic_weights flag for vollo_compiler.NNIR.to_program
Add support for multiple inputs to vollo_torch.nn.Scan
Add support for indexing with negative indices in: torch.stack, torch.sum, torch.permute, torch.squeeze, torch.unsqueeze
Add support for torch.nn.functional.linear
Add optional bias argument to vollo_torch.nn.PaddedConv1d

Assets 9

28 Jan 11:25

alex-pay

Vollo SDK 26.0.2

Update example/partial_update.c to allow multiple inputs and mixed precision
Fix bug in FP32/multi-input partial updates
Speed up model compilation

Assets 9

22 Jan 14:18

acairncross

Vollo SDK 26.0.1

Make vollo-tool license use the system's CA certificates
Fix bug in FP32 partial updates

Assets 9