Skip to content

Releases: MyrtleSoftware/vollo-sdk

Vollo SDK 28.0.0

12 Jun 15:02
5489d87

Choose a tag to compare

  • MMIO optimisations
  • Speed up model compilation
  • Optimise pointwise operations
  • Support partial updates with inputs that have more than 65536 elements
  • Support for transposed dynamic weight matmuls
  • Add functions to check if a model was compiled with generate_state_reset = True
    • vollo_rt_model_is_resettable in Vollo RT C/C++ API
    • vollo_rt.VolloRTContext.model_is_resettable in Vollo RT Python bindings
    • vollo_compiler.Program.model_is_resettable in vollo-compiler
  • Add vollo_rt_add_device in Vollo RT C/C++ API as a more versatile version of vollo_rt_add_accelerator
    • vollo_rt.VolloRTContext.add_device in Vollo RT Python bindings
  • Handle SIGINT in long-running vollo_compiler functions
  • Support for non-streaming torch.nn.Conv1d
  • Support torch.sum/torch.Tensor.sum over multiple dimensions
  • Support variable-length arg for dims in torch.Tensor.permute
  • Fix handling of exponentiation on infinities

Vollo SDK 27.1.0

28 Apr 12:18
9687cc4

Choose a tag to compare

  • State reset with vollo_rt_add_reset_job (for models compiled with generate_state_reset = True)
  • Support for non-zero LSTM initial state
  • Support GELU (exact and tanh approximation)
  • Faster SiLU
  • Improve operator fusion in some cases
  • Fix a bug handling strides in slices
  • Fix HwConfig info (vollo-tool read-hw-config) of Artena bitstreams
  • Standardize naming of vollo_compiler.Config presets

Vollo SDK 27.0.1

16 Apr 15:34
3953be9

Choose a tag to compare

  • Reduce tensor ram usage in certain models
  • Add experimental vollo_rt_prepare_raw_buffer_output_completion and vollo_rt_check_raw_buffer_output_completion
    functions to Vollo RT for completion detection from a different thread to the one holding the vollo_rt_context

Vollo SDK 27.0.0

02 Apr 15:57
1a742b3

Choose a tag to compare

  • Add support for Silicom Artena
  • IO optimization when using MMIO
  • Add experimental support for specifying which cores to allocate PyTorch operations to using vollo_torch.CorePartition
  • Add support for specifying which cores to allocate models to in a multi-model program by passing core_indices to vollo_compiler.ProgramBuilder.add_nnir
  • Optimize sigmoid and SiLU activation functions
  • Improve spaced latency for some stateful models that use dynamic weights
  • Reduce tensor RAM usage of state in stateful models
  • vollo_torch.Fp8Weights now errors if used on operations which require bf16 weights, such as dynamic weights

Vollo SDK 26.2.0

09 Mar 09:34
4320994

Choose a tag to compare

  • Support for FP8 (E4M3) weights on Versal devices using vollo_torch.Fp8Weights context manager
  • Support for torch.exp, torch.exp2 at FP32 precision using vollo_torch.Fp32Activations context manager
  • Support for matmul operations where both inputs are dynamic (non-constant) tensors, when using allow_dynamic_weights
  • Optimize accumulations on Versal devices, improving performance of layers such as LayerNorm and RMSNorm, and Linear layers with small output features
  • Add support for multiple state tensors in vollo_torch.nn.Scan
  • Add allow_unserializable flag to vollo_compiler.NNIR.to_program for testing programs which can't be serialized
  • Fix multi-model programs that use dynamic weights

Vollo SDK 26.1.2

06 Feb 15:50
124f527

Choose a tag to compare

  • Optimize handling of biases in Linear layers when using allow_dynamic_weights
  • Speed up model compilation
  • Add random_seeds argument to vollo_compiler.NNIR.to_program

Vollo SDK 26.1.1

05 Feb 15:02
7c5a606

Choose a tag to compare

  • Fix V80LL initialization bitstream so that the V80LL memory can be flashed over JTAG
  • Optimize handling of biases in Linear layers when using allow_dynamic_weights
  • Add support for multiple outputs to vollo_torch.nn.Scan
  • Speed up loading .vollo programs

Vollo SDK 26.1.0

30 Jan 12:44
8a8f971

Choose a tag to compare

  • Fix DMA bug introduced on V80 in Vollo SDK 26.0.0
  • Add Alveo V80LL bitstream and vollo_compiler.Config.v80ll_c6b32 hardware config
  • Add support for Linear layers where the contracted dimension is not the data dimension via the allow_dynamic_weights flag for vollo_compiler.NNIR.to_program
  • Add support for multiple inputs to vollo_torch.nn.Scan
  • Add support for indexing with negative indices in: torch.stack, torch.sum, torch.permute, torch.squeeze, torch.unsqueeze
  • Add support for torch.nn.functional.linear
  • Add optional bias argument to vollo_torch.nn.PaddedConv1d

Vollo SDK 26.0.2

28 Jan 11:25
8a8f971

Choose a tag to compare

  • Update example/partial_update.c to allow multiple inputs and mixed precision
  • Fix bug in FP32/multi-input partial updates
  • Speed up model compilation

Vollo SDK 26.0.1

22 Jan 14:18
400c4eb

Choose a tag to compare

  • Make vollo-tool license use the system's CA certificates
  • Fix bug in FP32 partial updates