Commits · 4.3.0.rc1 · artificial-intelligence / ethos-u / Vela

May 15, 2025

MLBEDSW-10683: Port 1x1 IFM resize for Ethos-U55/U65 · fddbad1d

William Isaksson authored Apr 10, 2025 and

Fredrik Svedberg committed May 15, 2025



Adds support for 1x1 IFM resizes to Regor.

Change-Id: Ia16bc65748ac5518291e1e4a78cd3b6350c7b0d5
Signed-off-by: William Isaksson <william.isaksson@arm.com>

fddbad1d

C++20 compatibility fix for regor logging · a37d6a4c

Limin Tang authored May 08, 2025 and

Tim Hall committed May 15, 2025

Logging dependency fmt::format takes fstring as input by default, which is required to
return compile time constant expression when built with C++20 (enforced by C++20 specifier
consteval). To make the logging work with C++20, the string input needs to be converted
to runtime evaluated string expression via fmt::runtime first.

Also fix minor bugs in tflite_model_semantics.cpp that causes build to fail with C++20.

Signed-off-by: Limin Tang <limintang@meta.com>

a37d6a4c

MLBEDSW-10776 Don't serialize intermediates vector if empty · 6fa0bbad

Jacob Bohlin authored May 15, 2025

Most operations do not have any intermediate tensors. In this case there
is no need to create a vector in the flatbuffer file as even empty vectors
contain a 4 byte length header. With this patch the intermediates vector
is only created if any intermediate tensors are present.

Change-Id: I505ebe8c17a577eee2361050d18715207555409d
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

6fa0bbad

May 14, 2025

MLBEDSW-9326: Revert FlatBuffers version · 5c42e8cc

William Isaksson authored May 14, 2025



Reverts version of FlatBuffers to v24.3.25.

There were no changes to the schema generated files due to this

Change-Id: I3791384206d5eb8fac7f37e505d386b5ea8e594b
Signed-off-by: William Isaksson <william.isaksson@arm.com>

5c42e8cc

MLBEDSW-9408: Add full REDUCE MIN/MAX/SUM decompostion · 981cc442

Johan Gunnarsson authored May 09, 2025



This implements full decomposition of TOSA REDUCE MIN/MAX/ANY/ALL
of the reduced axis.

* Extend decomposition to do blockwise reduce operations in the
  reduced axis.
* Add ReduceSum/ReduceMinMax to constraints.
* Move reshaping of reduce ops into decomposition.
* Move creating a reduce ops kernel to ConvertAttributes.
* Remove RewriteReduceMinMaxAnyAll.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ic0873dab1c3c5344045590d1c11724986b896120

981cc442

MLBEDSW-10656: Fix missing header compilation failure · 1bfdfd3e

Tim Hall authored May 08, 2025 and

Fredrik Svedberg committed May 14, 2025



 - Included the missing unordered_map

Change-Id: Ife560a9645526fc251e18db317719710070e9d82
Signed-off-by: Tim Hall <tim.hall@arm.com>

1bfdfd3e

May 13, 2025

MLBEDSW-10776 Revert graph traversal change · 319b4610

Jacob Bohlin authored May 12, 2025 and

Rickard Bolin committed May 13, 2025



The graph traversal was modified in MLBEDSW-8926 to traverse tensor
writers left-to-right instead of right-to-left. This is required to
ensure correct execution order for LSTM.

This change reverts the graph traversal back to right-to-left in the
general case and left-to-right will only be used on graphs which contain
persistent tensors, in order to target LSTM operators.

Change-Id: Ibe13af0cf952450cff253ff2e44ee8b96068583a
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

319b4610

MLBEDSW-10635: ReinterpretCast operator and cast to int64 · 498f957b

Max Bergfelt authored Apr 10, 2025 and

Johan Alfvén committed May 13, 2025

Implemented a ReinterpretCast operator which can be used to reinterpret tensors with different data types and sizes. Additionally added support for non hardware supported cast to in64 by replacing the operation with 4 sequential cast and reinterpret operations.

Change-Id: Ie7032cd5384c17a766dd17034cd59871bb1a833d
Signed-off-by: Max Bergfelt <max.bergfelt@arm.com>

498f957b

MLBEDSW-10779 Supported ops checks: Reject negative zero points for unsigned · e6783a07
Jacob Bohlin authored May 13, 2025 and Johan Alfvén committed May 13, 2025
```
Change-Id: I19c16a6a06278d3a0301d763482e0ab9aaee4472
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
e6783a07

May 12, 2025

MLBEDSW-10756 Fix issue where HLCParameters were copied incorrectly · 133a265c
Jacob Bohlin authored May 12, 2025
```
Change-Id: I4a40cf56a37f10e95b52f93d2b60851bc38f5aaf
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
133a265c
MLBEDSW-10681 Implement zero-point correction required for LSTM int16 · e989a504
Jacob Bohlin authored May 07, 2025 and Fredrik Svedberg committed May 12, 2025
```
Change-Id: I7bcef5bc787a7e00d7dee820a79bc451e1a97494
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
e989a504

MLBEDSW-10781: Fix transpose with LUT assert · bdc6dc3d

Philip Hall authored May 12, 2025 and

Fredrik Svedberg committed May 12, 2025



Ethos-U55 transpose with LUT generates an assert as a result
of default-initialised values in the block config.
This commit moves the LUT insertion until after a suitable
block config is available.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: If2137253696d26f996605ec9bf2fab1a1c69c229

bdc6dc3d

May 09, 2025

MLBEDSW-10533: Don't fuse transpose to an op with OFM slice · 72c32a33

Johan Gunnarsson authored May 06, 2025



* Don't fuse transpose to an op with OFM slice
* Also, when fusing a transpose, the primary op should inherit the
  fused op's OFM slice. Otherwise we might end up with different
  shapes on OFM slice and OFM and in that case OFM slice shape will
  be used later on.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Idb25cc3a53f0b52dcc59cda1aefa31d9d19a850f

72c32a33

May 08, 2025

MLBEDSW-8926 Fix WIN32 build issue · 96e9a932

Jacob Bohlin authored May 06, 2025 and

Fredrik Svedberg committed May 08, 2025



Problem was a code snippet assuming that `uintptr_t` was >32 bit wide.

Change-Id: I3bac0b0760b0f111d6ec6a9a1e955e0c19f3398c
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

96e9a932

MLBEDSW-10764 Invalidate buffer hash when writing new values · c6b4acd4
Jacob Bohlin authored May 07, 2025 and Fredrik Svedberg committed May 08, 2025
```
Change-Id: Icdef37a712c708cc75418f8aa4cd8033f3f70d5c
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
c6b4acd4
MLBEDSW-10756 Fix chaining issue with fusing transpose with activation · c932624f
Jacob Bohlin authored May 02, 2025 and Fredrik Svedberg committed May 08, 2025
```
Change-Id: Iec466c83c207822c48b2c1746fe39f46f4541a72
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
c932624f

MLBEDSW-10631: Convert non-constant int4 tensors to int8 · 3406966a

Rickard Bolin authored May 08, 2025



Similar as for int48 tensors, we should unpack the constant tensors and
change data type for the non-constant ones.

Change-Id: I9c3c69bd2d1e41bfa32e959c35325b1280f626e2
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

3406966a

May 07, 2025

MLBEDSW-10631: Extend type check to cover unpacked 48bit values · 393de444
Rickard Bolin authored May 07, 2025
```
Change-Id: I2758f97c6cc328aedf38edbc8017afacdf3a0fdb
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
```
393de444

MLBEDSW-10240: Implement missing transpose LUT fusing · f72c9110

Philip Hall authored Apr 23, 2025 and

Fredrik Svedberg committed May 07, 2025



For Ethos-U55 it is, in some cases, possible to fuse
a LUT onto the transpose.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: I4dfce085d83b78d80ae52f815757082058555c21

f72c9110

MLBEDSW-10749: MakeMemCopy and MakeTransposeOp tensor dtype correction · 8cac2b9a

Max Bergfelt authored May 02, 2025

Fixed incorrect usage of SchedulerConnection dtype in MakeTransposeOp function for creating Transpose OP's for Depthwise Conv2D decomposition. Also reworked MakeMemCopy function in a similar way.

Change-Id: I4858b8810840a6082afcc0859a997d04a0fac60b
Signed-off-by: Max Bergfelt <max.bergfelt@arm.com>

8cac2b9a

May 06, 2025

MLBEDSW-10754: Make kernel object usage more consistent · a25debfc

Philip Hall authored May 01, 2025 and

Fredrik Svedberg committed May 06, 2025



This commit rationalises the use of the kernel object to
make it simpler to use the unit kernel, and ensures that
called functions have the correct kernel constness to
prevent accidental modification.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: I27c63feb8e876f359b434a916bed50f41c97f411

a25debfc

MLBEDSW-10657: Align MAC count calculation with Vela · 1756fef8

Johan Gunnarsson authored May 05, 2025



Don't include subops that are activations in the performance summary.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I9e1e97281c862d3e3ce873aac89f18ff57fea46c

1756fef8

May 05, 2025

MLBEDSW-9758 TOSA MaxPool decomposition · 160abf94

Bjorn Davidsson authored Nov 06, 2024 and

Rickard Bolin committed May 05, 2025

Add support for stride > 3 and large FM by adding a decomposition function
for MaxPool, calling into the block and stride decomposition functions.

Change-Id: Id0e68c49dd89a807108f59fcd79cfe1b54d47e97
Signed-off-by: Björn Davidsson <bjoern.davidsson@arm.com>

160abf94

MLBEDSW-9759: Decompose batch for AvgPool · ab36eec4

Bjorn Davidsson authored Nov 20, 2024 and

Rickard Bolin committed May 05, 2025



- Add decomposition for AvgPool, handling batch > 1.
- Convert padding to offsets for AvgPool and MaxPool
  decomposition.

Change-Id: I47faaaddedb0295abc084e3e966daa53817c2586
Signed-off-by: Björn Davidsson <bjoern.davidsson@arm.com>

ab36eec4

May 02, 2025

MLBEDSW-10747: Fix input tensor check in TOSA reader · cea0885a

Rickard Bolin authored May 02, 2025 and

Fredrik Svedberg committed May 02, 2025



Reader expected all operators to have input tensors, but CONST operator
does not.

Change-Id: I696d9a36cbc179988ded61f9b020a46b40df89ef
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

cea0885a

MLBEDSW-9760 TOSA Conv3D Decomposition · bbbeb030

Fredrik Svedberg authored Apr 30, 2025



Final part of TOSA Conv3D decomposition - bias broadcast
decomposition of large tensors.

Change-Id: Ib2e278266f431fbfddffd7aef850a32fd63c3e17
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

bbbeb030

MLBEDSW-8926 Port LSTM to Regor · 92013dad

Jacob Bohlin authored Apr 25, 2025



* Ported lowering of TFLite::UnidirectionalSequenceLstm to Regor.
* Added reading of TFLite intermediate tensors. Added a new
TensorUsage::Intermediate for these tensors.
* Added logic to allocate tensors which point to the same buffer to the
same address, enabling this to be controlled in GraphIR.
* Added an optional Tag to the Buffer hash function in order to
differentiate between multiple empty buffers which stem from different
TFLite variable tensors.
* Added missing rescaling for Sigmoid and Tanh when fused with
 Elementwise Add, Sub or Mul.
* Added some limitations to persistent tensors:
  - They are now required to be in linear format.
  - They can not share memory with non-persistent tensors.
* Made a small modification to graph traversal so that partial writes
  are processed in the order they are added to the graph.
* Added supported operator checks for UnidirectionalSequenceLstm.

Change-Id: I6bd08822a41dca48b3aa8091b07747327b37d68f
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

92013dad

MLBEDSW-9291: Support basic resize bilinear for Ethos-U55/U65 · a3fc80da

William Isaksson authored Mar 18, 2025 and

Fredrik Svedberg committed May 02, 2025



-Adds limited support for resize bilinear with align corners and half
pixel centers both set to false, and analogous for TOSA.

Change-Id: I013759e0cb23f3ebc9037d3b41b0f449256a673a
Signed-off-by: William Isaksson <william.isaksson@arm.com>

a3fc80da

May 01, 2025

MLBEDSW-9142 Add support for fusing activation functions to chains · a64a8eff

Jacob Bohlin authored Apr 23, 2025 and

Johan Alfvén committed May 01, 2025



* Allow fusing multiple activations to chains as long as they are not
consecutive.
* Allow fusing of int16 Simoid and Tanh.
* Utilize all 8 LUT slots rather than reusing slot 0 every time.
* Extended live-range fusing to allow reusing IFM/OFM across opgroups.

Change-Id: Idce158551b23f4ea3824e662a045759d4e609a60
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

a64a8eff

Apr 30, 2025

MLBEDSW-10631: Modify CONST error check · 4472430d

Rickard Bolin authored Apr 30, 2025



CONST operator error check assumed exact size match between buffer and
OFM tensor storage size bytes. That failed since TOSA tensors are 8 byte
aligned. Changed to check that buffer has enough elements to fill OFM
instead.

Change-Id: Ice7ddc04707d81e27f5f26317ab12fe5640008d9
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

4472430d

Apr 29, 2025

MLBEDSW-10662 Fix RewriteFullyConnected · 34eec942

Fredrik Svedberg authored Apr 25, 2025



Added check to not rewrite FullyConnected like Conv2Ds if they
have padding.

Change-Id: Id9273c38d4ffb2270326e41662f8c791666dd78f
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

34eec942

MLBEDSW-10744: Fix DepthwiseConv2D with batch + depth multiplier · da7b49c2

Johan Gunnarsson authored Apr 29, 2025 and

Johan Alfvén committed Apr 29, 2025

DepthwiseConv2d with batch and depth multiplier is decomposed into
multiple non-batched DepthwiseConv2d followed by Transpose, then a
MemoryCopy to copy the result into the right place in the original
OFM. This last MemoryCopy didn't have the correct IFM shape.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I7fe23dbd0bbbb3aca6885153e76a2d3fc8ad7724

da7b49c2

MLBEDSW-10594: Fix OFM quantization for decomped DepthwiseConv2D · 72e09fda

Johan Gunnarsson authored Apr 28, 2025 and

Fredrik Svedberg committed Apr 29, 2025

When DepthwiseConv2D has depth multiplier >1, it's decomposed into
one op per depthwise multiplier. Weights and biases are sliced as
expected, but the per-channel OFM quantization also must be sliced.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ie7134987de9bb77dad99eb2322a2c8615e3e5471

72e09fda

MLBEDSW-10635: Added DataType field to SchedulerConnection · f74d4f0c

Max Bergfelt authored Apr 25, 2025

Added support for storing the DataType on the SchedulerConnection so that different operators can interpret the memory of a tensor as a different type than the one set on the tensor.

Change-Id: Ie8e51030460390ef3a0681d5f06134c7a5acf017
Signed-off-by: Max Bergfelt <max.bergfelt@arm.com>

f74d4f0c

MLBEDSW-10631: Rewrite CONST op · 72878c5b

Rickard Bolin authored Apr 11, 2025



TOSA CONST operators does not have any input tensors, only an attribute
containing a constant tensor and an output tensor. Until now, we've
simply ignored the CONST operators in the TOSA reader since we're only
interested in the constant output tensor. This causes issues in the case
of a single CONST operator network, where ignoring CONST results in an
optimized network without any operators, which our raw mode writer can
not handle.

This patch replaces CONST operators with IDENTITY operators which gets
cleaned up during RemoveReshape to either be removed completely or get
converted to a memory copy when it is the last operator remaining in the
network.

Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: Icf18f0bad27a1220073574ffa2626f8838b4df69

72878c5b

Apr 28, 2025

MLBEDSW-9758: Fix negative offsets in DecomposeForStrides · 14d6e95d

Alexander Bengtsson authored Apr 24, 2025 and

Alexander Bengtsson committed Apr 28, 2025



Fix handling of negative offsets when rejecting slices in stride
decomposition.
- Treat negative offsets as TOP/LEFT padding.
- Adjust coordinate with IfmStride to find first positive coordinate
- Determine whether first positive coordinate results in a volume

Change-Id: I2ecd5a5547f13f9efbc280339ab0e076bdce67f9
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

14d6e95d

Apr 25, 2025

MLBEDSW-10022: Add a unit test for passthrough · e652266f

Johan Gunnarsson authored Apr 08, 2025 and

Fredrik Svedberg committed Apr 25, 2025



This tests passthrough using the following steps:

1. Generate a TFLite network as a flatbuffer.
2. Pass the flatbuffer to tflite_reader to obtain the GraphIR.
3. Pass the GraphIR to tflite_writer to generate a flatbuffer
   again.
4. Compare it with the flatbuffer from step 1. The contents should
   be identical.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I25e6924829845f33aaf24bea665099b819116e67

e652266f

MLBEDSW-10656: Change tflite_writer to copy tables using reflection · 88d7a53f

Johan Gunnarsson authored Apr 10, 2025 and

Fredrik Svedberg committed Apr 25, 2025



This changes tflite_writer so that is copies passthrough tables
using flatbuffer minireflection. Tables affected by this change are:

* tensor.quantization
* tensor.sparsity
* tensor.variant_tensors
* operator.builtin_options
* operator.builtin_options_2

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ia197694a09d5eb97480f303919afd209e8221332

88d7a53f

MLBEDSW-10723: MLCE: Add support for the Log operator in Regor · 037f02e9

Johan Alfvén authored Apr 21, 2025



- Implemented Log operator with the generic LUT implementation
- Update SUPPORTED_OPS.md and vela.py to include the LOG operator
- In tflite_graph_optimiser.cpp:
  - Change the variable type of `prevLutResult` from float to int16_t.
  - Clamp the intermediate LUT value and explicitly cast it to int16_t,
    fixing precision and conversion issues in the LUT generation.

Change-Id: I8c877cb4f3c3c2da2eb82369fdd0269e938c4fe5
Signed-off-by: Johan Alfvén <johan.alfven@arm.com>

037f02e9

MLBEDSW-10740: Add leading dimension decomposition for Resize · 677651ad
Alexander Bengtsson authored Apr 24, 2025 and Alexander Bengtsson committed Apr 25, 2025
```
Change-Id: Ibb1b00160e50fc0edacb7e1e18b1d6c437832b0c
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>
```
677651ad