Commits · main · Limin Tang / Vela

May 27, 2025

MLBEDSW-10847: Update changelog and TensorFlow version documentation · 96223317
Rickard Bolin authored May 20, 2025
```
Change-Id: I1d363affe9a5d0028c4e8d041d8b79031bf75649
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
```
96223317

MLBEDSW-10859: Update email address in SECURITY.md · 4b39db34

Rickard Bolin authored May 22, 2025 and

Fredrik Svedberg committed May 27, 2025



Security issues should now be reported to psirt@arm.com

Change-Id: Ic7881695cf3ff210839fd474178df380324cf5c6
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

4b39db34

May 22, 2025

MLBEDSW-10867 Bump min versions of setuptools and lxml · 235d6a81

Jacob Bohlin authored May 22, 2025



Change-Id: I8797e734ad00bf8a598b2524c20ef9ad6973b6e5
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

235d6a81

May 21, 2025

MLBEDSW-10837: Improve Ethos-U85 block config search ratio · 68f1aa43

Philip Hall authored May 20, 2025 and

Fredrik Svedberg committed May 21, 2025



OFMs with extreme aspects (like 1:200) end up using
a pathological scaling ratio during the block config
search.
 - This commit changes the scaling ratio to be more
   conservative thus refining more accurately  (at the
   cost of speed).
 - Fixed issue where the last, smallest-possible
   configuration was not selectable.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: I20eed6268543453267c76f057b1a84b7cbadc1c5

68f1aa43

MLBEDSW-10840: MLCE: Add support for RELU_0_TO_1 · 3a9736a4

Johan Alfvén authored May 20, 2025



- Add support for RELU_0_TO_1 for both Vela and Regor
- Updated SUPPORTED_OPS

Change-Id: Icd8a3bb35008c20d8cd2118c0e11cefcfcbe1cd9
Signed-off-by: Johan Alfvén <johan.alfven@arm.com>

3a9736a4

May 20, 2025

MLBEDSW-10839: Fix Access cycles regression · 6787061d

William Isaksson authored May 20, 2025 and

Rickard Bolin committed May 20, 2025



Fixes regression from refactoring of access cycles.

Change-Id: I0fc4aec7a4d3129c13967171b8a798ec2cdb23cf
Signed-off-by: William Isaksson <william.isaksson@arm.com>

6787061d

MLBEDSW-10809 Fix Ethos-U55 RSCG assert · cd96dfdc

Fredrik Svedberg authored May 19, 2025



LUT size was not accounted for when selecting block configuration
for operations with fused LUT activations on Ethos-U55.

Change-Id: Ic9b070f9949d8eb9385299aeb2d24960bd16147c
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

cd96dfdc

MLBEDSW-10708: Fix setting Rescale IFM/OFM tensor dtype to unsigned · 1cbc7bf4

Max Bergfelt authored Apr 15, 2025

Made sure the dtype of a tensor is changed to unsigned in GraphIR optimizer when the rescale operator has the attribute input_unsigned/output_unsigned set to true.

Change-Id: I568638ef0e93252b55a2e748c2ed64d35a208977
Signed-off-by: Max Bergfelt <max.bergfelt@arm.com>

1cbc7bf4

May 19, 2025

MLBEDSW-10826: Fix the axis attribute adjustment · b8066ce8

Johan Gunnarsson authored May 19, 2025 and

Fredrik Svedberg committed May 19, 2025



This axis attribute refers to the IFM rank. There are ops where
IFM rank is not equal to OFM rank, such as Reduce Min/Max with
keep_dims = false. The correct way is to use IFM rank as the base
for this adjustment.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I5aab7efef84b720fd15c8735f5c88df76ac24952

b8066ce8

MLBEDSW-9392: Improve Access Cycle Estimation for Ethos-U85 · 59a7c001

William Isaksson authored Mar 13, 2025



Improves the algorithm for estimating access cycles drastically.

Change-Id: I468fb71e373ed6f779eac6370d725466809481f8
Signed-off-by: William Isaksson <william.isaksson@arm.com>

59a7c001

May 16, 2025

MLBEDSW-10801: Removed Ethos-U55/65 Argmax overflow constraint · 174555a5

Max Bergfelt authored May 15, 2025

Removed supported operator check for argmax width and height overflow as it has been solved by max pool decomposition.

Change-Id: I12c45a333ea48a8cdb69ef27023ba71a428401b6
Signed-off-by: Max Bergfelt <max.bergfelt@arm.com>

174555a5

May 15, 2025

MLBEDSW-10683: Port 1x1 IFM resize for Ethos-U55/U65 · fddbad1d

William Isaksson authored Apr 10, 2025 and

Fredrik Svedberg committed May 15, 2025



Adds support for 1x1 IFM resizes to Regor.

Change-Id: Ia16bc65748ac5518291e1e4a78cd3b6350c7b0d5
Signed-off-by: William Isaksson <william.isaksson@arm.com>

fddbad1d

C++20 compatibility fix for regor logging · a37d6a4c

Limin Tang authored May 08, 2025 and

Tim Hall committed May 15, 2025

Logging dependency fmt::format takes fstring as input by default, which is required to
return compile time constant expression when built with C++20 (enforced by C++20 specifier
consteval). To make the logging work with C++20, the string input needs to be converted
to runtime evaluated string expression via fmt::runtime first.

Also fix minor bugs in tflite_model_semantics.cpp that causes build to fail with C++20.

Signed-off-by: Limin Tang <limintang@meta.com>

a37d6a4c

MLBEDSW-10776 Don't serialize intermediates vector if empty · 6fa0bbad

Jacob Bohlin authored May 15, 2025

Most operations do not have any intermediate tensors. In this case there
is no need to create a vector in the flatbuffer file as even empty vectors
contain a 4 byte length header. With this patch the intermediates vector
is only created if any intermediate tensors are present.

Change-Id: I505ebe8c17a577eee2361050d18715207555409d
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

6fa0bbad

May 14, 2025

MLBEDSW-9326: Revert FlatBuffers version · 5c42e8cc

William Isaksson authored May 14, 2025



Reverts version of FlatBuffers to v24.3.25.

There were no changes to the schema generated files due to this

Change-Id: I3791384206d5eb8fac7f37e505d386b5ea8e594b
Signed-off-by: William Isaksson <william.isaksson@arm.com>

5c42e8cc

MLBEDSW-9408: Add full REDUCE MIN/MAX/SUM decompostion · 981cc442

Johan Gunnarsson authored May 09, 2025



This implements full decomposition of TOSA REDUCE MIN/MAX/ANY/ALL
of the reduced axis.

* Extend decomposition to do blockwise reduce operations in the
  reduced axis.
* Add ReduceSum/ReduceMinMax to constraints.
* Move reshaping of reduce ops into decomposition.
* Move creating a reduce ops kernel to ConvertAttributes.
* Remove RewriteReduceMinMaxAnyAll.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ic0873dab1c3c5344045590d1c11724986b896120

981cc442

MLBEDSW-10656: Fix missing header compilation failure · 1bfdfd3e

Tim Hall authored May 08, 2025 and

Fredrik Svedberg committed May 14, 2025



 - Included the missing unordered_map

Change-Id: Ife560a9645526fc251e18db317719710070e9d82
Signed-off-by: Tim Hall <tim.hall@arm.com>

1bfdfd3e

May 13, 2025

MLBEDSW-10776 Revert graph traversal change · 319b4610

Jacob Bohlin authored May 12, 2025 and

Rickard Bolin committed May 13, 2025



The graph traversal was modified in MLBEDSW-8926 to traverse tensor
writers left-to-right instead of right-to-left. This is required to
ensure correct execution order for LSTM.

This change reverts the graph traversal back to right-to-left in the
general case and left-to-right will only be used on graphs which contain
persistent tensors, in order to target LSTM operators.

Change-Id: Ibe13af0cf952450cff253ff2e44ee8b96068583a
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

319b4610

MLBEDSW-10635: ReinterpretCast operator and cast to int64 · 498f957b

Max Bergfelt authored Apr 10, 2025 and

Johan Alfvén committed May 13, 2025

Implemented a ReinterpretCast operator which can be used to reinterpret tensors with different data types and sizes. Additionally added support for non hardware supported cast to in64 by replacing the operation with 4 sequential cast and reinterpret operations.

Change-Id: Ie7032cd5384c17a766dd17034cd59871bb1a833d
Signed-off-by: Max Bergfelt <max.bergfelt@arm.com>

498f957b

MLBEDSW-10779 Supported ops checks: Reject negative zero points for unsigned · e6783a07
Jacob Bohlin authored May 13, 2025 and Johan Alfvén committed May 13, 2025
```
Change-Id: I19c16a6a06278d3a0301d763482e0ab9aaee4472
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
e6783a07

May 12, 2025

MLBEDSW-10756 Fix issue where HLCParameters were copied incorrectly · 133a265c
Jacob Bohlin authored May 12, 2025
```
Change-Id: I4a40cf56a37f10e95b52f93d2b60851bc38f5aaf
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
133a265c
MLBEDSW-10681 Implement zero-point correction required for LSTM int16 · e989a504
Jacob Bohlin authored May 07, 2025 and Fredrik Svedberg committed May 12, 2025
```
Change-Id: I7bcef5bc787a7e00d7dee820a79bc451e1a97494
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
e989a504

MLBEDSW-10781: Fix transpose with LUT assert · bdc6dc3d

Philip Hall authored May 12, 2025 and

Fredrik Svedberg committed May 12, 2025



Ethos-U55 transpose with LUT generates an assert as a result
of default-initialised values in the block config.
This commit moves the LUT insertion until after a suitable
block config is available.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: If2137253696d26f996605ec9bf2fab1a1c69c229

bdc6dc3d

May 09, 2025

MLBEDSW-10533: Don't fuse transpose to an op with OFM slice · 72c32a33

Johan Gunnarsson authored May 06, 2025



* Don't fuse transpose to an op with OFM slice
* Also, when fusing a transpose, the primary op should inherit the
  fused op's OFM slice. Otherwise we might end up with different
  shapes on OFM slice and OFM and in that case OFM slice shape will
  be used later on.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Idb25cc3a53f0b52dcc59cda1aefa31d9d19a850f

72c32a33

May 08, 2025

MLBEDSW-8926 Fix WIN32 build issue · 96e9a932

Jacob Bohlin authored May 06, 2025 and

Fredrik Svedberg committed May 08, 2025



Problem was a code snippet assuming that `uintptr_t` was >32 bit wide.

Change-Id: I3bac0b0760b0f111d6ec6a9a1e955e0c19f3398c
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

96e9a932

MLBEDSW-10764 Invalidate buffer hash when writing new values · c6b4acd4
Jacob Bohlin authored May 07, 2025 and Fredrik Svedberg committed May 08, 2025
```
Change-Id: Icdef37a712c708cc75418f8aa4cd8033f3f70d5c
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
c6b4acd4
MLBEDSW-10756 Fix chaining issue with fusing transpose with activation · c932624f
Jacob Bohlin authored May 02, 2025 and Fredrik Svedberg committed May 08, 2025
```
Change-Id: Iec466c83c207822c48b2c1746fe39f46f4541a72
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
```
c932624f

MLBEDSW-10631: Convert non-constant int4 tensors to int8 · 3406966a

Rickard Bolin authored May 08, 2025



Similar as for int48 tensors, we should unpack the constant tensors and
change data type for the non-constant ones.

Change-Id: I9c3c69bd2d1e41bfa32e959c35325b1280f626e2
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

3406966a

May 07, 2025

MLBEDSW-10631: Extend type check to cover unpacked 48bit values · 393de444
Rickard Bolin authored May 07, 2025
```
Change-Id: I2758f97c6cc328aedf38edbc8017afacdf3a0fdb
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
```
393de444

MLBEDSW-10240: Implement missing transpose LUT fusing · f72c9110

Philip Hall authored Apr 23, 2025 and

Fredrik Svedberg committed May 07, 2025



For Ethos-U55 it is, in some cases, possible to fuse
a LUT onto the transpose.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: I4dfce085d83b78d80ae52f815757082058555c21

f72c9110

MLBEDSW-10749: MakeMemCopy and MakeTransposeOp tensor dtype correction · 8cac2b9a

Max Bergfelt authored May 02, 2025

Fixed incorrect usage of SchedulerConnection dtype in MakeTransposeOp function for creating Transpose OP's for Depthwise Conv2D decomposition. Also reworked MakeMemCopy function in a similar way.

Change-Id: I4858b8810840a6082afcc0859a997d04a0fac60b
Signed-off-by: Max Bergfelt <max.bergfelt@arm.com>

8cac2b9a

May 06, 2025

MLBEDSW-10754: Make kernel object usage more consistent · a25debfc

Philip Hall authored May 01, 2025 and

Fredrik Svedberg committed May 06, 2025



This commit rationalises the use of the kernel object to
make it simpler to use the unit kernel, and ensures that
called functions have the correct kernel constness to
prevent accidental modification.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: I27c63feb8e876f359b434a916bed50f41c97f411

a25debfc

MLBEDSW-10657: Align MAC count calculation with Vela · 1756fef8

Johan Gunnarsson authored May 05, 2025



Don't include subops that are activations in the performance summary.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I9e1e97281c862d3e3ce873aac89f18ff57fea46c

1756fef8

May 05, 2025

MLBEDSW-9758 TOSA MaxPool decomposition · 160abf94

Bjorn Davidsson authored Nov 06, 2024 and

Rickard Bolin committed May 05, 2025

Add support for stride > 3 and large FM by adding a decomposition function
for MaxPool, calling into the block and stride decomposition functions.

Change-Id: Id0e68c49dd89a807108f59fcd79cfe1b54d47e97
Signed-off-by: Björn Davidsson <bjoern.davidsson@arm.com>

160abf94

MLBEDSW-9759: Decompose batch for AvgPool · ab36eec4

Bjorn Davidsson authored Nov 20, 2024 and

Rickard Bolin committed May 05, 2025



- Add decomposition for AvgPool, handling batch > 1.
- Convert padding to offsets for AvgPool and MaxPool
  decomposition.

Change-Id: I47faaaddedb0295abc084e3e966daa53817c2586
Signed-off-by: Björn Davidsson <bjoern.davidsson@arm.com>

ab36eec4

May 02, 2025

MLBEDSW-10747: Fix input tensor check in TOSA reader · cea0885a

Rickard Bolin authored May 02, 2025 and

Fredrik Svedberg committed May 02, 2025



Reader expected all operators to have input tensors, but CONST operator
does not.

Change-Id: I696d9a36cbc179988ded61f9b020a46b40df89ef
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

cea0885a

MLBEDSW-9760 TOSA Conv3D Decomposition · bbbeb030

Fredrik Svedberg authored Apr 30, 2025



Final part of TOSA Conv3D decomposition - bias broadcast
decomposition of large tensors.

Change-Id: Ib2e278266f431fbfddffd7aef850a32fd63c3e17
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

bbbeb030

MLBEDSW-8926 Port LSTM to Regor · 92013dad

Jacob Bohlin authored Apr 25, 2025



* Ported lowering of TFLite::UnidirectionalSequenceLstm to Regor.
* Added reading of TFLite intermediate tensors. Added a new
TensorUsage::Intermediate for these tensors.
* Added logic to allocate tensors which point to the same buffer to the
same address, enabling this to be controlled in GraphIR.
* Added an optional Tag to the Buffer hash function in order to
differentiate between multiple empty buffers which stem from different
TFLite variable tensors.
* Added missing rescaling for Sigmoid and Tanh when fused with
 Elementwise Add, Sub or Mul.
* Added some limitations to persistent tensors:
  - They are now required to be in linear format.
  - They can not share memory with non-persistent tensors.
* Made a small modification to graph traversal so that partial writes
  are processed in the order they are added to the graph.
* Added supported operator checks for UnidirectionalSequenceLstm.

Change-Id: I6bd08822a41dca48b3aa8091b07747327b37d68f
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

92013dad

MLBEDSW-9291: Support basic resize bilinear for Ethos-U55/U65 · a3fc80da

William Isaksson authored Mar 18, 2025 and

Fredrik Svedberg committed May 02, 2025



-Adds limited support for resize bilinear with align corners and half
pixel centers both set to false, and analogous for TOSA.

Change-Id: I013759e0cb23f3ebc9037d3b41b0f449256a673a
Signed-off-by: William Isaksson <william.isaksson@arm.com>

a3fc80da

May 01, 2025

MLBEDSW-9142 Add support for fusing activation functions to chains · a64a8eff

Jacob Bohlin authored Apr 23, 2025 and

Johan Alfvén committed May 01, 2025



* Allow fusing multiple activations to chains as long as they are not
consecutive.
* Allow fusing of int16 Simoid and Tanh.
* Utilize all 8 LUT slots rather than reusing slot 0 every time.
* Extended live-range fusing to allow reusing IFM/OFM across opgroups.

Change-Id: Idce158551b23f4ea3824e662a045759d4e609a60
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>

a64a8eff