Commits · 3.12.0 · artificial-intelligence / ethos-u / Vela

May 21, 2024

MLBEDSW-9089: Update release notes · 7abd82a6

Rickard Bolin authored May 20, 2024



- Add release notes for 3.12.0
- Update README with supported TensorFlow version

Change-Id: Ia95955e9fbe824d1dc62e3960a9ed43fa3689de5
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

7abd82a6

MLBEDSW-9090: Restrict third-party dependencies in pyproject.toml · 3955765d

Alexander Bengtsson authored May 20, 2024 and

Rickard Bolin committed May 21, 2024



- Restrict third-party dependency versions based upon compatibility and
security concerns.

Change-Id: Ic50eb987f012df8c0c46ef0ee51d907b7609edae
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

3955765d

May 20, 2024

MLBEDSW-9088: Update to concat grouping patch · 722f4bfe

Johan Alfvén authored May 20, 2024



 - Fix performance regression caused by the concat grouping fix.
 - If there is no cpu op interfering there is no need for grouping the
avg pool ops. Keep old compiler behavior for that use case.

Change-Id: I6476585d7dedff0b9edd8b9c300a71c181496cf1
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

722f4bfe

May 16, 2024
- MLBEDSW-8561: Striding support in H/W for StridedSlice · be78a053
  Rickard Bolin authored Jan 31, 2024
```
Change-Id: Ie6f39d9c4125f7c16d27621de47cd76143c2e636
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
```
  3.12.0.rc1
  
  be78a053
May 15, 2024

MLBEDSW-9067: MLCE: Group Avgpool ops for concat · 89146856

Johan Alfvén authored May 13, 2024



 - Concat is implemented by several avgpool ops, all of them
writing to the same ofm but with a slice offset. If a compiled
network contains cpu fallbacks the avgpool ops might end up
running in different custom ops. This works fine as long as the
runtime provides the same scratch area. If not the output from
the concat might be corrupt.

  - This fix adds an extra step to the pass packing so that all
avgpool ops for a concat is group together and run within the
same custom op in order to prevent possible corruption.

Change-Id: I343e08d7b4046f969b3d9ec3479db6490cbe4170
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

89146856

Apr 24, 2024

MLBEDSW-8969: Enable weight buffering for fully connected with batch shape · f4937000

Johan Alfvén authored Apr 20, 2024



 - Fully connected with batch shape will use the weights
more than once. Models with these type of fully connected
will benefit from weight buffering.
 - If a fully connected op with this shape is detected it is
changed to a conv2d and the normal weight buffering
flow will be used.

Change-Id: I272741a32390e036d5e04bd5af41d4538162e86e
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

f4937000

MLBEDSW-8973: MLCE: Fix assert in build pass links · f9194e11

Johan Alfvén authored Apr 22, 2024



 - Assert in build pass links due to that a concat
op is split into several avg pools op which run in different
custom ops. The code did not expected the pass to have a
dependency to itself.
 - Fixed the assert to handle this special case

Change-Id: Id03b1145b19c25bf967a1061aa5ecf559b3bc1cc
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

f9194e11

Apr 23, 2024
- Reformat code to align with precommit · bab7f28a
  Per Astrand authored Apr 22, 2024 and Johan Alfvén committed Apr 23, 2024
```
Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: Idc6f6959bfc7eabce2f5b6e0d4935d292dcf6618
```
  bab7f28a
Apr 12, 2024

Reshape weights from TOSA to Vela expected format · 92240e79

Per Astrand authored Mar 25, 2024



Reshape the weight for depthwise conv2d and set the
depth_multiplier attribute on the operation.

Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I3b73988fa8c4e0cbe2430874cefe6d002885ec89

92240e79

Fuse rescales into Add and Conv2d operation · 931613df

Per Astrand authored Mar 21, 2024



Remove the upscale to int32 before and after the the add operation.
Re-enable fusing of conv2d and rescale that was removed earlier.

Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I5e7d9bd99bb3925588b507824d8eb3e6642cc7f0

931613df

Apr 05, 2024

Fix various pre-commit errors · 31947ad1

Johan Alfvén authored Apr 04, 2024



Change-Id: I8e584a036036f35a8883b2a4884cb2d54e675e39
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

31947ad1

MLBEDSW-8885: MLCE: Fix assert in verify_subgraph_health · abed3c27

Johan Alfvén authored Apr 04, 2024



 - Assert triggered due to that the tensor consumer list did
not contain expected operators.
 - The problem happened because a concat op was split into two
avgpool ops and these two ops run in separate subgraphs with
a cpu node in between. Since the avgpool ops share the same
output tensor this caused some corruption to the tensor consumer
list when the last subgraph was traversed.
 - The fix is to ignore ops that do not belong in the subgraph's
set of operators (the pass list) when updating the consumers.

Change-Id: I4d94b54c77001f6447bec31ec62daeebc9b104f9
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

abed3c27

Apr 04, 2024

MLBEDSW-8886: Regression: Output diff on LSTM · 190b63a6

Johan Alfvén authored Apr 04, 2024



 - Fix regression caused by too strict constraints on
SplitSpliceRead causing output diff for LSTM.
 - As long as the SplitSpliceRead shape fits within the
consumer ifm shape it is ok to move the read.

Change-Id: Ia6f508f99638c3aedccc7fd9f31405527bb64f87
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

190b63a6

Apr 03, 2024

MLBEDSW-8875: MLCE: Update criteria when to move SplitSpliceRead to consumer · 7647b0fe

Johan Alfvén authored Apr 02, 2024



 - When possible, a read slice from a split or stride is moved to
the following op. The problem in this case was that the following
op was a Maxpool op (from Softmax). The Maxpool op is using a
different input shape compared to the original Softmax op, and
this input shape was then changed when the read slice was applied
to the Maxpool op.
 - The result is a faulty Maxpool op with an output diff.
 - The fix is to prevent moving the slice read when the consumer
input shape differs from the Split/Stride ofm shape

Change-Id: I649d89c38645fa51c20c3602954e2b8af9372076
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

7647b0fe

MLBEDSW-8873: MLCE: Update LUT index calculation · 55d90dd1

Johan Alfvén authored Apr 02, 2024



 - A network containing several softmax operators caused an
output diff
 - The problem was that the code that detects if the LUT is
already in internal SRAM calculated everything correctly except
for which lut index to use.
 - The code should use the slot_size and not then LUT size when
calculating the index which fixes this problem.
 - Updated unit tests

Change-Id: I07686651a883ccbba7c173e7191eb21f9ff15bf5
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

55d90dd1

Apr 02, 2024

MLBEDSW-8672: Add ext_key tracking · e4d2f218

William Isaksson authored Feb 10, 2024 and

Tim Hall committed Apr 02, 2024



- Add ext_key tracking.
- Fix debug db cmd offsets being off by 4.

Change-Id: Ib109a15a0a2c44d08021c3b1bc3bcc067240ac5c
Signed-off-by: William Isaksson <william.isaksson@arm.com>

e4d2f218

Mar 12, 2024

MLBEDSW-8725: Remove scales & biases from --verbose-weights · f697eac9

Alexander Bengtsson authored Feb 23, 2024 and

Alexander Bengtsson committed Mar 12, 2024

Remove scales and biases from encoded weights size. This aligns better
with original weights size (which only represents the weight tensor)

Change-Id: I5aabf61385d8fdf150764c45e04ba4388c6a63f0
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

f697eac9

Mar 07, 2024

TOSA fixes · b90666d9

Oscar Andersson authored Feb 29, 2024



- Fix TOSA imports
- Handle weights connected to Identity nodes
- Scaling info was missing in Fully Connected
- Disable rescaling fusing for conv-like ops
- Explicit scaling was missing for conv-like ops
- Handle Const->Identity->Transpose chains
- Handle Const->Identity->Reshape chains

Change-Id: I063af1f187b6b56105ccf5e8e8b2eb0d3a39dd3b
Signed-off-by: Oscar Andersson <oscar.andersson@arm.com>

b90666d9

Mar 06, 2024

MLBEDSW-8749: MLCE: Output diff on strided slice · 9341bf4b

Johan Alfvén authored Mar 05, 2024



 - When possible, a read slice from a split or stride is moved to
the following op. The problem in this case was that the following
op was an elementwise op where the ifm needed to be broadcasted
and that is not supported.
 - The result is a faulty elementwise op with an output diff.
 - The fix is to prevent moving the slice read to the elementwise op
if broadcasting is needed.

Change-Id: I89928c217510a822f91f051fd1ad6e34040c19de
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

9341bf4b

Feb 28, 2024

Fix stats writer exception when op has no tensors · f19fd2f0

Simon Hollis authored Feb 27, 2024



Signed-off-by: Simon Hollis <simon.hollis@meta.com>
Change-Id: I5553802afaa3faaa2548aece7a3e0e1530021765

f19fd2f0

Feb 27, 2024

Modifications of rescale to enable basic form quantized network support. · 78b9412b

Rob Elliott authored Jan 25, 2024 and

Tim Hall committed Feb 27, 2024



Minor fixes for TOSA 0.80.0 and 0.80.1 field naming following from
the 0.2 to 0.8 conversion.

Change-Id: I2ac1b3ac1ec60cf765edf54030cd2338bf001289
Signed-off-by: Rob Elliott <Robert.Elliott@arm.com>

78b9412b

Feb 19, 2024

MLBEDSW-8704: Update release notes · cc82c36d

Rickard Bolin authored Feb 19, 2024



- Added release information

Change-Id: I6d6d80460658d444d52d0abb17a2cb42954f992c
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

cc82c36d

Feb 09, 2024

MLBEDSW-8674: int16 VectorProduct should use Natural rounding · 33683086

Johan Alfvén authored Feb 08, 2024



 - Fixed output diff for FullyConnect int16
 - Problem was that wrong rounding mode was used
 - Reference uses Natural rounding for FullyConnect int16

Change-Id: I209313b6f89fed01678a448a935d5f6904b41057
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

33683086

Feb 06, 2024

MLBEDSW-8620: Fix MirrorPad supported ops check · 646314ef

Rickard Bolin authored Jan 31, 2024



Change-Id: I1458009f4b92c1a599efa3a63d6768148e55606d
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

646314ef

Jan 30, 2024

MLBEDSW-8491: Add support for Mirror pad · fdbb072d

Rickard Bolin authored Sep 05, 2023



Change-Id: I3c13118e14195a5fb8e522a38b205b75fb07b74b
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

fdbb072d

MLBEDSW-8569: MLCE: Reported number of CPU ops are wrong · 014bc283

Johan Alfvén authored Jan 25, 2024



 - A Pack op is implemented by several AvgPool ops. Depending
on number of CPU ops and graph topology this could result in that
the AvgPool ops ended up in different nodes. One of these node
had the Pack output referenced to it but the other node did not.
As a result the full graph was not traversed when calculating CPU
ops.
 - The compiled network works as intended but the number of
reported CPU was wrong.
 - Added new method that extracts the ops using the passes in
the sub graphs which fix the problem.

Change-Id: Ie88ebd4669783559258ae763737a4c7f86c905f8
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

014bc283

Jan 26, 2024

MLBEDSW-8575 Tests fails on conv networks · cbec599c

Fredrik Svedberg authored Jan 25, 2024



Fixed a problem where the compiler incorrectly called the
mlw_codec to create an empty weight stream for the second
weight core.
Also added code to the mlw_codec to detect this as an value
error rather than a memory error.

Change-Id: I463846cecb1178f8fbf04dc3e39bd6965cb8ddfc
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

cbec599c

vela: Remove unnecessary code from architecture allocator · 1c8f92d0

Tim Hall authored Jan 25, 2024



 - Small improvement that reduces compilation time

Change-Id: I9e5cd58674f719f5dedeb30ea42787dc996a22d6
Signed-off-by: Tim Hall <tim.hall@arm.com>

1c8f92d0

Revert "MLBEDSW-8468: overlaps_ranges does not treat the live range end time as inclusive" · b982898c

Tim Hall authored Jan 19, 2024



This reverts commit dbe4df4ccddafac9cbc345a4a03a42c241248e88.

 - The previous patch had a mostly negative effect on performance

Change-Id: I4003d50b07de9c63d9001ceb0a3a0bc966c0b861
Signed-off-by: Tim Hall <tim.hall@arm.com>

b982898c

vela: Remove dead code from register command stream · e4d0dbfa

Tim Hall authored Jan 16, 2024



 - Removed the unused function get_block_config_for_npu_op()

Change-Id: If36e4fe65286c4e13e127473d20971a1b6eaa94b
Signed-off-by: Tim Hall <tim.hall@arm.com>

e4d0dbfa

Jan 24, 2024

MLBEDSW-8568 Fix mlw_codec memory handling · c222f8cb

Fredrik Svedberg authored Jan 12, 2024



Added missing memory allocation checks to mlw_codec.

Change-Id: I20c04d5d9c934b9c715a2b2049705f853d90825a
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

c222f8cb

Jan 18, 2024

CONV ops int16 tests failed after TensorFlow update · 56e5f0c2

William Isaksson authored Jan 10, 2024 and

Rickard Bolin committed Jan 18, 2024

Adds support for setting the accumulator type using the quantized_bias_type attribute

Change-Id: Ibde1149143b510a1c650a5a037d3ab92d878d7cd
Signed-off-by: William Isaksson <william.isaksson@arm.com>

56e5f0c2

Jan 16, 2024

MLBEDSW-8468: overlaps_ranges does not treat the live range end time as inclusive · 84fe2f60

Tim Hall authored Dec 19, 2023



 - The issue is that live range start and end times are inclusive
but the function to calculate is two ranges overlap treats them as
exclusive
 - The fix is to change the comparison to be inclusive

Change-Id: Iab5ceec7be2a5fdf0d6ecef81509a88c74e7108c
Signed-off-by: Tim Hall <tim.hall@arm.com>

84fe2f60

Dec 22, 2023

MLBEDSW-8497: [MLCE] Avoid modifying FC with dynamic weights · 37dbca2a

Johan Alfvén authored Dec 21, 2023



 - If a npu op is followed by a convolution op with dynamic
weights the optimized file ends up containing a duplicated
tensor called _cpu.
 - Another problem is also that an empty bias tensor is added
in the reader.
 - The fix is to ignore these cpu ops both in the reader
and the writer.

Change-Id: I476b4f6062e26cca4ba589df694a99ef79b0f6d4
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

37dbca2a

Dec 20, 2023

MLBEDSW-8157: Update to TensorFlow 2.15 · f4a511ff

William Isaksson authored Nov 22, 2023 and

Rickard Bolin committed Dec 20, 2023



Updates to TensorFlow 2.15. No StableHLO operators were added to Vela since these are subject to change and have almost no runtime support.

- FlatBuffers version was unchanged.

Change-Id: I9a506a2dcc2e0bc2498742e857bbb6d69b19ac1b
Signed-off-by: William Isaksson <william.isaksson@arm.com>
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

f4a511ff

Dec 19, 2023

MLBEDSW-8467: verbose-allocation memory usage is incorrect · d2e03c61

Tim Hall authored Dec 19, 2023 and

Tim Hall committed Dec 19, 2023



 - The issue was that the peak memory usage was only evaluated at the
start of the tensor's lifetime and not across its whole lifetime
 - The fix is to look for the maximum usage between start and end

Change-Id: Iff4f390f3a017f1df0f8933796fa5282db7870db
Signed-off-by: Tim Hall <tim.hall@arm.com>

d2e03c61

Nov 21, 2023

MLBEDSW-7871: Document new error types in API · 6165283b

William Isaksson authored Aug 07, 2023



- Documents Legality requirements of CMD1 payloads

- Fixes a miss in the command stream checks.

Signed-off-by: William Isaksson <william.isaksson@arm.com>
Change-Id: I9b33dedfa66650fa3100f61fd158a385818b4d52

6165283b

Nov 16, 2023

MLBEDSW-8109: Update release notes · 8cb3c360

Tim Hall authored Nov 16, 2023



 - Added release information
 - Modified SUPPORTED_OPS.md version info

Change-Id: I3ead55db45c84821c426645e488dfb765166d20f
Signed-off-by: Tim Hall <tim.hall@arm.com>

8cb3c360

MLBEDSW-8240: Document reference comparison point · 5aa9ae24

Tim Hall authored Nov 16, 2023



 - Updated TensorFlow Support section

Change-Id: Ic2551f44e7dfa996a5dcc8840d480b7985415a0a
Signed-off-by: Tim Hall <tim.hall@arm.com>

5aa9ae24

MLBEDSW-8280: Update PyPI homepage link · 2742947d

Tim Hall authored Nov 16, 2023



 - Changed homepage link from cgit to gittiles
 - Clarified tensor alignment is in Bytes

Change-Id: I9fd912c17d61f9add11493e031bbb620271c68eb
Signed-off-by: Tim Hall <tim.hall@arm.com>

2742947d