Commits · 4.1.0.rc1 · artificial-intelligence / ethos-u / Vela

Nov 14, 2024

MLBEDSW-9784: Constant propagation pass · 3efd8b62

Mauricio Briceno authored Oct 23, 2024 and

Johan Alfvén committed Nov 14, 2024



- Add support for constant propagation via separate graphir pass set
- Add RewriteFunctions attribute to select traversal direction
- Constant propagation pass traversal direction is forward
- Fixed test/util CreateGraph to not include constant tensors as Graph
  IO
- This change introduces support for TOSA::LOGICAL_LEFT_SHIFT only

Change-Id: I7d63f3c6b11c715fc76ec37f79e85e6a75f6aa87
Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>

3efd8b62

MLBEDSW-9133: Output diff on MEAN op · e4a98830

Johan Alfvén authored Nov 12, 2024



 - A MEAN op with IFM rank two and reduce in C dimension caused an
output diff
 - The reason was that the intermediate tensor for calculating the sum
had the wrong shape because the reduceAxis shape had been padded to 4D
 - The fix is to use the original reduceAxis shape when calculating the
shape for the sum tensor

Change-Id: I144adfe07f697fecba6f7237e8b216295654f8ae
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

e4a98830

Nov 13, 2024

MLBEDSW-9912: Make sure weight buffering cost is cleared · 3ce02f61

Johan Alfvén authored Nov 13, 2024



 - Fixed a problem when the final scheduler cost contained wrong info
about Weight buffer when it should not. This caused the tensor
allocation to fail due to over allocation
 - The reason is that after optimizing the cascades, the final call to
ProposeScheduleBuffering decided that it would not need any buffering.
 - However, due to some missing code the weight buffering cost was not
cleared so memory size went over the limit
 - Added code to clear the weight buffering cost when it is not needed

Change-Id: I92db5c6199c99ca1ae4a21b62dfd17f2cae6ac9b
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

3ce02f61

Nov 12, 2024

MLBEDSW-9715: Fix to not recount weight buffers · dd00a882

William Isaksson authored Oct 16, 2024



Removes contribution from previously buffered weights for an op to the
buffering limit when proposing new weight buffering for the same
op.

Change-Id: If3ca6f6e359c29bc69f26c994503375cd353fba9
Signed-off-by: William Isaksson <william.isaksson@arm.com>

dd00a882

Nov 11, 2024

build: Import Catch2 via dependency infra · 183b1a22

Mauricio Briceno authored Nov 09, 2024 and

Johan Alfvén committed Nov 11, 2024



- This ensures produced artifacts are built with the same flags
- For example valgrind can now parse debug symbols properly

Change-Id: I798c0048890f546d2c2ec6a964ffa0f052de0840
Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>

183b1a22

MLBEDSW-9898: Enable fusing rescales to ReduceSum · 0b013846

Johan Gunnarsson authored Nov 11, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ia61c19749958ea271385d5efcdc3682428303ab5

0b013846

MLBEDSW-9899: Enable fusing rescales with unit scale · a9d495dd

Johan Gunnarsson authored Nov 11, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ie1504e5eba860785cfdf9fd09f508bd9c81e9026

a9d495dd

MLBEDSW-9696: Initial support for TOSA TransposeConv2D · d12a62a1

Alexander Bengtsson authored Oct 22, 2024 and

Alexander Bengtsson committed Nov 11, 2024



- Support TOSA TransposeConv2D with kernel strides 1 or 2
  by adding support for output-padding.
- Simplify DecomposeTransposeConv2D and breakout TFLite-specific padding
  to the TFLite reader.
- Refactor InitializeSlice into TensorSlice.Initialize
- Strides larger than 2 will require further decomposition MLBEDSW-9761

Change-Id: I54143461a5bc677ca7eefd91b9005dbdc7b924ec
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

d12a62a1

Nov 08, 2024

MLBEDSW-9889: MLCE: Update ConvertToInterpolatingLUT16 to use float · 88c49220

Johan Alfvén authored Nov 08, 2024



 - Update ConvertToInterpolatingLUT16 to use float to match reference
and avoid rounding issues
 - Update Exp int8 and int16 to use float to match reference

Change-Id: Ia0b6912d665d243b2dab12b002c418d04e2e124d
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

88c49220

MLBEDSW-9779 Fix Transpose attributes · 1f408d3d

Fredrik Svedberg authored Nov 08, 2024



Fixed Transpose attributes not always initialized for supported
Transpose operations.

Change-Id: I205d4920f8da16d9e48c07946e21c66d564e888e
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

1f408d3d

MLBEDSW-9889: MLCE: RSQRT int16 produces output diff · 050792c2

Johan Alfvén authored Nov 08, 2024



 - Update LUT table generation for RSQRT int16 to use
float to match reference and avoid rounding issues

Change-Id: I1d9af2a2f682dccadc918736f9e40a2e3e1c4986
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

050792c2

MLBEDSW-9892: MLCE: SOFTMAX int16 produces output diff · aa2389c0

Johan Alfvén authored Nov 08, 2024



 - Softmax int16 should use double precision in ElementwiseMulScale
in order to match reference and avoid rounding errors
 - Update ElementwiseMulScale to also support double, default
behavior is float

Change-Id: I207f95c273c66d99d994df2aa9cde144a01f80fe
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

aa2389c0

Nov 07, 2024

MLBEDSW-9777 Fix ReshapeOptions optional · 20ee1dbf

Fredrik Svedberg authored Nov 05, 2024



The ReshapeOptions attribute is optional for RESHAPE operations.

Change-Id: I607282bdf3871dd364c96a762674378920268725
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

20ee1dbf

MLBEDSW-9865: MLCE: Add constraint on Softmax · 7e1e27bb

Johan Alfvén authored Nov 07, 2024



 - Lowering of softmax can only be done as long as the product of
IFM width and height is within max allowed size for tensor dimension

Change-Id: Ifeacd3a8a8a9d7b1922a32b0475256d93a16fc8d
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

7e1e27bb

MLBEDSW-9769 Update RescalePerChannel for FULLY_CONNECTED · 673d493e

Johan Alfvén authored Nov 06, 2024



 - Update scale calculations to match reference for FULLY_CONNECTED
 - Refactor duplicate code and move RescalePerChannel to ethos_u_scaling

Change-Id: Ib0fc05fcbdcb124bbe5cba39c253db1f8f084c63
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

673d493e

Nov 06, 2024

MLBEDSW-9884: Reset all chaining registers before primary operation · 8e92d71f

Alexander Bengtsson authored Nov 06, 2024



- Remove StartChaining() and replace with ClearChainingRegisters()

Change-Id: Ib2904ef6eb93f378699cab221e1c80542c46652d
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

8e92d71f

MLBEDSW-9835 TOSA: Rescale fails for per channel scaling · 4ac32727

Fredrik Svedberg authored Nov 04, 2024



Added support for per channel scaling for Ethos-U55/65.

Change-Id: I9d1962ec6061b149abb5ccebaba53fec5cd37333
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

4ac32727

MLBEDSW-9859: MLCE: Update constraint class to reject float · b2eaacae

Johan Alfvén authored Nov 04, 2024



 - Operators with float input or output is not supported and should be
set to OpType::Passthrough. This does not happen since the architecture
constraint class is missing the float constraint
 - Update constraint class to reject datatype float
 - Use passthrough options in writer since passthrough op's do not
have attribute data

Change-Id: I7f2a8e18155a1c4c41761f4e8ff01ee9c8761298
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

b2eaacae

MLBEDSW-9776 Add check before accessing pads · fe9d34f8

Bjorn Davidsson authored Nov 05, 2024 and

Johan Alfvén committed Nov 06, 2024



Check array length before reading pads in
TFLiteGraphOptimiser::ConvertPad

Change-Id: I0237f3466eb038e4ab39ec4b1e5ee23d2a78eec0
Signed-off-by: Björn Davidsson <bjoern.davidsson@arm.com>

fe9d34f8

Nov 04, 2024

MLBEDSW-9536: Add support for transpose with large axis · 968818d2

Johan Gunnarsson authored Oct 26, 2024



* Use DecomposeLargeAxis, but with untransposed OFM shape (ie. the
  IFM shape). Transpose the OFM offset and shape in
  DecomposeLargeAxis.
* Don't pad shapes in RearrangeTranspose because then they won't
  match the perm attribute.
* At the end of decomposition, there can now be multiple ops for
  each swap, so we must look for all ops using the first and last
  tensor when adjusting the tensor and quantization.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I3bf3579c247faef42ddd83ae09b803a4de35910d

968818d2

MLBEDSW-9827 Rename OpType::DepthwiseConv2DBias · bd9820da

Fredrik Svedberg authored Oct 29, 2024



Rename OpType::DepthwiseConv2DBias to OpType::DepthwiseConv2D.

Change-Id: I6ab44d50339933689d77586f551ed773b6f97102
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

bd9820da

Nov 01, 2024

MLBEDSW-9836: Fix GenerateActivation assert · 046e1151

Johan Gunnarsson authored Oct 31, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I430d3d6cbc973ea48cb0cf2601f48972815d424c

046e1151

MLBEDSW-9834: Fix BufferReader assert in ConvertBool8Tensors · ceb879b2

Johan Gunnarsson authored Oct 31, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ib887c2700d0be5d28875ad252d18a564464a4646

ceb879b2

MLBEDSW-9722: Fix BufferReader assert in FixupDilationGT2 · 215ce42d

Johan Gunnarsson authored Oct 30, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I72db4a56a31c6887623d687529b0628f83e98a20

215ce42d

Oct 30, 2024

MLBEDSW-8789 Handle depthwise convolution with depth multiplier > 1 · 54a37db0

Fredrik Svedberg authored Jun 17, 2024



Set depth multiplier in GraphBuilder.
Add depth multiplier decomposition in DecomposeDepthwiseConv2D.

Change-Id: Ic1408dadba6e9587bff725308d17b5778d184be4
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

54a37db0

Oct 29, 2024

MLBEDSW-9795: Update accelerator configuration documentation · 4a00adce

Rickard Bolin authored Oct 28, 2024



Add Ethos-U85 to the accelerator configuration documentation

Change-Id: Iad1bb5f6d183730f6c268f143530674c1625a01d
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

4a00adce

Oct 28, 2024

MLBEDSW-9774: Fix to not accept first weight buffering scheme when preBuffering depth is 16 · c2c65026

William Isaksson authored Oct 22, 2024 and

Fredrik Svedberg committed Oct 28, 2024

Currrent behaviour gives up on double buffering tensors if the pre-buffering slice happens to be the minimal one. This patch makes sure that we also try to reduce the bufferingDepth before we give up in such a situation.

Change-Id: I6f52ea7b74d18863668c533f0d37bd9b6d05deb8
Signed-off-by: William Isaksson <william.isaksson@arm.com>

c2c65026

Oct 22, 2024

MLBEDSW-9762: Remove redundant consumers/producers management · a42c49b6

Johan Gunnarsson authored Oct 21, 2024 and

Rickard Bolin committed Oct 22, 2024



The consumers and producers are managed automatically by the
SchedulerOperation class.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I68ec801cbabbcf1dfc9e134a8f3339e7cad39118

a42c49b6

Oct 21, 2024

MLBEDSW-9764: Optimize 1x1 Conv2D as FullyConnected · 3696cf32

Johan Alfvén authored Oct 18, 2024



 - Batched Conv2D with kernel 1x1 can be optimized in the same way
as FullyConnected
 - Add condition to RewriteFullyConnected to detect the Conv2D case
 - Moved condition to scheduler when to buffer for FullyConnect
 - Extend CanSubdivide with optype FullyConnected
 - Set AxisOrder in TOSA reader for Conv2D

Change-Id: I8ecc9c5bb8700a00e18d54cb8cc209ab70d01f45
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

3696cf32

MLBEDSW-9745 Fix rounding mode replaced · 4541858f

Fredrik Svedberg authored Oct 21, 2024



Original rounding mode of operation was replaced in
GraphIrOptimiser::RewriteDepthwise.

Change-Id: Ifd3d0f37af802bddf2b730eb509a2def29159bf5
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

4541858f

MLBEDSW-9766 Restore support for 4-bit weights · 45f981ce

Bjorn Davidsson authored Oct 18, 2024 and

Fredrik Svedberg committed Oct 21, 2024



Add limited support for 4-bit data to the BufferView class,
and update the code for repacking 4-bit constant weight data
as 8-bit.

Change-Id: I015b059348868bf21272c99c7b425dcf3f2c1a1d
Signed-off-by: Björn Davidsson <bjoern.davidsson@arm.com>

45f981ce

MLBEDSW-9740 Fuse Rescale to binary op fails · b4537e37

Fredrik Svedberg authored Oct 18, 2024



Changed order of input and output rescale fusing since the extra
checks for input fusing on binary elementwise operations expects
fusing to be done in this order.

Change-Id: I5d796383de90a1477240bc45a94f5c2f71616530
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

b4537e37

MLBEDSW-9763: Fix TOSA PAD regression · 8431aadc

Johan Gunnarsson authored Oct 18, 2024 and

Johan Alfvén committed Oct 21, 2024



* Add support for reshaping shapes that are used to represent
  offsets. They may have zeros in them, which were clamped to ones
  when reshaping.
* Extend unit tests.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ib0a0796d36a7e6ff2d3cd1af97441dce07b53907

8431aadc

Oct 18, 2024

MLBEDSW-9743: Add Ethos-U85 HW-operation for ArgMax · 3296f32d

Alexander Bengtsson authored Oct 17, 2024 and

Alexander Bengtsson committed Oct 18, 2024



- Handle ArgMax-specific datatype mappings by introducing EthosU85NpuOp::ArgMax.

Change-Id: I0de682a2e28d2d70a67107973c26fc60bba0a50d
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

3296f32d

MLBEDSW-9748: Remove OpType::Conv2DBias · 2d83daf2

Johan Alfvén authored Oct 17, 2024



 - Replace all OpType::Conv2DBias with OpType::Conv2D
since there is no different handling in the code base
 - Remove OpType::Conv2DBias

Change-Id: Id444491c87b2a1ea3a9a6c5066b5406531cc025a
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

2d83daf2

MLBEDSW-9750: Fix ReshapeTo3D with shapes less than 3D · e0c21ffe

Johan Gunnarsson authored Oct 17, 2024 and

Johan Alfvén committed Oct 18, 2024



* Fix asserts.
* Add unit tests.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ibdcb9a157fca8e32ed59e03469aca2d14d5f220d

e0c21ffe

MLBEDSW-9694: Refactor unit-test helpers into common header · aff28cd9
Alexander Bengtsson authored Oct 07, 2024 and Alexander Bengtsson committed Oct 18, 2024
```
Change-Id: I6c977a1b810de949e51204fa5f75782f3cbb8fb6
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>
```
aff28cd9

Oct 17, 2024

MLBEDSW-9742 Batched conv2d produces output diff · c903b931

Johan Alfvén authored Oct 17, 2024



 - An output diff occurred because decomposition was not done for
convolutions when batch was greater than 1
 - Add check to CanRunOnHardware checking for batch for conv
 - Use SliceShape instead of full shape in DecomposeConv2D and
DecomposeDepthwiseConv2D

Change-Id: Ic932ad329327beea75df8e76b48d737e53aa3cb0
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

c903b931

MLBEDSW-9666: Read pad values in correct data type · bf0d7a27

Rickard Bolin authored Oct 11, 2024



- Assumed padValues were int32, but they can also be int64

Change-Id: I359fdf6f9c4fbb23926669c9f9207da44af18fa3
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

bf0d7a27

MLBEDSW-8903: Add support for 8D transpose · d4c88248

Johan Gunnarsson authored Oct 08, 2024 and

Johan Alfvén committed Oct 17, 2024



* Remove transpose constraint for Ethos-U85.
* Add decomposition of transpose operations.
* Refactor the ReshapeTo3D functions.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I4bf7a69b65b71312c34eed4ae16d7d899383d134

d4c88248