Commits · 4.1.0 · artificial-intelligence / ethos-u / Vela

Nov 27, 2024

MLBEDSW-10037: Update release notes · a08fc187

Rickard Bolin authored Nov 26, 2024



Updated release notes and added wheel information to README

Change-Id: Id6e01c6cc901855984e07ff4fe3024c385c6a57b
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

a08fc187

Nov 21, 2024

MLBEDSW-9476: Passthrough operations with shapeless tensors · 015f9714

Alexander Bengtsson authored Nov 21, 2024 and

Alexander Bengtsson committed Nov 21, 2024



- Shapes for non-constant tensors cannot be inferred in compile-time.
  Operations with non-constant shapeless tensors are pushed to CPU by
  setting Passthrough.

Change-Id: Id960c82de9e1905b63adbc830113f03d5db92842
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

015f9714

Nov 19, 2024

MLBEDSW-9780: Handle shapeless tensors in debug database · fcd11b9b

Johan Gunnarsson authored Nov 18, 2024 and

Rickard Bolin committed Nov 19, 2024



Store output tensor shape as empty strings.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I4d173f395ab64cf59086531af9a761e9f677ce5d

fcd11b9b

MLBEDSW-9780: Sum all performance reports · b54925d5

Johan Gunnarsson authored Nov 18, 2024 and

Rickard Bolin committed Nov 19, 2024



In the case of multiple subgraphs, return the sum of the performance
report of each subgraph.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Id098b068dd548bfb1ecafda20dad4adedc46deca

b54925d5

MLBEDSW-9780: Add placeholder tensors to unconnected ops · 3c4f0bbb

Johan Gunnarsson authored Nov 14, 2024 and

Rickard Bolin committed Nov 19, 2024



The TFLite CallOnce and TFLite AssignVariable has no OFM and will
therefore not be reachable when traversing the graph starting from
the graph output tensors, but we still need to keep them in the
graph so we can pass them to CPU. This change discovers such
unconnected ops when loading the graph, and adds a placeholder
output tensors to them and stores the placeholder tensors in both
graph outputs and in a separate list. Later when writing out this
graph, remove these placeholder tensors again.

Do the same for ops without input tensor.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I68c963c8b9c447527eb16ef415b85ac0253f9ff6

3c4f0bbb

MLBEDSW-7940: Document pip install options for C++ files. · e6c6b4e0

Alexander Bengtsson authored Nov 18, 2024 and

Alexander Bengtsson committed Nov 19, 2024



- Extend Advanced Installation for Developers with build-options for
  C++ files.

Change-Id: Iae25e953a0770913b84c1c9d016555a29b603042
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

e6c6b4e0

MLBEDSW-9647: Hide cycle estimates in summary behind a flag · 5d156fc6

Rickard Bolin authored Nov 11, 2024



- Hide cycle estimates behind a new flag, --verbose-cycle-estimate
- Update documentation to include the new flag
- Add notes highlighting that the estimates are only estimates and that
  for more accurate numbers the software model is suggested.

Change-Id: Id1342b71cca3ee11dead9a6fdf084fd2b5d030cb
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>

5d156fc6

MLBEDSW-9905: Fix commit causing compilation breakage · 3586030f

Philip Hall authored Nov 11, 2024 and

Johan Alfvén committed Nov 19, 2024



A previous commit introduced breaking compilation issues for
non gcc compilers.

 - MSVC refused to fit a float lambda into a double shaped hole
   without conversion complaints. This fixes the function
   signatures to allow compilation to complete.
 - Removed unnecessary stub lambdas.
 - Compilers complained heavily about the new mix of implicit
   and explicit float conversions in adjusted code. This commit
   rationalises those conversions.
 - Removed redundant RoundAwayZero template function that was
   not selecting the correct codepath in the float conversion
   mix.
 - Cleaned up other, not directly related, float conversion
   warnings.

Signed-off-by: Philip Hall <philip.hall@arm.com>
Change-Id: I029c0cad8652a9e0f3579c42948b8b8a53d8d093

3586030f

MLBEDSW-9954: Use const-ref parameters in test/util · 536681a6
Alexander Bengtsson authored Nov 18, 2024 and Alexander Bengtsson committed Nov 19, 2024
```
Change-Id: Ibee1a815f464cb9fcd8f7e8da8ba6b889225c228
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>
```
536681a6

MLBEDSW-9960: Storage shape must be used in ReusableIFM check · a3384a6d

Johan Alfvén authored Nov 18, 2024



 - Assert triggered in FuseRanges on size critera for input and output
 - Even if tensor connection shapes are equal for ifm and ofm the actual
storage shape can differ due to removal of operators like RESHAPE and
when this is the case the ranges can not be fused
 - Using the storage shape in ReusableIFM fixes this issue

Change-Id: I1db918e80edcf807c34670eac209d5853706548e
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

a3384a6d

Nov 18, 2024

MLBEDSW-9964: Fused RESCALE removes graph output · bd938a83

Johan Alfvén authored Nov 18, 2024



 - If the input tensor to a RESCALE op also is graph output tensor it is
not possible to fuse this RESCALE op with another op

Change-Id: Id615d5ca05ecb09e73cd009ee62cdf315777c618
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

bd938a83

MLBEDSW-9780: Fix FixupDilationGT2 · 165ef7b2

Johan Gunnarsson authored Nov 12, 2024 and

Johan Alfvén committed Nov 18, 2024



The calls to WithDilation() had its parameters width and height in
the wrong order. Point2i constructor takes W, H instead of H, W.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I20eb2d16120e2e1ff376cad5befe817df46d3111

165ef7b2

MLBEDSW-9780: Sort CPU ops with early producers as early · 1a68b9dd

Johan Gunnarsson authored Nov 11, 2024 and

Johan Alfvén committed Nov 18, 2024



In practice this will affect TFLite VarHandle, TFLite ReadVariable
and TFLite CallOnce.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I3b94e99d0e7a6feaeb661d5e8ea5b8d4d0d260b1

1a68b9dd

MLBEDSW-9941: Abort with error for TOSA input with CPU fallback · d6c6ada6

Alexander Bengtsson authored Nov 13, 2024 and

Alexander Bengtsson committed Nov 18, 2024



- If input is provided via GraphAPI, the graph must map fully to
  hardware.

Change-Id: I22931d12cc8e6aa9e0a154d1a453821c0012fea5
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

d6c6ada6

MLBEDSW-9476 Handle WhileOptions · 4b918415

Fredrik Svedberg authored Nov 15, 2024



Implemented passtrough of WhileOptions.

Change-Id: I05f729033c9749b62f18b9bfa180d56394f2cb78
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

4b918415

MLBEDSW-9476 Preserve subgraph names · 5cf0c66b

Fredrik Svedberg authored Nov 15, 2024



Added code to preserve subgraph names in output file.

Change-Id: Ia1f07420133692c361d31853ed113a1ca34b12c3
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

5cf0c66b

Nov 15, 2024

MLBEDSW-8783 Decompose CONV_2D with large strides · 70b82a27

Bjorn Davidsson authored Oct 09, 2024



Add support for CONV_2D with stride > 3:

* Decompose into blocks that fit in accumulator RAM.
  This also adds support for FM dimensions > 64K.
* Add checks for support of accumulator control to architecture.
* Decompose ops with large strides to multiple 1x1-kernel ops, all adding to the same
  accumulators.

Change-Id: I083a5ca112d14019f0317813564e9d3309a69bd1
Signed-off-by: Björn Davidsson <bjoern.davidsson@arm.com>

70b82a27

Nov 14, 2024

MLBEDSW-9780: Handle tensors with variably sized datatypes · 0a8c8ad0

Johan Gunnarsson authored Nov 08, 2024



Tensors with variably sized datatypes (i.e. String, Resource or
Variant), can't have a non-empty BufferView because we don't know
the size of each element.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I35c986e0521c6e4a8192ba8e88c127692b483d97

0a8c8ad0

MLBEDSW-9780: Handle a few more TFLite options · b6ab73d6

Johan Gunnarsson authored Nov 08, 2024



Add support for reading and writing the following options tables:

* CallOnceOptions
* VarHandleOptions
* ReadVariableOptions
* AssignVariableOptions

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Iaea0c79bdead53e29613e96b650ab7d3c9de97b5

b6ab73d6

MLBEDSW-9780: Passthrough all unknown TFLite ops · 7cd09ea1

Johan Gunnarsson authored Nov 11, 2024



There can be networks that contains operators that we don't know
about (i.e. not listed in tflite_mapping.cpp). Pass such operators
to CPU.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I37bd67d440ec6a5d4aafad3d6f17c7f050050cf2

7cd09ea1

MLBEDSW-9891: Only output one OfflineMemoryAllocation per network · 37844eae

Johan Gunnarsson authored Nov 08, 2024



Before this patch, it serialised one OfflineMemoryAllocation buffer
per subgraph containing all tensors in that subgraph. This is not
correct. There should be only one OfflineMemoryAllocation and it
should contain all tensors in all subgraphs.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I43f42fba65584dbec24b2a0c787c15fe4135e8a0

37844eae

MLBEDSW-9778: Don't assume all tensors have shapes · 0d3bed0e

Johan Gunnarsson authored Nov 11, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I528691b492e50aac012b8ff3eabc8c57d41bf399

0d3bed0e

MLBEDSW-9784: Constant propagation pass · 3efd8b62

Mauricio Briceno authored Oct 23, 2024 and

Johan Alfvén committed Nov 14, 2024



- Add support for constant propagation via separate graphir pass set
- Add RewriteFunctions attribute to select traversal direction
- Constant propagation pass traversal direction is forward
- Fixed test/util CreateGraph to not include constant tensors as Graph
  IO
- This change introduces support for TOSA::LOGICAL_LEFT_SHIFT only

Change-Id: I7d63f3c6b11c715fc76ec37f79e85e6a75f6aa87
Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>

3efd8b62

MLBEDSW-9133: Output diff on MEAN op · e4a98830

Johan Alfvén authored Nov 12, 2024



 - A MEAN op with IFM rank two and reduce in C dimension caused an
output diff
 - The reason was that the intermediate tensor for calculating the sum
had the wrong shape because the reduceAxis shape had been padded to 4D
 - The fix is to use the original reduceAxis shape when calculating the
shape for the sum tensor

Change-Id: I144adfe07f697fecba6f7237e8b216295654f8ae
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

e4a98830

Nov 13, 2024

MLBEDSW-9912: Make sure weight buffering cost is cleared · 3ce02f61

Johan Alfvén authored Nov 13, 2024



 - Fixed a problem when the final scheduler cost contained wrong info
about Weight buffer when it should not. This caused the tensor
allocation to fail due to over allocation
 - The reason is that after optimizing the cascades, the final call to
ProposeScheduleBuffering decided that it would not need any buffering.
 - However, due to some missing code the weight buffering cost was not
cleared so memory size went over the limit
 - Added code to clear the weight buffering cost when it is not needed

Change-Id: I92db5c6199c99ca1ae4a21b62dfd17f2cae6ac9b
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

3ce02f61

Nov 12, 2024

MLBEDSW-9715: Fix to not recount weight buffers · dd00a882

William Isaksson authored Oct 16, 2024



Removes contribution from previously buffered weights for an op to the
buffering limit when proposing new weight buffering for the same
op.

Change-Id: If3ca6f6e359c29bc69f26c994503375cd353fba9
Signed-off-by: William Isaksson <william.isaksson@arm.com>

dd00a882

Nov 11, 2024

build: Import Catch2 via dependency infra · 183b1a22

Mauricio Briceno authored Nov 09, 2024 and

Johan Alfvén committed Nov 11, 2024



- This ensures produced artifacts are built with the same flags
- For example valgrind can now parse debug symbols properly

Change-Id: I798c0048890f546d2c2ec6a964ffa0f052de0840
Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>

183b1a22

MLBEDSW-9898: Enable fusing rescales to ReduceSum · 0b013846

Johan Gunnarsson authored Nov 11, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ia61c19749958ea271385d5efcdc3682428303ab5

0b013846

MLBEDSW-9899: Enable fusing rescales with unit scale · a9d495dd

Johan Gunnarsson authored Nov 11, 2024



Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Ie1504e5eba860785cfdf9fd09f508bd9c81e9026

a9d495dd

MLBEDSW-9696: Initial support for TOSA TransposeConv2D · d12a62a1

Alexander Bengtsson authored Oct 22, 2024 and

Alexander Bengtsson committed Nov 11, 2024



- Support TOSA TransposeConv2D with kernel strides 1 or 2
  by adding support for output-padding.
- Simplify DecomposeTransposeConv2D and breakout TFLite-specific padding
  to the TFLite reader.
- Refactor InitializeSlice into TensorSlice.Initialize
- Strides larger than 2 will require further decomposition MLBEDSW-9761

Change-Id: I54143461a5bc677ca7eefd91b9005dbdc7b924ec
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

d12a62a1

Nov 08, 2024

MLBEDSW-9889: MLCE: Update ConvertToInterpolatingLUT16 to use float · 88c49220

Johan Alfvén authored Nov 08, 2024



 - Update ConvertToInterpolatingLUT16 to use float to match reference
and avoid rounding issues
 - Update Exp int8 and int16 to use float to match reference

Change-Id: Ia0b6912d665d243b2dab12b002c418d04e2e124d
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

88c49220

MLBEDSW-9779 Fix Transpose attributes · 1f408d3d

Fredrik Svedberg authored Nov 08, 2024



Fixed Transpose attributes not always initialized for supported
Transpose operations.

Change-Id: I205d4920f8da16d9e48c07946e21c66d564e888e
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

1f408d3d

MLBEDSW-9889: MLCE: RSQRT int16 produces output diff · 050792c2

Johan Alfvén authored Nov 08, 2024



 - Update LUT table generation for RSQRT int16 to use
float to match reference and avoid rounding issues

Change-Id: I1d9af2a2f682dccadc918736f9e40a2e3e1c4986
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

050792c2

MLBEDSW-9892: MLCE: SOFTMAX int16 produces output diff · aa2389c0

Johan Alfvén authored Nov 08, 2024



 - Softmax int16 should use double precision in ElementwiseMulScale
in order to match reference and avoid rounding errors
 - Update ElementwiseMulScale to also support double, default
behavior is float

Change-Id: I207f95c273c66d99d994df2aa9cde144a01f80fe
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

aa2389c0

Nov 07, 2024

MLBEDSW-9777 Fix ReshapeOptions optional · 20ee1dbf

Fredrik Svedberg authored Nov 05, 2024



The ReshapeOptions attribute is optional for RESHAPE operations.

Change-Id: I607282bdf3871dd364c96a762674378920268725
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

20ee1dbf

MLBEDSW-9865: MLCE: Add constraint on Softmax · 7e1e27bb

Johan Alfvén authored Nov 07, 2024



 - Lowering of softmax can only be done as long as the product of
IFM width and height is within max allowed size for tensor dimension

Change-Id: Ifeacd3a8a8a9d7b1922a32b0475256d93a16fc8d
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

7e1e27bb

MLBEDSW-9769 Update RescalePerChannel for FULLY_CONNECTED · 673d493e

Johan Alfvén authored Nov 06, 2024



 - Update scale calculations to match reference for FULLY_CONNECTED
 - Refactor duplicate code and move RescalePerChannel to ethos_u_scaling

Change-Id: Ib0fc05fcbdcb124bbe5cba39c253db1f8f084c63
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

673d493e

Nov 06, 2024

MLBEDSW-9884: Reset all chaining registers before primary operation · 8e92d71f

Alexander Bengtsson authored Nov 06, 2024



- Remove StartChaining() and replace with ClearChainingRegisters()

Change-Id: Ib2904ef6eb93f378699cab221e1c80542c46652d
Signed-off-by: Alexander Bengtsson <Alexander.Bengtsson@arm.com>

8e92d71f

MLBEDSW-9835 TOSA: Rescale fails for per channel scaling · 4ac32727

Fredrik Svedberg authored Nov 04, 2024



Added support for per channel scaling for Ethos-U55/65.

Change-Id: I9d1962ec6061b149abb5ccebaba53fec5cd37333
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>

4ac32727

MLBEDSW-9859: MLCE: Update constraint class to reject float · b2eaacae

Johan Alfvén authored Nov 04, 2024



 - Operators with float input or output is not supported and should be
set to OpType::Passthrough. This does not happen since the architecture
constraint class is missing the float constraint
 - Update constraint class to reject datatype float
 - Use passthrough options in writer since passthrough op's do not
have attribute data

Change-Id: I7f2a8e18155a1c4c41761f4e8ff01ee9c8761298
Signed-off-by: Johan Alfven <johan.alfven@arm.com>

b2eaacae