- May 21, 2024
-
-
Rickard Bolin authored
- Add release notes for 3.12.0 - Update README with supported TensorFlow version Change-Id: Ia95955e9fbe824d1dc62e3960a9ed43fa3689de5 Signed-off-by:
Rickard Bolin <rickard.bolin@arm.com>
-
- Restrict third-party dependency versions based upon compatibility and security concerns. Change-Id: Ic50eb987f012df8c0c46ef0ee51d907b7609edae Signed-off-by:
Alexander Bengtsson <Alexander.Bengtsson@arm.com>
-
- May 20, 2024
-
-
Johan Alfvén authored
- Fix performance regression caused by the concat grouping fix. - If there is no cpu op interfering there is no need for grouping the avg pool ops. Keep old compiler behavior for that use case. Change-Id: I6476585d7dedff0b9edd8b9c300a71c181496cf1 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- May 16, 2024
-
-
Rickard Bolin authored
Change-Id: Ie6f39d9c4125f7c16d27621de47cd76143c2e636 Signed-off-by:
Rickard Bolin <rickard.bolin@arm.com>
-
- May 15, 2024
-
-
Johan Alfvén authored
- Concat is implemented by several avgpool ops, all of them writing to the same ofm but with a slice offset. If a compiled network contains cpu fallbacks the avgpool ops might end up running in different custom ops. This works fine as long as the runtime provides the same scratch area. If not the output from the concat might be corrupt. - This fix adds an extra step to the pass packing so that all avgpool ops for a concat is group together and run within the same custom op in order to prevent possible corruption. Change-Id: I343e08d7b4046f969b3d9ec3479db6490cbe4170 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Apr 24, 2024
-
-
Johan Alfvén authored
- Fully connected with batch shape will use the weights more than once. Models with these type of fully connected will benefit from weight buffering. - If a fully connected op with this shape is detected it is changed to a conv2d and the normal weight buffering flow will be used. Change-Id: I272741a32390e036d5e04bd5af41d4538162e86e Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
Johan Alfvén authored
- Assert in build pass links due to that a concat op is split into several avg pools op which run in different custom ops. The code did not expected the pass to have a dependency to itself. - Fixed the assert to handle this special case Change-Id: Id03b1145b19c25bf967a1061aa5ecf559b3bc1cc Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Apr 23, 2024
-
-
Signed-off-by:
Per Åstrand <per.astrand@arm.com> Change-Id: Idc6f6959bfc7eabce2f5b6e0d4935d292dcf6618
-
- Apr 12, 2024
-
-
Per Astrand authored
Reshape the weight for depthwise conv2d and set the depth_multiplier attribute on the operation. Signed-off-by:
Per Åstrand <per.astrand@arm.com> Change-Id: I3b73988fa8c4e0cbe2430874cefe6d002885ec89
-
Per Astrand authored
Remove the upscale to int32 before and after the the add operation. Re-enable fusing of conv2d and rescale that was removed earlier. Signed-off-by:
Per Åstrand <per.astrand@arm.com> Change-Id: I5e7d9bd99bb3925588b507824d8eb3e6642cc7f0
-
- Apr 05, 2024
-
-
Johan Alfvén authored
Change-Id: I8e584a036036f35a8883b2a4884cb2d54e675e39 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
Johan Alfvén authored
- Assert triggered due to that the tensor consumer list did not contain expected operators. - The problem happened because a concat op was split into two avgpool ops and these two ops run in separate subgraphs with a cpu node in between. Since the avgpool ops share the same output tensor this caused some corruption to the tensor consumer list when the last subgraph was traversed. - The fix is to ignore ops that do not belong in the subgraph's set of operators (the pass list) when updating the consumers. Change-Id: I4d94b54c77001f6447bec31ec62daeebc9b104f9 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Apr 04, 2024
-
-
Johan Alfvén authored
- Fix regression caused by too strict constraints on SplitSpliceRead causing output diff for LSTM. - As long as the SplitSpliceRead shape fits within the consumer ifm shape it is ok to move the read. Change-Id: Ia6f508f99638c3aedccc7fd9f31405527bb64f87 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Apr 03, 2024
-
-
Johan Alfvén authored
- When possible, a read slice from a split or stride is moved to the following op. The problem in this case was that the following op was a Maxpool op (from Softmax). The Maxpool op is using a different input shape compared to the original Softmax op, and this input shape was then changed when the read slice was applied to the Maxpool op. - The result is a faulty Maxpool op with an output diff. - The fix is to prevent moving the slice read when the consumer input shape differs from the Split/Stride ofm shape Change-Id: I649d89c38645fa51c20c3602954e2b8af9372076 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
Johan Alfvén authored
- A network containing several softmax operators caused an output diff - The problem was that the code that detects if the LUT is already in internal SRAM calculated everything correctly except for which lut index to use. - The code should use the slot_size and not then LUT size when calculating the index which fixes this problem. - Updated unit tests Change-Id: I07686651a883ccbba7c173e7191eb21f9ff15bf5 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Apr 02, 2024
-
-
- Add ext_key tracking. - Fix debug db cmd offsets being off by 4. Change-Id: Ib109a15a0a2c44d08021c3b1bc3bcc067240ac5c Signed-off-by:
William Isaksson <william.isaksson@arm.com>
-
- Mar 12, 2024
-
-
Remove scales and biases from encoded weights size. This aligns better with original weights size (which only represents the weight tensor) Change-Id: I5aabf61385d8fdf150764c45e04ba4388c6a63f0 Signed-off-by:
Alexander Bengtsson <Alexander.Bengtsson@arm.com>
-
- Mar 07, 2024
-
-
Oscar Andersson authored
- Fix TOSA imports - Handle weights connected to Identity nodes - Scaling info was missing in Fully Connected - Disable rescaling fusing for conv-like ops - Explicit scaling was missing for conv-like ops - Handle Const->Identity->Transpose chains - Handle Const->Identity->Reshape chains Change-Id: I063af1f187b6b56105ccf5e8e8b2eb0d3a39dd3b Signed-off-by:
Oscar Andersson <oscar.andersson@arm.com>
-
- Mar 06, 2024
-
-
Johan Alfvén authored
- When possible, a read slice from a split or stride is moved to the following op. The problem in this case was that the following op was an elementwise op where the ifm needed to be broadcasted and that is not supported. - The result is a faulty elementwise op with an output diff. - The fix is to prevent moving the slice read to the elementwise op if broadcasting is needed. Change-Id: I89928c217510a822f91f051fd1ad6e34040c19de Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Feb 28, 2024
-
-
Simon Hollis authored
Signed-off-by:
Simon Hollis <simon.hollis@meta.com> Change-Id: I5553802afaa3faaa2548aece7a3e0e1530021765
-
- Feb 27, 2024
-
-
Minor fixes for TOSA 0.80.0 and 0.80.1 field naming following from the 0.2 to 0.8 conversion. Change-Id: I2ac1b3ac1ec60cf765edf54030cd2338bf001289 Signed-off-by:
Rob Elliott <Robert.Elliott@arm.com>
-
- Feb 19, 2024
-
-
Rickard Bolin authored
- Added release information Change-Id: I6d6d80460658d444d52d0abb17a2cb42954f992c Signed-off-by:
Rickard Bolin <rickard.bolin@arm.com>
-
- Feb 09, 2024
-
-
Johan Alfvén authored
- Fixed output diff for FullyConnect int16 - Problem was that wrong rounding mode was used - Reference uses Natural rounding for FullyConnect int16 Change-Id: I209313b6f89fed01678a448a935d5f6904b41057 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Feb 06, 2024
-
-
Rickard Bolin authored
Change-Id: I1458009f4b92c1a599efa3a63d6768148e55606d Signed-off-by:
Rickard Bolin <rickard.bolin@arm.com>
-
- Jan 30, 2024
-
-
Rickard Bolin authored
Change-Id: I3c13118e14195a5fb8e522a38b205b75fb07b74b Signed-off-by:
Rickard Bolin <rickard.bolin@arm.com>
-
Johan Alfvén authored
- A Pack op is implemented by several AvgPool ops. Depending on number of CPU ops and graph topology this could result in that the AvgPool ops ended up in different nodes. One of these node had the Pack output referenced to it but the other node did not. As a result the full graph was not traversed when calculating CPU ops. - The compiled network works as intended but the number of reported CPU was wrong. - Added new method that extracts the ops using the passes in the sub graphs which fix the problem. Change-Id: Ie88ebd4669783559258ae763737a4c7f86c905f8 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Jan 26, 2024
-
-
Fredrik Svedberg authored
Fixed a problem where the compiler incorrectly called the mlw_codec to create an empty weight stream for the second weight core. Also added code to the mlw_codec to detect this as an value error rather than a memory error. Change-Id: I463846cecb1178f8fbf04dc3e39bd6965cb8ddfc Signed-off-by:
Fredrik Svedberg <fredrik.svedberg@arm.com>
-
Tim Hall authored
- Small improvement that reduces compilation time Change-Id: I9e5cd58674f719f5dedeb30ea42787dc996a22d6 Signed-off-by:
Tim Hall <tim.hall@arm.com>
-
Tim Hall authored
This reverts commit dbe4df4ccddafac9cbc345a4a03a42c241248e88. - The previous patch had a mostly negative effect on performance Change-Id: I4003d50b07de9c63d9001ceb0a3a0bc966c0b861 Signed-off-by:
Tim Hall <tim.hall@arm.com>
-
Tim Hall authored
- Removed the unused function get_block_config_for_npu_op() Change-Id: If36e4fe65286c4e13e127473d20971a1b6eaa94b Signed-off-by:
Tim Hall <tim.hall@arm.com>
-
- Jan 24, 2024
-
-
Fredrik Svedberg authored
Added missing memory allocation checks to mlw_codec. Change-Id: I20c04d5d9c934b9c715a2b2049705f853d90825a Signed-off-by:
Fredrik Svedberg <fredrik.svedberg@arm.com>
-
- Jan 18, 2024
-
-
Adds support for setting the accumulator type using the quantized_bias_type attribute Change-Id: Ibde1149143b510a1c650a5a037d3ab92d878d7cd Signed-off-by:
William Isaksson <william.isaksson@arm.com>
-
- Jan 16, 2024
-
-
Tim Hall authored
- The issue is that live range start and end times are inclusive but the function to calculate is two ranges overlap treats them as exclusive - The fix is to change the comparison to be inclusive Change-Id: Iab5ceec7be2a5fdf0d6ecef81509a88c74e7108c Signed-off-by:
Tim Hall <tim.hall@arm.com>
-
- Dec 22, 2023
-
-
Johan Alfvén authored
- If a npu op is followed by a convolution op with dynamic weights the optimized file ends up containing a duplicated tensor called _cpu. - Another problem is also that an empty bias tensor is added in the reader. - The fix is to ignore these cpu ops both in the reader and the writer. Change-Id: I476b4f6062e26cca4ba589df694a99ef79b0f6d4 Signed-off-by:
Johan Alfven <johan.alfven@arm.com>
-
- Dec 20, 2023
-
-
Updates to TensorFlow 2.15. No StableHLO operators were added to Vela since these are subject to change and have almost no runtime support. - FlatBuffers version was unchanged. Change-Id: I9a506a2dcc2e0bc2498742e857bbb6d69b19ac1b Signed-off-by:
William Isaksson <william.isaksson@arm.com> Signed-off-by:
Rickard Bolin <rickard.bolin@arm.com>
-
- Dec 19, 2023
-
-
- The issue was that the peak memory usage was only evaluated at the start of the tensor's lifetime and not across its whole lifetime - The fix is to look for the maximum usage between start and end Change-Id: Iff4f390f3a017f1df0f8933796fa5282db7870db Signed-off-by:
Tim Hall <tim.hall@arm.com>
-
- Nov 21, 2023
-
-
William Isaksson authored
- Documents Legality requirements of CMD1 payloads - Fixes a miss in the command stream checks. Signed-off-by:
William Isaksson <william.isaksson@arm.com> Change-Id: I9b33dedfa66650fa3100f61fd158a385818b4d52
-
- Nov 16, 2023
-
-
Tim Hall authored
- Added release information - Modified SUPPORTED_OPS.md version info Change-Id: I3ead55db45c84821c426645e488dfb765166d20f Signed-off-by:
Tim Hall <tim.hall@arm.com>
-
Tim Hall authored
- Updated TensorFlow Support section Change-Id: Ic2551f44e7dfa996a5dcc8840d480b7985415a0a Signed-off-by:
Tim Hall <tim.hall@arm.com>
-
Tim Hall authored
- Changed homepage link from cgit to gittiles - Clarified tensor alignment is in Bytes Change-Id: I9fd912c17d61f9add11493e031bbb620271c68eb Signed-off-by:
Tim Hall <tim.hall@arm.com>
-