Skip to content
  1. Aug 25, 2021
  2. Aug 24, 2021
  3. Aug 23, 2021
  4. Aug 19, 2021
  5. Aug 16, 2021
  6. Aug 13, 2021
  7. Aug 11, 2021
  8. Aug 10, 2021
  9. Aug 05, 2021
  10. Jul 28, 2021
  11. Jul 27, 2021
  12. Jul 26, 2021
  13. Jul 19, 2021
    • Tim Hall's avatar
      MLBEDSW-4812: Deep speech performance block config update · daed1529
      Tim Hall authored
      
      
      Deep speech was exhibiting poor performance in its first three
      layers due to poor SHRAM utilisation.
      
       - Given a choice between multiple identical-cost block configs,
         the allocator was choosing the first one it encountered. This
         commit biases the choice towards blocks with a larger IFM
         fetch area to improve SHRAM utilisation.
      
      Signed-off-by: Tim Hall's avatarTim Hall <tim.hall@arm.com>
      Change-Id: I2ff18a13444b8812cb451a606ff692bf290e7d20
      daed1529
  14. Jul 15, 2021
  15. Jul 09, 2021
  16. Jul 08, 2021
    • Patrik Gustavsson's avatar
      MLBEDSW-4838 Added basic TOSA support. · 8f1f9aaa
      Patrik Gustavsson authored
      
      
      Added basic TOSA support, enabling Vela to
      read and compile  a .tosa file corresponding to
      CONV2D + Rescale + Clamp, and writing it to an
      optimized .tflite file.
      
      The optimized .tflite file, will in this case, hold
      a commandstream where the Rescale and Clamp has been
      fused into the CONV2D.
      
      The optimized tflite file is not output from Vela.
      
        -Added support to read .tosa file into Vela
          internal structure.
            - Added tosa_reader.py, tosa_mapper.py and
              helper files stored under tosa/
            - Support for this limited to ~10 ops
      
          -Added reader_util.py for functions common
          for TOSA and TFLite
      
          -Added tosa_graph_optimiser.py
            -Added support to fuse Rescale into convolution
            -Modified handling for padding
      
          -Added support to fuse Clamp to previous op
      
          -Added graph_optimiser_util.py
            -Moved functions common for TOSA/TFLite graph
             optimization to this file.
      
          -Renamed graph_optimiser.py to tflite_graph_optmiser.py
      
          -Added separate tosa_supported_operators.py
      
          -Added supported_operator_util.py
             -For functions in common for TOSA/TFLite
      
      Signed-off-by: default avatarPatrik Gustavsson <patrik.gustavsson@arm.com>
      Change-Id: Ic3c540504ec8c5eb4771397fdc6882050ecf33ab
      8f1f9aaa
  17. Jul 05, 2021
    • Samuel Panijel's avatar
      MLBEDSW-3890 handling scratch tensor · 6f4955aa
      Samuel Panijel authored and Tim Hall's avatar Tim Hall committed
      
      
      vela: Possible issue with handling scratch tensor on non-ethosu custom op
      
      Fixing a case where a tensor input name ends with "scratch".
      4 test cases passing this change:
      1) non-optimized tflite - input tensor name is _split_1_scratch
      2) optimized tflite - input tensor name is _split_1_scratch
      3) optimized tflite - input tensor name is _split_1_scratch and custom
         operation name is non_ethus_u
      4) non-optimized tflite - input tensor name is _split_1_scratch_fast
      
      Change-Id: Ia515805825b7f9a646607c5075b7ea3a0cf6aad8
      Signed-off-by: default avatarSamuel Panijel <samuel.panijel@arm.com>
      6f4955aa
  18. Jun 25, 2021
  19. Jun 22, 2021
  20. Jun 17, 2021
    • Tim Hall's avatar
      Block config optimisation for 256/512 configurations · 3016157e
      Tim Hall authored
      
      
       - 256 and 512 configuration variants execute 1D convolutions
         in an optimised manner compared to their 2x2 microblock
         dimensions. This commit takes this into account to improve
         Conv1D throughput on these configurations.
      
      Signed-off-by: Tim Hall's avatarTim Hall <tim.hall@arm.com>
      Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
      3016157e
    • Tim Hall's avatar
      vela: Improve block configuration and weight buffering algorithm · 789e6f3a
      Tim Hall authored
      
      
       - Update block config selection to take into account partial
         IFM fetches at edge of non-whole OFM block data.
       - Change to scheduler depth slicing for networks in MLBEDSW-4637
         for improved buffering. This helps general performance by buffering
         larger depth slices.
       - Bug fix for opt_max_schedule always being fitted to SRAM which
         prevented the optimisation step running in some cases.
      
      Signed-off-by: Tim Hall's avatarTim Hall <tim.hall@arm.com>
      Change-Id: I97642c5adec3bb684b1daabf2b81574c27d4eef2
      789e6f3a
  21. Jun 16, 2021
  22. Jun 09, 2021
  23. Jun 08, 2021
    • Tim Hall's avatar
      MLBEDSW-4602: Fix Deepspeech scale & bias reuse issue. · d784af7e
      Tim Hall authored
      
      
       - Deepspeech reuses identical weights and biases throughout
         the network. Since biases are now interleaved with weights
         there is a scaling issue when the ifm scales differ between
         operations using the same weight and scale tensor.
      
       - This commit uses interleaved weights/scales on their first use
         but separates scales to source memory on subsequent use (if
         the ifm scale is different).
      
      Signed-off-by: Tim Hall's avatarTim Hall <tim.hall@arm.com>
      Change-Id: I7aae163438160a919cae04e235966e75355a6148
      d784af7e
  24. Jun 03, 2021
Loading