Skip to content
  1. Apr 30, 2025
    • Jonny Svärd's avatar
      Rework cache management · 5fbe2998
      Jonny Svärd authored
      
      
      Simplify the cache management by removing most of the logic from the
      driver itself to user overridable weak functions for cache flush/clean
      and invalidation.
      
      The driver now calls the cache flush/clean before each NPU command
      stream with a full list of base pointers/base addresses and their size.
      This allows full freedom to implement any desired logic for cache
      coherence management outside of the driver. This changes the function
      prototype for the flush and invalidate functions.
      
      As there's no longer a need to keep a bitmask of which base
      pointers/addresses to flush/clean/invalidate, those functions have been
      removed.
      
      To guarantee that the driver works in all cases and doesn't get affected
      by potential speculative loads, the cache invalidation call has been
      moved to after the NPU has finished.
      
      Due to the backwards incompatible changes to the function prototypes,
      the driver version has been bumped to 1.0.0
      
      Change-Id: Ibfd755876842edc911fecebf34fa80c22f287ca4
      Signed-off-by: Jonny Svärd's avatarJonny Svärd <jonny.svaerd@arm.com>
      25.05-rc1
      5fbe2998
  2. Jan 07, 2025
  3. Dec 16, 2024
  4. Oct 29, 2024
    • Jonny Svärd's avatar
      Cache optimizations · e79714ad
      Jonny Svärd authored
      
      
      Add comments about ethosu_flush_dcache() being deprecated and not
      recommended to be implemented. Cache coherency for regions that are
      shared by the CPU and NPU are to be handled by the application before an
      inference is invoked, as the driver will otherwise do it for every
      invokation hurting performance.
      
      Remove cache flush/clean and invalidation calls for all base pointers
      and instead add a cache flush/clean and invalidation base pointer mask.
      This mask defaults to only mark the scratch base pointer (tensor arena)
      for both flush/clean and invalidation. The scratch base pointer is the
      only one containg RW data shared between the CPU and NPU.
      
      For the typical case, cache invalidation is only required to be done on
      the scratch/tensor arena base pointer, as that contains the OFM data.
      All other base pointers are either read only or in the case of dedicated
      sram mode being used, the fast memory is only meant to be used by the
      NPU and thus no cache coherency issues exist.
      
      Add a helper function to allow the cache masks to be modified for
      advanced use cases. The cache mask for flush and invalidate are both 8
      bit masks where bit 0 corresponds to base pointer 0, bit 1 corresponds
      to base pointer 1 etc.
      
      Update previously incorrect documentation that the addresses shipped to
      cache functions needs to be 16 byte aligned, they need to be 32 byte
      aligned (or the cache line size of the CPU).
      
      Invalidation of the complete cache is no longer supported as this is
      potentially dangerous, especially in async use cases where the CPU might
      be doing other things while the NPU is running. base_addr_size is now
      required to be set for all invoke calls, or an assert will trigger.
      
      Change-Id: Ica665ebfb84329ec5e56c224859516036fc08d2c
      Signed-off-by: Jonny Svärd's avatarJonny Svärd <jonny.svaerd@arm.com>
      3 tags
      e79714ad
  5. Dec 19, 2023
    • Jonny Svärd's avatar
      Support timeout for interrupt semaphore · a2732ecd
      Jonny Svärd authored
      
      
      Introduce ETHOSU_INFERENCE_TIMEOUT CMake variable to set an
      arbitrary timeout value that will be sent as argument to
      ethosu_semaphore_take() for the interrupt semaphore. Adding
      the ability to have a timeout for an inference. (Defaults to
      no timeout/wait forever.)
      
      Implement a placeholder mutex for the baremetal example and add
      error checks for mutex_create() call.
      
      Change-Id: Ia74391620340a27c23dc3d15f9ba742c674c8bfa
      Signed-off-by: Jonny Svärd's avatarJonny Svärd <jonny.svaerd@arm.com>
      2 tags
      a2732ecd
  6. Nov 04, 2022
  7. May 25, 2022
  8. Apr 12, 2022
    • Davide Grohmann's avatar
      Update the documentation · 2f9c333a
      Davide Grohmann authored
      - Improve build section by describing new build options
      - Add a section about driver APIs and basic usage examples
      - Add a section about mutexes and semaphores
      - Add a section about begin/end inference callbacks
      - Add a brief section about driver implementation design
      - Fix markdown title/subtitles structure
      - Small fixing of typos and rewording
      
      Also add .gitignore
      
      Change-Id: I7216a2b72b0dfaa605620f4344da205235339ddb
      2f9c333a
  9. Feb 23, 2021
  10. Nov 16, 2020
    • Per Astrand's avatar
      Flush and invalidate data caches · 3c8afcca
      Per Astrand authored
      Implement a weak linked function to handle the data cache.
      If the specific device is implementing a data cache the function should
      be overriden with device specific implementation of the flush/invalidate
      functions to make sure that the cache is properly maintained with
      regards to the NPU DMA transaction.
      
      Change-Id: I175644ef37bee62cc77d789d2b7bc3073e72ea5a
      3c8afcca
  11. Oct 23, 2020
  12. Sep 01, 2020
    • Kristofer Jonsson's avatar
      Update README.md · ea1f593c
      Kristofer Jonsson authored
      Add information how to contribute and how to report security incidents.
      
      Change-Id: I7946e66b30c4e338ffa5a279b5d769a764c34f0f
      ea1f593c
  13. May 05, 2020
Loading