Skip to content
  1. Jul 17, 2020
  2. Jul 16, 2020
    • Jacob Bramley's avatar
      Add missing aliases for SVE 0.0 moves. · f48172ba
      Jacob Bramley authored
      Change-Id: I608c610da7de42328ed3984dabb56cf1401d7a15
      f48172ba
    • Jacob Bramley's avatar
      Fix and enable CanTakeSVEMovprfx. · b9616b36
      Jacob Bramley authored
      Change-Id: I9afc03fb9e11546b9e6caf04497339bf45b285b6
      b9616b36
    • Jacob Bramley's avatar
      Support more than 64 CPU features. · 8c4ceb6a
      Jacob Bramley authored
      After recent patches, we have exactly 64 CPU features. This patch makes
      the mechanism flexible so that we can support more features in the
      future.
      
      Several operators on CPUFeatures had to be re-written as part of this,
      so this patch replaces the default-argument implementations with a more
      flexible template-based approach, which can accept more than four
      features. Existing usage remains unaffected.
      
      Change-Id: If91a3adb62669aa827464e857a90eb93a64db7a6
      8c4ceb6a
    • Jacob Bramley's avatar
      Fix CPUFeature iterator behaviour. · caa40eec
      Jacob Bramley authored
      The `++` operators should return iterators, not values.
      
      This also updates tests to match, and makes wider use of C++11
      range-based `for` loops, where they simplify code.
      
      Change-Id: I2c8ef422e851d6b16c8de2890ae16fc69817a738
      caa40eec
    • Jacob Bramley's avatar
      Add an example that dumps CPU feature information. · 28ff5975
      Jacob Bramley authored
      For debugging purposes, it's useful to see what CPU features VIXL sees.
      
      Change-Id: I6a501ee8c11e50252db713d7d295d84db0f2aee2
      28ff5975
    • Jacob Bramley's avatar
      Add support for AT_HWCAP2. · 31d432b2
      Jacob Bramley authored
      Change-Id: I3c893a6c1e3b25756999025a21ae310e5b3e199c
      31d432b2
    • Jacob Bramley's avatar
      Add CPUFeatures up to Armv8.6. · 3d8d3942
      Jacob Bramley authored
      This adds support for all relevant features described in the latest
      Armv8.6 XML.
      
      Note that this removes the CPUFeatures::All() part of the
      `API_CPUFeatures_format` test. It added little value to the test, and
      was a burden to update when new features are added.
      
      Change-Id: I276a0970be94c3adf2d0100874df0b82c7424a9b
      3d8d3942
  3. Jul 13, 2020
    • Martyn Capewell's avatar
      Emit pairs of add/sub for larger immediates · 960606b6
      Martyn Capewell authored and Jacob Bramley's avatar Jacob Bramley committed
      For immediates between 12 and 24 bits in size, a pair of add or sub instructions
      can be used instead of mov, avoiding the need to allocate a temporary.
      
      Change-Id: I114b4667dcc1bda094652e01d88069d012249dca
      960606b6
    • Martyn Capewell's avatar
      Use segments in SVE indexed fmul simulation · 4635261c
      Martyn Capewell authored
      The value used for the second operand in indexed multiplies differs for each
      segment (128-bit part) of a vector, but the simulator wasn't doing this for
      FP multiplies. Fix and update the tests.
      
      Change-Id: I9cc37ebef9d216243a23bedebea256826e1016cb
      4635261c
  4. Jul 06, 2020
    • Jacob Bramley's avatar
      Fix numerous issues related to CAS* instructions. · 3eb24e95
      Jacob Bramley authored
      1. There was no test for 64-bit CASP.
      
      2. The tests had some faulty code for obtaining aligned pointers. The
         natural alignment is sufficient anyway, so this patch removes the
         broken alignment code, and varies the addresses used to strengthen
         the test slightly.
      
         For the new CASP test, this patch uses the C++11 `alignas` specifier.
      
      3. The simulation of CASP variants accessed memory in the wrong order.
         With this patch, the first-specified register in each pair accesses
         the lowest address.
      
      4. We now check that `rs` and `rt` have the same format. Likewise for
         `rs2` and `rt2` in the CASP variants.
      
      5. Register trace is improved: the `rs` (and `rs2`) update is traced as
         a memory read so we should suppress the log on the register write.
         This is what we do for normal loads.
      
      Change-Id: I213c4b3de32305a8072fdc45357b67cbbf85ba9c
      3eb24e95
    • Martyn Capewell's avatar
      Make assembler more strict about SVE prefetch arguments · 102e7a5e
      Martyn Capewell authored and Jacob Bramley's avatar Jacob Bramley committed
      Add assertions to the assembler to prevent the use of unsupported addressing
      modes for prfb/h/w/d.
      
      Change-Id: Ie12991eb2e29661eb266fc495e9164246371d10e
      102e7a5e
  5. Jul 03, 2020
    • Jacob Bramley's avatar
      Use PgLow8 rather than Pg<12, 10>. · ebc3b8f5
      Jacob Bramley authored
      This is just a clean-up. We have the helper, so we should use it.
      
      Change-Id: I8ee2c7929aef6ad737d7079eee62ffe3f7618857
      ebc3b8f5
    • Jacob Bramley's avatar
      Always assert that 'pg' does not have a lane size. · 7b5819c3
      Jacob Bramley authored
      We did this for PgLow8, but not for 4-bit 'pg' fields.
      
      In practice, we plan to relax this in the future, permitting lane sizes
      where they match the rest of the instruction, but this patch makes our
      checks consistent in the meantime.
      
      Change-Id: Ie791027f217eabab305dbd22b8c0e77926c9d3b8
      7b5819c3
  6. Jul 02, 2020
    • Martyn Capewell's avatar
      Disallow x31/xzr for SVE prefetch scalar offset register · ecca4b1c
      Martyn Capewell authored
      The architecture disallows rm = x31/xzr for prefetch, so assert this in the
      assembler.
      
      Change-Id: I26e14688bde624d38eee40167fb3ada88acaaec7
      ecca4b1c
    • Jacob Bramley's avatar
      Fix simulation of FCMNE. · 4606adc3
      Jacob Bramley authored
      FCMNE can return true when the comparison is unordered.
      
      Change-Id: Ic1fa9a83cd9bde23faf2b13b69d3a7e9d1426a12
      4606adc3
    • Jacob Bramley's avatar
      Require an immediate (0.0) for compare-with-zero instructions. · 5a5e71f3
      Jacob Bramley authored
      This matches conventions elsewhere in the API, and allows for immediate
      synthesis. Immediate synthesis is not included in this patch.
      
      Change-Id: If4bdc9cfd9d4bb83a9c015ef363291c1ff08a64a
      5a5e71f3
    • Jacob Bramley's avatar
      Prefer to use 'rd' as a scratch. · a8461cf9
      Jacob Bramley authored
      This is generally useful, but in particular reduces scratch register
      pressure in code sequences using ComputeAddress. For example:
      
          MemOperand addr(...);
          UseScratchRegisterScope temps(&masm);
          Register computed = temps.AcquireX();
          __ ComputeAddress(computed, addr);
      
      Before this patch, that sequence usually required two scratch registers;
      one for `computed`, and one for immediate synthesis inside
      `ComputeAddress`. With this patch, the same code sequence only needs one
      scratch register.
      
      Change-Id: I9c93e6cab51bdacf36046d4d770dc81d1a65a34c
      a8461cf9
    • Jacob Bramley's avatar
      Fix CPURegister::GetArchitecturalName(). · 32f8fe13
      Jacob Bramley authored
      The `code_` field is a `uint8_t`, which is treated by stream formatters
      as a `char`. This caused strange output from error messages in test
      failure.
      
      Change-Id: I16302e6bbd8977bb376d28c7b7cb2091f9891aba
      32f8fe13
    • Jacob Bramley's avatar
      Fix simulation of FTSMUL. · dfb93b5e
      Jacob Bramley authored
      We tried to set the sign bit before multiplying, but this produced the
      wrong result when the input is already negative.
      
      Change-Id: I7b44070409ca265b1fae34792ebe43e5e53ce646
      dfb93b5e
    • Jacob Bramley's avatar
      Fix the `sve_fmla_fmls` test. · 8caa873b
      Jacob Bramley authored
      The `zn` value for the reference value for the `zd == zn == zm` case
      must also be subject to `FPSegmentPatternHelper`.
      
      Change-Id: Ic49b5e2500e9bb5fa46821cbe447bcdf891c813e
      8caa873b
  7. Jul 01, 2020
    • Jacob Bramley's avatar
      Fix simulation of BRKNS. · a3d6110c
      Jacob Bramley authored
      In this instruction, `pg` should be ignored when setting flags.
      
      Also, simplify the test.
      
      Change-Id: I9b8a73bdd0aaaebbecbccd1c446e17cd9d38ce8f
      a3d6110c
    • Jacob Bramley's avatar
      Make the 'sve_punpk' test VL-agnostic. · 3980b742
      Jacob Bramley authored
      Change-Id: I18195d5f20ce8aaa9a5b89a91cb58bf8801852fc
      3980b742
    • Jacob Bramley's avatar
      Update FPCR test. · 7c8c1f0f
      Jacob Bramley authored
      The test checks that writes to RES0 fields are ignored. This is fragile,
      because those fields are sometimes allocated as the architecture is
      updated.
      
      This patch updates the RES0 field specification, but also runs that part
      of the test only on simulator builds.
      
      Change-Id: Ia667e32519f9e2639f0e46f34426c9a51d21a0e0
      7c8c1f0f
  8. Jun 30, 2020
    • Jacob Bramley's avatar
      Merge branch 'sve' · df01bce3
      Jacob Bramley authored
      Change-Id: I44f3cb4607ef593e1b54eed13fd5ad7ae7ef9cd6
      df01bce3
    • Martyn Capewell's avatar
      [sve] Restore LaneSize to predicate logical operations. · 75892bd1
      Martyn Capewell authored
      In predicated operations, a lane size is required to use the governing predicate
      correctly, and to reject instructions that don't support the lane size.
      
      Consequently, this patch restores the lane size requirement on predicated
      logical operations with predicate operands.
      
      Change-Id: Ida32cc412a88c09454533dd8a5f12f46632d9750
      75892bd1
  9. Jun 26, 2020
  10. Jun 25, 2020
  11. Jun 24, 2020
  12. Jun 23, 2020
  13. Jun 22, 2020
    • Martyn Capewell's avatar
      [sve] Complete remaining gather loads. · a5112344
      Martyn Capewell authored and TatWai Chong's avatar TatWai Chong committed
      Implement remaining 64-bit gather loads including unpacking, unscaled and
      scaled offset form.
      
      Change-Id: I208de1fabfe40f7095f9848c3ebf9de82a5f7416
      a5112344
    • TatWai Chong's avatar
      [sve] Fix the index specifier decoding error in the gather load helper. · cd3f6c5e
      TatWai Chong authored
      In the simulation of the scalar-plus-vector form of gather loads,
      the helper hasn't considered shift specifiers in the decoding, so
      64-bit unscaled/scaled offset forms haven't been generated and tested.
      
      Change-Id: If4539de5a1b4e6760780fdbaefd56dc84dd47413
      cd3f6c5e
    • Jacob Bramley's avatar
      Remove some unnecessary casts in `LoadStoreMemOperand`. · cb0cfc31
      Jacob Bramley authored
      The casts appear dangerous, since they would shorten the value in an
      implementation-defined way. They weren't actually dangerous, because
      each case was guarded by an `IsImm...` helper, but they are
      unnecessary, and removing them makes the code easier to audit.
      
      Also clarify some variable names in related functions.
      
      Change-Id: Ib565965e3e6c4c0683c79d68eac51771d9e8b667
      cb0cfc31
  14. Jun 19, 2020
    • TatWai Chong's avatar
      [sve] Relax the lane size restriction of register in MacroAssembler. · 50ef1718
      TatWai Chong authored
      We decided to accept a register without lane sizes for operations in Macro
      assember where it doesn't matter. For example, we don't check the lane size
      for bitwise `and` vectors operation, like `__ And(z0, z1, z2)`.
      
      We should check that if lane sizes are privided by users, they are all
      the same _and_ are the valid lane sizes. For example, Since `rdffr` requires
      a B-sized lane, it's invalid when passing in a lane size rather than B-sized.
      Although this is essentially just a move which writes every bit to destination
      register, but other lane sizes would implicitly clear bits in destination.
      
      The calling like `__ And(z0.VnB(), z1.VnH(), z2.VnS())` is invalid because of
      their lane sizes are inconsistent.
      
      Change-Id: I7cf57fd174c9bc906f90b0710fec6739cc103448
      50ef1718
    • Martyn Capewell's avatar
      Use Register for macro assembler ldpsw · 7b9a5f12
      Martyn Capewell authored and Jacob Bramley's avatar Jacob Bramley committed
      Ldpsw extends words to X register-sized destinations, so should only accept
      Register, rather than CPURegister.
      
      Change-Id: I4e876165413d374d334fd12eedcd6b90419c1d95
      7b9a5f12
Loading