Skip to content
CHANGELOG.md 24 KiB
Newer Older
Nick Dingle's avatar
Nick Dingle committed
# Changelog

All notable changes to the Arm RAN Acceleration Library (ArmRAL) project will be
documented in this file.

## [Unreleased]

### Added

### Changed

### Deprecated

### Removed

### Fixed

### Security

Rosie Sumpter's avatar
Rosie Sumpter committed
## [25.01] - 2025-01-23

### Added

- Added the functions `armral_turbo_decode_batch`, and
  `armral_turbo_decode_batch_noalloc`. These functions implement a maximum a
  posteriori (MAP) algorithm to decode the output of the LTE Turbo encoding
  scheme on a batch of encoded data.

- Added the function `armral_turbo_decode_batch_noalloc_buffer_size` which
  returns the size of buffer required for `armral_turbo_decode_batch_noalloc`.

### Changed

- Updated all copyright headers, and the text in
  [LICENSE.md](https://gitlab.arm.com/networking/ral/-/blob/main/LICENSE.md),
  to include the `BSD-3-Clause` SPDX License Identifier.

- Improved Neon and SVE performance of `armral_fft_execute_cf32` and
  `armral_fft_execute_cs16`.

- The LTE Turbo coding Additive White Gaussian Noise (AWGN) simulation now
  supports the decoding of batches of data, using `armral_turbo_decode_batch`.
  The number of batches is specified using the flag "`-b <n>`".

- FFT lengths up to 42012 are now supported, although lengths greater
  than 4096 are mostly untested.

### Removed

- Unused FFT kernels have been removed.

### Fixed

- Improved error correction of LDPC decoding (`armral_ldpc_decode_block`) in
  the presence of channel noise. The function now uses 16-bit signed integers
  internally rather than 8-bit signed integers. This may result in decreased
  performance.

- The arguments to the function `armral_turbo_decode_block_noalloc_buffer_size`
  have been changed to remove the unused second argument, `max_iter`.

- When planning FFTs with an unsupported length, `armral_fft_create_plan_cf32`
  and `armral_fft_create_plan_cs16` now return `ARMRAL_ARGUMENT_ERROR`.

Nick Dingle's avatar
Nick Dingle committed
## [24.10] - 2024-10-17

### Added

- Added the function `armral_turbo_perm_idx_init` which generates all
  permutation indices used in the permutation step of LTE Turbo decoding.

- Added the function `armral_cmplx_matmul_i16_noalloc` which multiplies two
  matrices of complex Q15 values using a 64-bit Q32.31 accumulator. This
  function does not call any system memory allocators, unlike the existing
  `armral_cmplx_matmul_i16` function.

### Changed

- The interfaces for `armral_turbo_decode_block` and
  `armral_turbo_decode_block_noalloc` now have an additional argument. They now
  include the option to supply a user-allocated buffer which, if used, must be
  initialized with permutation indices by calling
  `armral_turbo_perm_idx_init`. This buffer can then be reused in subsequent
  calls to the Turbo decoding functions and will improve their performance by
  removing the need to compute the indices on each call. If the buffer is not
  initialized and a null pointer is passed instead, the functions will recompute
  the permutation indices on every call.

- Improved performance of `armral_fft_execute_cf32` and
  `armral_fft_execute_cs16`. Cases which were calculated using recursive calls
  to Rader's algorithm are now calculated using Bluestein's algorithm.

### Fixed

- Fixed performance regressions in the SVE versions of the following routines:

  - `armral_cmplx_vecdot_f32`
  - `armral_cmplx_vecmul_f32_2`

Rosie Sumpter's avatar
Rosie Sumpter committed
## [24.07] - 2024-07-18

### Added
Nick Dingle's avatar
Nick Dingle committed

Rosie Sumpter's avatar
Rosie Sumpter committed
- CMake option `ARMRAL_ENABLE_WEXTRA` to add the compiler flag `-Wextra` when
Nick Dingle's avatar
Nick Dingle committed
  building the library and tests.
Rosie Sumpter's avatar
Rosie Sumpter committed

### Changed
Nick Dingle's avatar
Nick Dingle committed

Rosie Sumpter's avatar
Rosie Sumpter committed
- Documentation is now installed by the `make install` target, if it has been
Nick Dingle's avatar
Nick Dingle committed
  built.
Rosie Sumpter's avatar
Rosie Sumpter committed

Nick Dingle's avatar
Nick Dingle committed
- Improved performance of `armral_cmplx_matmul_f32`. For complex 32-bit floating
  point matrix multiplication, we recommend you use this function for all
  cases. This function calls existing optimized special cases with minimal
  overhead and has new optimizations for larger cases.
Rosie Sumpter's avatar
Rosie Sumpter committed

- Improved performance of `armral_turbo_decode_block` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_turbo_decode_block_noalloc`. These functions now operate internally on
  16-bit integer values rather than 16-bit or 32-bit floating point values.
Rosie Sumpter's avatar
Rosie Sumpter committed

- The following functions now use unsigned integers in their interfaces to
Nick Dingle's avatar
Nick Dingle committed
  represent the lengths of vectors and the dimensions of matrices:

  - `armral_cmplx_vecdot_f32`
  - `armral_cmplx_vecdot_f32_2`
  - `armral_cmplx_vecdot_i16`
  - `armral_cmplx_vecdot_i16_2`
  - `armral_cmplx_vecdot_i16_32bit`
  - `armral_cmplx_vecdot_i16_2_32bit`
  - `armral_cmplx_vecmul_f32`
  - `armral_cmplx_vecmul_f32_2`
  - `armral_cmplx_vecmul_i16`
  - `armral_cmplx_vecmul_i16_2`
  - `armral_corr_coeff_i16`
  - `armral_svd_cf32`
  - `armral_svd_cf32_noalloc`
  - `armral_svd_cf32_noalloc_buffer_size`
Rosie Sumpter's avatar
Rosie Sumpter committed

- Renamed `armral_cmplx_mat_mult_aah_f32` to be `armral_cmplx_matmul_aah_f32`.
Nick Dingle's avatar
Nick Dingle committed
  All arguments are in the same order and have the same meaning.
Rosie Sumpter's avatar
Rosie Sumpter committed

- Replaced `armral_cmplx_mat_mult_ahb_f32` with `armral_cmplx_matmul_ahb_f32`.
Nick Dingle's avatar
Nick Dingle committed
  Note that the meanings of the parameters `m`, `n`, and `k` differ between the
  old function and the new; a call to the old function of the form
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_mat_mult_ahb_f32(dim1, dim2, dim3, a, b, c);`

Nick Dingle's avatar
Nick Dingle committed
    becomes
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_matmul_ahb_f32(dim2, dim3, dim1, a, b, c);`

Nick Dingle's avatar
Nick Dingle committed
- Replaced `armral_cmplx_mat_mult_i16` with `armral_cmplx_matmul_i16`.  Note
  that the meanings of the parameters `m`, `n`, and `k` differ between the old
  function and the new; a call to the old function of the form
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_mat_mult_i16(dim1, dim2, dim3, a, b, c);`

Nick Dingle's avatar
Nick Dingle committed
    becomes
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_matmul_i16(dim1, dim3, dim2, a, b, c);`

Nick Dingle's avatar
Nick Dingle committed
- Replaced `armral_cmplx_mat_mult_i16_32bit` with
  `armral_cmplx_matmul_i16_32bit`.  Note that the meanings of the parameters
  `m`, `n`, and `k` differ between the old function and the new; a call to the
  old function of the form
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_mat_mult_i16_32bit(dim1, dim2, dim3, a, b, c);`

Nick Dingle's avatar
Nick Dingle committed
    becomes
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_matmul_i16_32bit(dim1, dim3, dim2, a, b, c);`

Nick Dingle's avatar
Nick Dingle committed
- Replaced `armral_cmplx_matmul_f32` with `armral_cmplx_matmul_f32`.  Note that
  the meanings of the parameters `m`, `n`, and `k` differ between the old
  function and the new; a call to the old function of the form
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_mat_mult_f32(dim1, dim2, dim3, a, b, c);`

Nick Dingle's avatar
Nick Dingle committed
    becomes
Rosie Sumpter's avatar
Rosie Sumpter committed

    `armral_cmplx_matmul_f32(dim1, dim3, dim2, a, b, c);`

### Fixed
Nick Dingle's avatar
Nick Dingle committed

Rosie Sumpter's avatar
Rosie Sumpter committed
- Corrected documentation for `armral_cmplx_mat_inverse_batch_f32` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_cmplx_mat_inverse_batch_f32_pa` to clarify that these functions have
  no restriction on batch sizes.
Rosie Sumpter's avatar
Rosie Sumpter committed

Nick Dingle's avatar
Nick Dingle committed
## [24.04] - 2024-04-19

### Added
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Makefile target `bench_excel_summary` to run the benchmarks and create an
Nick Dingle's avatar
Nick Dingle committed
  Excel spreadsheet containing the results.
Nick Dingle's avatar
Nick Dingle committed

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Moved `license_terms/BSD-3-Clause.txt` and
Nick Dingle's avatar
Nick Dingle committed
  `license_terms/third_party_licenses.txt` to
  [LICENSE.md](https://gitlab.arm.com/networking/ral/-/blob/main/LICENSE.md) and
  [THIRD_PARTY_LICENSES.md](https://gitlab.arm.com/networking/ral/-/blob/main/THIRD_PARTY_LICENSES.md)
  respectively.
Nick Dingle's avatar
Nick Dingle committed

- Extended `armral_cmplx_pseudo_inverse_direct_f32` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_cmplx_pseudo_inverse_direct_f32_noalloc` to compute the regularized
  pseudo-inverse of a single complex 32-bit matrix of size `M-by-N` for the case
  where `M` and/or `N` == 1.
Nick Dingle's avatar
Nick Dingle committed

- Improved SVE2 performance of `armral_turbo_decode_block` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_turbo_decode_block_noalloc`.
Nick Dingle's avatar
Nick Dingle committed

- Improved SVE2 performance of `armral_ldpc_encode_block` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_ldpc_encode_block_noalloc`.
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
## [24.01] - 2024-01-19

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Extended `armral_cmplx_pseudo_inverse_direct_f32` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_cmplx_pseudo_inverse_direct_f32_noalloc` to compute the regularized
  pseudo-inverse of a single complex 32-bit matrix of size `M-by-N` for cases
  where `M > N` in addition to the cases where `M <= N`.
Nick Dingle's avatar
Nick Dingle committed

- Improved performance of `armral_turbo_decode_block` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_turbo_decode_block_noalloc`.
Nick Dingle's avatar
Nick Dingle committed

- Improved SVE2 performance of `armral_seq_generator`, for the cases when
Nick Dingle's avatar
Nick Dingle committed
  `sequence_len` is not a multiple of 64.
Nick Dingle's avatar
Nick Dingle committed

### Fixed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- LDPC block encoding (`armral_ldpc_encode_block`), rate matching
Nick Dingle's avatar
Nick Dingle committed
  (`armral_ldpc_rate_matching`) and rate recovery (`armral_ldpc_rate_recovery`),
  and the corresponding channel simulator, now support the insertion and removal
  of filler bits as described in the 3GPP Technical Specification (TS) 38.212.
  From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
Nick Dingle's avatar
Nick Dingle committed

## [23.10] - 2023-10-06

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Extended the `sequence_len` parameter of `armral_seq_generator` to `uint32_t`.
Nick Dingle's avatar
Nick Dingle committed
  From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
Nick Dingle's avatar
Nick Dingle committed

- Added parameter `i_bil` to `armral_polar_rate_matching` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_polar_rate_recovery` to enable or disable bit interleaving. From
  [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
Nick Dingle's avatar
Nick Dingle committed

- Added parameter `nref` to `armral_ldpc_rate_matching` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_ldpc_rate_recovery` to enable the functions to be used with a soft
  buffer size. From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
Nick Dingle's avatar
Nick Dingle committed

- Added parameter nref to `armral_ldpc_rate_matching` and
Nick Dingle's avatar
Nick Dingle committed
  `armral_ldpc_rate_recovery` to enable the functions to be used with a soft
  buffer size. From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of Polar block decoding
Nick Dingle's avatar
Nick Dingle committed
  (`armral_polar_decode_block`) for list sizes 1, 2, 4 and 8.
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of LDPC block decoding (`armral_ldpc_decode_block`
Nick Dingle's avatar
Nick Dingle committed
  and `armral_ldpc_decode_block_noalloc`).
Nick Dingle's avatar
Nick Dingle committed

- Simulation programs are now built by default and are tested by the make check
Nick Dingle's avatar
Nick Dingle committed
  target.
Nick Dingle's avatar
Nick Dingle committed

## [23.07] - 2023-07-07

### Added
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- New function to compute the regularized pseudo-inverse of a single complex
Nick Dingle's avatar
Nick Dingle committed
  32-bit floating-point matrix (`armral_cmplx_pseudo_inverse_direct_f32`).
Nick Dingle's avatar
Nick Dingle committed

- New function to compute the multiplication of a complex 32-bit floating-point
Nick Dingle's avatar
Nick Dingle committed
  matrix with its conjugate transpose (`armral_cmplx_mat_mult_aah_f32`).
Nick Dingle's avatar
Nick Dingle committed

- New function to compute the complex 32-bit floating-point multiplication of
Nick Dingle's avatar
Nick Dingle committed
  the conjugate transpose of a matrix with a matrix
  (`armral_cmplx_mat_mult_ahb_f32`).
Nick Dingle's avatar
Nick Dingle committed

- Variants of existing functions which take a pre-allocated buffer rather than
Nick Dingle's avatar
Nick Dingle committed
  performing memory allocations internally. For functions where the buffer size
  is not easily calculated from the input parameters, helper functions to
  calculate the required size have been provided.
Nick Dingle's avatar
Nick Dingle committed

- Neon-optimized implementation of batched complex 32-bit floating-point
Nick Dingle's avatar
Nick Dingle committed
  matrix-vector multiplication (`armral_cmplx_mat_vec_mult_batch_f32`).
Nick Dingle's avatar
Nick Dingle committed

- SVE2-optimized implementation of complex 32-bit floating-point general matrix
Nick Dingle's avatar
Nick Dingle committed
  inverse for matrices of size `2x2`, `3x3` and `4x4`
  (`armral_cmplx_mat_inverse_f32`).
Nick Dingle's avatar
Nick Dingle committed

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Improved Neon and SVE2 performance of Mu Law compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_mu_law_compr_8bit`, `armral_mu_law_compr_9bit`, and
  `armral_mu_law_compr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of 8-bit block float compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_float_compr_8bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved SVE2 performance of 9-bit block scaling decompression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_scaling_decompr_9bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved SVE2 performance of 14-bit block scaling decompression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_scaling_decompr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved SVE2 performance of 8-bit and 12-bit block float compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_float_compr_8bit` and `armral_block_float_compr_12bit`).
Nick Dingle's avatar
Nick Dingle committed

- Moved the definition of the symbol rate out of the `ebn0_to_snr` function
Nick Dingle's avatar
Nick Dingle committed
  (`simulation/awgn/awgn.cpp`) so that it is now a parameter that gets passed in
  by each of the simulation programs.
Nick Dingle's avatar
Nick Dingle committed

- Updated the `convolutional_awgn` simulation program to use OpenMP
Nick Dingle's avatar
Nick Dingle committed
  (`simulation/convolutional_awgn/convolutional_awgn.cpp`).
Nick Dingle's avatar
Nick Dingle committed

- Updated simulation programs to accept a path to write graphs to, instead of
Nick Dingle's avatar
Nick Dingle committed
  auto-generating filenames.
Nick Dingle's avatar
Nick Dingle committed

- Added the maximum number of iterations to the output of the Turbo simulation
Nick Dingle's avatar
Nick Dingle committed
  program (`simulation/turbo_awgn/turbo_error_rate.py`).
Nick Dingle's avatar
Nick Dingle committed

- Updated formatting of labels in simulation graph legends.

### Fixed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Removed bandwidth scaling in all simulation programs so that the maximum
Nick Dingle's avatar
Nick Dingle committed
  spectral efficiency does not exceed the number of bits per symbol.
Nick Dingle's avatar
Nick Dingle committed

- Convolutional decoding algorithm
Nick Dingle's avatar
Nick Dingle committed
  (`armral_tail_biting_convolutional_decode_block`) now returns correct results
  for input lengths greater than 255.
Nick Dingle's avatar
Nick Dingle committed

- Test file for convolutional decoding (`test/ConvCoding/decoding/main.cpp`) is
Nick Dingle's avatar
Nick Dingle committed
  updated so that the tests pass as expected for input lengths which are not a
  multiple of 4.
Nick Dingle's avatar
Nick Dingle committed

- Neon block float decompression functions (`armral_block_float_decompr_8bit`,
Nick Dingle's avatar
Nick Dingle committed
  `armral_block_float_decompr_9bit`, `armral_block_float_decompr_12bit`, and
  `armral_block_float_decompr_14bit`) now truncate values before storing rather
  than rounding them. This means the Neon implementations of these functions now
  have the same behavior as the SVE implementations.
Nick Dingle's avatar
Nick Dingle committed

- Neon block scaling decompression functions.
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_scaling_decompr_8bit`, `armral_block_scaling_decompr_9bit`, and
  `armral_block_scaling_decompr_14bit`) now truncate values before storing
  rather than rounding them. This means the Neon implementations of these
  functions now have the same behavior as the SVE implementations.
Nick Dingle's avatar
Nick Dingle committed

## [23.04] - 2023-04-21

### Added
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Cyclic Redundancy Check (CRC) attachment function
Nick Dingle's avatar
Nick Dingle committed
  (`armral_polar_crc_attachment`) for Polar codes, described in section 5.2.1 of
  the 3GPP Technical Specification (TS) 38.212.
Nick Dingle's avatar
Nick Dingle committed

- CRC function to check the validity of the output(s) of Polar decoding
Nick Dingle's avatar
Nick Dingle committed
  (`armral_check_crc_polar`).
Nick Dingle's avatar
Nick Dingle committed

- New simulation program `modulation_awgn` which plots the error rate versus
Nick Dingle's avatar
Nick Dingle committed
  Eb/N0 (or signal-to-noise ratio (SNR)) of taking a hard demodulation decision
  for data sent over a noisy channel with no forward error correction.
Nick Dingle's avatar
Nick Dingle committed

- Added a field called `snr` to the JSON output of all simulation programs,
Nick Dingle's avatar
Nick Dingle committed
  which stores the signal-to-noise ratio.
Nick Dingle's avatar
Nick Dingle committed

- Added a flag called `x-unit` to all plotting scripts which allows the user to
Nick Dingle's avatar
Nick Dingle committed
  choose whether Eb/N0 or SNR is plotted on the x-axis.
Nick Dingle's avatar
Nick Dingle committed

- Added CRC attachment and check in Polar codes simulation.

### Changed

- Updated [license terms]
Nick Dingle's avatar
Nick Dingle committed
  (https://gitlab.arm.com/networking/ral/-/blob/main/license_terms/BSD-3-Clause.txt)
  to BSD-3-Clause.
Nick Dingle's avatar
Nick Dingle committed

- Updated Polar decoding (`armral_polar_decode_block`) to accept a list size of
Nick Dingle's avatar
Nick Dingle committed
  8.
Nick Dingle's avatar
Nick Dingle committed

- LDPC decoding (`armral_ldpc_decode_block`) can optionally make use of attached
Nick Dingle's avatar
Nick Dingle committed
  CRC information to terminate iteration early in the case that a match is
  found.
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of tail biting convolutional encoder for LTE
Nick Dingle's avatar
Nick Dingle committed
  (`armral_tail_biting_convolutional_encode_block`).
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of tail biting convolutional decoder for LTE
Nick Dingle's avatar
Nick Dingle committed
  (`armral_tail_biting_convolutional_decode_block`).
Nick Dingle's avatar
Nick Dingle committed

### Fixed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Calculation of the encoded data length in the LDPC simulation program
Nick Dingle's avatar
Nick Dingle committed
  (`armral/simulation/ldpc_awgn/ldpc_error_rate.py`) is updated to match that
  used in ArmRAL.
Nick Dingle's avatar
Nick Dingle committed

- Graphs generated from results of simulation programs in the simulation
Nick Dingle's avatar
Nick Dingle committed
  directory no longer plot Shannon limits and theoretical maxima versus block
  error rates. Shannon limits and theoretical maxima continue to be plotted for
  bit error rates.
Nick Dingle's avatar
Nick Dingle committed

## [23.01] - 2023-01-27

### Added
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Rate matching for Turbo coding (`armral_turbo_rate_matching`). This implements
Nick Dingle's avatar
Nick Dingle committed
  the operations in section 5.1.4.1 of the 3GPP Technical Specification (TS)
  36.212.
Nick Dingle's avatar
Nick Dingle committed

- Rate recovery for Turbo coding (`armral_turbo_rate_recovery`). This implements
Nick Dingle's avatar
Nick Dingle committed
  the inverse operations of rate matching. Rate matching is described in section
  5.1.4.1 of the 3GPP Technical Specification (TS) 36.212.
Nick Dingle's avatar
Nick Dingle committed

- Tail-biting convolutional encoder for LTE
Nick Dingle's avatar
Nick Dingle committed
  (`armral_tail_biting_convolutional_encode_block`).
Nick Dingle's avatar
Nick Dingle committed

- Tail-biting convolutional decoder for LTE
Nick Dingle's avatar
Nick Dingle committed
  (`armral_tail_biting_convolutional_decode_block`).
Nick Dingle's avatar
Nick Dingle committed

- Scrambling for Physical Uplink Control Channels (PUCCH) formats 2, 3 and 4,
Nick Dingle's avatar
Nick Dingle committed
  Physical Downlink Shared Channel (PDSCH), Physical Downlink Control Channel
  (PDCCH), and Physical Broadcast Channel (PBCH) (`armral_scramble_code_block`).
  This covers scrambling as described in 3GPP Technical Specification (TS)
  38.211, sections 6.3.2.5.1, 6.3.2.6.1, 7.3.1.1, 7.3.2.3, and 7.3.3.1.
Nick Dingle's avatar
Nick Dingle committed

- Simulation program for LTE tail-biting convolutional coding
Nick Dingle's avatar
Nick Dingle committed
  (`armral/simulation/convolutional_awgn`).
Nick Dingle's avatar
Nick Dingle committed

- Python script that allows users to draw the data rates of each modulation and
Nick Dingle's avatar
Nick Dingle committed
  compare them to the capacity of the AWGN channel
  (`armral/simulation/capacity/capacity.py`).
Nick Dingle's avatar
Nick Dingle committed

- SVE2-optimized implementation of complex 32-bit floating point matrix-vector
Nick Dingle's avatar
Nick Dingle committed
  multiplication (`armral_cmplx_mat_vec_mult_f32`).
Nick Dingle's avatar
Nick Dingle committed

- SVE2-optimized implementation of 14-bit block scaling decompression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_scaling_decompr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Modified error rate Python scripts (under `armral/simulation`) to use Eb/N0 as
Nick Dingle's avatar
Nick Dingle committed
  x-axis (instead of the SNR) and to show the Shannon limits.
Nick Dingle's avatar
Nick Dingle committed

- Added Turbo rate matching and recovery to the Turbo simulation program
Nick Dingle's avatar
Nick Dingle committed
  (`armral/simulation/turbo_awgn/turbo_awgn.cpp`).
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of block-float decompression for 9-bit and 14-bit
Nick Dingle's avatar
Nick Dingle committed
  block-float representations. (`armral_block_float_decompr_9bit` and
  `armral_block_float_decompr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of complex 32-bit floating point matrix-vector
Nick Dingle's avatar
Nick Dingle committed
  multiplication (`armral_cmplx_mat_vec_mult_f32`).
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of Gold sequence generator (`armral_seq_generator`).

- Improved Neon performance of general matrix inversion
Nick Dingle's avatar
Nick Dingle committed
  (`armral_cmplx_mat_inverse_f32`).
Nick Dingle's avatar
Nick Dingle committed

- Improved Neon performance of batched general matrix inversion
Nick Dingle's avatar
Nick Dingle committed
  (`armral_cmplx_mat_inverse_batch_f32`).
Nick Dingle's avatar
Nick Dingle committed

### Fixed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Documentation of the interface for Polar rate recovery
Nick Dingle's avatar
Nick Dingle committed
  (`armral_polar_rate_recovery`) updated to reflect how the parameters are used
  in the implementation.
Nick Dingle's avatar
Nick Dingle committed

## [22.10] - 2022-10-07

### Added
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- SVE2-optimized implementations of `2x2` and `4x4` matrix multiplication
Nick Dingle's avatar
Nick Dingle committed
  functions where in-phase and quadrature components are separated
  (`armral_cmplx_mat_mult_2x2_f32_iq` and `armral_cmplx_mat_mult_4x4_f32_iq`).
Nick Dingle's avatar
Nick Dingle committed

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- The program to evaluate the error-correction performance of Polar coding in
Nick Dingle's avatar
Nick Dingle committed
  the presence of additive white Gaussian noise (AWGN) located in
  `simulation/polar_awgn` is updated to no longer take the length of a code
  block as a parameter.
Nick Dingle's avatar
Nick Dingle committed

- Improved the Neon and SVE2 performance of LDPC encoding for a single code
Nick Dingle's avatar
Nick Dingle committed
  block (`armral_ldpc_encode_block`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the Neon performance of Turbo decoding for a single code block
Nick Dingle's avatar
Nick Dingle committed
  (`armral_turbo_decode_block`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the Neon performance of Turbo encoding for a single code block
Nick Dingle's avatar
Nick Dingle committed
  (`armral_turbo_encode_block`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the Neon performance of 32-bit floating point general matrix
Nick Dingle's avatar
Nick Dingle committed
  inversion (`armral_cmplx_mat_inverse_f32`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the Neon performance of 32-bit floating point batch general matrix
Nick Dingle's avatar
Nick Dingle committed
  inversion (`armral_cmplx_mat_inverse_batch_f32` and
  `armral_cmplx_mat_inverse_batch_f32_pa`).
Nick Dingle's avatar
Nick Dingle committed

### Fixed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- The Turbo coding simulation program now builds when performing an SVE build of
Nick Dingle's avatar
Nick Dingle committed
  the library.
Nick Dingle's avatar
Nick Dingle committed

## [22.07] - 2022-07-15

### Added
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- SVE2-optimized implementation of equalization with four subcarriers
Nick Dingle's avatar
Nick Dingle committed
  (`armral_solve_*x*_4sc_f32`).
Nick Dingle's avatar
Nick Dingle committed

- Matrix-vector multiplication functions for batches of 32-bit complex
Nick Dingle's avatar
Nick Dingle committed
  floating-point matrices and vectors (`armral_cmplx_mat_vec_mult_batch_f32` and
  `armral_cmplx_mat_vec_mult_batch_f32_pa`).
Nick Dingle's avatar
Nick Dingle committed

- LTE Turbo encoding function (`armral_turbo_encode_block`) that implements the
Nick Dingle's avatar
Nick Dingle committed
  encoding scheme defined in section 5.1.3.2 of the 3GPP Technical Specification
  (TS) 36.212 "Multiplexing and channel coding".
Nick Dingle's avatar
Nick Dingle committed

- LTE Turbo decoding function (`armral_turbo_decode_block`) that implements a
Nick Dingle's avatar
Nick Dingle committed
  maximum a posteriori (MAP) algorithm to return a hard decision (either 0 or 1)
  for each output bit.
Nick Dingle's avatar
Nick Dingle committed

- Functions to perform rate matching and rate recovery for Polar coding. These
Nick Dingle's avatar
Nick Dingle committed
  implement the specification in section 5.4.1 of the 3GPP Technical Specification
  (TS) 38.212.
Nick Dingle's avatar
Nick Dingle committed

- Functions to perform rate matching and rate recovery for LDPC coding. This
Nick Dingle's avatar
Nick Dingle committed
  implements the specification in section 5.4.2 of the 3GPP Technical
  Specification (TS) 38.212.
Nick Dingle's avatar
Nick Dingle committed

- Utilities to simulate the error correction performance for Polar, LDPC and
Nick Dingle's avatar
Nick Dingle committed
  Turbo coding over a noisy channel.
Nick Dingle's avatar
Nick Dingle committed

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Renamed the Polar encoding and decoding functions to
Nick Dingle's avatar
Nick Dingle committed
  `armral_polar_encode_block` and `armral_polar_decode_block`.
Nick Dingle's avatar
Nick Dingle committed

- Improved the Neon and SVE2 performance of 16-QAM modulation
Nick Dingle's avatar
Nick Dingle committed
  (`armral_modulation` with `armral_modulation_type` set to `ARMRAL_MOD_16QAM)`.
Nick Dingle's avatar
Nick Dingle committed

- Improved the SVE2 performance of Mu law compression and decompression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_mu_law_compr_*` and `armral_mu_law_decompr_*`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the SVE2 performance of block float compression and decompression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_float_compr_*` and `armral_block_float_decompr_*`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the SVE2 performance of 8-bit block scaling compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_scaling_compr_8bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of 32-bit floating-point and 16-bit fixed-point
Nick Dingle's avatar
Nick Dingle committed
  complex valued FFTs (`armral_fft_execute_cf32` and `armral_fft_execute_cs16`)
  with large prime factors.
Nick Dingle's avatar
Nick Dingle committed

## [22.04] - 2022-04-08

### Added
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- SVE2-optimized implementations batched 16-bit fixed-point matrix-vector
Nick Dingle's avatar
Nick Dingle committed
  multiplication with 64-bit and 32-bit fixed-point accumulator
  (`armral_cmplx_mat_vec_mult_batch_i16`,
  `armral_cmplx_mat_vec_mult_batch_i16_pa`,
  `armral_cmplx_mat_vec_mult_batch_i16_32bit`,
  `armral_cmplx_mat_vec_mult_batch_i16_32bit_pa`).
Nick Dingle's avatar
Nick Dingle committed

- SVE2-optimized implementation of complex 32-bit floating-point singular value
Nick Dingle's avatar
Nick Dingle committed
  decomposition (`armral_svd_cf32`).
Nick Dingle's avatar
Nick Dingle committed

- SVE2-optimized implementations of complex 32-bit floating-point Hermitian
Nick Dingle's avatar
Nick Dingle committed
  matrix inversion for a single matrix or a batch of matrices of size `3x3`
  (`armral_cmplx_hermitian_mat_inverse_f32` and
  `armral_cmplx_hermitian_mat_inverse_batch_f32`).
Nick Dingle's avatar
Nick Dingle committed

- SVE2-optimized implementations of 9-bit and 14-bit Mu law compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_mu_law_compr_9bit` and `armral_mu_law_compr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- SVE2-optimized implementations of 9-bit and 14-bit Mu law decompression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_mu_law_decompr_9bit` and `armral_mu_law_decompr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- Complex 32-bit floating-point general matrix inversion for matrices of size
Nick Dingle's avatar
Nick Dingle committed
  `2x2`, `3x3`, `4x4`, `8x8`, and `16x16` (`armral_cmplx_mat_inverse_f32`).
Nick Dingle's avatar
Nick Dingle committed

### Changed
Nick Dingle's avatar
Nick Dingle committed

Nick Dingle's avatar
Nick Dingle committed
- Improved the performance of batched 16-bit floating-point matrix-vector
Nick Dingle's avatar
Nick Dingle committed
  multiplication with 64-bit floating-point accumulator
  (`armral_cmplx_mat_vec_mult_batch_i16` and
  `armral_cmplx_mat_vec_mult_batch_i16_pa`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of batched 16-bit floating-point matrix-vector
Nick Dingle's avatar
Nick Dingle committed
  multiplication with 32-bit floating-point accumulator
  (`armral_cmplx_mat_vec_mult_batch_i16_32bit` and
  `armral_cmplx_mat_vec_mult_batch_i16_32bit_pa`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of 14-bit block float compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_float_compr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of 14-bit block scaling compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_block_scaling_compr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of 14-bit Mu law compression
Nick Dingle's avatar
Nick Dingle committed
  (`armral_mu_law_compr_14bit`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of complex 32-bit floating-point singular value
Nick Dingle's avatar
Nick Dingle committed
  decomposition (`armral_svd_cf32`). The input matrix now needs to be stored in
  column-major order. Output matrices are also returned in column-major order.
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of complex 32-bit floating-point Hermitian matrix
Nick Dingle's avatar
Nick Dingle committed
  inversion for a single matrix or a batch of matrices of size `3x3`
  (`armral_cmplx_hermitian_mat_inverse_f32` and
  `armral_cmplx_hermitian_mat_inverse_batch_f32`).
Nick Dingle's avatar
Nick Dingle committed

- Improved the performance of Polar list decoding (`armral_polar_decoder`) with
Nick Dingle's avatar
Nick Dingle committed
  list size 4. The performance for list size 1 is slightly reduced, but the list
  size 4 gives much better error correction.
Nick Dingle's avatar
Nick Dingle committed

- Added restrictions to the number of matrices and vectors in the batch for the
Nick Dingle's avatar
Nick Dingle committed
  functions that perform batched matrix-vector multiplications in fixed-point
  precision (`armral_cmplx_mat_vec_mult_batch_i16`,
  `armral_cmplx_mat_vec_mult_batch_i16_pa`,
  `armral_cmplx_mat_vec_mult_batch_i16_32bit`,
  `armral_cmplx_mat_vec_mult_batch_i16_32bit_pa`).
Nick Dingle's avatar
Nick Dingle committed

- The function to perform fixed-point complex matrix-matrix multiplication with
Nick Dingle's avatar
Nick Dingle committed
  a 64-bit accumulator (`armral_cmplx_mat_mult_i16`) now narrows from the 64-bit
  accumulator to a 32-bit intermediate value, and then to the 16-bit result
  using truncating narrowing operations instead of rounding operations. This
  matches the behavior in the fixed-point complex matrix-matrix multiplication
  with a 32-bit accumulator.
Nick Dingle's avatar
Nick Dingle committed

- The function to perform fixed-point complex matrix-vector multiplication with
Nick Dingle's avatar
Nick Dingle committed
  a 64-bit accumulator (`armral_cmplx_mat_vec_mult_i16`) now narrows from the
  64-bit accumulator to a 32-bit intermediate value, and then to the 16-bit
  result using truncating narrowing operations instead of rounding
  operations. This matches the behavior in the fixed-point complex matrix-vector
  multiplication with a 32-bit accumulator.