diff --git a/.gitlab/merge_request_templates/Bugfix.md b/.gitlab/merge_request_templates/Bugfix.md
index bea5b43ffcb0e90257d1d9c5be5a55e492e9b455..e7bce9ba05de1ef04d06cc4c072ce72cda0eb2f4 100644
--- a/.gitlab/merge_request_templates/Bugfix.md
+++ b/.gitlab/merge_request_templates/Bugfix.md
@@ -8,6 +8,7 @@ If an [Issue](https://gitlab.arm.com/networking/ral/-/issues) already exists for
 
 * [] [Contribution meets RAL's licence terms](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-licensing-information)
 * [] [Documentation updated](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-documentation)
+* [] ["Unreleased" section of the Changelog updated](https://gitlab.arm.com/networking/ral/-/blob/main/CHANGELOG.md#unreleased)
 * [] [`clang-format` and `clang-tidy` run and changes included (C/C++ code)](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-cc-code-style)
 * [] [`flake8` run and changes included (Python code)](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-python-code-style)
 * [] Commit message includes information on how to reproduce the issue(s)
diff --git a/.gitlab/merge_request_templates/Default.md b/.gitlab/merge_request_templates/Default.md
index 656f1d80e2e2caf224e129453f83608c379fadb6..7e12eddf190a0e6aae4d5bc03acca22a34754550 100644
--- a/.gitlab/merge_request_templates/Default.md
+++ b/.gitlab/merge_request_templates/Default.md
@@ -12,6 +12,7 @@ If this Merge Request addresses an [Issue](https://gitlab.arm.com/networking/ral
 * [] [New functions adhere to RAL's naming scheme](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-function-naming)
 * [] [Contribution conforms to RAL's directory structure](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-directory-structure)
 * [] [Documentation updated](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-documentation)
+* [] ["Unreleased" section of the Changelog updated](https://gitlab.arm.com/networking/ral/-/blob/main/CHANGELOG.md#unreleased)
 * [] [`clang-format` and `clang-tidy` run and changes included (C/C++ code)](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-cc-code-style)
 * [] [`flake8` run and changes included (Python code)](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-python-code-style)
 * [] [Tests added or updated](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-writing-tests)
diff --git a/.gitlab/merge_request_templates/Documentation.md b/.gitlab/merge_request_templates/Documentation.md
index 249b11b56a0a0b22ac0ff6609d5768771bf7ea31..ab2485305a5ffe4233143bb71ab77f426fa778fc 100644
--- a/.gitlab/merge_request_templates/Documentation.md
+++ b/.gitlab/merge_request_templates/Documentation.md
@@ -9,6 +9,7 @@ If this Merge Request addresses an [Issue](https://gitlab.arm.com/networking/ral
 ## Checklist
 
 * [] [Contribution meets RAL's licence terms](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-licensing-information)
+* [] ["Unreleased" section of the Changelog updated](https://gitlab.arm.com/networking/ral/-/blob/main/CHANGELOG.md#unreleased)
 * [] [`make docs` target runs successfully](https://gitlab.arm.com/networking/ral/-/blob/main/README.md?ref_type=heads#user-content-documentation)
 * [] [`clang-format` and `clang-tidy` run and changes included (C/C++ code)](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-cc-code-style)
 * [] [`flake8` run and changes included (Python code)](https://gitlab.arm.com/networking/ral/-/blob/main/CONTRIBUTING.md#user-content-python-code-style)
diff --git a/CHANGELOG.md b/CHANGELOG.md
new file mode 100644
index 0000000000000000000000000000000000000000..ee10db9dbc912adc24907fd24fb240426fd99ab3
--- /dev/null
+++ b/CHANGELOG.md
@@ -0,0 +1,423 @@
+# Changelog
+
+All notable changes to the Arm RAN Acceleration Library (ArmRAL) project will be
+documented in this file.
+
+## [Unreleased]
+
+### Added
+
+### Changed
+
+### Deprecated
+
+### Removed
+
+### Fixed
+
+### Security
+
+## [24.01] - 2024-01-19
+
+### Changed
+- Extended `armral_cmplx_pseudo_inverse_direct_f32` and
+`armral_cmplx_pseudo_inverse_direct_f32_noalloc` to compute the regularized
+pseudo-inverse of a single complex 32-bit matrix of size `M-by-N` for cases
+where `M > N` in addition to the cases where `M <= N`.
+
+- Improved performance of `armral_turbo_decode_block` and
+`armral_turbo_decode_block_noalloc`.
+
+- Improved SVE2 performance of `armral_seq_generator`, for the cases when
+`sequence_len` is not a multiple of 64.
+
+### Fixed
+- LDPC block encoding (`armral_ldpc_encode_block`), rate matching
+(`armral_ldpc_rate_matching`) and rate recovery (`armral_ldpc_rate_recovery`),
+and the corresponding channel simulator, now support the insertion and removal
+of filler bits as described in the 3GPP Technical Specification (TS) 38.212.
+From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
+
+## [23.10] - 2023-10-06
+
+### Changed
+- Extended the `sequence_len` parameter of `armral_seq_generator` to `uint32_t`.
+From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
+
+- Added parameter `i_bil` to `armral_polar_rate_matching` and
+`armral_polar_rate_recovery` to enable or disable bit interleaving. From
+[@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
+
+- Added parameter `nref` to `armral_ldpc_rate_matching` and
+`armral_ldpc_rate_recovery` to enable the functions to be used with a soft
+buffer size. From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
+
+- Added parameter nref to `armral_ldpc_rate_matching` and
+`armral_ldpc_rate_recovery` to enable the functions to be used with a soft
+buffer size. From [@Suraj4g5g](https://gitlab.arm.com/Suraj4g5g).
+
+- Improved Neon performance of Polar block decoding
+(`armral_polar_decode_block`) for list sizes 1, 2, 4 and 8.
+
+- Improved Neon performance of LDPC block decoding (`armral_ldpc_decode_block`
+and `armral_ldpc_decode_block_noalloc`).
+
+- Simulation programs are now built by default and are tested by the make check
+target.
+
+## [23.07] - 2023-07-07
+
+### Added
+- New function to compute the regularized pseudo-inverse of a single complex
+32-bit floating-point matrix (`armral_cmplx_pseudo_inverse_direct_f32`).
+
+- New function to compute the multiplication of a complex 32-bit floating-point
+matrix with its conjugate transpose (`armral_cmplx_mat_mult_aah_f32`).
+
+- New function to compute the complex 32-bit floating-point multiplication of
+the conjugate transpose of a matrix with a matrix
+(`armral_cmplx_mat_mult_ahb_f32`).
+
+- Variants of existing functions which take a pre-allocated buffer rather than
+performing memory allocations internally. For functions where the buffer size is
+not easily calculated from the input parameters, helper functions to calculate
+the required size have been provided.
+
+- Neon-optimized implementation of batched complex 32-bit floating-point
+matrix-vector multiplication (`armral_cmplx_mat_vec_mult_batch_f32`).
+
+- SVE2-optimized implementation of complex 32-bit floating-point general matrix
+inverse for matrices of size `2x2`, `3x3` and `4x4`
+(`armral_cmplx_mat_inverse_f32`).
+
+### Changed
+- Improved Neon and SVE2 performance of Mu Law compression
+(`armral_mu_law_compr_8bit`, `armral_mu_law_compr_9bit`, and
+`armral_mu_law_compr_14bit`).
+
+- Improved Neon performance of 8-bit block float compression
+(`armral_block_float_compr_8bit`).
+
+- Improved SVE2 performance of 9-bit block scaling decompression
+(`armral_block_scaling_decompr_9bit`).
+
+- Improved SVE2 performance of 14-bit block scaling decompression
+(`armral_block_scaling_decompr_14bit`).
+
+- Improved SVE2 performance of 8-bit and 12-bit block float compression
+(`armral_block_float_compr_8bit` and `armral_block_float_compr_12bit`).
+
+- Moved the definition of the symbol rate out of the `ebn0_to_snr` function
+(`simulation/awgn/awgn.cpp`) so that it is now a parameter that gets passed in
+by each of the simulation programs.
+
+- Updated the `convolutional_awgn` simulation program to use OpenMP
+(`simulation/convolutional_awgn/convolutional_awgn.cpp`).
+
+- Updated simulation programs to accept a path to write graphs to, instead of
+auto-generating filenames.
+
+- Added the maximum number of iterations to the output of the Turbo simulation
+program (`simulation/turbo_awgn/turbo_error_rate.py`).
+
+- Updated formatting of labels in simulation graph legends.
+
+### Fixed
+- Removed bandwidth scaling in all simulation programs so that the maximum
+spectral efficiency does not exceed the number of bits per symbol.
+
+- Convolutional decoding algorithm
+(`armral_tail_biting_convolutional_decode_block`) now returns correct results
+for input lengths greater than 255.
+
+- Test file for convolutional decoding (`test/ConvCoding/decoding/main.cpp`) is
+updated so that the tests pass as expected for input lengths which are not a
+multiple of 4.
+
+- Neon block float decompression functions (`armral_block_float_decompr_8bit`,
+`armral_block_float_decompr_9bit`, `armral_block_float_decompr_12bit`, and
+`armral_block_float_decompr_14bit`) now truncate values before storing rather
+than rounding them. This means the Neon implementations of these functions now
+have the same behavior as the SVE implementations.
+
+- Neon block scaling decompression functions.
+(`armral_block_scaling_decompr_8bit`, `armral_block_scaling_decompr_9bit`, and
+`armral_block_scaling_decompr_14bit`) now truncate values before storing rather
+than rounding them. This means the Neon implementations of these functions now
+have the same behavior as the SVE implementations.
+
+## [23.04] - 2023-04-21
+
+### Added
+- Cyclic Redundancy Check (CRC) attachment function
+(`armral_polar_crc_attachment`) for Polar codes, described in section 5.2.1 of
+the 3GPP Technical Specification (TS) 38.212.
+
+- CRC function to check the validity of the output(s) of Polar decoding
+(`armral_check_crc_polar`).
+
+- New simulation program `modulation_awgn` which plots the error rate versus
+Eb/N0 (or signal-to-noise ratio (SNR)) of taking a hard demodulation decision
+for data sent over a noisy channel with no forward error correction.
+
+- Added a field called `snr` to the JSON output of all simulation programs,
+which stores the signal-to-noise ratio.
+
+- Added a flag called `x-unit` to all plotting scripts which allows the user to
+choose whether Eb/N0 or SNR is plotted on the x-axis.
+
+- Added CRC attachment and check in Polar codes simulation.
+
+### Changed
+
+- Updated [license terms]
+(https://gitlab.arm.com/networking/ral/-/blob/main/license_terms/BSD-3-Clause.txt)
+to BSD-3-Clause.
+
+- Updated Polar decoding (`armral_polar_decode_block`) to accept a list size of
+8.
+
+- LDPC decoding (`armral_ldpc_decode_block`) can optionally make use of attached
+CRC information to terminate iteration early in the case that a match is found.
+
+- Improved Neon performance of tail biting convolutional encoder for LTE
+(`armral_tail_biting_convolutional_encode_block`).
+
+- Improved Neon performance of tail biting convolutional decoder for LTE
+(`armral_tail_biting_convolutional_decode_block`).
+
+### Fixed
+- Calculation of the encoded data length in the LDPC simulation program
+(`armral/simulation/ldpc_awgn/ldpc_error_rate.py`) is updated to match that used
+in Arm RAN Acceleration Library.
+
+- Graphs generated from results of simulation programs in the simulation
+directory no longer plot Shannon limits and theoretical maxima versus block
+error rates. Shannon limits and theoretical maxima continue to be plotted for
+bit error rates.
+
+## [23.01] - 2023-01-27
+
+### Added
+- Rate matching for Turbo coding (`armral_turbo_rate_matching`). This implements
+the operations in section 5.1.4.1 of the 3GPP Technical Specification (TS)
+36.212.
+
+- Rate recovery for Turbo coding (`armral_turbo_rate_recovery`). This implements
+the inverse operations of rate matching. Rate matching is described in section
+5.1.4.1 of the 3GPP Technical Specification (TS) 36.212.
+
+- Tail-biting convolutional encoder for LTE
+(`armral_tail_biting_convolutional_encode_block`).
+
+- Tail-biting convolutional decoder for LTE
+(`armral_tail_biting_convolutional_decode_block`).
+
+- Scrambling for Physical Uplink Control Channels (PUCCH) formats 2, 3 and 4,
+Physical Downlink Shared Channel (PDSCH), Physical Downlink Control Channel
+(PDCCH), and Physical Broadcast Channel (PBCH) (`armral_scramble_code_block`).
+This covers scrambling as described in 3GPP Technical Specification (TS) 38.211,
+sections 6.3.2.5.1, 6.3.2.6.1, 7.3.1.1, 7.3.2.3, and 7.3.3.1.
+
+- Simulation program for LTE tail-biting convolutional coding
+(`armral/simulation/convolutional_awgn`).
+
+- Python script that allows users to draw the data rates of each modulation and
+compare them to the capacity of the AWGN channel
+(`armral/simulation/capacity/capacity.py`).
+
+- SVE2-optimized implementation of complex 32-bit floating point matrix-vector
+multiplication (`armral_cmplx_mat_vec_mult_f32`).
+
+- SVE2-optimized implementation of 14-bit block scaling decompression
+(`armral_block_scaling_decompr_14bit`).
+
+### Changed
+- Modified error rate Python scripts (under `armral/simulation`) to use Eb/N0 as
+x-axis (instead of the SNR) and to show the Shannon limits.
+
+- Added Turbo rate matching and recovery to the Turbo simulation program
+(`armral/simulation/turbo_awgn/turbo_awgn.cpp`).
+
+- Improved Neon performance of block-float decompression for 9-bit and 14-bit
+block-float representations. (`armral_block_float_decompr_9bit` and
+`armral_block_float_decompr_14bit`).
+
+- Improved Neon performance of complex 32-bit floating point matrix-vector
+multiplication (`armral_cmplx_mat_vec_mult_f32`).
+
+- Improved Neon performance of Gold sequence generator (`armral_seq_generator`).
+
+- Improved Neon performance of general matrix inversion
+(`armral_cmplx_mat_inverse_f32`).
+
+- Improved Neon performance of batched general matrix inversion
+(`armral_cmplx_mat_inverse_batch_f32`).
+
+### Fixed
+- Documentation of the interface for Polar rate recovery
+(armral_polar_rate_recovery) updated to reflect how the parameters are used in
+the implementation.
+
+## [22.10] - 2022-10-07
+
+### Added
+- SVE2-optimized implementations of `2x2` and `4x4` matrix multiplication
+functions where in-phase and quadrature components are separated
+(`armral_cmplx_mat_mult_2x2_f32_iq` and `armral_cmplx_mat_mult_4x4_f32_iq`).
+
+### Changed
+- The program to evaluate the error-correction performance of Polar coding in
+the presence of additive white Gaussian noise (AWGN) located in
+`simulation/polar_awgn` is updated to no longer take the length of a code block
+as a parameter.
+
+- Improved the Neon and SVE2 performance of LDPC encoding for a single code
+block (`armral_ldpc_encode_block`).
+
+- Improved the Neon performance of Turbo decoding for a single code block
+(`armral_turbo_decode_block`).
+
+- Improved the Neon performance of Turbo encoding for a single code block
+(`armral_turbo_encode_block`).
+
+- Improved the Neon performance of 32-bit floating point general matrix
+inversion (`armral_cmplx_mat_inverse_f32`).
+
+- Improved the Neon performance of 32-bit floating point batch general matrix
+inversion (`armral_cmplx_mat_inverse_batch_f32` and
+`armral_cmplx_mat_inverse_batch_f32_pa`).
+
+### Fixed
+- The Turbo coding simulation program now builds when performing an SVE build of
+the library.
+
+## [22.07] - 2022-07-15
+
+### Added
+- SVE2-optimized implementation of equalization with four subcarriers
+(`armral_solve_*x*_4sc_f32`).
+
+- Matrix-vector multiplication functions for batches of 32-bit complex
+floating-point matrices and vectors (`armral_cmplx_mat_vec_mult_batch_f32` and
+`armral_cmplx_mat_vec_mult_batch_f32_pa`).
+
+- LTE Turbo encoding function (`armral_turbo_encode_block`) that implements the
+encoding scheme defined in section 5.1.3.2 of the 3GPP Technical Specification
+(TS) 36.212 "Multiplexing and channel coding".
+
+- LTE Turbo decoding function (`armral_turbo_decode_block`) that implements a
+maximum a posteriori (MAP) algorithm to return a hard decision (either 0 or 1)
+for each output bit.
+
+- Functions to perform rate matching and rate recovery for Polar coding. These
+implement the specification in section 5.4.1 of the 3GPP Technical Specification
+(TS) 38.212.
+
+- Functions to perform rate matching and rate recovery for LDPC coding. This
+implements the specification in section 5.4.2 of the 3GPP Technical
+Specification (TS) 38.212.
+
+- Utilities to simulate the error correction performance for Polar, LDPC and
+Turbo coding over a noisy channel.
+
+### Changed
+- Renamed the Polar encoding and decoding functions to
+`armral_polar_encode_block` and `armral_polar_decode_block`.
+
+- Improved the Neon and SVE2 performance of 16-QAM modulation
+(`armral_modulation` with `armral_modulation_type` set to `ARMRAL_MOD_16QAM)`.
+
+- Improved the SVE2 performance of Mu law compression and decompression
+(`armral_mu_law_compr_*` and `armral_mu_law_decompr_*`).
+
+- Improved the SVE2 performance of block float compression and decompression
+(`armral_block_float_compr_*` and `armral_block_float_decompr_*`).
+
+- Improved the SVE2 performance of 8-bit block scaling compression
+(`armral_block_scaling_compr_8bit`).
+
+- Improved the performance of 32-bit floating-point and 16-bit fixed-point
+complex valued FFTs (`armral_fft_execute_cf32` and `armral_fft_execute_cs16`)
+with large prime factors.
+
+## [22.04] - 2022-04-08
+
+### Added
+- SVE2-optimized implementations batched 16-bit fixed-point matrix-vector
+multiplication with 64-bit and 32-bit fixed-point accumulator
+(`armral_cmplx_mat_vec_mult_batch_i16`,
+`armral_cmplx_mat_vec_mult_batch_i16_pa`,
+`armral_cmplx_mat_vec_mult_batch_i16_32bit`,
+`armral_cmplx_mat_vec_mult_batch_i16_32bit_pa`).
+
+- SVE2-optimized implementation of complex 32-bit floating-point singular value
+decomposition (`armral_svd_cf32`).
+
+- SVE2-optimized implementations of complex 32-bit floating-point Hermitian
+matrix inversion for a single matrix or a batch of matrices of size `3x3`
+(`armral_cmplx_hermitian_mat_inverse_f32` and
+`armral_cmplx_hermitian_mat_inverse_batch_f32`).
+
+- SVE2-optimized implementations of 9-bit and 14-bit Mu law compression
+(`armral_mu_law_compr_9bit` and `armral_mu_law_compr_14bit`).
+
+- SVE2-optimized implementations of 9-bit and 14-bit Mu law decompression
+(`armral_mu_law_decompr_9bit` and `armral_mu_law_decompr_14bit`).
+
+- Complex 32-bit floating-point general matrix inversion for matrices of size
+`2x2`, `3x3`, `4x4`, `8x8`, and `16x16` (`armral_cmplx_mat_inverse_f32`).
+
+### Changed
+- Improved the performance of batched 16-bit floating-point matrix-vector
+multiplication with 64-bit floating-point accumulator
+(`armral_cmplx_mat_vec_mult_batch_i16` and
+`armral_cmplx_mat_vec_mult_batch_i16_pa`).
+
+- Improved the performance of batched 16-bit floating-point matrix-vector
+multiplication with 32-bit floating-point accumulator
+(`armral_cmplx_mat_vec_mult_batch_i16_32bit` and
+`armral_cmplx_mat_vec_mult_batch_i16_32bit_pa`).
+
+- Improved the performance of 14-bit block float compression
+(`armral_block_float_compr_14bit`).
+
+- Improved the performance of 14-bit block scaling compression
+(`armral_block_scaling_compr_14bit`).
+
+- Improved the performance of 14-bit Mu law compression
+(`armral_mu_law_compr_14bit`).
+
+- Improved the performance of complex 32-bit floating-point singular value
+decomposition (`armral_svd_cf32`). The input matrix now needs to be stored in
+column-major order. Output matrices are also returned in column-major order.
+
+- Improved the performance of complex 32-bit floating-point Hermitian matrix
+inversion for a single matrix or a batch of matrices of size `3x3`
+(`armral_cmplx_hermitian_mat_inverse_f32` and
+`armral_cmplx_hermitian_mat_inverse_batch_f32`).
+
+- Improved the performance of Polar list decoding (`armral_polar_decoder`) with
+list size 4. The performance for list size 1 is slightly reduced, but the
+list size 4 gives much better error correction.
+
+- Added restrictions to the number of matrices and vectors in the batch for the
+functions that perform batched matrix-vector multiplications in fixed-point
+precision (`armral_cmplx_mat_vec_mult_batch_i16`,
+`armral_cmplx_mat_vec_mult_batch_i16_pa`,
+`armral_cmplx_mat_vec_mult_batch_i16_32bit`,
+`armral_cmplx_mat_vec_mult_batch_i16_32bit_pa`).
+
+- The function to perform fixed-point complex matrix-matrix multiplication with
+a 64-bit accumulator (`armral_cmplx_mat_mult_i16`) now narrows from the 64-bit
+accumulator to a 32-bit intermediate value, and then to the 16-bit result using
+truncating narrowing operations instead of rounding operations. This matches the
+behavior in the fixed-point complex matrix-matrix multiplication with a 32-bit
+accumulator.
+
+- The function to perform fixed-point complex matrix-vector multiplication with
+a 64-bit accumulator (`armral_cmplx_mat_vec_mult_i16`) now narrows from the
+64-bit accumulator to a 32-bit intermediate value, and then to the 16-bit result
+using truncating narrowing operations instead of rounding operations. This
+matches the behavior in the fixed-point complex matrix-vector multiplication
+with a 32-bit accumulator.