- Apr 27, 2023
-
-
The new sysctl sched_pelt_multiplier allows a user to set a clock multiplier to x2 or x4 (x1 being the default). This clock multiplier artificially speeds up PELT ramp up/down similarly to use a faster half-life than the default 32ms. - x1: 32ms half-life - x2: 16ms half-life - x4: 8ms half-life Internally, a new clock is created: rq->clock_task_mult. It sits in the clock hierarchy between rq->clock_task and rq->clock_pelt. [peterz: Use sched_feat()] Signed-off-by:
Vincent Donnefort <vincent.donnefort@arm.com> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org>
-
The cpufreq cooling devices have been removed by 52e3b2ca ("arm64: dts: qcom: sdm845: Remove cpufreq cooling devices for CPU thermal zones") but the per-CPU thermal zones remain. Per monitor_thermal_zone(), we get one polling workqueue per TZ, which is a lot of noise considering the only time those thermal zones will affect the system will be upon crossing a critical trip point. Remove the per-CPU thermal zones and let the cluster-wide ones handle critical temperature trip points. Signed-off-by:
Valentin Schneider <valentin.schneider@arm.com>
-
This is a temporary hack. Currently trace points for PELT are only triggered when the PELT metrics consumed by the scheduler are actually updated, i.e. util_avg. This means no updates if no 1 ms boundary is being crossed by the update. When reconstructing the PELT signal based on this data, the peak PELT value can therefore be up to 1 ms worth of PELT accumulation off (23 in absolute terms). This leads to a discrepancy that causes test cases to fail. This patch ensures that trace events are always emitted even if the metrics haven't been updated which should allow accurate reconstruction of the PELT signals.
-
Signed-off-by:
Ionela Voinescu <ionela.voinescu@arm.com>
-
arm and arm64: Add Debug per_cpu maps access Add Prove Locking Add Scheduler statistics arm: Add kernel .config support and /proc/config.gz arm64: Add Scheduler debugging Add Ftrace Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
arm and arm64: Add Cgroups (+ FAIR_GROUP_SCHED and FREEZER) Add Uclamp support for tasks and taskgroups Add CpuFreq governors and make schedutil default arm: Add Cpuset support Add Scheduler autogroups Add DIE (SCHED_MC) sched domain level Add Energy Model Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by:
Ionela Voinescu <ionela.voinescu@arm.com> [@Ionela: cpufreq governor enablement for both arm and arm64]
-
This reverts commit 5d777b18. [ionela.voinescu@arm.com: modify capacity of current CPU only]
-
These configs are causing instability on Juno boards.
-
For Arm: Add ARM vexpress-spc cpufreq driver Add ARM Big.Little cpuidle driver Add Sensor Vexpress Disable CONFIG_HARDEN_BRANCH_HISTORY
-
-
- Apr 24, 2023
-
-
Deepak Kumar Mishra authored
-
- Apr 21, 2023
-
-
Aaron Thompson authored
Have local_clock() return sched_clock() if sched_clock_init() has not yet run. sched_clock_cpu() has this check but it was not included in the new noinstr implementation of local_clock(). The effect can be seen on x86 with CONFIG_PRINTK_TIME enabled, for instance. scd->clock quickly reaches the value of TICK_NSEC and that value is returned until sched_clock_init() runs. dmesg without this patch: [ 0.000000] kvm-clock: ... [ 0.000002] kvm-clock: ... [ 0.000672] clocksource: ... [ 0.001000] tsc: ... [ 0.001000] e820: ... [ 0.001000] e820: ... ... [ 0.001000] ..TIMER: ... [ 0.001000] clocksource: ... [ 0.378956] Calibrating delay loop ... [ 0.379955] pid_max: ... dmesg with this patch: [ 0.000000] kvm-clock: ... [ 0.000001] kvm-clock: ... [ 0.000675] clocksource: ... [ 0.002685] tsc: ... [ 0.003331] e820: ... [ 0.004190] e820: ... ... [ 0.421939] ..TIMER: ... [ 0.422842] clocksource: ... [ 0.424582] Calibrating delay loop ... [ 0.425580] pid_max: ... Fixes: 776f2291 ("sched/clock: Make local_clock() noinstr") Signed-off-by:
Aaron Thompson <dev@aaront.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230413175012.2201-1-dev@aaront.org
-
Zhaohui Shi authored
Commit 95158a89 ("sched,rt: Use the full cpumask for balancing") allows find_lock_lowest_rq() to pick a task with migration disabled. The purpose of the commit is to push the current running task on the CPU that has the migrate_disable() task away. However, there is a race which allows a migrate_disable() task to be migrated. Consider: CPU0 CPU1 push_rt_task check is_migration_disabled(next_task) task not running and migration_disabled == 0 find_lock_lowest_rq(next_task, rq); _double_lock_balance(this_rq, busiest); raw_spin_rq_unlock(this_rq); double_rq_lock(this_rq, busiest); <<wait for busiest rq>> <wakeup> task become running migrate_disable(); <context out> deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); WARN_ON_ONCE(is_migration_disabled(p)); Fixes: 95158a89 ("sched,rt: Use the full cpumask for balancing") Signed-off-by:
Schspa Shi <schspa@gmail.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by:
Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by:
Valentin Schneider <vschneid@redhat.com> Tested-by:
Dwaine Gonyier <dgonyier@redhat.com>
-
Mathieu Desnoyers authored
Introduce per-mm/cpu current concurrency id (mm_cid) to fix a PostgreSQL sysbench regression reported by Aaron Lu. Keep track of the currently allocated mm_cid for each mm/cpu rather than freeing them immediately on context switch. This eliminates most atomic operations when context switching back and forth between threads belonging to different memory spaces in multi-threaded scenarios (many processes, each with many threads). The per-mm/per-cpu mm_cid values are serialized by their respective runqueue locks. Thread migration is handled by introducing invocation to sched_mm_cid_migrate_to() (with destination runqueue lock held) in activate_task() for migrating tasks. If the destination cpu's mm_cid is unset, and if the source runqueue is not actively using its mm_cid, then the source cpu's mm_cid is moved to the destination cpu on migration. Introduce a task-work executed periodically, similarly to NUMA work, which delays reclaim of cid values when they are unused for a period of time. Keep track of the allocation time for each per-cpu cid, and let the task work clear them when they are observed to be older than SCHED_MM_CID_PERIOD_NS and unused. This task work also clears all mm_cids which are greater or equal to the Hamming weight of the mm cidmask to keep concurrency ids compact. Because we want to ensure the mm_cid converges towards the smaller values as migrations happen, the prior optimization that was done when context switching between threads belonging to the same mm is removed, because it could delay the lazy release of the destination runqueue mm_cid after it has been replaced by a migration. Removing this prior optimization is not an issue performance-wise because the introduced per-mm/per-cpu mm_cid tracking also covers this more specific case. Fixes: af7f588d ("sched: Introduce per-memory-map concurrency ID") Reported-by:
Aaron Lu <aaron.lu@intel.com> Signed-off-by:
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by:
Aaron Lu <aaron.lu@intel.com> Link: https://lore.kernel.org/lkml/20230327080502.GA570847@ziqianlu-desk2/
-
Peter Zijlstra authored
Sync with the urgent patches; in particular: a53ce18c ("sched/fair: Sanitize vruntime of entity being migrated") Signed-off-by:
Peter Zijlstra <peterz@infradead.org>
-
- Apr 20, 2023
-
-
Rafael J. Wysocki authored
* pm-cpufreq: cpufreq: tegra194: add OPP support and set bandwidth cpufreq: qcom-cpufreq-hw: Revert adding cpufreq qos dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCM2290 dt-bindings: cpufreq: cpufreq-qcom-hw: Sanitize data per compatible dt-bindings: cpufreq: cpufreq-qcom-hw: Allow just 1 frequency domain cpufreq: Add SM7225 to cpufreq-dt-platdev blocklist cpufreq: qcom-cpufreq-hw: fix double IO unmap and resource release on exit cpufreq: mediatek: Raise proc and sram max voltage for MT7622/7623 cpufreq: mediatek: raise proc/sram max voltage for MT8516 cpufreq: mediatek: fix KP caused by handler usage after regulator_put/clk_put cpufreq: mediatek: fix passing zero to 'PTR_ERR' cpufreq: Use of_property_present() for testing DT property presence cpufreq: qcom-hw: Simplify counting frequency domains kbuild, cpufreq: remove MODULE_LICENSE in non-modules kbuild, cpufreq: tegra124: remove MODULE_LICENSE in non-modules dt-bindings: cpufreq: qcom-hw: add a compatible for sa8775p
-
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pmRafael J. Wysocki authored
Pull cpufreq/ARM updates for 6.4 from Viresh Kumar: "- Add opp and bandwidth support to tegra194 cpufreq driver (Sumit Gupta). - Use of_property_present() for testing DT property presence (Rob Herring). - Remove MODULE_LICENSE in non-modules (Nick Alcock). - Add SM7225 to cpufreq-dt-platdev blocklist (Luca Weiss). - Optimizations and fixes for qcom-cpufreq-hw driver (Krzysztof Kozlowski, Konrad Dybcio, and Bjorn Andersson). - DT binding updates for qcom-cpufreq-hw driver (Konrad Dybcio and Bartosz Golaszewski). - Updates and fixes for mediatek driver (Jia-Wei Chang and AngeloGioacchino Del Regno)."
-
Rafael J. Wysocki authored
* thermal-intel: thermal: intel: int340x: Add DLVR support for RFIM control
-
Rafael J. Wysocki authored
* acpi-soc: ACPI: LPSS: Add 80862289 ACPI _HID for second PWM controller on Cherry Trail * acpi-bus: ACPI: bus: Ensure that notify handlers are not running after removal ACPI: bus: Add missing braces to acpi_sb_notify()
-
Rafael J. Wysocki authored
* pm-cpufreq: cpufreq: use correct unit when verify cur freq
-
Rafael J. Wysocki authored
* pm-sleep: platform/x86/intel/pmc: core: Report duration of time in HW sleep state platform/x86/intel/pmc: core: Always capture counters on suspend platform/x86/amd: pmc: Report duration of time in hw sleep state PM: Add sysfs files to represent time spent in hardware sleep state
-
Mario Limonciello authored
intel_pmc_core displays a warning when the module parameter `warn_on_s0ix_failures` is set and a suspend didn't get to a HW sleep state. Report this to the standard kernel reporting infrastructure so that userspace software can query after the suspend cycle is done. Reviewed-by:
Hans de Goede <hdegoede@redhat.com> Suggested-by:
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by:
David E. Box <david.e.box@linux.intel.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Mario Limonciello authored
Currently counters are only captured during suspend when the warn_on_s0ix_failures module parameter is set. In order to relay this counter information to the kernel reporting infrastructure adjust it so that the counters are always captured. warn_on_s0ix_failures will be utilized solely for messaging by the driver instead. Reviewed-by:
Hans de Goede <hdegoede@redhat.com> Reviewed-by:
David E. Box <david.e.box@linux.intel.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Mario Limonciello authored
amd_pmc displays a warning when a suspend didn't get to the deepest state and a dynamic debugging message with the duration if it did. Rather than logging to dynamic debugging the duration spent in the deepest state, report this to the standard kernel reporting infrastructure so that userspace software can query after the suspend cycle is done. Reviewed-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Mario Limonciello authored
Userspace can't easily discover how much of a sleep cycle was spent in a hardware sleep state without using kernel tracing and vendor specific sysfs or debugfs files. To make this information more discoverable, introduce 3 new sysfs files: 1) The time spent in a hw sleep state for last cycle. 2) The time spent in a hw sleep state since the kernel booted 3) The maximum time that the hardware can report for a sleep cycle. All of these files will be present only if the system supports s2idle. Reviewed-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- Apr 18, 2023
-
-
Sanjay Chandrashekara authored
cpufreq_verify_current_freq checks() if the frequency returned by the hardware has a slight delta with the valid frequency value last set and returns "policy->cur" if the delta is within "1 MHz". In the comparison, "policy->cur" is in "kHz" but it's compared against HZ_PER_MHZ. So, the comparison range becomes "1 GHz". Fix this by comparing against KHZ_PER_MHZ instead of HZ_PER_MHZ. Fixes: f55ae08c ("cpufreq: Avoid unnecessary frequency updates due to mismatch") Signed-off-by:
Sanjay Chandrashekara <sanjayc@nvidia.com> [ sumit gupta: Commit message update ] Signed-off-by:
Sumit Gupta <sumitg@nvidia.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Hans de Goede authored
On some Cherry Trail devices the second PWM controller uses 80862289 as ACPI _HID, rather then using 80862288 as is done for both controllers on most models. Add the missing 80862289 ACPI _HID, note this uses its own lpss_device_desc, without ".setup = bsw_pwm_setup" so that the pwm_lookup is not added for it. On devices where both controllers use the 80862288 _HID bsw_pwm_setup() does a UID check to avoid registering the lookup for the second controller but that will not work here. Adding the missing id fixes the second PWM controller no longer working after the entire LPSS1 island has been in D3 at least once, which causes the contents of the LPSS private registers to get lost. Adding the _HID makes acpi_lpss restore these when the controller moves from D3 to D0. Reviewed-by:
Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Rafael J. Wysocki authored
Currently, acpi_device_remove_notify_handler() may return while the notify handler being removed is still running which may allow the module holding that handler to be torn down prematurely. Address this issue by making acpi_device_remove_notify_handler() wait for the handling of all the ACPI events in progress to complete before returning. Fixes: 5894b0c4 ("ACPI / scan: Move bus operations and notification routines to bus.c") Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Rafael J. Wysocki authored
As per the kernel coding style. No functional impact. Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Rafael J. Wysocki authored
* pm-opp: OPP: Move required opps configuration to specialized callback OPP: Handle all genpd cases together in _set_required_opps() opp: Use of_property_present() for testing DT property presence
-
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pmRafael J. Wysocki authored
Pull OPP updates for 6.4 from Viresh Kumar: "- Use of_property_present() for testing DT property presence (Rob Herring). - Add set_required_opps() callback to the 'struct opp_table', to make the code paths cleaner (Viresh Kumar)." * tag 'opp-updates-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: OPP: Move required opps configuration to specialized callback OPP: Handle all genpd cases together in _set_required_opps() opp: Use of_property_present() for testing DT property presence
-
Srinivas Pandruvada authored
Add support for DLVR (Digital Linear Voltage Regulator) attributes, which can be used to control RFIM. Here instead of "fivr" another directory "dlvr" is created with DLVR attributes: /sys/bus/pci/devices/0000:00:04.0/dlvr ├── dlvr_freq_mhz ├── dlvr_freq_select ├── dlvr_hardware_rev ├── dlvr_pll_busy ├── dlvr_rfim_enable └── dlvr_spread_spectrum_pct └── dlvr_control_mode └── dlvr_control_lock Attributes dlvr_freq_mhz (RO): Current DLVR PLL frequency in MHz. dlvr_freq_select (RW): Sets DLVR PLL clock frequency. dlvr_hardware_rev (RO): DLVR hardware revision. dlvr_pll_busy (RO): PLL can't accept frequency change when set. dlvr_rfim_enable (RW): 0: Disable RF frequency hopping, 1: Enable RF frequency hopping. dlvr_control_mode (RW): Specifies how frequencies are spread. 0: Down spread, 1: Spread in Center. dlvr_control_lock (RW): 1: future writes are ignored. dlvr_spread_spectrum_pct (RW) A write to this register updates the DLVR spread spectrum percent value. Signed-off-by:
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject edits ] Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Sumit Gupta authored
Add support to use OPP table from DT in Tegra194 cpufreq driver. Tegra SoC's receive the frequency lookup table (LUT) from BPMP-FW. Cross check the OPP's present in DT against the LUT from BPMP-FW and enable only those DT OPP's which are present in LUT also. The OPP table in DT has CPU Frequency to bandwidth mapping where the bandwidth value is per MC channel. DRAM bandwidth depends on the number of MC channels which can vary as per the boot configuration. This per channel bandwidth from OPP table will be later converted by MC driver to final bandwidth value by multiplying with number of channels before sending the request to BPMP-FW. If OPP table is not present in DT, then use the LUT from BPMP-FW directy as the CPU frequency table and not do the DRAM frequency scaling which is same as the current behavior. Now, as the CPU Frequency table is being controlling through OPP table in DT. Keeping fewer entries in the table will create less frequency steps and can help to scale fast to high frequencies when required. Signed-off-by:
Sumit Gupta <sumitg@nvidia.com> Signed-off-by:
Viresh Kumar <viresh.kumar@linaro.org>
-
- Apr 17, 2023
-
-
Rafael J. Wysocki authored
* pm-core: PM: core: Remove unnecessary (void *) conversions * pm-devfreq: PM / devfreq: exynos-ppmu: Use devm_platform_get_and_ioremap_resource() PM / devfreq: exynos: Use of_property_present() for testing DT property presence PM / devfreq: exyos-bus: drop of_match_ptr for ID table PM / devfreq: Remove "select SRCU" * pm-tools: PM: tools: sleepgraph: Recognize "CPU killed" messages pm-graph: Update to v5.11
-
Rafael J. Wysocki authored
* pm-cpufreq: cpufreq: amd-pstate: Make varaiable mode_state_machine static cpufreq: drivers with target_index() must set freq_table cpufreq: pmac32: Use of_property_read_bool() for boolean properties cpufreq: Fix typo in the ARM_BRCMSTB_AVS_CPUFREQ Kconfig entry cpufreq: warn about invalid vals to scaling_max/min_freq interfaces Documentation: cpufreq: amd-pstate: Update amd_pstate status sysfs for guided cpufreq: amd-pstate: Add guided mode control support via sysfs cpufreq: amd-pstate: Add guided autonomous mode Documentation: cpufreq: amd-pstate: Move amd_pstate param to alphabetical order ACPI: CPPC: Add auto select register read/write support ACPI: CPPC: Add min and max perf register writing support cpufreq: intel_pstate: Enable HWP IO boost for all servers * pm-cpuidle: cpuidle: Use of_property_present() for testing DT property presence
-
Rafael J. Wysocki authored
* acpi-video: ACPI: video: Remove desktops without backlight DMI quirks ACPI: video: Remove register_backlight_delay module option and code * acpi-misc: ACPI: Replace irqdomain.h include with struct declarations fpga: lattice-sysconfig-spi: Add explicit include for of.h tpm: atmel: Add explicit include for of.h virtio-mmio: Add explicit include for of.h pata: ixp4xx: Add explicit include for of.h ata: pata_macio: Add explicit include of irqdomain.h serial: 8250_tegra: Add explicit include for of.h net: rfkill-gpio: Add explicit include for of.h staging: iio: resolver: ad2s1210: Add explicit include for of.h iio: adc: ad7292: Add explicit include for of.h * acpi-utils: ACPI: utils: Fix acpi_evaluate_dsm_typed() redefinition error * acpi-docs: ACPI: docs: Update the pm_profile sysfs attribute documentation
-
Rafael J. Wysocki authored
* acpi-apei: ACPI: APEI: EINJ: warn on invalid argument when explicitly indicated by platform ACPI: APEI: EINJ: Add CXL error types * acpi-properties: ACPI: property: Refactor acpi_data_prop_read_single() * acpi-sbs: ACPI: SBS: Fix handling of Smart Battery Selectors ACPI: EC: Fix oops when removing custom query handlers ACPI: EC: Limit explicit removal of query handlers to custom query handlers * acpi-thermal: ACPI: thermal: Replace ternary operator with min_t()
-
Rafael J. Wysocki authored
* acpi-processor: ACPI: processor: Fix evaluating _PDC method when running as Xen dom0 ACPI: cpufreq: Use platform devices to load ACPI PPC and PCC drivers ACPI: processor: Check for null return of devm_kzalloc() in fch_misc_setup() * acpi-pm: ACPI: s2idle: Log when enabling wakeup IRQ fails * acpi-tables: ACPI: VIOT: Initialize the correct IOMMU fwspec ACPI: SPCR: Amend indentation ACPI: SPCR: Prefix error messages with FW_BUG * acpi-sysfs: ACPI: sysfs: Enable ACPI sysfs support for CCEL records
-
Rafael J. Wysocki authored
* acpica: (32 commits) ACPICA: Update version to 20230331 ACPICA: add os specific support for Zephyr RTOS ACPICA: ACPICA: check null return of ACPI_ALLOCATE_ZEROED in acpi_db_display_objects ACPICA: acpi_resource_irq: Replace 1-element arrays with flexible array ACPICA: acpi_madt_oem_data: Fix flexible array member definition ACPICA: acpi_dmar_andd: Replace 1-element array with flexible array ACPICA: acpi_pci_routing_table: Replace fixed-size array with flex array member ACPICA: struct acpi_resource_dma: Replace 1-element array with flexible array ACPICA: Introduce ACPI_FLEX_ARRAY ACPICA: struct acpi_nfit_interleave: Replace 1-element array with flexible array ACPICA: actbl2: Replace 1-element arrays with flexible arrays ACPICA: actbl1: Replace 1-element arrays with flexible arrays ACPICA: struct acpi_resource_vendor: Replace 1-element array with flexible array ACPICA: Avoid undefined behavior: load of misaligned address ACPICA: Avoid undefined behavior: member access within misaligned address ACPICA: Avoid undefined behavior: member access within misaligned address ACPICA: Avoid undefined behavior: member access within misaligned address ACPICA: Avoid undefined behavior: member access within misaligned address ACPICA: Avoid undefined behavior: member access within null pointer ACPICA: Avoid undefined behavior: applying zero offset to null pointer ...
-