Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this page for instructions on how to get full permissions. Sorry for the inconvenience.
We don't want the phone to be unusably hot (and breaking the first law of thermodynamics is not allowed) so this issue is to track the efforts to try and control the heat produced by the phone.
so the goal is for the user to be able to stress all cpus, and we still put them in sleep states in case the temperature rises over a given threshold?
@guido.gunther@angus.ainslie in case you have any experience with that already, I'm happy for hints before I look into this - if only knowing what parts of the stack are involved with this. thanks.
So currently open review question: why don't we have the cpufreq cooling device on 5.2? That should be fixed anyways, even if it won't have much impact.
1st Test without these changes: Run stress -c 4 and verify that the CPUs heat up to 90°C and the system shuts down. That's the easy part.
Test with 60 °C target temperature, in order to hit it earlier than 80, for testing:
--- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi+++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi@@ -221,7 +221,7 @@ trips { cpu_alert: cpu-alert {- temperature = <80000>;+ temperature = <60000>; hysteresis = <2000>; type = "passive"; };
Note: "cooling_device0" is identifiable by having type read thermal-idle-0.
Result:
Behaviour is as expected at my place (without a fan). 60 °C are hard to keep.
Comparison result:
The same code, but unmodified devicetree description: 80°C target termperature:
So 80°C are of course easier to keep than 60°C. Remember, stress -c 4 is always running during these tests.
TL;DR: thermal-idle works as an out-of-tree solution and is indeed effective in cooling under load. 80°C should be able to keep under a "load average" of 4, i.e. full CPU load.
@martin.kepplinger during these tests that you used to produce these plots, what was the ambient temperature? Did you have your heatsink facing down towards a table or pointing upwards towards the ceiling/sky or at some angle?
@martin.kepplinger when you run these sorts of thermal tests please keep the heatsink facing down towards your table with the display facing up. Since the phone is going to have the SoC completely enclosed (with some heatsinking inside), there won't be convection with the environment like you can get with the heatsink fins on the dev kit. The best way to reproduce such a scenario with the dev kit is by making the heatsink point downward (it will get hotter because of this).
Also, please record your ambient temperature when you run thermal tests.
Thanks for the tip, Eric. The purpose of this test was to show that the driver works; at least for that it's been good enough.
I did a test run with the current imx8-linux-next-dev branch based on 5.3-rc2: Ambient temp: 23,5 °C. devkit display-up facing. stress -c 4 heats it up to 80°C. The thermal driver kicks in and does idle-injection on the cpu. The temperature never goes beyond 82 °C..
Note: This is in no way a situation we want the final phone to be in! The CPU makes up for everything else pulling way too much power (see the open issues), and gets up to 99% slower in the process. It only shows that the thermal driver works. This test (stress -c 4) currently makes the phone (lockscreen) unusably slow. A few minutes after stopping "stress -c 4", the cpu still won't be able to cool down and stays at 80°C + 99% idle-injection. Indeed this is important, but right now, results in really bad user experience.