mail archive of the barebox mailing list
 help / color / mirror / Atom feed
* [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution
@ 2025-06-10 20:13 Ahmad Fatoum
  2025-06-10 20:13 ` [PATCH v2 2/3] ARM: Cortex-A9: invalidate caches in early lowlevel init Ahmad Fatoum
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ahmad Fatoum @ 2025-06-10 20:13 UTC (permalink / raw)
  To: barebox; +Cc: Rouven Czerwinski, lst, fpg, Fabian Pflug, Ahmad Fatoum

We expect start_optee_early() to be called with caches disabled, so there
is no point in calling sync_caches_for_execution inside.

Therefore let's verify the assumption that caches are indeed disabled
and drop the useless call.

Co-authored-by: Fabian Pflug <f.pflug@pengutronix.de>
Signed-off-by: Fabian Pflug <f.pflug@pengutronix.de>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
v1 -> v2:
  - panic instead of retuning error code (Rouven)
---
 arch/arm/lib32/optee-early.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm/lib32/optee-early.c b/arch/arm/lib32/optee-early.c
index 735d829c99fb..c9959b5e41b8 100644
--- a/arch/arm/lib32/optee-early.c
+++ b/arch/arm/lib32/optee-early.c
@@ -7,7 +7,9 @@
  */
 #include <asm/cache.h>
 #include <asm/setjmp.h>
+#include <asm/system.h>
 #include <tee/optee.h>
+#include <linux/bug.h>
 #include <debug_ll.h>
 #include <string.h>
 
@@ -19,6 +21,9 @@ int start_optee_early(void *fdt, void *tee)
 	struct optee_header *hdr;
 	int ret;
 
+	/* We expect this function to be called with data caches disabled */
+	BUG_ON(get_cr() & CR_C);
+
 	hdr = tee;
 	ret = optee_verify_header(hdr);
 	if (ret < 0)
@@ -30,7 +35,6 @@ int start_optee_early(void *fdt, void *tee)
 	/* We use setjmp/longjmp here because OP-TEE clobbers most registers */
 	ret = setjmp(tee_buf);
 	if (ret == 0) {
-		sync_caches_for_execution();
 		tee_start(0, 0, fdt);
 		longjmp(tee_buf, 1);
 	}
-- 
2.39.5




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 2/3] ARM: Cortex-A9: invalidate caches in early lowlevel init
  2025-06-10 20:13 [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution Ahmad Fatoum
@ 2025-06-10 20:13 ` Ahmad Fatoum
  2025-06-10 20:13 ` [PATCH v2 3/3] ARM: i.MX6Q: drop duplicate arm_early_mmu_cache_invalidate Ahmad Fatoum
  2025-06-11  7:10 ` [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution Sascha Hauer
  2 siblings, 0 replies; 4+ messages in thread
From: Ahmad Fatoum @ 2025-06-10 20:13 UTC (permalink / raw)
  To: barebox; +Cc: Rouven Czerwinski, lst, fpg, Ahmad Fatoum

The optee-early code was initially added for i.MX6UL. Trying to naively
enable it on an i.MX6Q boards was observed to cause spurious hangs on
return from OP-TEE to barebox.

Quoting Lucas[1]:

  The real issue with the Cortex A9 caches is that the tags aren't
  cleared on power-up, so some sets/ways may end up in "valid" state if
  not explicitly invalidated. Thus any write to memory may get stuck in
  the cache, even if caching is disabled, as this knob only turns off
  allocation in the cache, but doesn't prevent updates of such bogus
  valid lines. If you then proceed to invalidate the cache, you may
  discard data that has not yet reached DRAM.

This issue did likely not affect the original i.MX6UL, Quoting Lucas
again[2]:

  > How do we know we only need this for Cortex-A9 though?
  > Couldn't e.g. the Cortex-A8 also be affected?

  We can't be 100% sure without specific knowledge about each SoC
  integration. Both the Cortex A8 [1] and Cortex A15 [2] TRMs define a
  reset sequence that mandates the straps to be set in such a way that
  the processor will clear all L1 and L2 memory arrays on power-on reset.

  The only odd one where the TRM doesn't even mention memory arrays in
  the reset sequence is the Cortex A9 [3], which pretty much lines up
  with the number of SoCs where we have seen issues due to uninitialized
  cache content.

Therefore, let's call arm_early_mmu_cache_invalidate() very early in the
low level init. We don't have a common Cortex-A9 init and the locations
touched here were determined by a grep for the Cortex-A9 errata that are
already being worked around by imx6_cpu_lowlevel_init().

[1]: https://lore.barebox.org/barebox/569963942cf35755dfdf34b240c350986fda4727.camel@pengutronix.de/
[2]: https://lore.barebox.org/barebox/d6a0be9631286122e56fdfd87b9911c310554baf.camel@pengutronix.de/

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
v1 -> v2:
  - replace too late invalidation in start_optee_early with very early
    invalidation in the lowlevel init.
---
 arch/arm/mach-imx/cpu_init.c                 | 2 ++
 arch/arm/mach-socfpga/cpu_init.c             | 2 ++
 arch/arm/mach-tegra/tegra_maincomplex_init.c | 2 ++
 arch/arm/mach-zynq/cpu_init.c                | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/arch/arm/mach-imx/cpu_init.c b/arch/arm/mach-imx/cpu_init.c
index aebbd3defaec..e9f42945528e 100644
--- a/arch/arm/mach-imx/cpu_init.c
+++ b/arch/arm/mach-imx/cpu_init.c
@@ -14,6 +14,7 @@
 #include <mach/imx/imx9-regs.h>
 #include <mach/imx/trdc.h>
 #include <io.h>
+#include <asm/cache.h>
 #include <asm/syscounter.h>
 #include <asm/system.h>
 
@@ -36,6 +37,7 @@ void imx6_cpu_lowlevel_init(void)
 {
 	arm_cpu_lowlevel_init();
 
+	arm_early_mmu_cache_invalidate();
 	enable_arm_errata_742230_war();
 	enable_arm_errata_743622_war();
 	enable_arm_errata_751472_war();
diff --git a/arch/arm/mach-socfpga/cpu_init.c b/arch/arm/mach-socfpga/cpu_init.c
index 73b69c34c56f..f10cd468da96 100644
--- a/arch/arm/mach-socfpga/cpu_init.c
+++ b/arch/arm/mach-socfpga/cpu_init.c
@@ -2,11 +2,13 @@
 
 #include <common.h>
 #include <asm/barebox-arm-head.h>
+#include <asm/cache.h>
 #include <asm/errata.h>
 #include <mach/socfpga/init.h>
 
 void arria10_cpu_lowlevel_init(void)
 {
+	arm_early_mmu_cache_invalidate();
 	enable_arm_errata_794072_war();
 	enable_arm_errata_845369_war();
 }
diff --git a/arch/arm/mach-tegra/tegra_maincomplex_init.c b/arch/arm/mach-tegra/tegra_maincomplex_init.c
index 2a2272a99fbc..e4cc3e780cbe 100644
--- a/arch/arm/mach-tegra/tegra_maincomplex_init.c
+++ b/arch/arm/mach-tegra/tegra_maincomplex_init.c
@@ -19,6 +19,7 @@
 #include <asm/barebox-arm-head.h>
 #include <asm/barebox-arm.h>
 #include <asm/errata.h>
+#include <asm/cache.h>
 #include <mach/tegra/lowlevel.h>
 #include <mach/tegra/tegra20-pmc.h>
 #include <mach/tegra/tegra20-car.h>
@@ -30,6 +31,7 @@ void tegra_maincomplex_entry(char *fdt)
 	u32 reg = 0;
 
 	arm_cpu_lowlevel_init();
+	arm_early_mmu_cache_invalidate();
 
 	chiptype = tegra_get_chiptype();
 
diff --git a/arch/arm/mach-zynq/cpu_init.c b/arch/arm/mach-zynq/cpu_init.c
index cc7b8d1142a9..f26e2947fd6a 100644
--- a/arch/arm/mach-zynq/cpu_init.c
+++ b/arch/arm/mach-zynq/cpu_init.c
@@ -2,6 +2,7 @@
 
 #include <common.h>
 #include <asm/barebox-arm-head.h>
+#include <asm/cache.h>
 #include <asm/errata.h>
 #include <mach/zynq/init.h>
 
@@ -9,6 +10,7 @@ void zynq_cpu_lowlevel_init(void)
 {
 	arm_cpu_lowlevel_init();
 
+	arm_early_mmu_cache_invalidate();
 	enable_arm_errata_761320_war();
 	enable_arm_errata_794072_war();
 	enable_arm_errata_845369_war();
-- 
2.39.5




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 3/3] ARM: i.MX6Q: drop duplicate arm_early_mmu_cache_invalidate
  2025-06-10 20:13 [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution Ahmad Fatoum
  2025-06-10 20:13 ` [PATCH v2 2/3] ARM: Cortex-A9: invalidate caches in early lowlevel init Ahmad Fatoum
@ 2025-06-10 20:13 ` Ahmad Fatoum
  2025-06-11  7:10 ` [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution Sascha Hauer
  2 siblings, 0 replies; 4+ messages in thread
From: Ahmad Fatoum @ 2025-06-10 20:13 UTC (permalink / raw)
  To: barebox; +Cc: Rouven Czerwinski, lst, fpg, Ahmad Fatoum

Now that we call arm_early_mmu_cache_invalidate() in
imx6_cpu_lowlevel_init(), it's fine to drop it from the entry points
that already call the latter function.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
v1 -> v2:
  - new patch
---
 arch/arm/boards/freescale-mx6-sabrelite/lowlevel.c | 5 -----
 arch/arm/boards/guf-santaro/lowlevel.c             | 3 ---
 arch/arm/boards/tqma6x/lowlevel.c                  | 5 -----
 3 files changed, 13 deletions(-)

diff --git a/arch/arm/boards/freescale-mx6-sabrelite/lowlevel.c b/arch/arm/boards/freescale-mx6-sabrelite/lowlevel.c
index f1ed31ccad6e..856f050c0888 100644
--- a/arch/arm/boards/freescale-mx6-sabrelite/lowlevel.c
+++ b/arch/arm/boards/freescale-mx6-sabrelite/lowlevel.c
@@ -9,7 +9,6 @@
 #include <io.h>
 #include <mach/imx/debug_ll.h>
 #include <mach/imx/esdctl.h>
-#include <asm/cache.h>
 
 extern char __dtb_imx6q_sabrelite_start[];
 
@@ -33,8 +32,6 @@ ENTRY_FUNCTION(start_imx6q_sabrelite, r0, r1, r2)
 {
 	imx6_cpu_lowlevel_init();
 
-	arm_early_mmu_cache_invalidate();
-
 	relocate_to_current_adr();
 	setup_c();
 	barrier();
@@ -64,8 +61,6 @@ ENTRY_FUNCTION(start_imx6dl_sabrelite, r0, r1, r2)
 {
 	imx6_cpu_lowlevel_init();
 
-	arm_early_mmu_cache_invalidate();
-
 	relocate_to_current_adr();
 	setup_c();
 	barrier();
diff --git a/arch/arm/boards/guf-santaro/lowlevel.c b/arch/arm/boards/guf-santaro/lowlevel.c
index 72401eb32c69..870af0643401 100644
--- a/arch/arm/boards/guf-santaro/lowlevel.c
+++ b/arch/arm/boards/guf-santaro/lowlevel.c
@@ -5,7 +5,6 @@
 #include <io.h>
 #include <asm/barebox-arm-head.h>
 #include <asm/barebox-arm.h>
-#include <asm/cache.h>
 #include <mach/imx/generic.h>
 #include <mach/imx/imx6-regs.h>
 #include <debug_ll.h>
@@ -40,8 +39,6 @@ ENTRY_FUNCTION(start_imx6q_guf_santaro, r0, r1, r2)
 
 	arm_setup_stack(0x00920000);
 
-	arm_early_mmu_cache_invalidate();
-
 	setup_uart();
 
 	relocate_to_current_adr();
diff --git a/arch/arm/boards/tqma6x/lowlevel.c b/arch/arm/boards/tqma6x/lowlevel.c
index 6e9c9bed0bf8..38d3f3fec0df 100644
--- a/arch/arm/boards/tqma6x/lowlevel.c
+++ b/arch/arm/boards/tqma6x/lowlevel.c
@@ -9,7 +9,6 @@
 #include <asm/barebox-arm-head.h>
 #include <asm/barebox-arm.h>
 #include <asm/sections.h>
-#include <asm/cache.h>
 #include <asm/mmu.h>
 #include <mach/imx/imx6.h>
 
@@ -28,8 +27,6 @@ ENTRY_FUNCTION_WITHSTACK(start_imx6q_mba6x, 0x00920000, r0, r1, r2)
 		putc_ll('a');
 	}
 
-	arm_early_mmu_cache_invalidate();
-
 	fdt = __dtb_imx6q_mba6x_start + get_runtime_offset();
 
 	barebox_arm_entry(0x10000000, SZ_1G, fdt);
@@ -47,8 +44,6 @@ ENTRY_FUNCTION_WITHSTACK(start_imx6dl_mba6x, 0x00920000, r0, r1, r2)
 		putc_ll('a');
 	}
 
-	arm_early_mmu_cache_invalidate();
-
 	fdt = __dtb_imx6dl_mba6x_start + get_runtime_offset();
 
 	barebox_arm_entry(0x10000000, SZ_512M, fdt);
-- 
2.39.5




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution
  2025-06-10 20:13 [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution Ahmad Fatoum
  2025-06-10 20:13 ` [PATCH v2 2/3] ARM: Cortex-A9: invalidate caches in early lowlevel init Ahmad Fatoum
  2025-06-10 20:13 ` [PATCH v2 3/3] ARM: i.MX6Q: drop duplicate arm_early_mmu_cache_invalidate Ahmad Fatoum
@ 2025-06-11  7:10 ` Sascha Hauer
  2 siblings, 0 replies; 4+ messages in thread
From: Sascha Hauer @ 2025-06-11  7:10 UTC (permalink / raw)
  To: barebox, Ahmad Fatoum; +Cc: Rouven Czerwinski, lst, fpg, Fabian Pflug


On Tue, 10 Jun 2025 22:13:18 +0200, Ahmad Fatoum wrote:
> We expect start_optee_early() to be called with caches disabled, so there
> is no point in calling sync_caches_for_execution inside.
> 
> Therefore let's verify the assumption that caches are indeed disabled
> and drop the useless call.
> 
> 
> [...]

Applied, thanks!

[1/3] ARM: optee-early: drop superfluous sync_caches_for_execution
      https://git.pengutronix.de/cgit/barebox/commit/?id=02f4e6b02730 (link may not be stable)
[2/3] ARM: Cortex-A9: invalidate caches in early lowlevel init
      https://git.pengutronix.de/cgit/barebox/commit/?id=2754c3aba9d1 (link may not be stable)
[3/3] ARM: i.MX6Q: drop duplicate arm_early_mmu_cache_invalidate
      https://git.pengutronix.de/cgit/barebox/commit/?id=6db7f695218f (link may not be stable)

Best regards,
-- 
Sascha Hauer <s.hauer@pengutronix.de>




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-06-11  7:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-06-10 20:13 [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution Ahmad Fatoum
2025-06-10 20:13 ` [PATCH v2 2/3] ARM: Cortex-A9: invalidate caches in early lowlevel init Ahmad Fatoum
2025-06-10 20:13 ` [PATCH v2 3/3] ARM: i.MX6Q: drop duplicate arm_early_mmu_cache_invalidate Ahmad Fatoum
2025-06-11  7:10 ` [PATCH v2 1/3] ARM: optee-early: drop superfluous sync_caches_for_execution Sascha Hauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox