mail archive of the barebox mailing list
 help / color / mirror / Atom feed
* [PATCH v2 00/34] ARM: MMU rework
@ 2023-05-17  9:03 Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 01/34] ARM: remove unused membase argument Sascha Hauer
                   ` (33 more replies)
  0 siblings, 34 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The goal of this series is to properly map SDRAM used for OP-TEE non
executable, because otherwise the instruction prefetcher might speculate
into the OP-TEE area. This is currently not possible because we use
1MiB (AArch32) or 1GiB (AArch64) sections which are too coarse for that.

With this series we start using two level page tables also in the early
MMU setup.

Overall the MMU code is more consolidated now, we no longer
differentiate between early MMU setup and non early MMU setup.
Consequently the CONFIG_MMU_EARLY option is gone and early MMU setup is
always done when the MMU is enabled.

One nice side effect of this series is that the Rockchip RK3568 boards
now start about a second faster. On these boards the early MMU setup
was skipped because of the insufficient memory start alignment.

Changes since v1:
- integrate review feedback from Ahmad
- rework arm_mem_* functions

Sascha Hauer (34):
  ARM: remove unused membase argument
  ARM: remove unused define
  ARM: rename __arm_mem_scratch to arm_mem_scratch
  ARM: put scratch mem area below OP-TEE
  ARM: add arm_mem_optee()
  ARM: make arm_mem_scratch() a static inline function
  ARM: define stack base consistently
  ARM: move arm_mem_scratch_get() lower for consistency
  ARM: drop cache function initialization
  ARM: Add _32 suffix to aarch32 specific filenames
  ARM: cpu.c: remove unused include
  ARM: mmu-common.c: use common mmu include
  ARM: mmu32: rename mmu.h to mmu_32.h
  ARM: mmu: implement MAP_FAULT
  ARM: mmu64: Use arch_remap_range where possible
  ARM: mmu32: implement zero_page_*()
  ARM: i.MX: Drop HAB workaround
  ARM: Move early MMU after malloc initialization
  ARM: mmu: move dma_sync_single_for_device to extra file
  ARM: mmu: merge mmu-early_xx.c into mmu_xx.c
  ARM: mmu: alloc 64k for early page tables
  ARM: mmu32: create alloc_pte()
  ARM: mmu64: create alloc_pte()
  ARM: mmu: drop ttb argument
  ARM: mmu: always do MMU initialization early when MMU is enabled
  ARM: mmu32: Assume MMU is on
  ARM: mmu32: Fix pmd_flags_to_pte() for ARMv4/5/6
  ARM: mmu32: Add pte_flags_to_pmd()
  ARM: mmu32: add get_pte_flags, get_pmd_flags
  ARM: mmu32: move functions into c file
  ARM: mmu32: read TTB value from register
  ARM: mmu32: Use pages for early MMU setup
  ARM: mmu32: Skip reserved ranges during initialization
  ARM: mmu64: Use two level pagetables in early code

 arch/arm/Makefile                             |   5 +-
 arch/arm/boards/raspberry-pi/lowlevel.c       |   2 +-
 arch/arm/cpu/Kconfig                          |   3 +-
 arch/arm/cpu/Makefile                         |  21 +-
 arch/arm/cpu/{cache.c => cache_32.c}          |  85 +++--
 arch/arm/cpu/cache_64.c                       |   5 -
 arch/arm/cpu/cpu.c                            |   2 -
 arch/arm/cpu/dma_32.c                         |  20 ++
 arch/arm/cpu/dma_64.c                         |  16 +
 arch/arm/cpu/entry.c                          |   2 +-
 arch/arm/cpu/{entry_ll.S => entry_ll_32.S}    |   0
 .../arm/cpu/{exceptions.S => exceptions_32.S} |   0
 .../arm/cpu/{interrupts.c => interrupts_32.c} |   0
 arch/arm/cpu/{lowlevel.S => lowlevel_32.S}    |   0
 arch/arm/cpu/mmu-common.c                     |  13 +-
 arch/arm/cpu/mmu-early.c                      |  71 -----
 arch/arm/cpu/mmu-early_64.c                   |  93 ------
 arch/arm/cpu/{mmu.c => mmu_32.c}              | 298 +++++++++++-------
 arch/arm/cpu/{mmu.h => mmu_32.h}              |  20 --
 arch/arm/cpu/mmu_64.c                         | 109 ++++---
 arch/arm/cpu/{setupc.S => setupc_32.S}        |   0
 arch/arm/cpu/sm.c                             |   3 +-
 .../arm/cpu/{smccc-call.S => smccc-call_32.S} |   0
 arch/arm/cpu/start.c                          |  21 +-
 arch/arm/cpu/uncompress.c                     |  11 +-
 arch/arm/include/asm/barebox-arm.h            |  60 ++--
 arch/arm/include/asm/cache.h                  |   2 -
 arch/arm/include/asm/mmu.h                    |   3 +-
 arch/arm/mach-imx/atf.c                       |  12 +-
 arch/arm/mach-imx/xload-common.c              |   2 +-
 common/Kconfig                                |   9 -
 drivers/hab/habv4.c                           |  10 +-
 include/mach/rockchip/bootrom.h               |   2 +-
 include/mmu.h                                 |   1 +
 34 files changed, 414 insertions(+), 487 deletions(-)
 rename arch/arm/cpu/{cache.c => cache_32.c} (89%)
 create mode 100644 arch/arm/cpu/dma_32.c
 create mode 100644 arch/arm/cpu/dma_64.c
 rename arch/arm/cpu/{entry_ll.S => entry_ll_32.S} (100%)
 rename arch/arm/cpu/{exceptions.S => exceptions_32.S} (100%)
 rename arch/arm/cpu/{interrupts.c => interrupts_32.c} (100%)
 rename arch/arm/cpu/{lowlevel.S => lowlevel_32.S} (100%)
 delete mode 100644 arch/arm/cpu/mmu-early.c
 delete mode 100644 arch/arm/cpu/mmu-early_64.c
 rename arch/arm/cpu/{mmu.c => mmu_32.c} (67%)
 rename arch/arm/cpu/{mmu.h => mmu_32.h} (75%)
 rename arch/arm/cpu/{setupc.S => setupc_32.S} (100%)
 rename arch/arm/cpu/{smccc-call.S => smccc-call_32.S} (100%)

-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 01/34] ARM: remove unused membase argument
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:45   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 02/34] ARM: remove unused define Sascha Hauer
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The functions determining the different memory locations for stack,
early malloc, ttb and op-tee all take a membase argument which is
unused as all locations depend on the end of memory. Remove this unused
argument.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/boards/raspberry-pi/lowlevel.c |  2 +-
 arch/arm/cpu/entry.c                    |  2 +-
 arch/arm/cpu/start.c                    |  6 ++---
 arch/arm/cpu/uncompress.c               |  6 ++---
 arch/arm/include/asm/barebox-arm.h      | 29 ++++++++++---------------
 5 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/arch/arm/boards/raspberry-pi/lowlevel.c b/arch/arm/boards/raspberry-pi/lowlevel.c
index 742f177dec..fd11fe53e0 100644
--- a/arch/arm/boards/raspberry-pi/lowlevel.c
+++ b/arch/arm/boards/raspberry-pi/lowlevel.c
@@ -42,7 +42,7 @@ static void copy_vc_fdt(void *dest, void *src, unsigned long max_size)
  * this FDT there. We fetch it from there later in rpi_devices_init().
  */
 #define rpi_stack_top(memsize) \
-	arm_mem_stack_top(BCM2835_SDRAM_BASE, BCM2835_SDRAM_BASE + memsize - VIDEOCORE_FDT_SZ)
+	arm_mem_stack_top(BCM2835_SDRAM_BASE + memsize - VIDEOCORE_FDT_SZ)
 
 static inline void start_raspberry_pi(unsigned long memsize, void *fdt,
 								void *vc_fdt)
diff --git a/arch/arm/cpu/entry.c b/arch/arm/cpu/entry.c
index b863af5757..dc264c8771 100644
--- a/arch/arm/cpu/entry.c
+++ b/arch/arm/cpu/entry.c
@@ -40,5 +40,5 @@ void NAKED __noreturn barebox_arm_entry(unsigned long membase,
 					unsigned long memsize, void *boarddata)
 {
 	__barebox_arm_entry(membase, memsize, boarddata,
-			    arm_mem_stack_top(membase, membase + memsize));
+			    arm_mem_stack_top(membase + memsize));
 }
diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
index be303514c2..62b2054dd6 100644
--- a/arch/arm/cpu/start.c
+++ b/arch/arm/cpu/start.c
@@ -111,7 +111,7 @@ static inline unsigned long arm_mem_boarddata(unsigned long membase,
 
 unsigned long arm_mem_ramoops_get(void)
 {
-	return arm_mem_ramoops(0, arm_stack_top);
+	return arm_mem_ramoops(arm_stack_top);
 }
 EXPORT_SYMBOL_GPL(arm_mem_ramoops_get);
 
@@ -163,12 +163,12 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
 
 	arm_membase = membase;
 	arm_endmem = endmem;
-	arm_stack_top = arm_mem_stack_top(membase, endmem);
+	arm_stack_top = arm_mem_stack_top(endmem);
 	arm_barebox_size = barebox_size;
 	malloc_end = barebox_base;
 
 	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
-		unsigned long ttb = arm_mem_ttb(membase, endmem);
+		unsigned long ttb = arm_mem_ttb(endmem);
 
 		if (IS_ENABLED(CONFIG_PBL_IMAGE)) {
 			arm_set_cache_functions();
diff --git a/arch/arm/cpu/uncompress.c b/arch/arm/cpu/uncompress.c
index 65de87f109..abaf36b68c 100644
--- a/arch/arm/cpu/uncompress.c
+++ b/arch/arm/cpu/uncompress.c
@@ -82,13 +82,13 @@ void __noreturn barebox_pbl_start(unsigned long membase, unsigned long memsize,
 	pr_debug("memory at 0x%08lx, size 0x%08lx\n", membase, memsize);
 
 	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
-		unsigned long ttb = arm_mem_ttb(membase, endmem);
+		unsigned long ttb = arm_mem_ttb(endmem);
 		pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
 		mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
 	}
 
-	free_mem_ptr = arm_mem_early_malloc(membase, endmem);
-	free_mem_end_ptr = arm_mem_early_malloc_end(membase, endmem);
+	free_mem_ptr = arm_mem_early_malloc(endmem);
+	free_mem_end_ptr = arm_mem_early_malloc_end(endmem);
 
 	pr_debug("uncompressing barebox binary at 0x%p (size 0x%08x) to 0x%08lx (uncompressed size: 0x%08x)\n",
 			pg_start, pg_len, barebox_base, uncompressed_len);
diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index 0cf4549cd7..2e0d8dc9a7 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -78,39 +78,34 @@ static inline const void *arm_mem_scratch_get(void)
 	return (const void *)__arm_mem_scratch(arm_mem_endmem_get());
 }
 
-#define arm_mem_stack_top(membase, endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
+#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
 
-static inline unsigned long arm_mem_stack(unsigned long membase,
-					  unsigned long endmem)
+static inline unsigned long arm_mem_stack(unsigned long endmem)
 {
-	return arm_mem_stack_top(membase, endmem) - STACK_SIZE;
+	return arm_mem_stack_top(endmem) - STACK_SIZE;
 }
 
-static inline unsigned long arm_mem_ttb(unsigned long membase,
-					unsigned long endmem)
+static inline unsigned long arm_mem_ttb(unsigned long endmem)
 {
-	endmem = arm_mem_stack(membase, endmem);
+	endmem = arm_mem_stack(endmem);
 	endmem = ALIGN_DOWN(endmem, ARM_TTB_SIZE) - ARM_TTB_SIZE;
 
 	return endmem;
 }
 
-static inline unsigned long arm_mem_early_malloc(unsigned long membase,
-						 unsigned long endmem)
+static inline unsigned long arm_mem_early_malloc(unsigned long endmem)
 {
-	return arm_mem_ttb(membase, endmem) - SZ_128K;
+	return arm_mem_ttb(endmem) - SZ_128K;
 }
 
-static inline unsigned long arm_mem_early_malloc_end(unsigned long membase,
-						     unsigned long endmem)
+static inline unsigned long arm_mem_early_malloc_end(unsigned long endmem)
 {
-	return arm_mem_ttb(membase, endmem);
+	return arm_mem_ttb(endmem);
 }
 
-static inline unsigned long arm_mem_ramoops(unsigned long membase,
-					    unsigned long endmem)
+static inline unsigned long arm_mem_ramoops(unsigned long endmem)
 {
-	endmem = arm_mem_ttb(membase, endmem);
+	endmem = arm_mem_ttb(endmem);
 #ifdef CONFIG_FS_PSTORE_RAMOOPS
 	endmem -= CONFIG_FS_PSTORE_RAMOOPS_SIZE;
 	endmem = ALIGN_DOWN(endmem, SZ_4K);
@@ -123,7 +118,7 @@ static inline unsigned long arm_mem_barebox_image(unsigned long membase,
 						  unsigned long endmem,
 						  unsigned long size)
 {
-	endmem = arm_mem_ramoops(membase, endmem);
+	endmem = arm_mem_ramoops(endmem);
 
 	if (IS_ENABLED(CONFIG_RELOCATABLE)) {
 		return ALIGN_DOWN(endmem - size, SZ_1M);
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 02/34] ARM: remove unused define
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 01/34] ARM: remove unused membase argument Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:45   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 03/34] ARM: rename __arm_mem_scratch to arm_mem_scratch Sascha Hauer
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

__ARM_SETUP_STACK isn't used anywhere. Remove it.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/include/asm/barebox-arm.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index 2e0d8dc9a7..3a0c3d7d40 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -130,10 +130,6 @@ static inline unsigned long arm_mem_barebox_image(unsigned long membase,
 	}
 }
 
-#ifndef CONFIG_CPU_64
-#define __ARM_SETUP_STACK(name, stack_top) if (stack_top) arm_setup_stack(stack_top)
-#endif
-
 /*
  * Unlike ENTRY_FUNCTION, this can be used to setup stack for a C entry
  * point on both ARM32 and ARM64. ENTRY_FUNCTION on ARM64 can only be used
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 03/34] ARM: rename __arm_mem_scratch to arm_mem_scratch
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 01/34] ARM: remove unused membase argument Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 02/34] ARM: remove unused define Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:46   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE Sascha Hauer
                   ` (30 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

There are different arm_mem_* macros/functions and only one of them
has leading underscores. Remove them for consistency.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/include/asm/barebox-arm.h | 4 ++--
 arch/arm/mach-imx/atf.c            | 6 +++---
 arch/arm/mach-imx/xload-common.c   | 2 +-
 include/mach/rockchip/bootrom.h    | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index 3a0c3d7d40..f446044be6 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -71,11 +71,11 @@ static inline void arm_fixup_vectors(void)
 
 void *barebox_arm_boot_dtb(void);
 
-#define __arm_mem_scratch(endmem) ((endmem) - SZ_32K)
+#define arm_mem_scratch(endmem) ((endmem) - SZ_32K)
 
 static inline const void *arm_mem_scratch_get(void)
 {
-	return (const void *)__arm_mem_scratch(arm_mem_endmem_get());
+	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
 }
 
 #define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
diff --git a/arch/arm/mach-imx/atf.c b/arch/arm/mach-imx/atf.c
index 92820d9392..c5e6817aad 100644
--- a/arch/arm/mach-imx/atf.c
+++ b/arch/arm/mach-imx/atf.c
@@ -137,7 +137,7 @@ __noreturn void imx8mm_load_and_start_image_via_tfa(void)
 	void *endmem = (void *)MX8M_DDR_CSD1_BASE_ADDR +
 		imx8m_barebox_earlymem_size(32);
 
-	imx8m_save_bootrom_log(__arm_mem_scratch(endmem));
+	imx8m_save_bootrom_log(arm_mem_scratch(endmem));
 	imx8mm_load_bl33(bl33);
 
 	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MM_OPTEE))
@@ -185,7 +185,7 @@ __noreturn void imx8mp_load_and_start_image_via_tfa(void)
 	void *endmem = (void *)MX8M_DDR_CSD1_BASE_ADDR +
 		imx8m_barebox_earlymem_size(32);
 
-	imx8m_save_bootrom_log(__arm_mem_scratch(endmem));
+	imx8m_save_bootrom_log(arm_mem_scratch(endmem));
 	imx8mp_load_bl33(bl33);
 
 	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MP_OPTEE))
@@ -234,7 +234,7 @@ __noreturn void imx8mn_load_and_start_image_via_tfa(void)
 	void *endmem = (void *)MX8M_DDR_CSD1_BASE_ADDR +
 		imx8m_barebox_earlymem_size(16);
 
-	imx8m_save_bootrom_log(__arm_mem_scratch(endmem));
+	imx8m_save_bootrom_log(arm_mem_scratch(endmem));
 	imx8mn_load_bl33(bl33);
 
 	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MN_OPTEE))
diff --git a/arch/arm/mach-imx/xload-common.c b/arch/arm/mach-imx/xload-common.c
index 0d3e6be1b1..03eb2ef109 100644
--- a/arch/arm/mach-imx/xload-common.c
+++ b/arch/arm/mach-imx/xload-common.c
@@ -26,7 +26,7 @@ struct imx_scratch_space *__imx8m_scratch_space(int ddr_buswidth)
 	ulong endmem = MX8M_DDR_CSD1_BASE_ADDR +
 		imx8m_barebox_earlymem_size(ddr_buswidth);
 
-	return (void *)__arm_mem_scratch(endmem);
+	return (void *)arm_mem_scratch(endmem);
 }
 
 #define HDR_SIZE	512
diff --git a/include/mach/rockchip/bootrom.h b/include/mach/rockchip/bootrom.h
index 96eb147ae4..5b999fc606 100644
--- a/include/mach/rockchip/bootrom.h
+++ b/include/mach/rockchip/bootrom.h
@@ -15,7 +15,7 @@ static inline void rockchip_store_bootrom_iram(ulong membase,
                                                ulong memsize,
                                                const void *iram)
 {
-	void *dst = (void *)__arm_mem_scratch(membase + memsize);
+	void *dst = (void *)arm_mem_scratch(membase + memsize);
 	memcpy(dst, iram, sizeof(struct rockchip_scratch_space));
 }
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (2 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 03/34] ARM: rename __arm_mem_scratch to arm_mem_scratch Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:48   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 05/34] ARM: add arm_mem_optee() Sascha Hauer
                   ` (29 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

We want to reserve memory for OP-TEE at the end of available SDRAM,
so move the scratch area below OP-TEE and not above.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/include/asm/barebox-arm.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index f446044be6..6e6606d005 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -71,14 +71,14 @@ static inline void arm_fixup_vectors(void)
 
 void *barebox_arm_boot_dtb(void);
 
-#define arm_mem_scratch(endmem) ((endmem) - SZ_32K)
+#define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
 
 static inline const void *arm_mem_scratch_get(void)
 {
 	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
 }
 
-#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
+#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K)
 
 static inline unsigned long arm_mem_stack(unsigned long endmem)
 {
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 05/34] ARM: add arm_mem_optee()
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (3 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:53   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 06/34] ARM: make arm_mem_scratch() a static inline function Sascha Hauer
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

We have several functions/macros named arm_mem_* returning the different
addresses for early memory locations. Add one for OP-Tee as well.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/include/asm/barebox-arm.h | 5 +++++
 arch/arm/mach-imx/atf.c            | 6 +++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index 6e6606d005..8ab1e90e94 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -71,6 +71,11 @@ static inline void arm_fixup_vectors(void)
 
 void *barebox_arm_boot_dtb(void);
 
+static inline unsigned long arm_mem_optee(unsigned long endmem)
+{
+	return endmem - OPTEE_SIZE;
+}
+
 #define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
 
 static inline const void *arm_mem_scratch_get(void)
diff --git a/arch/arm/mach-imx/atf.c b/arch/arm/mach-imx/atf.c
index c5e6817aad..659798b95f 100644
--- a/arch/arm/mach-imx/atf.c
+++ b/arch/arm/mach-imx/atf.c
@@ -141,7 +141,7 @@ __noreturn void imx8mm_load_and_start_image_via_tfa(void)
 	imx8mm_load_bl33(bl33);
 
 	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MM_OPTEE))
-		imx8m_load_and_start_optee_via_tfa(imx8mm, endmem - OPTEE_SIZE, bl33);
+		imx8m_load_and_start_optee_via_tfa(imx8mm, arm_mem_optee(endmem), bl33);
 	else
 		imx8mm_load_and_start_tfa(imx8mm_bl31_bin);
 }
@@ -189,7 +189,7 @@ __noreturn void imx8mp_load_and_start_image_via_tfa(void)
 	imx8mp_load_bl33(bl33);
 
 	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MP_OPTEE))
-		imx8m_load_and_start_optee_via_tfa(imx8mp, endmem - OPTEE_SIZE, bl33);
+		imx8m_load_and_start_optee_via_tfa(imx8mp, arm_mem_optee(endmem), bl33);
 	else
 		imx8mp_load_and_start_tfa(imx8mp_bl31_bin);
 }
@@ -238,7 +238,7 @@ __noreturn void imx8mn_load_and_start_image_via_tfa(void)
 	imx8mn_load_bl33(bl33);
 
 	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MN_OPTEE))
-		imx8m_load_and_start_optee_via_tfa(imx8mn, endmem - OPTEE_SIZE, bl33);
+		imx8m_load_and_start_optee_via_tfa(imx8mn, arm_mem_optee(endmem), bl33);
 	else
 		imx8mn_load_and_start_tfa(imx8mn_bl31_bin);
 }
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 06/34] ARM: make arm_mem_scratch() a static inline function
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (4 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 05/34] ARM: add arm_mem_optee() Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:53   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 07/34] ARM: define stack base consistently Sascha Hauer
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

Most other arm_mem_* are functions, convert arm_mem_scratch to a
function as well.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/include/asm/barebox-arm.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index 8ab1e90e94..139ecce06d 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -76,7 +76,10 @@ static inline unsigned long arm_mem_optee(unsigned long endmem)
 	return endmem - OPTEE_SIZE;
 }
 
-#define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
+static inline unsigned long arm_mem_scratch(unsigned long endmem)
+{
+	return arm_mem_optee(endmem) - SZ_32K;
+}
 
 static inline const void *arm_mem_scratch_get(void)
 {
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 07/34] ARM: define stack base consistently
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (5 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 06/34] ARM: make arm_mem_scratch() a static inline function Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:55   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 08/34] ARM: move arm_mem_scratch_get() lower for consistency Sascha Hauer
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The different arm_mem_* functions have the pattern that they take
the region above it and substract the size of the current region. follow
the pattern for getting the stack base as well. While at it move
arm_mem_stack_top() lower in the file so that we have all functions
following said pattern below each other.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/include/asm/barebox-arm.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index 139ecce06d..8d8c102081 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -86,11 +86,9 @@ static inline const void *arm_mem_scratch_get(void)
 	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
 }
 
-#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K)
-
 static inline unsigned long arm_mem_stack(unsigned long endmem)
 {
-	return arm_mem_stack_top(endmem) - STACK_SIZE;
+	return arm_mem_scratch(endmem) - STACK_SIZE;
 }
 
 static inline unsigned long arm_mem_ttb(unsigned long endmem)
@@ -122,6 +120,11 @@ static inline unsigned long arm_mem_ramoops(unsigned long endmem)
 	return endmem;
 }
 
+static inline unsigned long arm_mem_stack_top(unsigned long endmem)
+{
+	return arm_mem_stack(endmem) + STACK_SIZE;
+}
+
 static inline unsigned long arm_mem_barebox_image(unsigned long membase,
 						  unsigned long endmem,
 						  unsigned long size)
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 08/34] ARM: move arm_mem_scratch_get() lower for consistency
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (6 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 07/34] ARM: define stack base consistently Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 12:57   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 09/34] ARM: drop cache function initialization Sascha Hauer
                   ` (25 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The different arm_mem_* functions all follow the same pattern of
taking the base address of the upper region minus the size of the
current region and with the exception of arm_mem_scratch_get() they
are all below each other. arm_mem_scratch_get() doesn't fit into
this row, so move it lower.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/include/asm/barebox-arm.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index 8d8c102081..f5a74b4746 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -81,11 +81,6 @@ static inline unsigned long arm_mem_scratch(unsigned long endmem)
 	return arm_mem_optee(endmem) - SZ_32K;
 }
 
-static inline const void *arm_mem_scratch_get(void)
-{
-	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
-}
-
 static inline unsigned long arm_mem_stack(unsigned long endmem)
 {
 	return arm_mem_scratch(endmem) - STACK_SIZE;
@@ -125,6 +120,11 @@ static inline unsigned long arm_mem_stack_top(unsigned long endmem)
 	return arm_mem_stack(endmem) + STACK_SIZE;
 }
 
+static inline const void *arm_mem_scratch_get(void)
+{
+	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
+}
+
 static inline unsigned long arm_mem_barebox_image(unsigned long membase,
 						  unsigned long endmem,
 						  unsigned long size)
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 09/34] ARM: drop cache function initialization
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (7 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 08/34] ARM: move arm_mem_scratch_get() lower for consistency Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 10/34] ARM: Add _32 suffix to aarch32 specific filenames Sascha Hauer
                   ` (24 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List; +Cc: Ahmad Fatoum

We need a call to arm_set_cache_functions() before the cache maintenance
functions can be used. Drop this call and just pick the correct
functions on the first call.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/cache.c         | 83 +++++++++++++++++-------------------
 arch/arm/cpu/cache_64.c      |  5 ---
 arch/arm/cpu/mmu-early.c     |  2 -
 arch/arm/cpu/mmu.c           |  2 -
 arch/arm/cpu/start.c         |  4 +-
 arch/arm/include/asm/cache.h |  2 -
 6 files changed, 41 insertions(+), 57 deletions(-)

diff --git a/arch/arm/cpu/cache.c b/arch/arm/cpu/cache.c
index 24a02c68f3..4202406d0d 100644
--- a/arch/arm/cpu/cache.c
+++ b/arch/arm/cpu/cache.c
@@ -17,8 +17,6 @@ struct cache_fns {
 	void (*mmu_cache_flush)(void);
 };
 
-struct cache_fns *cache_fns;
-
 #define DEFINE_CPU_FNS(arch) \
 	void arch##_dma_clean_range(unsigned long start, unsigned long end);	\
 	void arch##_dma_flush_range(unsigned long start, unsigned long end);	\
@@ -41,50 +39,13 @@ DEFINE_CPU_FNS(v5)
 DEFINE_CPU_FNS(v6)
 DEFINE_CPU_FNS(v7)
 
-void __dma_clean_range(unsigned long start, unsigned long end)
-{
-	if (cache_fns)
-		cache_fns->dma_clean_range(start, end);
-}
-
-void __dma_flush_range(unsigned long start, unsigned long end)
-{
-	if (cache_fns)
-		cache_fns->dma_flush_range(start, end);
-}
-
-void __dma_inv_range(unsigned long start, unsigned long end)
-{
-	if (cache_fns)
-		cache_fns->dma_inv_range(start, end);
-}
-
-#ifdef CONFIG_MMU
-
-void __mmu_cache_on(void)
-{
-	if (cache_fns)
-		cache_fns->mmu_cache_on();
-}
-
-void __mmu_cache_off(void)
+static struct cache_fns *cache_functions(void)
 {
-	if (cache_fns)
-		cache_fns->mmu_cache_off();
-}
+	static struct cache_fns *cache_fns;
 
-void __mmu_cache_flush(void)
-{
 	if (cache_fns)
-		cache_fns->mmu_cache_flush();
-	if (outer_cache.flush_all)
-		outer_cache.flush_all();
-}
-
-#endif
+		return cache_fns;
 
-int arm_set_cache_functions(void)
-{
 	switch (cpu_architecture()) {
 #ifdef CONFIG_CPU_32v4T
 	case CPU_ARCH_ARMv4T:
@@ -113,9 +74,45 @@ int arm_set_cache_functions(void)
 		while(1);
 	}
 
-	return 0;
+	return cache_fns;
+}
+
+void __dma_clean_range(unsigned long start, unsigned long end)
+{
+	cache_functions()->dma_clean_range(start, end);
+}
+
+void __dma_flush_range(unsigned long start, unsigned long end)
+{
+	cache_functions()->dma_flush_range(start, end);
+}
+
+void __dma_inv_range(unsigned long start, unsigned long end)
+{
+	cache_functions()->dma_inv_range(start, end);
+}
+
+#ifdef CONFIG_MMU
+
+void __mmu_cache_on(void)
+{
+	cache_functions()->mmu_cache_on();
+}
+
+void __mmu_cache_off(void)
+{
+	cache_functions()->mmu_cache_off();
 }
 
+void __mmu_cache_flush(void)
+{
+	cache_functions()->mmu_cache_flush();
+	if (outer_cache.flush_all)
+		outer_cache.flush_all();
+}
+
+#endif
+
 /*
  * Early function to flush the caches. This is for use when the
  * C environment is not yet fully initialized.
diff --git a/arch/arm/cpu/cache_64.c b/arch/arm/cpu/cache_64.c
index cb7bc0945c..3a30296128 100644
--- a/arch/arm/cpu/cache_64.c
+++ b/arch/arm/cpu/cache_64.c
@@ -6,11 +6,6 @@
 #include <asm/cache.h>
 #include <asm/system_info.h>
 
-int arm_set_cache_functions(void)
-{
-	return 0;
-}
-
 /*
  * Early function to flush the caches. This is for use when the
  * C environment is not yet fully initialized.
diff --git a/arch/arm/cpu/mmu-early.c b/arch/arm/cpu/mmu-early.c
index 0d528b9b9c..4895911cdb 100644
--- a/arch/arm/cpu/mmu-early.c
+++ b/arch/arm/cpu/mmu-early.c
@@ -28,8 +28,6 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize,
 {
 	ttb = (uint32_t *)_ttb;
 
-	arm_set_cache_functions();
-
 	set_ttbr(ttb);
 
 	/* For the XN bit to take effect, we can't be using DOMAIN_MANAGER. */
diff --git a/arch/arm/cpu/mmu.c b/arch/arm/cpu/mmu.c
index 6388e1bf14..78dd05577a 100644
--- a/arch/arm/cpu/mmu.c
+++ b/arch/arm/cpu/mmu.c
@@ -414,8 +414,6 @@ void __mmu_init(bool mmu_on)
 {
 	struct memory_bank *bank;
 
-	arm_set_cache_functions();
-
 	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
 		pte_flags_cached = PTE_FLAGS_CACHED_V7;
 		pte_flags_wc = PTE_FLAGS_WC_V7;
diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
index 62b2054dd6..4841ee6043 100644
--- a/arch/arm/cpu/start.c
+++ b/arch/arm/cpu/start.c
@@ -170,9 +170,7 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
 	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
 		unsigned long ttb = arm_mem_ttb(endmem);
 
-		if (IS_ENABLED(CONFIG_PBL_IMAGE)) {
-			arm_set_cache_functions();
-		} else {
+		if (!IS_ENABLED(CONFIG_PBL_IMAGE)) {
 			pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
 			arm_early_mmu_cache_invalidate();
 			mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
diff --git a/arch/arm/include/asm/cache.h b/arch/arm/include/asm/cache.h
index b63776a74a..261c30129a 100644
--- a/arch/arm/include/asm/cache.h
+++ b/arch/arm/include/asm/cache.h
@@ -18,8 +18,6 @@ static inline void icache_invalidate(void)
 #endif
 }
 
-int arm_set_cache_functions(void);
-
 void arm_early_mmu_cache_flush(void);
 void arm_early_mmu_cache_invalidate(void);
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 10/34] ARM: Add _32 suffix to aarch32 specific filenames
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (8 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 09/34] ARM: drop cache function initialization Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 11/34] ARM: cpu.c: remove unused include Sascha Hauer
                   ` (23 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List; +Cc: Ahmad Fatoum

Several files in arch/arm/cpu/ have 32bit and 64bit versions. The
64bit versions have a _64 suffix, but the 32bit versions have none.
This can be confusing sometimes as one doesn't know if a file is
32bit specific or common code.

Add a _32 suffix to the 32bit files to avoid this confusion.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/Makefile                             |  5 ++++-
 arch/arm/cpu/Makefile                         | 20 +++++++++----------
 arch/arm/cpu/{cache.c => cache_32.c}          |  0
 arch/arm/cpu/{entry_ll.S => entry_ll_32.S}    |  0
 .../arm/cpu/{exceptions.S => exceptions_32.S} |  0
 .../arm/cpu/{interrupts.c => interrupts_32.c} |  0
 arch/arm/cpu/{lowlevel.S => lowlevel_32.S}    |  0
 arch/arm/cpu/{mmu-early.c => mmu-early_32.c}  |  0
 arch/arm/cpu/{mmu.c => mmu_32.c}              |  0
 arch/arm/cpu/{setupc.S => setupc_32.S}        |  0
 .../arm/cpu/{smccc-call.S => smccc-call_32.S} |  0
 11 files changed, 14 insertions(+), 11 deletions(-)
 rename arch/arm/cpu/{cache.c => cache_32.c} (100%)
 rename arch/arm/cpu/{entry_ll.S => entry_ll_32.S} (100%)
 rename arch/arm/cpu/{exceptions.S => exceptions_32.S} (100%)
 rename arch/arm/cpu/{interrupts.c => interrupts_32.c} (100%)
 rename arch/arm/cpu/{lowlevel.S => lowlevel_32.S} (100%)
 rename arch/arm/cpu/{mmu-early.c => mmu-early_32.c} (100%)
 rename arch/arm/cpu/{mmu.c => mmu_32.c} (100%)
 rename arch/arm/cpu/{setupc.S => setupc_32.S} (100%)
 rename arch/arm/cpu/{smccc-call.S => smccc-call_32.S} (100%)

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index a506f1e3a3..cb88c7b330 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -78,10 +78,13 @@ endif
 ifeq ($(CONFIG_CPU_V8), y)
 KBUILD_CPPFLAGS += $(CFLAGS_ABI) $(arch-y) $(tune-y)
 KBUILD_AFLAGS   += -include asm/unified.h
-export S64 = _64
+export S64_32 = 64
+export S64 = 64
 else
 KBUILD_CPPFLAGS += $(CFLAGS_ABI) $(arch-y) $(tune-y) $(CFLAGS_THUMB2)
 KBUILD_AFLAGS   += -include asm/unified.h -msoft-float $(AFLAGS_THUMB2)
+export S64_32 = 32
+export S32 = 32
 endif
 
 # Machine directory name.  This list is sorted alphanumerically
diff --git a/arch/arm/cpu/Makefile b/arch/arm/cpu/Makefile
index 7674c1464c..fef2026da5 100644
--- a/arch/arm/cpu/Makefile
+++ b/arch/arm/cpu/Makefile
@@ -2,15 +2,15 @@
 
 obj-y += cpu.o
 
-obj-$(CONFIG_ARM_EXCEPTIONS) += exceptions$(S64).o interrupts$(S64).o
-obj-$(CONFIG_MMU) += mmu$(S64).o mmu-common.o
-obj-pbl-y += lowlevel$(S64).o
-obj-pbl-$(CONFIG_MMU) += mmu-early$(S64).o
+obj-$(CONFIG_ARM_EXCEPTIONS) += exceptions_$(S64_32).o interrupts_$(S64_32).o
+obj-$(CONFIG_MMU) += mmu_$(S64_32).o mmu-common.o
+obj-pbl-y += lowlevel_$(S64_32).o
+obj-pbl-$(CONFIG_MMU) += mmu-early_$(S64_32).o
 obj-pbl-$(CONFIG_CPU_32v7) += hyp.o
 AFLAGS_hyp.o :=-Wa,-march=armv7-a -Wa,-mcpu=all
 AFLAGS_hyp.pbl.o :=-Wa,-march=armv7-a -Wa,-mcpu=all
 
-obj-y += start.o entry.o entry_ll$(S64).o
+obj-y += start.o entry.o entry_ll_$(S64_32).o
 KASAN_SANITIZE_start.o := n
 
 pbl-$(CONFIG_CPU_64) += head_64.o
@@ -18,7 +18,7 @@ pbl-$(CONFIG_CPU_64) += head_64.o
 pbl-$(CONFIG_BOARD_ARM_GENERIC_DT) += board-dt-2nd.o
 pbl-$(CONFIG_BOARD_ARM_GENERIC_DT_AARCH64) += board-dt-2nd-aarch64.o
 
-obj-pbl-y += setupc$(S64).o cache$(S64).o
+obj-pbl-y += setupc_$(S64_32).o cache_$(S64_32).o
 
 obj-$(CONFIG_ARM_PSCI_CLIENT) += psci-client.o
 
@@ -35,9 +35,9 @@ endif
 
 obj-$(CONFIG_ARM_PSCI) += psci.o
 obj-$(CONFIG_ARM_PSCI_OF) += psci-of.o
-obj-pbl-$(CONFIG_ARM_SMCCC) += smccc-call$(S64).o
-AFLAGS_smccc-call$(S64).o :=-Wa,-march=armv$(if $(S64),8,7)-a
-AFLAGS_smccc-call$(S64).pbl.o :=-Wa,-march=armv$(if $(S64),8,7)-a
+obj-pbl-$(CONFIG_ARM_SMCCC) += smccc-call_$(S64_32).o
+AFLAGS_smccc-call_$(S64_32).o :=-Wa,-march=armv$(if $(S64),8,7)-a
+AFLAGS_smccc-call_$(S64_32).pbl.o :=-Wa,-march=armv$(if $(S64),8,7)-a
 obj-$(CONFIG_ARM_SECURE_MONITOR) += sm.o sm_as.o
 AFLAGS_sm_as.o		:=-Wa,-march=armv7-a
 
@@ -52,7 +52,7 @@ obj-pbl-$(CONFIG_CPU_64v8) += cache-armv8.o
 AFLAGS_cache-armv8.o       :=-Wa,-march=armv8-a
 AFLAGS-cache-armv8.pbl.o   :=-Wa,-march=armv8-a
 
-pbl-y += entry.o entry_ll$(S64).o
+pbl-y += entry.o entry_ll_$(S64_32).o
 pbl-y += uncompress.o
 pbl-$(CONFIG_ARM_ATF) += atf.o
 
diff --git a/arch/arm/cpu/cache.c b/arch/arm/cpu/cache_32.c
similarity index 100%
rename from arch/arm/cpu/cache.c
rename to arch/arm/cpu/cache_32.c
diff --git a/arch/arm/cpu/entry_ll.S b/arch/arm/cpu/entry_ll_32.S
similarity index 100%
rename from arch/arm/cpu/entry_ll.S
rename to arch/arm/cpu/entry_ll_32.S
diff --git a/arch/arm/cpu/exceptions.S b/arch/arm/cpu/exceptions_32.S
similarity index 100%
rename from arch/arm/cpu/exceptions.S
rename to arch/arm/cpu/exceptions_32.S
diff --git a/arch/arm/cpu/interrupts.c b/arch/arm/cpu/interrupts_32.c
similarity index 100%
rename from arch/arm/cpu/interrupts.c
rename to arch/arm/cpu/interrupts_32.c
diff --git a/arch/arm/cpu/lowlevel.S b/arch/arm/cpu/lowlevel_32.S
similarity index 100%
rename from arch/arm/cpu/lowlevel.S
rename to arch/arm/cpu/lowlevel_32.S
diff --git a/arch/arm/cpu/mmu-early.c b/arch/arm/cpu/mmu-early_32.c
similarity index 100%
rename from arch/arm/cpu/mmu-early.c
rename to arch/arm/cpu/mmu-early_32.c
diff --git a/arch/arm/cpu/mmu.c b/arch/arm/cpu/mmu_32.c
similarity index 100%
rename from arch/arm/cpu/mmu.c
rename to arch/arm/cpu/mmu_32.c
diff --git a/arch/arm/cpu/setupc.S b/arch/arm/cpu/setupc_32.S
similarity index 100%
rename from arch/arm/cpu/setupc.S
rename to arch/arm/cpu/setupc_32.S
diff --git a/arch/arm/cpu/smccc-call.S b/arch/arm/cpu/smccc-call_32.S
similarity index 100%
rename from arch/arm/cpu/smccc-call.S
rename to arch/arm/cpu/smccc-call_32.S
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 11/34] ARM: cpu.c: remove unused include
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (9 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 10/34] ARM: Add _32 suffix to aarch32 specific filenames Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 12/34] ARM: mmu-common.c: use common mmu include Sascha Hauer
                   ` (22 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List; +Cc: Ahmad Fatoum

cpu.c doesn't use anything from mmu.h, so drop its incusion.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/cpu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm/cpu/cpu.c b/arch/arm/cpu/cpu.c
index 5b79dd2a8f..cacd442b28 100644
--- a/arch/arm/cpu/cpu.c
+++ b/arch/arm/cpu/cpu.c
@@ -18,8 +18,6 @@
 #include <asm/cache.h>
 #include <asm/ptrace.h>
 
-#include "mmu.h"
-
 /**
  * Enable processor's instruction cache
  */
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 12/34] ARM: mmu-common.c: use common mmu include
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (10 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 11/34] ARM: cpu.c: remove unused include Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 13/34] ARM: mmu32: rename mmu.h to mmu_32.h Sascha Hauer
                   ` (21 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List; +Cc: Ahmad Fatoum

mmu-common.c needs things from mmu-common.h, but not from mmu.h, so
include the former instead.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/mmu-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/cpu/mmu-common.c b/arch/arm/cpu/mmu-common.c
index 488a189f1c..e6cc3b974f 100644
--- a/arch/arm/cpu/mmu-common.c
+++ b/arch/arm/cpu/mmu-common.c
@@ -11,7 +11,7 @@
 #include <asm/system.h>
 #include <asm/barebox-arm.h>
 #include <memory.h>
-#include "mmu.h"
+#include "mmu-common.h"
 
 void dma_sync_single_for_cpu(dma_addr_t address, size_t size,
 			     enum dma_data_direction dir)
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 13/34] ARM: mmu32: rename mmu.h to mmu_32.h
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (11 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 12/34] ARM: mmu-common.c: use common mmu include Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 14/34] ARM: mmu: implement MAP_FAULT Sascha Hauer
                   ` (20 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List; +Cc: Ahmad Fatoum

mmu.h is AArch32 specific, so rename it to mmu_32.h like the C files
have been renamed already.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/cache_32.c          | 2 +-
 arch/arm/cpu/mmu-early_32.c      | 2 +-
 arch/arm/cpu/mmu_32.c            | 2 +-
 arch/arm/cpu/{mmu.h => mmu_32.h} | 0
 arch/arm/cpu/sm.c                | 3 +--
 5 files changed, 4 insertions(+), 5 deletions(-)
 rename arch/arm/cpu/{mmu.h => mmu_32.h} (100%)

diff --git a/arch/arm/cpu/cache_32.c b/arch/arm/cpu/cache_32.c
index 4202406d0d..0ac50c4d9a 100644
--- a/arch/arm/cpu/cache_32.c
+++ b/arch/arm/cpu/cache_32.c
@@ -6,7 +6,7 @@
 #include <asm/cache.h>
 #include <asm/system_info.h>
 
-#include "mmu.h"
+#include "mmu_32.h"
 
 struct cache_fns {
 	void (*dma_clean_range)(unsigned long start, unsigned long end);
diff --git a/arch/arm/cpu/mmu-early_32.c b/arch/arm/cpu/mmu-early_32.c
index 4895911cdb..07c5917e6a 100644
--- a/arch/arm/cpu/mmu-early_32.c
+++ b/arch/arm/cpu/mmu-early_32.c
@@ -9,7 +9,7 @@
 #include <asm/cache.h>
 #include <asm-generic/sections.h>
 
-#include "mmu.h"
+#include "mmu_32.h"
 
 static uint32_t *ttb;
 
diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 78dd05577a..8ec21ee1d2 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -18,7 +18,7 @@
 #include <asm/system_info.h>
 #include <asm/sections.h>
 
-#include "mmu.h"
+#include "mmu_32.h"
 
 #define PTRS_PER_PTE		(PGDIR_SIZE / PAGE_SIZE)
 #define ARCH_MAP_WRITECOMBINE	((unsigned)-1)
diff --git a/arch/arm/cpu/mmu.h b/arch/arm/cpu/mmu_32.h
similarity index 100%
rename from arch/arm/cpu/mmu.h
rename to arch/arm/cpu/mmu_32.h
diff --git a/arch/arm/cpu/sm.c b/arch/arm/cpu/sm.c
index f5a1edbd4f..53f5142b63 100644
--- a/arch/arm/cpu/sm.c
+++ b/arch/arm/cpu/sm.c
@@ -19,8 +19,7 @@
 #include <linux/arm-smccc.h>
 #include <asm-generic/sections.h>
 #include <asm/secure.h>
-
-#include "mmu.h"
+#include "mmu_32.h"
 
 static unsigned int read_id_pfr1(void)
 {
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 14/34] ARM: mmu: implement MAP_FAULT
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (12 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 13/34] ARM: mmu32: rename mmu.h to mmu_32.h Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 15/34] ARM: mmu64: Use arch_remap_range where possible Sascha Hauer
                   ` (19 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

MAP_FAULT can be used for the zero page to make it faulting.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 4 ++++
 arch/arm/cpu/mmu_64.c | 3 +++
 include/mmu.h         | 1 +
 3 files changed, 8 insertions(+)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 8ec21ee1d2..a1ecc49f03 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -178,6 +178,10 @@ int arch_remap_range(void *start, size_t size, unsigned flags)
 		pte_flags = pte_flags_uncached;
 		pgd_flags = pgd_flags_uncached;
 		break;
+	case MAP_FAULT:
+		pte_flags = 0x0;
+		pgd_flags = 0x0;
+		break;
 	case ARCH_MAP_WRITECOMBINE:
 		pte_flags = pte_flags_wc;
 		pgd_flags = pgd_flags_wc;
diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index f43ac9a121..a22e0c81ab 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -154,6 +154,9 @@ int arch_remap_range(void *_start, size_t size, unsigned flags)
 	case MAP_UNCACHED:
 		attrs = attrs_uncached_mem();
 		break;
+	case MAP_FAULT:
+		attrs = 0x0;
+		break;
 	default:
 		return -EINVAL;
 	}
diff --git a/include/mmu.h b/include/mmu.h
index 2e23853df3..2326cb215a 100644
--- a/include/mmu.h
+++ b/include/mmu.h
@@ -4,6 +4,7 @@
 
 #define MAP_UNCACHED	0
 #define MAP_CACHED	1
+#define MAP_FAULT	2
 
 /*
  * Depending on the architecture the default mapping can be
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 15/34] ARM: mmu64: Use arch_remap_range where possible
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (13 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 14/34] ARM: mmu: implement MAP_FAULT Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 16/34] ARM: mmu32: implement zero_page_*() Sascha Hauer
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List; +Cc: Ahmad Fatoum

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/mmu_64.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index a22e0c81ab..0639d0f1ce 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -174,12 +174,12 @@ static void mmu_enable(void)
 
 void zero_page_access(void)
 {
-	create_sections(0x0, 0x0, PAGE_SIZE, CACHED_MEM);
+	arch_remap_range(0x0, PAGE_SIZE, MAP_CACHED);
 }
 
 void zero_page_faulting(void)
 {
-	create_sections(0x0, 0x0, PAGE_SIZE, 0x0);
+	arch_remap_range(0x0, PAGE_SIZE, MAP_FAULT);
 }
 
 /*
@@ -201,17 +201,17 @@ void __mmu_init(bool mmu_on)
 	pr_debug("ttb: 0x%p\n", ttb);
 
 	/* create a flat mapping */
-	create_sections(0, 0, 1UL << (BITS_PER_VA - 1), attrs_uncached_mem());
+	arch_remap_range(0, 1UL << (BITS_PER_VA - 1), MAP_UNCACHED);
 
 	/* Map sdram cached. */
 	for_each_memory_bank(bank) {
 		struct resource *rsv;
 
-		create_sections(bank->start, bank->start, bank->size, CACHED_MEM);
+		arch_remap_range((void *)bank->start, bank->size, MAP_CACHED);
 
 		for_each_reserved_region(bank, rsv) {
-			create_sections(resource_first_page(rsv), resource_first_page(rsv),
-					resource_count_pages(rsv), attrs_uncached_mem());
+			arch_remap_range((void *)resource_first_page(rsv),
+					 resource_count_pages(rsv), MAP_UNCACHED);
 		}
 	}
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 16/34] ARM: mmu32: implement zero_page_*()
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (14 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 15/34] ARM: mmu64: Use arch_remap_range where possible Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 17/34] ARM: i.MX: Drop HAB workaround Sascha Hauer
                   ` (17 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

We have functions to access the zero page and to make it faulting again.
Implement them for AArch32.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/Kconfig      |  3 ++-
 arch/arm/cpu/mmu-common.c | 11 +++++++++++
 arch/arm/cpu/mmu_32.c     |  5 ++---
 arch/arm/cpu/mmu_64.c     | 10 ----------
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/arm/cpu/Kconfig b/arch/arm/cpu/Kconfig
index 26f07043fe..40dd35833a 100644
--- a/arch/arm/cpu/Kconfig
+++ b/arch/arm/cpu/Kconfig
@@ -11,6 +11,7 @@ config CPU_32
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAS_DMA
 	select HAVE_PBL_IMAGE
+	select ARCH_HAS_ZERO_PAGE
 
 config CPU_64
 	bool
@@ -19,6 +20,7 @@ config CPU_64
 	select HAVE_PBL_MULTI_IMAGES
 	select HAS_DMA
 	select ARCH_WANT_FRAME_POINTERS
+	select ARCH_HAS_ZERO_PAGE
 
 # Select CPU types depending on the architecture selected. This selects
 # which CPUs we support in the kernel image, and the compiler instruction
@@ -92,7 +94,6 @@ config CPU_V8
 	select ARM_EXCEPTIONS
 	select GENERIC_FIND_NEXT_BIT
 	select ARCH_HAS_STACK_DUMP
-	select ARCH_HAS_ZERO_PAGE
 
 config CPU_XSC3
         bool
diff --git a/arch/arm/cpu/mmu-common.c b/arch/arm/cpu/mmu-common.c
index e6cc3b974f..02f512c2c6 100644
--- a/arch/arm/cpu/mmu-common.c
+++ b/arch/arm/cpu/mmu-common.c
@@ -11,6 +11,7 @@
 #include <asm/system.h>
 #include <asm/barebox-arm.h>
 #include <memory.h>
+#include <zero_page.h>
 #include "mmu-common.h"
 
 void dma_sync_single_for_cpu(dma_addr_t address, size_t size,
@@ -57,6 +58,16 @@ void dma_free_coherent(void *mem, dma_addr_t dma_handle, size_t size)
 	free(mem);
 }
 
+void zero_page_access(void)
+{
+	arch_remap_range(0x0, PAGE_SIZE, MAP_CACHED);
+}
+
+void zero_page_faulting(void)
+{
+	arch_remap_range(0x0, PAGE_SIZE, MAP_FAULT);
+}
+
 static int mmu_init(void)
 {
 	if (list_empty(&memory_banks)) {
diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index a1ecc49f03..7b31938ecd 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -9,6 +9,7 @@
 #include <init.h>
 #include <mmu.h>
 #include <errno.h>
+#include <zero_page.h>
 #include <linux/sizes.h>
 #include <asm/memory.h>
 #include <asm/barebox-arm.h>
@@ -362,7 +363,6 @@ static int set_vector_table(unsigned long adr)
 static void create_zero_page(void)
 {
 	struct resource *zero_sdram;
-	u32 *zero;
 
 	zero_sdram = request_sdram_region("zero page", 0x0, PAGE_SIZE);
 	if (zero_sdram) {
@@ -372,8 +372,7 @@ static void create_zero_page(void)
 		 */
 		pr_debug("zero page is in SDRAM area, currently not supported\n");
 	} else {
-		zero = arm_create_pte(0x0, pte_flags_uncached);
-		zero[0] = 0;
+		zero_page_faulting();
 		pr_debug("Created zero page\n");
 	}
 }
diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index 0639d0f1ce..c7c16b527b 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -172,16 +172,6 @@ static void mmu_enable(void)
 	set_cr(get_cr() | CR_M | CR_C | CR_I);
 }
 
-void zero_page_access(void)
-{
-	arch_remap_range(0x0, PAGE_SIZE, MAP_CACHED);
-}
-
-void zero_page_faulting(void)
-{
-	arch_remap_range(0x0, PAGE_SIZE, MAP_FAULT);
-}
-
 /*
  * Prepare MMU for usage enable it.
  */
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 17/34] ARM: i.MX: Drop HAB workaround
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (15 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 16/34] ARM: mmu32: implement zero_page_*() Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:01   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 18/34] ARM: Move early MMU after malloc initialization Sascha Hauer
                   ` (16 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The i.MX HAB code on i.MX6 has to jump into ROM which happens to start
at 0x0. To make that possible we used to map the ROM cached and jumped
to it before the MMU is initialized. Instead, remap the ROM as needed
in the HAB code so that we can safely jump into ROM with MMU enabled.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu-early_32.c |  7 -------
 drivers/hab/habv4.c         | 10 +++++++++-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm/cpu/mmu-early_32.c b/arch/arm/cpu/mmu-early_32.c
index 07c5917e6a..94bde44c9b 100644
--- a/arch/arm/cpu/mmu-early_32.c
+++ b/arch/arm/cpu/mmu-early_32.c
@@ -58,12 +58,5 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize,
 	/* maps main memory as cachable */
 	map_region(membase, memsize, PMD_SECT_DEF_CACHED);
 
-	/*
-	 * With HAB enabled we call into the ROM code later in imx6_hab_get_status().
-	 * Map the ROM cached which has the effect that the XN bit is not set.
-	 */
-	if (IS_ENABLED(CONFIG_HABV4) && IS_ENABLED(CONFIG_ARCH_IMX6))
-		map_region(0x0, SZ_1M, PMD_SECT_DEF_CACHED);
-
 	__mmu_cache_on();
 }
diff --git a/drivers/hab/habv4.c b/drivers/hab/habv4.c
index ca26773bf8..e8c7d3264d 100644
--- a/drivers/hab/habv4.c
+++ b/drivers/hab/habv4.c
@@ -11,6 +11,9 @@
 #include <hab.h>
 #include <init.h>
 #include <types.h>
+#include <mmu.h>
+#include <zero_page.h>
+#include <linux/sizes.h>
 #include <linux/arm-smccc.h>
 #include <asm/cache.h>
 
@@ -616,12 +619,17 @@ static int init_imx6_hab_get_status(void)
 		/* can happen in multi-image builds and is not an error */
 		return 0;
 
+	arch_remap_range(0x0, SZ_1M, MAP_CACHED);
+
 	/*
 	 * Nobody will check the return value if there were HAB errors, but the
 	 * initcall will fail spectaculously with a strange error message.
 	 */
 	imx6_hab_get_status();
 
+	zero_page_faulting();
+	arch_remap_range((void *)PAGE_SIZE, SZ_1M - PAGE_SIZE, MAP_UNCACHED);
+
 	return 0;
 }
 
@@ -630,7 +638,7 @@ static int init_imx6_hab_get_status(void)
  * which will no longer be accessible when the MMU sets the zero page to
  * faulting.
  */
-postconsole_initcall(init_imx6_hab_get_status);
+postmmu_initcall(init_imx6_hab_get_status);
 
 int imx28_hab_get_status(void)
 {
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 18/34] ARM: Move early MMU after malloc initialization
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (16 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 17/34] ARM: i.MX: Drop HAB workaround Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 19/34] ARM: mmu: move dma_sync_single_for_device to extra file Sascha Hauer
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List; +Cc: Ahmad Fatoum

Initialize the MMU after malloc so that we can use malloc in the
MMU code, for example to allocate memory for page tables.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/start.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
index 4841ee6043..87207822a0 100644
--- a/arch/arm/cpu/start.c
+++ b/arch/arm/cpu/start.c
@@ -167,16 +167,6 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
 	arm_barebox_size = barebox_size;
 	malloc_end = barebox_base;
 
-	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
-		unsigned long ttb = arm_mem_ttb(endmem);
-
-		if (!IS_ENABLED(CONFIG_PBL_IMAGE)) {
-			pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
-			arm_early_mmu_cache_invalidate();
-			mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
-		}
-	}
-
 	if (boarddata) {
 		uint32_t totalsize = 0;
 		const char *name;
@@ -226,6 +216,16 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
 
 	mem_malloc_init((void *)malloc_start, (void *)malloc_end - 1);
 
+	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
+		unsigned long ttb = arm_mem_ttb(endmem);
+
+		if (!IS_ENABLED(CONFIG_PBL_IMAGE)) {
+			pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
+			arm_early_mmu_cache_invalidate();
+			mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
+		}
+	}
+
 	if (IS_ENABLED(CONFIG_BOOTM_OPTEE))
 		of_add_reserve_entry(endmem - OPTEE_SIZE, endmem - 1);
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 19/34] ARM: mmu: move dma_sync_single_for_device to extra file
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (17 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 18/34] ARM: Move early MMU after malloc initialization Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 20/34] ARM: mmu: merge mmu-early_xx.c into mmu_xx.c Sascha Hauer
                   ` (14 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The next patch merges the mmu.c files with their corresponding
mmu-early.c files. Before doing that move functions which can't
be compiled for PBL out to extra files.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/Makefile |  1 +
 arch/arm/cpu/dma_32.c | 20 ++++++++++++++++++++
 arch/arm/cpu/dma_64.c | 16 ++++++++++++++++
 arch/arm/cpu/mmu_32.c | 18 ------------------
 arch/arm/cpu/mmu_64.c | 13 -------------
 5 files changed, 37 insertions(+), 31 deletions(-)
 create mode 100644 arch/arm/cpu/dma_32.c
 create mode 100644 arch/arm/cpu/dma_64.c

diff --git a/arch/arm/cpu/Makefile b/arch/arm/cpu/Makefile
index fef2026da5..cd5f36eb49 100644
--- a/arch/arm/cpu/Makefile
+++ b/arch/arm/cpu/Makefile
@@ -4,6 +4,7 @@ obj-y += cpu.o
 
 obj-$(CONFIG_ARM_EXCEPTIONS) += exceptions_$(S64_32).o interrupts_$(S64_32).o
 obj-$(CONFIG_MMU) += mmu_$(S64_32).o mmu-common.o
+obj-$(CONFIG_MMU) += dma_$(S64_32).o
 obj-pbl-y += lowlevel_$(S64_32).o
 obj-pbl-$(CONFIG_MMU) += mmu-early_$(S64_32).o
 obj-pbl-$(CONFIG_CPU_32v7) += hyp.o
diff --git a/arch/arm/cpu/dma_32.c b/arch/arm/cpu/dma_32.c
new file mode 100644
index 0000000000..a66aa26b9b
--- /dev/null
+++ b/arch/arm/cpu/dma_32.c
@@ -0,0 +1,20 @@
+#include <dma.h>
+#include <asm/mmu.h>
+
+void dma_sync_single_for_device(dma_addr_t address, size_t size,
+				enum dma_data_direction dir)
+{
+	/*
+	 * FIXME: This function needs a device argument to support non 1:1 mappings
+	 */
+
+	if (dir == DMA_FROM_DEVICE) {
+		__dma_inv_range(address, address + size);
+		if (outer_cache.inv_range)
+			outer_cache.inv_range(address, address + size);
+	} else {
+		__dma_clean_range(address, address + size);
+		if (outer_cache.clean_range)
+			outer_cache.clean_range(address, address + size);
+	}
+}
diff --git a/arch/arm/cpu/dma_64.c b/arch/arm/cpu/dma_64.c
new file mode 100644
index 0000000000..b4ae736c9b
--- /dev/null
+++ b/arch/arm/cpu/dma_64.c
@@ -0,0 +1,16 @@
+#include <dma.h>
+#include <asm/mmu.h>
+#include <asm/cache.h>
+
+void dma_sync_single_for_device(dma_addr_t address, size_t size,
+                                enum dma_data_direction dir)
+{
+	/*
+	 * FIXME: This function needs a device argument to support non 1:1 mappings
+	 */
+
+	if (dir == DMA_FROM_DEVICE)
+		v8_inv_dcache_range(address, address + size - 1);
+	else
+		v8_flush_dcache_range(address, address + size - 1);
+}
diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 7b31938ecd..10f447874c 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -494,21 +494,3 @@ void *dma_alloc_writecombine(size_t size, dma_addr_t *dma_handle)
 {
 	return dma_alloc_map(size, dma_handle, ARCH_MAP_WRITECOMBINE);
 }
-
-void dma_sync_single_for_device(dma_addr_t address, size_t size,
-				enum dma_data_direction dir)
-{
-	/*
-	 * FIXME: This function needs a device argument to support non 1:1 mappings
-	 */
-
-	if (dir == DMA_FROM_DEVICE) {
-		__dma_inv_range(address, address + size);
-		if (outer_cache.inv_range)
-			outer_cache.inv_range(address, address + size);
-	} else {
-		__dma_clean_range(address, address + size);
-		if (outer_cache.clean_range)
-			outer_cache.clean_range(address, address + size);
-	}
-}
diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index c7c16b527b..9150de1676 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -241,16 +241,3 @@ void dma_flush_range(void *ptr, size_t size)
 
 	v8_flush_dcache_range(start, end);
 }
-
-void dma_sync_single_for_device(dma_addr_t address, size_t size,
-                                enum dma_data_direction dir)
-{
-	/*
-	 * FIXME: This function needs a device argument to support non 1:1 mappings
-	 */
-
-	if (dir == DMA_FROM_DEVICE)
-		v8_inv_dcache_range(address, address + size - 1);
-	else
-		v8_flush_dcache_range(address, address + size - 1);
-}
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 20/34] ARM: mmu: merge mmu-early_xx.c into mmu_xx.c
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (18 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 19/34] ARM: mmu: move dma_sync_single_for_device to extra file Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 21/34] ARM: mmu: alloc 64k for early page tables Sascha Hauer
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The code will be further consolidated, so move it together for easier
code sharing.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/Makefile       |  4 +-
 arch/arm/cpu/mmu-early_32.c | 62 -------------------------
 arch/arm/cpu/mmu-early_64.c | 93 -------------------------------------
 arch/arm/cpu/mmu_32.c       | 50 ++++++++++++++++++++
 arch/arm/cpu/mmu_64.c       | 76 ++++++++++++++++++++++++++++++
 5 files changed, 128 insertions(+), 157 deletions(-)
 delete mode 100644 arch/arm/cpu/mmu-early_32.c
 delete mode 100644 arch/arm/cpu/mmu-early_64.c

diff --git a/arch/arm/cpu/Makefile b/arch/arm/cpu/Makefile
index cd5f36eb49..0e4fa69229 100644
--- a/arch/arm/cpu/Makefile
+++ b/arch/arm/cpu/Makefile
@@ -3,10 +3,10 @@
 obj-y += cpu.o
 
 obj-$(CONFIG_ARM_EXCEPTIONS) += exceptions_$(S64_32).o interrupts_$(S64_32).o
-obj-$(CONFIG_MMU) += mmu_$(S64_32).o mmu-common.o
+obj-$(CONFIG_MMU) += mmu-common.o
+obj-pbl-$(CONFIG_MMU) += mmu_$(S64_32).o
 obj-$(CONFIG_MMU) += dma_$(S64_32).o
 obj-pbl-y += lowlevel_$(S64_32).o
-obj-pbl-$(CONFIG_MMU) += mmu-early_$(S64_32).o
 obj-pbl-$(CONFIG_CPU_32v7) += hyp.o
 AFLAGS_hyp.o :=-Wa,-march=armv7-a -Wa,-mcpu=all
 AFLAGS_hyp.pbl.o :=-Wa,-march=armv7-a -Wa,-mcpu=all
diff --git a/arch/arm/cpu/mmu-early_32.c b/arch/arm/cpu/mmu-early_32.c
deleted file mode 100644
index 94bde44c9b..0000000000
--- a/arch/arm/cpu/mmu-early_32.c
+++ /dev/null
@@ -1,62 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-
-#include <common.h>
-#include <asm/mmu.h>
-#include <errno.h>
-#include <linux/sizes.h>
-#include <asm/memory.h>
-#include <asm/system.h>
-#include <asm/cache.h>
-#include <asm-generic/sections.h>
-
-#include "mmu_32.h"
-
-static uint32_t *ttb;
-
-static inline void map_region(unsigned long start, unsigned long size,
-			      uint64_t flags)
-
-{
-	start = ALIGN_DOWN(start, SZ_1M);
-	size  = ALIGN(size, SZ_1M);
-
-	create_sections(ttb, start, start + size - 1, flags);
-}
-
-void mmu_early_enable(unsigned long membase, unsigned long memsize,
-		      unsigned long _ttb)
-{
-	ttb = (uint32_t *)_ttb;
-
-	set_ttbr(ttb);
-
-	/* For the XN bit to take effect, we can't be using DOMAIN_MANAGER. */
-	if (cpu_architecture() >= CPU_ARCH_ARMv7)
-		set_domain(DOMAIN_CLIENT);
-	else
-		set_domain(DOMAIN_MANAGER);
-
-	/*
-	 * This marks the whole address space as uncachable as well as
-	 * unexecutable if possible
-	 */
-	create_flat_mapping(ttb);
-
-	/*
-	 * There can be SoCs that have a section shared between device memory
-	 * and the on-chip RAM hosting the PBL. Thus mark this section
-	 * uncachable, but executable.
-	 * On such SoCs, executing from OCRAM could cause the instruction
-	 * prefetcher to speculatively access that device memory, triggering
-	 * potential errant behavior.
-	 *
-	 * If your SoC has such a memory layout, you should rewrite the code
-	 * here to map the OCRAM page-wise.
-	 */
-	map_region((unsigned long)_stext, _etext - _stext, PMD_SECT_DEF_UNCACHED);
-
-	/* maps main memory as cachable */
-	map_region(membase, memsize, PMD_SECT_DEF_CACHED);
-
-	__mmu_cache_on();
-}
diff --git a/arch/arm/cpu/mmu-early_64.c b/arch/arm/cpu/mmu-early_64.c
deleted file mode 100644
index d1f4a046bb..0000000000
--- a/arch/arm/cpu/mmu-early_64.c
+++ /dev/null
@@ -1,93 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-
-#include <common.h>
-#include <dma-dir.h>
-#include <init.h>
-#include <mmu.h>
-#include <errno.h>
-#include <linux/sizes.h>
-#include <asm/memory.h>
-#include <asm/pgtable64.h>
-#include <asm/barebox-arm.h>
-#include <asm/system.h>
-#include <asm/cache.h>
-#include <memory.h>
-#include <asm/system_info.h>
-
-#include "mmu_64.h"
-
-static void create_sections(void *ttb, uint64_t virt, uint64_t phys,
-			    uint64_t size, uint64_t attr)
-{
-	uint64_t block_size;
-	uint64_t block_shift;
-	uint64_t *pte;
-	uint64_t idx;
-	uint64_t addr;
-	uint64_t *table;
-
-	addr = virt;
-
-	attr &= ~PTE_TYPE_MASK;
-
-	table = ttb;
-
-	while (1) {
-		block_shift = level2shift(1);
-		idx = (addr & level2mask(1)) >> block_shift;
-		block_size = (1ULL << block_shift);
-
-		pte = table + idx;
-
-		*pte = phys | attr | PTE_TYPE_BLOCK;
-
-		if (size < block_size)
-			break;
-
-		addr += block_size;
-		phys += block_size;
-		size -= block_size;
-	}
-}
-
-#define EARLY_BITS_PER_VA 39
-
-void mmu_early_enable(unsigned long membase, unsigned long memsize,
-		      unsigned long ttb)
-{
-	int el;
-
-	/*
-	 * For the early code we only create level 1 pagetables which only
-	 * allow for a 1GiB granularity. If our membase is not aligned to that
-	 * bail out without enabling the MMU.
-	 */
-	if (membase & ((1ULL << level2shift(1)) - 1))
-		return;
-
-	memset((void *)ttb, 0, GRANULE_SIZE);
-
-	el = current_el();
-	set_ttbr_tcr_mair(el, ttb, calc_tcr(el, EARLY_BITS_PER_VA), MEMORY_ATTRIBUTES);
-	create_sections((void *)ttb, 0, 0, 1UL << (EARLY_BITS_PER_VA - 1),
-			attrs_uncached_mem());
-	create_sections((void *)ttb, membase, membase, memsize, CACHED_MEM);
-	tlb_invalidate();
-	isb();
-	set_cr(get_cr() | CR_M);
-}
-
-void mmu_early_disable(void)
-{
-	unsigned int cr;
-
-	cr = get_cr();
-	cr &= ~(CR_M | CR_C);
-
-	set_cr(cr);
-	v8_flush_dcache_all();
-	tlb_invalidate();
-
-	dsb();
-	isb();
-}
diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 10f447874c..12fe892400 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -494,3 +494,53 @@ void *dma_alloc_writecombine(size_t size, dma_addr_t *dma_handle)
 {
 	return dma_alloc_map(size, dma_handle, ARCH_MAP_WRITECOMBINE);
 }
+
+static uint32_t *ttb;
+
+static inline void map_region(unsigned long start, unsigned long size,
+			      uint64_t flags)
+
+{
+	start = ALIGN_DOWN(start, SZ_1M);
+	size  = ALIGN(size, SZ_1M);
+
+	create_sections(ttb, start, start + size - 1, flags);
+}
+
+void mmu_early_enable(unsigned long membase, unsigned long memsize,
+		      unsigned long _ttb)
+{
+	ttb = (uint32_t *)_ttb;
+
+	set_ttbr(ttb);
+
+	/* For the XN bit to take effect, we can't be using DOMAIN_MANAGER. */
+	if (cpu_architecture() >= CPU_ARCH_ARMv7)
+		set_domain(DOMAIN_CLIENT);
+	else
+		set_domain(DOMAIN_MANAGER);
+
+	/*
+	 * This marks the whole address space as uncachable as well as
+	 * unexecutable if possible
+	 */
+	create_flat_mapping(ttb);
+
+	/*
+	 * There can be SoCs that have a section shared between device memory
+	 * and the on-chip RAM hosting the PBL. Thus mark this section
+	 * uncachable, but executable.
+	 * On such SoCs, executing from OCRAM could cause the instruction
+	 * prefetcher to speculatively access that device memory, triggering
+	 * potential errant behavior.
+	 *
+	 * If your SoC has such a memory layout, you should rewrite the code
+	 * here to map the OCRAM page-wise.
+	 */
+	map_region((unsigned long)_stext, _etext - _stext, PMD_SECT_DEF_UNCACHED);
+
+	/* maps main memory as cachable */
+	map_region(membase, memsize, PMD_SECT_DEF_CACHED);
+
+	__mmu_cache_on();
+}
diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index 9150de1676..55ada960c5 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -241,3 +241,79 @@ void dma_flush_range(void *ptr, size_t size)
 
 	v8_flush_dcache_range(start, end);
 }
+
+static void early_create_sections(void *ttb, uint64_t virt, uint64_t phys,
+				  uint64_t size, uint64_t attr)
+{
+	uint64_t block_size;
+	uint64_t block_shift;
+	uint64_t *pte;
+	uint64_t idx;
+	uint64_t addr;
+	uint64_t *table;
+
+	addr = virt;
+
+	attr &= ~PTE_TYPE_MASK;
+
+	table = ttb;
+
+	while (1) {
+		block_shift = level2shift(1);
+		idx = (addr & level2mask(1)) >> block_shift;
+		block_size = (1ULL << block_shift);
+
+		pte = table + idx;
+
+		*pte = phys | attr | PTE_TYPE_BLOCK;
+
+		if (size < block_size)
+			break;
+
+		addr += block_size;
+		phys += block_size;
+		size -= block_size;
+	}
+}
+
+#define EARLY_BITS_PER_VA 39
+
+void mmu_early_enable(unsigned long membase, unsigned long memsize,
+		      unsigned long ttb)
+{
+	int el;
+
+	/*
+	 * For the early code we only create level 1 pagetables which only
+	 * allow for a 1GiB granularity. If our membase is not aligned to that
+	 * bail out without enabling the MMU.
+	 */
+	if (membase & ((1ULL << level2shift(1)) - 1))
+		return;
+
+	memset((void *)ttb, 0, GRANULE_SIZE);
+
+	el = current_el();
+	set_ttbr_tcr_mair(el, ttb, calc_tcr(el, EARLY_BITS_PER_VA), MEMORY_ATTRIBUTES);
+	early_create_sections((void *)ttb, 0, 0, 1UL << (EARLY_BITS_PER_VA - 1),
+			attrs_uncached_mem());
+	early_create_sections((void *)ttb, membase, membase, memsize, CACHED_MEM);
+	tlb_invalidate();
+	isb();
+	set_cr(get_cr() | CR_M);
+}
+
+void mmu_early_disable(void)
+{
+	unsigned int cr;
+
+	cr = get_cr();
+	cr &= ~(CR_M | CR_C);
+
+	set_cr(cr);
+	v8_flush_dcache_all();
+	tlb_invalidate();
+
+	dsb();
+	isb();
+}
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 21/34] ARM: mmu: alloc 64k for early page tables
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (19 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 20/34] ARM: mmu: merge mmu-early_xx.c into mmu_xx.c Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:03   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 22/34] ARM: mmu32: create alloc_pte() Sascha Hauer
                   ` (12 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

This is a preparation for using two level page tables in the PBL.
To do that we need a way to allocate page tables in PBL. As malloc
is not available in PBL, increase the area we use for the TTB to
make some space available for page tables.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c              | 6 ++++++
 arch/arm/include/asm/barebox-arm.h | 8 ++------
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 12fe892400..4050d96846 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -24,6 +24,12 @@
 #define PTRS_PER_PTE		(PGDIR_SIZE / PAGE_SIZE)
 #define ARCH_MAP_WRITECOMBINE	((unsigned)-1)
 
+/*
+ * We have a 4GiB address space split into 1MiB sections, with each
+ * section header taking 4 bytes
+ */
+#define ARM_TTB_SIZE	(SZ_4G / SZ_1M * sizeof(u32))
+
 static uint32_t *ttb;
 
 /*
diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
index f5a74b4746..eb31ca2788 100644
--- a/arch/arm/include/asm/barebox-arm.h
+++ b/arch/arm/include/asm/barebox-arm.h
@@ -23,11 +23,7 @@
 #include <asm/reloc.h>
 #include <linux/stringify.h>
 
-/*
- * We have a 4GiB address space split into 1MiB sections, with each
- * section header taking 4 bytes
- */
-#define ARM_TTB_SIZE	(SZ_4G / SZ_1M * sizeof(u32))
+#define ARM_EARLY_PAGETABLE_SIZE	SZ_64K
 
 void __noreturn barebox_arm_entry(unsigned long membase, unsigned long memsize, void *boarddata);
 
@@ -89,7 +85,7 @@ static inline unsigned long arm_mem_stack(unsigned long endmem)
 static inline unsigned long arm_mem_ttb(unsigned long endmem)
 {
 	endmem = arm_mem_stack(endmem);
-	endmem = ALIGN_DOWN(endmem, ARM_TTB_SIZE) - ARM_TTB_SIZE;
+	endmem = ALIGN_DOWN(endmem, ARM_EARLY_PAGETABLE_SIZE) - ARM_EARLY_PAGETABLE_SIZE;
 
 	return endmem;
 }
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 22/34] ARM: mmu32: create alloc_pte()
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (20 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 21/34] ARM: mmu: alloc 64k for early page tables Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:07   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 23/34] ARM: mmu64: " Sascha Hauer
                   ` (11 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

This is a preparation for using two level page tables in the PBL.
To do that we need a way to allocate page tables in PBL. As malloc
is not available in PBL, implement a function to allocate a page table
from the area we also place the TTB.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 4050d96846..a82382ad1e 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -76,6 +76,27 @@ static bool pgd_type_table(u32 pgd)
 	return (pgd & PMD_TYPE_MASK) == PMD_TYPE_TABLE;
 }
 
+#define PTE_SIZE       (PTRS_PER_PTE * sizeof(u32))
+
+#ifdef __PBL__
+static uint32_t *alloc_pte(void)
+{
+	static unsigned int idx = 3;
+
+	idx++;
+
+	if (idx * PTE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
+		return NULL;
+
+	return (void *)ttb + idx * PTE_SIZE;
+}
+#else
+static uint32_t *alloc_pte(void)
+{
+	return xmemalign(PTE_SIZE, PTE_SIZE);
+}
+#endif
+
 static u32 *find_pte(unsigned long adr)
 {
 	u32 *table;
@@ -125,8 +146,7 @@ static u32 *arm_create_pte(unsigned long virt, uint32_t flags)
 
 	virt = ALIGN_DOWN(virt, PGDIR_SIZE);
 
-	table = xmemalign(PTRS_PER_PTE * sizeof(u32),
-			  PTRS_PER_PTE * sizeof(u32));
+	table = alloc_pte();
 
 	if (!ttb)
 		arm_mmu_not_initialized_error();
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 23/34] ARM: mmu64: create alloc_pte()
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (21 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 22/34] ARM: mmu32: create alloc_pte() Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:15   ` Ahmad Fatoum
  2023-05-17 13:17   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 24/34] ARM: mmu: drop ttb argument Sascha Hauer
                   ` (10 subsequent siblings)
  33 siblings, 2 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

This is a preparation for using two level page tables in the PBL.
To do that we need a way to allocate page tables in PBL. As malloc
is not available in PBL, implement a function to allocate a page table
from the area we also place the TTB.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_64.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index 55ada960c5..3cc5b14a46 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -32,7 +32,20 @@ static void set_table(uint64_t *pt, uint64_t *table_addr)
 	*pt = val;
 }
 
-static uint64_t *create_table(void)
+#ifdef __PBL__
+static uint64_t *alloc_pte(void)
+{
+	static unsigned int idx;
+
+	idx++;
+
+	if (idx * GRANULE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
+		return NULL;
+
+	return (void *)ttb + idx * GRANULE_SIZE;
+}
+#else
+static uint64_t *alloc_pte(void)
 {
 	uint64_t *new_table = xmemalign(GRANULE_SIZE, GRANULE_SIZE);
 
@@ -41,6 +54,7 @@ static uint64_t *create_table(void)
 
 	return new_table;
 }
+#endif
 
 static __maybe_unused uint64_t *find_pte(uint64_t addr)
 {
@@ -81,7 +95,7 @@ static void split_block(uint64_t *pte, int level)
 	/* level describes the parent level, we need the child ones */
 	levelshift = level2shift(level + 1);
 
-	new_table = create_table();
+	new_table = alloc_pte();
 
 	for (i = 0; i < MAX_PTE_ENTRIES; i++) {
 		new_table[i] = old_pte | (i << levelshift);
@@ -183,7 +197,7 @@ void __mmu_init(bool mmu_on)
 	if (mmu_on)
 		mmu_disable();
 
-	ttb = create_table();
+	ttb = alloc_pte();
 	el = current_el();
 	set_ttbr_tcr_mair(el, (uint64_t)ttb, calc_tcr(el, BITS_PER_VA),
 			  MEMORY_ATTRIBUTES);
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 24/34] ARM: mmu: drop ttb argument
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (22 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 23/34] ARM: mmu64: " Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:23   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 25/34] ARM: mmu: always do MMU initialization early when MMU is enabled Sascha Hauer
                   ` (9 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

No need to pass ttb to the MMU code, the MMU code can itself call
arm_mem_ttb() to get the desired base.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c      |  9 +++++----
 arch/arm/cpu/mmu_64.c      |  8 +++++---
 arch/arm/cpu/start.c       | 11 +++--------
 arch/arm/cpu/uncompress.c  |  7 ++-----
 arch/arm/include/asm/mmu.h |  3 +--
 5 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index a82382ad1e..bef4a01670 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -533,10 +533,11 @@ static inline void map_region(unsigned long start, unsigned long size,
 	create_sections(ttb, start, start + size - 1, flags);
 }
 
-void mmu_early_enable(unsigned long membase, unsigned long memsize,
-		      unsigned long _ttb)
+void mmu_early_enable(unsigned long membase, unsigned long memsize)
 {
-	ttb = (uint32_t *)_ttb;
+	ttb = (uint32_t *)arm_mem_ttb(membase, membase + memsize);
+
+	pr_debug("enabling MMU, ttb @ 0x%p\n", ttb);
 
 	set_ttbr(ttb);
 
@@ -566,7 +567,7 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize,
 	map_region((unsigned long)_stext, _etext - _stext, PMD_SECT_DEF_UNCACHED);
 
 	/* maps main memory as cachable */
-	map_region(membase, memsize, PMD_SECT_DEF_CACHED);
+	map_region(membase, memsize - OPTEE_SIZE, PMD_SECT_DEF_CACHED);
 
 	__mmu_cache_on();
 }
diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index 3cc5b14a46..d32eecf144 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -292,10 +292,12 @@ static void early_create_sections(void *ttb, uint64_t virt, uint64_t phys,
 
 #define EARLY_BITS_PER_VA 39
 
-void mmu_early_enable(unsigned long membase, unsigned long memsize,
-		      unsigned long ttb)
+void mmu_early_enable(unsigned long membase, unsigned long memsize)
 {
 	int el;
+	unsigned long ttb = arm_mem_ttb(membase + memsize);
+
+	pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
 
 	/*
 	 * For the early code we only create level 1 pagetables which only
@@ -311,7 +313,7 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize,
 	set_ttbr_tcr_mair(el, ttb, calc_tcr(el, EARLY_BITS_PER_VA), MEMORY_ATTRIBUTES);
 	early_create_sections((void *)ttb, 0, 0, 1UL << (EARLY_BITS_PER_VA - 1),
 			attrs_uncached_mem());
-	early_create_sections((void *)ttb, membase, membase, memsize, CACHED_MEM);
+	early_create_sections((void *)ttb, membase, membase, memsize - OPTEE_SIZE, CACHED_MEM);
 	tlb_invalidate();
 	isb();
 	set_cr(get_cr() | CR_M);
diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
index 87207822a0..165d2d94e6 100644
--- a/arch/arm/cpu/start.c
+++ b/arch/arm/cpu/start.c
@@ -216,14 +216,9 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
 
 	mem_malloc_init((void *)malloc_start, (void *)malloc_end - 1);
 
-	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
-		unsigned long ttb = arm_mem_ttb(endmem);
-
-		if (!IS_ENABLED(CONFIG_PBL_IMAGE)) {
-			pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
-			arm_early_mmu_cache_invalidate();
-			mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
-		}
+	if (IS_ENABLED(CONFIG_MMU_EARLY) && !IS_ENABLED(CONFIG_PBL_IMAGE)) {
+		arm_early_mmu_cache_invalidate();
+		mmu_early_enable(membase, memsize);
 	}
 
 	if (IS_ENABLED(CONFIG_BOOTM_OPTEE))
diff --git a/arch/arm/cpu/uncompress.c b/arch/arm/cpu/uncompress.c
index abaf36b68c..e471dd87f9 100644
--- a/arch/arm/cpu/uncompress.c
+++ b/arch/arm/cpu/uncompress.c
@@ -81,11 +81,8 @@ void __noreturn barebox_pbl_start(unsigned long membase, unsigned long memsize,
 
 	pr_debug("memory at 0x%08lx, size 0x%08lx\n", membase, memsize);
 
-	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
-		unsigned long ttb = arm_mem_ttb(endmem);
-		pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
-		mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
-	}
+	if (IS_ENABLED(CONFIG_MMU_EARLY))
+		mmu_early_enable(membase, memsize);
 
 	free_mem_ptr = arm_mem_early_malloc(endmem);
 	free_mem_end_ptr = arm_mem_early_malloc_end(endmem);
diff --git a/arch/arm/include/asm/mmu.h b/arch/arm/include/asm/mmu.h
index fd8e93f7a3..9d2fdcf365 100644
--- a/arch/arm/include/asm/mmu.h
+++ b/arch/arm/include/asm/mmu.h
@@ -56,8 +56,7 @@ void __dma_clean_range(unsigned long, unsigned long);
 void __dma_flush_range(unsigned long, unsigned long);
 void __dma_inv_range(unsigned long, unsigned long);
 
-void mmu_early_enable(unsigned long membase, unsigned long memsize,
-		      unsigned long ttb);
+void mmu_early_enable(unsigned long membase, unsigned long memsize);
 void mmu_early_disable(void);
 
 #endif /* __ASM_MMU_H */
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 25/34] ARM: mmu: always do MMU initialization early when MMU is enabled
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (23 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 24/34] ARM: mmu: drop ttb argument Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:29   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 26/34] ARM: mmu32: Assume MMU is on Sascha Hauer
                   ` (8 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

Drop the CONFIG_MMU_EARLY and make early MMU initialization the default.

Doing so allows us for some simplifications in the MMU code as we have
less code pathes to care and think about.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/start.c      | 2 +-
 arch/arm/cpu/uncompress.c | 2 +-
 common/Kconfig            | 9 ---------
 3 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
index 165d2d94e6..2e987ec41d 100644
--- a/arch/arm/cpu/start.c
+++ b/arch/arm/cpu/start.c
@@ -216,7 +216,7 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
 
 	mem_malloc_init((void *)malloc_start, (void *)malloc_end - 1);
 
-	if (IS_ENABLED(CONFIG_MMU_EARLY) && !IS_ENABLED(CONFIG_PBL_IMAGE)) {
+	if (IS_ENABLED(CONFIG_MMU) && !IS_ENABLED(CONFIG_PBL_IMAGE)) {
 		arm_early_mmu_cache_invalidate();
 		mmu_early_enable(membase, memsize);
 	}
diff --git a/arch/arm/cpu/uncompress.c b/arch/arm/cpu/uncompress.c
index e471dd87f9..a481c4634d 100644
--- a/arch/arm/cpu/uncompress.c
+++ b/arch/arm/cpu/uncompress.c
@@ -81,7 +81,7 @@ void __noreturn barebox_pbl_start(unsigned long membase, unsigned long memsize,
 
 	pr_debug("memory at 0x%08lx, size 0x%08lx\n", membase, memsize);
 
-	if (IS_ENABLED(CONFIG_MMU_EARLY))
+	if (IS_ENABLED(CONFIG_MMU))
 		mmu_early_enable(membase, memsize);
 
 	free_mem_ptr = arm_mem_early_malloc(endmem);
diff --git a/common/Kconfig b/common/Kconfig
index ac3df75acb..c6008f125b 100644
--- a/common/Kconfig
+++ b/common/Kconfig
@@ -185,15 +185,6 @@ config MMU
 	  to enable the data cache which depends on the MMU. See Documentation/mmu.txt
 	  for further information.
 
-config MMU_EARLY
-	bool "Enable MMU early"
-	depends on ARM
-	depends on MMU
-	default y
-	help
-	  This enables the MMU during early startup. This speeds up things during startup
-	  of barebox, but may lead to harder to debug code. If unsure say yes here.
-
 config HAVE_CONFIGURABLE_TEXT_BASE
 	bool
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 26/34] ARM: mmu32: Assume MMU is on
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (24 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 25/34] ARM: mmu: always do MMU initialization early when MMU is enabled Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:36   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 27/34] ARM: mmu32: Fix pmd_flags_to_pte() for ARMv4/5/6 Sascha Hauer
                   ` (7 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

As we now always enable the MMU during early initialization we can
safely assume that the MMU is already enabled in __mmu_init() and
drop the code path which enables the MMU.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 47 +++++++++----------------------------------
 1 file changed, 10 insertions(+), 37 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index bef4a01670..7cd732580e 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -24,12 +24,6 @@
 #define PTRS_PER_PTE		(PGDIR_SIZE / PAGE_SIZE)
 #define ARCH_MAP_WRITECOMBINE	((unsigned)-1)
 
-/*
- * We have a 4GiB address space split into 1MiB sections, with each
- * section header taking 4 bytes
- */
-#define ARM_TTB_SIZE	(SZ_4G / SZ_1M * sizeof(u32))
-
 static uint32_t *ttb;
 
 /*
@@ -457,38 +451,19 @@ void __mmu_init(bool mmu_on)
 		pte_flags_uncached = PTE_FLAGS_UNCACHED_V4;
 	}
 
-	if (mmu_on) {
+	/* Clear unpredictable bits [13:0] */
+	ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
+
+	if (!request_sdram_region("ttb", (unsigned long)ttb, SZ_16K))
 		/*
-		 * Early MMU code has already enabled the MMU. We assume a
-		 * flat 1:1 section mapping in this case.
+		 * This can mean that:
+		 * - the early MMU code has put the ttb into a place
+		 *   which we don't have inside our available memory
+		 * - Somebody else has occupied the ttb region which means
+		 *   the ttb will get corrupted.
 		 */
-		/* Clear unpredictable bits [13:0] */
-		ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
-
-		if (!request_sdram_region("ttb", (unsigned long)ttb, SZ_16K))
-			/*
-			 * This can mean that:
-			 * - the early MMU code has put the ttb into a place
-			 *   which we don't have inside our available memory
-			 * - Somebody else has occupied the ttb region which means
-			 *   the ttb will get corrupted.
-			 */
-			pr_crit("Critical Error: Can't request SDRAM region for ttb at %p\n",
+		pr_crit("Critical Error: Can't request SDRAM region for ttb at %p\n",
 					ttb);
-	} else {
-		ttb = xmemalign(ARM_TTB_SIZE, ARM_TTB_SIZE);
-
-		set_ttbr(ttb);
-
-		/* For the XN bit to take effect, we can't be using DOMAIN_MANAGER. */
-		if (cpu_architecture() >= CPU_ARCH_ARMv7)
-			set_domain(DOMAIN_CLIENT);
-		else
-			set_domain(DOMAIN_MANAGER);
-
-		create_flat_mapping(ttb);
-		__mmu_cache_flush();
-	}
 
 	pr_debug("ttb: 0x%p\n", ttb);
 
@@ -499,8 +474,6 @@ void __mmu_init(bool mmu_on)
 				PMD_SECT_DEF_CACHED);
 		__mmu_cache_flush();
 	}
-
-	__mmu_cache_on();
 }
 
 /*
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 27/34] ARM: mmu32: Fix pmd_flags_to_pte() for ARMv4/5/6
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (25 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 26/34] ARM: mmu32: Assume MMU is on Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:39   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 28/34] ARM: mmu32: Add pte_flags_to_pmd() Sascha Hauer
                   ` (6 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

pmd_flags_to_pte() assumed ARMv7 page table format. This has the effect
that random bit values end up in the access permission bits. This works
because the domain is configured as manager in the DACR and thus the
access permissions are ignored by the MMU.
Nevertheless fix this and take the cpu architecture into account when
translating the bits. Don't bother to translate the access permission
bits though, just hardcode them as PTE_SMALL_AP_UNO_SRW.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 7cd732580e..4abaab7d87 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -167,17 +167,22 @@ static u32 pmd_flags_to_pte(u32 pmd)
 		pte |= PTE_BUFFERABLE;
 	if (pmd & PMD_SECT_CACHEABLE)
 		pte |= PTE_CACHEABLE;
-	if (pmd & PMD_SECT_nG)
-		pte |= PTE_EXT_NG;
-	if (pmd & PMD_SECT_XN)
-		pte |= PTE_EXT_XN;
-
-	/* TEX[2:0] */
-	pte |= PTE_EXT_TEX((pmd >> 12) & 7);
-	/* AP[1:0] */
-	pte |= ((pmd >> 10) & 0x3) << 4;
-	/* AP[2] */
-	pte |= ((pmd >> 15) & 0x1) << 9;
+
+	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
+		if (pmd & PMD_SECT_nG)
+			pte |= PTE_EXT_NG;
+		if (pmd & PMD_SECT_XN)
+			pte |= PTE_EXT_XN;
+
+		/* TEX[2:0] */
+		pte |= PTE_EXT_TEX((pmd >> 12) & 7);
+		/* AP[1:0] */
+		pte |= ((pmd >> 10) & 0x3) << 4;
+		/* AP[2] */
+		pte |= ((pmd >> 15) & 0x1) << 9;
+	} else {
+		pte |= PTE_SMALL_AP_UNO_SRW;
+	}
 
 	return pte;
 }
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 28/34] ARM: mmu32: Add pte_flags_to_pmd()
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (26 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 27/34] ARM: mmu32: Fix pmd_flags_to_pte() for ARMv4/5/6 Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:43   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 29/34] ARM: mmu32: add get_pte_flags, get_pmd_flags Sascha Hauer
                   ` (5 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 4abaab7d87..0af89ac39c 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -187,30 +187,53 @@ static u32 pmd_flags_to_pte(u32 pmd)
 	return pte;
 }
 
+static u32 pte_flags_to_pmd(u32 pte)
+{
+	u32 pmd = 0;
+
+	if (pte & PTE_BUFFERABLE)
+		pmd |= PMD_SECT_BUFFERABLE;
+	if (pte & PTE_CACHEABLE)
+		pmd |= PMD_SECT_CACHEABLE;
+
+	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
+		if (pte & PTE_EXT_NG)
+			pmd |= PMD_SECT_nG;
+		if (pte & PTE_EXT_XN)
+			pmd |= PMD_SECT_XN;
+
+		/* TEX[2:0] */
+		pmd |= ((pte >> 6) & 7) << 12;
+		/* AP[1:0] */
+		pmd |= ((pte >> 4) & 0x3) << 10;
+		/* AP[2] */
+		pmd |= ((pte >> 9) & 0x1) << 15;
+	} else {
+		pmd |= PMD_SECT_AP_WRITE | PMD_SECT_AP_READ;
+	}
+
+	return pmd;
+}
+
 int arch_remap_range(void *start, size_t size, unsigned flags)
 {
 	u32 addr = (u32)start;
 	u32 pte_flags;
-	u32 pgd_flags;
 
 	BUG_ON(!IS_ALIGNED(addr, PAGE_SIZE));
 
 	switch (flags) {
 	case MAP_CACHED:
 		pte_flags = pte_flags_cached;
-		pgd_flags = PMD_SECT_DEF_CACHED;
 		break;
 	case MAP_UNCACHED:
 		pte_flags = pte_flags_uncached;
-		pgd_flags = pgd_flags_uncached;
 		break;
 	case MAP_FAULT:
 		pte_flags = 0x0;
-		pgd_flags = 0x0;
 		break;
 	case ARCH_MAP_WRITECOMBINE:
 		pte_flags = pte_flags_wc;
-		pgd_flags = pgd_flags_wc;
 		break;
 	default:
 		return -EINVAL;
@@ -228,7 +251,7 @@ int arch_remap_range(void *start, size_t size, unsigned flags)
 			 * replace it with a section
 			 */
 			chunk = PGDIR_SIZE;
-			*pgd = addr | pgd_flags;
+			*pgd = addr | pte_flags_to_pmd(pte_flags) | PMD_TYPE_SECT;
 			dma_flush_range(pgd, sizeof(*pgd));
 		} else {
 			unsigned int num_ptes;
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 29/34] ARM: mmu32: add get_pte_flags, get_pmd_flags
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (27 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 28/34] ARM: mmu32: Add pte_flags_to_pmd() Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:46   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 30/34] ARM: mmu32: move functions into c file Sascha Hauer
                   ` (4 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The mmu code has several variables containing the pte/pmd values for
different mapping types. These variables only contain the correct values
after initializing them which makes it a bit hard to follow when the
code is used in both PBL and barebox proper.

Instead of using variables calculate the values when they are needed.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 82 +++++++++++++++++++++----------------------
 1 file changed, 41 insertions(+), 41 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 0af89ac39c..829139574c 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -57,11 +57,6 @@ static inline void tlb_invalidate(void)
  * PTE flags to set cached and uncached areas.
  * This will be determined at runtime.
  */
-static uint32_t pte_flags_cached;
-static uint32_t pte_flags_wc;
-static uint32_t pte_flags_uncached;
-static uint32_t pgd_flags_wc;
-static uint32_t pgd_flags_uncached;
 
 #define PTE_MASK ((1 << 12) - 1)
 
@@ -215,29 +210,48 @@ static u32 pte_flags_to_pmd(u32 pte)
 	return pmd;
 }
 
-int arch_remap_range(void *start, size_t size, unsigned flags)
+static uint32_t get_pte_flags(int map_type)
+{
+	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
+		switch (map_type) {
+		case MAP_CACHED:
+			return PTE_FLAGS_CACHED_V7;
+		case MAP_UNCACHED:
+			return PTE_FLAGS_UNCACHED_V7;
+		case ARCH_MAP_WRITECOMBINE:
+			return PTE_FLAGS_WC_V7;
+		case MAP_FAULT:
+		default:
+			return 0x0;
+		}
+	} else {
+		switch (map_type) {
+		case MAP_CACHED:
+			return PTE_FLAGS_CACHED_V4;
+		case MAP_UNCACHED:
+		case ARCH_MAP_WRITECOMBINE:
+			return PTE_FLAGS_UNCACHED_V4;
+		case MAP_FAULT:
+		default:
+			return 0x0;
+		}
+	}
+}
+
+static uint32_t get_pmd_flags(int map_type)
+{
+	return pte_flags_to_pmd(get_pte_flags(map_type));
+}
+
+int arch_remap_range(void *start, size_t size, unsigned map_type)
 {
 	u32 addr = (u32)start;
-	u32 pte_flags;
+	u32 pte_flags, pmd_flags;
 
 	BUG_ON(!IS_ALIGNED(addr, PAGE_SIZE));
 
-	switch (flags) {
-	case MAP_CACHED:
-		pte_flags = pte_flags_cached;
-		break;
-	case MAP_UNCACHED:
-		pte_flags = pte_flags_uncached;
-		break;
-	case MAP_FAULT:
-		pte_flags = 0x0;
-		break;
-	case ARCH_MAP_WRITECOMBINE:
-		pte_flags = pte_flags_wc;
-		break;
-	default:
-		return -EINVAL;
-	}
+	pte_flags = get_pte_flags(map_type);
+	pmd_flags = pte_flags_to_pmd(pte_flags);
 
 	while (size) {
 		const bool pgdir_size_aligned = IS_ALIGNED(addr, PGDIR_SIZE);
@@ -251,7 +265,7 @@ int arch_remap_range(void *start, size_t size, unsigned flags)
 			 * replace it with a section
 			 */
 			chunk = PGDIR_SIZE;
-			*pgd = addr | pte_flags_to_pmd(pte_flags) | PMD_TYPE_SECT;
+			*pgd = addr | pmd_flags | PMD_TYPE_SECT;
 			dma_flush_range(pgd, sizeof(*pgd));
 		} else {
 			unsigned int num_ptes;
@@ -309,7 +323,7 @@ void *map_io_sections(unsigned long phys, void *_start, size_t size)
 	unsigned long start = (unsigned long)_start, sec;
 
 	for (sec = start; sec < start + size; sec += PGDIR_SIZE, phys += PGDIR_SIZE)
-		ttb[pgd_index(sec)] = phys | pgd_flags_uncached;
+		ttb[pgd_index(sec)] = phys | get_pmd_flags(MAP_UNCACHED);
 
 	dma_flush_range(ttb, 0x4000);
 	tlb_invalidate();
@@ -350,9 +364,9 @@ static void create_vector_table(unsigned long adr)
 		vectors = xmemalign(PAGE_SIZE, PAGE_SIZE);
 		pr_debug("Creating vector table, virt = 0x%p, phys = 0x%08lx\n",
 			 vectors, adr);
-		arm_create_pte(adr, pte_flags_uncached);
+		arm_create_pte(adr, get_pte_flags(MAP_UNCACHED));
 		pte = find_pte(adr);
-		*pte = (u32)vectors | PTE_TYPE_SMALL | pte_flags_cached;
+		*pte = (u32)vectors | PTE_TYPE_SMALL | get_pte_flags(MAP_CACHED);
 	}
 
 	arm_fixup_vectors();
@@ -465,20 +479,6 @@ void __mmu_init(bool mmu_on)
 {
 	struct memory_bank *bank;
 
-	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
-		pte_flags_cached = PTE_FLAGS_CACHED_V7;
-		pte_flags_wc = PTE_FLAGS_WC_V7;
-		pgd_flags_wc = PGD_FLAGS_WC_V7;
-		pgd_flags_uncached = PGD_FLAGS_UNCACHED_V7;
-		pte_flags_uncached = PTE_FLAGS_UNCACHED_V7;
-	} else {
-		pte_flags_cached = PTE_FLAGS_CACHED_V4;
-		pte_flags_wc = PTE_FLAGS_UNCACHED_V4;
-		pgd_flags_wc = PMD_SECT_DEF_UNCACHED;
-		pgd_flags_uncached = PMD_SECT_DEF_UNCACHED;
-		pte_flags_uncached = PTE_FLAGS_UNCACHED_V4;
-	}
-
 	/* Clear unpredictable bits [13:0] */
 	ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 30/34] ARM: mmu32: move functions into c file
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (28 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 29/34] ARM: mmu32: add get_pte_flags, get_pmd_flags Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:48   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 31/34] ARM: mmu32: read TTB value from register Sascha Hauer
                   ` (3 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

Move create_flat_mapping() and create_sections() into the c file
rather than having them as static inline functions in the header file.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 19 +++++++++++++++++++
 arch/arm/cpu/mmu_32.h | 20 --------------------
 2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 829139574c..0762bd55a3 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -318,6 +318,25 @@ int arch_remap_range(void *start, size_t size, unsigned map_type)
 	return 0;
 }
 
+static void create_sections(uint32_t *ttb, unsigned long first,
+			    unsigned long last, unsigned int flags)
+{
+	unsigned long ttb_start = pgd_index(first);
+	unsigned long ttb_end = pgd_index(last) + 1;
+	unsigned int i, addr = first;
+
+	for (i = ttb_start; i < ttb_end; i++) {
+		ttb[i] = addr | flags;
+		addr += PGDIR_SIZE;
+	}
+}
+
+static void create_flat_mapping(uint32_t *ttb)
+{
+	/* create a flat mapping using 1MiB sections */
+	create_sections(ttb, 0, 0xffffffff, attrs_uncached_mem());
+}
+
 void *map_io_sections(unsigned long phys, void *_start, size_t size)
 {
 	unsigned long start = (unsigned long)_start, sec;
diff --git a/arch/arm/cpu/mmu_32.h b/arch/arm/cpu/mmu_32.h
index 1499b70dd6..607d9e8608 100644
--- a/arch/arm/cpu/mmu_32.h
+++ b/arch/arm/cpu/mmu_32.h
@@ -56,20 +56,6 @@ static inline void set_domain(unsigned val)
 	asm volatile ("mcr  p15,0,%0,c3,c0,0" : : "r"(val) /*:*/);
 }
 
-static inline void
-create_sections(uint32_t *ttb, unsigned long first,
-		unsigned long last, unsigned int flags)
-{
-	unsigned long ttb_start = pgd_index(first);
-	unsigned long ttb_end = pgd_index(last) + 1;
-	unsigned int i, addr = first;
-
-	for (i = ttb_start; i < ttb_end; i++) {
-		ttb[i] = addr | flags;
-		addr += PGDIR_SIZE;
-	}
-}
-
 #define PMD_SECT_DEF_UNCACHED (PMD_SECT_AP_WRITE | PMD_SECT_AP_READ | PMD_TYPE_SECT)
 #define PMD_SECT_DEF_CACHED (PMD_SECT_WB | PMD_SECT_DEF_UNCACHED)
 
@@ -83,10 +69,4 @@ static inline unsigned long attrs_uncached_mem(void)
 	return flags;
 }
 
-static inline void create_flat_mapping(uint32_t *ttb)
-{
-	/* create a flat mapping using 1MiB sections */
-	create_sections(ttb, 0, 0xffffffff, attrs_uncached_mem());
-}
-
 #endif /* __ARM_MMU_H */
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 31/34] ARM: mmu32: read TTB value from register
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (29 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 30/34] ARM: mmu32: move functions into c file Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 13:58   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 32/34] ARM: mmu32: Use pages for early MMU setup Sascha Hauer
                   ` (2 subsequent siblings)
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

Instead of relying on a variable for the location of the TTB which we
have to initialize in both PBL and barebox proper, just read the value
back from the hardware register.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 41 ++++++++++++++++++++---------------------
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 0762bd55a3..785b20c7fd 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -24,7 +24,11 @@
 #define PTRS_PER_PTE		(PGDIR_SIZE / PAGE_SIZE)
 #define ARCH_MAP_WRITECOMBINE	((unsigned)-1)
 
-static uint32_t *ttb;
+static inline uint32_t *get_ttb(void)
+{
+	/* Clear unpredictable bits [13:0] */
+	return (uint32_t *)(get_ttbr() & ~0x3fff);
+}
 
 /*
  * Do it the simple way for now and invalidate the entire
@@ -77,7 +81,7 @@ static uint32_t *alloc_pte(void)
 	if (idx * PTE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
 		return NULL;
 
-	return (void *)ttb + idx * PTE_SIZE;
+	return get_ttb() + idx * PTE_SIZE;
 }
 #else
 static uint32_t *alloc_pte(void)
@@ -89,9 +93,7 @@ static uint32_t *alloc_pte(void)
 static u32 *find_pte(unsigned long adr)
 {
 	u32 *table;
-
-	if (!ttb)
-		arm_mmu_not_initialized_error();
+	uint32_t *ttb = get_ttb();
 
 	if (!pgd_type_table(ttb[pgd_index(adr)]))
 		return NULL;
@@ -130,6 +132,7 @@ void dma_inv_range(void *ptr, size_t size)
  */
 static u32 *arm_create_pte(unsigned long virt, uint32_t flags)
 {
+	uint32_t *ttb = get_ttb();
 	u32 *table;
 	int i, ttb_idx;
 
@@ -137,9 +140,6 @@ static u32 *arm_create_pte(unsigned long virt, uint32_t flags)
 
 	table = alloc_pte();
 
-	if (!ttb)
-		arm_mmu_not_initialized_error();
-
 	ttb_idx = pgd_index(virt);
 
 	for (i = 0; i < PTRS_PER_PTE; i++) {
@@ -247,6 +247,7 @@ int arch_remap_range(void *start, size_t size, unsigned map_type)
 {
 	u32 addr = (u32)start;
 	u32 pte_flags, pmd_flags;
+	uint32_t *ttb = get_ttb();
 
 	BUG_ON(!IS_ALIGNED(addr, PAGE_SIZE));
 
@@ -318,9 +319,10 @@ int arch_remap_range(void *start, size_t size, unsigned map_type)
 	return 0;
 }
 
-static void create_sections(uint32_t *ttb, unsigned long first,
-			    unsigned long last, unsigned int flags)
+static void create_sections(unsigned long first, unsigned long last,
+			    unsigned int flags)
 {
+	uint32_t *ttb = get_ttb();
 	unsigned long ttb_start = pgd_index(first);
 	unsigned long ttb_end = pgd_index(last) + 1;
 	unsigned int i, addr = first;
@@ -331,15 +333,16 @@ static void create_sections(uint32_t *ttb, unsigned long first,
 	}
 }
 
-static void create_flat_mapping(uint32_t *ttb)
+static inline void create_flat_mapping(void)
 {
 	/* create a flat mapping using 1MiB sections */
-	create_sections(ttb, 0, 0xffffffff, attrs_uncached_mem());
+	create_sections(0, 0xffffffff, attrs_uncached_mem());
 }
 
 void *map_io_sections(unsigned long phys, void *_start, size_t size)
 {
 	unsigned long start = (unsigned long)_start, sec;
+	uint32_t *ttb = get_ttb();
 
 	for (sec = start; sec < start + size; sec += PGDIR_SIZE, phys += PGDIR_SIZE)
 		ttb[pgd_index(sec)] = phys | get_pmd_flags(MAP_UNCACHED);
@@ -497,9 +500,7 @@ static void vectors_init(void)
 void __mmu_init(bool mmu_on)
 {
 	struct memory_bank *bank;
-
-	/* Clear unpredictable bits [13:0] */
-	ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
+	uint32_t *ttb = get_ttb();
 
 	if (!request_sdram_region("ttb", (unsigned long)ttb, SZ_16K))
 		/*
@@ -517,7 +518,7 @@ void __mmu_init(bool mmu_on)
 	vectors_init();
 
 	for_each_memory_bank(bank) {
-		create_sections(ttb, bank->start, bank->start + bank->size - 1,
+		create_sections(bank->start, bank->start + bank->size - 1,
 				PMD_SECT_DEF_CACHED);
 		__mmu_cache_flush();
 	}
@@ -541,8 +542,6 @@ void *dma_alloc_writecombine(size_t size, dma_addr_t *dma_handle)
 	return dma_alloc_map(size, dma_handle, ARCH_MAP_WRITECOMBINE);
 }
 
-static uint32_t *ttb;
-
 static inline void map_region(unsigned long start, unsigned long size,
 			      uint64_t flags)
 
@@ -550,12 +549,12 @@ static inline void map_region(unsigned long start, unsigned long size,
 	start = ALIGN_DOWN(start, SZ_1M);
 	size  = ALIGN(size, SZ_1M);
 
-	create_sections(ttb, start, start + size - 1, flags);
+	create_sections(start, start + size - 1, flags);
 }
 
 void mmu_early_enable(unsigned long membase, unsigned long memsize)
 {
-	ttb = (uint32_t *)arm_mem_ttb(membase, membase + memsize);
+	uint32_t *ttb = (uint32_t *)arm_mem_ttb(membase + memsize);
 
 	pr_debug("enabling MMU, ttb @ 0x%p\n", ttb);
 
@@ -571,7 +570,7 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize)
 	 * This marks the whole address space as uncachable as well as
 	 * unexecutable if possible
 	 */
-	create_flat_mapping(ttb);
+	create_flat_mapping();
 
 	/*
 	 * There can be SoCs that have a section shared between device memory
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 32/34] ARM: mmu32: Use pages for early MMU setup
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (30 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 31/34] ARM: mmu32: read TTB value from register Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 14:21   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization Sascha Hauer
  2023-05-17  9:03 ` [PATCH v2 34/34] ARM: mmu64: Use two level pagetables in early code Sascha Hauer
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

Up to now we use 1MiB sections to setup the page tables in PBL. There
are two places where this leads to problems. First is OP-TEE, we have
to map the OP-TEE area with PTE_EXT_XN to prevent the instruction
prefetcher from speculating into that area. With the current section
mapping we have to align OPTEE_SIZE to 1MiB boundaries. The second
problem comes with SRAM where the PBL might be running. This SRAM has
to be mapped executable, but at the same time we should map the
surrounding areas non executable which is not always possible with
1MiB mapping granularity.

We now have everything in place to use two level page tables from PBL,
so use arch_remap_range() for the problematic cases.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 31 +++++++------------------------
 1 file changed, 7 insertions(+), 24 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 785b20c7fd..705d27a045 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -111,8 +111,10 @@ void dma_flush_range(void *ptr, size_t size)
 	unsigned long end = start + size;
 
 	__dma_flush_range(start, end);
+#ifndef __PBL__
 	if (outer_cache.flush_range)
 		outer_cache.flush_range(start, end);
+#endif
 }
 
 void dma_inv_range(void *ptr, size_t size)
@@ -120,8 +122,10 @@ void dma_inv_range(void *ptr, size_t size)
 	unsigned long start = (unsigned long)ptr;
 	unsigned long end = start + size;
 
+#ifndef __PBL__
 	if (outer_cache.inv_range)
 		outer_cache.inv_range(start, end);
+#endif
 	__dma_inv_range(start, end);
 }
 
@@ -542,16 +546,6 @@ void *dma_alloc_writecombine(size_t size, dma_addr_t *dma_handle)
 	return dma_alloc_map(size, dma_handle, ARCH_MAP_WRITECOMBINE);
 }
 
-static inline void map_region(unsigned long start, unsigned long size,
-			      uint64_t flags)
-
-{
-	start = ALIGN_DOWN(start, SZ_1M);
-	size  = ALIGN(size, SZ_1M);
-
-	create_sections(start, start + size - 1, flags);
-}
-
 void mmu_early_enable(unsigned long membase, unsigned long memsize)
 {
 	uint32_t *ttb = (uint32_t *)arm_mem_ttb(membase + memsize);
@@ -572,21 +566,10 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize)
 	 */
 	create_flat_mapping();
 
-	/*
-	 * There can be SoCs that have a section shared between device memory
-	 * and the on-chip RAM hosting the PBL. Thus mark this section
-	 * uncachable, but executable.
-	 * On such SoCs, executing from OCRAM could cause the instruction
-	 * prefetcher to speculatively access that device memory, triggering
-	 * potential errant behavior.
-	 *
-	 * If your SoC has such a memory layout, you should rewrite the code
-	 * here to map the OCRAM page-wise.
-	 */
-	map_region((unsigned long)_stext, _etext - _stext, PMD_SECT_DEF_UNCACHED);
-
 	/* maps main memory as cachable */
-	map_region(membase, memsize - OPTEE_SIZE, PMD_SECT_DEF_CACHED);
+	arch_remap_range((void *)membase, memsize - OPTEE_SIZE, MAP_CACHED);
+	arch_remap_range((void *)membase + memsize - OPTEE_SIZE, OPTEE_SIZE, MAP_UNCACHED);
+	arch_remap_range(_stext, PAGE_ALIGN(_etext - _stext), MAP_CACHED);
 
 	__mmu_cache_on();
 }
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (31 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 32/34] ARM: mmu32: Use pages for early MMU setup Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  2023-05-17 14:43   ` Ahmad Fatoum
  2023-05-17  9:03 ` [PATCH v2 34/34] ARM: mmu64: Use two level pagetables in early code Sascha Hauer
  33 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

The early MMU code now uses pages to map the OP-TEE area non executable.
This mapping is overwritten with sections in barebox proper. Refrain
from doing so by using arch_remap_range() and bypassing reserved areas.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 705d27a045..47711bed35 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -522,9 +522,17 @@ void __mmu_init(bool mmu_on)
 	vectors_init();
 
 	for_each_memory_bank(bank) {
-		create_sections(bank->start, bank->start + bank->size - 1,
-				PMD_SECT_DEF_CACHED);
-		__mmu_cache_flush();
+		struct resource *rsv;
+		resource_size_t pos;
+
+		pos = bank->start;
+
+		for_each_reserved_region(bank, rsv) {
+			arch_remap_range((void *)pos, rsv->start - pos, MAP_CACHED);
+			pos = rsv->end + 1;
+		}
+
+		arch_remap_range((void *)pos, bank->start + bank->size - pos, MAP_CACHED);
 	}
 }
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 34/34] ARM: mmu64: Use two level pagetables in early code
  2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
                   ` (32 preceding siblings ...)
  2023-05-17  9:03 ` [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization Sascha Hauer
@ 2023-05-17  9:03 ` Sascha Hauer
  33 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17  9:03 UTC (permalink / raw)
  To: Barebox List

So far we used 1GiB sized sections in the early MMU setup. This has
the disadvantage that we can't use the MMU in early code when we
require a finer granularity. Rockchip for example keeps TF-A code
in the lower memory so the code just skipped MMU initialization.
Also we can't properly map the OP-TEE space at the end of SDRAM non
executable.

With this patch we now use two level page tables and can map with 4KiB
granularity.

The MMU setup in barebox proper changes as well. Instead of disabling
the MMU for reconfiguration we can now keep the MMU enabled and just
add the mappings for SDRAM banks not known to the early code.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 arch/arm/cpu/mmu_64.c | 97 +++++++++----------------------------------
 1 file changed, 20 insertions(+), 77 deletions(-)

diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index d32eecf144..2f9b5098a3 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -22,7 +22,10 @@
 
 #include "mmu_64.h"
 
-static uint64_t *ttb;
+static uint64_t *get_ttb(void)
+{
+	return (uint64_t *)get_ttbr(current_el());
+}
 
 static void set_table(uint64_t *pt, uint64_t *table_addr)
 {
@@ -42,7 +45,7 @@ static uint64_t *alloc_pte(void)
 	if (idx * GRANULE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
 		return NULL;
 
-	return (void *)ttb + idx * GRANULE_SIZE;
+	return get_ttb() + idx * GRANULE_SIZE;
 }
 #else
 static uint64_t *alloc_pte(void)
@@ -63,7 +66,7 @@ static __maybe_unused uint64_t *find_pte(uint64_t addr)
 	uint64_t idx;
 	int i;
 
-	pte = ttb;
+	pte = get_ttb();
 
 	for (i = 0; i < 4; i++) {
 		block_shift = level2shift(i);
@@ -112,6 +115,7 @@ static void split_block(uint64_t *pte, int level)
 static void create_sections(uint64_t virt, uint64_t phys, uint64_t size,
 			    uint64_t attr)
 {
+	uint64_t *ttb = get_ttb();
 	uint64_t block_size;
 	uint64_t block_shift;
 	uint64_t *pte;
@@ -121,9 +125,6 @@ static void create_sections(uint64_t virt, uint64_t phys, uint64_t size,
 	uint64_t type;
 	int level;
 
-	if (!ttb)
-		arm_mmu_not_initialized_error();
-
 	addr = virt;
 
 	attr &= ~PTE_TYPE_MASK;
@@ -192,37 +193,23 @@ static void mmu_enable(void)
 void __mmu_init(bool mmu_on)
 {
 	struct memory_bank *bank;
-	unsigned int el;
-
-	if (mmu_on)
-		mmu_disable();
-
-	ttb = alloc_pte();
-	el = current_el();
-	set_ttbr_tcr_mair(el, (uint64_t)ttb, calc_tcr(el, BITS_PER_VA),
-			  MEMORY_ATTRIBUTES);
 
-	pr_debug("ttb: 0x%p\n", ttb);
-
-	/* create a flat mapping */
-	arch_remap_range(0, 1UL << (BITS_PER_VA - 1), MAP_UNCACHED);
-
-	/* Map sdram cached. */
 	for_each_memory_bank(bank) {
 		struct resource *rsv;
+		resource_size_t pos;
 
-		arch_remap_range((void *)bank->start, bank->size, MAP_CACHED);
+		pos = bank->start;
 
 		for_each_reserved_region(bank, rsv) {
-			arch_remap_range((void *)resource_first_page(rsv),
-					 resource_count_pages(rsv), MAP_UNCACHED);
+			arch_remap_range((void *)pos, rsv->start - pos, MAP_CACHED);
+			pos = rsv->end + 1;
 		}
+
+		arch_remap_range((void *)pos, bank->start + bank->size - pos, MAP_CACHED);
 	}
 
 	/* Make zero page faulting to catch NULL pointer derefs */
 	zero_page_faulting();
-
-	mmu_enable();
 }
 
 void mmu_disable(void)
@@ -256,42 +243,6 @@ void dma_flush_range(void *ptr, size_t size)
 	v8_flush_dcache_range(start, end);
 }
 
-static void early_create_sections(void *ttb, uint64_t virt, uint64_t phys,
-				  uint64_t size, uint64_t attr)
-{
-	uint64_t block_size;
-	uint64_t block_shift;
-	uint64_t *pte;
-	uint64_t idx;
-	uint64_t addr;
-	uint64_t *table;
-
-	addr = virt;
-
-	attr &= ~PTE_TYPE_MASK;
-
-	table = ttb;
-
-	while (1) {
-		block_shift = level2shift(1);
-		idx = (addr & level2mask(1)) >> block_shift;
-		block_size = (1ULL << block_shift);
-
-		pte = table + idx;
-
-		*pte = phys | attr | PTE_TYPE_BLOCK;
-
-		if (size < block_size)
-			break;
-
-		addr += block_size;
-		phys += block_size;
-		size -= block_size;
-	}
-}
-
-#define EARLY_BITS_PER_VA 39
-
 void mmu_early_enable(unsigned long membase, unsigned long memsize)
 {
 	int el;
@@ -299,24 +250,16 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize)
 
 	pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
 
-	/*
-	 * For the early code we only create level 1 pagetables which only
-	 * allow for a 1GiB granularity. If our membase is not aligned to that
-	 * bail out without enabling the MMU.
-	 */
-	if (membase & ((1ULL << level2shift(1)) - 1))
-		return;
+	el = current_el();
+	set_ttbr_tcr_mair(el, ttb, calc_tcr(el, BITS_PER_VA), MEMORY_ATTRIBUTES);
 
 	memset((void *)ttb, 0, GRANULE_SIZE);
 
-	el = current_el();
-	set_ttbr_tcr_mair(el, ttb, calc_tcr(el, EARLY_BITS_PER_VA), MEMORY_ATTRIBUTES);
-	early_create_sections((void *)ttb, 0, 0, 1UL << (EARLY_BITS_PER_VA - 1),
-			attrs_uncached_mem());
-	early_create_sections((void *)ttb, membase, membase, memsize - OPTEE_SIZE, CACHED_MEM);
-	tlb_invalidate();
-	isb();
-	set_cr(get_cr() | CR_M);
+	arch_remap_range(0, 1UL << (BITS_PER_VA - 1), MAP_UNCACHED);
+	arch_remap_range((void *)membase, memsize - OPTEE_SIZE, MAP_CACHED);
+	arch_remap_range((void *)membase + memsize - OPTEE_SIZE, OPTEE_SIZE, MAP_FAULT);
+
+	mmu_enable();
 }
 
 void mmu_early_disable(void)
-- 
2.39.2




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 01/34] ARM: remove unused membase argument
  2023-05-17  9:03 ` [PATCH v2 01/34] ARM: remove unused membase argument Sascha Hauer
@ 2023-05-17 12:45   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:45 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> The functions determining the different memory locations for stack,
> early malloc, ttb and op-tee all take a membase argument which is
> unused as all locations depend on the end of memory. Remove this unused
> argument.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/boards/raspberry-pi/lowlevel.c |  2 +-
>  arch/arm/cpu/entry.c                    |  2 +-
>  arch/arm/cpu/start.c                    |  6 ++---
>  arch/arm/cpu/uncompress.c               |  6 ++---
>  arch/arm/include/asm/barebox-arm.h      | 29 ++++++++++---------------
>  5 files changed, 20 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/arm/boards/raspberry-pi/lowlevel.c b/arch/arm/boards/raspberry-pi/lowlevel.c
> index 742f177dec..fd11fe53e0 100644
> --- a/arch/arm/boards/raspberry-pi/lowlevel.c
> +++ b/arch/arm/boards/raspberry-pi/lowlevel.c
> @@ -42,7 +42,7 @@ static void copy_vc_fdt(void *dest, void *src, unsigned long max_size)
>   * this FDT there. We fetch it from there later in rpi_devices_init().
>   */
>  #define rpi_stack_top(memsize) \
> -	arm_mem_stack_top(BCM2835_SDRAM_BASE, BCM2835_SDRAM_BASE + memsize - VIDEOCORE_FDT_SZ)
> +	arm_mem_stack_top(BCM2835_SDRAM_BASE + memsize - VIDEOCORE_FDT_SZ)
>  
>  static inline void start_raspberry_pi(unsigned long memsize, void *fdt,
>  								void *vc_fdt)
> diff --git a/arch/arm/cpu/entry.c b/arch/arm/cpu/entry.c
> index b863af5757..dc264c8771 100644
> --- a/arch/arm/cpu/entry.c
> +++ b/arch/arm/cpu/entry.c
> @@ -40,5 +40,5 @@ void NAKED __noreturn barebox_arm_entry(unsigned long membase,
>  					unsigned long memsize, void *boarddata)
>  {
>  	__barebox_arm_entry(membase, memsize, boarddata,
> -			    arm_mem_stack_top(membase, membase + memsize));
> +			    arm_mem_stack_top(membase + memsize));
>  }
> diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
> index be303514c2..62b2054dd6 100644
> --- a/arch/arm/cpu/start.c
> +++ b/arch/arm/cpu/start.c
> @@ -111,7 +111,7 @@ static inline unsigned long arm_mem_boarddata(unsigned long membase,
>  
>  unsigned long arm_mem_ramoops_get(void)
>  {
> -	return arm_mem_ramoops(0, arm_stack_top);
> +	return arm_mem_ramoops(arm_stack_top);
>  }
>  EXPORT_SYMBOL_GPL(arm_mem_ramoops_get);
>  
> @@ -163,12 +163,12 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
>  
>  	arm_membase = membase;
>  	arm_endmem = endmem;
> -	arm_stack_top = arm_mem_stack_top(membase, endmem);
> +	arm_stack_top = arm_mem_stack_top(endmem);
>  	arm_barebox_size = barebox_size;
>  	malloc_end = barebox_base;
>  
>  	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
> -		unsigned long ttb = arm_mem_ttb(membase, endmem);
> +		unsigned long ttb = arm_mem_ttb(endmem);
>  
>  		if (IS_ENABLED(CONFIG_PBL_IMAGE)) {
>  			arm_set_cache_functions();
> diff --git a/arch/arm/cpu/uncompress.c b/arch/arm/cpu/uncompress.c
> index 65de87f109..abaf36b68c 100644
> --- a/arch/arm/cpu/uncompress.c
> +++ b/arch/arm/cpu/uncompress.c
> @@ -82,13 +82,13 @@ void __noreturn barebox_pbl_start(unsigned long membase, unsigned long memsize,
>  	pr_debug("memory at 0x%08lx, size 0x%08lx\n", membase, memsize);
>  
>  	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
> -		unsigned long ttb = arm_mem_ttb(membase, endmem);
> +		unsigned long ttb = arm_mem_ttb(endmem);
>  		pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
>  		mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
>  	}
>  
> -	free_mem_ptr = arm_mem_early_malloc(membase, endmem);
> -	free_mem_end_ptr = arm_mem_early_malloc_end(membase, endmem);
> +	free_mem_ptr = arm_mem_early_malloc(endmem);
> +	free_mem_end_ptr = arm_mem_early_malloc_end(endmem);
>  
>  	pr_debug("uncompressing barebox binary at 0x%p (size 0x%08x) to 0x%08lx (uncompressed size: 0x%08x)\n",
>  			pg_start, pg_len, barebox_base, uncompressed_len);
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index 0cf4549cd7..2e0d8dc9a7 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -78,39 +78,34 @@ static inline const void *arm_mem_scratch_get(void)
>  	return (const void *)__arm_mem_scratch(arm_mem_endmem_get());
>  }
>  
> -#define arm_mem_stack_top(membase, endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
> +#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
>  
> -static inline unsigned long arm_mem_stack(unsigned long membase,
> -					  unsigned long endmem)
> +static inline unsigned long arm_mem_stack(unsigned long endmem)
>  {
> -	return arm_mem_stack_top(membase, endmem) - STACK_SIZE;
> +	return arm_mem_stack_top(endmem) - STACK_SIZE;
>  }
>  
> -static inline unsigned long arm_mem_ttb(unsigned long membase,
> -					unsigned long endmem)
> +static inline unsigned long arm_mem_ttb(unsigned long endmem)
>  {
> -	endmem = arm_mem_stack(membase, endmem);
> +	endmem = arm_mem_stack(endmem);
>  	endmem = ALIGN_DOWN(endmem, ARM_TTB_SIZE) - ARM_TTB_SIZE;
>  
>  	return endmem;
>  }
>  
> -static inline unsigned long arm_mem_early_malloc(unsigned long membase,
> -						 unsigned long endmem)
> +static inline unsigned long arm_mem_early_malloc(unsigned long endmem)
>  {
> -	return arm_mem_ttb(membase, endmem) - SZ_128K;
> +	return arm_mem_ttb(endmem) - SZ_128K;
>  }
>  
> -static inline unsigned long arm_mem_early_malloc_end(unsigned long membase,
> -						     unsigned long endmem)
> +static inline unsigned long arm_mem_early_malloc_end(unsigned long endmem)
>  {
> -	return arm_mem_ttb(membase, endmem);
> +	return arm_mem_ttb(endmem);
>  }
>  
> -static inline unsigned long arm_mem_ramoops(unsigned long membase,
> -					    unsigned long endmem)
> +static inline unsigned long arm_mem_ramoops(unsigned long endmem)
>  {
> -	endmem = arm_mem_ttb(membase, endmem);
> +	endmem = arm_mem_ttb(endmem);
>  #ifdef CONFIG_FS_PSTORE_RAMOOPS
>  	endmem -= CONFIG_FS_PSTORE_RAMOOPS_SIZE;
>  	endmem = ALIGN_DOWN(endmem, SZ_4K);
> @@ -123,7 +118,7 @@ static inline unsigned long arm_mem_barebox_image(unsigned long membase,
>  						  unsigned long endmem,
>  						  unsigned long size)
>  {
> -	endmem = arm_mem_ramoops(membase, endmem);
> +	endmem = arm_mem_ramoops(endmem);
>  
>  	if (IS_ENABLED(CONFIG_RELOCATABLE)) {
>  		return ALIGN_DOWN(endmem - size, SZ_1M);

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 02/34] ARM: remove unused define
  2023-05-17  9:03 ` [PATCH v2 02/34] ARM: remove unused define Sascha Hauer
@ 2023-05-17 12:45   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:45 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> __ARM_SETUP_STACK isn't used anywhere. Remove it.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/include/asm/barebox-arm.h | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index 2e0d8dc9a7..3a0c3d7d40 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -130,10 +130,6 @@ static inline unsigned long arm_mem_barebox_image(unsigned long membase,
>  	}
>  }
>  
> -#ifndef CONFIG_CPU_64
> -#define __ARM_SETUP_STACK(name, stack_top) if (stack_top) arm_setup_stack(stack_top)
> -#endif
> -
>  /*
>   * Unlike ENTRY_FUNCTION, this can be used to setup stack for a C entry
>   * point on both ARM32 and ARM64. ENTRY_FUNCTION on ARM64 can only be used

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 03/34] ARM: rename __arm_mem_scratch to arm_mem_scratch
  2023-05-17  9:03 ` [PATCH v2 03/34] ARM: rename __arm_mem_scratch to arm_mem_scratch Sascha Hauer
@ 2023-05-17 12:46   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:46 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> There are different arm_mem_* macros/functions and only one of them
> has leading underscores. Remove them for consistency.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/include/asm/barebox-arm.h | 4 ++--
>  arch/arm/mach-imx/atf.c            | 6 +++---
>  arch/arm/mach-imx/xload-common.c   | 2 +-
>  include/mach/rockchip/bootrom.h    | 2 +-
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index 3a0c3d7d40..f446044be6 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -71,11 +71,11 @@ static inline void arm_fixup_vectors(void)
>  
>  void *barebox_arm_boot_dtb(void);
>  
> -#define __arm_mem_scratch(endmem) ((endmem) - SZ_32K)
> +#define arm_mem_scratch(endmem) ((endmem) - SZ_32K)
>  
>  static inline const void *arm_mem_scratch_get(void)
>  {
> -	return (const void *)__arm_mem_scratch(arm_mem_endmem_get());
> +	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
>  }
>  
>  #define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
> diff --git a/arch/arm/mach-imx/atf.c b/arch/arm/mach-imx/atf.c
> index 92820d9392..c5e6817aad 100644
> --- a/arch/arm/mach-imx/atf.c
> +++ b/arch/arm/mach-imx/atf.c
> @@ -137,7 +137,7 @@ __noreturn void imx8mm_load_and_start_image_via_tfa(void)
>  	void *endmem = (void *)MX8M_DDR_CSD1_BASE_ADDR +
>  		imx8m_barebox_earlymem_size(32);
>  
> -	imx8m_save_bootrom_log(__arm_mem_scratch(endmem));
> +	imx8m_save_bootrom_log(arm_mem_scratch(endmem));
>  	imx8mm_load_bl33(bl33);
>  
>  	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MM_OPTEE))
> @@ -185,7 +185,7 @@ __noreturn void imx8mp_load_and_start_image_via_tfa(void)
>  	void *endmem = (void *)MX8M_DDR_CSD1_BASE_ADDR +
>  		imx8m_barebox_earlymem_size(32);
>  
> -	imx8m_save_bootrom_log(__arm_mem_scratch(endmem));
> +	imx8m_save_bootrom_log(arm_mem_scratch(endmem));
>  	imx8mp_load_bl33(bl33);
>  
>  	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MP_OPTEE))
> @@ -234,7 +234,7 @@ __noreturn void imx8mn_load_and_start_image_via_tfa(void)
>  	void *endmem = (void *)MX8M_DDR_CSD1_BASE_ADDR +
>  		imx8m_barebox_earlymem_size(16);
>  
> -	imx8m_save_bootrom_log(__arm_mem_scratch(endmem));
> +	imx8m_save_bootrom_log(arm_mem_scratch(endmem));
>  	imx8mn_load_bl33(bl33);
>  
>  	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MN_OPTEE))
> diff --git a/arch/arm/mach-imx/xload-common.c b/arch/arm/mach-imx/xload-common.c
> index 0d3e6be1b1..03eb2ef109 100644
> --- a/arch/arm/mach-imx/xload-common.c
> +++ b/arch/arm/mach-imx/xload-common.c
> @@ -26,7 +26,7 @@ struct imx_scratch_space *__imx8m_scratch_space(int ddr_buswidth)
>  	ulong endmem = MX8M_DDR_CSD1_BASE_ADDR +
>  		imx8m_barebox_earlymem_size(ddr_buswidth);
>  
> -	return (void *)__arm_mem_scratch(endmem);
> +	return (void *)arm_mem_scratch(endmem);
>  }
>  
>  #define HDR_SIZE	512
> diff --git a/include/mach/rockchip/bootrom.h b/include/mach/rockchip/bootrom.h
> index 96eb147ae4..5b999fc606 100644
> --- a/include/mach/rockchip/bootrom.h
> +++ b/include/mach/rockchip/bootrom.h
> @@ -15,7 +15,7 @@ static inline void rockchip_store_bootrom_iram(ulong membase,
>                                                 ulong memsize,
>                                                 const void *iram)
>  {
> -	void *dst = (void *)__arm_mem_scratch(membase + memsize);
> +	void *dst = (void *)arm_mem_scratch(membase + memsize);
>  	memcpy(dst, iram, sizeof(struct rockchip_scratch_space));
>  }
>  

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE
  2023-05-17  9:03 ` [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE Sascha Hauer
@ 2023-05-17 12:48   ` Ahmad Fatoum
  2023-05-17 13:14     ` Sascha Hauer
  0 siblings, 1 reply; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:48 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> We want to reserve memory for OP-TEE at the end of available SDRAM,
> so move the scratch area below OP-TEE and not above.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/include/asm/barebox-arm.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index f446044be6..6e6606d005 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -71,14 +71,14 @@ static inline void arm_fixup_vectors(void)
>  
>  void *barebox_arm_boot_dtb(void);
>  
> -#define arm_mem_scratch(endmem) ((endmem) - SZ_32K)
> +#define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
>  
>  static inline const void *arm_mem_scratch_get(void)
>  {
>  	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
>  }
>  
> -#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
> +#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K)

I don't understand why you drop OPTEE_SIZE here. Wouldn't the stack
now eat into the OP-TEE region?

>  
>  static inline unsigned long arm_mem_stack(unsigned long endmem)
>  {

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 05/34] ARM: add arm_mem_optee()
  2023-05-17  9:03 ` [PATCH v2 05/34] ARM: add arm_mem_optee() Sascha Hauer
@ 2023-05-17 12:53   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:53 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> We have several functions/macros named arm_mem_* returning the different
> addresses for early memory locations. Add one for OP-Tee as well.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/include/asm/barebox-arm.h | 5 +++++
>  arch/arm/mach-imx/atf.c            | 6 +++---
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index 6e6606d005..8ab1e90e94 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -71,6 +71,11 @@ static inline void arm_fixup_vectors(void)
>  
>  void *barebox_arm_boot_dtb(void);
>  
> +static inline unsigned long arm_mem_optee(unsigned long endmem)
> +{
> +	return endmem - OPTEE_SIZE;
> +}

I'd prefer to return OPTEE_SIZE ? endmem - OPTEE_SIZE : 0;
That way we are a bit more robust against future broken code.

With that adressed:

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> +
>  #define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
>  
>  static inline const void *arm_mem_scratch_get(void)
> diff --git a/arch/arm/mach-imx/atf.c b/arch/arm/mach-imx/atf.c
> index c5e6817aad..659798b95f 100644
> --- a/arch/arm/mach-imx/atf.c
> +++ b/arch/arm/mach-imx/atf.c
> @@ -141,7 +141,7 @@ __noreturn void imx8mm_load_and_start_image_via_tfa(void)
>  	imx8mm_load_bl33(bl33);
>  
>  	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MM_OPTEE))
> -		imx8m_load_and_start_optee_via_tfa(imx8mm, endmem - OPTEE_SIZE, bl33);
> +		imx8m_load_and_start_optee_via_tfa(imx8mm, arm_mem_optee(endmem), bl33);
>  	else
>  		imx8mm_load_and_start_tfa(imx8mm_bl31_bin);
>  }
> @@ -189,7 +189,7 @@ __noreturn void imx8mp_load_and_start_image_via_tfa(void)
>  	imx8mp_load_bl33(bl33);
>  
>  	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MP_OPTEE))
> -		imx8m_load_and_start_optee_via_tfa(imx8mp, endmem - OPTEE_SIZE, bl33);
> +		imx8m_load_and_start_optee_via_tfa(imx8mp, arm_mem_optee(endmem), bl33);
>  	else
>  		imx8mp_load_and_start_tfa(imx8mp_bl31_bin);
>  }
> @@ -238,7 +238,7 @@ __noreturn void imx8mn_load_and_start_image_via_tfa(void)
>  	imx8mn_load_bl33(bl33);
>  
>  	if (IS_ENABLED(CONFIG_FIRMWARE_IMX8MN_OPTEE))
> -		imx8m_load_and_start_optee_via_tfa(imx8mn, endmem - OPTEE_SIZE, bl33);
> +		imx8m_load_and_start_optee_via_tfa(imx8mn, arm_mem_optee(endmem), bl33);
>  	else
>  		imx8mn_load_and_start_tfa(imx8mn_bl31_bin);
>  }

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 06/34] ARM: make arm_mem_scratch() a static inline function
  2023-05-17  9:03 ` [PATCH v2 06/34] ARM: make arm_mem_scratch() a static inline function Sascha Hauer
@ 2023-05-17 12:53   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:53 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> Most other arm_mem_* are functions, convert arm_mem_scratch to a
> function as well.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/include/asm/barebox-arm.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index 8ab1e90e94..139ecce06d 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -76,7 +76,10 @@ static inline unsigned long arm_mem_optee(unsigned long endmem)
>  	return endmem - OPTEE_SIZE;
>  }
>  
> -#define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
> +static inline unsigned long arm_mem_scratch(unsigned long endmem)
> +{
> +	return arm_mem_optee(endmem) - SZ_32K;
> +}
>  
>  static inline const void *arm_mem_scratch_get(void)
>  {

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 07/34] ARM: define stack base consistently
  2023-05-17  9:03 ` [PATCH v2 07/34] ARM: define stack base consistently Sascha Hauer
@ 2023-05-17 12:55   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:55 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> The different arm_mem_* functions have the pattern that they take
> the region above it and substract the size of the current region. follow
> the pattern for getting the stack base as well. While at it move
> arm_mem_stack_top() lower in the file so that we have all functions
> following said pattern below each other.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/include/asm/barebox-arm.h | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index 139ecce06d..8d8c102081 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -86,11 +86,9 @@ static inline const void *arm_mem_scratch_get(void)
>  	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
>  }
>  
> -#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K)
> -
>  static inline unsigned long arm_mem_stack(unsigned long endmem)
>  {
> -	return arm_mem_stack_top(endmem) - STACK_SIZE;
> +	return arm_mem_scratch(endmem) - STACK_SIZE;
>  }
>  
>  static inline unsigned long arm_mem_ttb(unsigned long endmem)
> @@ -122,6 +120,11 @@ static inline unsigned long arm_mem_ramoops(unsigned long endmem)
>  	return endmem;
>  }
>  
> +static inline unsigned long arm_mem_stack_top(unsigned long endmem)
> +{
> +	return arm_mem_stack(endmem) + STACK_SIZE;
> +}
> +
>  static inline unsigned long arm_mem_barebox_image(unsigned long membase,
>  						  unsigned long endmem,
>  						  unsigned long size)

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 08/34] ARM: move arm_mem_scratch_get() lower for consistency
  2023-05-17  9:03 ` [PATCH v2 08/34] ARM: move arm_mem_scratch_get() lower for consistency Sascha Hauer
@ 2023-05-17 12:57   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 12:57 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> The different arm_mem_* functions all follow the same pattern of
> taking the base address of the upper region minus the size of the
> current region and with the exception of arm_mem_scratch_get() they
> are all below each other. arm_mem_scratch_get() doesn't fit into
> this row, so move it lower.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/include/asm/barebox-arm.h | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index 8d8c102081..f5a74b4746 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -81,11 +81,6 @@ static inline unsigned long arm_mem_scratch(unsigned long endmem)
>  	return arm_mem_optee(endmem) - SZ_32K;
>  }
>  
> -static inline const void *arm_mem_scratch_get(void)
> -{
> -	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
> -}
> -
>  static inline unsigned long arm_mem_stack(unsigned long endmem)
>  {
>  	return arm_mem_scratch(endmem) - STACK_SIZE;
> @@ -125,6 +120,11 @@ static inline unsigned long arm_mem_stack_top(unsigned long endmem)
>  	return arm_mem_stack(endmem) + STACK_SIZE;
>  }
>  
> +static inline const void *arm_mem_scratch_get(void)
> +{
> +	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
> +}
> +
>  static inline unsigned long arm_mem_barebox_image(unsigned long membase,
>  						  unsigned long endmem,
>  						  unsigned long size)

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 17/34] ARM: i.MX: Drop HAB workaround
  2023-05-17  9:03 ` [PATCH v2 17/34] ARM: i.MX: Drop HAB workaround Sascha Hauer
@ 2023-05-17 13:01   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:01 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> The i.MX HAB code on i.MX6 has to jump into ROM which happens to start
> at 0x0. To make that possible we used to map the ROM cached and jumped
> to it before the MMU is initialized. Instead, remap the ROM as needed
> in the HAB code so that we can safely jump into ROM with MMU enabled.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/cpu/mmu-early_32.c |  7 -------
>  drivers/hab/habv4.c         | 10 +++++++++-
>  2 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu-early_32.c b/arch/arm/cpu/mmu-early_32.c
> index 07c5917e6a..94bde44c9b 100644
> --- a/arch/arm/cpu/mmu-early_32.c
> +++ b/arch/arm/cpu/mmu-early_32.c
> @@ -58,12 +58,5 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize,
>  	/* maps main memory as cachable */
>  	map_region(membase, memsize, PMD_SECT_DEF_CACHED);
>  
> -	/*
> -	 * With HAB enabled we call into the ROM code later in imx6_hab_get_status().
> -	 * Map the ROM cached which has the effect that the XN bit is not set.
> -	 */
> -	if (IS_ENABLED(CONFIG_HABV4) && IS_ENABLED(CONFIG_ARCH_IMX6))
> -		map_region(0x0, SZ_1M, PMD_SECT_DEF_CACHED);
> -
>  	__mmu_cache_on();
>  }
> diff --git a/drivers/hab/habv4.c b/drivers/hab/habv4.c
> index ca26773bf8..e8c7d3264d 100644
> --- a/drivers/hab/habv4.c
> +++ b/drivers/hab/habv4.c
> @@ -11,6 +11,9 @@
>  #include <hab.h>
>  #include <init.h>
>  #include <types.h>
> +#include <mmu.h>
> +#include <zero_page.h>
> +#include <linux/sizes.h>
>  #include <linux/arm-smccc.h>
>  #include <asm/cache.h>
>  
> @@ -616,12 +619,17 @@ static int init_imx6_hab_get_status(void)
>  		/* can happen in multi-image builds and is not an error */
>  		return 0;
>  
> +	arch_remap_range(0x0, SZ_1M, MAP_CACHED);
> +
>  	/*
>  	 * Nobody will check the return value if there were HAB errors, but the
>  	 * initcall will fail spectaculously with a strange error message.
>  	 */
>  	imx6_hab_get_status();
>  
> +	zero_page_faulting();
> +	arch_remap_range((void *)PAGE_SIZE, SZ_1M - PAGE_SIZE, MAP_UNCACHED);
> +
>  	return 0;
>  }
>  
> @@ -630,7 +638,7 @@ static int init_imx6_hab_get_status(void)
>   * which will no longer be accessible when the MMU sets the zero page to
>   * faulting.
>   */
> -postconsole_initcall(init_imx6_hab_get_status);
> +postmmu_initcall(init_imx6_hab_get_status);
>  
>  int imx28_hab_get_status(void)
>  {

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 21/34] ARM: mmu: alloc 64k for early page tables
  2023-05-17  9:03 ` [PATCH v2 21/34] ARM: mmu: alloc 64k for early page tables Sascha Hauer
@ 2023-05-17 13:03   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:03 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> This is a preparation for using two level page tables in the PBL.
> To do that we need a way to allocate page tables in PBL. As malloc
> is not available in PBL, increase the area we use for the TTB to
> make some space available for page tables.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/cpu/mmu_32.c              | 6 ++++++
>  arch/arm/include/asm/barebox-arm.h | 8 ++------
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 12fe892400..4050d96846 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -24,6 +24,12 @@
>  #define PTRS_PER_PTE		(PGDIR_SIZE / PAGE_SIZE)
>  #define ARCH_MAP_WRITECOMBINE	((unsigned)-1)
>  
> +/*
> + * We have a 4GiB address space split into 1MiB sections, with each
> + * section header taking 4 bytes
> + */
> +#define ARM_TTB_SIZE	(SZ_4G / SZ_1M * sizeof(u32))
> +
>  static uint32_t *ttb;
>  
>  /*
> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> index f5a74b4746..eb31ca2788 100644
> --- a/arch/arm/include/asm/barebox-arm.h
> +++ b/arch/arm/include/asm/barebox-arm.h
> @@ -23,11 +23,7 @@
>  #include <asm/reloc.h>
>  #include <linux/stringify.h>
>  
> -/*
> - * We have a 4GiB address space split into 1MiB sections, with each
> - * section header taking 4 bytes
> - */
> -#define ARM_TTB_SIZE	(SZ_4G / SZ_1M * sizeof(u32))
> +#define ARM_EARLY_PAGETABLE_SIZE	SZ_64K
>  
>  void __noreturn barebox_arm_entry(unsigned long membase, unsigned long memsize, void *boarddata);
>  
> @@ -89,7 +85,7 @@ static inline unsigned long arm_mem_stack(unsigned long endmem)
>  static inline unsigned long arm_mem_ttb(unsigned long endmem)
>  {
>  	endmem = arm_mem_stack(endmem);
> -	endmem = ALIGN_DOWN(endmem, ARM_TTB_SIZE) - ARM_TTB_SIZE;
> +	endmem = ALIGN_DOWN(endmem, ARM_EARLY_PAGETABLE_SIZE) - ARM_EARLY_PAGETABLE_SIZE;
>  
>  	return endmem;
>  }

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 22/34] ARM: mmu32: create alloc_pte()
  2023-05-17  9:03 ` [PATCH v2 22/34] ARM: mmu32: create alloc_pte() Sascha Hauer
@ 2023-05-17 13:07   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:07 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> This is a preparation for using two level page tables in the PBL.
> To do that we need a way to allocate page tables in PBL. As malloc
> is not available in PBL, implement a function to allocate a page table
> from the area we also place the TTB.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_32.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 4050d96846..a82382ad1e 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -76,6 +76,27 @@ static bool pgd_type_table(u32 pgd)
>  	return (pgd & PMD_TYPE_MASK) == PMD_TYPE_TABLE;
>  }
>  
> +#define PTE_SIZE       (PTRS_PER_PTE * sizeof(u32))
> +
> +#ifdef __PBL__
> +static uint32_t *alloc_pte(void)
> +{
> +	static unsigned int idx = 3;

Can you add a comment explaining the choice of initial index?

> +
> +	idx++;

I know it's a quite construed example, but if one calls alloc_pte
often enough, it will start returning non-NULL pointers after
having returned NULL before. 

> +
> +	if (idx * PTE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
> +		return NULL;
> +
> +	return (void *)ttb + idx * PTE_SIZE;

To address above point, just replace idx with idx++ (and 3 with 4?).

> +}
> +#else
> +static uint32_t *alloc_pte(void)
> +{
> +	return xmemalign(PTE_SIZE, PTE_SIZE);
> +}
> +#endif
> +
>  static u32 *find_pte(unsigned long adr)
>  {
>  	u32 *table;
> @@ -125,8 +146,7 @@ static u32 *arm_create_pte(unsigned long virt, uint32_t flags)
>  
>  	virt = ALIGN_DOWN(virt, PGDIR_SIZE);
>  
> -	table = xmemalign(PTRS_PER_PTE * sizeof(u32),
> -			  PTRS_PER_PTE * sizeof(u32));
> +	table = alloc_pte();
>  
>  	if (!ttb)
>  		arm_mmu_not_initialized_error();

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE
  2023-05-17 12:48   ` Ahmad Fatoum
@ 2023-05-17 13:14     ` Sascha Hauer
  2023-05-17 15:50       ` Ahmad Fatoum
  0 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17 13:14 UTC (permalink / raw)
  To: Ahmad Fatoum; +Cc: Barebox List

On Wed, May 17, 2023 at 02:48:43PM +0200, Ahmad Fatoum wrote:
> On 17.05.23 11:03, Sascha Hauer wrote:
> > We want to reserve memory for OP-TEE at the end of available SDRAM,
> > so move the scratch area below OP-TEE and not above.
> > 
> > Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> > ---
> >  arch/arm/include/asm/barebox-arm.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
> > index f446044be6..6e6606d005 100644
> > --- a/arch/arm/include/asm/barebox-arm.h
> > +++ b/arch/arm/include/asm/barebox-arm.h
> > @@ -71,14 +71,14 @@ static inline void arm_fixup_vectors(void)
> >  
> >  void *barebox_arm_boot_dtb(void);
> >  
> > -#define arm_mem_scratch(endmem) ((endmem) - SZ_32K)
> > +#define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
> >  
> >  static inline const void *arm_mem_scratch_get(void)
> >  {
> >  	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
> >  }
> >  
> > -#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
> > +#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K)
> 
> I don't understand why you drop OPTEE_SIZE here. Wouldn't the stack
> now eat into the OP-TEE region?

I accidently thought that arm_mem_stack_top() is calculated based on the
region above it, namely arm_mem_scratch(), but really it's calculated
based on endmem directly.

Indeed it's wrong like this, it should be:

#define arm_mem_stack_top(endmem) (arm_mem_scratch(endmem) - SZ_64K)

I just stumbled upon the SZ_64K here. I followed the value back to 2016
and found 75c96bd2459e ("ARM: Do not use last 64KiB of address space for
barebox"). I had a board that time that has SDRAM at the very end of the
32bit address space. On that board it happened that we overwrite parts
of the lowlevel memory with the vector table. It seems that has been
lost over time as now we put the scratch space and possibly parts of
OP-TEE into the last 64k.

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 23/34] ARM: mmu64: create alloc_pte()
  2023-05-17  9:03 ` [PATCH v2 23/34] ARM: mmu64: " Sascha Hauer
@ 2023-05-17 13:15   ` Ahmad Fatoum
  2023-05-17 13:17   ` Ahmad Fatoum
  1 sibling, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:15 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> This is a preparation for using two level page tables in the PBL.
> To do that we need a way to allocate page tables in PBL. As malloc
> is not available in PBL, implement a function to allocate a page table
> from the area we also place the TTB.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Acked-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/cpu/mmu_64.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
> index 55ada960c5..3cc5b14a46 100644
> --- a/arch/arm/cpu/mmu_64.c
> +++ b/arch/arm/cpu/mmu_64.c
> @@ -32,7 +32,20 @@ static void set_table(uint64_t *pt, uint64_t *table_addr)
>  	*pt = val;
>  }
>  
> -static uint64_t *create_table(void)
> +#ifdef __PBL__
> +static uint64_t *alloc_pte(void)
> +{
> +	static unsigned int idx;
> +
> +	idx++;
> +
> +	if (idx * GRANULE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
> +		return NULL;
> +
> +	return (void *)ttb + idx * GRANULE_SIZE;
> +}
> +#else
> +static uint64_t *alloc_pte(void)
>  {
>  	uint64_t *new_table = xmemalign(GRANULE_SIZE, GRANULE_SIZE);
>  
> @@ -41,6 +54,7 @@ static uint64_t *create_table(void)
>  
>  	return new_table;
>  }
> +#endif
>  
>  static __maybe_unused uint64_t *find_pte(uint64_t addr)
>  {
> @@ -81,7 +95,7 @@ static void split_block(uint64_t *pte, int level)
>  	/* level describes the parent level, we need the child ones */
>  	levelshift = level2shift(level + 1);
>  
> -	new_table = create_table();
> +	new_table = alloc_pte();
>  
>  	for (i = 0; i < MAX_PTE_ENTRIES; i++) {
>  		new_table[i] = old_pte | (i << levelshift);
> @@ -183,7 +197,7 @@ void __mmu_init(bool mmu_on)
>  	if (mmu_on)
>  		mmu_disable();
>  
> -	ttb = create_table();
> +	ttb = alloc_pte();
>  	el = current_el();
>  	set_ttbr_tcr_mair(el, (uint64_t)ttb, calc_tcr(el, BITS_PER_VA),
>  			  MEMORY_ATTRIBUTES);

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 23/34] ARM: mmu64: create alloc_pte()
  2023-05-17  9:03 ` [PATCH v2 23/34] ARM: mmu64: " Sascha Hauer
  2023-05-17 13:15   ` Ahmad Fatoum
@ 2023-05-17 13:17   ` Ahmad Fatoum
  1 sibling, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:17 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> This is a preparation for using two level page tables in the PBL.
> To do that we need a way to allocate page tables in PBL. As malloc
> is not available in PBL, implement a function to allocate a page table
> from the area we also place the TTB.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_64.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
> index 55ada960c5..3cc5b14a46 100644
> --- a/arch/arm/cpu/mmu_64.c
> +++ b/arch/arm/cpu/mmu_64.c
> @@ -32,7 +32,20 @@ static void set_table(uint64_t *pt, uint64_t *table_addr)
>  	*pt = val;
>  }
>  
> -static uint64_t *create_table(void)
> +#ifdef __PBL__
> +static uint64_t *alloc_pte(void)
> +{
> +	static unsigned int idx;
> +
> +	idx++;
> +
> +	if (idx * GRANULE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
> +		return NULL;
> +
> +	return (void *)ttb + idx * GRANULE_SIZE;
> +}
> +#else
> +static uint64_t *alloc_pte(void)
>  {
>  	uint64_t *new_table = xmemalign(GRANULE_SIZE, GRANULE_SIZE);
>  
> @@ -41,6 +54,7 @@ static uint64_t *create_table(void)

Nit: There's a memset(new_table, 0, GRANULE_SIZE); inside here, which doesn't
exist in the 32-bit MMU implementation and which can be skipped if
we opencode the memset in __mmu_init.

>  
>  	return new_table;
>  }
> +#endif
>  
>  static __maybe_unused uint64_t *find_pte(uint64_t addr)
>  {
> @@ -81,7 +95,7 @@ static void split_block(uint64_t *pte, int level)
>  	/* level describes the parent level, we need the child ones */
>  	levelshift = level2shift(level + 1);
>  
> -	new_table = create_table();
> +	new_table = alloc_pte();
>  
>  	for (i = 0; i < MAX_PTE_ENTRIES; i++) {
>  		new_table[i] = old_pte | (i << levelshift);
> @@ -183,7 +197,7 @@ void __mmu_init(bool mmu_on)
>  	if (mmu_on)
>  		mmu_disable();
>  
> -	ttb = create_table();
> +	ttb = alloc_pte();
>  	el = current_el();
>  	set_ttbr_tcr_mair(el, (uint64_t)ttb, calc_tcr(el, BITS_PER_VA),
>  			  MEMORY_ATTRIBUTES);

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 24/34] ARM: mmu: drop ttb argument
  2023-05-17  9:03 ` [PATCH v2 24/34] ARM: mmu: drop ttb argument Sascha Hauer
@ 2023-05-17 13:23   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:23 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> No need to pass ttb to the MMU code, the MMU code can itself call
> arm_mem_ttb() to get the desired base.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_32.c      |  9 +++++----
>  arch/arm/cpu/mmu_64.c      |  8 +++++---
>  arch/arm/cpu/start.c       | 11 +++--------
>  arch/arm/cpu/uncompress.c  |  7 ++-----
>  arch/arm/include/asm/mmu.h |  3 +--
>  5 files changed, 16 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index a82382ad1e..bef4a01670 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -533,10 +533,11 @@ static inline void map_region(unsigned long start, unsigned long size,
>  	create_sections(ttb, start, start + size - 1, flags);
>  }
>  
> -void mmu_early_enable(unsigned long membase, unsigned long memsize,
> -		      unsigned long _ttb)
> +void mmu_early_enable(unsigned long membase, unsigned long memsize)
>  {
> -	ttb = (uint32_t *)_ttb;
> +	ttb = (uint32_t *)arm_mem_ttb(membase, membase + memsize);

This commit breaks bisection, because v2 changes arm_mem_ttb prototype.
> +	pr_debug("enabling MMU, ttb @ 0x%p\n", ttb);
>  
>  	set_ttbr(ttb);
>  
> @@ -566,7 +567,7 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize,
>  	map_region((unsigned long)_stext, _etext - _stext, PMD_SECT_DEF_UNCACHED);
>  
>  	/* maps main memory as cachable */
> -	map_region(membase, memsize, PMD_SECT_DEF_CACHED);
> +	map_region(membase, memsize - OPTEE_SIZE, PMD_SECT_DEF_CACHED);

(y)

>  
>  	__mmu_cache_on();
>  }
> diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
> index 3cc5b14a46..d32eecf144 100644
> --- a/arch/arm/cpu/mmu_64.c
> +++ b/arch/arm/cpu/mmu_64.c
> @@ -292,10 +292,12 @@ static void early_create_sections(void *ttb, uint64_t virt, uint64_t phys,
>  
>  #define EARLY_BITS_PER_VA 39
>  
> -void mmu_early_enable(unsigned long membase, unsigned long memsize,
> -		      unsigned long ttb)
> +void mmu_early_enable(unsigned long membase, unsigned long memsize)
>  {
>  	int el;
> +	unsigned long ttb = arm_mem_ttb(membase + memsize);
> +
> +	pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
>  
>  	/*
>  	 * For the early code we only create level 1 pagetables which only
> @@ -311,7 +313,7 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize,
>  	set_ttbr_tcr_mair(el, ttb, calc_tcr(el, EARLY_BITS_PER_VA), MEMORY_ATTRIBUTES);
>  	early_create_sections((void *)ttb, 0, 0, 1UL << (EARLY_BITS_PER_VA - 1),
>  			attrs_uncached_mem());
> -	early_create_sections((void *)ttb, membase, membase, memsize, CACHED_MEM);
> +	early_create_sections((void *)ttb, membase, membase, memsize - OPTEE_SIZE, CACHED_MEM);
>  	tlb_invalidate();
>  	isb();
>  	set_cr(get_cr() | CR_M);
> diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
> index 87207822a0..165d2d94e6 100644
> --- a/arch/arm/cpu/start.c
> +++ b/arch/arm/cpu/start.c
> @@ -216,14 +216,9 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
>  
>  	mem_malloc_init((void *)malloc_start, (void *)malloc_end - 1);
>  
> -	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
> -		unsigned long ttb = arm_mem_ttb(endmem);
> -
> -		if (!IS_ENABLED(CONFIG_PBL_IMAGE)) {
> -			pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
> -			arm_early_mmu_cache_invalidate();
> -			mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
> -		}
> +	if (IS_ENABLED(CONFIG_MMU_EARLY) && !IS_ENABLED(CONFIG_PBL_IMAGE)) {
> +		arm_early_mmu_cache_invalidate();
> +		mmu_early_enable(membase, memsize);
>  	}
>  
>  	if (IS_ENABLED(CONFIG_BOOTM_OPTEE))
> diff --git a/arch/arm/cpu/uncompress.c b/arch/arm/cpu/uncompress.c
> index abaf36b68c..e471dd87f9 100644
> --- a/arch/arm/cpu/uncompress.c
> +++ b/arch/arm/cpu/uncompress.c
> @@ -81,11 +81,8 @@ void __noreturn barebox_pbl_start(unsigned long membase, unsigned long memsize,
>  
>  	pr_debug("memory at 0x%08lx, size 0x%08lx\n", membase, memsize);
>  
> -	if (IS_ENABLED(CONFIG_MMU_EARLY)) {
> -		unsigned long ttb = arm_mem_ttb(endmem);
> -		pr_debug("enabling MMU, ttb @ 0x%08lx\n", ttb);
> -		mmu_early_enable(membase, memsize - OPTEE_SIZE, ttb);
> -	}
> +	if (IS_ENABLED(CONFIG_MMU_EARLY))
> +		mmu_early_enable(membase, memsize);
>  
>  	free_mem_ptr = arm_mem_early_malloc(endmem);
>  	free_mem_end_ptr = arm_mem_early_malloc_end(endmem);
> diff --git a/arch/arm/include/asm/mmu.h b/arch/arm/include/asm/mmu.h
> index fd8e93f7a3..9d2fdcf365 100644
> --- a/arch/arm/include/asm/mmu.h
> +++ b/arch/arm/include/asm/mmu.h
> @@ -56,8 +56,7 @@ void __dma_clean_range(unsigned long, unsigned long);
>  void __dma_flush_range(unsigned long, unsigned long);
>  void __dma_inv_range(unsigned long, unsigned long);
>  
> -void mmu_early_enable(unsigned long membase, unsigned long memsize,
> -		      unsigned long ttb);
> +void mmu_early_enable(unsigned long membase, unsigned long memsize);
>  void mmu_early_disable(void);
>  
>  #endif /* __ASM_MMU_H */

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 25/34] ARM: mmu: always do MMU initialization early when MMU is enabled
  2023-05-17  9:03 ` [PATCH v2 25/34] ARM: mmu: always do MMU initialization early when MMU is enabled Sascha Hauer
@ 2023-05-17 13:29   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:29 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> Drop the CONFIG_MMU_EARLY and make early MMU initialization the default.
> 
> Doing so allows us for some simplifications in the MMU code as we have
> less code pathes to care and think about.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>



> ---
>  arch/arm/cpu/start.c      | 2 +-
>  arch/arm/cpu/uncompress.c | 2 +-
>  common/Kconfig            | 9 ---------
>  3 files changed, 2 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm/cpu/start.c b/arch/arm/cpu/start.c
> index 165d2d94e6..2e987ec41d 100644
> --- a/arch/arm/cpu/start.c
> +++ b/arch/arm/cpu/start.c
> @@ -216,7 +216,7 @@ __noreturn __no_sanitize_address void barebox_non_pbl_start(unsigned long membas
>  
>  	mem_malloc_init((void *)malloc_start, (void *)malloc_end - 1);
>  
> -	if (IS_ENABLED(CONFIG_MMU_EARLY) && !IS_ENABLED(CONFIG_PBL_IMAGE)) {
> +	if (IS_ENABLED(CONFIG_MMU) && !IS_ENABLED(CONFIG_PBL_IMAGE)) {
>  		arm_early_mmu_cache_invalidate();
>  		mmu_early_enable(membase, memsize);
>  	}
> diff --git a/arch/arm/cpu/uncompress.c b/arch/arm/cpu/uncompress.c
> index e471dd87f9..a481c4634d 100644
> --- a/arch/arm/cpu/uncompress.c
> +++ b/arch/arm/cpu/uncompress.c
> @@ -81,7 +81,7 @@ void __noreturn barebox_pbl_start(unsigned long membase, unsigned long memsize,
>  
>  	pr_debug("memory at 0x%08lx, size 0x%08lx\n", membase, memsize);
>  
> -	if (IS_ENABLED(CONFIG_MMU_EARLY))
> +	if (IS_ENABLED(CONFIG_MMU))
>  		mmu_early_enable(membase, memsize);
>  
>  	free_mem_ptr = arm_mem_early_malloc(endmem);
> diff --git a/common/Kconfig b/common/Kconfig
> index ac3df75acb..c6008f125b 100644
> --- a/common/Kconfig
> +++ b/common/Kconfig
> @@ -185,15 +185,6 @@ config MMU
>  	  to enable the data cache which depends on the MMU. See Documentation/mmu.txt
>  	  for further information.
>  
> -config MMU_EARLY
> -	bool "Enable MMU early"
> -	depends on ARM
> -	depends on MMU
> -	default y
> -	help
> -	  This enables the MMU during early startup. This speeds up things during startup
> -	  of barebox, but may lead to harder to debug code. If unsure say yes here.
> -
>  config HAVE_CONFIGURABLE_TEXT_BASE
>  	bool
>  

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 26/34] ARM: mmu32: Assume MMU is on
  2023-05-17  9:03 ` [PATCH v2 26/34] ARM: mmu32: Assume MMU is on Sascha Hauer
@ 2023-05-17 13:36   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:36 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> As we now always enable the MMU during early initialization we can
> safely assume that the MMU is already enabled in __mmu_init() and
> drop the code path which enables the MMU.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_32.c | 47 +++++++++----------------------------------
>  1 file changed, 10 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index bef4a01670..7cd732580e 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -24,12 +24,6 @@
>  #define PTRS_PER_PTE		(PGDIR_SIZE / PAGE_SIZE)
>  #define ARCH_MAP_WRITECOMBINE	((unsigned)-1)
>  
> -/*
> - * We have a 4GiB address space split into 1MiB sections, with each
> - * section header taking 4 bytes
> - */
> -#define ARM_TTB_SIZE	(SZ_4G / SZ_1M * sizeof(u32))
> -
>  static uint32_t *ttb;
>  
>  /*
> @@ -457,38 +451,19 @@ void __mmu_init(bool mmu_on)
>  		pte_flags_uncached = PTE_FLAGS_UNCACHED_V4;
>  	}
>  
> -	if (mmu_on) {
> +	/* Clear unpredictable bits [13:0] */
> +	ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
> +
> +	if (!request_sdram_region("ttb", (unsigned long)ttb, SZ_16K))
>  		/*
> -		 * Early MMU code has already enabled the MMU. We assume a
> -		 * flat 1:1 section mapping in this case.
> +		 * This can mean that:
> +		 * - the early MMU code has put the ttb into a place
> +		 *   which we don't have inside our available memory
> +		 * - Somebody else has occupied the ttb region which means
> +		 *   the ttb will get corrupted.
>  		 */
> -		/* Clear unpredictable bits [13:0] */
> -		ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
> -
> -		if (!request_sdram_region("ttb", (unsigned long)ttb, SZ_16K))
> -			/*
> -			 * This can mean that:
> -			 * - the early MMU code has put the ttb into a place
> -			 *   which we don't have inside our available memory
> -			 * - Somebody else has occupied the ttb region which means
> -			 *   the ttb will get corrupted.
> -			 */
> -			pr_crit("Critical Error: Can't request SDRAM region for ttb at %p\n",
> +		pr_crit("Critical Error: Can't request SDRAM region for ttb at %p\n",
>  					ttb);
> -	} else {
> -		ttb = xmemalign(ARM_TTB_SIZE, ARM_TTB_SIZE);
> -
> -		set_ttbr(ttb);
> -
> -		/* For the XN bit to take effect, we can't be using DOMAIN_MANAGER. */
> -		if (cpu_architecture() >= CPU_ARCH_ARMv7)
> -			set_domain(DOMAIN_CLIENT);
> -		else
> -			set_domain(DOMAIN_MANAGER);
> -
> -		create_flat_mapping(ttb);
> -		__mmu_cache_flush();
> -	}
>  
>  	pr_debug("ttb: 0x%p\n", ttb);
>  
> @@ -499,8 +474,6 @@ void __mmu_init(bool mmu_on)
>  				PMD_SECT_DEF_CACHED);
>  		__mmu_cache_flush();
>  	}
> -
> -	__mmu_cache_on();

I guess it's ok to drop this, but some assurance in the commit message
would be nice.

>  }
>  
>  /*

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 27/34] ARM: mmu32: Fix pmd_flags_to_pte() for ARMv4/5/6
  2023-05-17  9:03 ` [PATCH v2 27/34] ARM: mmu32: Fix pmd_flags_to_pte() for ARMv4/5/6 Sascha Hauer
@ 2023-05-17 13:39   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:39 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> pmd_flags_to_pte() assumed ARMv7 page table format. This has the effect
> that random bit values end up in the access permission bits. This works

                                                              ^ for older CPUs.

> because the domain is configured as manager in the DACR and thus the

                                                          ^ for non-ARMv7

> access permissions are ignored by the MMU.
> Nevertheless fix this and take the cpu architecture into account when
> translating the bits. Don't bother to translate the access permission
> bits though, just hardcode them as PTE_SMALL_AP_UNO_SRW.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Apart from that:

Acked-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/cpu/mmu_32.c | 27 ++++++++++++++++-----------
>  1 file changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 7cd732580e..4abaab7d87 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -167,17 +167,22 @@ static u32 pmd_flags_to_pte(u32 pmd)
>  		pte |= PTE_BUFFERABLE;
>  	if (pmd & PMD_SECT_CACHEABLE)
>  		pte |= PTE_CACHEABLE;
> -	if (pmd & PMD_SECT_nG)
> -		pte |= PTE_EXT_NG;
> -	if (pmd & PMD_SECT_XN)
> -		pte |= PTE_EXT_XN;
> -
> -	/* TEX[2:0] */
> -	pte |= PTE_EXT_TEX((pmd >> 12) & 7);
> -	/* AP[1:0] */
> -	pte |= ((pmd >> 10) & 0x3) << 4;
> -	/* AP[2] */
> -	pte |= ((pmd >> 15) & 0x1) << 9;
> +
> +	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
> +		if (pmd & PMD_SECT_nG)
> +			pte |= PTE_EXT_NG;
> +		if (pmd & PMD_SECT_XN)
> +			pte |= PTE_EXT_XN;
> +
> +		/* TEX[2:0] */
> +		pte |= PTE_EXT_TEX((pmd >> 12) & 7);
> +		/* AP[1:0] */
> +		pte |= ((pmd >> 10) & 0x3) << 4;
> +		/* AP[2] */
> +		pte |= ((pmd >> 15) & 0x1) << 9;
> +	} else {
> +		pte |= PTE_SMALL_AP_UNO_SRW;
> +	}
>  
>  	return pte;
>  }

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 28/34] ARM: mmu32: Add pte_flags_to_pmd()
  2023-05-17  9:03 ` [PATCH v2 28/34] ARM: mmu32: Add pte_flags_to_pmd() Sascha Hauer
@ 2023-05-17 13:43   ` Ahmad Fatoum
  2023-05-17 14:44     ` Sascha Hauer
  0 siblings, 1 reply; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:43 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_32.c | 35 +++++++++++++++++++++++++++++------
>  1 file changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 4abaab7d87..0af89ac39c 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -187,30 +187,53 @@ static u32 pmd_flags_to_pte(u32 pmd)
>  	return pte;
>  }
>  
> +static u32 pte_flags_to_pmd(u32 pte)
> +{
> +	u32 pmd = 0;
> +
> +	if (pte & PTE_BUFFERABLE)
> +		pmd |= PMD_SECT_BUFFERABLE;
> +	if (pte & PTE_CACHEABLE)
> +		pmd |= PMD_SECT_CACHEABLE;
> +
> +	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
> +		if (pte & PTE_EXT_NG)
> +			pmd |= PMD_SECT_nG;
> +		if (pte & PTE_EXT_XN)
> +			pmd |= PMD_SECT_XN;

Note that at least,  <asm/pgtable.h> claims these bits are v6+:

#define PTE_EXT_XN		(1 << 0)	/* v6 */
#define PMD_SECT_nG		(1 << 17)	/* v6 */


> +
> +		/* TEX[2:0] */
> +		pmd |= ((pte >> 6) & 7) << 12;
> +		/* AP[1:0] */
> +		pmd |= ((pte >> 4) & 0x3) << 10;
> +		/* AP[2] */
> +		pmd |= ((pte >> 9) & 0x1) << 15;
> +	} else {
> +		pmd |= PMD_SECT_AP_WRITE | PMD_SECT_AP_READ;
> +	}
> +
> +	return pmd;
> +}
> +
>  int arch_remap_range(void *start, size_t size, unsigned flags)
>  {
>  	u32 addr = (u32)start;
>  	u32 pte_flags;
> -	u32 pgd_flags;
>  
>  	BUG_ON(!IS_ALIGNED(addr, PAGE_SIZE));
>  
>  	switch (flags) {
>  	case MAP_CACHED:
>  		pte_flags = pte_flags_cached;
> -		pgd_flags = PMD_SECT_DEF_CACHED;
>  		break;
>  	case MAP_UNCACHED:
>  		pte_flags = pte_flags_uncached;
> -		pgd_flags = pgd_flags_uncached;
>  		break;
>  	case MAP_FAULT:
>  		pte_flags = 0x0;
> -		pgd_flags = 0x0;
>  		break;
>  	case ARCH_MAP_WRITECOMBINE:
>  		pte_flags = pte_flags_wc;
> -		pgd_flags = pgd_flags_wc;
>  		break;
>  	default:
>  		return -EINVAL;
> @@ -228,7 +251,7 @@ int arch_remap_range(void *start, size_t size, unsigned flags)
>  			 * replace it with a section
>  			 */
>  			chunk = PGDIR_SIZE;
> -			*pgd = addr | pgd_flags;
> +			*pgd = addr | pte_flags_to_pmd(pte_flags) | PMD_TYPE_SECT;
>  			dma_flush_range(pgd, sizeof(*pgd));
>  		} else {
>  			unsigned int num_ptes;

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 29/34] ARM: mmu32: add get_pte_flags, get_pmd_flags
  2023-05-17  9:03 ` [PATCH v2 29/34] ARM: mmu32: add get_pte_flags, get_pmd_flags Sascha Hauer
@ 2023-05-17 13:46   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:46 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> The mmu code has several variables containing the pte/pmd values for
> different mapping types. These variables only contain the correct values
> after initializing them which makes it a bit hard to follow when the
> code is used in both PBL and barebox proper.
> 
> Instead of using variables calculate the values when they are needed.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/cpu/mmu_32.c | 82 +++++++++++++++++++++----------------------
>  1 file changed, 41 insertions(+), 41 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 0af89ac39c..829139574c 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -57,11 +57,6 @@ static inline void tlb_invalidate(void)
>   * PTE flags to set cached and uncached areas.
>   * This will be determined at runtime.
>   */
> -static uint32_t pte_flags_cached;
> -static uint32_t pte_flags_wc;
> -static uint32_t pte_flags_uncached;
> -static uint32_t pgd_flags_wc;
> -static uint32_t pgd_flags_uncached;
>  
>  #define PTE_MASK ((1 << 12) - 1)
>  
> @@ -215,29 +210,48 @@ static u32 pte_flags_to_pmd(u32 pte)
>  	return pmd;
>  }
>  
> -int arch_remap_range(void *start, size_t size, unsigned flags)
> +static uint32_t get_pte_flags(int map_type)
> +{
> +	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
> +		switch (map_type) {
> +		case MAP_CACHED:
> +			return PTE_FLAGS_CACHED_V7;
> +		case MAP_UNCACHED:
> +			return PTE_FLAGS_UNCACHED_V7;
> +		case ARCH_MAP_WRITECOMBINE:
> +			return PTE_FLAGS_WC_V7;
> +		case MAP_FAULT:
> +		default:
> +			return 0x0;
> +		}
> +	} else {
> +		switch (map_type) {
> +		case MAP_CACHED:
> +			return PTE_FLAGS_CACHED_V4;
> +		case MAP_UNCACHED:
> +		case ARCH_MAP_WRITECOMBINE:
> +			return PTE_FLAGS_UNCACHED_V4;
> +		case MAP_FAULT:
> +		default:
> +			return 0x0;
> +		}
> +	}
> +}
> +
> +static uint32_t get_pmd_flags(int map_type)
> +{
> +	return pte_flags_to_pmd(get_pte_flags(map_type));
> +}
> +
> +int arch_remap_range(void *start, size_t size, unsigned map_type)
>  {
>  	u32 addr = (u32)start;
> -	u32 pte_flags;
> +	u32 pte_flags, pmd_flags;
>  
>  	BUG_ON(!IS_ALIGNED(addr, PAGE_SIZE));
>  
> -	switch (flags) {
> -	case MAP_CACHED:
> -		pte_flags = pte_flags_cached;
> -		break;
> -	case MAP_UNCACHED:
> -		pte_flags = pte_flags_uncached;
> -		break;
> -	case MAP_FAULT:
> -		pte_flags = 0x0;
> -		break;
> -	case ARCH_MAP_WRITECOMBINE:
> -		pte_flags = pte_flags_wc;
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> +	pte_flags = get_pte_flags(map_type);
> +	pmd_flags = pte_flags_to_pmd(pte_flags);
>  
>  	while (size) {
>  		const bool pgdir_size_aligned = IS_ALIGNED(addr, PGDIR_SIZE);
> @@ -251,7 +265,7 @@ int arch_remap_range(void *start, size_t size, unsigned flags)
>  			 * replace it with a section
>  			 */
>  			chunk = PGDIR_SIZE;
> -			*pgd = addr | pte_flags_to_pmd(pte_flags) | PMD_TYPE_SECT;
> +			*pgd = addr | pmd_flags | PMD_TYPE_SECT;
>  			dma_flush_range(pgd, sizeof(*pgd));
>  		} else {
>  			unsigned int num_ptes;
> @@ -309,7 +323,7 @@ void *map_io_sections(unsigned long phys, void *_start, size_t size)
>  	unsigned long start = (unsigned long)_start, sec;
>  
>  	for (sec = start; sec < start + size; sec += PGDIR_SIZE, phys += PGDIR_SIZE)
> -		ttb[pgd_index(sec)] = phys | pgd_flags_uncached;
> +		ttb[pgd_index(sec)] = phys | get_pmd_flags(MAP_UNCACHED);
>  
>  	dma_flush_range(ttb, 0x4000);
>  	tlb_invalidate();
> @@ -350,9 +364,9 @@ static void create_vector_table(unsigned long adr)
>  		vectors = xmemalign(PAGE_SIZE, PAGE_SIZE);
>  		pr_debug("Creating vector table, virt = 0x%p, phys = 0x%08lx\n",
>  			 vectors, adr);
> -		arm_create_pte(adr, pte_flags_uncached);
> +		arm_create_pte(adr, get_pte_flags(MAP_UNCACHED));
>  		pte = find_pte(adr);
> -		*pte = (u32)vectors | PTE_TYPE_SMALL | pte_flags_cached;
> +		*pte = (u32)vectors | PTE_TYPE_SMALL | get_pte_flags(MAP_CACHED);
>  	}
>  
>  	arm_fixup_vectors();
> @@ -465,20 +479,6 @@ void __mmu_init(bool mmu_on)
>  {
>  	struct memory_bank *bank;
>  
> -	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
> -		pte_flags_cached = PTE_FLAGS_CACHED_V7;
> -		pte_flags_wc = PTE_FLAGS_WC_V7;
> -		pgd_flags_wc = PGD_FLAGS_WC_V7;
> -		pgd_flags_uncached = PGD_FLAGS_UNCACHED_V7;
> -		pte_flags_uncached = PTE_FLAGS_UNCACHED_V7;
> -	} else {
> -		pte_flags_cached = PTE_FLAGS_CACHED_V4;
> -		pte_flags_wc = PTE_FLAGS_UNCACHED_V4;
> -		pgd_flags_wc = PMD_SECT_DEF_UNCACHED;
> -		pgd_flags_uncached = PMD_SECT_DEF_UNCACHED;
> -		pte_flags_uncached = PTE_FLAGS_UNCACHED_V4;
> -	}
> -
>  	/* Clear unpredictable bits [13:0] */
>  	ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
>  

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 30/34] ARM: mmu32: move functions into c file
  2023-05-17  9:03 ` [PATCH v2 30/34] ARM: mmu32: move functions into c file Sascha Hauer
@ 2023-05-17 13:48   ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:48 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> Move create_flat_mapping() and create_sections() into the c file
> rather than having them as static inline functions in the header file.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

> ---
>  arch/arm/cpu/mmu_32.c | 19 +++++++++++++++++++
>  arch/arm/cpu/mmu_32.h | 20 --------------------
>  2 files changed, 19 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 829139574c..0762bd55a3 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -318,6 +318,25 @@ int arch_remap_range(void *start, size_t size, unsigned map_type)
>  	return 0;
>  }
>  
> +static void create_sections(uint32_t *ttb, unsigned long first,
> +			    unsigned long last, unsigned int flags)
> +{
> +	unsigned long ttb_start = pgd_index(first);
> +	unsigned long ttb_end = pgd_index(last) + 1;
> +	unsigned int i, addr = first;
> +
> +	for (i = ttb_start; i < ttb_end; i++) {
> +		ttb[i] = addr | flags;
> +		addr += PGDIR_SIZE;
> +	}
> +}
> +
> +static void create_flat_mapping(uint32_t *ttb)
> +{
> +	/* create a flat mapping using 1MiB sections */
> +	create_sections(ttb, 0, 0xffffffff, attrs_uncached_mem());
> +}
> +
>  void *map_io_sections(unsigned long phys, void *_start, size_t size)
>  {
>  	unsigned long start = (unsigned long)_start, sec;
> diff --git a/arch/arm/cpu/mmu_32.h b/arch/arm/cpu/mmu_32.h
> index 1499b70dd6..607d9e8608 100644
> --- a/arch/arm/cpu/mmu_32.h
> +++ b/arch/arm/cpu/mmu_32.h
> @@ -56,20 +56,6 @@ static inline void set_domain(unsigned val)
>  	asm volatile ("mcr  p15,0,%0,c3,c0,0" : : "r"(val) /*:*/);
>  }
>  
> -static inline void
> -create_sections(uint32_t *ttb, unsigned long first,
> -		unsigned long last, unsigned int flags)
> -{
> -	unsigned long ttb_start = pgd_index(first);
> -	unsigned long ttb_end = pgd_index(last) + 1;
> -	unsigned int i, addr = first;
> -
> -	for (i = ttb_start; i < ttb_end; i++) {
> -		ttb[i] = addr | flags;
> -		addr += PGDIR_SIZE;
> -	}
> -}
> -
>  #define PMD_SECT_DEF_UNCACHED (PMD_SECT_AP_WRITE | PMD_SECT_AP_READ | PMD_TYPE_SECT)
>  #define PMD_SECT_DEF_CACHED (PMD_SECT_WB | PMD_SECT_DEF_UNCACHED)
>  
> @@ -83,10 +69,4 @@ static inline unsigned long attrs_uncached_mem(void)
>  	return flags;
>  }
>  
> -static inline void create_flat_mapping(uint32_t *ttb)
> -{
> -	/* create a flat mapping using 1MiB sections */
> -	create_sections(ttb, 0, 0xffffffff, attrs_uncached_mem());
> -}
> -
>  #endif /* __ARM_MMU_H */

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 31/34] ARM: mmu32: read TTB value from register
  2023-05-17  9:03 ` [PATCH v2 31/34] ARM: mmu32: read TTB value from register Sascha Hauer
@ 2023-05-17 13:58   ` Ahmad Fatoum
  2023-05-17 14:39     ` Sascha Hauer
  0 siblings, 1 reply; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 13:58 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> Instead of relying on a variable for the location of the TTB which we
> have to initialize in both PBL and barebox proper, just read the value
> back from the hardware register.

Why not initialize on first call to get_ttb()? 

> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_32.c | 41 ++++++++++++++++++++---------------------
>  1 file changed, 20 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 0762bd55a3..785b20c7fd 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -24,7 +24,11 @@
>  #define PTRS_PER_PTE		(PGDIR_SIZE / PAGE_SIZE)
>  #define ARCH_MAP_WRITECOMBINE	((unsigned)-1)
>  
> -static uint32_t *ttb;
> +static inline uint32_t *get_ttb(void)
> +{
> +	/* Clear unpredictable bits [13:0] */
> +	return (uint32_t *)(get_ttbr() & ~0x3fff);
> +}
>  
>  /*
>   * Do it the simple way for now and invalidate the entire
> @@ -77,7 +81,7 @@ static uint32_t *alloc_pte(void)
>  	if (idx * PTE_SIZE >= ARM_EARLY_PAGETABLE_SIZE)
>  		return NULL;
>  
> -	return (void *)ttb + idx * PTE_SIZE;
> +	return get_ttb() + idx * PTE_SIZE;
>  }
>  #else
>  static uint32_t *alloc_pte(void)
> @@ -89,9 +93,7 @@ static uint32_t *alloc_pte(void)
>  static u32 *find_pte(unsigned long adr)
>  {
>  	u32 *table;
> -
> -	if (!ttb)
> -		arm_mmu_not_initialized_error();
> +	uint32_t *ttb = get_ttb();
>  
>  	if (!pgd_type_table(ttb[pgd_index(adr)]))
>  		return NULL;
> @@ -130,6 +132,7 @@ void dma_inv_range(void *ptr, size_t size)
>   */
>  static u32 *arm_create_pte(unsigned long virt, uint32_t flags)
>  {
> +	uint32_t *ttb = get_ttb();
>  	u32 *table;
>  	int i, ttb_idx;
>  
> @@ -137,9 +140,6 @@ static u32 *arm_create_pte(unsigned long virt, uint32_t flags)
>  
>  	table = alloc_pte();
>  
> -	if (!ttb)
> -		arm_mmu_not_initialized_error();
> -
>  	ttb_idx = pgd_index(virt);
>  
>  	for (i = 0; i < PTRS_PER_PTE; i++) {
> @@ -247,6 +247,7 @@ int arch_remap_range(void *start, size_t size, unsigned map_type)
>  {
>  	u32 addr = (u32)start;
>  	u32 pte_flags, pmd_flags;
> +	uint32_t *ttb = get_ttb();
>  
>  	BUG_ON(!IS_ALIGNED(addr, PAGE_SIZE));
>  
> @@ -318,9 +319,10 @@ int arch_remap_range(void *start, size_t size, unsigned map_type)
>  	return 0;
>  }
>  
> -static void create_sections(uint32_t *ttb, unsigned long first,
> -			    unsigned long last, unsigned int flags)
> +static void create_sections(unsigned long first, unsigned long last,
> +			    unsigned int flags)
>  {
> +	uint32_t *ttb = get_ttb();
>  	unsigned long ttb_start = pgd_index(first);
>  	unsigned long ttb_end = pgd_index(last) + 1;
>  	unsigned int i, addr = first;
> @@ -331,15 +333,16 @@ static void create_sections(uint32_t *ttb, unsigned long first,
>  	}
>  }
>  
> -static void create_flat_mapping(uint32_t *ttb)
> +static inline void create_flat_mapping(void)
>  {
>  	/* create a flat mapping using 1MiB sections */
> -	create_sections(ttb, 0, 0xffffffff, attrs_uncached_mem());
> +	create_sections(0, 0xffffffff, attrs_uncached_mem());
>  }
>  
>  void *map_io_sections(unsigned long phys, void *_start, size_t size)
>  {
>  	unsigned long start = (unsigned long)_start, sec;
> +	uint32_t *ttb = get_ttb();
>  
>  	for (sec = start; sec < start + size; sec += PGDIR_SIZE, phys += PGDIR_SIZE)
>  		ttb[pgd_index(sec)] = phys | get_pmd_flags(MAP_UNCACHED);
> @@ -497,9 +500,7 @@ static void vectors_init(void)
>  void __mmu_init(bool mmu_on)
>  {
>  	struct memory_bank *bank;
> -
> -	/* Clear unpredictable bits [13:0] */
> -	ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
> +	uint32_t *ttb = get_ttb();
>  
>  	if (!request_sdram_region("ttb", (unsigned long)ttb, SZ_16K))
>  		/*
> @@ -517,7 +518,7 @@ void __mmu_init(bool mmu_on)
>  	vectors_init();
>  
>  	for_each_memory_bank(bank) {
> -		create_sections(ttb, bank->start, bank->start + bank->size - 1,
> +		create_sections(bank->start, bank->start + bank->size - 1,
>  				PMD_SECT_DEF_CACHED);
>  		__mmu_cache_flush();
>  	}
> @@ -541,8 +542,6 @@ void *dma_alloc_writecombine(size_t size, dma_addr_t *dma_handle)
>  	return dma_alloc_map(size, dma_handle, ARCH_MAP_WRITECOMBINE);
>  }
>  
> -static uint32_t *ttb;
> -
>  static inline void map_region(unsigned long start, unsigned long size,
>  			      uint64_t flags)
>  
> @@ -550,12 +549,12 @@ static inline void map_region(unsigned long start, unsigned long size,
>  	start = ALIGN_DOWN(start, SZ_1M);
>  	size  = ALIGN(size, SZ_1M);
>  
> -	create_sections(ttb, start, start + size - 1, flags);
> +	create_sections(start, start + size - 1, flags);
>  }
>  
>  void mmu_early_enable(unsigned long membase, unsigned long memsize)
>  {
> -	ttb = (uint32_t *)arm_mem_ttb(membase, membase + memsize);
> +	uint32_t *ttb = (uint32_t *)arm_mem_ttb(membase + memsize);
>  
>  	pr_debug("enabling MMU, ttb @ 0x%p\n", ttb);
>  
> @@ -571,7 +570,7 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize)
>  	 * This marks the whole address space as uncachable as well as
>  	 * unexecutable if possible
>  	 */
> -	create_flat_mapping(ttb);
> +	create_flat_mapping();
>  
>  	/*
>  	 * There can be SoCs that have a section shared between device memory

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 32/34] ARM: mmu32: Use pages for early MMU setup
  2023-05-17  9:03 ` [PATCH v2 32/34] ARM: mmu32: Use pages for early MMU setup Sascha Hauer
@ 2023-05-17 14:21   ` Ahmad Fatoum
  2023-05-22  8:14     ` Sascha Hauer
  0 siblings, 1 reply; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 14:21 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> Up to now we use 1MiB sections to setup the page tables in PBL. There
> are two places where this leads to problems. First is OP-TEE, we have
> to map the OP-TEE area with PTE_EXT_XN to prevent the instruction
> prefetcher from speculating into that area. With the current section
> mapping we have to align OPTEE_SIZE to 1MiB boundaries. The second
> problem comes with SRAM where the PBL might be running. This SRAM has
> to be mapped executable, but at the same time we should map the
> surrounding areas non executable which is not always possible with
> 1MiB mapping granularity.
> 
> We now have everything in place to use two level page tables from PBL,
> so use arch_remap_range() for the problematic cases.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_32.c | 31 +++++++------------------------
>  1 file changed, 7 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 785b20c7fd..705d27a045 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -111,8 +111,10 @@ void dma_flush_range(void *ptr, size_t size)
>  	unsigned long end = start + size;
>  
>  	__dma_flush_range(start, end);
> +#ifndef __PBL__
>  	if (outer_cache.flush_range)
>  		outer_cache.flush_range(start, end);
> +#endif

Meh. I see why this is ok (L2X0 currently initialized in initcall), but this
#ifdef looks a bit too fragile. Perhaps, we could do this in <asm/mmu.h> instead?

#ifdef __PBL__
/* Existing platforms with non-architected outer cache initialize it
 * outside PBL and new ones will likely only have architected caches,
 * so we provide a dummy here
 */
static __maybe_unused struct outer_cache_fns outer_cache;
#else
extern struct outer_cache_fns outer_cache;
#endif

>  }
>  
>  void dma_inv_range(void *ptr, size_t size)
> @@ -120,8 +122,10 @@ void dma_inv_range(void *ptr, size_t size)
>  	unsigned long start = (unsigned long)ptr;
>  	unsigned long end = start + size;
>  
> +#ifndef __PBL__
>  	if (outer_cache.inv_range)
>  		outer_cache.inv_range(start, end);
> +#endif
>  	__dma_inv_range(start, end);
>  }
>  
> @@ -542,16 +546,6 @@ void *dma_alloc_writecombine(size_t size, dma_addr_t *dma_handle)
>  	return dma_alloc_map(size, dma_handle, ARCH_MAP_WRITECOMBINE);
>  }
>  
> -static inline void map_region(unsigned long start, unsigned long size,
> -			      uint64_t flags)
> -
> -{
> -	start = ALIGN_DOWN(start, SZ_1M);
> -	size  = ALIGN(size, SZ_1M);
> -
> -	create_sections(start, start + size - 1, flags);
> -}
> -
>  void mmu_early_enable(unsigned long membase, unsigned long memsize)
>  {
>  	uint32_t *ttb = (uint32_t *)arm_mem_ttb(membase + memsize);
> @@ -572,21 +566,10 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize)
>  	 */
>  	create_flat_mapping();
>  
> -	/*
> -	 * There can be SoCs that have a section shared between device memory
> -	 * and the on-chip RAM hosting the PBL. Thus mark this section
> -	 * uncachable, but executable.
> -	 * On such SoCs, executing from OCRAM could cause the instruction
> -	 * prefetcher to speculatively access that device memory, triggering
> -	 * potential errant behavior.
> -	 *
> -	 * If your SoC has such a memory layout, you should rewrite the code
> -	 * here to map the OCRAM page-wise.
> -	 */
> -	map_region((unsigned long)_stext, _etext - _stext, PMD_SECT_DEF_UNCACHED);
> -
>  	/* maps main memory as cachable */
> -	map_region(membase, memsize - OPTEE_SIZE, PMD_SECT_DEF_CACHED);
> +	arch_remap_range((void *)membase, memsize - OPTEE_SIZE, MAP_CACHED);
> +	arch_remap_range((void *)membase + memsize - OPTEE_SIZE, OPTEE_SIZE, MAP_UNCACHED);
> +	arch_remap_range(_stext, PAGE_ALIGN(_etext - _stext), MAP_CACHED);

Rest looks fine. With above point addressed:

Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

>  
>  	__mmu_cache_on();
>  }

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 31/34] ARM: mmu32: read TTB value from register
  2023-05-17 13:58   ` Ahmad Fatoum
@ 2023-05-17 14:39     ` Sascha Hauer
  2023-05-19  6:53       ` Ahmad Fatoum
  0 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17 14:39 UTC (permalink / raw)
  To: Ahmad Fatoum; +Cc: Barebox List

On Wed, May 17, 2023 at 03:58:01PM +0200, Ahmad Fatoum wrote:
> On 17.05.23 11:03, Sascha Hauer wrote:
> > Instead of relying on a variable for the location of the TTB which we
> > have to initialize in both PBL and barebox proper, just read the value
> > back from the hardware register.
> 
> Why not initialize on first call to get_ttb()?

get_ttb() doesn't have access to endmem which we would need to get the
address for the ttb.

Also we have the value in the hardware register, why not use it?

Sascha


-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization
  2023-05-17  9:03 ` [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization Sascha Hauer
@ 2023-05-17 14:43   ` Ahmad Fatoum
  2023-05-17 14:55     ` Sascha Hauer
  0 siblings, 1 reply; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 14:43 UTC (permalink / raw)
  To: Sascha Hauer, Barebox List

On 17.05.23 11:03, Sascha Hauer wrote:
> The early MMU code now uses pages to map the OP-TEE area non executable.
> This mapping is overwritten with sections in barebox proper. Refrain
> from doing so by using arch_remap_range() and bypassing reserved areas.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  arch/arm/cpu/mmu_32.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> index 705d27a045..47711bed35 100644
> --- a/arch/arm/cpu/mmu_32.c
> +++ b/arch/arm/cpu/mmu_32.c
> @@ -522,9 +522,17 @@ void __mmu_init(bool mmu_on)
>  	vectors_init();

Took me a while to parse the below code, so I think a comment may be apt:

/*
 * Early mmu init will have mapped everything but the initial memory area
 * (excluding final OPTEE_SIZE bytes) uncached. We have now discovered
 * all memory banks, so let's map all pages, excluding reserved memory areas,
 * cacheable and executable.
 */

>  
>  	for_each_memory_bank(bank) {
> -		create_sections(bank->start, bank->start + bank->size - 1,
> -				PMD_SECT_DEF_CACHED);
> -		__mmu_cache_flush();
> +		struct resource *rsv;
> +		resource_size_t pos;
> +
> +		pos = bank->start;
> +
> +		for_each_reserved_region(bank, rsv) {
> +			arch_remap_range((void *)pos, rsv->start - pos, MAP_CACHED);
> +			pos = rsv->end + 1;
> +		}
> +
> +		arch_remap_range((void *)pos, bank->start + bank->size - pos, MAP_CACHED);

I am a bit bothered by the asymmetry here: Reserved regions in the extra memory banks
will be initially uncached (because outside initmem), so this loop does the correct thing.

For reserved regions within the initial memory, only the OP-TEE region would be uncached,
everything else would just be requested, but is still mapped cacheable.

IMO, that's surprising behavior.

Cheers,
Ahmad

>  	}
>  }
>  

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 28/34] ARM: mmu32: Add pte_flags_to_pmd()
  2023-05-17 13:43   ` Ahmad Fatoum
@ 2023-05-17 14:44     ` Sascha Hauer
  0 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17 14:44 UTC (permalink / raw)
  To: Ahmad Fatoum; +Cc: Barebox List

On Wed, May 17, 2023 at 03:43:51PM +0200, Ahmad Fatoum wrote:
> On 17.05.23 11:03, Sascha Hauer wrote:
> > Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> > ---
> >  arch/arm/cpu/mmu_32.c | 35 +++++++++++++++++++++++++++++------
> >  1 file changed, 29 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> > index 4abaab7d87..0af89ac39c 100644
> > --- a/arch/arm/cpu/mmu_32.c
> > +++ b/arch/arm/cpu/mmu_32.c
> > @@ -187,30 +187,53 @@ static u32 pmd_flags_to_pte(u32 pmd)
> >  	return pte;
> >  }
> >  
> > +static u32 pte_flags_to_pmd(u32 pte)
> > +{
> > +	u32 pmd = 0;
> > +
> > +	if (pte & PTE_BUFFERABLE)
> > +		pmd |= PMD_SECT_BUFFERABLE;
> > +	if (pte & PTE_CACHEABLE)
> > +		pmd |= PMD_SECT_CACHEABLE;
> > +
> > +	if (cpu_architecture() >= CPU_ARCH_ARMv7) {
> > +		if (pte & PTE_EXT_NG)
> > +			pmd |= PMD_SECT_nG;
> > +		if (pte & PTE_EXT_XN)
> > +			pmd |= PMD_SECT_XN;
> 
> Note that at least,  <asm/pgtable.h> claims these bits are v6+:
> 
> #define PTE_EXT_XN		(1 << 0)	/* v6 */
> #define PMD_SECT_nG		(1 << 17)	/* v6 */

Yes, I know. We only seem to use them on ARMv7+ though, so I decided to
honour these flags on ARMv7+ only.

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization
  2023-05-17 14:43   ` Ahmad Fatoum
@ 2023-05-17 14:55     ` Sascha Hauer
  2023-05-17 15:56       ` Ahmad Fatoum
  0 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-17 14:55 UTC (permalink / raw)
  To: Ahmad Fatoum; +Cc: Barebox List

On Wed, May 17, 2023 at 04:43:39PM +0200, Ahmad Fatoum wrote:
> On 17.05.23 11:03, Sascha Hauer wrote:
> > The early MMU code now uses pages to map the OP-TEE area non executable.
> > This mapping is overwritten with sections in barebox proper. Refrain
> > from doing so by using arch_remap_range() and bypassing reserved areas.
> > 
> > Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> > ---
> >  arch/arm/cpu/mmu_32.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> > index 705d27a045..47711bed35 100644
> > --- a/arch/arm/cpu/mmu_32.c
> > +++ b/arch/arm/cpu/mmu_32.c
> > @@ -522,9 +522,17 @@ void __mmu_init(bool mmu_on)
> >  	vectors_init();
> 
> Took me a while to parse the below code, so I think a comment may be apt:

That's just fair, it likely tool me even longer to write it ;)

> 
> /*
>  * Early mmu init will have mapped everything but the initial memory area
>  * (excluding final OPTEE_SIZE bytes) uncached. We have now discovered
>  * all memory banks, so let's map all pages, excluding reserved memory areas,
>  * cacheable and executable.
>  */
> 
> >  
> >  	for_each_memory_bank(bank) {
> > -		create_sections(bank->start, bank->start + bank->size - 1,
> > -				PMD_SECT_DEF_CACHED);
> > -		__mmu_cache_flush();
> > +		struct resource *rsv;
> > +		resource_size_t pos;
> > +
> > +		pos = bank->start;
> > +
> > +		for_each_reserved_region(bank, rsv) {
> > +			arch_remap_range((void *)pos, rsv->start - pos, MAP_CACHED);
> > +			pos = rsv->end + 1;
> > +		}
> > +
> > +		arch_remap_range((void *)pos, bank->start + bank->size - pos, MAP_CACHED);
> 
> I am a bit bothered by the asymmetry here: Reserved regions in the extra memory banks
> will be initially uncached (because outside initmem), so this loop does the correct thing.
> 
> For reserved regions within the initial memory, only the OP-TEE region would be uncached,
> everything else would just be requested, but is still mapped cacheable.
> 
> IMO, that's surprising behavior.

I agree. What would you suggest. Map all reserved regions uncached?

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE
  2023-05-17 13:14     ` Sascha Hauer
@ 2023-05-17 15:50       ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 15:50 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: Barebox List

On 17.05.23 15:14, Sascha Hauer wrote:
> On Wed, May 17, 2023 at 02:48:43PM +0200, Ahmad Fatoum wrote:
>> On 17.05.23 11:03, Sascha Hauer wrote:
>>> We want to reserve memory for OP-TEE at the end of available SDRAM,
>>> so move the scratch area below OP-TEE and not above.
>>>
>>> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
>>> ---
>>>  arch/arm/include/asm/barebox-arm.h | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/barebox-arm.h b/arch/arm/include/asm/barebox-arm.h
>>> index f446044be6..6e6606d005 100644
>>> --- a/arch/arm/include/asm/barebox-arm.h
>>> +++ b/arch/arm/include/asm/barebox-arm.h
>>> @@ -71,14 +71,14 @@ static inline void arm_fixup_vectors(void)
>>>  
>>>  void *barebox_arm_boot_dtb(void);
>>>  
>>> -#define arm_mem_scratch(endmem) ((endmem) - SZ_32K)
>>> +#define arm_mem_scratch(endmem) ((endmem) - OPTEE_SIZE - SZ_32K)
>>>  
>>>  static inline const void *arm_mem_scratch_get(void)
>>>  {
>>>  	return (const void *)arm_mem_scratch(arm_mem_endmem_get());
>>>  }
>>>  
>>> -#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K - OPTEE_SIZE)
>>> +#define arm_mem_stack_top(endmem) ((endmem) - SZ_64K)
>>
>> I don't understand why you drop OPTEE_SIZE here. Wouldn't the stack
>> now eat into the OP-TEE region?
> 
> I accidently thought that arm_mem_stack_top() is calculated based on the
> region above it, namely arm_mem_scratch(), but really it's calculated
> based on endmem directly.
> 
> Indeed it's wrong like this, it should be:
> 
> #define arm_mem_stack_top(endmem) (arm_mem_scratch(endmem) - SZ_64K)
> 
> I just stumbled upon the SZ_64K here. I followed the value back to 2016
> and found 75c96bd2459e ("ARM: Do not use last 64KiB of address space for
> barebox"). I had a board that time that has SDRAM at the very end of the
> 32bit address space. On that board it happened that we overwrite parts
> of the lowlevel memory with the vector table. It seems that has been
> lost over time as now we put the scratch space and possibly parts of
> OP-TEE into the last 64k.

Scratch space at -SZ_32K leaves 32K for IVT, which should be enough, no?

> 
> Sascha
> 

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization
  2023-05-17 14:55     ` Sascha Hauer
@ 2023-05-17 15:56       ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-17 15:56 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: Barebox List

On 17.05.23 16:55, Sascha Hauer wrote:
> On Wed, May 17, 2023 at 04:43:39PM +0200, Ahmad Fatoum wrote:
>> I am a bit bothered by the asymmetry here: Reserved regions in the extra memory banks
>> will be initially uncached (because outside initmem), so this loop does the correct thing.
>>
>> For reserved regions within the initial memory, only the OP-TEE region would be uncached,
>> everything else would just be requested, but is still mapped cacheable.
>>
>> IMO, that's surprising behavior.
> 
> I agree. What would you suggest. Map all reserved regions uncached?

That would be my preference, yes. Then as a future step, we could elect to parse the
reserved entries in PBL to be able to handle OP-TEE or whatever even in middle of RAM.

Cheers,
Ahmad

> 
> Sascha
> 

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 31/34] ARM: mmu32: read TTB value from register
  2023-05-17 14:39     ` Sascha Hauer
@ 2023-05-19  6:53       ` Ahmad Fatoum
  2023-05-19  7:44         ` Sascha Hauer
  0 siblings, 1 reply; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-19  6:53 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: Barebox List

On 17.05.23 16:39, Sascha Hauer wrote:
> On Wed, May 17, 2023 at 03:58:01PM +0200, Ahmad Fatoum wrote:
>> On 17.05.23 11:03, Sascha Hauer wrote:
>>> Instead of relying on a variable for the location of the TTB which we
>>> have to initialize in both PBL and barebox proper, just read the value
>>> back from the hardware register.
>>
>> Why not initialize on first call to get_ttb()?
> 
> get_ttb() doesn't have access to endmem which we would need to get the
> address for the ttb.
> 
> Also we have the value in the hardware register, why not use it?

I meant initialization using the hardware register.

> 
> Sascha
> 
> 

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 31/34] ARM: mmu32: read TTB value from register
  2023-05-19  6:53       ` Ahmad Fatoum
@ 2023-05-19  7:44         ` Sascha Hauer
  2023-05-19  7:52           ` Ahmad Fatoum
  0 siblings, 1 reply; 68+ messages in thread
From: Sascha Hauer @ 2023-05-19  7:44 UTC (permalink / raw)
  To: Ahmad Fatoum; +Cc: Barebox List

On Fri, May 19, 2023 at 08:53:34AM +0200, Ahmad Fatoum wrote:
> On 17.05.23 16:39, Sascha Hauer wrote:
> > On Wed, May 17, 2023 at 03:58:01PM +0200, Ahmad Fatoum wrote:
> >> On 17.05.23 11:03, Sascha Hauer wrote:
> >>> Instead of relying on a variable for the location of the TTB which we
> >>> have to initialize in both PBL and barebox proper, just read the value
> >>> back from the hardware register.
> >>
> >> Why not initialize on first call to get_ttb()?
> > 
> > get_ttb() doesn't have access to endmem which we would need to get the
> > address for the ttb.
> > 
> > Also we have the value in the hardware register, why not use it?
> 
> I meant initialization using the hardware register.

You mean something like:

static uint32_t *ttb;

static inline uint32_t *get_ttb(void)
{
	if (!ttb)
		ttb = (uint32_t *)(get_ttbr() & ~0x3fff);

	return ttb;
}

If yes, I don't know what this is good for. If no, please explain, I
don't seem to understand what you mean.

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 31/34] ARM: mmu32: read TTB value from register
  2023-05-19  7:44         ` Sascha Hauer
@ 2023-05-19  7:52           ` Ahmad Fatoum
  0 siblings, 0 replies; 68+ messages in thread
From: Ahmad Fatoum @ 2023-05-19  7:52 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: Barebox List

On 19.05.23 09:44, Sascha Hauer wrote:
> On Fri, May 19, 2023 at 08:53:34AM +0200, Ahmad Fatoum wrote:
>> On 17.05.23 16:39, Sascha Hauer wrote:
>>> On Wed, May 17, 2023 at 03:58:01PM +0200, Ahmad Fatoum wrote:
>>>> On 17.05.23 11:03, Sascha Hauer wrote:
>>>>> Instead of relying on a variable for the location of the TTB which we
>>>>> have to initialize in both PBL and barebox proper, just read the value
>>>>> back from the hardware register.
>>>>
>>>> Why not initialize on first call to get_ttb()?
>>>
>>> get_ttb() doesn't have access to endmem which we would need to get the
>>> address for the ttb.
>>>
>>> Also we have the value in the hardware register, why not use it?
>>
>> I meant initialization using the hardware register.
> 
> You mean something like:
> 
> static uint32_t *ttb;
> 
> static inline uint32_t *get_ttb(void)
> {
> 	if (!ttb)
> 		ttb = (uint32_t *)(get_ttbr() & ~0x3fff);
> 
> 	return ttb;
> }
> 
> If yes, I don't know what this is good for. If no, please explain, I
> don't seem to understand what you mean.

That's what I mean yes (but with ttb's scope limited to get_ttb()).

There are instances where get_ttb is called in a loop. Having a static
variable would mimic more closely the code we have before. I see now that
the Cortex-A9 TRM lists MRS cycle time as single cycle, so the static
variable seems indeed unnecessary.

> 
> Sascha
> 

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 32/34] ARM: mmu32: Use pages for early MMU setup
  2023-05-17 14:21   ` Ahmad Fatoum
@ 2023-05-22  8:14     ` Sascha Hauer
  0 siblings, 0 replies; 68+ messages in thread
From: Sascha Hauer @ 2023-05-22  8:14 UTC (permalink / raw)
  To: Ahmad Fatoum; +Cc: Barebox List

On Wed, May 17, 2023 at 04:21:32PM +0200, Ahmad Fatoum wrote:
> On 17.05.23 11:03, Sascha Hauer wrote:
> > Up to now we use 1MiB sections to setup the page tables in PBL. There
> > are two places where this leads to problems. First is OP-TEE, we have
> > to map the OP-TEE area with PTE_EXT_XN to prevent the instruction
> > prefetcher from speculating into that area. With the current section
> > mapping we have to align OPTEE_SIZE to 1MiB boundaries. The second
> > problem comes with SRAM where the PBL might be running. This SRAM has
> > to be mapped executable, but at the same time we should map the
> > surrounding areas non executable which is not always possible with
> > 1MiB mapping granularity.
> > 
> > We now have everything in place to use two level page tables from PBL,
> > so use arch_remap_range() for the problematic cases.
> > 
> > Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> > ---
> >  arch/arm/cpu/mmu_32.c | 31 +++++++------------------------
> >  1 file changed, 7 insertions(+), 24 deletions(-)
> > 
> > diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
> > index 785b20c7fd..705d27a045 100644
> > --- a/arch/arm/cpu/mmu_32.c
> > +++ b/arch/arm/cpu/mmu_32.c
> > @@ -111,8 +111,10 @@ void dma_flush_range(void *ptr, size_t size)
> >  	unsigned long end = start + size;
> >  
> >  	__dma_flush_range(start, end);
> > +#ifndef __PBL__
> >  	if (outer_cache.flush_range)
> >  		outer_cache.flush_range(start, end);
> > +#endif
> 
> Meh. I see why this is ok (L2X0 currently initialized in initcall), but this
> #ifdef looks a bit too fragile. Perhaps, we could do this in <asm/mmu.h> instead?
> 
> #ifdef __PBL__
> /* Existing platforms with non-architected outer cache initialize it
>  * outside PBL and new ones will likely only have architected caches,
>  * so we provide a dummy here
>  */
> static __maybe_unused struct outer_cache_fns outer_cache;
> #else
> extern struct outer_cache_fns outer_cache;
> #endif

Ok.

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2023-05-22  8:16 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-17  9:03 [PATCH v2 00/34] ARM: MMU rework Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 01/34] ARM: remove unused membase argument Sascha Hauer
2023-05-17 12:45   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 02/34] ARM: remove unused define Sascha Hauer
2023-05-17 12:45   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 03/34] ARM: rename __arm_mem_scratch to arm_mem_scratch Sascha Hauer
2023-05-17 12:46   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 04/34] ARM: put scratch mem area below OP-TEE Sascha Hauer
2023-05-17 12:48   ` Ahmad Fatoum
2023-05-17 13:14     ` Sascha Hauer
2023-05-17 15:50       ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 05/34] ARM: add arm_mem_optee() Sascha Hauer
2023-05-17 12:53   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 06/34] ARM: make arm_mem_scratch() a static inline function Sascha Hauer
2023-05-17 12:53   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 07/34] ARM: define stack base consistently Sascha Hauer
2023-05-17 12:55   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 08/34] ARM: move arm_mem_scratch_get() lower for consistency Sascha Hauer
2023-05-17 12:57   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 09/34] ARM: drop cache function initialization Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 10/34] ARM: Add _32 suffix to aarch32 specific filenames Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 11/34] ARM: cpu.c: remove unused include Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 12/34] ARM: mmu-common.c: use common mmu include Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 13/34] ARM: mmu32: rename mmu.h to mmu_32.h Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 14/34] ARM: mmu: implement MAP_FAULT Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 15/34] ARM: mmu64: Use arch_remap_range where possible Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 16/34] ARM: mmu32: implement zero_page_*() Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 17/34] ARM: i.MX: Drop HAB workaround Sascha Hauer
2023-05-17 13:01   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 18/34] ARM: Move early MMU after malloc initialization Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 19/34] ARM: mmu: move dma_sync_single_for_device to extra file Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 20/34] ARM: mmu: merge mmu-early_xx.c into mmu_xx.c Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 21/34] ARM: mmu: alloc 64k for early page tables Sascha Hauer
2023-05-17 13:03   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 22/34] ARM: mmu32: create alloc_pte() Sascha Hauer
2023-05-17 13:07   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 23/34] ARM: mmu64: " Sascha Hauer
2023-05-17 13:15   ` Ahmad Fatoum
2023-05-17 13:17   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 24/34] ARM: mmu: drop ttb argument Sascha Hauer
2023-05-17 13:23   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 25/34] ARM: mmu: always do MMU initialization early when MMU is enabled Sascha Hauer
2023-05-17 13:29   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 26/34] ARM: mmu32: Assume MMU is on Sascha Hauer
2023-05-17 13:36   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 27/34] ARM: mmu32: Fix pmd_flags_to_pte() for ARMv4/5/6 Sascha Hauer
2023-05-17 13:39   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 28/34] ARM: mmu32: Add pte_flags_to_pmd() Sascha Hauer
2023-05-17 13:43   ` Ahmad Fatoum
2023-05-17 14:44     ` Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 29/34] ARM: mmu32: add get_pte_flags, get_pmd_flags Sascha Hauer
2023-05-17 13:46   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 30/34] ARM: mmu32: move functions into c file Sascha Hauer
2023-05-17 13:48   ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 31/34] ARM: mmu32: read TTB value from register Sascha Hauer
2023-05-17 13:58   ` Ahmad Fatoum
2023-05-17 14:39     ` Sascha Hauer
2023-05-19  6:53       ` Ahmad Fatoum
2023-05-19  7:44         ` Sascha Hauer
2023-05-19  7:52           ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 32/34] ARM: mmu32: Use pages for early MMU setup Sascha Hauer
2023-05-17 14:21   ` Ahmad Fatoum
2023-05-22  8:14     ` Sascha Hauer
2023-05-17  9:03 ` [PATCH v2 33/34] ARM: mmu32: Skip reserved ranges during initialization Sascha Hauer
2023-05-17 14:43   ` Ahmad Fatoum
2023-05-17 14:55     ` Sascha Hauer
2023-05-17 15:56       ` Ahmad Fatoum
2023-05-17  9:03 ` [PATCH v2 34/34] ARM: mmu64: Use two level pagetables in early code Sascha Hauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox