[PATCH 0/4] ARM: mmu: ensure PTE is written at once

mail archive of the barebox mailing list
 help / color / mirror / Atom feed

* [PATCH 0/4] ARM: mmu: ensure PTE is written at once
@ 2025-05-21 17:41 Ahmad Fatoum
  2025-05-21 17:41 ` [PATCH 1/4] arch: sync READ_ONCE/WRITE_ONCE with Linux Ahmad Fatoum
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Ahmad Fatoum @ 2025-05-21 17:41 UTC (permalink / raw)
  To: barebox

Writing the PTE with a normal non-volatile access is a bad idea, because
speculation may already use a partially written entry.

Use WRITE_ONCE to turn this into a volatile write via a new set_pte
helper. The extra helper is useful for type enforcement and for future
extension when we start to consistently apply break-before-make[1].

Ahmad Fatoum (4):
  arch: sync READ_ONCE/WRITE_ONCE with Linux
  ARM64: mmu: document granule size calculation
  ARM: mmu: ensure PTE is written at once
  WIP: break before make and don't invalidate uncached regions being
    remapped

 Documentation/devel/porting.rst     |   2 +
 arch/arm/cpu/mmu_32.c               | 247 +++++++++++++++++++++++-----
 arch/arm/cpu/mmu_64.c               |  38 ++++-
 arch/arm/include/asm/barrier.h      |   8 +
 arch/kvx/include/asm/barrier.h      |   2 +
 arch/mips/include/asm/barrier.h     |   8 +
 arch/openrisc/include/asm/barrier.h |   8 +
 arch/powerpc/include/asm/barrier.h  |   8 +
 arch/sandbox/include/asm/barrier.h  |   8 +
 arch/x86/include/asm/barrier.h      |   8 +
 include/asm-generic/barrier.h       |  24 +++
 include/asm-generic/rwonce.h        |  87 ++++++++++
 include/linux/bitops.h              |   2 +-
 include/linux/compiler.h            |  94 -----------
 include/linux/compiler_types.h      |  46 ++++++
 include/linux/list.h                |   1 +
 16 files changed, 449 insertions(+), 142 deletions(-)
 create mode 100644 arch/arm/include/asm/barrier.h
 create mode 100644 arch/mips/include/asm/barrier.h
 create mode 100644 arch/openrisc/include/asm/barrier.h
 create mode 100644 arch/powerpc/include/asm/barrier.h
 create mode 100644 arch/sandbox/include/asm/barrier.h
 create mode 100644 arch/x86/include/asm/barrier.h
 create mode 100644 include/asm-generic/barrier.h
 create mode 100644 include/asm-generic/rwonce.h

-- 
2.39.5




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] arch: sync READ_ONCE/WRITE_ONCE with Linux
  2025-05-21 17:41 [PATCH 0/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
@ 2025-05-21 17:41 ` Ahmad Fatoum
  2025-05-21 17:41 ` [PATCH 2/4] ARM64: mmu: document granule size calculation Ahmad Fatoum
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ahmad Fatoum @ 2025-05-21 17:41 UTC (permalink / raw)
  To: barebox; +Cc: Ahmad Fatoum

In preparation for using READ_ONCE/WRITE_ONCE in the MMU code, let's
give it a sync with Linux.

This removes ACCESS_ONCE, but we have only a single user in barebox
anyway outside of scripts/.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 Documentation/devel/porting.rst     |  2 +
 arch/arm/include/asm/barrier.h      |  8 +++
 arch/kvx/include/asm/barrier.h      |  2 +
 arch/mips/include/asm/barrier.h     |  8 +++
 arch/openrisc/include/asm/barrier.h |  8 +++
 arch/powerpc/include/asm/barrier.h  |  8 +++
 arch/sandbox/include/asm/barrier.h  |  8 +++
 arch/x86/include/asm/barrier.h      |  8 +++
 include/asm-generic/barrier.h       | 24 ++++++++
 include/asm-generic/rwonce.h        | 87 ++++++++++++++++++++++++++
 include/linux/bitops.h              |  2 +-
 include/linux/compiler.h            | 94 -----------------------------
 include/linux/compiler_types.h      | 46 ++++++++++++++
 include/linux/list.h                |  1 +
 14 files changed, 211 insertions(+), 95 deletions(-)
 create mode 100644 arch/arm/include/asm/barrier.h
 create mode 100644 arch/mips/include/asm/barrier.h
 create mode 100644 arch/openrisc/include/asm/barrier.h
 create mode 100644 arch/powerpc/include/asm/barrier.h
 create mode 100644 arch/sandbox/include/asm/barrier.h
 create mode 100644 arch/x86/include/asm/barrier.h
 create mode 100644 include/asm-generic/barrier.h
 create mode 100644 include/asm-generic/rwonce.h

diff --git a/Documentation/devel/porting.rst b/Documentation/devel/porting.rst
index e3904eea067c..9dab2a301f2a 100644
--- a/Documentation/devel/porting.rst
+++ b/Documentation/devel/porting.rst
@@ -488,6 +488,8 @@ Your architecture needs to implement following headers:
    Only if ``HAS_DMA`` is selected by the architecture.
  - ``<asm/io.h>``
    Defines I/O memory and port accessors
+ - ``<asm/barrier.h>``
+   Can normally just include ``<asm-generic/barrier.h>``
  - ``<asm/mmu.h>``
  - ``<asm/string.h>``
  - ``<asm/swab.h>``
diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
new file mode 100644
index 000000000000..5eb06065cd9f
--- /dev/null
+++ b/arch/arm/include/asm/barrier.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ASM_BARRIER_H
+#define _ASM_BARRIER_H
+
+#include <asm-generic/barrier.h>
+
+#endif
diff --git a/arch/kvx/include/asm/barrier.h b/arch/kvx/include/asm/barrier.h
index 616b5f90a230..d08b4a98a62c 100644
--- a/arch/kvx/include/asm/barrier.h
+++ b/arch/kvx/include/asm/barrier.h
@@ -6,6 +6,8 @@
 #ifndef _ASM_KVX_BARRIER_H
 #define _ASM_KVX_BARRIER_H
 
+#include <asm-generic/barrier.h>
+
 /* fence is sufficient to guarantee write ordering */
 #define wmb()	__builtin_kvx_fence()
 
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
new file mode 100644
index 000000000000..5eb06065cd9f
--- /dev/null
+++ b/arch/mips/include/asm/barrier.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ASM_BARRIER_H
+#define _ASM_BARRIER_H
+
+#include <asm-generic/barrier.h>
+
+#endif
diff --git a/arch/openrisc/include/asm/barrier.h b/arch/openrisc/include/asm/barrier.h
new file mode 100644
index 000000000000..5eb06065cd9f
--- /dev/null
+++ b/arch/openrisc/include/asm/barrier.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ASM_BARRIER_H
+#define _ASM_BARRIER_H
+
+#include <asm-generic/barrier.h>
+
+#endif
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
new file mode 100644
index 000000000000..5eb06065cd9f
--- /dev/null
+++ b/arch/powerpc/include/asm/barrier.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ASM_BARRIER_H
+#define _ASM_BARRIER_H
+
+#include <asm-generic/barrier.h>
+
+#endif
diff --git a/arch/sandbox/include/asm/barrier.h b/arch/sandbox/include/asm/barrier.h
new file mode 100644
index 000000000000..5eb06065cd9f
--- /dev/null
+++ b/arch/sandbox/include/asm/barrier.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ASM_BARRIER_H
+#define _ASM_BARRIER_H
+
+#include <asm-generic/barrier.h>
+
+#endif
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
new file mode 100644
index 000000000000..5eb06065cd9f
--- /dev/null
+++ b/arch/x86/include/asm/barrier.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ASM_BARRIER_H
+#define _ASM_BARRIER_H
+
+#include <asm-generic/barrier.h>
+
+#endif
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
new file mode 100644
index 000000000000..fbfb67f3087f
--- /dev/null
+++ b/include/asm-generic/barrier.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Generic barrier definitions.
+ *
+ * It should be possible to use these on really simple architectures,
+ * but it serves more as a starting point for new ports.
+ *
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ */
+#ifndef __ASM_GENERIC_BARRIER_H
+#define __ASM_GENERIC_BARRIER_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/compiler.h>
+#include <asm-generic/rwonce.h>
+
+#ifndef nop
+#define nop()	asm volatile ("nop")
+#endif
+
+#endif /* !__ASSEMBLY__ */
+#endif /* __ASM_GENERIC_BARRIER_H */
diff --git a/include/asm-generic/rwonce.h b/include/asm-generic/rwonce.h
new file mode 100644
index 000000000000..53aad037902c
--- /dev/null
+++ b/include/asm-generic/rwonce.h
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Prevent the compiler from merging or refetching reads or writes. The
+ * compiler is also forbidden from reordering successive instances of
+ * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
+ * particular ordering. One way to make the compiler aware of ordering is to
+ * put the two invocations of READ_ONCE or WRITE_ONCE in different C
+ * statements.
+ *
+ * These two macros will also work on aggregate data types like structs or
+ * unions.
+ *
+ * Their two major use cases are: (1) Mediating communication between
+ * process-level code and irq/NMI handlers, all running on the same CPU,
+ * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
+ * mutilate accesses that either do not require ordering or that interact
+ * with an explicit memory barrier or atomic instruction that provides the
+ * required ordering.
+ */
+#ifndef __ASM_GENERIC_RWONCE_H
+#define __ASM_GENERIC_RWONCE_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/compiler_types.h>
+
+/*
+ * Yes, this permits 64-bit accesses on 32-bit architectures. These will
+ * actually be atomic in some cases (namely Armv7 + LPAE), but for others we
+ * rely on the access being split into 2x32-bit accesses for a 32-bit quantity
+ * (e.g. a virtual address) and a strong prevailing wind.
+ */
+#define compiletime_assert_rwonce_type(t)					\
+	compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long),	\
+		"Unsupported access size for {READ,WRITE}_ONCE().")
+
+/*
+ * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
+ * atomicity. Note that this may result in tears!
+ */
+#ifndef __READ_ONCE
+#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#endif
+
+#define READ_ONCE(x)							\
+({									\
+	compiletime_assert_rwonce_type(x);				\
+	__READ_ONCE(x);							\
+})
+
+#define __WRITE_ONCE(x, val)						\
+do {									\
+	*(volatile typeof(x) *)&(x) = (val);				\
+} while (0)
+
+#define WRITE_ONCE(x, val)						\
+do {									\
+	compiletime_assert_rwonce_type(x);				\
+	__WRITE_ONCE(x, val);						\
+} while (0)
+
+static __no_sanitize_or_inline
+unsigned long __read_once_word_nocheck(const void *addr)
+{
+	return __READ_ONCE(*(unsigned long *)addr);
+}
+
+/*
+ * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need to load a
+ * word from memory atomically but without telling KASAN/KCSAN. This is
+ * usually used by unwinding code when walking the stack of a running process.
+ */
+#define READ_ONCE_NOCHECK(x)						\
+({									\
+	compiletime_assert(sizeof(x) == sizeof(unsigned long),		\
+		"Unsupported access size for READ_ONCE_NOCHECK().");	\
+	(typeof(x))__read_once_word_nocheck(&(x));			\
+})
+
+static __no_kasan_or_inline
+unsigned long read_word_at_a_time(const void *addr)
+{
+	return *(unsigned long *)addr;
+}
+
+#endif /* __ASSEMBLY__ */
+#endif	/* __ASM_GENERIC_RWONCE_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 44602ad5af84..b6500e4630b3 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -210,7 +210,7 @@ static inline void assign_bit(long nr, volatile unsigned long *addr, bool value)
 	typeof(*ptr) old, new;					\
 								\
 	do {							\
-		old = ACCESS_ONCE(*ptr);			\
+		old = READ_ONCE(*ptr);				\
 		new = (old & ~mask) | bits;			\
 	} while (cmpxchg(ptr, old, new) != old);		\
 								\
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6654c164f594..e7dc9ba0a590 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -168,26 +168,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 
 #include <linux/types.h>
 
-#define __READ_ONCE_SIZE						\
-({									\
-	switch (size) {							\
-	case 1: *(__u8 *)res = *(volatile __u8 *)p; break;		\
-	case 2: *(__u16 *)res = *(volatile __u16 *)p; break;		\
-	case 4: *(__u32 *)res = *(volatile __u32 *)p; break;		\
-	case 8: *(__u64 *)res = *(volatile __u64 *)p; break;		\
-	default:							\
-		barrier();						\
-		__builtin_memcpy((void *)res, (const void *)p, size);	\
-		barrier();						\
-	}								\
-})
-
-static __always_inline
-void __read_once_size(const volatile void *p, void *res, int size)
-{
-	__READ_ONCE_SIZE;
-}
-
 #ifdef CONFIG_KASAN
 /*
  * We can't declare function 'inline' because __no_sanitize_address confilcts
@@ -200,80 +180,6 @@ void __read_once_size(const volatile void *p, void *res, int size)
 # define __no_kasan_or_inline __always_inline
 #endif
 
-static __no_kasan_or_inline
-void __read_once_size_nocheck(const volatile void *p, void *res, int size)
-{
-	__READ_ONCE_SIZE;
-}
-
-static __always_inline void __write_once_size(volatile void *p, void *res, int size)
-{
-	switch (size) {
-	case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
-	case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
-	case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
-	case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
-	default:
-		barrier();
-		__builtin_memcpy((void *)p, (const void *)res, size);
-		barrier();
-	}
-}
-
-/*
- * Prevent the compiler from merging or refetching reads or writes. The
- * compiler is also forbidden from reordering successive instances of
- * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
- * particular ordering. One way to make the compiler aware of ordering is to
- * put the two invocations of READ_ONCE or WRITE_ONCE in different C
- * statements.
- *
- * These two macros will also work on aggregate data types like structs or
- * unions. If the size of the accessed data type exceeds the word size of
- * the machine (e.g., 32 bits or 64 bits) READ_ONCE() and WRITE_ONCE() will
- * fall back to memcpy(). There's at least two memcpy()s: one for the
- * __builtin_memcpy() and then one for the macro doing the copy of variable
- * - '__u' allocated on the stack.
- *
- * Their two major use cases are: (1) Mediating communication between
- * process-level code and irq/NMI handlers, all running on the same CPU,
- * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
- * mutilate accesses that either do not require ordering or that interact
- * with an explicit memory barrier or atomic instruction that provides the
- * required ordering.
- */
-
-#define __READ_ONCE(x, check)						\
-({									\
-	union { typeof(x) __val; char __c[1]; } __u;			\
-	if (check)							\
-		__read_once_size(&(x), __u.__c, sizeof(x));		\
-	else								\
-		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\
-	__u.__val;							\
-})
-#define READ_ONCE(x) __READ_ONCE(x, 1)
-
-/*
- * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
- * to hide memory access from KASAN.
- */
-#define READ_ONCE_NOCHECK(x) __READ_ONCE(x, 0)
-
-static __no_kasan_or_inline
-unsigned long read_word_at_a_time(const void *addr)
-{
-	return *(unsigned long *)addr;
-}
-
-#define WRITE_ONCE(x, val) \
-({							\
-	union { typeof(x) __val; char __c[1]; } __u =	\
-		{ .__val = (__force typeof(x)) (val) }; \
-	__write_once_size(&(x), __u.__c, sizeof(x));	\
-	__u.__val;					\
-})
-
 #endif /* __KERNEL__ */
 
 /**
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index d208004b3989..87ab117aca13 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -222,6 +222,28 @@ struct ftrace_likely_data {
 /* Are two types/vars the same type (ignoring qualifiers)? */
 #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
 
+/*
+ * __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
+ *			       non-scalar types unchanged.
+ */
+/*
+ * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char'
+ * is not type-compatible with 'signed char', and we define a separate case.
+ */
+#define __scalar_type_to_expr_cases(type)				\
+		unsigned type:	(unsigned type)0,			\
+		signed type:	(signed type)0
+
+#define __unqual_scalar_typeof(x) typeof(				\
+		_Generic((x),						\
+			 char:	(char)0,				\
+			 __scalar_type_to_expr_cases(char),		\
+			 __scalar_type_to_expr_cases(short),		\
+			 __scalar_type_to_expr_cases(int),		\
+			 __scalar_type_to_expr_cases(long),		\
+			 __scalar_type_to_expr_cases(long long),	\
+			 default: (x)))
+
 /* Is this type a native word size -- useful for atomic operations */
 #define __native_word(t) \
 	(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
@@ -356,6 +378,30 @@ struct ftrace_likely_data {
  */
 #define noinline_for_stack noinline
 
+/*
+ * Sanitizer helper attributes: Because using __always_inline and
+ * __no_sanitize_* conflict, provide helper attributes that will either expand
+ * to __no_sanitize_* in compilation units where instrumentation is enabled
+ * (__SANITIZE_*__), or __always_inline in compilation units without
+ * instrumentation (__SANITIZE_*__ undefined).
+ */
+#ifdef __SANITIZE_ADDRESS__
+/*
+ * We can't declare function 'inline' because __no_sanitize_address conflicts
+ * with inlining. Attempt to inline it may cause a build failure.
+ *     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
+ * '__maybe_unused' allows us to avoid defined-but-not-used warnings.
+ */
+# define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
+# define __no_sanitize_or_inline __no_kasan_or_inline
+#else
+# define __no_kasan_or_inline __always_inline
+#endif
+
+#ifndef __no_sanitize_or_inline
+#define __no_sanitize_or_inline __always_inline
+#endif
+
 /* code that can't be instrumented at all */
 #define noinstr \
 	noinline notrace __no_sanitize_address __no_stack_protector
diff --git a/include/linux/list.h b/include/linux/list.h
index 35b3f573806a..b90ea3e125d0 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -7,6 +7,7 @@
 #include <linux/stddef.h>
 #include <linux/poison.h>
 #include <linux/const.h>
+#include <asm/barrier.h>
 
 /*
  * Simple doubly linked list implementation.
-- 
2.39.5




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/4] ARM64: mmu: document granule size calculation
  2025-05-21 17:41 [PATCH 0/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
  2025-05-21 17:41 ` [PATCH 1/4] arch: sync READ_ONCE/WRITE_ONCE with Linux Ahmad Fatoum
@ 2025-05-21 17:41 ` Ahmad Fatoum
  2025-05-21 17:41 ` [PATCH 3/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ahmad Fatoum @ 2025-05-21 17:41 UTC (permalink / raw)
  To: barebox; +Cc: Ahmad Fatoum

Copy a nice-looking ASCII table from U-Boot's cache_v8.c to help with
understanding the function.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/mmu_64.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index bc1a44d0a7b8..d9b0b74d7314 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -176,6 +176,24 @@ static void create_sections(uint64_t virt, uint64_t phys, uint64_t size,
 
 static size_t granule_size(int level)
 {
+	/*
+	 *  With 4k page granule, a virtual address is split into 4 lookup parts
+	 *  spanning 9 bits each:
+	 *
+	 *    _______________________________________________
+	 *   |       |       |       |       |       |       |
+	 *   |   0   |  Lv0  |  Lv1  |  Lv2  |  Lv3  |  off  |
+	 *   |_______|_______|_______|_______|_______|_______|
+	 *     63-48   47-39   38-30   29-21   20-12   11-00
+	 *
+	 *             mask        page size
+	 *
+	 *    Lv0: FF8000000000       --
+	 *    Lv1:   7FC0000000       1G
+	 *    Lv2:     3FE00000       2M
+	 *    Lv3:       1FF000       4K
+	 *    off:          FFF
+	 */
 	switch (level) {
 	default:
 	case 0:
-- 
2.39.5




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 3/4] ARM: mmu: ensure PTE is written at once
  2025-05-21 17:41 [PATCH 0/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
  2025-05-21 17:41 ` [PATCH 1/4] arch: sync READ_ONCE/WRITE_ONCE with Linux Ahmad Fatoum
  2025-05-21 17:41 ` [PATCH 2/4] ARM64: mmu: document granule size calculation Ahmad Fatoum
@ 2025-05-21 17:41 ` Ahmad Fatoum
  2025-05-21 17:41 ` [PATCH 4/4] WIP: break before make and don't invalidate uncached regions being remapped Ahmad Fatoum
  2025-05-26 13:51 ` (subset) [PATCH 0/4] ARM: mmu: ensure PTE is written at once Sascha Hauer
  4 siblings, 0 replies; 6+ messages in thread
From: Ahmad Fatoum @ 2025-05-21 17:41 UTC (permalink / raw)
  To: barebox; +Cc: Ahmad Fatoum

Writing the PTE with a normal non-volatile access is a bad idea, because
speculation may already use a partially written entry.

Use WRITE_ONCE to turn this into a volatile write via a new set_pte
helper. The extra helper is useful for type enforcement and for future
extension when we start to consistently apply break-before-make[1].

[1]: https://lore.kernel.org/all/87a5btrcyr.wl-maz@kernel.org/

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 40 +++++++++++++++++++++++++++++-----------
 arch/arm/cpu/mmu_64.c | 20 +++++++++++++-------
 2 files changed, 42 insertions(+), 18 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index ec6bd27da4e1..9f50194c7c2b 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -65,6 +65,11 @@ static bool pgd_type_table(u32 pgd)
 
 #define PTE_SIZE       (PTRS_PER_PTE * sizeof(u32))
 
+static void set_pte(uint32_t *pt, uint32_t val)
+{
+	WRITE_ONCE(*pt, val);
+}
+
 #ifdef __PBL__
 static uint32_t *alloc_pte(void)
 {
@@ -141,13 +146,14 @@ static u32 *arm_create_pte(unsigned long virt, unsigned long phys,
 	ttb_idx = pgd_index(virt);
 
 	for (i = 0; i < PTRS_PER_PTE; i++) {
-		table[i] = phys | PTE_TYPE_SMALL | flags;
+		set_pte(&table[i], phys | PTE_TYPE_SMALL | flags);
 		virt += PAGE_SIZE;
 		phys += PAGE_SIZE;
 	}
 	dma_flush_range(table, PTRS_PER_PTE * sizeof(u32));
 
-	ttb[ttb_idx] = (unsigned long)table | PMD_TYPE_TABLE;
+	// TODO break-before-make missing
+	set_pte(&ttb[ttb_idx], (unsigned long)table | PMD_TYPE_TABLE);
 	dma_flush_range(&ttb[ttb_idx], sizeof(u32));
 
 	return table;
@@ -264,14 +270,17 @@ static void __arch_remap_range(void *_virt_addr, phys_addr_t phys_addr, size_t s
 		if (size >= PGDIR_SIZE && pgdir_size_aligned &&
 		    IS_ALIGNED(phys_addr, PGDIR_SIZE) &&
 		    !pgd_type_table(*pgd)) {
+			u32 val;
 			/*
 			 * TODO: Add code to discard a page table and
 			 * replace it with a section
 			 */
 			chunk = PGDIR_SIZE;
-			*pgd = phys_addr | pmd_flags;
+			val = phys_addr | pmd_flags;
 			if (map_type != MAP_FAULT)
-				*pgd |= PMD_TYPE_SECT;
+				val |= PMD_TYPE_SECT;
+			// TODO break-before-make missing
+			set_pte(pgd, val);
 			dma_flush_range(pgd, sizeof(*pgd));
 		} else {
 			unsigned int num_ptes;
@@ -310,10 +319,15 @@ static void __arch_remap_range(void *_virt_addr, phys_addr_t phys_addr, size_t s
 			}
 
 			for (i = 0; i < num_ptes; i++) {
-				pte[i] = phys_addr + i * PAGE_SIZE;
-				pte[i] |= pte_flags;
+				u32 val;
+
+				val = phys_addr + i * PAGE_SIZE;
+				val |= pte_flags;
 				if (map_type != MAP_FAULT)
-					pte[i] |= PTE_TYPE_SMALL;
+					val |= PTE_TYPE_SMALL;
+
+				// TODO break-before-make missing
+				set_pte(&pte[i], val);
 			}
 
 			dma_flush_range(pte, num_ptes * sizeof(u32));
@@ -350,7 +364,7 @@ static void create_sections(unsigned long first, unsigned long last,
 	unsigned int i, addr = first;
 
 	for (i = ttb_start; i < ttb_end; i++) {
-		ttb[i] = addr | flags;
+		set_pte(&ttb[i], addr | flags);
 		addr += PGDIR_SIZE;
 	}
 }
@@ -366,8 +380,10 @@ void *map_io_sections(unsigned long phys, void *_start, size_t size)
 	unsigned long start = (unsigned long)_start, sec;
 	uint32_t *ttb = get_ttb();
 
-	for (sec = start; sec < start + size; sec += PGDIR_SIZE, phys += PGDIR_SIZE)
-		ttb[pgd_index(sec)] = phys | get_pmd_flags(MAP_UNCACHED);
+	for (sec = start; sec < start + size; sec += PGDIR_SIZE, phys += PGDIR_SIZE) {
+		// TODO break-before-make missing
+		set_pte(&ttb[pgd_index(sec)], phys | get_pmd_flags(MAP_UNCACHED));
+	}
 
 	dma_flush_range(ttb, 0x4000);
 	tlb_invalidate();
@@ -410,7 +426,9 @@ static void create_vector_table(unsigned long adr)
 			 vectors, adr);
 		arm_create_pte(adr, adr, get_pte_flags(MAP_UNCACHED));
 		pte = find_pte(adr);
-		*pte = (u32)vectors | PTE_TYPE_SMALL | get_pte_flags(MAP_CACHED);
+		// TODO break-before-make missing
+		set_pte(pte, (u32)vectors | PTE_TYPE_SMALL |
+			get_pte_flags(MAP_CACHED));
 	}
 
 	arm_fixup_vectors();
diff --git a/arch/arm/cpu/mmu_64.c b/arch/arm/cpu/mmu_64.c
index d9b0b74d7314..efd788c93e7b 100644
--- a/arch/arm/cpu/mmu_64.c
+++ b/arch/arm/cpu/mmu_64.c
@@ -31,12 +31,14 @@ static uint64_t *get_ttb(void)
 	return (uint64_t *)get_ttbr(current_el());
 }
 
+static void set_pte(uint64_t *pt, uint64_t val)
+{
+	WRITE_ONCE(*pt, val);
+}
+
 static void set_table(uint64_t *pt, uint64_t *table_addr)
 {
-	uint64_t val;
-
-	val = PTE_TYPE_TABLE | (uint64_t)table_addr;
-	*pt = val;
+	set_pte(pt, PTE_TYPE_TABLE | (uint64_t)table_addr);
 }
 
 #ifdef __PBL__
@@ -114,14 +116,16 @@ static void split_block(uint64_t *pte, int level)
 
 
 	for (i = 0; i < MAX_PTE_ENTRIES; i++) {
-		new_table[i] = old_pte | (i << levelshift);
+		set_pte(&new_table[i], old_pte | (i << levelshift));
 
 		/* Level 3 block PTEs have the table type */
 		if ((level + 1) == 3)
 			new_table[i] |= PTE_TYPE_TABLE;
 	}
 
-	/* Set the new table into effect */
+	/* Set the new table into effect
+	 * TODO: break-before-make missing
+	 */
 	set_table(pte, new_table);
 }
 
@@ -157,7 +161,9 @@ static void create_sections(uint64_t virt, uint64_t phys, uint64_t size,
 			    IS_ALIGNED(phys, block_size)) {
 				type = (level == 3) ?
 					PTE_TYPE_PAGE : PTE_TYPE_BLOCK;
-				*pte = phys | attr | type;
+
+				/* TODO: break-before-make missing */
+				set_pte(pte, phys | attr | type);
 				addr += block_size;
 				phys += block_size;
 				size -= block_size;
-- 
2.39.5




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 4/4] WIP: break before make and don't invalidate uncached regions being remapped
  2025-05-21 17:41 [PATCH 0/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
                   ` (2 preceding siblings ...)
  2025-05-21 17:41 ` [PATCH 3/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
@ 2025-05-21 17:41 ` Ahmad Fatoum
  2025-05-26 13:51 ` (subset) [PATCH 0/4] ARM: mmu: ensure PTE is written at once Sascha Hauer
  4 siblings, 0 replies; 6+ messages in thread
From: Ahmad Fatoum @ 2025-05-21 17:41 UTC (permalink / raw)
  To: barebox; +Cc: Ahmad Fatoum

This needs to be split up and cleaned up, but I include it anyway
to show what needs to be done still, here for arm32 and for arm64 as
well.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/cpu/mmu_32.c | 253 +++++++++++++++++++++++++++++++++---------
 1 file changed, 201 insertions(+), 52 deletions(-)

diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c
index 9f50194c7c2b..d4ed298ac64f 100644
--- a/arch/arm/cpu/mmu_32.c
+++ b/arch/arm/cpu/mmu_32.c
@@ -70,6 +70,45 @@ static void set_pte(uint32_t *pt, uint32_t val)
 	WRITE_ONCE(*pt, val);
 }
 
+static void set_pte_range(uint32_t *virt, phys_addr_t phys,
+			  size_t count, uint32_t flags,
+			  bool break_before_make)
+{
+	bool made = false;
+
+	if (!break_before_make)
+		goto write_attrs;
+
+	if ((flags & PTE_TYPE_MASK) == PTE_TYPE_FAULT)
+		phys = 0;
+
+	for (int i = 0; i < count; i++) {
+		if (READ_ONCE(virt[i]) == ((phys + i * PAGE_SIZE) | flags))
+			continue;
+		set_pte(&virt[i], PTE_TYPE_FAULT);
+		made = true;
+	}
+
+	if (made) {
+		dma_flush_range( virt, count * sizeof(u32));
+		tlb_invalidate();
+	} else {
+		break_before_make = false;
+	}
+
+write_attrs:
+	for (int i = 0; i < count; i++, phys += PAGE_SIZE)
+		set_pte(&virt[i], phys | flags);
+
+	dma_flush_range(virt, count * sizeof(u32));
+
+#if 0
+	pr_notice("%s(0x%08x+0x%zx -> 0x%08x, flags=0x%x%s)\n", __func__,
+		  (unsigned)virt, count, phys, flags,
+		  made ? " [BBM]" : break_before_make ? " [BBM, but unneeded]" : "");
+#endif
+}
+
 #ifdef __PBL__
 static uint32_t *alloc_pte(void)
 {
@@ -89,30 +128,47 @@ static uint32_t *alloc_pte(void)
 }
 #endif
 
-static u32 *find_pte(unsigned long adr)
+static u32 *__find_pte(uint32_t *ttb, unsigned long adr, int *level)
 {
+	u32 *pgd = (u32 *)&ttb[pgd_index(adr)];
 	u32 *table;
-	uint32_t *ttb = get_ttb();
 
-	if (!pgd_type_table(ttb[pgd_index(adr)]))
-		return NULL;
+	if (!pgd_type_table(*pgd)) {
+		*level = 1;
+		return pgd;
+	}
+
+	*level = 2;
 
 	/* find the coarse page table base address */
-	table = (u32 *)(ttb[pgd_index(adr)] & ~0x3ff);
+	table = (u32 *)(*pgd & ~0x3ff);
 
 	/* find second level descriptor */
 	return &table[(adr >> PAGE_SHIFT) & 0xff];
 }
 
+static u32 *find_pte(unsigned long adr)
+{
+	int level;
+	u32 *pte = __find_pte(get_ttb(), adr, &level);
+
+	return level == 2 ? pte : NULL;
+}
+
+static void dma_flush_range_end(unsigned long start, unsigned long end)
+{
+	__dma_flush_range(start, end);
+
+	if (outer_cache.flush_range)
+		outer_cache.flush_range(start, end);
+}
+
 void dma_flush_range(void *ptr, size_t size)
 {
 	unsigned long start = (unsigned long)ptr;
 	unsigned long end = start + size;
 
-	__dma_flush_range(start, end);
-
-	if (outer_cache.flush_range)
-		outer_cache.flush_range(start, end);
+	dma_flush_range_end(start, end);
 }
 
 void dma_inv_range(void *ptr, size_t size)
@@ -132,11 +188,11 @@ void dma_inv_range(void *ptr, size_t size)
  * Not yet exported, but may be later if someone finds use for it.
  */
 static u32 *arm_create_pte(unsigned long virt, unsigned long phys,
-			   uint32_t flags)
+			   uint32_t flags, bool break_before_make)
 {
 	uint32_t *ttb = get_ttb();
 	u32 *table;
-	int i, ttb_idx;
+	int ttb_idx;
 
 	virt = ALIGN_DOWN(virt, PGDIR_SIZE);
 	phys = ALIGN_DOWN(phys, PGDIR_SIZE);
@@ -145,16 +201,11 @@ static u32 *arm_create_pte(unsigned long virt, unsigned long phys,
 
 	ttb_idx = pgd_index(virt);
 
-	for (i = 0; i < PTRS_PER_PTE; i++) {
-		set_pte(&table[i], phys | PTE_TYPE_SMALL | flags);
-		virt += PAGE_SIZE;
-		phys += PAGE_SIZE;
-	}
-	dma_flush_range(table, PTRS_PER_PTE * sizeof(u32));
+	set_pte_range(table, phys, PTRS_PER_PTE, PTE_TYPE_SMALL | flags,
+		      break_before_make);
 
-	// TODO break-before-make missing
-	set_pte(&ttb[ttb_idx], (unsigned long)table | PMD_TYPE_TABLE);
-	dma_flush_range(&ttb[ttb_idx], sizeof(u32));
+	set_pte_range(&ttb[ttb_idx], (unsigned long)table, 1,
+		      PMD_TYPE_TABLE, break_before_make);
 
 	return table;
 }
@@ -243,6 +294,22 @@ static uint32_t get_pte_flags(int map_type)
 	}
 }
 
+static const char *map_type_tostr(int map_type)
+{
+	switch (map_type) {
+	case MAP_CACHED:
+		return "CACHED";
+	case MAP_UNCACHED:
+		return "UNCACHED";
+	case ARCH_MAP_WRITECOMBINE:
+		return "WRITECOMBINE";
+	case MAP_FAULT:
+		return "FAULT";
+	default:
+		return "<unknown>";
+	}
+}
+
 static uint32_t get_pmd_flags(int map_type)
 {
 	return pte_flags_to_pmd(get_pte_flags(map_type));
@@ -250,6 +317,7 @@ static uint32_t get_pmd_flags(int map_type)
 
 static void __arch_remap_range(void *_virt_addr, phys_addr_t phys_addr, size_t size, unsigned map_type)
 {
+	bool mmu_on;
 	u32 virt_addr = (u32)_virt_addr;
 	u32 pte_flags, pmd_flags;
 	uint32_t *ttb = get_ttb();
@@ -262,6 +330,13 @@ static void __arch_remap_range(void *_virt_addr, phys_addr_t phys_addr, size_t s
 
 	size = PAGE_ALIGN(size);
 
+	mmu_on = get_cr() & CR_M;
+
+	pr_info("[MMU %s]remapping 0x%08x+0x%zx: phys 0x%08lx, type %s\n",
+		  get_cr() & CR_M ? " ON" : "OFF",
+		  virt_addr, size, (ulong)phys_addr,
+		  map_type_tostr(map_type));
+
 	while (size) {
 		const bool pgdir_size_aligned = IS_ALIGNED(virt_addr, PGDIR_SIZE);
 		u32 *pgd = (u32 *)&ttb[pgd_index(virt_addr)];
@@ -270,22 +345,20 @@ static void __arch_remap_range(void *_virt_addr, phys_addr_t phys_addr, size_t s
 		if (size >= PGDIR_SIZE && pgdir_size_aligned &&
 		    IS_ALIGNED(phys_addr, PGDIR_SIZE) &&
 		    !pgd_type_table(*pgd)) {
-			u32 val;
+			u32 flags;
 			/*
 			 * TODO: Add code to discard a page table and
 			 * replace it with a section
 			 */
 			chunk = PGDIR_SIZE;
-			val = phys_addr | pmd_flags;
+			flags = pmd_flags;
 			if (map_type != MAP_FAULT)
-				val |= PMD_TYPE_SECT;
-			// TODO break-before-make missing
-			set_pte(pgd, val);
-			dma_flush_range(pgd, sizeof(*pgd));
+				flags |= PMD_TYPE_SECT;
+			set_pte_range(pgd, phys_addr, 1, flags, mmu_on);
 		} else {
 			unsigned int num_ptes;
 			u32 *table = NULL;
-			unsigned int i;
+			u32 flags;
 			u32 *pte;
 			/*
 			 * We only want to cover pages up until next
@@ -313,24 +386,16 @@ static void __arch_remap_range(void *_virt_addr, phys_addr_t phys_addr, size_t s
 				 * create a new page table for it
 				 */
 				table = arm_create_pte(virt_addr, phys_addr,
-						       pmd_flags_to_pte(*pgd));
+						       pmd_flags_to_pte(*pgd), mmu_on);
 				pte = find_pte(virt_addr);
 				BUG_ON(!pte);
 			}
 
-			for (i = 0; i < num_ptes; i++) {
-				u32 val;
+			flags = pte_flags;
+			if (map_type != MAP_FAULT)
+				flags |= PTE_TYPE_SMALL;
 
-				val = phys_addr + i * PAGE_SIZE;
-				val |= pte_flags;
-				if (map_type != MAP_FAULT)
-					val |= PTE_TYPE_SMALL;
-
-				// TODO break-before-make missing
-				set_pte(&pte[i], val);
-			}
-
-			dma_flush_range(pte, num_ptes * sizeof(u32));
+			set_pte_range(pte, phys_addr, num_ptes, flags, mmu_on);
 		}
 
 		virt_addr += chunk;
@@ -345,12 +410,99 @@ static void early_remap_range(u32 addr, size_t size, unsigned map_type)
 	__arch_remap_range((void *)addr, addr, size, map_type);
 }
 
+static size_t granule_size(int level)
+{
+	switch (level) {
+	default:
+	case 1:
+		return PGDIR_SIZE;
+	case 2:
+		return PAGE_SIZE;
+	}
+}
+
+static bool pte_is_cacheable(uint32_t pte, int level)
+{
+	return	(level == 2 && (pte & PTE_CACHEABLE)) ||
+		(level == 1 && (pte & PMD_SECT_CACHEABLE));
+}
+
+/**
+ * flush_cacheable_pages - Flush only the cacheable pages in a region
+ * @start: Starting virtual address of the range.
+ * @end:   Ending virtual address of the range.
+ *
+ * This function walks the page table and flushes the data caches for the
+ * specified range only if the memory is marked as normal cacheable in the
+ * page tables. If a non-cacheable or non-normal page is encountered,
+ * it's skipped.
+ */
+static void flush_cacheable_pages(void *start, size_t size)
+{
+	u32 flush_start = ~0UL, flush_end = ~0UL;
+	u32 region_start, region_end;
+	size_t block_size;
+	u32 *ttb;
+
+	region_start = PAGE_ALIGN_DOWN((ulong)start);
+	region_end = PAGE_ALIGN(region_start + size);
+
+	ttb = get_ttb();
+
+	/*
+	 * TODO: This loop could be made more optimal by inlining the page walk,
+	 * so we need not restart address translation from the top every time.
+	 *
+	 * The hope is that with the page tables being cached and the
+	 * windows being remapped being small, the overhead compared to
+	 * actually flushing the ranges isn't too significant.
+	 */
+	for (u32 addr = region_start; addr < region_end; addr += block_size) {
+		int level;
+		u32 *pte = __find_pte(ttb, addr, &level);
+
+		block_size = granule_size(level);
+
+		if (!pte || !pte_is_cacheable(*pte, level))
+			continue;
+
+		if (flush_end == addr) {
+			/*
+			 * While it's safe to flush the whole block_size,
+			 * it's unnecessary time waste to go beyond region_end.
+			 */
+			flush_end = min(flush_end + block_size, region_end);
+			continue;
+		}
+
+		/*
+		 * We don't have a previous contiguous flush area to append to.
+		 * If we recorded any area before, let's flush it now
+		 */
+		if (flush_start != ~0U) {
+			pr_notice("flushing %x-%x\n", flush_start, flush_end);
+			dma_flush_range_end(flush_start, flush_end);
+		}
+
+		/* and start the new contiguous flush area with this page */
+		flush_start = addr;
+		flush_end = min(flush_start + block_size, region_end);
+	}
+
+	/* The previous loop won't flush the last cached range, so do it here */
+	if (flush_start != ~0UL) {
+		pr_notice("flushing %x-%x\n", flush_start, flush_end);
+		dma_flush_range_end(flush_start, flush_end);
+	}
+}
+
 int arch_remap_range(void *virt_addr, phys_addr_t phys_addr, size_t size, unsigned map_type)
 {
-	__arch_remap_range(virt_addr, phys_addr, size, map_type);
 
-	if (map_type == MAP_UNCACHED)
-		dma_inv_range(virt_addr, size);
+	if (map_type != MAP_CACHED)
+		flush_cacheable_pages(virt_addr, size);
+
+	__arch_remap_range(virt_addr, phys_addr, size, map_type);
 
 	return 0;
 }
@@ -377,13 +529,11 @@ static inline void create_flat_mapping(void)
 
 void *map_io_sections(unsigned long phys, void *_start, size_t size)
 {
-	unsigned long start = (unsigned long)_start, sec;
+	unsigned long start = (unsigned long)_start;
 	uint32_t *ttb = get_ttb();
 
-	for (sec = start; sec < start + size; sec += PGDIR_SIZE, phys += PGDIR_SIZE) {
-		// TODO break-before-make missing
-		set_pte(&ttb[pgd_index(sec)], phys | get_pmd_flags(MAP_UNCACHED));
-	}
+	set_pte_range(&ttb[pgd_index(start)], phys, size / PGDIR_SIZE,
+		      get_pmd_flags(MAP_UNCACHED), true);
 
 	dma_flush_range(ttb, 0x4000);
 	tlb_invalidate();
@@ -424,11 +574,10 @@ static void create_vector_table(unsigned long adr)
 		vectors = xmemalign(PAGE_SIZE, PAGE_SIZE);
 		pr_debug("Creating vector table, virt = 0x%p, phys = 0x%08lx\n",
 			 vectors, adr);
-		arm_create_pte(adr, adr, get_pte_flags(MAP_UNCACHED));
+		arm_create_pte(adr, adr, get_pte_flags(MAP_UNCACHED), true);
 		pte = find_pte(adr);
-		// TODO break-before-make missing
-		set_pte(pte, (u32)vectors | PTE_TYPE_SMALL |
-			get_pte_flags(MAP_CACHED));
+		set_pte_range(pte, (u32)vectors, 1, PTE_TYPE_SMALL |
+			      get_pte_flags(MAP_CACHED), true);
 	}
 
 	arm_fixup_vectors();
-- 
2.39.5




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: (subset) [PATCH 0/4] ARM: mmu: ensure PTE is written at once
  2025-05-21 17:41 [PATCH 0/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
                   ` (3 preceding siblings ...)
  2025-05-21 17:41 ` [PATCH 4/4] WIP: break before make and don't invalidate uncached regions being remapped Ahmad Fatoum
@ 2025-05-26 13:51 ` Sascha Hauer
  4 siblings, 0 replies; 6+ messages in thread
From: Sascha Hauer @ 2025-05-26 13:51 UTC (permalink / raw)
  To: barebox, Ahmad Fatoum


On Wed, 21 May 2025 19:41:00 +0200, Ahmad Fatoum wrote:
> Writing the PTE with a normal non-volatile access is a bad idea, because
> speculation may already use a partially written entry.
> 
> Use WRITE_ONCE to turn this into a volatile write via a new set_pte
> helper. The extra helper is useful for type enforcement and for future
> extension when we start to consistently apply break-before-make[1].
> 
> [...]

Applied, thanks!

[1/4] arch: sync READ_ONCE/WRITE_ONCE with Linux
      https://git.pengutronix.de/cgit/barebox/commit/?id=cacff83edeef (link may not be stable)
[2/4] ARM64: mmu: document granule size calculation
      https://git.pengutronix.de/cgit/barebox/commit/?id=ebe6d3b8b732 (link may not be stable)
[3/4] ARM: mmu: ensure PTE is written at once
      https://git.pengutronix.de/cgit/barebox/commit/?id=cc909e5d084e (link may not be stable)

Best regards,
-- 
Sascha Hauer <s.hauer@pengutronix.de>




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-05-26 13:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-21 17:41 [PATCH 0/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
2025-05-21 17:41 ` [PATCH 1/4] arch: sync READ_ONCE/WRITE_ONCE with Linux Ahmad Fatoum
2025-05-21 17:41 ` [PATCH 2/4] ARM64: mmu: document granule size calculation Ahmad Fatoum
2025-05-21 17:41 ` [PATCH 3/4] ARM: mmu: ensure PTE is written at once Ahmad Fatoum
2025-05-21 17:41 ` [PATCH 4/4] WIP: break before make and don't invalidate uncached regions being remapped Ahmad Fatoum
2025-05-26 13:51 ` (subset) [PATCH 0/4] ARM: mmu: ensure PTE is written at once Sascha Hauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox