[PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size

mail archive of the barebox mailing list
 help / color / mirror / Atom feed

* [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size
@ 2025-02-17 11:33 Ahmad Fatoum
  2025-02-17 11:33 ` [PATCH v2 2/2] mci: core: reset ERASE_GRP_DEF on startup Ahmad Fatoum
  2025-02-17 11:58 ` [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size Sascha Hauer
  0 siblings, 2 replies; 3+ messages in thread
From: Ahmad Fatoum @ 2025-02-17 11:33 UTC (permalink / raw)
  To: barebox; +Cc: Ahmad Fatoum

As a comment in the file notes, doing too small a granularity for erases
has considerable effect on performance:

  > Example Samsung eMMC 8GTF4:
  >
  >   time erase /dev/mmc2.part_of_512m # 1024 trims
  >   time: 2849ms
  >
  >   time erase /dev/mmc2.part_of_512m # single trim
  >   time: 56ms

This was deemed acceptable at first, because 3 seconds is still
tolerable.

On a SkyHigh S40004, an erase of the whole 3728 MiB ended up
taking longer than 400s in barebox, but only 4s in Linux, which
dwarfs the time actually needed for writing.

Linux has some rather complicated logic to compute a higher erase size
granularity, which still fits in the max busy timeout that a controller
may require. Until that's support in barebox, we import a simpler
heuristic that Linux uses to compute

  /sys/class/mmc_host/*/*/preferred_erase_size

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
v1 -> v2:
  - replace wrong 11 bit right shift with more descriptive and correct
    division by SZ_1M (Sascha)
---
 drivers/mci/mci-core.c | 105 ++++++++++++++++++++++++++---------------
 include/mci.h          |   1 +
 2 files changed, 69 insertions(+), 37 deletions(-)

diff --git a/drivers/mci/mci-core.c b/drivers/mci/mci-core.c
index 7ec2643b8d7f..13306d78dcfe 100644
--- a/drivers/mci/mci-core.c
+++ b/drivers/mci/mci-core.c
@@ -1774,6 +1774,70 @@ static int mci_startup_mmc(struct mci *mci)
 	return ret >= MMC_BUS_WIDTH_1 ? 0 : ret;
 }
 
+static void mci_init_erase(struct mci *card)
+{
+	if (!IS_ENABLED(CONFIG_MCI_ERASE))
+		return;
+
+	/* TODO: While it's possible to clear many erase groups at once
+	 * and it greatly improves throughput, drivers need adjustment:
+	 *
+	 * Many drivers hardcode a maximal wait time before aborting
+	 * the wait for R1b and returning -ETIMEDOUT. With long
+	 * erases/trims, we are bound to run into this timeout, so for now
+	 * we just split into sufficiently small erases that are unlikely
+	 * to trigger the timeout.
+	 *
+	 * What Linux does and what we should be doing in barebox is:
+	 *
+	 *  - add a struct mci_cmd::busy_timeout member that drivers should
+	 *    use instead of hardcoding their own timeout delay. The busy
+	 *    timeout length can be calculated by the MCI core after
+	 *    consulting the appropriate CSD/EXT_CSD/SSR registers.
+	 *
+	 *  - add a struct mci_host::max_busy_timeout member, where drivers
+	 *    can indicate the maximum timeout they are able to support.
+	 *    The MCI core will never set a busy_timeout that exceeds this
+	 *    value.
+	 *
+	 *  Example Samsung eMMC 8GTF4:
+	 *
+	 *    time erase /dev/mmc2.part_of_512m # 1024 trims
+	 *    time: 2849ms
+	 *
+	 *    time erase /dev/mmc2.part_of_512m # single trim
+	 *    time: 56ms
+	 */
+	if (IS_SD(card) && card->ssr.au) {
+		card->pref_erase = card->ssr.au;
+	} else if (card->erase_grp_size) {
+		unsigned int sz;
+
+		sz = card->capacity / SZ_1M;
+		if (sz < 128)
+			card->pref_erase = 512 * 1024 / 512;
+		else if (sz < 512)
+			card->pref_erase = 1024 * 1024 / 512;
+		else if (sz < 1024)
+			card->pref_erase = 2 * 1024 * 1024 / 512;
+		else
+			card->pref_erase = 4 * 1024 * 1024 / 512;
+		if (card->pref_erase < card->erase_grp_size)
+			card->pref_erase = card->erase_grp_size;
+		else {
+			sz = card->pref_erase % card->erase_grp_size;
+			if (sz)
+				card->pref_erase += card->erase_grp_size - sz;
+		}
+	} else {
+		card->pref_erase = 0;
+		return;
+	}
+
+	dev_add_param_uint32_fixed(&card->dev, "preferred_erase_size",
+				   card->pref_erase * 512, "%u");
+}
+
 /**
  * Scan the given host interfaces and detect connected MMC/SD cards
  * @param mci MCI instance
@@ -1903,6 +1967,8 @@ static int mci_startup(struct mci *mci)
 	/* we setup the blocklength only one times for all accesses to this media  */
 	err = mci_set_blocklen(mci, mci->read_bl_len);
 
+	mci_init_erase(mci);
+
 	mci_part_add(mci, mci->capacity, 0,
 			mci->cdevname, NULL, 0, true,
 			MMC_BLK_DATA_AREA_MAIN);
@@ -2080,7 +2146,7 @@ static int mci_sd_erase(struct block_device *blk, sector_t from,
 	struct mci *mci = part->mci;
 	sector_t i = 0;
 	unsigned arg;
-	sector_t blk_max, to = from + blkcnt;
+	sector_t to = from + blkcnt;
 	int rc;
 
 	mci_blk_part_switch(part);
@@ -2106,45 +2172,10 @@ static int mci_sd_erase(struct block_device *blk, sector_t from,
 	/* 'from' and 'to' are inclusive */
 	to -= 1;
 
-	/* TODO: While it's possible to clear many erase groups at once
-	 * and it greatly improves throughput, drivers need adjustment:
-	 *
-	 * Many drivers hardcode a maximal wait time before aborting
-	 * the wait for R1b and returning -ETIMEDOUT. With long
-	 * erases/trims, we are bound to run into this timeout, so for now
-	 * we just split into sufficiently small erases that are unlikely
-	 * to trigger the timeout.
-	 *
-	 * What Linux does and what we should be doing in barebox is:
-	 *
-	 *  - add a struct mci_cmd::busy_timeout member that drivers should
-	 *    use instead of hardcoding their own timeout delay. The busy
-	 *    timeout length can be calculated by the MCI core after
-	 *    consulting the appropriate CSD/EXT_CSD/SSR registers.
-	 *
-	 *  - add a struct mci_host::max_busy_timeout member, where drivers
-	 *    can indicate the maximum timeout they are able to support.
-	 *    The MCI core will never set a busy_timeout that exceeds this
-	 *    value.
-	 *
-	 *  Example Samsung eMMC 8GTF4:
-	 *
-	 *    time erase /dev/mmc2.part_of_512m # 1024 trims
-	 *    time: 2849ms
-	 *
-	 *    time erase /dev/mmc2.part_of_512m # single trim
-	 *    time: 56ms
-	 */
-
-	if (IS_SD(mci) && mci->ssr.au)
-		blk_max = mci->ssr.au;
-	else
-		blk_max = mci->erase_grp_size;
-
 	while (i < blkcnt) {
 		sector_t blk_r;
 
-		blk_r = min(blkcnt - i, blk_max);
+		blk_r = min_t(blkcnt_t, blkcnt - i, mci->pref_erase);
 
 		rc =  mci_block_erase(mci, from + i, blk_r, arg);
 		if (rc)
diff --git a/include/mci.h b/include/mci.h
index 1e3757027406..15fc0f22088f 100644
--- a/include/mci.h
+++ b/include/mci.h
@@ -647,6 +647,7 @@ struct mci {
 	/** currently used data block length for write accesses */
 	unsigned write_bl_len;
 	unsigned erase_grp_size;
+	unsigned pref_erase;	/**< preferred erase granularity in blocks */
 	uint64_t capacity;	/**< Card's data capacity in bytes */
 	int ready_for_use;	/** true if already probed */
 	int dsr_imp;		/**< DSR implementation state from CSD */
-- 
2.39.5




^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 2/2] mci: core: reset ERASE_GRP_DEF on startup
  2025-02-17 11:33 [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size Ahmad Fatoum
@ 2025-02-17 11:33 ` Ahmad Fatoum
  2025-02-17 11:58 ` [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size Sascha Hauer
  1 sibling, 0 replies; 3+ messages in thread
From: Ahmad Fatoum @ 2025-02-17 11:33 UTC (permalink / raw)
  To: barebox; +Cc: Ahmad Fatoum

EXT_CSD register EXT_CSD_ERASE_GROUP_DEF loses its value after power
cycles and needs to be reset. Quoting the eMMC 5.1 standard[1]:

  After every power-up, when the host uses a device that partition(s)
  are configured, it must set the ERASE_GROUP_DEF bit to high before
  issuing read, write, erase and write protect commands, because this
  bit is reset after power-up. Otherwise, these may not work correctly
  and it may leave the stored data in an unknown state.

[1]: JESD84-B51

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/mci/mci-core.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/mci/mci-core.c b/drivers/mci/mci-core.c
index 13306d78dcfe..e6e23b566a44 100644
--- a/drivers/mci/mci-core.c
+++ b/drivers/mci/mci-core.c
@@ -1218,11 +1218,24 @@ static void mci_extract_block_lengths_from_csd(struct mci *mci)
 }
 
 /**
- * Extract erase group size
+ * Configure erase group size
  * @param mci MCI instance
  */
-static void mci_extract_erase_group_size(struct mci *mci)
+static void mci_configure_erase_group_size(struct mci *mci)
 {
+	/* Enable ERASE_GRP_DEF. This bit is lost after a reset or power off.
+	 * This needs to happen even if we don't have erase support compiled-in.
+	 */
+	if (!IS_SD(mci) && mci->version >= MMC_VERSION_4_3) { 
+		int err;
+
+		err = mci_switch(mci, EXT_CSD_ERASE_GROUP_DEF, 1);
+		if (err)
+			dev_warn(&mci->dev, "failed erase group switch\n");
+		else
+			mci->ext_csd[EXT_CSD_ERASE_GROUP_DEF] = 1;
+	}
+
 	if (!IS_ENABLED(CONFIG_MCI_ERASE) ||
 	    !(UNSTUFF_BITS(mci->csd, 84, 12) & CCC_ERASE))
 		return;
@@ -1954,7 +1967,7 @@ static int mci_startup(struct mci *mci)
 	dev_info(&mci->dev, "detected %s card version %s\n", IS_SD(mci) ? "SD" : "MMC",
 		mci_version_string(mci));
 	mci_extract_card_capacity_from_csd(mci);
-	mci_extract_erase_group_size(mci);
+	mci_configure_erase_group_size(mci);
 
 	if (IS_SD(mci))
 		err = mci_startup_sd(mci);
-- 
2.39.5




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size
  2025-02-17 11:33 [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size Ahmad Fatoum
  2025-02-17 11:33 ` [PATCH v2 2/2] mci: core: reset ERASE_GRP_DEF on startup Ahmad Fatoum
@ 2025-02-17 11:58 ` Sascha Hauer
  1 sibling, 0 replies; 3+ messages in thread
From: Sascha Hauer @ 2025-02-17 11:58 UTC (permalink / raw)
  To: barebox, Ahmad Fatoum


On Mon, 17 Feb 2025 12:33:54 +0100, Ahmad Fatoum wrote:
> As a comment in the file notes, doing too small a granularity for erases
> has considerable effect on performance:
> 
>   > Example Samsung eMMC 8GTF4:
>   >
>   >   time erase /dev/mmc2.part_of_512m # 1024 trims
>   >   time: 2849ms
>   >
>   >   time erase /dev/mmc2.part_of_512m # single trim
>   >   time: 56ms
> 
> [...]

Applied, thanks!

[1/2] mci: core: import Linux logic for higher preferred erase size
      https://git.pengutronix.de/cgit/barebox/commit/?id=cffa32cbd510 (link may not be stable)
[2/2] mci: core: reset ERASE_GRP_DEF on startup
      https://git.pengutronix.de/cgit/barebox/commit/?id=b2c8b27dd73d (link may not be stable)

Best regards,
-- 
Sascha Hauer <s.hauer@pengutronix.de>




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-02-17 12:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-17 11:33 [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size Ahmad Fatoum
2025-02-17 11:33 ` [PATCH v2 2/2] mci: core: reset ERASE_GRP_DEF on startup Ahmad Fatoum
2025-02-17 11:58 ` [PATCH v2 1/2] mci: core: import Linux logic for higher preferred erase size Sascha Hauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox