mail archive of the barebox mailing list
 help / color / mirror / Atom feed
* [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent
@ 2023-02-21  8:05 Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 01/12] usb: dwc3: populate parent of xHCI dev Ahmad Fatoum
                   ` (11 more replies)
  0 siblings, 12 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst

Upstream DT changes /soc of NXP Layerscape LS1046A to be dma-coherent.
This means that:

 1) Linux v6.1 expects bootloader to configure DMA masters to snoop caches
 2) bootloader needs to skip cache maintenance when talking to DMA
    master when it has set snoop bits

This series does that and thereby restores USB functionality when
booting Linux v6.1 with barebox. For older kernels, dma-coherent is
fixed up into kernel DT, so newer barebox versions can boot both
kernels.

This series still needs some more testing, but it's fit for review IMO.

One thing I wasn't sure about is whether I need memory barriers in
dma_sync_single. I don't see the kernel using any though in the same
situation.

Let me know what you think.

Ahmad Fatoum (12):
  usb: dwc3: populate parent of xHCI dev
  usb: xhci: pass physical device to DMA API
  net: rtl8169: pass physical device for DMA API
  firmware: zynqmp-fpga: pass physical device to DMA API
  net: designware: eqos: pass physical device to DMA API
  x86: select OF_DMA_DEFAULT_COHERENT
  dma: define CONFIG_OF_DMA_COHERENCY
  RISC-V: StarFive: J7100: set /soc/dma-noncoherent
  of: populate new device_d::dma_coherent attribute
  dma: fix dma_sync when not all device DMA is equally coherent
  dma: provide of_dma_coherent_fixup helper
  ARM64: layerscape: configure all DMA masters to be cache-coherent

 arch/arm/Kconfig                            |  1 +
 arch/arm/mach-layerscape/Makefile           |  1 +
 arch/arm/mach-layerscape/dma-coherent.c     | 20 +++++++++++
 arch/arm/mach-layerscape/lowlevel-ls1046a.c | 10 +++---
 arch/riscv/Kconfig.socs                     |  1 +
 arch/riscv/dts/jh7100.dtsi                  |  1 +
 arch/x86/Kconfig                            |  1 +
 commands/devinfo.c                          |  5 +++
 drivers/base/resource.c                     |  6 ++--
 drivers/dma/Kconfig                         | 10 ++++++
 drivers/dma/Makefile                        |  1 +
 drivers/dma/map.c                           |  6 ++--
 drivers/dma/of_fixups.c                     | 16 +++++++++
 drivers/firmware/zynqmp-fpga.c              |  7 ++--
 drivers/net/designware_eqos.c               | 13 ++++---
 drivers/net/rtl8169.c                       |  9 ++---
 drivers/of/platform.c                       |  3 ++
 drivers/usb/dwc3/host.c                     |  8 ++---
 drivers/usb/host/xhci-ring.c                |  8 ++---
 drivers/usb/host/xhci.c                     |  6 +++-
 include/driver.h                            | 40 +++++++++++++++++++--
 include/of_address.h                        |  8 +++++
 include/soc/fsl/immap_lsch2.h               |  7 ++++
 23 files changed, 155 insertions(+), 33 deletions(-)
 create mode 100644 arch/arm/mach-layerscape/dma-coherent.c
 create mode 100644 drivers/dma/of_fixups.c

-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 01/12] usb: dwc3: populate parent of xHCI dev
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 02/12] usb: xhci: pass physical device to DMA API Ahmad Fatoum
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

Reparent xHCIs instantiated from DWC3 controllers to their parents
instead of them being direct children of the bus. Apart from improving
devinfo/drvinfo output, this should introduce no functional change.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/base/resource.c |  6 ++++--
 drivers/usb/dwc3/host.c |  8 ++++----
 include/driver.h        | 12 ++++++++++--
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/base/resource.c b/drivers/base/resource.c
index 3725c79eb9d4..0d6f200a9d51 100644
--- a/drivers/base/resource.c
+++ b/drivers/base/resource.c
@@ -57,13 +57,15 @@ int device_add_resource(struct device *dev, const char *resname,
 	return device_add_resources(dev, &res, 1);
 }
 
-struct device *add_generic_device(const char* devname, int id, const char *resname,
+struct device *add_child_device(struct device *parent,
+		const char* devname, int id, const char *resname,
 		resource_size_t start, resource_size_t size, unsigned int flags,
 		void *pdata)
 {
 	struct device *dev;
 
 	dev = device_alloc(devname, id);
+	dev->parent = parent;
 	dev->platform_data = pdata;
 	device_add_resource(dev, resname, start, size, flags);
 
@@ -71,7 +73,7 @@ struct device *add_generic_device(const char* devname, int id, const char *resna
 
 	return dev;
 }
-EXPORT_SYMBOL(add_generic_device);
+EXPORT_SYMBOL(add_child_device);
 
 struct device *add_generic_device_res(const char* devname, int id,
 		struct resource *res, int nb, void *pdata)
diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c
index a3a27db6fb64..09ed01fc569e 100644
--- a/drivers/usb/dwc3/host.c
+++ b/drivers/usb/dwc3/host.c
@@ -24,13 +24,13 @@ int dwc3_host_init(struct dwc3 *dwc)
 		return PTR_ERR(io);
 	}
 
-	dwc->xhci = add_generic_device("xHCI", DEVICE_ID_DYNAMIC, NULL,
-				       io->start, resource_size(io),
-				       IORESOURCE_MEM, NULL);
+	dwc->xhci = add_child_device(dev, "xHCI", DEVICE_ID_DYNAMIC, NULL,
+				     io->start, resource_size(io),
+				     IORESOURCE_MEM, NULL);
 	if (!dwc->xhci) {
 		dev_err(dev, "Failed to register xHCI device\n");
 		return -ENODEV;
 	}
-	
+
 	return 0;
 }
diff --git a/include/driver.h b/include/driver.h
index f53668711b81..7e25c060280b 100644
--- a/include/driver.h
+++ b/include/driver.h
@@ -269,13 +269,21 @@ int device_add_resource(struct device *dev, const char *resname,
 
 int device_add_data(struct device *dev, const void *data, size_t size);
 
+struct device *add_child_device(struct device *parent,
+				const char* devname, int id, const char *resname,
+				resource_size_t start, resource_size_t size, unsigned int flags,
+				void *pdata);
+
 /*
  * register a generic device
  * with only one resource
  */
-struct device *add_generic_device(const char* devname, int id, const char *resname,
+static inline struct device *add_generic_device(const char* devname, int id, const char *resname,
 		resource_size_t start, resource_size_t size, unsigned int flags,
-		void *pdata);
+		void *pdata)
+{
+	return add_child_device(NULL, devname, id, resname, start, size, flags, pdata);
+}
 
 /*
  * register a generic device
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 02/12] usb: xhci: pass physical device to DMA API
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 01/12] usb: dwc3: populate parent of xHCI dev Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 03/12] net: rtl8169: pass physical device for " Ahmad Fatoum
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

The xHCI device is just a child of the DWC3 with no DT node assigned.
As such, it lacks all DMA settings that may be set in the DT.
Fix this by using the hardware device instead in case the xHCI lacks
its own DT node.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/usb/host/xhci-ring.c | 8 ++++----
 drivers/usb/host/xhci.c      | 6 +++++-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 60764222af3d..0abe8a67a392 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -594,7 +594,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
 		memcpy(bounce, buffer, length);
 	}
 
-	map = addr = dma_map_single(ctrl->dev, bounce, length, direction);
+	map = addr = dma_map_single(ctrl->host.hw_dev, bounce, length, direction);
 
 	dev_dbg(&udev->dev, "pipe=0x%lx, buffer=%p, length=%d\n",
 		pipe, buffer, length);
@@ -740,7 +740,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
 	record_transfer_result(udev, event, length);
 	xhci_acknowledge_event(ctrl);
 
-	dma_unmap_single(ctrl->dev, map, length, direction);
+	dma_unmap_single(ctrl->host.hw_dev, map, length, direction);
 
 	if (usb_pipein(pipe))
 		memcpy(buffer, bounce, length);
@@ -895,7 +895,7 @@ int xhci_ctrl_tx(struct usb_device *udev, unsigned long pipe,
 	if (length > 0) {
 		if (req->requesttype & USB_DIR_IN)
 			field |= TRB_DIR_IN;
-		map = buf_64 = dma_map_single(ctrl->dev, buffer, length, direction);
+		map = buf_64 = dma_map_single(ctrl->host.hw_dev, buffer, length, direction);
 
 		trb_fields[0] = lower_32_bits(buf_64);
 		trb_fields[1] = upper_32_bits(buf_64);
@@ -947,7 +947,7 @@ int xhci_ctrl_tx(struct usb_device *udev, unsigned long pipe,
 
 	/* Invalidate buffer to make it available to usb-core */
 	if (length > 0)
-		dma_unmap_single(ctrl->dev, map, length, direction);
+		dma_unmap_single(ctrl->host.hw_dev, map, length, direction);
 
 	if (GET_COMP_CODE(le32_to_cpu(event->trans_event.transfer_len))
 			== COMP_SHORT_TX) {
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 55f51ca95192..930bb4b6c608 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1359,7 +1359,11 @@ int xhci_register(struct xhci_ctrl *ctrl)
 	 */
 	host->no_desc_before_addr = true;
 
-	host->hw_dev = dev;
+	/*
+	 * If xHCI doesn't have our own DT node, it'll be a child of a
+	 * physical USB host controller device that should be used for DMA
+	 */
+	host->hw_dev = dev_of_node(dev) ? dev : dev->parent;
 	host->submit_int_msg = xhci_submit_int_msg;
 	host->submit_control_msg = xhci_submit_control_msg;
 	host->submit_bulk_msg = xhci_submit_bulk_msg;
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 03/12] net: rtl8169: pass physical device for DMA API
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 01/12] usb: dwc3: populate parent of xHCI dev Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 02/12] usb: xhci: pass physical device to DMA API Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 04/12] firmware: zynqmp-fpga: pass physical device to " Ahmad Fatoum
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Denis Orlov, Ahmad Fatoum

It shouldn't matter for now, but DMA API should always be called for the
physical device, i.e. the struct device underlying the struct pci_device.
This the Ethernet device interface parent, so use that instead.

Cc: Denis Orlov <denorl2009@gmail.com>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/net/rtl8169.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/rtl8169.c b/drivers/net/rtl8169.c
index ffc4ef238b0b..cbcd065980e5 100644
--- a/drivers/net/rtl8169.c
+++ b/drivers/net/rtl8169.c
@@ -216,6 +216,7 @@ static void __set_rx_mode(struct rtl8169_priv *priv)
 
 static void rtl8169_init_ring(struct rtl8169_priv *priv)
 {
+	struct eth_device *edev = &priv->edev;
 	int i;
 
 	priv->cur_rx = priv->cur_tx = 0;
@@ -223,13 +224,13 @@ static void rtl8169_init_ring(struct rtl8169_priv *priv)
 	priv->tx_desc = dma_alloc_coherent(NUM_TX_DESC * sizeof(struct bufdesc),
 					   &priv->tx_desc_phys);
 	priv->tx_buf = malloc(NUM_TX_DESC * PKT_BUF_SIZE);
-	priv->tx_buf_phys = dma_map_single(&priv->edev.dev, priv->tx_buf,
+	priv->tx_buf_phys = dma_map_single(edev->parent, priv->tx_buf,
 					   NUM_TX_DESC * PKT_BUF_SIZE, DMA_TO_DEVICE);
 
 	priv->rx_desc = dma_alloc_coherent(NUM_RX_DESC * sizeof(struct bufdesc),
 					   &priv->rx_desc_phys);
 	priv->rx_buf = malloc(NUM_RX_DESC * PKT_BUF_SIZE);
-	priv->rx_buf_phys = dma_map_single(&priv->edev.dev, priv->rx_buf,
+	priv->rx_buf_phys = dma_map_single(edev->parent, priv->rx_buf,
 					   NUM_RX_DESC * PKT_BUF_SIZE, DMA_FROM_DEVICE);
 
 	for (i = 0; i < NUM_RX_DESC; i++) {
@@ -479,13 +480,13 @@ static void rtl8169_eth_halt(struct eth_device *edev)
 
 	pci_clear_master(priv->pci_dev);
 
-	dma_unmap_single(&edev->dev, priv->tx_buf_phys, NUM_TX_DESC * PKT_BUF_SIZE,
+	dma_unmap_single(edev->parent, priv->tx_buf_phys, NUM_TX_DESC * PKT_BUF_SIZE,
 			 DMA_TO_DEVICE);
 	free(priv->tx_buf);
 	dma_free_coherent((void *)priv->tx_desc, priv->tx_desc_phys,
 			  NUM_TX_DESC * sizeof(struct bufdesc));
 
-	dma_unmap_single(&edev->dev, priv->rx_buf_phys, NUM_RX_DESC * PKT_BUF_SIZE,
+	dma_unmap_single(edev->parent, priv->rx_buf_phys, NUM_RX_DESC * PKT_BUF_SIZE,
 			 DMA_FROM_DEVICE);
 	free(priv->rx_buf);
 	dma_free_coherent((void *)priv->rx_desc, priv->rx_desc_phys,
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 04/12] firmware: zynqmp-fpga: pass physical device to DMA API
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (2 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 03/12] net: rtl8169: pass physical device for " Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 05/12] net: designware: eqos: " Ahmad Fatoum
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Michael Tretter, Ahmad Fatoum

The manager device is just a child of the physical device with no
DT node assigned. As such, it lacks all DMA settings that may be
set in the DT. Fix this by using the hardware device instead.

Cc: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/firmware/zynqmp-fpga.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/zynqmp-fpga.c b/drivers/firmware/zynqmp-fpga.c
index f4e8456843e5..806bce6174ef 100644
--- a/drivers/firmware/zynqmp-fpga.c
+++ b/drivers/firmware/zynqmp-fpga.c
@@ -197,6 +197,7 @@ static void zynqmp_fpga_show_header(const struct device *dev,
 static int fpgamgr_program_finish(struct firmware_handler *fh)
 {
 	struct fpgamgr *mgr = container_of(fh, struct fpgamgr, fh);
+	struct device *hw_dev = mgr->dev.parent;
 	u32 *buf_aligned;
 	u32 buf_size;
 	u32 *body;
@@ -254,9 +255,9 @@ static int fpgamgr_program_finish(struct firmware_handler *fh)
 		memcpy((u32 *)buf_aligned, body, body_length);
 	buf_aligned[body_length / sizeof(*buf_aligned)] = body_length;
 
-	addr = dma_map_single(&mgr->dev, buf_aligned,
+	addr = dma_map_single(hw_dev, buf_aligned,
 			      body_length + sizeof(buf_size), DMA_TO_DEVICE);
-	if (dma_mapping_error(&mgr->dev, addr)) {
+	if (dma_mapping_error(hw_dev, addr)) {
 		status = -EFAULT;
 		goto err_free_dma;
 	}
@@ -267,7 +268,7 @@ static int fpgamgr_program_finish(struct firmware_handler *fh)
 		buf_size = addr + body_length;
 
 	status = mgr->eemi_ops->fpga_load((u64)addr, buf_size, flags);
-	dma_unmap_single(&mgr->dev, addr, body_length + sizeof(buf_size),
+	dma_unmap_single(hw_dev, addr, body_length + sizeof(buf_size),
 			 DMA_TO_DEVICE);
 	if (status < 0)
 		dev_err(&mgr->dev, "unable to load fpga\n");
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 05/12] net: designware: eqos: pass physical device to DMA API
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (3 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 04/12] firmware: zynqmp-fpga: pass physical device to " Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 06/12] x86: select OF_DMA_DEFAULT_COHERENT Ahmad Fatoum
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

The Ethernet interface device is just a child of the physical device with
no DT node assigned. As such, it lacks all DMA settings that may be
set in the DT. Fix this by using the hardware device instead.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/net/designware_eqos.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/designware_eqos.c b/drivers/net/designware_eqos.c
index ee5a10a007e9..2e2a1cf8bff3 100644
--- a/drivers/net/designware_eqos.c
+++ b/drivers/net/designware_eqos.c
@@ -687,7 +687,6 @@ static void eqos_stop(struct eth_device *edev)
 static int eqos_send(struct eth_device *edev, void *packet, int length)
 {
 	struct eqos *eqos = edev->priv;
-	struct device *dev = &eqos->netdev.dev;
 	struct eqos_desc *tx_desc;
 	dma_addr_t dma;
 	u32 des3;
@@ -697,8 +696,8 @@ static int eqos_send(struct eth_device *edev, void *packet, int length)
 	eqos->tx_currdescnum++;
 	eqos->tx_currdescnum %= EQOS_DESCRIPTORS_TX;
 
-	dma = dma_map_single(dev, packet, length, DMA_TO_DEVICE);
-	if (dma_mapping_error(dev, dma))
+	dma = dma_map_single(edev->parent, packet, length, DMA_TO_DEVICE);
+	if (dma_mapping_error(edev->parent, dma))
 		return -EFAULT;
 
 	tx_desc->des0 = (unsigned long)dma;
@@ -717,7 +716,7 @@ static int eqos_send(struct eth_device *edev, void *packet, int length)
 				  !(des3 & EQOS_DESC3_OWN),
 				  100 * USEC_PER_MSEC);
 
-	dma_unmap_single(dev, dma, length, DMA_TO_DEVICE);
+	dma_unmap_single(edev->parent, dma, length, DMA_TO_DEVICE);
 
 	if (ret == -ETIMEDOUT)
 		eqos_dbg(eqos, "TX timeout\n");
@@ -764,7 +763,7 @@ static int eqos_recv(struct eth_device *edev)
 
 static int eqos_init_resources(struct eqos *eqos)
 {
-	struct device *dev = eqos->netdev.parent;
+	struct eth_device *edev = &eqos->netdev;
 	int ret = -ENOMEM;
 	void *descs;
 	void *p;
@@ -785,8 +784,8 @@ static int eqos_init_resources(struct eqos *eqos)
 		struct eqos_desc *rx_desc = &eqos->rx_descs[i];
 		dma_addr_t dma;
 
-		dma = dma_map_single(dev, p, EQOS_MAX_PACKET_SIZE, DMA_FROM_DEVICE);
-		if (dma_mapping_error(dev, dma)) {
+		dma = dma_map_single(edev->parent, p, EQOS_MAX_PACKET_SIZE, DMA_FROM_DEVICE);
+		if (dma_mapping_error(edev->parent, dma)) {
 			ret = -EFAULT;
 			goto err_free_rx_bufs;
 		}
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 06/12] x86: select OF_DMA_DEFAULT_COHERENT
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (4 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 05/12] net: designware: eqos: " Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 07/12] dma: define CONFIG_OF_DMA_COHERENCY Ahmad Fatoum
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

We don't DT instantiation of HW devices on x86, but on the off-chance we
do that in future, ensure DMA coherency settings are correct.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/x86/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bd6a94bd0b0e..5688e5317d21 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -5,6 +5,7 @@ config X86
 	select HAS_KALLSYMS
 	select HAS_DMA
 	select GENERIC_FIND_NEXT_BIT
+	select OF_DMA_DEFAULT_COHERENT
 	default y
 
 config ARCH_TEXT_BASE
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 07/12] dma: define CONFIG_OF_DMA_COHERENCY
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (5 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 06/12] x86: select OF_DMA_DEFAULT_COHERENT Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 08/12] RISC-V: StarFive: J7100: set /soc/dma-noncoherent Ahmad Fatoum
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

Some architectures are either exclusively cache-coherent or not, but
some others can have only some devices that snoop the bus, while the
rest doesn't. Provide a new CONFIG_OF_DMA_COHERENCY symbol for selection
on such platforms.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/dma/Kconfig | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 46b9b90d8231..d96fcd0f845a 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -7,4 +7,14 @@ config MXS_APBH_DMA
 	select STMP_DEVICE
 	help
 	  Experimental!
+
+config OF_DMA_COHERENCY
+	bool "Respect device tree DMA coherency settings" if COMPILE_TEST
+	depends on HAS_DMA
+	help
+	  For most platforms supported, either all DMA is coherent or it isn't.
+	  Platforms that have DMA masters of mixed coherency or that differ
+	  from the architecture default will select this option to parse
+	  DMA coherency out of the DT.
+
 endmenu
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 08/12] RISC-V: StarFive: J7100: set /soc/dma-noncoherent
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (6 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 07/12] dma: define CONFIG_OF_DMA_COHERENCY Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 09/12] of: populate new device_d::dma_coherent attribute Ahmad Fatoum
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst

With upcoming changes, cache handling will be skipped on RISC-V, because
arch is cache-coherent by default. StarFive JH7100 has non-coherent DMA
masters though, so note that in the DT.
---
 arch/riscv/Kconfig.socs    | 1 +
 arch/riscv/dts/jh7100.dtsi | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig.socs b/arch/riscv/Kconfig.socs
index 0f03637a66bc..c0cac2ca6816 100644
--- a/arch/riscv/Kconfig.socs
+++ b/arch/riscv/Kconfig.socs
@@ -87,6 +87,7 @@ config SOC_STARFIVE_JH7100
 	bool
 	select SOC_STARFIVE_JH71XX
 	select SIFIVE_L2
+	select OF_DMA_COHERENCY
 	help
 	  Unlike JH7110 and later, CPU on the JH7100 are not cache-coherent
 	  with respect to DMA masters like GMAC and DW MMC controller.
diff --git a/arch/riscv/dts/jh7100.dtsi b/arch/riscv/dts/jh7100.dtsi
index e3990582af97..b11801553bf7 100644
--- a/arch/riscv/dts/jh7100.dtsi
+++ b/arch/riscv/dts/jh7100.dtsi
@@ -212,6 +212,7 @@
 		#clock-cells = <1>;
 		compatible = "simple-bus";
 		ranges;
+		dma-noncoherent;
 
 		intram0: sram@18000000 {
 			compatible = "mmio-sram";
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 09/12] of: populate new device_d::dma_coherent attribute
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (7 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 08/12] RISC-V: StarFive: J7100: set /soc/dma-noncoherent Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-27 10:41   ` Sascha Hauer
  2023-02-21  8:05 ` [PATCH RFC 10/12] dma: fix dma_sync when not all device DMA is equally coherent Ahmad Fatoum
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

So far, whether device DMA is coherent was a one-time global decision.
This is insufficient, because some platforms:

 - are cache coherent, while the architecture isn't in general, e.g. ARM
   with CONFIG_MMU=y assumes non-coherent DMA, but LS1046A is coherent

 - have a mix of devices that snoop caches and devices that don't
   (StarFive JH7100).

To enable dev_dma_(map|unmap)_single to take the correct device-specific
action with regards to cache maintenance, provide dev_is_dma_coherent()
with semantics similar to what Linux provides.

This admittedly looks a bit ugly, but we will refrain from making
dma_coherent defined independent of ifdef CONFIG_ARCH_HAS_NONCOHERENT_DMA:
On platforms that are cache-coherent, we will want dev->dma_coherent to
be true. Yet not all code allocating devices uses dev_alloc(), so adding
a global toggle is a bit too much refactoring effort for now.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 commands/devinfo.c    |  5 +++++
 drivers/of/platform.c |  3 +++
 include/driver.h      | 28 ++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/commands/devinfo.c b/commands/devinfo.c
index 2487786c7101..be28d02028c3 100644
--- a/commands/devinfo.c
+++ b/commands/devinfo.c
@@ -50,6 +50,7 @@ static int do_devinfo(int argc, char *argv[])
 	struct param_d *param;
 	int i;
 	int first;
+	int coherent;
 	struct resource *res;
 
 	if (argc == 1) {
@@ -82,6 +83,10 @@ static int do_devinfo(int argc, char *argv[])
 		if (dev->bus)
 			printf("Bus: %s\n", dev->bus->name);
 
+		coherent = dev_is_dma_coherent(dev);
+		if (coherent >= 0)
+			printf("DMA Coherent: %s\n", coherent ? "true" : "false");
+
 		if (dev->info)
 			dev->info(dev);
 
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 0982873446a6..1bd5d41ba226 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -128,6 +128,9 @@ static void of_dma_configure(struct device *dev, struct device_node *np)
 	}
 
 	dev->dma_offset = offset;
+#ifdef CONFIG_OF_DMA_COHERENCY
+	dev->dma_coherent = of_dma_is_coherent(np);
+#endif
 }
 
 /**
diff --git a/include/driver.h b/include/driver.h
index 7e25c060280b..f6301d954bb5 100644
--- a/include/driver.h
+++ b/include/driver.h
@@ -8,6 +8,7 @@
 
 #include <linux/list.h>
 #include <linux/ioport.h>
+#include <linux/bitops.h>
 #include <of.h>
 #include <filetype.h>
 
@@ -43,6 +44,13 @@ struct device {
 	 * something like eth0 or nor0. */
 	int id;
 
+	/*! This particular device is dma coherent, even if the
+         * architecture supports non-coherent devices.
+	 */
+#ifdef CONFIG_OF_DMA_COHERENCY
+	bool dma_coherent:1;
+#endif
+
 	struct resource *resource;
 	int num_resources;
 
@@ -652,6 +660,26 @@ static inline struct device_node *dev_of_node(struct device *dev)
 	return IS_ENABLED(CONFIG_OFDEVICE) ? dev->of_node : NULL;
 }
 
+#ifdef CONFIG_OF_DMA_COHERENCY
+static inline int __dev_is_dma_coherent(struct device *dev)
+{
+	return dev->dma_coherent;
+}
+#else
+static inline bool __dev_is_dma_coherent(struct device *dev)
+{
+	return IS_ENABLED(CONFIG_OF_DMA_DEFAULT_COHERENT);
+}
+#endif
+
+static inline int dev_is_dma_coherent(struct device *dev)
+{
+	if (!dev_of_node(dev))
+		return -EINVAL;
+
+	return __dev_is_dma_coherent(dev);
+}
+
 static inline void *dev_get_priv(const struct device *dev)
 {
 	return dev->priv;
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 10/12] dma: fix dma_sync when not all device DMA is equally coherent
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (8 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 09/12] of: populate new device_d::dma_coherent attribute Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 11/12] dma: provide of_dma_coherent_fixup helper Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 12/12] ARM64: layerscape: configure all DMA masters to be cache-coherent Ahmad Fatoum
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

The LS1046A features a cache-coherent interconnect and the drivers
configure the hardware appropriately, e.g. setting the FMan PRAM_MODE_GLOBAL
bit, so the written Ethernet Controllers snoop caches.

Yet, we use the standard arm64 cache maintenance routines when the MMU
is enabled and thus risk memory corruption if CPU prefetches receive buffers
in the time window between dma_map_single() cleaning them to
Point-of-Coherency and dma_unmap_single() invalidating them[1].

To properly solve this issue, we need to consult the newly added per-device
dma coherent attribute to decide whether to do manual cache maintenance.

[1]: https://lore.kernel.org/all/a5d6cc26-cd23-7c31-f56e-f6d535ea39b0@arm.com/

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/dma/map.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/map.c b/drivers/dma/map.c
index 114c0f7db3bd..be0ee258cc59 100644
--- a/drivers/dma/map.c
+++ b/drivers/dma/map.c
@@ -25,7 +25,8 @@ dma_addr_t dma_map_single(struct device *dev, void *ptr, size_t size,
 {
 	unsigned long addr = (unsigned long)ptr;
 
-	dma_sync_single_for_device(addr, size, dir);
+	if (dev_is_dma_coherent(dev) <= 0)
+		dma_sync_single_for_device(addr, size, dir);
 
 	return cpu_to_dma(dev, ptr);
 }
@@ -35,5 +36,6 @@ void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
 {
 	unsigned long addr = (unsigned long)dma_to_cpu(dev, dma_addr);
 
-	dma_sync_single_for_cpu(addr, size, dir);
+	if (dev_is_dma_coherent(dev) <= 0)
+		dma_sync_single_for_cpu(addr, size, dir);
 }
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 11/12] dma: provide of_dma_coherent_fixup helper
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (9 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 10/12] dma: fix dma_sync when not all device DMA is equally coherent Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  2023-02-21  8:05 ` [PATCH RFC 12/12] ARM64: layerscape: configure all DMA masters to be cache-coherent Ahmad Fatoum
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

Upstream DT changing dma-coherency setting can break booting newer
kernels. TO allows newer bareboxes to both boot old and new kernels add
a new DT fixup helper that may be called from board code after enabling
cache coherence.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 drivers/dma/Makefile    |  1 +
 drivers/dma/of_fixups.c | 16 ++++++++++++++++
 include/of_address.h    |  8 ++++++++
 3 files changed, 25 insertions(+)
 create mode 100644 drivers/dma/of_fixups.c

diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 39829cab5008..a3c1d52e05c3 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_MXS_APBH_DMA)	+= apbh_dma.o
 obj-$(CONFIG_HAS_DMA)		+= map.o
+obj-$(CONFIG_OFTREE)		+= of_fixups.o
diff --git a/drivers/dma/of_fixups.c b/drivers/dma/of_fixups.c
new file mode 100644
index 000000000000..27dc24e287ee
--- /dev/null
+++ b/drivers/dma/of_fixups.c
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <of.h>
+#include <of_address.h>
+
+int of_dma_coherent_fixup(struct device_node *root, void *data)
+{
+	struct device_node *soc;
+
+	soc = of_find_node_by_path_from(root, "/soc");
+	if (!soc)
+		return -ENOENT;
+
+	of_property_write_bool(soc, "dma-noncoherent", false);
+	return of_property_write_bool(soc, "dma-coherent", true);
+}
diff --git a/include/of_address.h b/include/of_address.h
index 4e5faf6f7783..4f425e0803eb 100644
--- a/include/of_address.h
+++ b/include/of_address.h
@@ -62,6 +62,8 @@ extern int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr,
 
 extern bool of_dma_is_coherent(struct device_node *np);
 
+extern int of_dma_coherent_fixup(struct device_node *np, void *unused);
+
 #else /* CONFIG_OFTREE */
 
 static inline u64 of_translate_address(struct device_node *dev,
@@ -115,6 +117,12 @@ static inline bool of_dma_is_coherent(struct device_node *np)
 {
 	return false;
 }
+
+static inline int of_dma_coherent_fixup(struct device_node *np, void *unused)
+{
+	return 0;
+}
+
 #endif /* CONFIG_OFTREE */
 
 #ifdef CONFIG_OF_PCI
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 12/12] ARM64: layerscape: configure all DMA masters to be cache-coherent
  2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
                   ` (10 preceding siblings ...)
  2023-02-21  8:05 ` [PATCH RFC 11/12] dma: provide of_dma_coherent_fixup helper Ahmad Fatoum
@ 2023-02-21  8:05 ` Ahmad Fatoum
  11 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2023-02-21  8:05 UTC (permalink / raw)
  To: barebox; +Cc: lst, Ahmad Fatoum

Upstream device tree now has /soc/dma-coherent, which breaks USB in
Linux v6.1 when kernel is booted with barebox. Fix this by setting
fixing up cache coherency setting into kernel DT whenever barebox
DT has /soc/dma-coherent.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 arch/arm/Kconfig                            |  1 +
 arch/arm/mach-layerscape/Makefile           |  1 +
 arch/arm/mach-layerscape/dma-coherent.c     | 20 ++++++++++++++++++++
 arch/arm/mach-layerscape/lowlevel-ls1046a.c | 10 ++++++----
 include/soc/fsl/immap_lsch2.h               |  7 +++++++
 5 files changed, 35 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm/mach-layerscape/dma-coherent.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 8183f6d54686..0296e227b9ad 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -115,6 +115,7 @@ config ARCH_LAYERSCAPE
 	select HW_HAS_PCI
 	select OFTREE
 	select OFDEVICE
+	select OF_DMA_COHERENCY
 
 config ARCH_MVEBU
 	bool "Marvell EBU platforms"
diff --git a/arch/arm/mach-layerscape/Makefile b/arch/arm/mach-layerscape/Makefile
index 58d3ea820aa3..05965e113bde 100644
--- a/arch/arm/mach-layerscape/Makefile
+++ b/arch/arm/mach-layerscape/Makefile
@@ -4,6 +4,7 @@ obj- := __dummy__.o
 lwl-y += lowlevel.o errata.o
 lwl-$(CONFIG_ARCH_LS1046) += lowlevel-ls1046a.o
 obj-y += icid.o
+obj-y += dma-coherent.o
 obj-pbl-y += boot.o
 pbl-y += xload-qspi.o xload.o
 obj-$(CONFIG_ARCH_LAYERSCAPE_PPA) += ppa.o ppa-entry.o
diff --git a/arch/arm/mach-layerscape/dma-coherent.c b/arch/arm/mach-layerscape/dma-coherent.c
new file mode 100644
index 000000000000..c7bb1cffb080
--- /dev/null
+++ b/arch/arm/mach-layerscape/dma-coherent.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <io.h>
+#include <init.h>
+#include <of_address.h>
+
+static int layerscape_of_fixup_dma_coherent(void)
+{
+	struct device_node *soc;
+
+	soc = of_find_node_by_path("/soc");
+	if (!soc)
+		return -ENOENT;
+
+	if (!of_property_read_bool(soc, "dma-coherent"))
+		return 0;
+
+	return of_register_fixup(of_dma_coherent_fixup, NULL);
+}
+coredevice_initcall(layerscape_of_fixup_dma_coherent);
diff --git a/arch/arm/mach-layerscape/lowlevel-ls1046a.c b/arch/arm/mach-layerscape/lowlevel-ls1046a.c
index 32f825ec2575..320f791164ca 100644
--- a/arch/arm/mach-layerscape/lowlevel-ls1046a.c
+++ b/arch/arm/mach-layerscape/lowlevel-ls1046a.c
@@ -227,11 +227,13 @@ void ls1046a_init_lowlevel(void)
 	set_cntfrq(25000000);
 	syscnt_enable(IOMEM(LSCH2_SYS_COUNTER_ADDR));
 
-	/* Make SEC reads and writes snoopable */
+	/* Make DMA master reads and writes snoopable */
 	setbits_be32(&scfg->snpcnfgcr, SCFG_SNPCNFGCR_SECRDSNP |
-		SCFG_SNPCNFGCR_SECWRSNP |
-		SCFG_SNPCNFGCR_SATARDSNP |
-		SCFG_SNPCNFGCR_SATAWRSNP);
+		SCFG_SNPCNFGCR_SECWRSNP | SCFG_SNPCNFGCR_USB1RDSNP |
+		SCFG_SNPCNFGCR_USB1WRSNP | SCFG_SNPCNFGCR_USB2RDSNP |
+		SCFG_SNPCNFGCR_USB2WRSNP | SCFG_SNPCNFGCR_USB3RDSNP |
+		SCFG_SNPCNFGCR_USB3WRSNP | SCFG_SNPCNFGCR_SATARDSNP |
+		SCFG_SNPCNFGCR_SATAWRSNP | SCFG_SNPCNFGCR_EDMASNP);
 
 	/*
 	 * Enable snoop requests and DVM message requests for
diff --git a/include/soc/fsl/immap_lsch2.h b/include/soc/fsl/immap_lsch2.h
index 1b74c77908d1..969ce93736f9 100644
--- a/include/soc/fsl/immap_lsch2.h
+++ b/include/soc/fsl/immap_lsch2.h
@@ -247,6 +247,13 @@ struct ccsr_gur {
 #define SCFG_SNPCNFGCR_SECWRSNP		0x40000000
 #define SCFG_SNPCNFGCR_SATARDSNP	0x00800000
 #define SCFG_SNPCNFGCR_SATAWRSNP	0x00400000
+#define SCFG_SNPCNFGCR_USB1RDSNP	0x00200000
+#define SCFG_SNPCNFGCR_USB1WRSNP	0x00100000
+#define SCFG_SNPCNFGCR_EDMASNP		0x00020000
+#define SCFG_SNPCNFGCR_USB2RDSNP	0x00008000
+#define SCFG_SNPCNFGCR_USB2WRSNP	0x00010000
+#define SCFG_SNPCNFGCR_USB3RDSNP	0x00002000
+#define SCFG_SNPCNFGCR_USB3WRSNP	0x00004000
 
 /* RGMIIPCR bit definitions*/
 #define SCFG_RGMIIPCR_EN_AUTO		BIT(3)
-- 
2.30.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 09/12] of: populate new device_d::dma_coherent attribute
  2023-02-21  8:05 ` [PATCH RFC 09/12] of: populate new device_d::dma_coherent attribute Ahmad Fatoum
@ 2023-02-27 10:41   ` Sascha Hauer
  2024-01-11  7:05     ` Ahmad Fatoum
  0 siblings, 1 reply; 15+ messages in thread
From: Sascha Hauer @ 2023-02-27 10:41 UTC (permalink / raw)
  To: Ahmad Fatoum; +Cc: barebox, lst

On Tue, Feb 21, 2023 at 09:05:21AM +0100, Ahmad Fatoum wrote:
> So far, whether device DMA is coherent was a one-time global decision.
> This is insufficient, because some platforms:
> 
>  - are cache coherent, while the architecture isn't in general, e.g. ARM
>    with CONFIG_MMU=y assumes non-coherent DMA, but LS1046A is coherent
> 
>  - have a mix of devices that snoop caches and devices that don't
>    (StarFive JH7100).
> 
> To enable dev_dma_(map|unmap)_single to take the correct device-specific
> action with regards to cache maintenance, provide dev_is_dma_coherent()
> with semantics similar to what Linux provides.
> 
> This admittedly looks a bit ugly, but we will refrain from making
> dma_coherent defined independent of ifdef CONFIG_ARCH_HAS_NONCOHERENT_DMA:
> On platforms that are cache-coherent, we will want dev->dma_coherent to
> be true. Yet not all code allocating devices uses dev_alloc(), so adding
> a global toggle is a bit too much refactoring effort for now.
> 
> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
> ---
>  commands/devinfo.c    |  5 +++++
>  drivers/of/platform.c |  3 +++
>  include/driver.h      | 28 ++++++++++++++++++++++++++++
>  3 files changed, 36 insertions(+)
> 
> diff --git a/commands/devinfo.c b/commands/devinfo.c
> index 2487786c7101..be28d02028c3 100644
> --- a/commands/devinfo.c
> +++ b/commands/devinfo.c
> @@ -50,6 +50,7 @@ static int do_devinfo(int argc, char *argv[])
>  	struct param_d *param;
>  	int i;
>  	int first;
> +	int coherent;
>  	struct resource *res;
>  
>  	if (argc == 1) {
> @@ -82,6 +83,10 @@ static int do_devinfo(int argc, char *argv[])
>  		if (dev->bus)
>  			printf("Bus: %s\n", dev->bus->name);
>  
> +		coherent = dev_is_dma_coherent(dev);
> +		if (coherent >= 0)
> +			printf("DMA Coherent: %s\n", coherent ? "true" : "false");

'>=' doesn't seem to be quite right here.

> +
>  		if (dev->info)
>  			dev->info(dev);
>  
> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index 0982873446a6..1bd5d41ba226 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -128,6 +128,9 @@ static void of_dma_configure(struct device *dev, struct device_node *np)
>  	}
>  
>  	dev->dma_offset = offset;
> +#ifdef CONFIG_OF_DMA_COHERENCY
> +	dev->dma_coherent = of_dma_is_coherent(np);
> +#endif
>  }
>  
>  /**
> diff --git a/include/driver.h b/include/driver.h
> index 7e25c060280b..f6301d954bb5 100644
> --- a/include/driver.h
> +++ b/include/driver.h
> @@ -8,6 +8,7 @@
>  
>  #include <linux/list.h>
>  #include <linux/ioport.h>
> +#include <linux/bitops.h>
>  #include <of.h>
>  #include <filetype.h>
>  
> @@ -43,6 +44,13 @@ struct device {
>  	 * something like eth0 or nor0. */
>  	int id;
>  
> +	/*! This particular device is dma coherent, even if the
> +         * architecture supports non-coherent devices.
> +	 */
> +#ifdef CONFIG_OF_DMA_COHERENCY
> +	bool dma_coherent:1;
> +#endif

I think this patch could be easier when we add the dma_coherent field
unconditionally.

> +
>  	struct resource *resource;
>  	int num_resources;
>  
> @@ -652,6 +660,26 @@ static inline struct device_node *dev_of_node(struct device *dev)
>  	return IS_ENABLED(CONFIG_OFDEVICE) ? dev->of_node : NULL;
>  }
>  
> +#ifdef CONFIG_OF_DMA_COHERENCY
> +static inline int __dev_is_dma_coherent(struct device *dev)
> +{
> +	return dev->dma_coherent;
> +}
> +#else
> +static inline bool __dev_is_dma_coherent(struct device *dev)
> +{
> +	return IS_ENABLED(CONFIG_OF_DMA_DEFAULT_COHERENT);
> +}
> +#endif
> +
> +static inline int dev_is_dma_coherent(struct device *dev)
> +{
> +	if (!dev_of_node(dev))
> +		return -EINVAL;
> +
> +	return __dev_is_dma_coherent(dev);
> +}

This should return bool. Either a device is DMA coherent or not.

You seem to assume that a device is not cache coherent when you don't
know:

-       dma_sync_single_for_device(addr, size, dir);
+       if (dev_is_dma_coherent(dev) <= 0)
+               dma_sync_single_for_device(addr, size, dir);

This assumption seems to make sense as an unnecessary cache synchronisation
shouldn't hurt, right?

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 09/12] of: populate new device_d::dma_coherent attribute
  2023-02-27 10:41   ` Sascha Hauer
@ 2024-01-11  7:05     ` Ahmad Fatoum
  0 siblings, 0 replies; 15+ messages in thread
From: Ahmad Fatoum @ 2024-01-11  7:05 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: barebox, lst

Hello Sascha,

(apologies for the late answer. Device was only debricked recently).

On 27.02.23 11:41, Sascha Hauer wrote:
>> +		coherent = dev_is_dma_coherent(dev);
>> +		if (coherent >= 0)
>> +			printf("DMA Coherent: %s\n", coherent ? "true" : "false");
> 
> '>=' doesn't seem to be quite right here.

Negative error code means here that it's unknown whether device is coherent or not.
I have reworked this for v2 to make dev_is_dma_coherent return a boolean.

>> +	/*! This particular device is dma coherent, even if the
>> +         * architecture supports non-coherent devices.
>> +	 */
>> +#ifdef CONFIG_OF_DMA_COHERENCY
>> +	bool dma_coherent:1;
>> +#endif
> 
> I think this patch could be easier when we add the dma_coherent field
> unconditionally.

There's an overhead to determining coherency from the DT, which I would like to
skip if we don't require that support, because all devices are either coherent
or aren't. This is still the case in v2, but dma_coherent is now defined
unconditionally with a DEV_DMA_COHERENCY_DEFAULT value if no OF walk happens.

>> +static inline int dev_is_dma_coherent(struct device *dev)
>> +{
>> +	if (!dev_of_node(dev))
>> +		return -EINVAL;
>> +
>> +	return __dev_is_dma_coherent(dev);
>> +}
> 
> This should return bool. Either a device is DMA coherent or not.

Or we don't know, but in that case we can fall back to the architecture
default. Fixed in v2.

> 
> You seem to assume that a device is not cache coherent when you don't
> know:
> 
> -       dma_sync_single_for_device(addr, size, dir);
> +       if (dev_is_dma_coherent(dev) <= 0)
> +               dma_sync_single_for_device(addr, size, dir);

Yes, but when printing the attribute in devinfo, I didn't want to print

DMA-Coherent: false

when actually we don't know. For v2, I only print it now when the value
differs from the default for non-OF devices.

> This assumption seems to make sense as an unnecessary cache synchronisation
> shouldn't hurt, right?

It hurts when the device turns out to be coherent. In that case, the device
may have written to the cache and we invalidate the cache thinking that
the device has written to memory. With v2 and dev_is_dma_coherent becoming
a boolean, the code is now much more readable IMO.

Thanks for the review.

Cheers,
Ahmad



> 
> Sascha
> 

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-01-11  7:07 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-21  8:05 [PATCH RFC 00/12] ARM64: layerscape: make LS1046 DMA coherent Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 01/12] usb: dwc3: populate parent of xHCI dev Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 02/12] usb: xhci: pass physical device to DMA API Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 03/12] net: rtl8169: pass physical device for " Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 04/12] firmware: zynqmp-fpga: pass physical device to " Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 05/12] net: designware: eqos: " Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 06/12] x86: select OF_DMA_DEFAULT_COHERENT Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 07/12] dma: define CONFIG_OF_DMA_COHERENCY Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 08/12] RISC-V: StarFive: J7100: set /soc/dma-noncoherent Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 09/12] of: populate new device_d::dma_coherent attribute Ahmad Fatoum
2023-02-27 10:41   ` Sascha Hauer
2024-01-11  7:05     ` Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 10/12] dma: fix dma_sync when not all device DMA is equally coherent Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 11/12] dma: provide of_dma_coherent_fixup helper Ahmad Fatoum
2023-02-21  8:05 ` [PATCH RFC 12/12] ARM64: layerscape: configure all DMA masters to be cache-coherent Ahmad Fatoum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox