mail archive of the barebox mailing list
 help / color / mirror / Atom feed
From: Ahmad Fatoum <a.fatoum@pengutronix.de>
To: Sascha Hauer <s.hauer@pengutronix.de>
Cc: BAREBOX <barebox@lists.infradead.org>,
	"Claude Sonnet 4.5" <noreply@anthropic.com>
Subject: Re: [PATCH 06/19] elf: implement elf_load_inplace()
Date: Tue, 6 Jan 2026 09:18:39 +0100	[thread overview]
Message-ID: <a2b90e6a-be82-4ace-9f40-2a73b31e7cf7@pengutronix.de> (raw)
In-Reply-To: <aVw-cJ1f-B17q2j8@pengutronix.de>

On 1/5/26 23:42, Sascha Hauer wrote:
> On Mon, Jan 05, 2026 at 02:37:13PM +0100, Ahmad Fatoum wrote:
>> Hi,
>>
>> On 1/5/26 12:26 PM, Sascha Hauer wrote:
>>> Implement elf_load_inplace() to apply dynamic relocations to an ELF binary
>>> that is already loaded in memory. Unlike elf_load(), this function does not
>>> allocate memory or copy segments - it only modifies the existing image in
>>> place.
>>>
>>> This is useful for self-relocating loaders or when the ELF has been loaded
>>> by external means (e.g., firmware or another bootloader).
>>
>> Nice. This is more elegant than what I came up with (compressing every
>> segment separately). :)
>>
>>> For ET_DYN (position-independent) binaries, the relocation offset is
>>> calculated relative to the first executable PT_LOAD segment (.text section),
>>> taking into account the difference between the segment's virtual address
>>> and its file offset.
>>
>> While this may be true for barebox proper, I think in the general case,
>> we need to iterate over all segments and take the lowest address.
>> The first executable PT_LOAD shouldn't have a special significance in
>> this case.
>>
>>>
>>> The entry point is also adjusted to point to the relocated image.
>>>
>>> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
>>> Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
>>> ---
>>>  common/elf.c  | 152 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/elf.h |   8 ++++
>>>  2 files changed, 160 insertions(+)
>>>
>>> diff --git a/common/elf.c b/common/elf.c
>>> index fc2949c285ebb0c0740c68c551926da8d0bb8637..565b283b694773727ef77917cfd8c1d4ee83a8d1 100644
>>> --- a/common/elf.c
>>> +++ b/common/elf.c
>>> @@ -531,3 +531,155 @@ void elf_close(struct elf_image *elf)
>>>  
>>>  	free(elf);
>>>  }
>>> +
>>> +static void *elf_find_dynamic_inplace(struct elf_image *elf)
>>
>> const void *
>>
>>> +{
>>> +	void *buf = elf->hdr_buf;
>>> +	void *phdr = buf + elf_hdr_e_phoff(elf, buf);
>>> +	int i;
>>> +
>>> +	for (i = 0; i < elf_hdr_e_phnum(elf, buf); i++) {
>>> +		if (elf_phdr_p_type(elf, phdr) == PT_DYNAMIC) {
>>> +			u64 offset = elf_phdr_p_offset(elf, phdr);
>>> +			/* For in-place binary, PT_DYNAMIC is at hdr_buf + offset */
>>> +			return elf->hdr_buf + offset;
>>> +		}
>>> +		phdr += elf_size_of_phdr(elf);
>>> +	}
>>> +
>>> +	return NULL;  /* No PT_DYNAMIC segment */
>>> +}
>>> +
>>> +/**
>>> + * elf_load_inplace() - Apply dynamic relocations to an ELF binary in place
>>> + * @elf: ELF image previously opened with elf_open_binary()
>>> + *
>>> + * This function applies dynamic relocations to an ELF binary that is already
>>> + * loaded at its target address in memory. Unlike elf_load(), this does not
>>> + * allocate memory or copy segments - it only modifies the existing image.
>>> + *
>>> + * This is useful for self-relocating loaders or when the ELF has been loaded
>>> + * by external means (e.g., loaded by firmware or another bootloader).
>>> + *
>>> + * The ELF image must have been previously opened with elf_open_binary().
>>> + *
>>> + * For ET_DYN (position-independent) binaries, the relocation offset is
>>> + * calculated relative to the first executable PT_LOAD segment (.text section).
>>> + *
>>> + * For ET_EXEC binaries, no relocation is applied as they are expected to
>>> + * be at their link-time addresses.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure
>>> + */
>>> +int elf_load_inplace(struct elf_image *elf)
>>> +{
>>> +	void *dyn_seg;
>>> +	void *buf, *phdr;
>>> +	void *elf_buf;
>>> +	int i, ret;
>>> +
>>> +	buf = elf->hdr_buf;
>>> +	elf_buf = elf->hdr_buf;
>>> +
>>> +	/*
>>> +	 * First pass: Clear BSS segments (p_memsz > p_filesz).
>>> +	 * This must be done before relocations as uninitialized data
>>> +	 * must be zeroed per C standard.
>>> +	 */
>>> +	phdr = buf + elf_hdr_e_phoff(elf, buf);
>>> +	for (i = 0; i < elf_hdr_e_phnum(elf, buf); i++) {
>>> +		if (elf_phdr_p_type(elf, phdr) == PT_LOAD) {
>>> +			u64 p_offset = elf_phdr_p_offset(elf, phdr);
>>> +			u64 p_filesz = elf_phdr_p_filesz(elf, phdr);
>>> +			u64 p_memsz = elf_phdr_p_memsz(elf, phdr);
>>> +
>>> +			/* Clear BSS (uninitialized data) */
>>> +			if (p_filesz < p_memsz) {
>>> +				void *bss_start = elf_buf + p_offset + p_filesz;
>>> +				size_t bss_size = p_memsz - p_filesz;
>>> +				memset(bss_start, 0x00, bss_size);
>>
>> How can this be done in-place? If this is padding to get to a page
>> boundary, I would assume it's already zero. If it goes beyond a page,
>> this will overwrite follow-up segments wouldn't it?
>>
>> I also find bss_ an unforunate name here as this is applied to all segments.
> 
> The code basically implements this (from
> https://refspecs.linuxbase.org/elf/gabi4+/ch5.pheader.html):
> 
> PT_LOAD
>     The array element specifies a loadable segment, described by
>     p_filesz and p_memsz. The bytes from the file are mapped to the
>     beginning of the memory segment. If the segment's memory size
>     (p_memsz) is larger than the file size (p_filesz), the ``extra''
>     bytes are defined to hold the value 0 and to follow the segment's
>     initialized area
> 
> You are right, this is done for all segments, but de facto the segment
> containing the bss section is the only one where p_filesz < p_memsz
> is actually used, so "Clear bss segments" doesn't sound too wrong to me.
> 
> I experimented a bit. When I change the linker file to move the bss
> section before the data section, the bss section really shows up as
> zeroes in the ELF file and p_filesz becomes p_memsz for all segments.
> Only when the bss section is at the very end of the binary the bss
> section is no longer part of the ELF binary and may hit uninitialized
> memory which we really have to memset.

Ok.

>>> +	if (elf->type == ET_DYN) {
>>> +		u64 text_vaddr = 0;
>>> +		u64 text_offset = 0;
>>> +		bool found_text = false;
>>> +
>>> +		/* Find first executable PT_LOAD segment (.text) */
>>
>> As mentioned, we should rather get the lowest address across all segments.
> 
> Not sure if this is true. I didn't manage to generate a binary where the
> text section is not the first one. When I try to move for example the
> data section before the text section then I end up with no executable
> segment at all. Something weird is happening there, I haven't fully
> understood how the generation of segments from sections work.

Quoting your link:

An executable or shared object file's base address (on platforms that support
the concept) is calculated during execution from three values:
the virtual memory load address, the maximum page size, and the lowest virtual
address of a program's loadable segment. To compute the base address,
one determines the memory address associated with the lowest p_vaddr value for
a PT_LOAD segment.
This address is truncated to the nearest multiple of the maximum page size.
The corresponding p_vaddr value itself is also truncated to the nearest
multiple of the maximum page size.
The base address is the difference between the truncated memory address and the
truncated p_vaddr value.

>>> +	/* Apply architecture-specific relocations */
>>> +	ret = elf_apply_relocations(elf, dyn_seg);
>>
>> Should I upstream my relocate_image changes that reuse the
>> relocate_to_current_adr() code and you rebase on top?
> 
> Show the patch and we'll see.

Just sent it.

Cheers,
Ahmad

> 
> Sascha
> 


-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



  reply	other threads:[~2026-01-06  8:19 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-05 11:26 [PATCH 00/19] PBL: Add PBL ELF loading support with dynamic relocations Sascha Hauer
2026-01-05 11:26 ` [PATCH 01/19] elf: Use memcmp to make suitable for PBL Sascha Hauer
2026-01-05 11:46   ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 02/19] elf: build for PBL as well Sascha Hauer
2026-01-05 11:26 ` [PATCH 03/19] elf: add dynamic relocation support Sascha Hauer
2026-01-05 14:05   ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 04/19] ARM: implement elf_apply_relocations() for ELF " Sascha Hauer
2026-01-05 11:58   ` Ahmad Fatoum
2026-01-05 19:53     ` Sascha Hauer
2026-01-05 11:26 ` [PATCH 05/19] riscv: " Sascha Hauer
2026-01-05 11:26 ` [PATCH 06/19] elf: implement elf_load_inplace() Sascha Hauer
2026-01-05 13:37   ` Ahmad Fatoum
2026-01-05 22:42     ` Sascha Hauer
2026-01-06  8:18       ` Ahmad Fatoum [this message]
2026-01-05 11:26 ` [PATCH 07/19] elf: create elf_open_binary_into() Sascha Hauer
2026-01-05 11:26 ` [PATCH 08/19] Makefile: add barebox.elf build target Sascha Hauer
2026-01-05 12:22   ` Ahmad Fatoum
2026-01-05 15:43     ` Sascha Hauer
2026-01-05 17:11       ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 09/19] PBL: allow to link ELF image into PBL Sascha Hauer
2026-01-05 12:11   ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 10/19] mmu: add MAP_CACHED_RO mapping type Sascha Hauer
2026-01-05 12:14   ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 11/19] mmu: introduce pbl_remap_range() Sascha Hauer
2026-01-05 12:15   ` Ahmad Fatoum
2026-01-06  8:50     ` Ahmad Fatoum
2026-01-06  9:25       ` Sascha Hauer
2026-01-05 11:26 ` [PATCH 12/19] ARM: use relative jumps in exception table Sascha Hauer
2026-01-05 11:44   ` Ahmad Fatoum
2026-01-05 12:29     ` Sascha Hauer
2026-01-05 12:31       ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 13/19] ARM: exceptions: make in-binary exception table const Sascha Hauer
2026-01-05 11:26 ` [PATCH 14/19] ARM: linker script: create separate PT_LOAD segments for text, rodata, and data Sascha Hauer
2026-01-05 13:11   ` Ahmad Fatoum
2026-01-05 23:01     ` Sascha Hauer
2026-01-06  7:59       ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 15/19] ARM: link ELF image into PBL Sascha Hauer
2026-01-05 12:27   ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 16/19] ARM: PBL: setup MMU with proper permissions from ELF segments Sascha Hauer
2026-01-05 12:58   ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 17/19] riscv: link ELF image into PBL Sascha Hauer
2026-01-05 13:12   ` Ahmad Fatoum
2026-01-05 11:26 ` [PATCH 18/19] riscv: linker script: create separate PT_LOAD segments for text, rodata, and data Sascha Hauer
2026-01-05 13:40   ` Ahmad Fatoum
2026-01-05 11:27 ` [PATCH 19/19] riscv: add ELF segment-based memory protection with MMU Sascha Hauer
2026-01-05 13:58   ` Ahmad Fatoum
2026-01-05 14:08 ` [PATCH 00/19] PBL: Add PBL ELF loading support with dynamic relocations Ahmad Fatoum
2026-01-05 16:47   ` Sascha Hauer
2026-01-06  8:35     ` Ahmad Fatoum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2b90e6a-be82-4ace-9f40-2a73b31e7cf7@pengutronix.de \
    --to=a.fatoum@pengutronix.de \
    --cc=barebox@lists.infradead.org \
    --cc=noreply@anthropic.com \
    --cc=s.hauer@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox