SW4STM32 and SW4Linux fully supports the STM32MP1 asymmetric multicore Cortex/A7+M4 MPUs

   With System Workbench for Linux, Embedded Linux on the STM32MP1 family of MPUs from ST was never as simple to build and maintain, even for newcomers in the Linux world.
And, if you install System Workbench for Linux in System Workbench for STM32 you can seamlessly develop and debug asymmetric applications running partly on Linux, partly on the Cortex-M4.
You can get more information from the ac6-tools website and download two short videos (registration required) highlighting:

System Workbench for STM32

Bootloader Erased when Application is programmed

If you change your linker configuration file (normally called LinkerScript.ld) to omit from the ROM memory section the portion of program space used by your bootloader, the ELF loader will not erase or overwrite those portions of the ROM/FLASH that have been excluded.

Here’s what the MEMORY defintion inside LinkerScript.ld looks like by default for the STM32F091RC:
(128K FLASH, 32K RAM)

  RAM (rwx)     : ORIGIN = 0x20000000, LENGTH = 32K
  ROM (rx)      : ORIGIN = 0x08000000, LENGTH = 128K

The modified version below shows what you’d need to do if you wanted to reserve the first 32K of program FLASH space for a bootloader:

  RAM (rwx)     : ORIGIN = 0x20000000, LENGTH = 32K
  ROM (rx)      : ORIGIN = 0x08008000, LENGTH = 96K

Note how the ROM start address was increased by 0x8000 from 0x0800_0000 to 0x0800_8000, and the LENGTH reduced from 128K to 96K.

When you do a debug or run, or use the project explorer right-click context menu “Target” option to program your code when built using the modified MEMORY definition shown above, the first 32K of program FLASH should be left unchanged.

At the section level, it is possible to omit parts from being loaded/flashed by using the NOLOAD modifier in the section declaration in the linker config file. The example below shows what I did to allocate a 2K page of program FLASH for non-volatile parameter variable storage:

  RAM (rwx)     : ORIGIN = 0x20000000, LENGTH = 32K
  ROM (rx)      : ORIGIN = 0x08000000, LENGTH = 126K
  P_FLASH (rw)  : ORIGIN = 0x0801F800, LENGTH = 2K

. . . 

  Non-volatile parameter data in FLASH
  Note - from GNU toolchain documentation:
  The (NOLOAD) directive will mark a section to not be loaded at run time.
  The linker will process the section normally, but will mark it so that 
  a program loader will not load it into memory.

  .param_flash (NOLOAD) :
    . = ALIGN(0x800);
    __param_flash_start = .;
    . = ALIGN(4);
    __param_flash_end = .;
  } >P_FLASH

In the C source, you’d declare a variable to be located in this section like this:

param_t param_flash __attribute__ ((section (".param_flash"),used));

I won’t go into more detail on how to create NV variable storage here... there’s more to it than what I’ve shown here. I bring it up only to demonstrate how one can use the NOLOAD section modifier to instruct the linker/loader to exclude otherwise ROMable sections from a program load.