SW4STM32 and SW4Linux fully supports the STM32MP1 asymmetric multicore Cortex/A7+M4 MPUs

   With System Workbench for Linux, Embedded Linux on the STM32MP1 family of MPUs from ST was never as simple to build and maintain, even for newcomers in the Linux world.
And, if you install System Workbench for Linux in System Workbench for STM32 you can seamlessly develop and debug asymmetric applications running partly on Linux, partly on the Cortex-M4.
You can get more information from the ac6-tools website and download two short videos (registration required) highlighting:

System Workbench for STM32

Eliminate malloc support for bare metal platform?

This is not a direct answer to your question, but you may still find it useful for reducing the gcc libc’s bloat by a bit...

Add these linker flags to your project configuration:
Project -> Properties -> C/C++ Build -> Settings -> MCU GCC LInker -> Miscellaneous -> Linker flags

-specs=nosys.specs -specs=nano.specs

This won’t get rid of malloc() in and of itself, but it will reduce libc’s bloat by a few Kbytes.

You can change the RAM sizes allocated for the stack and the heap by modifying the linker configuration (*.ld) file for your project. Don’t recall the specific symbol names, but they are defined right at the top of the file and are very obvious.