SW4STM32 and SW4Linux fully supports the STM32MP1 asymmetric multicore Cortex/A7+M4 MPUs

   With System Workbench for Linux, Embedded Linux on the STM32MP1 family of MPUs from ST was never as simple to build and maintain, even for newcomers in the Linux world.
And, if you install System Workbench for Linux in System Workbench for STM32 you can seamlessly develop and debug asymmetric applications running partly on Linux, partly on the Cortex-M4.
You can get more information from the ac6-tools website and download two short videos (registration required) highlighting:

System Workbench for STM32

How to debug multi-threading problems

FatFS has some requirements as far as re-entrant behaviour goes (eg: FF_FS_REENTRANT) and constraints about what exactly can be called concurrently.

Also from memory the HAL USB stuff has requirements for being called concurrently.

Debugging concurrent stuff is not trivial - for example you write “after several times running this thread” - does that mean you create and destroy a new thread each time? Or that the thread is blocked, waiting for your message?
The former possibly leads to memory issues, the latter is how I’d do it personally.

You write “the program stops running” - does it crash or is your thread just blocked for some reason?