SW4STM32 and SW4Linux fully supports the STM32MP1 asymmetric multicore Cortex/A7+M4 MPUs

   With System Workbench for Linux, Embedded Linux on the STM32MP1 family of MPUs from ST was never as simple to build and maintain, even for newcomers in the Linux world.
And, if you install System Workbench for Linux in System Workbench for STM32 you can seamlessly develop and debug asymmetric applications running partly on Linux, partly on the Cortex-M4.
You can get more information from the ac6-tools website and download two short videos (registration required) highlighting:

System Workbench for STM32

Best way to create a generic library

If you’re asking about creating a target-generic static library based on the ST (CubeMX) HAL, I’d say that doing so would be difficult if not impossible. However, take this assertion with a grain of salt, as I can’t say that I’ve had a lot of direct experience using the Cube HAL. From what I’ve seen, the Cube HAL code makes liberal use of conditional compilation (#ifdef / #if defined() directives) that change the code generated at build time depending on the selected target device, which woiuld preclude using it in a static library intended to be usable on every STM32 variant.

What I have had a moderate amount of experience with is creating my own libraries for some of the STM32’s peripherals, and my experience with this would incline me to say that while one might be able to create a truly device-independent static library for a given peripheral, say, the ADC, doing so would be difficult and the end result would be inefficient in both code size and speed terms. While I will admit that ST has done a fair job keeping the peripherals across the STM32 product family consistent, there are device-to-device variations - not to mention that different parts have different subsets and number of peripherals. The ADC I mentioned earlier is one example of a peripheral that changes significantly depending on the target device. The F0 devices (that have an ADC to begin with) provide ‘basic’ functionality, whereas the ADC on, say, a M4-core device like the F3/F4 has a more advanced, and different, archetecture. Since I was using the ADC on both F0 and F3-based projects, I started with the intention of designing a ADC support library that would run on both targets, but after noting the significant differences between the F0 and F3 ADC’s, eventually opted to create separate libraries for the two different ADC types.

Another example: A while ago I wrote a general-purpose USART library (for async “RS232-like” streaming I/O) that I took pains to make work on practically every STM32 variant in existence at the time. This was difficult enough to do even in a “build for a specific target device” mode. Every STM32 device seems to have a different number and combination of available U(S)ARTs. If I had attempted to make it work in a way that the code would figure out what target it was running on at runtime and configure itself to run on that target - that is, constructed it so it could be bulit as a target-agnostic static library, it would have been an order of magnitude more difficult to design and code, and would have bloated the code size by at least 2x. I was able to create a common codebase that (I think) will work on most/all STM32 devices, but to do this I had to make liberal use of conditional compilation (#if / #ifdef) and the code has to be built with the target device known in advance and, once built, will only run properly on the target it was configured for.