If optimizing your code cause a problem, probably this come from your code missing synchronization; the GCC compiler is optimizing your code quite aggresively, so when programming parallel devices like DMA you should be careful on two points:
- check that all registers and DMA descriptors are correctly declared as volatile
- check that, if there is ordering problems (like two writes that must occur in a definite order with one being to standard memory) you placed memory barriers between them (using the CMSIS __MSB() intrinsic
The LED code most probably do not have these kind of problems as this uses a simple register-based interface; however the DMA controller running in parallel with the CPU, you must be absolutly sure, when you validate the transfert, that all other information was correctly available to the DMA controller.
Hope this helps,