FPU going too slow
What I have done in simular situations:
Option 1) Go in with the debugger and open the disassembly window on the routine.
Option 2) Some compilers have a ‘compile to assembly’ option, turn on that option and get your assembly version of the code
Option 2) Find a disassembler and disassemble the source code
Now that you have the assembly version of the ‘C’ review it. In all cases I have seen there is a huge waste of time moving values into and out of registers.
But to start with, #ifdef out your ‘C’ routine and substutite the assembly version (as-is, no mods). Compile it and make sure it works.
Now, start by removing the unnecessary loads and stores. Often after every instruction the intermediate value is copied from the temporary register into memory, then out of memory and back into the (sometimes same) register.
Using the assembly manual you can calculate how much procesor time is taken by eash assemly instruction.
In the past I have gotten much more speed out of an operation by doing this.
After that look to see if you can substute 1 assembly instruction for several others. Sometimes the compiler defaults to using generic assembly and not taking advantage of specialized instructions.
Make sure you comment the code well.
Good Luck.
-Matt