That code looks like a dog's dinner, I can see no reason whatsoever for the fdiv / fabs operations. x*x is always positive for all real x anyway.
The inlined version further up is pretty much what I'd write. The only thing in the longer version that makes immediate sense is the test for the sum of the squares being zero and skipping the square root in that case. However, that's a pretty pointless optimisation in my mind. I know that x*x tends to zero quickly for small x, but in reality how often is it likely to occur that the sum is fully zero?
Incidentally, I thought returning a double in fp0 was legal under the existing ABI. What's with the d0/d1 return?
the default of the compiler is that FPU return values are always in d0 and d1.the reason is in that way there can work software float.
The linux 68k elf GCC use the FPU for return, i can too switch it easy on(i have get answer in GCC ML what need change), but here must be sure that all libs are fpu version and all libs need recompile.
but when such a big step is do and all libs need make new, then better change AOS GCC that it use elf format.so its more easy to have newest binutils for this.
in elf can also work the LTO.
In aminet there is source code of a elf loader patch for loadseg powerup.I think this can easy use for any CPU, because elf segements contain only code, and address offset correction that have nothing to do with CPU Code.
so a patch for loadseg can then handle the elf format and execute it.
But if that give more speed in real world apps i dont think, and to see if a optimation is usefull its best to look on profiler what funtions cost most time and optimize them.
Here i find the func hypot in gcc source gcc/builtins.c
static tree
fold_builtin_hypot (location_t loc, tree fndecl,
tree arg0, tree arg1, tree type)
{
tree res, narg0, narg1;
if (!validate_arg (arg0, REAL_TYPE)
|| !validate_arg (arg1, REAL_TYPE))
return NULL_TREE;
/* Calculate the result when the argument is a constant. */
if ((res = do_mpfr_arg2 (arg0, arg1, type, mpfr_hypot)))
return res;
/* If either argument to hypot has a negate or abs, strip that off.
E.g. hypot(-x,fabs(y)) -> hypot(x,y). */
narg0 = fold_strip_sign_ops (arg0);
narg1 = fold_strip_sign_ops (arg1);
if (narg0 || narg1)
{
return build_call_expr_loc (loc, fndecl, 2, narg0 ? narg0 : arg0,
narg1 ? narg1 : arg1);
}
/* If either argument is zero, hypot is fabs of the other. */
if (real_zerop (arg0))
return fold_build1_loc (loc, ABS_EXPR, type, arg1);
else if (real_zerop (arg1))
return fold_build1_loc (loc, ABS_EXPR, type, arg0);
/* hypot(x,x) -> fabs(x)*sqrt(2). */
if (flag_unsafe_math_optimizations
&& operand_equal_p (arg0, arg1, OEP_PURE_SAME))
{
const REAL_VALUE_TYPE sqrt2_trunc
= real_value_truncate (TYPE_MODE (type), dconst_sqrt2 ());
return fold_build2_loc (loc, MULT_EXPR, type,
fold_build1_loc (loc, ABS_EXPR, type, arg0),
build_real (type, sqrt2_trunc));
}
return NULL_TREE;
}