Author Topic: GCC 4.5.0 Amigaos 68k Compiler CYGWIN Hostet (Read 7274 times)

bernd_afa · « **on:** May 11, 2010, 01:33:27 PM »

Here you can download the GCC 4.5.0 Verion.It contain C and C++ Compiler.
http://amiga.sourceforge.net

In this archive is also docs and the additional backend sources to build from the offical GCC sources any GCC 68k Version with only 1-2 minutes of typing/copying work.

The C++ includes and libs can download from other link.

Karlos · « **Reply #1 on:** May 12, 2010, 12:09:50 AM »

Nice

When you say it's "without C++", do you mean it doesn't include any precompiled standard C++ libraries for m68k, or something else?

Cosmos · « **Reply #2 on:** May 12, 2010, 04:45:11 AM »

"-mregparm" is fixed for working in this new release ?

bernd_afa · « **Reply #3 on:** May 12, 2010, 11:30:23 AM »

Quote from: Cosmos;557913

"-mregparm" is fixed for working in this new release ?

the Versions are clean GCC builds from offical source.They dont contain the inoffical AOS Hacks that work buggy and so not add in GCC Main source.If somebody want add this features he can do that(which is of course lots work), i need that not and so i dont want spend time to do that.

-regparms is obsolete in newer compilers, because functions that are short and so get a speedup with -regparms do inline in GCC with -O3 or declare as

__attribute__((always_inline)) which is support since GCC 3.1

Newer GCC as 2.x can inline over a linked lib. and thats faster as when use -regparms

to see what functions are usefull to optimize to increase the noticable speed of a program you can use with ixemul V63 gprof.

bernd_afa · « **Reply #4 on:** May 12, 2010, 11:36:53 AM »

>When you say it's "without C++", do you mean it doesn't include any precompiled >standard C++ libraries for m68k, or something else?

the archives are splittet in compiler exe/which contain C++ and C) and includes+ libs.

the split is done, because maybe somebody else compile the gcc to another platform.
so the compiler archive need not contain the includes and libs.

http://amiga.sourceforge.net/phps/logger.php?download=include-20090222.lha

The C++ libs/includes can btw use with compilers from 3.4.0 upto newest without Problems

Piru · « **Reply #5 on:** May 12, 2010, 11:57:20 AM »

Quote from: bernd_afa;557946

Newer GCC as 2.x can inline over a linked lib.

Could you elaborate a bit? GCC 3 certainly didn't do LTO, if that's what you mean. As far as I can tell it was only added in 4.5.

Piru · « **Reply #6 on:** May 12, 2010, 12:14:34 PM »

Quote from: bernd_afa;557946

the Versions are clean GCC builds from offical source.They dont contain the inoffical AOS Hacks that work buggy and so not add in GCC Main source.

I'm not so sure if that was the reason they weren't included. When GCC went thru massive changes no-one just wasn't around to port them and they weren't seen as important enough to keep around.

In fact, now that I think about it: Were they ever even included in the main gcc tree? I remember gcc "amiga" patchsets from long long ago.

bernd_afa · « **Reply #7 on:** May 12, 2010, 02:27:29 PM »

Quote from: Piru;557951

Could you elaborate a bit? GCC 3 certainly didn't do LTO, if that's what you mean. As far as I can tell it was only added in 4.5.

LTO is optimizing over whole code.
the compiler optimizer see the whole code of main and lib and can optimize.I have not test LTO on 4.5 if that work.maybe need a special linker.
its a hard to find a real world example that optimize better thru this.

maybe for PPC CPU with the many registers and worse out of order execution, it speedup more.

but what work since GCC 3 is the inline of a lib func so for short funcs you need no calls and so need not pass the registers on stack or register.-mrepparms is then obsolete and do no speedup.

I do tests with the ixemul libc.a some time ago.

the sourcercode in libc.a is this

__attribute__((always_inline)) inline double hypot(double x,double y)
{
return sqrt(x*x+y*y);
}

this is the main source and create a ixemul prog (because only link with -ldebug)

#include
main(int argv,int argc)
{
   double value;
   value=hypot(argc,argv);
   kprintf("%f\n",value);
}

I compile with -O1.

when i load the compiled exe in asm debugger it look as this.You see there is no call do and all is inline.
_main
+0000(10E33D22): LINK A5,#-$48
+0004(10E33D26): MOVEM.L D2-D3/A6,-(A7)
+0008(10E33D2A): BSR.L ___main ;10E340
+000E(10E33D30): FMOVE.L $C(A5),FP0
+0014(10E33D36): FMOVE.L 8(A5),FP1
+001A(10E33D3C): FMUL.X FP0,FP0
+001E(10E33D40): FMUL.X FP1,FP1
+0022(10E33D44): FADD.X FP1,FP0
+0026(10E33D48): FSQRT.X FP0
+002A(10E33D4C): FMOVE.D FP0,-(A7)
+002E(10E33D50): PEA _moncontrol+4(PC) ;10E33C84
+0032(10E33D54): BSR.L _kprintf ;10E33F74

bernd_afa · « **Reply #8 on:** May 12, 2010, 02:49:33 PM »

Quote from: Piru;557955

I'm not so sure if that was the reason they weren't included. When GCC went thru massive changes no-one just wasn't around to port them and they weren't seen as important enough to keep around.

In fact, now that I think about it: Were they ever even included in the main gcc tree? I remember gcc "amiga" patchsets from long long ago.

No, they never was add, it is not in the 68k tree.
The register to variabale feature on AOS is only a hack to stay more compatible with sasc.It work not in C++ compile mode.

GCC have a way to load register into variable.only the programs need change.a SDI that handle it for MUI programs is also in the includes.

other programs can change easy, stormmesa for GCC use too the offical GCC register to variable feature

and the regparm thing dont work well, i never get a program running.

maybe this indicate that regparm is not need because not measurable speedup and as far i know X86 is the only target that have offical the regparm switch.

curios X86 have few register, better is maybe for RISC or use PPC always regparm now ?
I remember only warpup and powerup time, with GCC 2.xx.Here i see the parameter are put on stack.

Piru · « **Reply #9 on:** May 12, 2010, 04:44:38 PM »

Quote from: bernd_afa;557990

LTO is optimizing over whole code.

I know what LTO does. It wasn't included in gcc3.

Quote

I have not test LTO on 4.5 if that work.maybe need a special linker.

According to documentation you don't.

Quote

but what work since GCC 3 is the inline of a lib func so for short funcs you need no calls and so need not pass the registers on stack or register.

I do tests with the ixemul libc.a some time ago.

the sourcercode in libc.a is this

__attribute__((always_inline)) inline double hypot(double x,double y)
{
return sqrt(x*x+y*y);
}

this is the main source and create a ixemul prog (because only link with -ldebug)

#include
main(int argv,int argc)
{
   double value;
   value=hypot(argc,argv);
   kprintf("%f\n",value);
}

I compile with -O1.

when i load the compiled exe in asm debugger it look as this.You see there is no call do and all is inline.
_main
+0000(10E33D22): LINK A5,#-$48
+0004(10E33D26): MOVEM.L D2-D3/A6,-(A7)
+0008(10E33D2A): BSR.L ___main ;10E340
+000E(10E33D30): FMOVE.L $C(A5),FP0
+0014(10E33D36): FMOVE.L 8(A5),FP1
+001A(10E33D3C): FMUL.X FP0,FP0
+001E(10E33D40): FMUL.X FP1,FP1
+0022(10E33D44): FADD.X FP1,FP0
+0026(10E33D48): FSQRT.X FP0
+002A(10E33D4C): FMOVE.D FP0,-(A7)
+002E(10E33D50): PEA _moncontrol+4(PC) ;10E33C84
+0032(10E33D54): BSR.L _kprintf ;10E33F74

How does that work exactly? If you have a function call to external function (and you don't use LTO), how can it replace the jump to the subroutine with other code?

If you mean you have to introduce all the inline functions in the headers then this is absolutely nothing new over gcc 2.x.

Piru · « **Reply #10 on:** May 12, 2010, 04:54:45 PM »

Quote from: bernd_afa;557997

curios X86 have few register, better is maybe for RISC or use PPC always regparm now ?
I remember only warpup and powerup time, with GCC 2.xx.Here i see the parameter are put on stack.

IIRC both use System V ABI, although WarpUP breaks it by having the applications reuse r2. There's a fixed system on how the parameters are passed. See 3-18 Parameter Passing.

bernd_afa · « **Reply #11 on:** May 13, 2010, 11:39:56 AM »

Quote from: Piru;558016

I know what LTO does. It wasn't included in gcc3.

According to documentation you don't.

How does that work exactly? If you have a function call to external function (and you don't use LTO), how can it replace the jump to the subroutine with other code?

If you mean you have to introduce all the inline functions in the headers then this is absolutely nothing new over gcc 2.x.

I look now on the -S output gcc produce and ups i see wy it work, the math.h include the math-68881.h which contain this func

hypot (_CONST double x, _CONST double y)
{
return sqrt (x*x + y*y);
}

So i need play with the new LTO feature and test the example when i compile with -m68060

I see also when i compile with -m68060 the libc func is not use, seem a buildin GCC func is used for that.The asm code look much diffrent to above statement and contain lots more instructions and bad division.maybe the old hypot code of ixemul is buggy, or this GCC code is bad and slow written.

maybe somebody with good C math func knowledge can say more

+0000(10FEF1A6): MOVE.L A5,-(A7)
+0002(10FEF1A8): MOVEA.L A7,A5
+0004(10FEF1AA): FMOVEM.X FP2-FP3,-(A7)
+0008(10FEF1AE): FDMOVE.D 8(A5),FP3
+000E(10FEF1B4): FDMOVE.D $10(A5),FP2
+0014(10FEF1BA): FDABS.X FP3,FP1
+0018(10FEF1BE): FDABS.X FP2,FP0
+001C(10FEF1C2): FDADD.X FP0,FP1
+0020(10FEF1C6): FBNE _hypot+$34 ;10FEF1DA
+0024(10FEF1CA): FMOVE.D FP1,-(A7)
+0028(10FEF1CE): MOVE.L (A7)+,D0
+002A(10FEF1D0): MOVE.L (A7)+,D1
+002C(10FEF1D2): FMOVEM.X (A7)+,FP2-FP3
+0030(10FEF1D6): UNLK A5
+0032(10FEF1D8): RTS
+0034(10FEF1DA): FDDIV.X FP1,FP3
+0038(10FEF1DE): FDDIV.X FP1,FP2
+003C(10FEF1E2): FDMUL.X FP3,FP3
+0040(10FEF1E6): FDMUL.X FP2,FP2
+0044(10FEF1EA): FDADD.X FP2,FP3
+0048(10FEF1EE): FDSQRT.X FP3,FP0
+004C(10FEF1F2): FDMUL.X FP0,FP1
+0050(10FEF1F6): FMOVE.D FP1,-(A7)
+0054(10FEF1FA): MOVE.L (A7)+,D0
+0056(10FEF1FC): MOVE.L (A7)+,D1
+0058(10FEF1FE): FMOVEM.X (A7)+,FP2-FP3
+005C(10FEF202): UNLK A5
+005E(10FEF204): RTS

Karlos · « **Reply #12 on:** May 13, 2010, 10:23:12 PM »

That code looks like a dog's dinner, I can see no reason whatsoever for the fdiv / fabs operations. x*x is always positive for all real x anyway.

The inlined version further up is pretty much what I'd write. The only thing in the longer version that makes immediate sense is the test for the sum of the squares being zero and skipping the square root in that case. However, that's a pretty pointless optimisation in my mind. I know that x*x tends to zero quickly for small x, but in reality how often is it likely to occur that the sum is fully zero?

Incidentally, I thought returning a double in fp0 was legal under the existing ABI. What's with the d0/d1 return?

bernd_afa · « **Reply #13 on:** May 14, 2010, 03:07:51 PM »

Quote from: Karlos;558354

That code looks like a dog's dinner, I can see no reason whatsoever for the fdiv / fabs operations. x*x is always positive for all real x anyway.

The inlined version further up is pretty much what I'd write. The only thing in the longer version that makes immediate sense is the test for the sum of the squares being zero and skipping the square root in that case. However, that's a pretty pointless optimisation in my mind. I know that x*x tends to zero quickly for small x, but in reality how often is it likely to occur that the sum is fully zero?

Incidentally, I thought returning a double in fp0 was legal under the existing ABI. What's with the d0/d1 return?

the default of the compiler is that FPU return values are always in d0 and d1.the reason is in that way there can work software float.

The linux 68k elf GCC use the FPU for return, i can too switch it easy on(i have get answer in GCC ML what need change), but here must be sure that all libs are fpu version and all libs need recompile.

but when such a big step is do and all libs need make new, then better change AOS GCC that it use elf format.so its more easy to have newest binutils for this.
in elf can also work the LTO.

In aminet there is source code of a elf loader patch for loadseg powerup.I think this can easy use for any CPU, because elf segements contain only code, and address offset correction that have nothing to do with CPU Code.

so a patch for loadseg can then handle the elf format and execute it.

But if that give more speed in real world apps i dont think, and to see if a optimation is usefull its best to look on profiler what funtions cost most time and optimize them.

Here i find the func hypot in gcc source gcc/builtins.c

static tree
fold_builtin_hypot (location_t loc, tree fndecl,
       tree arg0, tree arg1, tree type)
{
tree res, narg0, narg1;

if (!validate_arg (arg0, REAL_TYPE)
|| !validate_arg (arg1, REAL_TYPE))
return NULL_TREE;

/* Calculate the result when the argument is a constant. */
if ((res = do_mpfr_arg2 (arg0, arg1, type, mpfr_hypot)))
return res;

/* If either argument to hypot has a negate or abs, strip that off.
E.g. hypot(-x,fabs(y)) -> hypot(x,y). */
narg0 = fold_strip_sign_ops (arg0);
narg1 = fold_strip_sign_ops (arg1);
if (narg0 || narg1)
{
return build_call_expr_loc (loc, fndecl, 2, narg0 ? narg0 : arg0,
          narg1 ? narg1 : arg1);
}

/* If either argument is zero, hypot is fabs of the other. */
if (real_zerop (arg0))
return fold_build1_loc (loc, ABS_EXPR, type, arg1);
else if (real_zerop (arg1))
return fold_build1_loc (loc, ABS_EXPR, type, arg0);

/* hypot(x,x) -> fabs(x)*sqrt(2). */
if (flag_unsafe_math_optimizations
&& operand_equal_p (arg0, arg1, OEP_PURE_SAME))
{
const REAL_VALUE_TYPE sqrt2_trunc
   = real_value_truncate (TYPE_MODE (type), dconst_sqrt2 ());
return fold_build2_loc (loc, MULT_EXPR, type,
          fold_build1_loc (loc, ABS_EXPR, type, arg0),
          build_real (type, sqrt2_trunc));
}

return NULL_TREE;
}

Piru · « **Reply #14 on:** May 14, 2010, 03:37:15 PM »

Quote from: bernd_afa;558498

the default of the compiler is that FPU return values are always in d0 and d1.the reason is in that way there can work software float.

The linux 68k elf GCC use the FPU for return, i can too switch it easy on(i have get answer in GCC ML what need change), but here must be sure that all libs are fpu version and all libs need recompile.

but when such a big step is do and all libs need make new, then better change AOS GCC that it use elf format.so its more easy to have newest binutils for this.
in elf can also work the LTO.

Well that makes absolutely no sense whatsoever.

Care to explain in simple terms why the binaries would need to be ELF all the sudden?

Author Topic: GCC 4.5.0 Amigaos 68k Compiler CYGWIN Hostet (Read 7274 times)

bernd_afa

GCC 4.5.0 Amigaos 68k Compiler CYGWIN Hostet

Karlos

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

Cosmos

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

bernd_afa

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

bernd_afa

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

Piru

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

Piru

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

bernd_afa

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

bernd_afa

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

Piru

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

Piru

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

bernd_afa

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

Karlos

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

bernd_afa

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet

Piru

Re: GCC 4.5.0 amigaos 68k Compiler CYGWIN hostet