amiga.org
     
iconAll times are GMT -6. The time now is 04:37 AM. | Welcome to Forum, please register to access all of our features.

» Amiga.org » Amiga computer related discussion » Amiga Software Issues and Discussion » 68k AGA AROS + UAE => winner!

Amiga Software Issues and Discussion This forum exists for the discussion of the use, issues with, and fun brought about by classic and next generation Amiga software.

Reply
 
Thread Tools Display Modes
Old 04-20-2004, 02:43 PM   #106
bloodline
Master Sock Abuser
Points: 38,946, Level: 100 Points: 38,946, Level: 100 Points: 38,946, Level: 100
Activity: 28% Activity: 28% Activity: 28%
 
bloodline's Avatar
 
Join Date: Mar 2002
Location: London, UK
Posts: 11,890
Blog Entries: 3
Default Re: 68k AGA AROS + UAE => winner!

Quote:
I have set up WinUAE now with AIAB, its absolutely fantastic and its
the identical ROM of my A1200 so its the real AmigaOS,

so the XP machine is like a huge graphics card accelerator for my ROM,

It has the exact same quality as the Windows XP environment,
as its basically an XP app, with emulated Picasso screen,
the speed tests were absolutely staggering: it filled an entire screen
with characters instantly, it was like rain in a thunderstorm,
it said equivalent to 1662 MHz 020 and 5234 MHz FPU,
100 x speedup of FPU!, 30 x speed up of CPU (I think)
361 MIPS, 673 MFlops, I havent run Sysinfo yet,
Sounds like you made the right hardware choice for your needs :-)

Quote:
AROS should be much faster still, they say power is an aphrodisiac,
I think once any developer starts coding on AROS they are going to become
totally hooked,
Exactly! :-D
__________________
My iPhone Game: Puny Humans -
http://itunes.apple.com/gb/app/puny-...362230281?mt=8
bloodline is offline   Reply With Quote
Old 04-20-2004, 03:44 PM   #107
whoosh777
Too much caffeine
Points: 5,031, Level: 45 Points: 5,031, Level: 45 Points: 5,031, Level: 45
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Posts: 114
Default Re: 68k AGA AROS + UAE => winner!


WinUAE vs A1200 Sysinfo speed comparison,

Windows XP, CPU = 2.600 GHz Intel Celeron,
256 MB RAM, 64MB shared memory Intel graphics card

WinUAE via AIAB Picasso emulation,
AGA A1200: 50MHz 68030 + MMU, 50MHz 68882 FPU,

CPU: 972 MIPS vs 9.01 MIPS : 108 x faster
FPU: 331 MFlops vs 1.33 MFlops : 249 x faster
Dhrystones: 931824 vs 8632 : 108 x ratio,

so CPU is some 100 x as fast, and FPU is 250 x as fast,

there you have it, certified FUD free,

AIAB comes with a different speed program Sysspeed,
that said: 361 MIPS, 673 MFlops, but its not fair
to compare Sysspeed on WinUAE with Sysinfo on A1200,
so I will try a Sysspeed comparison also,


You can tell WinUAE to treat a particular XP directory as
a hard disk, floppies are via .adf files which you can
create via my trackdisk program earlier, set
device to be trackdisk.device

Once I've got myself more organised I'll do my own
speed comparison of AROS vs WinUAE, but not via
Sysinfo etc but via a "real" program of some sort,

test programs are a bit artificial,
I will try and compare progs that run entirely in RAM
to remove disk speed from the comparison,



whoosh777 is offline   Reply With Quote
Old 04-20-2004, 04:00 PM   #108
Piru
Banned
Points: 30,457, Level: 100 Points: 30,457, Level: 100 Points: 30,457, Level: 100
Activity: 69% Activity: 69% Activity: 69%
 
Join Date: Aug 2002
Location: Helsinki, Finland
Posts: 6,946
Default Re: 68k AGA AROS + UAE => winner!

@whoosh777

Quote:
Quote:
No. It will just crash due to your program writing to zeropage.
this problem is easily fixed by:

1. moving the vbr away from the start of memory, as in fact appears to happen if you run "cpu fastrom", Sysinfo shows where the VBR is,

2. the OS should then write protect the first page of memory after the pointer to ExecBase has been set up in position 4,
...except that device read access can use DMA to write to memory (KS 1.x trackdisk.device always uses blitter to decode, KS 2.x+ trackdisk.device doesn't since !(TypeOfMem(0) & MEMF_CHIP), dunno about mfm.device). If DMA access is used, it is not captured by MMU. Reading to address 0 could happily overwrite whatever data is at address 0 and forward, regardless of MMU. However, if the zeropage is mapped to fastmem, the fastmem copy will not be trashed (so ExecBase ptr would remain valid). Anything in low chipmem would be trashed however (readlen > chipmem start address), including the chipmem MemHeader, leading to swift crash (next memory allocation/deallocation).

Quote:
Now when my prog tries to read from disk or file to position 0 the MMU will intercept this and a "first page write violation" requester should come up, "click to remove task",
Might not happen if the read access uses DMA (see above).

Even if the app crashes, there is no way to "remove task" safely, since there is no resource tracking or separated address spaces. When task/process crashes under AmigaOS the safest thing you can do is make it Wait(0) (suspend).

Now, it might be just my twisted personality, but IMO it would make much more sense to check for allocation failure rather than worry about all this.

Quote:
printf is dos.library IO:
SAS C 650 and 68k gcc *both* implement printf() via dos.library Write(),
True. dos.library VPrintf() calls VFPrintf() for Output() filehandle. VFPrintf() calls RawDoFmt() with FPutC() putchproc that will eventually use Write() to write the chars. How much is written at a time depends on the filehandle buffering mode (BUF_LINE == Write() is used when newline is reached, BUF_FULL == Write() is used when buffer fills up or BUF_NONE == Write() is used for every char). See dos/SetVBuf().

Quote:
this is how I would implement PutStr() :

int PutStr( UBYTE *str )
{
int len ;

len = strlen( str ) ;
if( len==Write( Output() , str , len ) )return( 0 ) ;
else return( -1 ) ;
}

ooh that was difficult
But it also lacks all local buffering. It doesn't handle BUF_NONE and BUF_LINE properly. With such methods all buffering would be left at the lower level (filesystem and device driver), thru several APIs. The "closer" the buffering is to caller, the faster it is. Also when the buffering is local it has much better knowlege of the actual buffer usage, so further optimizations are possible that would not be at lower level.

Quote:
my prog tells me they havent used any of Write(), FPuts(), VFPrintf(),
Your snoop program probably only tells you if something calls dos.library thru vectors. It doesn't detect dos.library calling itself directly via bsr or jsr. Also it misses direct DosPacket I/O to filehandler (for example ixemul uses it).

All dos.library, SAS/C libc & GCC ixemul and GCC libnix libc use similar buffering methods, fgetc for reading data and fputc for writing.
Piru is offline   Reply With Quote
Old 04-20-2004, 05:44 PM   #109
Hammer
VIP / Donor
Points: 11,529, Level: 70 Points: 11,529, Level: 70 Points: 11,529, Level: 70
Activity: 20% Activity: 20% Activity: 20%
 
Join Date: Mar 2002
Location: NSW, Oz
Posts: 1,992
Default Re: 68k AGA AROS + UAE => winner!

Quote:
Embarassingly for the senior IT technician who ordered the changes, the newer machines ran the existing x86 version of the software not only slower than the Alpha, they ran it at a speed that was only marginally greater than running the x86 version under FX32 on the alpha.
"Figure 1 shows the relative performance on the ByteBenchmark of a 200Mz Pentium Pro and a 500 MzAlpha running DIGITAL FX!32. For this benchmark,the Alpha running DIGITAL FX!32 provides about thesame performance as a 200Mz Pentium Pro"(1)

References
1. http://www.usenix.org/publications/library/proceedings/usenix-nt97/full_papers/chernoff/chernoff.pdf
2. http://www.hotchips.org/archive/hc9/hc9pres_pdf/hc97_4b_rubin_1up.pdf
Hammer is offline   Reply With Quote
Old 04-20-2004, 06:06 PM   #110
Karlos
Sockologist
Points: 50,135, Level: 100 Points: 50,135, Level: 100 Points: 50,135, Level: 100
Activity: 10% Activity: 10% Activity: 10%
 
Karlos's Avatar
 
Join Date: Nov 2002
Location: Barishabaad, Sardistan
Posts: 16,646
Blog Entries: 18
Default Re: 68k AGA AROS + UAE => winner!

@Hammer

I have no idea about the bytmark benchmarks but they aren't relavent to the point I was making. All I can tell you is the software in question was for spectra prediction and molecular energy calculations. There were several different platform builds available at the time, but it was stated that future versions would be x86 only. The existing x86/NT version ran very poorly on the real Pentium-II compared to the same revision for Alpha, which managed to run same the x86 version around 80% of the speed at which the Pentium-II did. The Alpha native version on alpha comared to the x86 version on Pentuiu-II was about 2x faster. That's all I can tell you.

It seems to me in hindsight that the x86 build of that version of the application must have been very poor.

Hell, I didn't write the stupid thing, I just had to use it.
__________________
OCA
This isn't SCSI... This is SATA!!!
I have CDO. It's like OCD except all the letters are in ascending order. The way they should be.
Core2 Quad Q9450 2.66GHz / X48T / 4GB DDR3 / nVidia GTX275 / Linux x64, AROS, Win64
A1XE 800MHz / 512MB / Radeon 9200 / OS4.1
A1200T BPPC 240MHz / 256MB / Permedia 2 / OS 3.1 - OS3.9, OS4
A1200T Apollo 1240 28MHz / 32MB / Mediator1200 / Voodoo 3000 / OS3.9
A1200D Apollo 1240 25MHz (ejector seat ROM edition) / 32MB
Karlos is offline   Reply With Quote
Old 04-20-2004, 06:27 PM   #111
Hammer
VIP / Donor
Points: 11,529, Level: 70 Points: 11,529, Level: 70 Points: 11,529, Level: 70
Activity: 20% Activity: 20% Activity: 20%
 
Join Date: Mar 2002
Location: NSW, Oz
Posts: 1,992
Default Re: 68k AGA AROS + UAE => winner!

@Karlos
Note that there are two versions of Alpha @~266Mhz

1. Model 21066 (2 issue superpipeline, EV4 core)
2. Model 21164 (4 issue superpipeline**, on chip ~96KB L2 cache, EV5 core). **2 integer pipelines, 2 floating-point pipelines.

The Model 21064 was release around March 1992, sports a 128-bit bus interface.

The first Pentium II variant to received it’s on chip L2 cache is during introduction of Celeron 300A (Mendocino).

Other related references (X86 complier improvments)
http://www.aceshardware.com/Spades/read.php?article_id=40000191
Hammer is offline   Reply With Quote
Old 04-20-2004, 07:02 PM   #112
Karlos
Sockologist
Points: 50,135, Level: 100 Points: 50,135, Level: 100 Points: 50,135, Level: 100
Activity: 10% Activity: 10% Activity: 10%
 
Karlos's Avatar
 
Join Date: Nov 2002
Location: Barishabaad, Sardistan
Posts: 16,646
Blog Entries: 18
Default Re: 68k AGA AROS + UAE => winner!

@Hammer

Interesting. The Alpha machine was there in 1995 and wasn't new then. I'm reasonably sure it was a 21164. When was it released?
__________________
OCA
This isn't SCSI... This is SATA!!!
I have CDO. It's like OCD except all the letters are in ascending order. The way they should be.
Core2 Quad Q9450 2.66GHz / X48T / 4GB DDR3 / nVidia GTX275 / Linux x64, AROS, Win64
A1XE 800MHz / 512MB / Radeon 9200 / OS4.1
A1200T BPPC 240MHz / 256MB / Permedia 2 / OS 3.1 - OS3.9, OS4
A1200T Apollo 1240 28MHz / 32MB / Mediator1200 / Voodoo 3000 / OS3.9
A1200D Apollo 1240 25MHz (ejector seat ROM edition) / 32MB
Karlos is offline   Reply With Quote
Old 04-20-2004, 07:39 PM   #113
Hammer
VIP / Donor
Points: 11,529, Level: 70 Points: 11,529, Level: 70 Points: 11,529, Level: 70
Activity: 20% Activity: 20% Activity: 20%
 
Join Date: Mar 2002
Location: NSW, Oz
Posts: 1,992
Default Re: 68k AGA AROS + UAE => winner!

@Karlos

"FMUL and FDIV that are still 'not pipelined' in the Pentium III. "(3).

Reference.
3. http://www.heise.de/ct/english/99/16/092/

Quote:
Interesting. The Alpha machine was there in 1995 and wasn't new then. I'm reasonably sure it was a 21164. When was it released?
Around 1994 for Alpha Model 21164 (industry’s first 300Mhz processor)(4). ~266Mhz variant could be the slightly cheaper version.

Reference
4. http://vt100.net/timeline/1994-3.html
Hammer is offline   Reply With Quote
Old 04-21-2004, 08:04 AM   #114
whoosh777
Too much caffeine
Points: 5,031, Level: 45 Points: 5,031, Level: 45 Points: 5,031, Level: 45
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Posts: 114
Default Re: 68k AGA AROS + UAE => winner!

@bloodline

if I have understood you right AROS understands IDE
and AmigaOS so I think I will set up an IDE AmigaOS
drive for AROS, I have to think a bit how to go about
this.

I installed my first hardware yesterday: I opened up
the box and installed the Fast SCSI-2 PCI interface,

this was totally painless, the box internally is
very well engineered, lots of space, slots wherever
you look: 3 PCI slots with one used by the modem,
PCI modem sounds a bit extravagant, so I have 1 PCI
slot free now,

I suppose an unused slot is a wasted slot in some sense,

I noticed a similar but different slot which must be
the AGP slot for Gfx card??

Also 2 Simm slots with one in use, which must be the
DDR RAM, 256 Mb currently,

There is a free IDE-drive place which I may use for
AROS or I will reuse the current drive: but I would
have to transfer its contents first: where to???

there are also 1 unused floppy + 1 unused optical drive slots.

I figured out most of it just by looking eg the existing
PCI slot puzzled me, is that a drive? until I noticed that the phone socket was part of it, aha! its a modem!

its great how the SCSI socket is on the back of the case,
so conveniently thought out,

I have already downloaded my first Spyware/adware,
it seems I should have read the EULA, Norton located
the 3 problem files,

The security risk on this machine worries me a bit,
this is why I want to use external drives:
not just to prevent viruses but to prevent MS spying on
my system. This system is so complicated that they
could do anything they want and you wouldnt be any
the wiser. The spyware info says that spyware already
can raid your system for passwords, address books etc

So I want multiple unconnected environments:
drives X,Y,Z,... where 2 of these are never simultaneously
on and communication is done via ascii only floppy disks
or something like that. I can vett the floppies via my
A1200,

I want my own programming work never to be on an online
machine,

I have to say that buying this PC is the best decision
I have made in the last 10 years, I feel so excited about
it, I have no regrets and am glad I made the break with
Hyperion + Eyetech. The word for me which sums up
those 2 companies is "contempt".

The PC market is so dynamic that its frightening, there is something very liberating about
this machine. Also I want to remain with the Amiga,
when I switch over to my A1200 I can move so much
faster and painlessly. There is something very
direct about the Amiga paradigm, you have immediate
access to everything, on the PC you have to go through
a maze to access anything.

Anyway I cannot wait to find out directly how fast
AROS is, ie how much faster than WinUAE,
if its just 2 x as fast that would make it 216 x as
fast as my A1200,

each 1% speed up would be another A1200!

at that point I will then be able to quantify the
decision of little endian gcc, big endian gcc
I can also evaluate via Amithlon's big endian gcc,
but thats further down the line,

Once I am set up with gcc I will start on this,
I think I have to sort out the AROS drive first,
because at the moment when I boot up AROS I just
get the CD's icon + RAM icon, so all my work
will evaporate after the session!

:so I can only try out progs that I write in the session,

whats the best way to get files to and from AROS?
because it looks tricky currently,


the original idea of this thread I think is fully
valid, because WinUAE is no use for non Amiga people:
they would have to buy Cloanto to use it:

an AROS kickstart ROM would enable everyone to use
WinUAE for free as well as publicise your project,
furthermore they would be able to use the full
Windows XP through UAE. So people could use
scsi + internet until you have the native
versions up and running (I think),

you mentioned about UAE's GUI, WinUAE's GUI is a
total pain, it has 3 rows of buttons and everytime
you click a button all the other buttons rearrange,
so its very difficult to be certain you've gone
through all the options, its like one of those
sliding square puzzles,



whoosh777 is offline   Reply With Quote
Old 04-21-2004, 06:40 PM   #115
whoosh777
Too much caffeine
Points: 5,031, Level: 45 Points: 5,031, Level: 45 Points: 5,031, Level: 45
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Posts: 114
Default Re: 68k AGA AROS + UAE => winner!

@Piru,

I dont want to get too bogged down in the discussion of trackdisk.c,

its just a small utility,

you insert disk, type shell command, wait a little while,
binary ready,

its not a big issue, I am worried that you make such heavy weather
over such a non issue,

but on this occasion I'll look at some of the points you raise,

>this problem is easily fixed by:

>1. moving the vbr away from the start of memory, as in fact appears to happen if you run "cpu fastrom", Sysinfo shows where the VBR is,

>2. the OS should then write protect the first page of memory after the pointer to ExecBase has been set up in position 4,



>>...except that device read access can use DMA to write to memory
>>(KS 1.x trackdisk.device always uses blitter to decode,
>>KS 2.x+ trackdisk.device doesn't since !(TypeOfMem(0) & MEMF_CHIP),
>>dunno about mfm.device). If DMA access is used, it is not captured m
>>y MMU.

ok that bypasses 2. of my idea but only for OS1,
1. I think is sound except for 68000 which doesnt have a VBR,

so the idea is sound on a 68030 MMU machine, 1. is sound on 68020 upwards,
and 2. is sound on MMU machines OS2 upwards,

I think we see here several design flaws: putting the vectors right at start
of memory: the whole situation is ridiculous from a logical POV:

AllocMem() returns 0 on failure,
but 0 *is* a valid memory address,

then putting the highest system data here,

we have here a recipe for disaster: memory allocation failure
returning a valid memory pointer to the most important system data,

guess what the cat-in-the-hat will do here??

really memory allocation failure should return a pointer to non memory which
would cause an exception on *all* CPUs,

preferably a pointer to the middle of a large non memory
region,

-1 may also be a clever way to do it because firstly it is probably beyond
many address spaces, even if it is in the address space it would often cause a
memory alignment exception, non byte accesses would cause exceptions,

I havent used the blitter for a long time but what happens if you give
the blitter an odd address, do you get an exception or just a horrible crash??

lastly not sure about this but it may cause an
exception for wrapping around memory ie after 11111....111111 comes 000...000000
(only if you write from the start,


>>Reading to address 0 could happily overwrite whatever data is at
>>address 0 and forward, regardless of MMU. However, if the zeropage is
>>mapped to fastmem, the fastmem copy will not be trashed
>>(so ExecBase ptr would remain valid). Anything in low chipm
>>em would be trashed however (readlen > chipmem start address),
>>including the chipmem MemHeader, leading to swift crash
>>(next memory allocation/deallocation).

they should have put the memory headers at the end of the blocks,

:memory writes tend to move upwards through memory which makes the
start of memory regions vulnerable to trashing,

as I said I can turn all the criticisms of my code around into
criticisms of the system design, both h/w and OS,

I think we also see here one of the flaws of in-place memory datastructures,

(ie where the memory regions are self describing data structures),

it may be better to put memory datastructures away from the memory itself
and in write protection,

if its in place put it at the top of the regions,

with exec's struct Library's the jump vectors precede the Library which
is a clever idea,


IMO a well designed system is idiot proof,

dont blame the user blame the designer,


In the Modula3 language you can write code that never checks for errors,
errors always cause lightweight language level exceptions so it produces very
clean code eg instead of

[code]
if( y==0 ){ printf("error in y"); exit(20); }
if( z & ~(MEMF_CLEAR | MEMF_FAST | .... ) )
{
printf("error: undefined memory flag" ) ; exit(20);
}

t = AllocMem( x/y , z ) ;

if( t==0 ){ printf("memory allocation failed" ) ; exit(20); }
[\code]

you would just write:

[code]

t = AllocMem( x/y , z ) ;

[\code]

failure at any of the 3 places would cause the appropriate exception,
so you get much more transparent programs,

its a long time since I used m3 but I think if you dont put any
exception handlers the program will probably just quit with untrapped exception,

>Even if the app crashes, there is no way to "remove task" safely,
>since there is no resource tracking or separated address spaces.

if you had just memory tracking then you could remove the tasks
allocated memory + code + stack,

>When task/process crashes under AmigaOS the safest thing you can do is make it
> Wait(0) (suspend).

thats what I meant, or remove permanently the struct Task from the task queues,

>But it also lacks all local buffering. It doesn't handle BUF_NONE and
>BUF_LINE properly.

you are clouding the issue with implementation specifics,

>With such methods all buffering would be left at the
>lower level (filesystem and device driver), thru several APIs.

no no no, I am standing at the opposite side:

you------prog------API-----API--------h/w-----me

from my POV through 0 API's, from your POV lots of them,

I am looking at it from the h/w POV, you from the programmer or user POV,

speed is determined by the h/w,

>The "closer" the buffering is to caller, the fasterl
>it is.

correction: closer buffering is to the h/w the faster it is,

see CPU caches for example,

caller caching of memory is going to be very slow and complicated,
h/w caching (ie lowest level possible) is much faster than uncached
which is why its done!

thats why people dont like emulating MMUs, s/w emulation of MMU,
= very very very slow,

VM is a caching mechanism: RAM space becomes an L1-cache (IYSWIM)
for the virtual space, Hard disk = L2-cache,
virtual space itself doesnt exist hence its "virtual",

VM has to be done by s/w however it is MMU-exception driven,
so its as near to the h/w as you can get namely exceptions,

imagine if the programmer had to do this deliberately in s/w,


>Also when the buffering is local it has much better knowlege of
>the actual buffer usage, so further optimizations are possible
>that would not be at lower level.

I totally disagree, buffers should be dynamically allocated at the
lowest level, preferably not by the programmer, in fact maybe even not
by the filesystem but by a lower level still, though integrating the
filesystem with the lowest level maybe is the best approach,

just as memory caching is done transparently by a very low level of
the CPU, exactly the same applies to everything,

imagine if programmers had to do RAM caching themselves,

Also choice of caching algorithm can make a huge difference this is part of
why I dont want it done by the programmer,

if its not done by the programmer then it can be retargetted ie reimplemented,

caches arent everything, filesystem design is equally important at
determining speed,

a well designed system would be really fast even if the programmer
doesnt do any buffering,

the subroutine call overhead of fputc( fgetc(infp) , outfp) should be
quite tiny because this is such a tiny loop it should entirely be in the
memory cache,

(with fgetc() copying a byte from a low level buffer and fputc() to a low
level buffer)

note that not only will the instructions of this be entirely in the
instruction caches but the file buffer arrays will also be entirely in
data caches, so all round very fast,


I think AmigaOS filesystem is much more efficient than that of Windows XP,

its a good filesystem, but it could be a lot better,

>Your snoop program probably only tells you if something calls dos.library
>thru vectors. It doesn't detect dos.library calling itself directly
>via bsr or jsr.

very bad design if it bypasses the jump vectors, what happens if someone
retargets the jump vectors?

inconsistent behaviour thats what,

someone tried to save a few clock cycles instead of

bsr x
....
x: jmp y

they've done

bsr y

ie ruined the retargettable design to save 1 asm instruction,

you should never trash clean design just to save 1 asm instruction,

the above stops someone fixing some bug in x:


>Also it misses direct DosPacket I/O to filehandler (for example ixemul uses it).

as you point out if it bypasses the jump vector my prog wont see it,
however if the snoop prog sees something though its for real,

If you run my program trackdisk.c you will see that the OS in fact
does sensible buffering: you get a sequence of dots rapidly drawn
followed by a wait, followed by the next sequence,

so the OS is sensibly not writing to the hardware till the cylinder or track
is full,

I think one difference between my perspective and yours is I am an empiricist,
I look at what happens when a system is used or run,

whereas you are following the deductive path where you look at how its
implemented and work from there,


both paths have their own value and both have limitations,

I do use deduction but not on specifics but based on
past observations and experience,

so eg I know that mfm.device i/o is using sensible buffering
just by watching my program output,

also if copying a partition takes much much longer than formatting it
then I know that the filesystem is to blame, I dont even need to
know what filesystem it is,

Here is an example of bad design:

with Memacs if I look at a 5 meg file, it takes 10 years to load,
the silly program appears to be loading the entire file into ram,
with sensible design it should just load the visible part of the
file almost instantly,

another example:

if someone binmails me a 1 meg file attachment,
YAM takes a zillion years to load it,
this means either the email datastructure is badly designed
or YAM is badly designed,

a well designed email structure with 2 large attached files would have say bytes:

#define SENDER -1
#define MESSAGE -2
#define ATTACHMENT -3

SENDER int_sizeof(next_string) "xyz@uvw.com"
MESSAGE int_sizeof( next_string)"hello,\nbye\nxyz"
ATTACHMENT NAME int_sizeof( name ) "story" int_sizeof(file)
"A long story: once upon a time, last week, "
......
[approx 1 meg here]
......
"and lived happily ever after"
ATTACHMENT NAME int_sizeof( next_string ) "another_story" int_sizeof(file)
"Another long story: once upon a time, last week, "
......
[approx 1 meg here]
......
"and they also lived happily ever after"


then YAM only needs to load SENDER + MESSAGE, but ATTACHMENT it reads in
the name then skips the contents to the next attachment reads name
etc printing out:

sender: xyz@uvw
message:
hello,
bye
xyz
attachment: story 1meg
attachment: another_story 1 meg

this should happen quite fast, if it doesnt then there
must be something wrong with the filesystem, a well designed filesystem
should allow you to random access any part quite quickly,
so eg the file must *not* be solely a linked list of blocks

but be say an initial block giving the positions of
the data blocks, if the first block isnt large enough
then the tree needs more levels, so always the first block
should span the entire file,

this way it will take log_blocksize( filesize ) to random
access the file. So if blocksize is 512 ints then
its log_512( filesize ) reads, ie
2 reads to locate from 262144 byte file,
3 reads from 134000000 byte file,
ie approx 3 reads to get to any position of a 130 meg file,

anyway YAM should load a big binmail real fast,

it doesnt, it can take several minutes on my A1200,

something somewhere is wrongly designed:

either YAM or email-datastructures or the filesystem,

I dont know which,
whoosh777 is offline   Reply With Quote
Old 04-21-2004, 06:47 PM   #116
Karlos
Sockologist
Points: 50,135, Level: 100 Points: 50,135, Level: 100 Points: 50,135, Level: 100
Activity: 10% Activity: 10% Activity: 10%
 
Karlos's Avatar
 
Join Date: Nov 2002
Location: Barishabaad, Sardistan
Posts: 16,646
Blog Entries: 18
Default Re: 68k AGA AROS + UAE => winner!

@whoosh777

So anyway, how's the big-endian memory emulation idea progressing? Somewhere in the chip / trackdisk / aros install posts I lost where you were up to.

Are you still planning to attempt it?
__________________
OCA
This isn't SCSI... This is SATA!!!
I have CDO. It's like OCD except all the letters are in ascending order. The way they should be.
Core2 Quad Q9450 2.66GHz / X48T / 4GB DDR3 / nVidia GTX275 / Linux x64, AROS, Win64
A1XE 800MHz / 512MB / Radeon 9200 / OS4.1
A1200T BPPC 240MHz / 256MB / Permedia 2 / OS 3.1 - OS3.9, OS4
A1200T Apollo 1240 28MHz / 32MB / Mediator1200 / Voodoo 3000 / OS3.9
A1200D Apollo 1240 25MHz (ejector seat ROM edition) / 32MB
Karlos is offline   Reply With Quote
Old 04-21-2004, 09:36 PM   #117
whoosh777
Too much caffeine
Points: 5,031, Level: 45 Points: 5,031, Level: 45 Points: 5,031, Level: 45
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Posts: 114
Default Re: 68k AGA AROS + UAE => winner!


@bloodline

/*
TCP/IP stacks are hard, but Wez is working on one.

Did you not use Datatypes on your Amiga?
*/

no, I've not gone too deeply into graphics as I find it too
open ended and arbitrary, though its very satisfying to look at,

same reason I didnt become an artist,

so when I have needed graphics I've coded it directly eg
iffparse.library for iff files,

or used 3rd party code for a specific picture file,

/*
You seem hung up on SCSI. SCSI is a very old idea. Hard drives and
CD/DVD drives all go on the IDE bus, all other devices go on the
USB or the Firewire bus.
AROS can use CD/DVD drives, because AROS has an ide.device.
SCSI is dead, no normal computer equipment use it
*/

I bought a USB external drive today so I see what you mean:

£40 for 120 Gig, £60 for the case though, probably 10 x the capacity of
an external SCSI for that price,

I think USB outdoes SCSI, but SCSI has advantages (not price) over IDE:

with IDE my machine only has 2 slots,

for me personally I have complete docs for coding cross platform SCSI,
but maybe I can learn to code USB,

/*
The thing I hate most about all GUI's is "auto rise",
that is when you single click on a window and it is brought to the
front of the display. I hate it. AROS does not do this, and I like it that way
*/

I agree entirely,

single click should activate and double click "rise",

and I think its not just subjective but "auto rise" is a deficient concept
in that its impossible to "activate" without "rise"


Question: can you outline what is involved in communicating with a
hardware device on the PC?

(thinking about AROS drivers)

explain it in general terms understandable to someone only familiar with
the classic 68k Amiga,

how do you communicate with a generic PC h/w unit?

alternatively select a specific h/w unit and explain how this
is controlled directly in assembler,

eg are there specific absolute memory addresses for say PCI socket 1,
and is there some kind of structuring to the reads + writes,

eg for a write might it be:
write data length to $12340
write pointer to data to $12344
set bit 1 of $12348 to begin the write,
interrupt 123 happens on write completion,

and eg the data itself will be structured in a way
private to that specific device,


I can see that all PCs have the same structure:

some PCI sockets, AGP socket(s), SIMM sockets, maybe IDE socket(s),
maybe USB sockets,

and then a lot of h/w is via these sockets,
you mentioned IDE bus and USB bus, presumably the bus amounts to
an initial socket??

so I presume each socket has an assembler level API,

whoosh777 is offline   Reply With Quote
Old 04-22-2004, 03:05 AM   #118
bloodline
Master Sock Abuser
Points: 38,946, Level: 100 Points: 38,946, Level: 100 Points: 38,946, Level: 100
Activity: 28% Activity: 28% Activity: 28%
 
bloodline's Avatar
 
Join Date: Mar 2002
Location: London, UK
Posts: 11,890
Blog Entries: 3
Default Re: 68k AGA AROS + UAE => winner!

Quote:
/*
TCP/IP stacks are hard, but Wez is working on one.

Did you not use Datatypes on your Amiga?
*/

no, I've not gone too deeply into graphics as I find it too
open ended and arbitrary, though its very satisfying to look at,

same reason I didnt become an artist,

so when I have needed graphics I've coded it directly eg
iffparse.library for iff files,

or used 3rd party code for a specific picture file,
From now on, you will use Datatypes if you need to support any file type. :-D

Quote:
/*
You seem hung up on SCSI. SCSI is a very old idea. Hard drives and
CD/DVD drives all go on the IDE bus, all other devices go on the
USB or the Firewire bus.
AROS can use CD/DVD drives, because AROS has an ide.device.
SCSI is dead, no normal computer equipment use it
*/

I bought a USB external drive today so I see what you mean:

£40 for 120 Gig, £60 for the case though, probably 10 x the capacity of
an external SCSI for that price,

I think USB outdoes SCSI, but SCSI has advantages (not price) over IDE:

with IDE my machine only has 2 slots,

for me personally I have complete docs for coding cross platform SCSI,
but maybe I can learn to code USB,
If you have 2 IDE slots then you can use 4 devices.
That's right each IDE slot can support 2 devices, it you buy an IDE cable it has three plugs on it, one for the Motherboard and one for each of the devices the slot can support. If you do put more than one device on a single IDE slow, one device must be set to Master, the other must be set to Slave. This is doen using some jumpers at the back of the device.

Learn to use the USB, it is much better than SCSI.

Quote:
Question: can you outline what is involved in communicating with a
hardware device on the PC?

(thinking about AROS drivers)

explain it in general terms understandable to someone only familiar with
the classic 68k Amiga,

how do you communicate with a generic PC h/w unit?

alternatively select a specific h/w unit and explain how this
is controlled directly in assembler,

eg are there specific absolute memory addresses for say PCI socket 1,
and is there some kind of structuring to the reads + writes,

eg for a write might it be:
write data length to $12340
write pointer to data to $12344
set bit 1 of $12348 to begin the write,
interrupt 123 happens on write completion,

and eg the data itself will be structured in a way
private to that specific device,


I can see that all PCs have the same structure:

some PCI sockets, AGP socket(s), SIMM sockets, maybe IDE socket(s),
maybe USB sockets,

and then a lot of h/w is via these sockets,
you mentioned IDE bus and USB bus, presumably the bus amounts to
an initial socket??

so I presume each socket has an assembler level API,
Ok... firstly we don't use any ASM in the drivers, since you will find the same network card/graphics card/sound card whatever, in an x86 machine, a PPC machine or whatever etc... so that driver should be portable across the CPU types.
You do not program the PCI slots yourself. As a driver you request infomation about the attached hardware from the PCI drivers (which I think is the pci.library in AROS). You use the PCI drivers for all access to devices on the PCI bus, and since almost everything in the computer is attached via that bus, this is the thing you need to learn about. Everything hangs off the PCI, this is the bridge between the CPU and the hardware. The PCI drivers will provide you with any information you need about the attached hardware.
I'm not sure how good the AROS PCI driver docs are, but it's a pretty standard implementation.
__________________
My iPhone Game: Puny Humans -
http://itunes.apple.com/gb/app/puny-...362230281?mt=8
bloodline is offline   Reply With Quote
Old 04-22-2004, 04:06 AM   #119
Piru
Banned
Points: 30,457, Level: 100 Points: 30,457, Level: 100 Points: 30,457, Level: 100
Activity: 69% Activity: 69% Activity: 69%
 
Join Date: Aug 2002
Location: Helsinki, Finland
Posts: 6,946
Default Re: 68k AGA AROS + UAE => winner!

@whoosh777

Quote:
I havent used the blitter for a long time but what happens if you give the blitter an odd address, do you get an exception or just a horrible crash??
No exception. The lowest bit of the address is ignored, so you would get odd_address - 1.

Quote:
as I said I can turn all the criticisms of my code around into criticisms of the system design, both h/w and OS,
I am sure of that. However, I would still recommend just testing against NULL return as documented.

Quote:
Quote:
Even if the app crashes, there is no way to "remove task" safely, since there is no resource tracking or separated address spaces.
if you had just memory tracking then you could remove the tasks allocated memory + code + stack,
I'm afraid that won't work. The memory allocated by the task can still be in use by other tasks / processes. Also the program seglist could be used by interrupts, hooks or other processes. This is why you can't free the task memory or unload the seglist.

Quote:
Quote:
With such methods all buffering would be left at the lower level (filesystem and device driver), thru several APIs.
... Lots of stuff comparing RAM L1/L2 cache and medium cache ...

correction: closer buffering is to the h/w the faster it is,

see CPU caches for example
I'm afraid this comparision is totally unfair, as typically the medium is tens to hundreds times slower than memory, not to mention L1/L2. It certainly makes no sense at all to locally cache memory!

Quote:
Quote:
Also when the buffering is local it has much better knowlege of the actual buffer usage, so further optimizations are possible that would not be at lower level.
I totally disagree, buffers should be dynamically allocated at the lowest level, preferably not by the programmer, in fact maybe even not by the filesystem but by a lower level still, though integrating the filesystem with the lowest level maybe is the best approach
I totally disagree with you, see below for an example.

Quote:
Also choice of caching algorithm can make a huge difference this is part of why I dont want it done by the programmer,

if its not done by the programmer then it can be retargetted ie reimplemented
But the programmer does not need to do it, libc does it for him/her.

Quote:
caches arent everything, filesystem design is equally important at determining speed, a well designed system would be really fast even if the programmer doesnt do any buffering, the subroutine call overhead of fputc( fgetc(infp) , outfp) should be quite tiny because this is such a tiny loop it should entirely be in the memory cache, (with fgetc() copying a byte from a low level buffer and fputc() to a low level buffer) note that not only will the instructions of this be entirely in the instruction caches but the file buffer arrays will also be entirely in data caches, so all round very fast,

I think AmigaOS filesystem is much more efficient than that of Windows XP, its a good filesystem, but it could be a lot better,
Well, apparently you don't know how complex the two APIs are, and how much overhead is caused by it. Maybe a simple example will clear it for you.
Lets imagine simple fputc without any buffering, except at the exec device driver level (which you say will be the most efficient):

- fputc calls Write(fh, &ch, 1); to write the char
- Write calls DoPkt with ACTION_WRITE
- DoPkt sets up a DosPacket with ACTION_WRITE and parameters for the write
- DoPkt PutMsg the DosPacket to filesystem MsgPort and Wait for the reply
- filesystem wakes up from Wait() and GetMsg() the DosPacket
- filesystem determines the DosPacket is ACTION_WRITE and process it
- filesystem ACTION_WRITE updates the current block of the file
- filesystem ACTION_WRITE use DoIO CMD_WRITE (or CMD_WRITE64, or HD_SCSICMD etc) to send IORequest to exec device driver
- DoIO use device driver DEV_BEGINIO vector to send the IORequest
- The device driver DEV_BEGINIO link the IORequest to device task for processing, and return
- DoIO WaitIO is waiting for the IO to finish
- The device task will process the IORequest, see that it's CMD_WRITE (or whatever), and update buffers (and perhaps do the actual IO).
- When the IO is finished the IORequest will return (ReplyMsg)
- DoIO's WaitIO call wakes up, and DoIO returns
- filesystem checks for IO error and io_Actual to see write was successful
- filesystem ACTION_WRITE set dp_Ret1 to 1 (written 1 byte) and dp_Ret2 to 0 and PutMsg() the DosPacket back to caller (DoPkt)
- DoPkt's Wait wakes up, and DoPkt GetMsg the reply DosPacket, and moves dp_Ret1 to d0, and dp_Ret2 to pr_Result2 (IoErr) and returns
- Write returns with 1 byte written
- fputc returns with 1 byte written

The above sequence is for writing single byte without local buffering. It involves several task switches (scheduling) and waiting. In all it is very very time consuming and will kill the performance, regardless of caches.

Now, if you put the caching to filesystem the sequence gets a lot shorter:

- fputc calls Write(fh, &ch, 1); to write the char
- Write calls DoPkt with ACTION_WRITE
- DoPkt sets up a DosPacket with ACTION_WRITE and parameters for the write
- DoPkt PutMsg the DosPacket to filesystem MsgPort and Wait for the reply
- filesystem wakes up from Wait() and GetMsg() the DosPacket
- filesystem determines the DosPacket is ACTION_WRITE and process it
- filesystem ACTION_WRITE updates the current block of the file in cache
- filesystem ACTION_WRITE set dp_Ret1 to 1 (written 1 byte) and dp_Ret2 to 0 and PutMsg() the DosPacket back to caller
- DoPkt's Wait wakes up, and DoPkt GetMsg the reply DosPacket, and moves dp_Ret1 to d0, and dp_Ret2 to pr_Result2 (IoErr) and returns
- Write returns with 1 byte written
- fputc returns with 1 byte written

It still involves several task switches (scheduling) and waiting.

Now, lets put the cache to fputc:

- fputc puts the char to local buffer
- fputc return with 1 byte written

No task switching is involved. Depending on the buffering mode, only filling up the buffer or linefeed will cause actually flush of the cache (Write).

If you still fail to see my point, I can't really help it.

Quote:
so the OS is sensibly not writing to the hardware till the cylinder or track is full
Only because you write full tracks. If you would do small writes, it would rewrite the same track several times.
Piru is offline   Reply With Quote
Old 04-22-2004, 05:29 PM   #120
whoosh777
Too much caffeine
Points: 5,031, Level: 45 Points: 5,031, Level: 45 Points: 5,031, Level: 45
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Posts: 114
Default Re: 68k AGA AROS + UAE => winner!

Quote:
Karlos wrote:
@whoosh777

So anyway, how's the big-endian memory emulation idea progressing? Somewhere in the chip / trackdisk / aros install posts I lost where you were up to.

Are you still planning to attempt it?
I never said I would attempt it, though its a
possibility, I'm ruling nothing in ruling nothing out
IYSWIM,

at the moment my time is being taken up sorting out
the h/w of my system,

I bought a KVM switch: 2 PCs to share monitor+keybd +mouse,
took maybe 2 hours to figure out the cabling as I have
lots of cables but not all the right ones,

today I bought the right cables + converters but
some were not in stock, then I found that my
PC cannot use the mouse if there are too many
extensions: the A1200 has no problem whatsoever,
the PC cannot cope,

I may have to buy a USB mouse, I want the PC to join
the A1200 in another room with lengths of cables
connecting them to the room I program in, I do this
to have total silence.

I also bought an external USB drive: £40 for 120Gig
and £60 for the case (I think), (see earlier posting
for the correct price),

I want to see if I can externalise the IDE drive,
then I will boot up in different environments,

once the h/w is reasonably set up then I will
look further into AROS. I havent figured out how
I will transfer data to and from AROS,
I wonder if its possible to impose AmigaDOS onto
the PC floppies, ie treat them as small hard disks,
then I could transfer data to and
from the A1200,

if UAE is integrated successfully into little endian
AROS then I may not bother with a big endian AROS,
its completely a practical matter,


anyway the very first program I will write is
"hello world",

#include <stdio.h>

int main( int argc, char **argv )
{
printf("hello world\n");
return( 0 ) ;
}

I'm dreading what flaw Piru will find, I cant see any
but he will find something, maybe it should be
char *argv[]?? or should I check for an error from
printf eg a shell may not be open,

the difficulty with "hello world" being setting up the compiler,
after that I will try some small but useful programs
little utilities I've written out of necessity,

dabbling with AROS itself is much further down
the line,

I like to move gradually, if you recall at the
start of the thread I hadnt got a PC and
wasnt ready to look at the AROS sites, but now
I've got the PC + visited the sites and installed
+ tried out AROS,

BTW I hope AROS dont reimplement OS4 as that would be
in bad taste I think. The open nature of AROS makes
it eternal in some sense, so I think like Linux and
GNU it will gradually succeed, if you visit www.gnu.org

(or is it ftp.gnu.org) its worth reading the interview
there with Richard Stallman on how
GNU begun, I think he started it with GNU emacs,

so it begun with just GNU emacs, I dont know if
thats a reimplemenation of a Unix emacs,
or if it was a brand new program,


the selling point of AmigaOS is its so direct,
both for users and developers, its so straightforward
to write programs, also it generally doesnt
pretend to be bigger than it is,
no gratuitous bloat to make it look like several Gig,

my A1200 has a 1/2 meg ROM, I know this by watching
a memory meter and running CPU fastrom,

you can run an A1200 in 2 Meg so the diskloaded libraries
must be under 2 meg, thus it fits in 2.5 Meg, probably
quite a bit less,

IMO the extra functionality that Windows XP has
cannot be more than a few Meg, so I can easily
believe XP to be say less than 5 Meg,

a really interesting project will be to "measure" XP,

1 Meg of OS is a huge amount of stuff, so I think
MS bloated up their OS with 100's of megs of non
OS material masquerading as OS,

IMO also Windows XP looks like the work of a few
dozen developers at most, it has that feel,

maybe 10 developers writing the programs and
15 wrting the OS,

so if AROS has 11 developers I think they're doing well,
wasnt Linux done mainly by 1 developer initially??
of course they will hype it up
as man centuries of work, but note that
25 developers working for 4 years == 1 man century,

anyway for me getting involved with AROS itself
is quite a bit later on, first I want to try out
some programs that use the current API rather than
work on the API itself. What is really interesting
is that if you locate OS bugs you can actually fix
them yourself!

no more workarounds of OS bugs,

whoosh777 is offline   Reply With Quote
Reply

Bookmarks

Tags
aga , uae , 68k , winner , aros

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Winner Z4 busboard for sale Boot_WB Amiga Marketplace 6 07-11-2006 02:40 AM
Winner 4-DEV IDE Interface jgratton Amiga Hardware Issues and discussion 2 01-14-2006 04:22 PM
Elbox: Winner IDE jimmyboy Amiga Hardware Issues and discussion 1 09-15-2005 07:40 AM
Winner Z4 busboard for A1200 Eco Amiga Marketplace 1 05-20-2005 05:25 PM
Meteorite hits lottery winner blobrana CH / Entertainment 5 07-12-2004 08:01 AM