Welcome, Guest. Please login or register.

Author Topic: MFM decode  (Read 3185 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline orangeTopic starter

  • Hero Member
  • *****
  • Join Date: Dec 2003
  • Posts: 2794
    • Show only replies by orange
MFM decode
« on: December 17, 2017, 05:36:04 PM »
does 'rawread' dump MFM encoded data? if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)
Better sorry than worry.
 

guest11527

  • Guest
Re: MFM decode
« Reply #1 on: December 17, 2017, 05:56:15 PM »
Quote from: orange;834180
does 'rawread' dump MFM encoded data?
Yes.
Quote from: orange;834180
if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)
That, of course, depends on the format. MFM is a 1:2 encoding, every bit is encoded by two bits. In particular, the filler bit between two data bits is 1 if and only if the two data bits are zero. Otherwise, the filler bit is 0. Now, how the data is laid out and how the data bits are spread is entirely a decision of the format, and also how the track and sector headers look like.

The format of the trackdisk.device is described in the RKRM Hardware, you find all the information there. In this format, payload data is separated into two 256-bit groups (even and odd bits), which is a rather untypical layout, but it allows fast decoding with the blitter. Typically, filler bits are interleaved with the subsequent data bits.
 

Offline orangeTopic starter

  • Hero Member
  • *****
  • Join Date: Dec 2003
  • Posts: 2794
    • Show only replies by orange
Re: MFM decode
« Reply #2 on: December 17, 2017, 07:00:51 PM »
thanks Thomas.
« Last Edit: December 17, 2017, 07:41:33 PM by orange »
Better sorry than worry.
 

Offline olsen

Re: MFM decode
« Reply #3 on: December 18, 2017, 12:31:07 PM »
Quote from: orange;834180
does 'rawread' dump MFM encoded data? if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)

Documented MFM decoding program code is pretty scarce (I've seen my share, and I still can't believe that the people who wrote it trusted their own code!). You might want to dip into TrackSalve, which is somewhat old, though. "TrackSalve" is a patch for trackdisk.device by Dirk Reisig, last updated in 1990, fixing Kickstart 1.x-specific bugs which by that time had already been fixed for Kickstart 2.0. "TrackSalve" covers just about everything. Bonus: it uses the blitter for encoding/decoding.
 

Offline olsen

Re: MFM decode
« Reply #4 on: December 18, 2017, 01:43:23 PM »
Quote from: Thomas Richter;834181
Yes.

That, of course, depends on the format. MFM is a 1:2 encoding, every bit is encoded by two bits. In particular, the filler bit between two data bits is 1 if and only if the two data bits are zero. Otherwise, the filler bit is 0. Now, how the data is laid out and how the data bits are spread is entirely a decision of the format, and also how the track and sector headers look like.

The format of the trackdisk.device is described in the RKRM Hardware, you find all the information there. In this format, payload data is separated into two 256-bit groups (even and odd bits), which is a rather untypical layout, but it allows fast decoding with the blitter. Typically, filler bits are interleaved with the subsequent data bits.

I just double-checked: the disk format documentation ended up in "Appendix C" of the 3rd edition "Devices" ROM Kernel Reference Manual. It seems that it was originally part of the 1st edition "Libraries & Devices" ROM Kernel Reference manual in "Appendix L", but you can't find that version online.

Anyway, here's what I found, from way back (1985):

Code: [Select]
COMMODORE-AMIGA DISK FORMAT

The following are details about how the bits on the Commodore-Amiga disk
are actually written.

Gross Data Organization:

3 1/2 inch disk
double-sided
80 cylinders/160 tracks


Per-track Organization:

Nulls written as a gap, then 11 sectors of data.
No gaps written between sectors.


Per-sector Organization:

All data is MFM encoded.  This is the pre-encoded contents
of each sector:

two bytes of 00 data    (MFM = AAAA each)
two bytes of A1* ( "standard sync byte" -- MFM
encoded A1 without a clock pulse )
(MFM = 4489 each)
one byte  of format-byte
(Amiga 1.0 format = FF)
one byte  of track number
one byte  of sector number
one byte  of sectors until end of write (NOTE 1)

[above 4 bytes treated as one longword
 for purposes of MFM encoding]

16  bytes of OS recovery info (NOTE 2)
[treated as a block of 16 bytes for encoding]
four bytes of header checksum
[treated as a longword for encoding]
four bytes of data-area checksum
[treated as a longword for encoding]
512 bytes of data
[treated as a block of 512 bytes for encoding]

NOTES:

   NOTE 1.  
   The track number and sector number are constant for each
   particular sector.  However, the sector offset byte changes
   each time we rewrite the track.

   The Amiga does a full track read starting at a random
   position on the track and going for slightly more
   than a full track read to assure that all data gets into the
   buffer.  The data buffer is examined to determine where the
   first sector of data begins as compared to the start of the
   buffer.  The track data is block moved to the beginning of
   the buffer so as to align some sector with the first location
   in the buffer.

   Because we start reading at a random spot, the read data may
   be divided into three chunks: a series of sectors, the track
   gap, and another series of sectors.  The sector offset
   value tells the disk software how many more
   sectors remain before the gap.  From this the software can
   figure out the buffer memory location of the last byte
   of legal data in the buffer.  It can then search past the gap
   for the next sync byte and, having found it, can block move
   the rest of the disk data so that all 11 sectors of data are
   contiguous.    

   Example:

first-ever write of the track from a buffer like this:

<GAP> |sector0|sector1|sector2|.....|sector10|      

sector offset values:

11     10  9 ....    1

  (if I find this one at the start of my read buffer,
    then I know there are this many more sectors
    with no intervening gaps before I hit a gap).


sample read of this track:

<junk>|sector9|sector10|<gap>|sector0|...|sector8|<junk>

value of 'sectors till end of write':

        2  1 ....    11           3

result of track realligning:

<GAP>|sector9|sector10|sector0|...|sector8|

new sectors till end of write:

11      10        9    ...    1

so that when the track is rewritten, the sector offsets
are adjusted to match the way the data was written.


NOTE 2. This is operating systems dependent data and relates
to how AmigaDos assigns sectors to files.

Reserved for future use.



GENERAL:

When data is MFM encoded, the encoding is performed on
the basis of a data block-size.  In the sector encoding
described above, there are bytes individually encoded;
three segments of 4 bytes of data each, treated as
longwords; one segment of 16 bytes treated as a block; two
segments of longwords for the header and data checksums;
and the data area of 512 bytes treated as a block.

When the data is encoded, the odd bits are encoded first,
then the even bits of the block.  

(Make a block of bytes formed from all odd bits of the block,
encode as MFM.

Make a block of bytes formed from all even bits of the block,
encode as MFM.   Even bits are shifted left one bit position
before being encoded.)



SOURCE CODE FOR DATA ENCODE/DECODE

decodeBlock( mfmbuffer, userbuffer, numwords )
WORD *mfmbuffer; /* the encoded data */
WORD *userbuffer; /* where to put the decoded data */
int numwords; /* the number of WORDS of data (not bytes) */
{
    WORD *oddptr, *evenptr, oddbits, evenbits;

    oddptr = mfmbuffer;

    /* the even region starts right after the odd one */
    evenptr = &mfmbuffer[numwords];

    while( numwords-- > 0 ) {
/* mask off the mfm clock bits, and shift the word */
oddbits = ((*oddptr++ << 1) & 0xAAAA);

/* even bits are already in the right place.  Just mask off clock */
evenbits = ((*evenptr++) & 0x5555);

/* recombine the two sections */
*userbuffer++ = oddbits | evenbits;
    }
}

encodeBlock( mfmbuffer, userbuffer, numwords )
WORD *mfmbuffer; /* where to put the encoded data */
WORD *userbuffer; /* the user data, before encoding */
int numwords; /* the number of WORDS of data (not bytes) */
{
    WORD *oddptr, *evenptr;
    WORD *ubuf;


    oddptr = mfmbuffer;

    /* the even region starts right after the odd one */
    evenptr = &mfmbuffer[numwords];

    /* mfmencode takes one word of mfm data can correctly sets
     * the clock bits
     */

    /* encode the odd bits */
    for( ubuf = userbuffer, i = numwords; i > 0; i-- ) {
oddptr++ = mfmencode( (*ubuf++ >> 1) & 0x5555 );
    }

    /* encode the even bits */
    for( ubuf = userbuffer, i = numwords; i > 0; i-- ) {
evenptr++ = mfmencode( *ubuf++ & 0x5555 );
    }
}

Documentation on how the sector header and data area checksums are calculated remains elusive, I'm afraid.
« Last Edit: December 18, 2017, 01:47:55 PM by olsen »
 

guest11527

  • Guest
Re: MFM decode
« Reply #5 on: December 18, 2017, 02:13:07 PM »
Quote from: olsen;834189
Documentation on how the sector header and data area checksums are calculated remains elusive, I'm afraid.

For German readers, there is the Databecker "Floppybuch" for the Amiga which contains this information. I'm in general pretty careful with second sources, especially Databecker (you find a lot of nonsense in these books), but this one is pretty complete (but also contains nonsense you better filter out).
 

Offline olsen

Re: MFM decode
« Reply #6 on: December 18, 2017, 03:04:28 PM »
Quote from: Thomas Richter;834190
For German readers, there is the Databecker "Floppybuch" for the Amiga which contains this information. I'm in general pretty careful with second sources, especially Databecker (you find a lot of nonsense in these books), but this one is pretty complete (but also contains nonsense you better filter out).


According to the "TrackSalv" source code the respective checksums are calculated for the MFM-encoded header/sector data, respectively.

I think that the checksum algorithm works as follows:

Code: [Select]
ULONG
checksum(const ULONG * encoded_words,int num_words)
{
const ULONG mask = 0x55555555;
ULONG sum;

sum = 0;

while(num_words-- > 0)
sum ^= (*encoded_words++);

sum = ((sum >> 1) & mask) ^ (sum & mask);

return(sum);
}


The XOR operation is quite handy here, I suppose, since it works regardless of whether the MFM fill bits are present or not. This is not the case for the IBM PC floppy disk format, which uses CRC values.

It might be worth looking up the old Amiga 68k NetBSD/Linux kernel floppy driver code for reference.
 

Offline olsen

Re: MFM decode
« Reply #7 on: December 18, 2017, 04:03:24 PM »
Quote from: orange;834180
does 'rawread' dump MFM encoded data? if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)


If I understand this correctly, you can tell trackdisk.device in TD_RAWREAD mode to start reading as soon as it finds the sync pattern of your choice. This should save you the trouble to find the beginning of the sector, which can be shifted by 1..15 bits.

You do need to know the sector size that is going to be used, though. In the standard Amiga format you'll encounter 32 bytes of header data in addition to the 512 bytes of sector data, including the sync pattern (four bytes total) which introduces the header. In this format you'll need (32+512) * 2 = 1088 bytes worth of memory to read the MFM-encoded data.
 

Offline orangeTopic starter

  • Hero Member
  • *****
  • Join Date: Dec 2003
  • Posts: 2794
    • Show only replies by orange
Re: MFM decode
« Reply #8 on: December 18, 2017, 08:42:07 PM »
what does ' length 12656/4' mean in output?
how long is the header, what is its format?
thanks.


edit: is it like this:

OFFSET              Count TYPE   Description
0000h                   8 byte   'UAE-1ADF'
0008h                   4 byte   trackcount
000Ch                   4 byte   0=amigados 1=raw mfm
0010h                   4 byte   tracklength
0014h                   4 byte   tracklength in bits
0018h                   4 byte   0=amigados 1=raw mfm
...





I just cant find '0xAAAA AAAA 4489 4489'   :(

( http://lclevy.free.fr/adflib/adf_info.html#p23 )
« Last Edit: December 18, 2017, 11:15:50 PM by orange »
Better sorry than worry.
 

Offline olsen

Re: MFM decode
« Reply #9 on: December 19, 2017, 08:44:14 AM »
Quote from: orange;834196
what does ' length 12656/4' mean in output?
how long is the header, what is its format?
thanks.


edit: is it like this:

OFFSET              Count TYPE   Description
0000h                   8 byte   'UAE-1ADF'
0008h                   4 byte   trackcount
000Ch                   4 byte   0=amigados 1=raw mfm
0010h                   4 byte   tracklength
0014h                   4 byte   tracklength in bits
0018h                   4 byte   0=amigados 1=raw mfm
...
Shrug... this does not look like anything I would expect to find on a standard Amiga formatted floppy disk. Are you sure you are looking for MFM data? If this is the data structure layout, I would expect it to be a container format, not the contents.


Quote
I just cant find '0xAAAA AAAA 4489 4489'   :(

( http://lclevy.free.fr/adflib/adf_info.html#p23 )

You may not be able to see this pattern in the encoded MFM data at all. The thing is, this is a bit pattern, not a byte pattern. It can start in the MFM bit stream at virtually any position in the track buffer, but usually it's somewhere near the beginning of the buffer.

So, how do you find the bit position where it starts? The key is the 0xAAAA pattern, which either shows up as 0xAAAA in the MFM bit stream (if the header starts at an even bit position), or as 0x5555 (if it starts at an odd bit position).

The first step to decoding is to find out where the 0xAAAA bit pattern shows up. Because it covers 32 bits, you should be able to find it by looking for any two consecutive bytes which either read as 0xAA or as 0x55.

Code: [Select]
UWORD * mfm_buffer;
int mfm_buffer_size, i;
int num_words = mfm_buffer_size / sizeof(*mfm_buffer);
UWORD pattern;
int word_position = -1;

for(i = 0 ; i < num_words ; i++)
{
   if (mfm_buffer[i] == 0xAAAA)
   {
      pattern = 0xAAAA;
      word_position = i;
      break;
   }
   else if (mfm_buffer[i] == 0x5555)
   {
      pattern = 0x5555;
      word_position = i;
      break;
   }
}

/* Skip the pattern if it shows up again, which happens
 * if it started at the very first bit of the byte.
 */
if(word_position != -1 && word_position + 1 < num_words && mfm_buffer[word_position+1] == pattern)
  word_position++;

If these two bytes are part of a sector header, then they should be followed by two 0x4489 bit patterns in the next 0..14 bits. You need to figure out which bit position they show up at.

Code: [Select]
if(word_position != -1 && word_position + 1 < num_words)
{
   int bit_position = -1;
   ULONG match;

   match = (((ULONG)mfm_buffer[word_position]) << 16) | mfm_buffer[word_position+1];

   for(i = 0 ; i < 15 ; i++)
   {
      if(((match << i) & 0xFFFF0000) == 0x44890000)
      {
          bit_position = i;
          break;
      }
   }
}

At this point you should be able to tell if you found the byte and bit positions of the first 0x4489 sync bit pattern. The next step would be to check if the first 0x4489 pattern you found is followed by another one. If that's the case, you can begin to read the individual words, shift them as needed and reconstruct both the sector header and sector data in their MFM-encoded forms.

Please note that in production code the task of finding the sync words is usually table-driven and does not run in a loop which shifts bits around ;)
« Last Edit: December 19, 2017, 08:48:48 AM by olsen »
 

Offline orangeTopic starter

  • Hero Member
  • *****
  • Join Date: Dec 2003
  • Posts: 2794
    • Show only replies by orange
Re: MFM decode
« Reply #10 on: December 19, 2017, 08:56:34 AM »
Quote from: olsen;834214
Shrug... this does not look like anything I would expect to find on a standard Amiga formatted floppy disk. Are you sure you are looking for MFM data? If this is the data structure layout, I would expect it to be a container format, not the contents.

that is an output of rawread command, it writes 'extended' ADF format, at least when raw tracks are present.

Quote
You may not be able to see this pattern in the encoded MFM data at all. The thing is, this is a bit pattern, not a byte pattern. It can start in the MFM bit stream at virtually any position in the track buffer, but usually it's somewhere near the beginning of the buffer.

thanks. I was searching at bit-level, but will try again. perhaps rawread removes the sync?

Quote
So, how do you find the bit position where it starts? The key is the 0xAAAA pattern, which either shows up as 0xAAAA in the MFM bit stream (if the header starts at an even bit position), or as 0x5555 (if it starts at an odd bit position).

The first step to decoding is to find out where the 0xAAAA bit pattern shows up. Because it covers 32 bits, you should be able to find it by looking for any two consecutive bytes which either read as 0xAA or as 0x55.
...

thanks. will try.

I've tried encoding 'DOS' to MFM data (after splitting to odd and even bits?bytes?), but can't find the bit pattern in input.
« Last Edit: December 19, 2017, 09:20:34 AM by orange »
Better sorry than worry.
 

Offline olsen

Re: MFM decode
« Reply #11 on: December 19, 2017, 10:46:15 AM »
Quote from: orange;834215
that is an output of rawread command, it writes 'extended' ADF format, at least when raw tracks are present.
OK, so this is a container format after all.

Quote
thanks. I was searching at bit-level, but will try again. perhaps rawread removes the sync?
I don't know how the "rawread" command works (any pointers to the source code?), but if it uses the standard Amiga MFM encoded format, then it could drop the sync patterns because they are redundant in this container. Mind you, the sector header and sector data would still have to be preserved in properly-shifted form.

Quote
I've tried encoding 'DOS' to MFM data (after splitting to odd and even bits?bytes?), but can't find the bit pattern in input.
The odd and the even bits are encoded separately and stored separately (512 bytes apart). The encoding of the first bit of "DOS" may vary, depending upon the bit which preceded it. "D" = binary 01000100, which comes out as odd=0000 and even=1010 prior to encoding. That could be encoded either as odd=10101010 or odd=00101010 depending upon the preceding bit (sector data checksum) and even=01000100. So there's already a bit of ambiguity here.
 

Offline orangeTopic starter

  • Hero Member
  • *****
  • Join Date: Dec 2003
  • Posts: 2794
    • Show only replies by orange
Re: MFM decode
« Reply #12 on: December 19, 2017, 01:32:03 PM »
ok, thanks.
finally found the problem.
I was using:
 $bitdata = unpack "b*",$data;
instead of
 $bitdata = unpack "B*",$data;

in perl :/
Better sorry than worry.
 

Offline kolla

Re: MFM decode
« Reply #13 on: December 19, 2017, 03:08:21 PM »
Quote from: orange;834223

finally found the problem.
...
perl :/


Indeed :D
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC
---
A3000/060CSPPC+CVPPC/128MB + 256MB BigRAM/Deneb USB
A4000/CS060/Mediator4000Di/Voodoo5/128MB
A1200/Blz1260/IndyAGA/192MB
A1200/Blz1260/64MB
A1200/Blz1230III/32MB
A1200/ACA1221
A600/V600v2/Subway USB
A600/Apollo630/32MB
A600/A6095
CD32/SX32/32MB/Plipbox
CD32/TF328
A500/V500v2
A500/MTec520
CDTV
MiSTer, MiST, FleaFPGAs and original Minimig
Peg1, SAM440 and Mac minis with MorphOS
 

Offline olsen

Re: MFM decode
« Reply #14 on: December 19, 2017, 03:29:05 PM »
Quote from: orange;834223
ok, thanks.
finally found the problem.
I was using:
 $bitdata = unpack "b*",$data;
instead of
 $bitdata = unpack "B*",$data;

in perl :/

Oh well... give 'C' a try, please. Only a fraction of the expressiveness that leads Perl users to their doom, but the same degree of catastrophic errors easily triggered by a mere single wrong character that is abstrusely difficult to spot ;)