balrog changed the topic of #discferret to: DiscFerret magnetic disc analyser project || http://www.discferret.com || Current versions: Microcode 002C, API v1.7r1 (mcode 002E in dev) || DFI import support in Disk-Utilities: http://git.io/SabCYA || Current work: disc write support, decoders, planning Merlin2 || logs at https://freenode.irclog.whitequark.org/discferret
dustinm`_ has quit [Quit: Leaving]
dustinm` has joined #discferret
philpem has quit [Ping timeout: 250 seconds]
Ralf has joined #discferret
philpem has joined #discferret
* whitequark knocks on wood
<whitequark> anyone here alive?
<whitequark> balrog? Lord_Nightmare? philpem? curious what you think about my Oct 19 question
<Lord_Nightmare> sorry, was traveling all day yesterday
<Lord_Nightmare> was the question in here, or on twitter?
<Lord_Nightmare> oh, i see.
<Lord_Nightmare> [04:15:59] <whitequark> so i was wondering about the dfiv2 bitstream format
<Lord_Nightmare> [04:17:50] <whitequark> it seems strange to spend so much effort on the precise timing (down to 128 samples) of the index transition (which is actually far less precise due to the mechanical nature of the index sensor), only to completely lose the sign
<Lord_Nightmare> [04:16:53] <whitequark> it doesn't store the polarity of the index pulse, only that it changed
<Lord_Nightmare> [04:18:00] <whitequark> er, less than 128 samples*
<Lord_Nightmare> [04:21:13] <whitequark> i spent some time today drafting dfiv3 spec (as i've long was planning to), and thought about using a much simpler scheme: bit 7 is the logical value of INDEX line, bits 0-6 indicate time until current transition, or 0x7f if there's been 128 samples without one
<Lord_Nightmare> i think dfiv1 did something similar with bit 7, and i know for sure the catweasel format uses bit 7 exactly the way you explain
<Lord_Nightmare> the issue i have is if a track has large swaths of unformatted areas, and for some reason the drive does not inject any flux 'noise' into those areas
<Lord_Nightmare> you could have the index pulse half a revolution off or something
<Lord_Nightmare> some particularly nasty apple2 disks do stuff like this (choplifter, i think?) where half of each track isn't formatted at all
<Lord_Nightmare> and the formatted part 'rotates' around the disk, with the software expecting the drive head to see it within a certain fraction of a revolution
<Lord_Nightmare> if the drive has no transitions at all for the entire half a rotation, the index pulse update, on the first actual transition, could be very far off of where it really happened
<Lord_Nightmare> for mfm disks this doesn't matter
<Lord_Nightmare> but for bizarre copy protected disks, it does.
<whitequark> Lord_Nightmare: but I address that, no?
<whitequark> if there are no transitions at all for half a rotation, you would get a stream of 0x7f 0x7f 0x7f ... 0xff (index) 0x7f 0x7f
<whitequark> so the maximum uncertainty is still around 2 bits at 25 MHz, definitely much less than half a revolution
<Lord_Nightmare> ah so the farthest it can be off is 128 bits
<Lord_Nightmare> that's not so bad
<Lord_Nightmare> 128 flux time slots i mean
<whitequark> 128 samples
<Lord_Nightmare> yes
<Lord_Nightmare> philpem: you around?
<Lord_Nightmare> whitequark: here's an interesting question: if dfiv3 breaks up track images by 'from the rising edge of index on rotation 1 to the falling edge of index on rotation 2', do we store the time in samples between the rising edges of the index pulses as track metadata (i.e. disk rotation speed), or do we let the client deal with calculating that?
<Lord_Nightmare> also if we store 5 images of a track, do we store them all end-to-end so the area where index is active is not duplicated, or do we store them as separate IFF blocks, with the area where index was active on rotation 2 duplicated at the end of the 'track image' of rotation 1 and the beginning of the 'track image' of rotation 2 ?
<Lord_Nightmare> i see pros and cons of doing it either way
<Lord_Nightmare> obviously the sample rate needs to be stored with every track image, the drive RPM could be optionally stored, but assuming index works, can be derived from the index pulses in the data and the sample rate
<Lord_Nightmare> however i have at least one drive with no index sensor, where you'd have to manually store the RPM and manually figure out the track splice by autocorrelation
<Lord_Nightmare> the kodak 3.3mb drive has no index sensor
<whitequark> Lord_Nightmare: IMO the entire contiguous read should be stored as one unbroken data stream
<whitequark> no matter what are the index pulses or how many revolutions there were or how long is it
<pjustice> Stuff I read as two passes caused me some issues, as some software didn't understand and tried to make very long tracks.
<whitequark> we can always split the contiguous read using a simple tool, but it's more annoying to go back
<whitequark> also, one advantage to using the contiguous read is that it's easy to train a PLL on that, even if it doesn't lock right away
<whitequark> or maybe one part of the disc slows down for some reason (mechanical friction), if you have the entire read, you can correct for that easily
<whitequark> not so much if it's split into sectors
<pjustice> I'm not talking about sectors, I'm talking about asking the tool to pull two revolutions of each track.
<whitequark> oh sorry, i mentioned two distinct points, only the first one is a direct response to you
<pjustice> ok
<whitequark> with the proposed format, splitting multi-revolution image to single-revolution is 2 lines of C
<whitequark> so it doesn't seem like if this ever comes up, it would be hard to deal with
<Lord_Nightmare> so what is an optimal number of revolutions to store? 5?
<Lord_Nightmare> for wabash disks where you want the head on the track for as few revolutions as possible, maybe 2 or 3
<whitequark> the current default in glasgow is 2
<Lord_Nightmare> should be definiable by a commandline argument
<whitequark> and you can override it
<whitequark> actually, for wabash disks, i would do something more interesting
<whitequark> so one of the reasons i want dfiv3 is to be able to store all 3 of flux data, demodulated data, and sector data, right?
<whitequark> i'm thinking that i would add some gateware to do mfm decoding *in parallel* to gateware that streams flux info
<whitequark> and then you can add a condition that if you've got every sector with valid CRC, move on right away
<whitequark> otherwise try again up to defined number of attempts
<whitequark> in fact, in this case, you don't even necessarily need to wait for the index pulse at all
<whitequark> if it decodes perfectly, you can do 1 revolution per track
<whitequark> (well, slightly more, to account for head movement)
<Lord_Nightmare> yes
<whitequark> Lord_Nightmare: in practice, on my sample of floppies, and if i constrain the NCO quite tightly to the expected frequency, 2 revolutions works very well for the outer 70% of the drive, and then the inner 30% could have some issues
<whitequark> so it might make sense to vary that with track, too
<Lord_Nightmare> you could also say 'do a max of 2 revolutions, then move on' and go back and do the track over at the end from seeking the other direction
<whitequark> yeah, a number of strategies
<whitequark> storing decoded sectors in the dfiv3 file itself also allows any tool to easily restart the process where it stopped
<Lord_Nightmare> so each track image iff 'block' (i'm assuming dfiv3 will be an IFF/RIFF format) will have a sub-block for the decoded sectors?
<whitequark> it didn't occur for me to call it IFF (I call it TLV), but I looked it up and IFF is identical to what I came up with, with the exception of padding size
<whitequark> so yes
<whitequark> yes re IFF/RIFF, that is
<Lord_Nightmare> DISK
<Lord_Nightmare> SIDE
<Lord_Nightmare> |
<Lord_Nightmare> TRAK
<Lord_Nightmare> |
<Lord_Nightmare> |
<Lord_Nightmare> SECT
<Lord_Nightmare> |
<Lord_Nightmare> SECT
<Lord_Nightmare> ...
<Lord_Nightmare> SECT
<Lord_Nightmare> |
<Lord_Nightmare> SECT
<Lord_Nightmare> TRAK
<Lord_Nightmare> |
<Lord_Nightmare> something like that?
<whitequark> hmm, I had a non-hierarchical format in mind, and here is a good reason why
<whitequark> I am thinking a dfiv3 file could function as a "workspace", as a sort of an append-only data recovery log
<whitequark> first you extract and append (to an empty file) the flux data
<whitequark> then you demodulate each track and append it again
<Lord_Nightmare> TRAK would hold the flux image of the track, and also the number of revolutions, and would also hold any track-specific metadata (samplerate, rpm if needed, etc); the SECT subsections would contain the decoded sectors and their metadata (size, decode type, crc, whether crc is valid, etc)
<whitequark> then you decode the sectors and append them
<Lord_Nightmare> aha, so you'd instead do:
<Lord_Nightmare> DFI3
<Lord_Nightmare> <raw portion follows>
<Lord_Nightmare> DISK
<Lord_Nightmare> |
<Lord_Nightmare> |
<Lord_Nightmare> SIDE
<Lord_Nightmare> |
<Lord_Nightmare> TRAK
<Lord_Nightmare> |
<Lord_Nightmare> TRAK
<Lord_Nightmare> ...
<Lord_Nightmare> TRAK
<Lord_Nightmare> |
<Lord_Nightmare> TRAK
<Lord_Nightmare> SIDE
<Lord_Nightmare> ...
<Lord_Nightmare> TRAK
<Lord_Nightmare> <decoded block, which is appended later, follows, either under the DISK heirarchy or appended to the end of the file after the DISK block>
<Lord_Nightmare> DCOD
<Lord_Nightmare> |
<Lord_Nightmare> SIDE
<Lord_Nightmare> etc
<whitequark> hmm
<Lord_Nightmare> that way you only have to rebuild the 'DCOD' heirarchy when redoing/continuing decoding, and the 'DISK' heirarchy remains static and is the raw flux images per track
<pjustice> If you did it flat, but included a session id or some such in each block as well as c/h/s, you could construct a hierarchy when you wanted one, but treat the thing as log-structured write-once.
<whitequark> ^ pjustice gets what I want
<whitequark> so here's one thing I want to solve
<pjustice> Which I like from the archival perspective.;
<whitequark> suppose I am streaming tracks and sectors from glasgow, and then USB timeouts (latency spike larger than its buffer), or just someone trips on the cable
<whitequark> or something crashes, you get the idea
<whitequark> if the file is completely flat, I can do fflush() once I wrote a FLUX; SECT...SECT for each track
<whitequark> and be done with it
<Lord_Nightmare> ok, so you have a FLUX block per track, and once that's imaged, you immediately start grabbing another from glasgow
<whitequark> yes, and from another EP I have SECT blocks per sector coming
<whitequark> or maybe not coming if the glasgow can't decode the disk
<Lord_Nightmare> or do you do two threads, one imaging each FLUX block and another decoding the previously read one on another core to make SECT blocks?
<whitequark> and I just save both FLUX and SECT as soon as an entire chunk arrives
<whitequark> you could decode on another core too, which would be similar format-wise
<pjustice> I'd suggest imposing only a minimum of ordering on the blocks. Maybe a start-of-session block that includes session id, date, time, operator comments, and then any sequence of flux, sect, whatever.
<whitequark> yep, that is what I want
<Lord_Nightmare> diskblitz from the applesauce project wanted to change the a2r(flux) and woz(cleaned flux guaranteed to decode) formats to allow session based editing stuff recently, but changed his mind at some point. might be interesting to talk to him too
<Lord_Nightmare> pjustice: i'm trying to understand still
<whitequark> Lord_Nightmare: think of it as a sort of "knowledge base" for this specific disk
<Lord_Nightmare> so you'd have the global DFI3 scope, under that a SESSION block with a timestamp, then under that have FLUX blocks each which store side and (presumed track number) as metadata, as well as optionally decoded SECT blocks produced during that session?
<whitequark> when your tool learns a new fact about the disk, it dumps it to the end of file right away
<Lord_Nightmare> so you can add more SESSION (SESN?) blocks later
<whitequark> yes
<Lord_Nightmare> which can add re-images of tracks, and different/better decoding of SECT bloks
<whitequark> yep
<Lord_Nightmare> so it stores a history of every change
<whitequark> yep
<pjustice> You can have a logical hierarchy of sorts, but within the file you have only a minimum of block ordering and a flat sequence of events, basically.
<whitequark> and it never actually *modifies* the data in the file, so you can't *lose* it even if your OS crashes or whatever
<whitequark> it only appends, like a proper database
<whitequark> each time a tool starts, it does a complete linear scan
<whitequark> well, if it wants to know what's already there
<Lord_Nightmare> that could get very slow if the file gets very large, and as it builds the 'best' decoding/flux images for each track
<Lord_Nightmare> i guess you could make some sort of extra file alongside the main file which is a quick index to the 'final/best' decodings of each thing
<pjustice> You could consider writing, at the end of each session, an index block, and a fixed-size "start of last index" block.
<Lord_Nightmare> that will make the file even larger, but has advantages too since it makes reading much faster
<whitequark> if making it very fast on spinning rust is important, yes, we could add an index
<whitequark> if people will mostly use this on SSDs then it won't be slow in the first place
<Lord_Nightmare> i.e. the last 32 or 64 bit word of the file offset from the very end is an index to the offset of the final INDX block
<whitequark> since you are doing, pessimistically, on the order of 10k pointer dereferences in an mmapped file
<Lord_Nightmare> and the INDX block has a list of all the important offsets as of the end of the previous session
<pjustice> If we're talking about floppies, I can't see the size of the file being that big a deal, even on spinning rust or slower machines.
<Lord_Nightmare> .dfi files in dfiv2 can be 60+mb
<pjustice> And?
<Lord_Nightmare> and applesauce .a2r files can be even larger
<pjustice> If hard drives are involved, then maybe it's a bigger issue.
<Lord_Nightmare> adding a new INDX block at the end is a good idea
<Lord_Nightmare> so you can fully play back the entire file history of every session
<pjustice> and don't make the final pointer freestanding; make it a block type, just fixed length, so that reading it is seek to end - pointerblock size, read pointerblock size
<whitequark> no objection to INDX
<Lord_Nightmare> you could also make a 'cooker' which grabs only the most recent/relevant blocks from the file to make a cooked version which will discard all the bad redundant data
<whitequark> yes, compaction
<Lord_Nightmare> as another file
<pjustice> And sort them into a sane order
<Lord_Nightmare> if the INDX block is missing or corrupt, you can manually re-read the entire file and rebuild it
<Lord_Nightmare> there's a lot of redundancy
<whitequark> yep
<Lord_Nightmare> since there's manu INDX blocks, one per session, you can manually find the previous INDX block if one exists, and use that as a shortcut when rebuilding the INDX for the final session (which presumably had a corrupt INDX)
<Lord_Nightmare> there's MANY INDX blocks
<pjustice> Truncate from the last sane session-end, if needed.
<Lord_Nightmare> i'm assuming each INDX contains the entire file index
<whitequark> yes. this is basically how EOCD works in zip
<Lord_Nightmare> though, you COULD have each INDX just have an index for this session only, and a pointer to the previous INDX
<Lord_Nightmare> but that's a bit slower to read, but probably not too bad
<Lord_Nightmare> unless you have thousands of sessions
<pjustice> Can't imagine that's going to be the case.
<pjustice> Another case for multiple sessions is trying different flux decoders against the raw flux, which suggests that there should be a way to record which decoder did a FLUX->SECT conversion.
<pjustice> Also, sadly, second system syndrome? :)
<whitequark> this doesn't seem all that complicated to me
<whitequark> it's a format you could make a tool for in one evening
<Lord_Nightmare> there's disks where there are two sector 0s on track 0, one in FM format (tells the trs-80 to display a short message 'SWITCH THE DAMN DISKETTE CARD INTO MFM MODE DUMMY' or something like that)
<Lord_Nightmare> and one in MFM format (the actual boot sector)
<whitequark> so SECT would store track, head, modulation, sector format
<whitequark> sector number, name of decoder, whether the CRC matched
<whitequark> possibly a link back to the DCOD chunk from which it came
<Lord_Nightmare> one very nasty format is the format used by the canon cat on track 0 side 0 (cat only uses side 0 of each disk)
<Lord_Nightmare> it stores sector -1 (mfm sector id 0x00, which is technicaly sector -1 or sector 255) TEN TIMES, i believe all 10 copies are the same
<Lord_Nightmare> this is the diskette ID number
<Lord_Nightmare> normally sector 0 is stored as sector 0x01 in the address mark header in MFM
<Lord_Nightmare> and karsten scheibler's CWTOOL utility is sadly hard-coded around this assumption, which turns out to be wrong for canon cat disks
<Lord_Nightmare> so it cannot image them
<whitequark> so let's define the sector number as whatever is in the MFM header
<whitequark> and then a separate LBA mapping table or something
<balrog> huh what are y'all up to?
<balrog> whitequark: if you can, I also suggest reading the A2R spec (from applesauce project)
<balrog> since that's probably the most professional/friendly other group doing floppy dislk
<balrog> disk*
<philpem> @whitequark the index pulse sign?
<philpem> just reading this and trying to get my head around it... there is no sign, the DF stores the current counter value when it hits a trigger event
<pjustice> I've been wondering about this too, actually. I thought the only thing any of the systems cared about was pulse timing.
<whitequark> er, "sign" is a bad way to describe it
<whitequark> I just mean the level on \INDEX
<whitequark> i.e. recording both leading and trailing edge as separate events
<whitequark> or, are you saying that in effect the trailing edge of the index pulse is never important?
<whitequark> (I think I said "sign" where I meant "polarity" for some reason, oops)
<pjustice> I can see both edges being useful, I just thought you were talking about which direction the change went.
<whitequark> so right now, in dfiv2, you don't actually know the polarity of the index pulse apriori
<whitequark> at least, not as specified
<pjustice> Fill in the hardware details for me: I thought the normal level and the "hole going by level" were standardized, and would be logic levels.
<pjustice> Or does Shugart fall apart even on such basic things?
<whitequark> sure, but then you have to guess polarity from timings
<whitequark> which seems icky
<pjustice> Ok, I see.
<pjustice> Thanks for cluex4.
<whitequark> balrog: I did look at A2R
<whitequark> balrog: it's the same basic idea, but not quite as advanced
<whitequark> for example, i want to have one format that would be A2R+WOZ
<cr1901_modern> philpem: cross-post from ##yamahasynths, where we talk about floppy disk preservation now, but...
<cr1901_modern> Do you have the spoons to elaborate re: what SPS did to you?
<cr1901_modern> >(4:29:20 PM) Lord_Nightmare: ask philpem about what they did to him
coalhot_ has joined #discferret
coalhot has quit [Ping timeout: 240 seconds]
balrog has quit [Ping timeout: 250 seconds]
balrog has joined #discferret
coalhot_ has quit [Ping timeout: 268 seconds]
coalhot has joined #discferret
<philpem> @whitequark the leading/active edge qualifies the index (I think the 765 datasheet says this - if not then maybe the WD1770)
<philpem> @cr1901_modern: oh god, please... I've only just finished repressing that memory...
<cr1901_modern> philpem: Ack, no problem. No worries
<whitequark> philpem: aha
<philpem> I mean, the little uncontested bits... they have contracts with a bunch of UK computer museums and got really angry when I started talking to them about discferret
<whitequark> if only the leading edge is ever useful, then the dfiv2 format is obviously sufficient and superior to what I suggest
<philpem> there was a huge argument, EAB (think it was EAB, some amiga forum anyway) went... rather mad.
<philpem> ask jason scott about his encounters with them!
<philpem> (KF/SPS)
<whitequark> i vaguely recall him warning me about KF
<philpem> @whitequark I actually haven't read through all of what you said in the scrollback yet, sorry
<whitequark> that was a bit surreal
<cr1901_modern> A long time ago, there was a product from "Device Side Data" for reading 5.25" floppies. Anyone familiar w/ them?
<cr1901_modern> I think it might even predate Kyroflux a bit
<cr1901_modern> anyways, I didn't know about all this drama back when I learned about KF... didn't realize it was even more of a shitshow than MAME
<philpem> hah. they're really controlling and act accordingly... that's about the size of it.
<whitequark> i think no less than three people independently contacted me in private to warn about KF
<whitequark> it was ... something
<whitequark> it's just fuckin floppies man
<philpem> I usually keep quiet about it, it's not worth the drama these days
<philpem> also kinda have to if I'm making what is effectively a competing product
<philpem> I really want to do what Applesauce does (that cool swirly spiral disk content display) but for MFM formats...
<cr1901_modern> https://dallibraries.atlassian.net/wiki/spaces/DFL/pages/522256385/Device+Side+Data+FC5025+floppy+controller Wonder what actual chip is being used for the "5025 FDC"
<Lord_Nightmare> philpem: you should talk to diskblitz about it; i think the reason he keeps his stuff closed source is also because of SPS
<Lord_Nightmare> i think they asked him to 'donate' his software code to them! the sheer gall of them
<Lord_Nightmare> maybe it was more like "we're making an offer you can't refuse" sort of things
<Lord_Nightmare> well, fuck SPS
<Lord_Nightmare> with a rake
<philpem> @Lord_Nightmare: yikes
<philpem> yeah that sounds like what they did with me
<Lord_Nightmare> his software, which is free (but closed source) i think is getting to the point of being superior to SPS's stuff
<Lord_Nightmare> unfortunately there's no windows/linux port, and its written in swyft/objc so it isn't portable at the moment
<Lord_Nightmare> but he's open to discussing stuff
<Lord_Nightmare> the actual applesauce hardware/device is also fairly generic internally, its a teensy/arm running the show, and the actual interface between the teensy and the mac-side applesauce software is fairly simple, and semi-documented, and will eventually be open spec
<Lord_Nightmare> people are already working on a replacement backboard riser for the applesauce to allow it to connect to a minishugart PC floppy drive interface
<pjustice> whitequark, has anyone done anything with your glasgow and 8" drives?
<Lord_Nightmare> I have one of those dbit http://www.dbit.com/fdadap.html minishugart-to-8" interface boards with the little MCU on it which shows the current track number and correctly asserts the TG43 pin which needs to go active on tracks over 43 in order to activate special write precompensation stuff inside the 8" drive
<Lord_Nightmare> the Tandon 8" drive i have however has its own MCU on it which i think automatically generates that signal internally
<Lord_Nightmare> balrog: did we ever dump that MCU? is it even possible to dump it?
<Lord_Nightmare> i think it might be a 4-bitter
<Lord_Nightmare> and a Ferranti ULA
<balrog> hey sorry
<balrog> been hacking away on yahogroup archival crap
<balrog> yahoogroup*
<Lord_Nightmare> that's more important for the moment
<pjustice> Yeah, I have one too.
<pjustice> That and my DF board is how I've read 8" floppies.
<Lord_Nightmare> that reminds me, did anyone ever make a generic fpga replacment for ferranti ulas, in the bbc tube and etc stuff?
<Lord_Nightmare> the actual way the ULA is laid out internally is known, its all programmed with 2 mask layers, the other 4? layers are static
<Lord_Nightmare> its basically a mask programmed fixed gate array
<Lord_Nightmare> where you have a grid of something like 128x128 NAND/NOR gates which you can route and connect together to build other stuff from, like latches etc
<Lord_Nightmare> it should be possible, just from the netlist of how the gates are connected together, to auto-generate a verilog or vhdl file for an fpga to replace the ula with exactly
Ralf has quit [Quit: Leaving]