<gregdavill>
actually, implementing the FLASH layer ontop looks reasonably straightforward. I should be able to re-use all the logic in applet.memory._25x
<gregdavill>
Just add some commands to put the ECP5 into spi_background mode before running anything, maybe just some trailing commands if the user wants to force a REFRESH/check status.
<gregdavill>
Then overwrite Memory25xInterface._command() to funnel requests through the JTAG interface, instead of SPI.
<whitequark>
gregdavill: yup, something like it! I try to make applets reusable if possible
<whitequark>
we might perhaps need to tidy it up a bit (like add an abstract base class or something) but that's the basic idea.
<gregdavill>
Did you want me to make a separate PR for program.ecp5_flash? Or have them together?
<whitequark>
definitely separate
<whitequark>
also, jsyk, i might be able to review your PRs soon, but i also might have to delay that until about 30th
<whitequark>
no further though
<gregdavill>
No worries. There is no hurry here, I just wanted to start the process. So these additions didn't just sit locally and bit-rot.
<whitequark>
so we're on the same page, great!
<gregdavill>
So that was pretty easy! Glasgow has just uploaded a bitstream into the FLASH on ECP5 via JTAG! \o/
Stormwind_mobile has quit [Ping timeout: 240 seconds]
andresmanelli has joined #glasgow
<andresmanelli>
Hello all, hope you're ok :)
<andresmanelli>
After a successful glasgow test (thanks for the support), I started to code the API for the SX1272 LoRa PHY
<andresmanelli>
I know for the moment only one applet can be run at a time .. and I was hooping to hook up two PHYs to test TX/RX
<andresmanelli>
Is there any way of having to SPIMaster instances and say two threads or somehing similar ? I don't know much of the low level code yet
<andresmanelli>
having two **
<andresmanelli>
Another possibility would be to assemble another glasgow (have the parts), but I'm not sure if I can choose which glasgow to use when running the software
<andresmanelli>
Any hints ? Thanks !
<ZirconiumX>
andresmanelli: There aren't really "threads", as such. If you want to do two things at once, you just do them on the same clock cycle
<andresmanelli>
Yeah I know, I mean, If an API blocks waiting for some register flag, how to interleave requests to use other the SPI bus ? I don't know If I'm explaining well
<ZirconiumX>
Have two independent FSMs, one for each SPI bus
<ZirconiumX>
That's probably about how you'll get two separate threads
<andresmanelli>
You say at the FPGA level
<andresmanelli>
Not at applet level
Stormwind_mobile has joined #glasgow
<ZirconiumX>
I would run this on the FPGA, yes
<whitequark>
andresmanelli: multi-applet support is a planned feature and a lot of infra is ready for it
<whitequark>
unfortunately, a lot more isn't
<whitequark>
if you can just use two of them, you can pass --serial with no issues
<whitequark>
that's what i often do
<whitequark>
in general, i've been very busy with other things in the last few months, so i'm not sure exactly when multi-applet support will arrive
<whitequark>
however, if you need two of them at most, you can hack it together a bit
<whitequark>
you can call .claim_interface() twice in the applet code (well, no more than twice) and it'll work fine
<whitequark>
you will effectively get two completely independent pipes to your gateware and could do whatever you want with them
<whitequark>
and they should share USB bandwidth quite efficiently too
<whitequark>
on the host, you'll just have `await spi1.transfer()` and `await spi2.transfer()`.
<andresmanelli>
Hey there whitequark, yes, I saw the open issue for multi applet ! I'd love to see a glasgow compose ish behaviour :)
<andresmanelli>
That's interesting
<andresmanelli>
One question though, the two calls will be sequential anyway right ? It solves my problem but just to be sure
<andresmanelli>
I mean there will always be only one run() that gets executed sequentially for the current applet
<whitequark>
andresmanelli: the two calls would be concurrent with each other
<whitequark>
well, it's a bit subtle, let me go into detail
<whitequark>
in case of SPI, .transfer() waits for a response, so if you just do `await spi1.transfer()` it would stop the progress on everything else until it's done
<whitequark>
you can use `asyncio.wait()` to run two transfers concurrently
<whitequark>
in that case, they'll transfer the data into the FPGA-side FIFOs in a bursty interleaved way, and the SPI cores will happily churn it out in parallel on the FPGA
<andresmanelli>
Now that's what I'm talking about :D
<andresmanelli>
I have to read a little bit more how asyncio works , not really a python developer. But that will probably work for my use case
<andresmanelli>
Thanks for the tip, I'll try that out and see if I can get both PHYs talking to each other
<whitequark>
andresmanelli: what kind of languages you use?
<whitequark>
there are asyncio equivalents in most of them, maybe I can suggest an analogy
<andresmanelli>
Mostly C, a little Ada, learning Python maybe JS has something similar with promises ?
bvernoux has quit [Quit: Leaving]
<whitequark>
any embedded C?
<andresmanelli>
Mostly embedded yes
<whitequark>
ok so you know how if you need to do something in response to interrupts, you write the function as an FSM so it can be "suspended" while waiting for another event?
<whitequark>
(write the ISR that is)
<andresmanelli>
Yes
<whitequark>
conceptually, if you write `async def` in Python, it is going to do a similar conversion, and when you write `await`, it's similar to switching to the next state in that FSM, remembering what you're waiting for so you know when to wake up, and returning
<whitequark>
think of an asyncio Python program as a huge collection of FSMs like that that all run concurrently, except you have a lot of syntactic sugar that makes writing them more pleasant, and you also can't get interrupted in the middle of normal Python code*, only on await points
<whitequark>
(* Unix signals can do that, but you really want to avoid that)
<whitequark>
so when the `spi1.transfer()` function does `await self.lower.write(...)`, it will write out the bytes to the USB FIFO, remember where it was, and return to the scheduler. then the other one gets an opportunity to do the same thing
<whitequark>
once the bytes are all in the FIFO, they get waken up, do `await self.lower.read(len)`, remember where they were, return to scheuler again
<electronic_eel>
when you do spi1.transfer and spi2.transfer - how does the data of each transfer get to the correct part of the gateware? is there already some kind of marking in the packets? or is it done via different usb endpoints on the fx2?
<whitequark>
different USB EP
<andresmanelli>
So `asyncio.wait` would do the wait for two independent transfers ?
<whitequark>
that's why there is a limitation of 2 interfaces
<andresmanelli>
In their own FSMs lets say
<whitequark>
andresmanelli: yes, it's an utility function that checks *both* events instead of just one, since the `await` syntax itself doesn't let you wait for more than one event
<andresmanelli>
Nice
<whitequark>
and it works exactly like you are saying. specifically, it waits until both the FSM of `spi1.transfer()` and `spi2.transfer()` go to one of their final states
<whitequark>
(either returns or raises an exception; you can choose how you prioritize those, too)
<whitequark>
electronic_eel: the problem with using four EP is that this cuts down on buffer size and impacts performance considerabl
<whitequark>
on linux, it is not an issue to add a second configuration
<whitequark>
but on windows, you literally cannot switch configuration on the fly, not even in a kernel mode driver
<electronic_eel>
yes, also the fx2 doesn't have endless endpoints you could use
<whitequark>
you HAVE to write an inf file that says which configuration to use
<andresmanelli>
Well great, that should work for my applet, I'll try. Thank you whitequark for the explanation !
<whitequark>
which... completely defeats the purpose of having configurations in first place?
<hl>
jesus christ, windows is that bad?
<andresmanelli>
Have to go now, but will get in touch with my results, and hopefully a PR at least for the arch before the applet is ready
<whitequark>
i think we'll have to switch to using alternate settings and add interlock on the FX2 that prevents you from choosing altsettings that would have conflicting EP numbers
* hl
adds it to the list
<electronic_eel>
yeah, windows and usb drivers, fun and joy forever
<whitequark>
hl: they actuallly tell you on MSDN that you cant do it even with a kernel mode driver
<whitequark>
"limitation of KMDF"
<hl>
what if you use WDM? :/
<whitequark>
in WDM too
<hl>
jesus
<hl>
whitequark: currently I'm writing a Windows filesystem driver targeting NT4 and later
<whitequark>
i mean, i would be overjoyed to be proven wrong
andresmanelli has quit [Remote host closed the connection]
<whitequark>
so please do recheck
<whitequark>
because this seems batshit insane to me and i really hope i'm just misreading something
<hl>
whitequark: this is part of a harebrained scheme of mine to be the first person to ever get windows booting with C: as a network filesystem
<hl>
well, suffice to say the above has inspired me to try
<electronic_eel>
so putting a target applet number into the first byte of each packet is the better solution than lot's of different usb eps
<electronic_eel>
the same solution as from this company from glasgow
<electronic_eel>
(the city)
<hl>
oh right, and then there's USB3 STREAMS
<whitequark>
electronic_eel: yes :/
<whitequark>
we're going to end up with that at some point. i hate it
<whitequark>
it will be quite inefficient to do in python, too
<whitequark>
right now it's approaching zero-copy, packetizing 512-byte chunks in pure python would mean we'll likely no longer saturate USB2 with python code
<whitequark>
and also take a massive latency hit
<whitequark>
*511-byte
<electronic_eel>
packetizing is currently done in the kernel, or libusb?
<whitequark>
kernel
<whitequark>
might even be the HCI
<whitequark>
well, we can still let kernel/HCI packetize stuff, what we have to do is to insert a byte every 511 bytes of a data stream
<whitequark>
and do it really quickly
<whitequark>
which is not something python is good at
<whitequark>
i could try to do more of the things with memoryview() we currently do, but i am not very hopeful
<whitequark>
as it is, saturating USB2 with python code already requires a rather beefy machine, i don't really want to make it worse
<electronic_eel>
can we be sure that the packets are cut by the kernel at the correct positions?
<whitequark>
yes
<electronic_eel>
or is there some codepath where the kernel decides to use smaller packets, like for sharing bandwidth?
<whitequark>
things would get seriously broken if that would not be done
<whitequark>
it's actually guaranteed by the spec
<whitequark>
i think in the past there were actually some issues with that, mostly around ZLPs, but in general, we can rely on that inasmuch as we can rely on anything that has USB in it
<whitequark>
even further: we have to rely on it, and we already do rely on it to an extent
<whitequark>
if the kernel chose to use differently sized packets, the performance cliff would be so severe that you'd get single MB/s transfer speed at best
<whitequark>
the thing is that USB uses "full sized packet" to indicate "more data to come", which is also the reason ZLPs exist
<electronic_eel>
ah, right
<whitequark>
and if e.g. glasgow responds with a 511-byte packet, the HCI will actually stop polling it for a while
<whitequark>
whereas if it gives a 512 byte packet back, the HCI may poll again even in the same microframe (IIRC this is not required but good controllers do it; I experimented a bit with this a while ago but may forget the details)
<whitequark>
(if you want the details to be 100% correct ask ktemkin)
<sorear>
you only need to insert byte numbers in the rare case where more than 2 applets are running, though?
<whitequark>
sorear: with the current architecture, yes
<whitequark>
but the problem here is the case of, what if i want to interface with my glasgow *not* from the python vertically integrated framework?
<whitequark>
glasgow currently lacks an "ABI" and the more special cases we add, the worse it will get
<whitequark>
what i'm thinking of is to write a Rust server that abstracts the awful innards of USB and say "if you want to use Glasgow from not-Python, you just have to use that".
<whitequark>
(and Python would get faster with it, but it's not required)
<electronic_eel>
I think the server thing is a good idea, because the other program can't easily find out the internal applet numbers and where to send packets destined for other applets
<whitequark>
yes, exactly
<whitequark>
the other idea would be to stuff some JSON to self-describe the applets into the device itself
<whitequark>
but... it has so little memory that it is a serious problem of where it will be stored
<whitequark>
sure, you can use a more efficient serialization, compress it, but it never goes away
<electronic_eel>
later with ethernet-enabled glasgows, the server would go away and the applets just have tcp sockets on glasgow
<whitequark>
what if the applet has a fuckton of metadata?
<whitequark>
yes, precisely that is the plan
<electronic_eel>
I don't like the metadata idea, because what if you want to have two non-python programs running at the same time - how do they coordinate access?
<electronic_eel>
that would make the ABI much more complex
<whitequark>
indeed
<whitequark>
it has a lot of flaws
<sorear>
how does any of this work without nmigen?
<electronic_eel>
I think the other program doesn't create it's own gateware, but uses a nmigen-created gateware that is already loaded onto glasgow
<electronic_eel>
maybe it would make sense to add some kind of control channel to the server, where the other program can request an applet with specific configuration and the server then calls python to create and load it
<electronic_eel>
this control channel could also be implemented directly in python of course
<whitequark>
sorear: you configure it via the Python framework, but then use in some other way