mauz555 has quit [Read error: Connection reset by peer]
mauz555 has joined #m-labs
mauz555 has quit [Ping timeout: 252 seconds]
<sb0>
is anyone using misoc CSRStorage.atomic_write or can I change the semantics?
<sb0>
of course, llvm has to use the opposite write order as the misoc csr functions. 50% chance and murphy's law does the rest...
<davidc__>
I think its used in a project I've worked on, but it wouldn't be a big deal to fix it (code is in asm). I'm sure there's a compiler out there that does the opposite to LLVM, so there's probably no good answer :)
<sb0>
whitequark: does llvm on or1k require that 64-bit locations be aligned to 64-bit boundaries?
<whitequark>
sb0: nope
<whitequark>
or1k abi does not require any 64 bit alignment anywhere iirc
<sb0>
davidc__: right, so i'll just keep the modification contained in this part of artiq
<sb0>
but if I use the two commented out lines instead, there are strange bugs
<sb0>
whitequark: and do you confirm that llvm stores the lower word first and then the higher one? it's not me misreading the asm code?
<whitequark>
let me look up the codegen for that
<whitequark>
it's in the legalization code
<sb0>
btw this now-pinning doesn't seem to have any impact on performance, at least not for test_pulse_rate
<whitequark>
sb0: ok, I looked it up in the type legalizer, and no, LLVM does not in fact impose any specific ordering for the two stores. sorry for misinformation earlier.
<whitequark>
you will need to split it manually.
<sb0>
whitequark: sounds rather complicated to do without performance issues. how hard would it be to enforce ordering in llvm?
<whitequark>
sb0: this is a patch in the very core of llvm's target-independent backend
<whitequark>
it will never be upstreamed
<whitequark>
we can add a hack in the or1k backend, which, if i was the or1k upstream, i would have also never accepted
<whitequark>
what are the performance issues here exactly?
<whitequark>
if you split a 64-bit store into two 32-bit stores, you do exactly the same thing as the legalizer
<whitequark>
just slightly earlier in the pipeline
<whitequark>
like, you do store, trunc, another store, this is precisely what the legalizer generates in the DAG
<sb0>
right now I'm doing things like telling llvm to do "now = now + 1", with "now" being pinned to the CSRs
<whitequark>
you'll get identical DAG *except* it will have correct ordering
<whitequark>
what is "pinned to CSRs"?
<sb0>
api!(now = csr::rtio::NOW_HI_ADDR as *const _),
<whitequark>
that carries an explicit load and store
<whitequark>
either in rust or in the compiler
<whitequark>
each access
<whitequark>
so you aren't actually losing anything if you define a #[inline(always)] helper function
<whitequark>
if you make them volatile then yes, it would inhibit some optimizations
<whitequark>
you want to make it atomic, gimme a sec
<whitequark>
ok, so what you want is to use atomic loads and stores with the "monotonic" ordering
<sb0>
I want to do atomicity using the write ordering, to avoid losing performance
<whitequark>
llvm optimizes atomic loads and stores.
<whitequark>
(it is a common misconception that compilers are not allowed to optimize them; that only applies to volatile)
<sb0>
the atomicity solves the problem of the kernel being interrupted in the middle of a "now" store, and leaving a borked value for the next kernel
<whitequark>
if you use the monotonic ordering, you should see the exact same machine code generated, but with the correct ordering
<sb0>
okay, how do I use that?
<whitequark>
doing this in LLVM for regular loads/stores is an awful idea in any case
<whitequark>
you need to split the store into two manually
<whitequark>
it generates a function call because or1k has no 64-bit atomic stores
<whitequark>
(and it doesn't know how to do them with ll/sc, which would be a wrong thing anyway)
<whitequark>
the function call would be a syscall that tells the kernel to do it atomically
<whitequark>
you need to split them AND to use atomics because it's legal for the compiler to reload two stores to two addresses it knows are different
<whitequark>
as an optimization
mauz555 has joined #m-labs
mauz555 has quit [Ping timeout: 252 seconds]
<sb0>
how do convert between i64 and 2xi32 efficiently? will it optimize shift/trunc operations?
<whitequark>
yes
<whitequark>
if not, it's a bug that should be trivially fixable
<whitequark>
but it should
<whitequark>
or the code would be just horrible
<sb0>
btw why would the compiler reload non-atomic stores?
<whitequark>
reorder?
<sb0>
hm
<sb0>
if it can reorder writes arbitrarily, we'd have problems with mmio
<whitequark>
no
<whitequark>
those writes are volatile
<whitequark>
but volatile writes cannot be ever optimized
<whitequark>
i'm trying to give you atomics here because atomics *can* be optimized
<whitequark>
e.g. it should be able to remove dead stores
<whitequark>
i'm not sure that it will do everything we want, but if you make it volatile, it will *definitely* kill performance
<GitHub-m-labs>
[artiq] dhslichter commented on issue #1007: I agree with @jordens above. For now, we use the `_mu` methods as a workaround so this particular bug doesn't seem to be a major issue. I am much more concerned about *compilation* slowness than this issue. https://github.com/m-labs/artiq/issues/1007#issuecomment-436901529
m4ssi has joined #m-labs
<sb0>
whitequark: not sure that it will do everything we want <<< what kind of issue will there be?
<whitequark>
sb0: it might e.g. coalesce loads/stores worse
<whitequark>
i will need to look at specific examples
<whitequark>
it will not necessarily happen; i have never tried this specific thing before
<whitequark>
this is just in realm of possibilities and needs to be checked
<sb0>
whitequark: so, lower performance than it could be, but no bugs?
<cr1901_modern>
sb0: Well, you can override any and all parts of the build scripts in a more flexible manner than Vivado.
<cr1901_modern>
or ISE
<cr1901_modern>
Basically I'm asking: What did you have in mind for a build system overhaul? B/c I implemented changes in icestorm backend w/ the understanding (at least in rjo's eyes) that this is how the backends should all evolve
<whitequark>
cr1901_modern: you mean the templates?
<whitequark>
i don't think that's how they should be
<whitequark>
if i was redesigning them, i'd make each "step" a python function
<whitequark>
nextpnr and arachne would subclass the basic icestorm backend
<whitequark>
then there would be an overarching function that collects all the "steps"
<whitequark>
this actually allows you to override things in a sane way and also lets the language help you, e.g. via @abstractmethod
<whitequark>
this is just more stringly typed nonsense...
<whitequark>
i mean, it *is* better
<whitequark>
it's just not nearly better enough for an overhaul
<whitequark>
anynway, you can see some traces of that in my changes
<whitequark>
e.g. look at what i did to _build_script and _run_script
<whitequark>
there's no magic in _run_script anymore, you can just make a build tree and later run it with `bash` elsewhere, even on a different machine (subject ot some constraints)
<whitequark>
actually, what i would like to see in a migen build system overhaul is remoting.
<cr1901_modern>
I would prefer to see makefiles or ninja scripts generated instead of bash/cmd scripts
<whitequark>
right now it has some silly dependencies on the `build` machine, like all the `if sys.platform` statements
<whitequark>
so you can't e.g. run migen on windows but remote to a linux machine with vivado
<whitequark>
(makefiles) what's the point?
<whitequark>
first, this adds a dependency (make or ninja respectively)
<whitequark>
second, toolchains do not maintain dependency information
<whitequark>
and in case of verilog, you will have to extract all the preprocessor statements to construct the complete tree
<whitequark>
this can be done with verilator (iirc), which is *yet another* dependency
<cr1901_modern>
whitequark: Tbh, I don't remember the reason I think this. I just remember thinking it at some point. So just forget that I said it.
<whitequark>
and in the end you always resynth and replace the entire thing on any change anyway
<cr1901_modern>
anyways remote would be nice
<whitequark>
i had a branch with remoting, actually
<whitequark>
but it did not work well at all and i gave up
<whitequark>
it was long ago anyhow
<cr1901_modern>
Anyways I'd like parity of all backends interfaces to the extent that's reasonable.
<whitequark>
yes, _run_script and _build_script should be factored out
<whitequark>
think you can do that?
<whitequark>
that would be great
<cr1901_modern>
right now creating the build script for Vivado is different from icestorm is different from ISE is different from trellis
<whitequark>
even if it's just for two backedns
<cr1901_modern>
sure, lemme finish eating and I'll do that
<cr1901_modern>
whitequark: I can't focus on this right now, but re: adding Linux support to diamond, please make sure the Windows part doesn't contradict this: http://ix.io/1rj5
<cr1901_modern>
I had intended to add Linux support at some point, and this was supposed to be a large comment indicating how screwed up the docs are
<whitequark>
>Set $bindir env variable, run diamond_env, and then
<whitequark>
# invoke pnmainc.
<whitequark>
that's what I am going
<whitequark>
doing
<whitequark>
and on windows i don't source it
<cr1901_modern>
2. On Windows, add Diamond binary directory to the path, and run pnmainc.
<cr1901_modern>
Doesn't look like you do, I'll work on that if you don't get to it in a bit
<cr1901_modern>
I can test on Windoze anyway
<cr1901_modern>
I may even add that comment block b/c I don't want to have gone through that frustration for nothing
mauz555 has joined #m-labs
mauz555 has quit [Remote host closed the connection]
mauz555 has joined #m-labs
<cr1901_modern>
whitequark: I want to push trellis now; I think refactoring _build_script/_run_script should be its own PR
mauz555 has quit [Read error: Connection reset by peer]
<cr1901_modern>
whitequark: Also I want to massage the diamond backend while I'm at it. It should be possible to refactor it in such a way that I can also hoist its _build_script out into build/tools.py >>
<cr1901_modern>
If you want to try your proposed build overhaul, perhaps now would be a decent time (on the FOSS plus Diamond backend)?
<GitHub-m-labs>
artiq/new c990b5e Sebastien Bourdeauducq: Merge remote-tracking branch 'origin/master' into new
<cr1901_modern>
whitequark: btw, did you happen to install Diamond into /opt/Diamond? Current convention is "if toolchain_path is None, set to installer defaults" for proprietary toolchains. On Diamond, this is /usr/local/Diamond
mauz555 has quit [Remote host closed the connection]
mauz555 has joined #m-labs
mauz555 has quit [Remote host closed the connection]
<cr1901_modern>
whitequark: I am uncertain why, but it's producing forward slash here
<cr1901_modern>
Very much makes perfect sense and definitely not surprising
<cr1901_modern>
whitequark: It depends on which shell (bash or cmd.exe) I'm running when mingw python is invoked whether forward or backslash is used. But in the context of cmd.exe I exclusively need backslash.
<cr1901_modern>
since we use the native command interpreter when running the script
<whitequark>
mh, ok
<cr1901_modern>
specifically the rules seem to be (for mingw python): If interactive REPL and cmd.exe running, os.path.sep == "\\", else "/". Yea, makes sense. /s But whatever it's Windoze fault for not pushing "/" in the DOS 2.x days lol
<cr1901_modern>
(fwiw, I think Windoze is actually pretty good w/ paths nowadays; all APIs don't care which slash you use. Just cmd.exe internals insist on backwards compat)
mauz555 has joined #m-labs
mauz555 has quit [Ping timeout: 252 seconds]
mauz555 has joined #m-labs
<rjo>
bb-m-labs: force build --props=package=artiq-board,artiq_target=kasli,artiq_variant=opticlock artiq-board
<bb-m-labs>
build forced [ETA 17m03s]
<bb-m-labs>
I'll give a shout when the build finishes