Degi has quit [Ping timeout: 255 seconds]
Degi has joined ##openfpga
X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 260 seconds]
X-Scale` is now known as X-Scale
Lord_Nightmare has joined ##openfpga
X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 260 seconds]
X-Scale` is now known as X-Scale
Bike has joined ##openfpga
zng has quit [Quit: ZNC 1.7.2 - https://znc.in]
zng has joined ##openfpga
rohitksingh has joined ##openfpga
emeb has left ##openfpga [##openfpga]
lopsided98 has quit [Remote host closed the connection]
ym has quit [Remote host closed the connection]
lopsided98 has joined ##openfpga
emeb_mac has joined ##openfpga
genii has quit [Quit: Morning comes early.... GO LEAFS GO!]
rohitksingh has quit [Ping timeout: 268 seconds]
rohitksingh has joined ##openfpga
Maylay has quit [Ping timeout: 240 seconds]
Degi has quit [Ping timeout: 256 seconds]
Degi has joined ##openfpga
Maylay has joined ##openfpga
Bike has quit [Quit: Lost terminal]
genii has joined ##openfpga
genii has quit [Quit: Morning comes early.... GO LEAFS GO!]
____ has joined ##openfpga
_whitelogger has joined ##openfpga
emeb_mac has quit [Quit: Leaving.]
OmniMancer has joined ##openfpga
<____> Is anyone familiar with a 16-bit hyperbus? Gowin uses one for the internal connection to whathever is that on-chip thing they call PSRAM.
<____> Afaik Cypress spec talks about 8 bits only.
<tnt> Doesn't the gowin doc have info ?
<tnt> Also, does that mean the connections to it have io buffers / io ffs and you need to do the normal clok-to-out / inpout setup-hold time analysis ?
<____> Gowin doc is pretty limited, and is mostly about their custom hyperbus core interface, not about raw interface. Or maybe I missed something.
<____> The PSRAM is not a primitive. It is magically connected somewhere at the synthesis stage, if you give the corresponding top level ports the corresponding magical names.
<tnt> Looking at the doc, it really just looks like you have a classic PSRAM chip connected to some IO of the FPGAs.
<tnt> instead of being broken out to pads, they go to another die in the package.
<____> So, i guess that constraint-wise, the PSRAM ports are treated the same way as any other port.
<tnt> But the doc I'm reading shows the data width as 8 bits, not 16.
<____> For my specific part it is specified that the PSRAM width is 8 bit, but the DQ is 16 bit.
<____> Gowin's core user interface uses 64-bit ports and a minimum 16-byte burst. I suppose there is some muxing magic involded.
<____> Well, I guess this would require some hacking with internal LA.
<tnt> Well if psram is x8 and dq is x16 you just have two psram dies in parallel.
<tnt> so you'd have two rwds as well.
<tnt> and CS / CK / CK_n are shared.
<tnt> Actually looking at table 5-2 of IP UG525, it's two completely independent psram dies ...
<____> Wow, that's actually seems true, every one of them is doubled, even CK and CS.
<____> Thanks
<tnt> I'm kind of curious why there is a differentiation between PSRAM and HyperRAM in that document.
<tnt> (like in Table 2-3)
<tnt> of DS861
<____> The interface is hyperbus for both of them, and IPUG525 says that there's no difference. Maybe it's about marketing?
<tnt> yup maybe. I guess maybe the psram is some custom silicon while hyperram is "official". It might not support the same configuration registers options.
emily has quit [Quit: killed]
eddyb has quit [Quit: killed]
promach3 has quit [Quit: killed]
swedishhat[m] has quit [Quit: killed]
jfng has quit [Quit: killed]
indefini[m] has quit [Quit: killed]
nrossi has quit [Quit: killed]
henriknj has quit [Quit: killed]
john_k[m] has quit [Quit: killed]
omnitechnomancer has quit [Quit: killed]
scream has quit [Quit: killed]
xobs has quit [Quit: killed]
indefini[m] has joined ##openfpga
<azonenberg> fffuuuuuu i just spent the last 4 hours chasing a bug caused by copying code from another project and not patching up one net name
<azonenberg> Which led to my I2C IP not having a clock
<azonenberg> Aaaaand the ONE file in the entire project without `default_nettype none was, you guessed it, the top level
<tnt> that sounds way too familiar
<azonenberg> My coding style requires it but i don't automatically enforce it
<azonenberg> i really need to find a standardized system for creating new projects that avoids some of these issues
<tnt> Can yosys default be changed ?
<azonenberg> I'm actually using vivado for this, and i'm not sure about the yosys default
<tnt> What annoys me is that this is not per-file and so when I use lattic or xilinx IPs, it often breaks :/
<azonenberg> i have however thought about using the yosys parser to write a linter that enforces my style guidelines though
<azonenberg> Yes. My general solution to this is simple, don't use third party IP :P
<azonenberg> the only xilinx IP i use with any degree of regularity is the ILA, and the design i had this problem on is actually a testbench for the latest version of my own ILA
<tnt> heh, sure but not always an option ... (I often do changes / additions to existing projects, so rewriting the whole thing is not viable :p)
<azonenberg> with a view towards eventually discontinuing use of the vivado ILA permanently
<azonenberg> after all, not much sense having code buildable with f/oss tools if you depend on non-free blobs
<azonenberg> that can't be compiled except with vivado
<tnt> I end up having the default_nettype wrapped in `ifdef and then I use iverilog as a syntax checker.
<azonenberg> well i want to do more
<q3k> same, iverilog for linting
<azonenberg> i want to do things like alerting on a flag which is set to 1 in a state machine but never cleared to 0
<azonenberg> lack of default_nettype none
<q3k> azonenberg | i want to do things like alerting on a flag which is set to 1 in a state machine but never cleared to 0
<q3k> that will require some level of formal verification in order to be done well
<q3k> probably
<azonenberg> initially it would be quite simple
<q3k> or you can just detect when a line gets turned into a constant driver
<azonenberg> within one always block, if you see assignments to 1'b1 only
<azonenberg> and never to any other value
<azonenberg> that's a warning
<azonenberg> because you probably intended this to be a single cycle pulse and forgot to add the default-zero
<azonenberg> (this has bit me a lot)
<azonenberg> it should be quite easy to do at the AST level, harder on synthesized logic
<q3k> doing it at AST level sounds like just tons of false positives (ie. code that's broken that passes your simplistic checks)
<azonenberg> that's false negatives
<azonenberg> false positives is warnings for logic that doesn't meet the filter
<q3k> depends how you look at it, that's why i added the (explanation)
<azonenberg> or logic that meets the filter but isn't broken
<azonenberg> my general rule is that a linter/warning tool needs to have an extremely low false positive rate to be useful, or people ignore the spam
<q3k> it sounds like you're trying to write a 'go vet' but for verilog
<q3k> ie. something a bit smarter than a linter
<azonenberg> while false negatives are bad, halting problem says you can't catch all bugs
<azonenberg> any bugs i catch are better than none
<azonenberg> other rules i want to enforce: no mixing <= and = in one always block
<q3k> which is interesting, because both go and verilog are braindead languages, and you write tools around them to discover bugs, instead of making the language less braindead
<azonenberg> no latches in always_comb blocks
<q3k> so that's a quaint parallel.
<azonenberg> no use of numbered module ports or synthesis constraints in comments
<azonenberg> in FPGA mode: mandatory initial value for all registers
<azonenberg> in ASIC mode: use of "initial" is an error
<azonenberg> (this can lead to sim-synthesis mismatches with asic HDL if the simulator isn't explicitly instructed to ignore initial values)
<azonenberg> I might also ban the ?: operator. Certainly nested instanecs of it
<azonenberg> multiple drivers from different always blocks, use of # delays
<azonenberg> statically impossible conditionals/assignments due to width mismatch
<azonenberg> (post elaboration maybe)
<azonenberg> for example reg[3:0] foo; foo <= 32;
<azonenberg> but foo <= 8 should be legal despite the unsized 8 being expanded to 32'd8 per the LRM
<azonenberg> Systemverilog fixes a lot of the things in my original list
<azonenberg> but it does still have issues :p
ZipCPU has joined ##openfpga
henriknj has joined ##openfpga
eddyb has joined ##openfpga
jfng has joined ##openfpga
swedishhat[m] has joined ##openfpga
emily has joined ##openfpga
xobs has joined ##openfpga
john_k[m] has joined ##openfpga
promach3 has joined ##openfpga
omnitechnomancer has joined ##openfpga
nrossi has joined ##openfpga
scream has joined ##openfpga
rohitksingh has quit [Ping timeout: 240 seconds]
fjullien has quit [Ping timeout: 255 seconds]
____ has quit [Quit: Nettalk6 - www.ntalk.de]
ZipCPU has quit [Ping timeout: 265 seconds]
OmniMancer has quit [Quit: Leaving.]
genii has joined ##openfpga
<lambda> azonenberg: have you tried VHDL? it doesn't have most of those problems either ;)
<azonenberg> lambda: it also looks like ada and i find it pretty much unreadable
<azonenberg> i have lots of ideas for things i want in languages, which seem to not be what anyone else wants
<azonenberg> for example, rust looks like an awesome memory safe C replacement
<azonenberg> But I want "rust++"
<lambda> fair enough, it takes some getting used to. I'm just always glad for its strictness and somewhat decent type system whenever I hear these verilog horror stories
<azonenberg> i.e. full OO on a statically memory safe, non-GC'd bare-metal-friendly platform
<azonenberg> AFAIK this does not currently exist
<lambda> there's probably always just One More Thing™, no matter how many languages there are to be honest
<azonenberg> well the single big blocker to me moving ~95% of my code to rust is the lack of proper OO
<azonenberg> structs or whatever they're called don't count
<azonenberg> in particular, my code tends to make heavy use of base classes that provide common functionality which is occasionally overridden
<azonenberg> so full inheritance, not just interfaces
<azonenberg> and sometimes even multiple inheritance, which i use a lot in jtaghal
<azonenberg> OO just fits very naturally to a lot of hardware problems like making drivers for a peripheral
<azonenberg> as well as a lot of UI stuff
<lambda> true, maybe eventually something will come along and fill that gap
Zorix has quit [Ping timeout: 248 seconds]
<anuejn> azonenberg: there is the Deref pattern in rust, which gives you something quite similiar to inheritance
<anuejn> but it is a rather evil hack
<anuejn> >We do intend to add a mechanism for inheritance similar to this to Rust, but it is likely to be some time before it reaches stable Rust
<anuejn> *says
Zorix has joined ##openfpga
tlwoerner is now known as tw-eh
tw-eh is now known as tlwoerner
<q3k> azonenberg: i mean, it's a different pattern. if you're applying java-style OO to rust you'll feel extremely constrained
<q3k> azonenberg: you can implement all these things, just thinking differently, in the system that rust gives you
<q3k> azonenberg: and generally, IMO, end up with easier to grok code
<q3k> azonenberg: (multiple inheritance is evil)
<q3k> i had a very similar issue when I was trying to port some of my old C++ code (which was similar to you C-with-classes C++) to rust
<q3k> just took a while to adjust
<q3k> *to your
<azonenberg> q3k: i mean, you can also write full java style OO in C
<azonenberg> using get/set functions, virtual fuctions, etc. Doesnt make it a good idea, or the right tool for the job
<q3k> my point was more that if you let go of patterns from other languages you usually end up more productive
<azonenberg> And this is why i like C++, it doesnt really enforce much in the way of patterns/paradigms on you
<azonenberg> you can go full C imperative, you can go full java OO, you can do C-with-classes
<azonenberg> you can even do functional stuff up to a point
<q3k> i mean, that's still one single pattern on a spectrum
<q3k> rust doesn't enforce that much either
<azonenberg> its not things that rust enforces that i complain about, so much as the lack of syntax for various things
<azonenberg> Like inheritance
<q3k> that's not syntax, it's different semantics
<azonenberg> I also remember not being super happy with the way it handled object lifetimes
<azonenberg> i forget the specifics but it seemed like the default type for arguments etc was readonly reference or something and that led to lots of annoyance if you actually wanted to make a copy of something
<q3k> i think you just need to spend more time with the language
<azonenberg> oh actually no, i think it had to do with a pass-by-writable-reference turning into a transfer of ownership?
<q3k> it's a complex little thing and it just takes time
<q3k> you can definitely pass a mutable borrow without transferring ownership
<q3k> so i'm still not sure what you mean
<azonenberg> i havent actually used it much. I just remember spending most of my time fighting the tools to express what should have been simple concepts
<q3k> the borrow checker will piss you off at first but it's a tool for correctness that will find issues with your code
<q3k> same as a type checker will piss you off if you're not used to complex static typing
<azonenberg> my recollection was that the borrow checker was overly paranoid and would complain about things that were obviously safe and could be easily statically verified as safe
<q3k> also not sure when you last used rust, but some new lifetime elisions have been introduced that make more simple use cases easier to express without annotating everything
<azonenberg> That could have been part of it. It was a while ago
<q3k> generally that's the borrow checker's job, to be paranoid
<q3k> if you're sure you're right then use unsafe {}
<azonenberg> At which point i've lost the whole benefit of using a memory safe language :p
<q3k> i mean, pick one - memory safety or not
<q3k> you can't both complain about the borrow checker being paranoid and that you want memory safety
<azonenberg> Paranoid meaning it's shooting at shadows rather than finding actual problems
<q3k> i would need to see some concrete examples
<azonenberg> yeah i havent touched it in a while
<azonenberg> But IMO a well designed memory safe language stays out of your way and doesn't let you do anything stupid
<q3k> if you read TRPL it generally shows you how to do things in a way that appease the borrow checker
<azonenberg> But also doesn't force you to explicitly say everything you want
<q3k> so do you have any examples of such a language?
<azonenberg> No, i dont think it exists :p
<q3k> there might be a reason for that
<azonenberg> Lol
<q3k> i mean, rust is not perfect, i don't even like rust that much
<q3k> but that borrow checker is usually right
<azonenberg> My interpretation of paranoid was "no false negatives but a ton of false positives"
<azonenberg> if the false positive rate could be reduced a lot i'd be more happy
<azonenberg> One thing i recall not obviously seeing in rust was the ability to create a memory safe mapping for a region of absolute physical memory
<azonenberg> for example a memory mapped packet buffer for an ethernet IP
<q3k> i'm not well versed in embedded rust, but i'm sure you can find examples on how to do that in redox
<q3k> also a lot of times you can just 'cheat' by using Arc/Rc when dealing with lifetimes
<q3k> and that's generally fine in my book, reference counts are cheap
<emily> 16:57 <azonenberg> my recollection was that the borrow checker was overly paranoid and would complain about things that were obviously safe and could be easily statically verified as safe
<emily> i'm sure if you sent an easy hundred-line patch to make the compiler statically verify these patterns as safe it would be expected (but in reality it's almost certainly not so easy)
<emily> expected → accepted
<azonenberg> hmm i think it might have been the "exactly one mutable reference" rule i had trouble with?
<azonenberg> how are you supposed to have global state you can change from two places at once with proper synchronization?
rohitksingh has joined ##openfpga
<azonenberg> e.g. a fifo you can pop from one of two worker threads
<q3k> you wrap it in a Mutex
<q3k> or you use an existing FIFO construct that allows for two ends across different threads
<sensille> yeah, you really have to let go of the notion that you can do everything yourself in safe rust
oter has quit [Quit: Textual IRC Client: www.textualapp.com]
Stary has quit [Quit: ZNC - http://znc.in]
Stary has joined ##openfpga
fjullien has joined ##openfpga
_franck_ has quit [Ping timeout: 272 seconds]
Lord_Nightmare has quit [Quit: ZNC - http://znc.in]
<azonenberg> q3k: yeah the fifo was just an example
<azonenberg> i was envisioning more complex data structures not in a library, like traversing some complex directed graph
<azonenberg> and possibly modifying state at various nodes
<q3k> i mean, you have to prove to the borrow checker that you can do this thread safely
<q3k> either by writing unsafe code that just tells it 'fuck off i know what i'm doing', or using existing code that does that
<q3k> with the simplest example being sync::RwMutex for instance
<azonenberg> Also, this is a bit more living-on-the-edge
<azonenberg> but on most platforms you can do lock-free data structures by taking advantage of the fact that byte/word sized operations are inherently atomic
<azonenberg> Without a need to explicitly mutex
<azonenberg> A language that recognizes that e.g. incrementing a uint32 from two threads at once is safe would be nice
<q3k> rust also has this
<azonenberg> obviously this might be OK or not depending on the specific ISA, but you get the idea of the sort of low-level stuff that matters to me as an embedded guy
<q3k> via explicit atomics
<emily> rust is about "correct by construction", not "you do whatever and it tries its damnedest to prove that it's not broken yet"
<emily> because not broken yet can become broken in the future
<emily> but yes you can achieve the same result here just by declaring you explicitly want and depend on that property
<emily> rather than just partying on it and having it at most documented in a comment
<azonenberg> yeah my point is, i want to be able to do low-level performance stuff while also being sure it will work
<q3k> but rust lets you do that
<q3k> you just have to prove the compiler that it's safe
<azonenberg> But this is a bit of a moot point for now, because most of the important stuff i've done lately has been in HDL
<q3k> it's just you can't realy on UB like accessing u32s across threads
<q3k> you have to explicitely say 'i want an atomic u32'
<q3k> and your platform might as well implement it using a simple u32 that is accessed using normal load/store instructions
<q3k> bonus: your code is then actually portable
<azonenberg> Fair enough, my point is more that i would prefer such stuff to not *be* UB
<q3k> it's not in rust
<azonenberg> e.g. language semantics that incrementing an int always wraps mod 2^32
<azonenberg> re HDL, this allows a lot of correctness properties to be ensured just by the thing compiling, like only writing to a register in one always block
<azonenberg> or pointers not being used outside the bram that they're intended to go to
<azonenberg> and you can use formal for higher level properties
<q3k> you should take a look at bluespec now that it's open
<TD-Linux> in fact, AtomicU32 in Rust is defined to wrap at mod 2^32 :)
<azonenberg> Does rust have arbitrarily sized "int", btw, or did they kill that?
<azonenberg> that's one of the biggest complaints i have with C :p
<azonenberg> I'm thinking of enforcing a C++ coding style that only allows stdint numeric types
<q3k> there's a bigint crate
<azonenberg> i.e. you can never say int, you have to explicitly say int32
<azonenberg> no i mean, "integer of unknown platform dependent size"
<azonenberg> should IMO not be a type
<TD-Linux> azonenberg, in general yes, the normal types are like "i32" and "u32". the biggest exception is "usize"
<q3k> yes, isize and usize
<q3k> which are explicitly arch specific
<TD-Linux> equivalent to size_t
<azonenberg> well ok having a size be dependent on address length makes sense for pointers or array indexes
<azonenberg> But only for that purpose
<q3k> otherwise you have {u,i}{8,16,32,64,128}
<azonenberg> (and it's something you'd never use in serialization)
<TD-Linux> rust doesn't have structs that you can dd to disk and it makes me a bit sad :(
<azonenberg> yes exactly
<TD-Linux> btw there is ##bluespec but no activity on it in a while
<azonenberg> my ideal embedded language would statically enforce a bunch of safety properties but also let you take advantage of low level stuff
<azonenberg> for example having well defined bitfields and packed structs with explicit endianness and msb-lsb ordering
<azonenberg> basically allow a struct to be declared either as optimized for efficient access with arbitrary platform dependent alignment etc, or serializable
<azonenberg> in the latter case it would have well defined in memory representation
<azonenberg> i.e. all floats are ieee754, all multibyte fields big endian, all struct members consecutive with no padding, etc
<qu1j0t3> is there an Embedded Rust working group that azonenberg could join
<azonenberg> better yet, allow a struct to simply be declared and used, by default optimized for in memory use
<azonenberg> but have some sort of serialize property you could apply to iot
<levi> Yes, but it's generally working at a higher level than the kinds of things being discussed right now.
<azonenberg> i.e. you could have Foo bar / Serializable Foo bar
<azonenberg> and cast between them, shuffling bits as needed
<azonenberg> But you could also do something like
<azonenberg> Serializable Foo bar [at 0x40030800]
<azonenberg> in order to memory map a SFR as a struct
<levi> Rust is *usable* for embedded, but definitely wasn't designed as "embedded-first".
<azonenberg> Exactly
<azonenberg> I want an embedded-first OO language that provides as many safety properties as reasonably practical without compromising the core mission
<azonenberg> Not a safe language shoehorned into embedded, which is what rust looks like
Asu has joined ##openfpga
<azonenberg> bonus points if you can figure out how to implement ownership semantics in a way that's compatible with ping-ping DMA buffers etc
<azonenberg> so you can change ownership of a block of memory to hardware, then when you get an interrupt mark it as usable by the app again
<levi> I don't think it's really "shoehorned" any more than C or C++ are.
<azonenberg> But enforce bounds checking when the app is using it
<azonenberg> levi: yes, C/C++ are not ideal for embedded either
<azonenberg> e.g. UB around struct packing, bitfield ordering, and endianness when casting complex data types to a byte*
<azonenberg> Hence why i want a language that is explicitly designed to support memory mapped peripherals in a well defined fashion
<levi> The embedded working group has come up with some pretty interesting tools with that regard, but they're definitely not to everyone's liking.
<qu1j0t3> there's some prior art, since operating systems have been writtne in HLLs for about 70 years now
<qu1j0t3> well, 60+
<levi> PL/I, Ada, and BLISS had a lot of interesting ways to specify low-level details that no one really looks at when designing languages these days.
<azonenberg> qu1j0t3: yes, and the vast majority are C/C++ and just assume the UB
<qu1j0t3> no
<qu1j0t3> there have been many languages used. lots of prior art.
<qu1j0t3> C wasn't used until 1973-odd
<azonenberg> well ok i dont have a ton of exposure to REALLY old OSes
<qu1j0t3> right, but could make an interesting survey
<qu1j0t3> Per Brinch Hansen "CLassic Operating Systems" mentions some, but definitely not all
<azonenberg> my OS knowledge is windows, linux, uc/os, freertos, vxworks, and probably a few other more obscure ones
<levi> There was also some interesting research on bitdata types from the Haskell-OS community that no one has tried to do much with beyond experiments.
<azonenberg> There is of course a totally different tangent
<qu1j0t3> DEC had BLISS
<azonenberg> which is to design the hardware for safe access and not require all of this BS
<azonenberg> like, in Antikernel the memory manager is in hardware and operates at a page level
<levi> That was a common approach back in the 60s, actually.
<azonenberg> and enforces single-owner semantics per PID (no multithreading, one thread per process)
<azonenberg> this means that a peripheral can malloc a page on its own with no software interaction
<levi> Through the 70s-80s to some degree too, when everything was made with bit-slice ICs and microcode.
<azonenberg> DMA some data to it, then chown it to your app and send you a pointer
<azonenberg> or you can fill a buffer with data, flush cache, chown it to the NIC
<azonenberg> no races possible because you forfeit your own rights to the page as you do so
<levi> The lowest-level language available on the Burroughs B5000 series machines was a variant of Algol.
<levi> It had a descriptor-based memory management scheme as well; not super familiar with how it worked though.
<azonenberg> also there were no SFRs in antikernel
<emily> 18:09 <azonenberg> e.g. language semantics that incrementing an int always wraps mod 2^32
<azonenberg> Memory mapping was only used for bulk data transfer
<emily> this is already the case in rust because i32 is the default-inferred integer type btw
<emily> (sorry for resurrecting the old topic)
<azonenberg> and the basic primitive for control plane activity was either an RPC or an interrupt (request-response or unidirectional "FYI" with optional data attached))
<levi> Rust also only supports 2's complement signed integer semantics, so that's helpful.
<azonenberg> in the RPC case you had explicit handshaking
<emily> also re all the rust stuff: you can, fundamentally, just use unsafe if you know what you're doing is safe
<azonenberg> levi: no unsigned?
<emily> but it does help to internalize the rust mindset enough that you know how to do things without just sprinkling it everywhere
<azonenberg> thats another thing i disliked about java
<levi> It has unsigned, but doesn't support other kinds of signed integers.
<azonenberg> makes doing crypto, ECC, etc hard
<emily> it has unsigned
<azonenberg> oh ok
<azonenberg> so unsigned or twos complement. That's a sane choice
<emily> c'mon, it's a systems language, of course it has unsigned :p
<azonenberg> i've seen a lot of idiocy from language designers over the years sooo... :p
<emily> sudden flashbacks of those Java OSes
<azonenberg> Exactly
<azonenberg> Reminds me of the programming languages class i took as an undergrad
<azonenberg> we had to do a project in SALSA, an actor-based language built on top of Java
<azonenberg> meant for distributed systems
<azonenberg> i chose a password cracker and demonstrated beautiful scaling from 1 to 32 nodes of the department x86 cluster
<azonenberg> what i did not send the professor, out of fear for my grade (this was his pet project, descended from his dissertation) were the results for my C++/sse2 implementation that ran faster on one core of my laptop than salsa on 32 nodes of the cluster
<azonenberg> Or the CUDA implementation that, on a handful of GPUs in my living room, ran faster than linear scaling of SALSA extrapolated to the #1 top500 system at the time
<levi> Yeah; fortunately the people writing code for top500 machines aren't doing it in Java-based languages.
<levi> Mostly in C and Fortran still, unless things have changed drastically recently.
<azonenberg> yeah that sounds right. LAMMPS is a mix of C++ and FORTRAN, that was the last large HPC codebase i did much with
<azonenberg> back as an undergrad i never got access to more than half a rack of BlueGene/L
<azonenberg> it's funny, my RTX 2080 Ti probably has more flops than that
<azonenberg> Let's see, 2.9 Tflops/rack in coprocesor mode, so 1.45 Tflops/midplane, a 2080 Ti can do 14.2 Tflops
<azonenberg> So my GPU is actually equivalent to just shy of five racks of BG/L lol
<q3k> azonenberg: i mean, you care more in distributed systems than plain number crunching
<q3k> just because it was a bad choice for your usecase doesn't mean it's not a valuable tool
<azonenberg> well yes, my point was more to illustrate just how obsolete BG/L is by modern standards
<azonenberg> (not that a half rack was that much even back then compared to the size of the whole... 16 rack? system)
<q3k> do people actually use SALSA to do number crunching on bluegenes? i would expect to see a lot of c++/fortran with MPI
<q3k> your example doesn't really tell me anything about the power of your GPU vs a bluegene cluster, just that with a poor choice of tool just throwing more compute at a problem ain't gonna solve things
rohitksingh has quit [Ping timeout: 256 seconds]
<levi> I think SALSA is mostly an academic proof-of-concept sort of language, and is probably only really used in academic environments to teach distributed computing concepts.
<q3k> yeah, that's kind of my point (and it's IMO a good goal for a language)
<levi> I would also expect all production code on scientific compute clusters to be in C/C++/Fortran with MPI, mostly because scientists.
<q3k> i'm just not rady to go from 'i wrote this slow SALSA implementation of X' to 'my gpu is more powerful than a bluegene cluster'
<azonenberg> q3k: i was comparing peak flops in that example
<azonenberg> and i'm comparing a bluegene from 2008 to a modern GPU
<azonenberg> and no i dont think anyone uses salsa on it
<q3k> 'Or the CUDA implementation that [...] ran faster than linear scaling of SALSA [on bluegene]'
<azonenberg> i was extrapolating to roadrunner actually
<azonenberg> which i dont think was a bluegene, it was #1 on top500 at the time
<azonenberg> and my point was more to show how slow salsa was :p
<q3k> and that's fine that it's slow?
<q3k> like you started with azonenberg | i've seen a lot of idiocy from language designers over the years sooo... :p
<q3k> and i really am not sure what sort of idiocy you're talking about
Jybz has joined ##openfpga
Jybz has quit [Remote host closed the connection]
rohitksingh has joined ##openfpga
<levi> Looking at the Blue Gene/L architecture, it seems Java was a particularly poor choice for scaling performance-oriented numeric code on that platform.
<levi> And having a modern high-end desktop GPU outperform 2 racks of it in peak flops doesn't sound ridiculous either. The actual top-performing Blue Gene/L systems were around ~100 racks.
<q3k> especially for things like pure bruteforce, where a GPU just fits better than a bunch of CPUs
rohitksingh has quit [Ping timeout: 255 seconds]
emeb has joined ##openfpga
<adamgreig> azonenberg: i'm late to the party [because the rust embedded wg meeting is on right now too] but rust does have a lot of things you just mentioned on embedded
<adamgreig> packed structs that live at some memory address are absolutely a thing and is how most embedded devices do MMIO
<adamgreig> including thread-safe "ownership" thereof, so that only one thing can modify at a time in a way that can be safely moved between threads/interrupt handlers/etc
<adamgreig> even clever type level systems to automatically set the system interrupt floor based on current context to ensure no pre-emption on shared resources etc
<adamgreig> a lot of systems for extremely easy struct serialisation, and you can have structs with C layout semantics, or specifically packed, etc
<TD-Linux> oh are the HALs packed struct backed now?
<adamgreig> when were they not?
<TD-Linux> maybe they always were. I mean it's not obvious when I'm writing something like
<TD-Linux> rcc.apb1enr.write(|w| { w.pwren().bit(true) });
<adamgreig> that's a method-based api on top of a packed struct that lives in memory
<adamgreig> it compiles down to a normal read-modify-write or write to the relevant register address
<TD-Linux> yeah, I guess I assumed the backing was a direct mmio write
<adamgreig> though actually that might change due to a hilarious llvm issue, heh
<TD-Linux> link?
<adamgreig> technically llvm is allowed to insert reads to any dereferenceable reference, so you might end up with an unexpected read action on a mmio which is read sensitive
<adamgreig> only way around is to never construct a reference to mmio, so only do explicit pointer-based atomic reads
<adamgreig> that's sort of irrelevant though
mumptai has joined ##openfpga
rohitksingh has joined ##openfpga
rohitksingh has quit [Ping timeout: 240 seconds]
<tnt> So, correct me if I'm wrong, but there is no way in the ice40 (using a single pll, target is up5k) to have two clocks outputs of the same frequency with a dynamically variable phase shift between them.
<tnt> Ok, nobody corrected me so ... damnit.
<levi> Wouldn't take that as definitive evidence; no one's said anything.
<kc8apf> Finally sorted out all the dependencies to build jtaghal. C++ dependencies is such a mess
Asu has quit [Quit: Konversation terminated!]
<tnt> So in the python wrapper, doing PyErr_SetString(...) isn't enough to raise an exception, the function actually needs to return ...
<tpw_rules> tnt: have you consulted the manual
<tnt> tpw_rules: the manual ?
<tpw_rules> lattice tech note 1251
<tnt> oh of the pll you mean ?
<tpw_rules> "iCE40 sysCLOCK PLL Design and Usage Guide"
<tnt> Well it doesn't even list the two PLL outputs ...
<tnt> and yeah, obviously I looked at it, and I couldn't find any way to do what I want.
<tpw_rules> then it must not be possible
<tpw_rules> it says tho "The PLL provides two optional fine delay adjustment blocks that control the delay of the PLLOUT output relative to the input reference clock, to an external feedback signal, or relative to the selected quadrant phase shifted clock."
<tnt> what's possible is way different than what's in the lattice docs ...
<tpw_rules> i'm not sure what you're proposing
<tnt> well in _this_ case nothing. But my point is that not being in the lattice docs is not exactly enough for me to conclude something is not possible. (like for instance outputing two clocks 90 deg appart. Nowhere in TN1251 will it tell you that you can do it, but you can ...)
<tpw_rules> maybe that's the wrong note
<tpw_rules> because i remember finding out that fact from a tech note and using it
<tnt> the icecube UI allows you to do that.
<tnt> The FPGA library reference also references the two PLL outputs.
<tpw_rules> there is a document on it: FPGA-TN-02052-1.0
mumptai has quit [Remote host closed the connection]
<tpw_rules> hmm, maybe not?
<tpw_rules> page 101
<tpw_rules> it looks like you can do it, as long as the pll's frequency is the same as the input frequency
<tnt> yeah, using bypass mode. Not an option, I need a synthesized clock.
<tpw_rules> the 2F variety might be able to do it
<tpw_rules> page 104
<tnt> lattice download ... 37 min left ...
<tnt> I don't understand how their website can be so bad.
<tpw_rules> it went okay for me, but yeah it is amazing
<tnt> But anyway, I know the 2F variant. And unless I'm missing something, I don't see how to achieve that. (which is why I asked here in the first place in case I missed something or misread ...)
<tpw_rules> so the dynamicdelay delays both pins the same?
<tnt> GENCLK or GENCLK_HALF don't have the dunamic delay applied. SHIFTREG_{0,90} have the delay applied but are 1:4 of the frequency of the GENCLK output. And there is only two dynamic delay control: 1 in the feedback path to control phase wrt the input clock. and 1 in the output path.
<tpw_rules> and the output goes to both A and B
<tnt> So the "best" I could possibly do is output shiftreg_0 and then genclk_half and post divide genclk_half in the fabric.
<tnt> but in my case genclk_half would be 280 MHz which is a tad high for a UP5k.
<tpw_rules> alright. sorry for wasting your time
<tnt> np, appreciate having a second pair of eyes on it.
renze has quit [Quit: Spaceserver reboot?!]
renze has joined ##openfpga
Bike has joined ##openfpga
unixb0y has quit [Ping timeout: 258 seconds]