kori has quit [Read error: Connection reset by peer]
kori has joined #forth
tabemann has joined #forth
jsoft has joined #forth
<tabemann>
hey guys
<Croran>
Hi Tabemann
X-Scale` has joined #forth
X-Scale has quit [Ping timeout: 258 seconds]
X-Scale` is now known as X-Scale
tabemann has quit [Ping timeout: 240 seconds]
boru` has joined #forth
boru has quit [Disconnected by services]
boru` is now known as boru
proteus-guy has joined #forth
tabemann has joined #forth
tp has quit [Read error: Connection reset by peer]
tp has joined #forth
tp has quit [Changing host]
tp has joined #forth
rdrop-exit has joined #forth
<tabemann>
wahoo - I fixed my problems with loops!
<tabemann>
it turned out that the PC is offset by 4 in Thumb-2
<rdrop-exit>
kudos!
<tabemann>
so I was miscalculating my jump offsets
<tp>
ahh
<tp>
thats because 4 8 * = 32
<tp>
?
<tabemann>
that's because the processor does instruction lookahead
<tabemann>
so the PC is always the current instruction's address plus 4
<tp>
I know I have to add 4 to the pc when reading 32 bit words
<tp>
oops
<tp>
I mean add 4 to the address!
<rdrop-exit>
that's how PCs usually work, they point to the next instruction while the current instruction is executed
<tp>
I have no idea what my PC is doing
<tabemann>
rdrop-exit: for Thumb-2, though, instructions can be 16 or 32 bit
<tabemann>
so with 16-bit instructions, it actually points to the instruction *after* the next instruction
<rdrop-exit>
that sounds wonky, when is it adjusted back?
<rdrop-exit>
(i.e. at what point in the instruction cycle?)
<tabemann>
the reason why is that it fetches 16 bits, and then fetches another 16 bits, and decides whether the instruction is to be 16-bit or 32-bit
<tabemann>
the reason why it does two fetches is that 32 bit instructions are only 16 bit aligned
<rdrop-exit>
aha
<rdrop-exit>
thanks, my Google-fu didn't bring up a quick diagram of the ARM instruction cycle
<rdrop-exit>
actually, my duckduckgo-fu
<rdrop-exit>
thumb seems a hack
<tabemann>
thumb is annoying
<tp>
probably is as it stated life as the acorn
<tabemann>
the thing about thumb is it isn't consistent
<rdrop-exit>
the way RISC-V handles this aspect is much better
<tabemann>
like why in hell are instructions that set the status register typically 16 bit and instructions that do not are typically 32 bit
<tp>
the good thing about Forth is it makes thumb skills unnecessary usually
<tabemann>
I can't decide whether to implement inlining or fix string handling next
<tp>
apparently ALL internal ARM instructions are 32 bit anyway
<rdrop-exit>
what's fix string handling?
<tabemann>
rdrop-exit: words like .( and ." are horrifically broken in zeptoforth ATM
<rdrop-exit>
ah, string literals
<tp>
gtg, back in a few hours, cya folks!
<tabemann>
see ya tp
<rdrop-exit>
ciao tp!
<rdrop-exit>
IIRC "standard" Forth doesn't provide explicit inlining related words
<tabemann>
the thing is that RAM to Flash calls in zeptoforth are really expensive
<rdrop-exit>
Some Forths do implicit inlining in their optimizer, I prefer explicit inlining
<tabemann>
so a significant gain can be obtained through inlining common builtin words
<tabemann>
a single call from RAM to Flash is 10 bytes
<rdrop-exit>
inlining can also be useful for factoring return stack related code
<tabemann>
it'd also make it cheaper processing-time-wise, as you say, because then I don't need to push and pop return addresses
<rdrop-exit>
might be cheaper for you to load code from flash into ram at startup
<tabemann>
rdrop-exit: yes, Flash to Flash calls are far cheaper, only being 4 bytes
<rdrop-exit>
explicit inlining is very simple to implement, I've never bothered with implicit inlining
<tabemann>
I could probably implement implicit inlining pretty easily
<rdrop-exit>
implicit requires too much special handling
<tabemann>
when compiling a word, check to see if any calls are made in the word, or if the word is over a certain size
<tabemann>
if neither are true, set a flag in the word
<tabemann>
when the word is finalized
<tabemann>
then, when compiling that word into another word
<tabemann>
check for that flag
<tabemann>
if it is set
<tabemann>
strip the push {lr} and pop {pc} instructions from the start and end of the word
<tabemann>
and insert it into the word being compiled
<rdrop-exit>
you're assuming the word doesn't do anything special with the return stack
<tabemann>
that's true
<rdrop-exit>
if the word has early exits for example
<tabemann>
I'm better of only doing that if an inline flag is set explicitly by the user
<rdrop-exit>
implicit requires understanding what the word is up to
<rdrop-exit>
explicit doesn't
<tabemann>
I probably with explicit will still have a check for whether any calls are made
<rdrop-exit>
it's just you explicitly saying this word is for inlining rather than calling
<rdrop-exit>
much simpler
<rdrop-exit>
I use ;inline
<rdrop-exit>
e.g. : foo ... ;inline
<rdrop-exit>
that's me explicitly indicating to the compiler that this word should be inlined at compile time
<rdrop-exit>
the compiler doesn't require any smarts, I told it what I want
<rdrop-exit>
COMPILE, takes care of the actual inlining
<rdrop-exit>
it's up to me to make sure I know what I want
<rdrop-exit>
and that it makes sense
<rdrop-exit>
If the inline-able word is not compile-only then it should have a ret at the end that doesn't get inlined
<rdrop-exit>
that way you can interpret it normally
<rdrop-exit>
I find implicit inlining has too many gotchas which require too much smarts to be built into the optimizer/compiler, and then you need workarounds when the smarts get in the way of what you actually need
<rdrop-exit>
But that's just my personal take on inlining in Forth
<tabemann>
back
<rdrop-exit>
wb :)
<tabemann>
yeah, I'm just going to do explicit
<rdrop-exit>
I posted a description of one approach to explicit inlining on reddit a while back
<rdrop-exit>
the most difficult part of x11 so far is figuring out what's unofficially deprecated
<tabemann>
back
<tabemann>
like server-side font rendering
<rdrop-exit>
WB
<tabemann>
apparently everyone just renders their fonts client-side these days, and then sends over the rendered fonts as images
<tabemann>
you'd better familiarize yourself with the likes of freetype2
<tabemann>
(which, if you're doing this all in forth, requires writing either an FFI layer, or a separate process that offloads font rendering)
<rdrop-exit>
I'm planning on using my own raster font, won't need to deal with that, i'll just be pushing pixmaps to the server with the font already rendered
<rdrop-exit>
X won't have anything to do with my fonts
<rdrop-exit>
my window will be fixed-sized so no rescaling of fonts required
<rdrop-exit>
X will only receive pixels to put on the screen
<rdrop-exit>
I'm not plaaning on using any libraries, just the x11 wire protocol
<rdrop-exit>
my needs are simple enough that I think I can get away with ignoring big chunks of the X ecosystem
<rdrop-exit>
I do need to figure out which protocol extensions I need though, probably the Sync one, the newer keyboard one, and if performance is too slow I might have no choice but to look into the shared memory extension
<rdrop-exit>
apparently the double buffering extension is deprecated, but there are simpler ways of accomplishing its purpose
<rdrop-exit>
I'm spending most of my time figuring out what can or should be ignored, so much cruft
<tabemann>
yess!! inlining works!
<rdrop-exit>
bravo! :)
<rdrop-exit>
late lunch, catch you later :)
rdrop-exit has quit [Quit: Lost terminal]
WickedShell has quit [Remote host closed the connection]
tp has quit [Read error: Connection reset by peer]
tp has joined #forth
tp has joined #forth
tp has quit [Changing host]
nonlinear has joined #forth
dddddd has quit [Ping timeout: 255 seconds]
dys has joined #forth
<veltas>
tabemann: That's how Z80's PC works as well for relative jumping
<veltas>
The relative offset is referred to as 'e', but the stored value in machine code is e-2
<veltas>
And the range of e is -126 to 129
rdrop-exit has joined #forth
<rdrop-exit>
veltas, it seems ARM thumb works differently from what tabemann described
<rdrop-exit>
something about the intermixing of 16-bit and 32-bit instructions
<veltas>
Right
<veltas>
I thought thumb was all 16-bit instructions
<rdrop-exit>
I have just about zero ARM knowledge
<veltas>
Me too but that is part of my approx. 0
<tp>
hey guys
<rdrop-exit>
I remember reading in the RISC-V docs that ARM's thumb is a separate ISA, while with RISC-V 16 bits is just an optional extension
<tp>
arm uses 32 bits internally always
<rdrop-exit>
hello Forth Master Technician (tm)!
<tp>
hey rdrop-exit, Zen Forth Guru!
<tp>
welocme back veltas !
<tp>
but thumb goes thru a code converter so that while the user code is thumb, it all winds up as 32 bit arm before executing in the cpu
<tp>
unlike risc-v you dont get a choice of 32 or 16 bit with cortex-m0 at least
<veltas>
I mean the instruction sizes
<rdrop-exit>
So if I understood correctly with ARM you're either in thumb mode or regular mode, while with RISC-V the compressed instructions are just extra instructions added to the base ISA
<veltas>
ARM is like 32-bit PowerPC in that everything is done with 32-bit words, right?
<veltas>
Going to work!
<tp>
rdrop-exit, with cortex-m0 there is no 'regular' mode, it's thumb or nothing
<tp>
even tho deep in the guts of the cpu, it's all 32 bit
<tp>
i think m3 is the same
<rdrop-exit>
some chips may be limited to one ISA or the other, but the point I think is that it's not a superset/subset relation
<rdrop-exit>
here's a quote from some of the RISC-V lit:
<rdrop-exit>
"
<rdrop-exit>
RV32I instructions are indistinguishable in RV32IC. Thumb-2 is actually a separate ISA with 16-bit instructions plus most but not all of ARMv7. For example, Compare and Branch on Zero is in Thumb-2 but not ARMv7, and vice versa for Reverse Subtract with Carry."
<tp>
Cortex-M use the 32/16 bit Thumb2 instruction set, except the M0/M0+ which use almost pure Thumb1 16 bit instructions with just a few system management 32 bit instructions. Choose Thumb2 or ARMv7-M for both. They don’t support original ARM instructions at all.
<rdrop-exit>
the point I think is that there is no superset/subset relation between non-thumb and thumb ISAs
<tp>
i think risc-v probably has a lot of modern advantages but it's still new
<tp>
rdrop-exit, agreed
<tp>
for instance Mecrisp-Stellaris on STM32F103 (cortex-m3) is faster than mecrisp-quintus on the gd32VF103 risc-v
<rdrop-exit>
yes, RV is definitely still wet behind the ears
<tp>
my GCD benchmark is the same for cortex-m3 at 75 mhz and risc-v at 104 MHz
<tp>
which surprised me
<tp>
by the same token, at 75 Mhz, cortex-m3 is 3.5x faster than cortex-m0 at the same speed with the same benchmark
<tp>
I'm not one for benchmarks anyway as I have all the speed ill ever need with a m0 at 8Mhz
<rdrop-exit>
I would assume your ARM optimizer is more mature than your RV one
<rdrop-exit>
or is this a pure assembly benchmark?
<tp>
thats what Im assuming also
<tp>
no, theyre all Forth benchmarks
<tp>
mecrisp-quintus is fairly new compared to the arm versions
<rdrop-exit>
makes sense
<tp>
the arm versions are years old in fact
<tp>
the dodgy risc-v doc isnt helping either
<tp>
in comparison the STM/ARM documentation is a masterpiece of comprehensive and easy to read tech info
<X-Scale>
If you want to get a feel of the earlier ARM and Thumb, check this precious little doc from 1996
<rdrop-exit>
I've only read the standards for RV, not the docs of a particular implementation.
<rdrop-exit>
I found the standards to be fairly legible as far as such things go.
<tp>
I'm now forming the opinion that the Chinese Gigadevice company who make the GD32VD103 have copied a lot of the ARM doc
<tp>
and they have been quite slack about it
<rdrop-exit>
For peripherals that's not surprising
<tp>
for instance the GD32VF103 (risc-v) doc just copies stuff from the GD32F103 (cortex-m3) doc verbatim in some cases, yet they are utterly different ISA
<tp>
thats what I assumed at first also
<tp>
I mean the GD32F103 is STM32F103 'compatible'
<tp>
both are M3 cores
<tp>
the GD32VF103 is a risc-v core
<tp>
they even have the same pinouts
<tp>
thanks X-Scale !
<tp>
I think all this copying will set GD back years because of all the confusion
<tp>
customers will just tire of all the wrong info and wasted time
<rdrop-exit>
It may be RV core, I imagine they're trying to ease adoption for current ARM users by making the rest as close to what you'd expect with existing ARM based products
<tp>
i dont think thats the reason
<rdrop-exit>
(by they, I mean Gigadevices)
<rdrop-exit>
cheaper for them too, if they make them as close as they can
<tp>
I think GD 'cloned' the STM32F103 peripherals so they could make immediate profits
<tp>
i mean if you clone a chip, mark it as the chip that you cloned, you can sell it immediately into espablished markets
<tp>
if you make your own version with your own part numbering the profits will take years longer to be realised
<tp>
as the market adopts it or not etc
<tp>
sadly I think GD had no Zen masters on the board because they didnt seem to realise that everything has two sides
<tp>
quick profits have resulted in a market that has little trust of GD
<rdrop-exit>
sure, that's the case with their ARM clone, but the RV is a different CPU
<tp>
thats right ... except ....
<rdrop-exit>
drum roll
<tp>
again to realise quick profits, GD used the same 'compatible' peripherals they used in their STM32F103 fakes
<tp>
I agree the GD32VF103 is a huge improvement, a chance for GD to go 'legit'
<tp>
my new stm32f103 diagnostics binary found a GD32F103 in a Chinese 'blue pill' board recently
<rdrop-exit>
neat
<tp>
the chip has poor quality markings but they are of a STM32F103
<tp>
so a blantant fake
<tp>
someone is relabelling GD32F103's in China as STM32F103's
<tp>
and putting them in 'blue pill' boards, but the Western buyers are now wise to this stuff
<rdrop-exit>
hopefully the RV market will mature, so that we see more than just clones of ARM-based products with the CPU switched out
<tp>
yeah, it's time for Chinese boards with their own Chinese MCUs, no fakes
<tp>
their prices are good, the chips seem fine
<tp>
the doc sucks, but that can improve
xek__ has joined #forth
<tp>
I also found out that Russian interests have licensed some ARM cores
<tp>
theyre making low volume variants legally, mainly rad hardened versions!
<tp>
so silicon on saphire I guess
gravicappa has quit [Ping timeout: 265 seconds]
<rdrop-exit>
gotta walk the dogs, catch you later :)