<whitequark>
I'm using a type system with types dependent on both types and values, but place an additional restriction that no type variable should survive the optimizations
<whitequark>
i.e. every type must only depend on constants.
<micahjohnston_>
interesting choice of language to build this on top of
<micahjohnston_>
but my interest is very piqued
<whitequark>
micahjohnston_: the rationale was very simple
<whitequark>
Ruby is quite mature, and features extensive metaprogramming facilities which are somewhat widely accepted
<whitequark>
I don't really know another language satisfying these conditions
<whitequark>
(if we only look at top20 of TIOBE.)
<micahjohnston_>
well, they're all runtime, so if your type system stuff allows it all to be compile time, I am definitely interested
<micahjohnston_>
or else it doesn't really seem like a systems language
<whitequark>
heh. optimizing metaprogramming to be fast at runtime is hard. I don't like hard problems, so I simply avoid them
<whitequark>
the compilation consists of two phases.
<whitequark>
in the first phase, the compiler essentially features a Ruby interpreter and a more or less regular Ruby runtime featuring all the well-known dynamism.
<whitequark>
of course, the semantics of the target is completely preserved; we don't need yet another autotools problem.
<whitequark>
in the second one, the compiler begins analysis of the resulting image (think of it as of a Smalltalk virtual image) from several defined entry points
<whitequark>
actual runtime semantics is quite restricted and it actually features very little metaprogramming. There is however a reason this will still be very convenient, and it's quite simple.
<whitequark>
real-world, production-quality Ruby programs aren't very dynamic after they load all the dependencies. Even stronger, most call sites are monomorphic
<whitequark>
(which is the main reason they can be efficiently optimized by runtimes like JRuby)
<whitequark>
There are however some call sites which are inherently megamorphic, such as internal `yield' in Array#each.
<whitequark>
CLR solves this by adding generic classes; JVM doesn't try to solve this problem per se, as it, very unfortunately, has type erasure, but can improve performance of them by inlining these methods in their immediate caller
<whitequark>
after which in most cases the call site becomes monomorphic
<micahjohnston_>
oh yeah, I've heard that about ruby programs, that most dynamism happens in setup
<micahjohnston_>
that's a really cool idea, creating an image by running fully privileged code
<micahjohnston_>
kind of reminds me of something in paws, actually
<micahjohnston_>
the interpretive preprocessor
<whitequark>
Ole Agesen has achieved some very impressive results while applying this to Self software
<micahjohnston_>
fully privileged paws code is run on the ast
<micahjohnston_>
kind of lisp macros but like
<whitequark>
yea
<micahjohnston_>
not so segregated
<micahjohnston_>
I guess
<whitequark>
ruby's metaprogrammning can be thought of a poor man's macros sometimes
<micahjohnston_>
oh cool, there seems to be a lot of awesome research about self and slate and those
<micahjohnston_>
that I'm not too familiar with
<whitequark>
Ole Agesen's thesis is really awesome. It also features an extensive overview of prior art.
<whitequark>
Ok. So.
<whitequark>
Back to bidi-DFA.
<whitequark>
First I should explain one thing which is, I think, somewhat unusual for compilers in general.
<whitequark>
In my compiler, optimizations--at least some of them-- are not optional, but on the contrary, are actually guaranteed.
<whitequark>
I.e. whatever you write inside `if false; end' will not only be stripped from the resulting binary
<whitequark>
but the complier will explicitly not analyze that code and thus will skip any semantic errors.
<whitequark>
Given that SCCP is also mandatory, this allows for writing code which is guaranteed to compile to very simple instruction sequences for a common case [with a constant operand], but gracefully degrade to a more complex one with an uncommon case.
<whitequark>
The key word here is `guaranteed'. If sometimes I'll have a spec, the spec will require and specify this behavior.
<micahjohnston_>
oh that's cool
<whitequark>
("a spec" as in "a formal separate document describing the language and runtime.")
<micahjohnston_>
I really like this idea
<micahjohnston_>
I have a feeling that elliott would get over his dislike of staticness for this idea too
<whitequark>
hm, what's next...
<whitequark>
The types in the type system are just Ruby classes, and thus values. In the interpreted phase, you can freely manipulate them however you want.
<whitequark>
Want to construct a type dependent on some computation with a piece of code reading an external file? No prob.
<whitequark>
That's just Ruby code.
<whitequark>
This allows to implement behavior which would otherwise be a part of the language in the stdlib, in pure Ruby.
<micahjohnston_>
are you adopting ruby's syntax, warts and all?
<whitequark>
In particular I have a 100 LOC implementation of a DSL for atomically setting bitfields in registers which looks like this: "PERIA.regfoo = { a: 10, b: :bar }"
<micahjohnston_>
that's hardcore
<whitequark>
you won't believe how much hoops you'll need to jump over to achieve the same result in C++
<whitequark>
and I'm not sure if you even can do that at all, much less portably
<whitequark>
in C you can't.
<whitequark>
Ada features this as a syntax.
<whitequark>
Oh also, have I mentioned that you can write compiler plugins which operate on both HIR [high-level intermediate representation] and LIR?
<micahjohnston_>
I like the arbitrary precision integer types
<micahjohnston_>
oh that sounds very nice
<whitequark>
The former is equivalent to macros; the latter allows for things like precise edge profiling.
<micahjohnston_>
this sounds like it could be easy to write programs that already run fast
<micahjohnston_>
like, competing with C speed without effort
<micahjohnston_>
idk
<whitequark>
pretty much.
<whitequark>
you *have* to compete with C to gain anything in embedded :)
<micahjohnston_>
oh also, the debugger/analyzer alexgordon was talking about could easily be implemented within the language or as a compiler plugin, it seems
<whitequark>
yea, debugger would be a compiler plugin
<whitequark>
there is some quite complicated machinery required to make it work with LLVM over remote targets which possibly execute code from ROM and don't have JTAG
<whitequark>
but back to typeclasses!
<whitequark>
as I've said, I need to perform bidirectional DFA to infer all types
<whitequark>
the most obvious reason to have backwards DFA is to infer block type signature from its usage.
<whitequark>
I'm however going to extend that concept and add something I'd call "ad-hoc typeclasses".
<whitequark>
So, if a certain method receives a variable `a' and calls `a.b(10)' and then `a.c()', then, in absence of more specific info about a's type--essentially, when a's type is a free type variable--the inferencer could deduce a partial type by going backwards from usage to definition
<whitequark>
Something along {b<Integer,*>;c<*>}
<micahjohnston_>
nice
<micahjohnston_>
so what exactly is bidirectional dfa?
<micahjohnston_>
you define a dfa to analyze the type of an expression, so you can run it backwards?
<whitequark>
The rationale for this behavior is as follows: Sometimes, type inferencer could not construct a single fully qualified type just by performing one iteration
<whitequark>
so, it would perform several ones and combine the results
<whitequark>
micahjohnston_: um, no
<micahjohnston_>
I love static duck typing things like this
<whitequark>
bidirectional dataflow analysis is a kind of DFA which propagates type information in both directions: through use-def edges and def-use ones
<micahjohnston_>
oh, data flow analysis
<micahjohnston_>
thought you meant automata, haha
<whitequark>
there's a recent-ish research on how to perform it efficiently; Static Single Information form
<micahjohnston_>
yeah right now I'm writing a functional reactive language so I convert ASTs to data flow graphs and then analyze/compile them
<whitequark>
if in SSA, you have phi nodes which "join" different definitions, then in SSI you also have sigma nodes which "split" them prior to control flow divergence
<micahjohnston_>
hmmm
<whitequark>
that's details though
<micahjohnston_>
I don't quite remember how the whole phi thing works
<micahjohnston_>
I was going to compile to llvm once but then I didn't :P
* whitequark
shrugs
<purr>
¯\(º_o)/¯
<micahjohnston_>
haha
<whitequark>
you can just use stack slots and then do mem2reg
<whitequark>
llvm will run dominance frontier calculation for you
<micahjohnston_>
I know some of those words
<whitequark>
sigh
<micahjohnston_>
sorry I am a compiler-theory philistine :P
<micahjohnston_>
ah I understand dominance frontiers now
<micahjohnston_>
thanks wikipedia
<micahjohnston_>
<3 graph theory
<purr>
Let it be known that micahjohnston_ hearts graph theory.
<micahjohnston_>
is mem2reg an optimization that takes llvm code using memory and makes it use registers?
<whitequark>
yea
<micahjohnston_>
cool
<whitequark>
spares you from having to construct ssa
<micahjohnston_>
oh nice
<micahjohnston_>
so in llvm "registers" are the ssa names?
<whitequark>
yes
<whitequark>
I think it's just the transform which is named weirdly
<micahjohnston_>
ok nice
<micahjohnston_>
so an SSA graph is basically a data flow graph, right?
<whitequark>
actually you can more or less throw any crap which passes verification at llvm and it will make sense of it :3
<micahjohnston_>
like, data flow where nodes are statements?
<micahjohnston_>
well, instructions
<whitequark>
micahjohnston_: combined CFG+DFG
<whitequark>
you also have basic blocks
<micahjohnston_>
what's DFG?
<whitequark>
data flow graph
<micahjohnston_>
ahh ok
<micahjohnston_>
control flow graph + dataflow graph
<whitequark>
well you don't *have* to build a DFG there, but it's convenient, LLVM does it and so do I
<micahjohnston_>
I thought you were saying context free grammar at first, haha
<micahjohnston_>
man there is a lot of acronym overloading in this field
<whitequark>
sure thing
<micahjohnston_>
ok so you use phi at dominance frontiers?
<whitequark>
well kinda, placing phi nodes across dominance frontier is optimal in terms of time/space
<micahjohnston_>
ok
<micahjohnston_>
this is cool
<micahjohnston_>
well, I'm off to dinner
<micahjohnston_>
be back in a bit probably
<whitequark>
'k
<whitequark>
though it's 6AM here
* whitequark
zzzz
<micahjohnston_>
oh, we're just about 12 hours apart :P
<Nuck>
I have a theory
<Nuck>
That elliott brought whitequark in to bring back micahjohnston from his not-programming.
<micahjohnston_>
lol
<purr>
lol
<micahjohnston_>
I have been programming a lot recently
<micahjohnston_>
but it's just been a game in C++ and a level editor that runs in the browser
<micahjohnston_>
no language dev
alexgordon has quit [Quit: Computer has gone to sleep.]
bulldog98 has joined #elliottcable
bulldog98_ has quit [Ping timeout: 248 seconds]
<purr>
<elliottcable> oh gods eboyjr is a progress bar
sephr has quit [Quit: Leaving]
<purr>
<malia> Jenna is jenna and your friend and that is cool
bulldog98 has quit [Remote host closed the connection]
<purr>
<Nuck> $20 on devyn turning out gay because of elliottcable
alexgordon has joined #elliottcable
<Nuck>
I really hope Google and Tesla team up to make the ultimate self-driving electric car
<Nuck>
Cause that'd be REALLY cool
<Nuck>
And I can't imagine a better team
<purr>
<devyn> yes, dad, I am paws
<whitequark>
$ git commit --amen
<alexgordon>
this may be the ugliest code I've ever written