avsm has joined #mirage
<avsm> noddy: does that explanation of strong/weak make sense?
<avsm> noddy: we need some way to expose the options i think, with a default "pick the best" choice
<cebka> avsm: why not use something like yarrow (or fortuna) to generate secure PRNG from all entropy sources available?
<avsm> this is entropy, not a prng
<avsm> could expose that as the difference, hm
<avsm> there's a fortuna in the tls stack that could be pulled out for the prng
djs55 has joined #mirage
<noddy> avsm: this being *entropy* is why i asked
<noddy> we need an environmental noise source here
<noddy> in particular, no noise source can be blocking or depleting
<noddy> and we do the best anyone knows how to do with whatever entropy is really in there
<avsm> Yes, but we might be stuck with a self-seeded source due to no other entropy
<avsm> These aren't quite noise sources
<avsm> any single noise source must fundamentally block sometimes (i.e. no interrupts are happening)
<noddy> honestly, i'd just coalesce the two into a single one and say "the environmental noise on xen does not carry a lot of entropy, yet"
<avsm> hm, no, I'm not happy with that — we're going to have 2 separate methods for some time (>1 year)
<avsm> a split-channel one that's external, and a self-seeded one that does the best it can
<noddy> yes and no; if you are for ex. hashing noise into accumulators, you read-reset the rolling hash and never block
<avsm> until the former goes upstream
<avsm> noddy: right, but we need to split that logic out into the Random device then
<noddy> so again - the entropy device is not expected to provide strong random, right?
<avsm> and then Entropy can never block
<noddy> what it should give is *noise*, and noise varies in quality depending on the platform
<avsm> yes, but what supplies random numbers then?
<noddy> your own state-of-the-art PRNG
<avsm> Fortuna is embedded in TLS atm isnt it?
<noddy> in nocrypto
<avsm> what i'm saying is — where is it for 2.0 :)
<noddy> the thing is, these are cryptographically distinct things
<noddy> entropy and rng
<noddy> in an entropy device, you expect noise, not random numbers
<noddy> in a random, you expect random numbers
<avsm> I realise that — but in the absence of a noise source on Xen, we're sort of stuck and have to provide something
<noddy> i thought that was the whole point of an "entropy" device; to expose noise as opposed to prng
<avsm> right now, it errors out
<avsm> so reverting to Random.self_init temporarily (i.e. the current behaviour), which is then seeded upstream by Fortuna seems reasonable to unblock it
<noddy> ok, what i'm trying to say: given the desired semantics, i think we can coalesce weak/strong back b/c the entropy, as opposed to rng, is always somewhere in between the two
<noddy> as in, with *entropy*, you have no guarantees on the actual distribution of output
<noddy> but anyone implementing a prng needs to know you sole guarantee - that it's environmental
<avsm> Isn't it worth distinguishing between external entropy and self-discovered ones though?
<noddy> they are both environment-based
<noddy> you can say "our xen noise is weak" but fundamentally they are the same
<avsm> no, one's channeled in from outside, the other one is observed from within
<avsm> and quite possibly clocked or otherwise aligned on something we can't observe
<avsm> but also out interface isn't quite right then, since we do block until we can get the right amount of entropy
<noddy> so - why does freebsd have symling /dev/urandom -> /dev/random but linux has two completely different things there?
<avsm> well, that's not entropy; it's a strong prng
<noddy> well /dev/random on linux is entropy
<noddy> and all the prngs involved are crypto-strong
<noddy> on both
<noddy> still, linux makes the distinction between noise and the endless stream
<noddy> but freebsd does not; why?
<avsm> no, /dev/random mixes on linux too i think
<noddy> it mixes, that's all
<noddy> it has running hash of noise
<noddy> a hash of noise still has the same amount of actual entropy, modulo size, as the noise itself
<avsm> my key point though, is that we have two things: crypto strong entropy, and really bad and quite determinstic entropy in xen
<avsm> i'd really like to expose that a unikernel might have a poor source
<noddy> my point: there is only one kind of entropy
<noddy> and several kinds of pseudorandom numbers
<noddy> entropy being environmental noise
<noddy> or something derived from it
<avsm> i dont see how you can call observed entropy from interrupt counters in the same class as random noise though?
<noddy> entropy is also never crypto strong because before mixing, its statistically skewed
<avsm> yep, that's true
<noddy> it is
<noddy> it is random
<noddy> 'cept its skewed
<noddy> and you need much more for the same unpredictability
<noddy> but basically noise is noise
<noddy> and some channels are noisier
<avsm> yeah, i agree with that; but then we have the wrong interface entirely in Entropy
<avsm> since it blocks
<noddy> it doesn't actually
<noddy> exactly because of this
<noddy> ( it uses urandom, not random, to emulate opportunistic noise on unix )
<avsm> argh, but then this is no longer entropy
<noddy> it is: anything derived from noise is at least as noisy as the noise
<avsm> and so the xen self_init is reasonable
<noddy> it has *more* properties we disregard
<noddy> so the use case:
<noddy> tls has a really effing good mixing thing for deriving strong crypto rng from _some_ noise
<noddy> and other people might have the same
<noddy> and now somehow the platform needs to expose the pure noise
<noddy> of some quality
<avsm> right, and how do we distinguish that quality?
<noddy> without imposing other rng properties on it, because some people need just noise
<avsm> yep, agreed
<noddy> we say "the best we have"
<noddy> for example, fortuna gracefully recovers in environment with low trickle of entropy
<avsm> but the point of exposing it is that if the unikernel needs to have top-class noise, then we can abort at build time
<noddy> the property is that it take a little longer, but even with interrupts, it converges onto a "good" state
<avsm> this implies that the xen interrupt source needs to go through a mixer though
<noddy> you don't ever need "top-class" noise if you know how to build a prng
<noddy> yup
<noddy> and our mixer is fortuna
<avsm> so in other words, we need to merge the prng and entropy sources
<avsm> they aren't standalone at all
<noddy> i think the opposite
<noddy> we need the pure entropy
<noddy> and the prng
<noddy> and nocrypto in particular needs just the entropy
<noddy> just the noise from the environment
<avsm> we don't have such a noise device at the moment...
<avsm> we have:
<noddy> timers
<avsm> Unix: /dev/random
<noddy> ( urandom )
<avsm> Xen: interrupt, timers
<noddy> aha
<avsm> Xen: split channel
<avsm> (which proxies /dev/[u]random)
<noddy> and as long as we agree on the desired semantics of that device, these are ok implementations of alleged "noise"
<noddy> it's the blurb i'm worried about really: what is intended there?
<avsm> with the existing Entropy interface, Xen *must* have fortuna to provide noise
<noddy> and i'm saying we need it to intend to provide noise measurements, better or worse
<noddy> you can provide noise to whoever
<avsm> it has no polling interface to peek "give me 5 bytes if available"
<noddy> you have to have it to convert noise to random numbers
<noddy> yes
<noddy> and we have a system that intentionally does not rely on underlying platform's ability to generate crypto-strong rngs
<avsm> so in other words, we have to merge the Xen noise with the prng
<noddy> and intentionally just needs noise
<noddy> i think these are two complete separate issues here
<noddy> some people needs random numbers
<noddy> some people mix their own but need the environment
<noddy> ... so i was quite happy with the entropy device, but then we need it to remain an *entropy* device and not become a prng device
<noddy> ( given that a well-seeded prng such as urandom *is* of sufficient entropy to play that role when available )
<avsm> From a code perspective, what do you want the Xen entropy device to expose?
<avsm> Sample interrupts, feed that into a fortuna, and then reexpose that as Entropy, right?
<noddy> *some* amount of bytes, without blocking, that are somewhat unpredictable
<noddy> nono
<noddy> entropy does not read fortuna
<noddy> just the measurements
<noddy> just *some* bytes read from interrupts and timers
<avsm> but the interface is broken then!
<noddy> why?
<noddy> why do you think that?
<avsm> entropy len must return the full len
<avsm> it has to support a short read
<noddy> that's why i asked about the semantics
<noddy> we can agree it needs to return only as much as it has too
<noddy> as long as the contract is noise, not prng
<avsm> in that case even something as simple as useconds from gettimeofday would work in the xen timer
<noddy> that is one of possible sources, indeed
<noddy> and one of the suggested sources in the fortuna documents too
<noddy> ( as long as there are other ofc )
<noddy> usecs from interrupts
<noddy> usecs from latencies
<noddy> *any* usecs we can get our hands on, really :)
<noddy> if any of those events contains mere two unpredictable bits, that's good
<noddy> and the prng recovers
<avsm> hm, the interrupt one can only be used once per interrupt
<noddy> still good
<avsm> so i guess on wakeup, we set a ref
<avsm> that can be Noned by the polling noise
<noddy> for ex, yeah
<noddy> so the interface can also be "*up to* N bytes of noise, as availble *now*"
<avsm> i'll fix the Unix interface to return a shorter cstruct, first
<avsm> i'm tempted to leave the Xen one doing self_init for now
<avsm> to unblock the end-to-end build (with a printf noting its weak)
<noddy> self_init is the old random thingie with time?
<avsm> y
<noddy> :(
<avsm> why?
<noddy> we basically need a couple of self_inits per sec, continuously
<avsm> yeah, that's fine too
<noddy> and i'd rather read the bytes directly
<noddy> but the basic idea, still: entropy can make no statistical guarantees, the only property it has is that *some* bits are truly unpredictable even to a computationally-unbounded adversary
<noddy> that's what i mean by "entropy"
<avsm> can do that later one (read bytes directly). without an end to end build it's very hard to develop
<avsm> is there a nocrypto/asn/x509 cut in opam btw?
<avsm> that would make it easier to build the stack itself
<noddy> there is the overlay currently
* avsm needs to test talex5's patch, and conduit for vbmithr too...
<noddy> we plan to make a cut, i think, today
<noddy> look: it not really urgent
<noddy> i just wanted to chime in before we desync on what it means
<avsm> yeah, it's important
<noddy> well we have urandom for now
<avsm> not on xen...
<noddy> is the overlay ok for you now?
<avsm> overlay?
<avsm> self_init, you mean?
<noddy> we have an opam overlay
<noddy> opam overlay
<noddy> in mirleft/
<noddy> for quick builds
<noddy> also opam files in all projects for direct pins with git-opam
<avsm> yep overlay fine for now
<noddy> cool
<noddy> ok, we agree on definition? "entropy" is something not entirely predictable even for computationally unbounded adversary. but its unpredictable space can be very small too.
<avsm> i think this works on unix then:
<avsm> let entropy { fd = fd } len =
<avsm> let r = Cstruct.create len in
<avsm> `Ok r
<avsm> Lwt_cstruct.(complete (read fd) r) >|= fun () ->
<avsm> so this never returns short, but the interface supports short reads
<noddy> i never noticed the `complete` :D
<avsm> the slightly subtle thing in Lwt_cstruct is that it leaves the rest of the cstruct nutouched
<avsm> so the old entropy would have had a lot of zeros potentially in the cstruct ;-)
<noddy> *maybe* use Cstruct.sub to signal how much there was
<noddy> but then again, you can't ever measure entropy since the property is relative
<avsm> i dont think /dev/urandom will ever return short in normal use anyway
<noddy> urandom won't, ever
<noddy> was thinking about xen equivalent
<noddy> this code is imho spot on
<avsm> it may return short if interrupted by a signal, but that's hopefully marginal
<avsm> xen right now is just doing the self_init, for later improvement
<avsm> we have all the right hooks to do it now i think
<noddy> a can live with that and a small warning
<noddy> btw what is the init sequence of a mirage unikernel?
<avsm> yep, look at mirage-entropy and merge if ok?
<noddy> and the teardown?
<avsm> no teardown
<noddy> init?
<noddy> i want 'a Lwt.t to fire *before* start?
<avsm> init is just call the start function in the Job
<avsm> got to be done in a device connect
<noddy> hmhmhm
<avsm> what is it?
<noddy> so for _correct_ use, we need to spin a timer ring and periodically read from the entropy, right?
<noddy> TLS needs to, that is
<noddy> so i was wondering how to avoid having the "init_now ()" in there
<noddy> and just fire the ring in the background
<noddy> without having a race condition too
<noddy> there needs to be at least one successful reseed before the stack starts
<noddy> and it needs to continue in the background
<noddy> how?
<avsm> the stack would just block in connect() until the reseed finishes, and background the thread
<avsm> like the tcp stack does for its timers
<noddy> i was afraid you'd say so :D
<noddy> actually not so bad since it's global
<noddy> Tls.t does not have to sync
<noddy> only connects need to wait for a lwt waiter
<noddy> one singe, global waiter
<noddy> yeah, could work
<avsm> should revert us back
<noddy> re: look at mirage-entropy and merge: where? can't see PRs or sth?
<avsm> noddy: github/mirage/mirage-entropy
<avsm> oh are you in the admin list
<avsm> noddy: should have merge access now
<noddy> woo :)
<jerith> avsm: I really enjoyed the podcast you were on recently.
<jerith> I'm most of the way through Real World OCaml now. :-)
<avsm> jerith: glad to hear it! I haven't actually listened to that podcast myself :-)
<avsm> noddy: im going to cut a entropy 0.1.5 then
<avsm> and mir 1.2.0
<avsm> noddy: that's good, but terribly inefficient
<noddy> why?
<avsm> a kernel really ought to spend most of its time sleeping
<avsm> in this case, it wakes up once per second, guaranteed
<avsm> if your phone does that, you arent making many calls...
<avsm> is there a way to apply backpressure so that it stops gathering entropy after it has enough?
<avsm> for a while
<noddy> no probs - we can either make that 10, 60, whatever (on unix)
<avsm> i.e. if noone's reading it, don't generate it
<noddy> ... or make more intricate sliding window
<avsm> nope, it needs to do nothing if noone's consuming entropy
<noddy> aha, there is problem with that
<avsm> yeah, sliding window perhaps
<noddy> entropy is very, very sparse
<noddy> so if we don't remember it, we waste it
<noddy> think timings
<noddy> this is the *actual* entropy model of fortuna, straight from schneier
<noddy> i didn't want to force it because it's kinda fortuna-specific
<avsm> yes that's fine
<avsm> but
<noddy> .. b/c no estimators and blah
<avsm> after an hour of gathering entropy, and noone consuming it
<avsm> there's no point in reseeding
<noddy> there is
<noddy> we don't know the actual amount of entropy
<noddy> and the entire prng is absolutely designed with constant seeding in mind
<avsm> only if someones consuming it
<noddy> with that, it can provide recovery guarantees no matter the actual quality
<avsm> this is the equivalent of spinning
<avsm> it's just not workable
<noddy> well
<noddy> that is what fortuna was designed for
<avsm> linux or freebsd themselves aren't constantly seeding, they can sleep for long periods too
<noddy> and without that, we can write "fortuna" all over
<noddy> but we cheat and don't really provide that.
<avsm> really? so freebsd wakes up every second?
<noddy> freebsd uses yarrow
<noddy> yarrow has estimators
<noddy> fortuna has no estimators
<noddy> hence, continuity
<noddy> the idea is just to opportunistically gather whatever measurements are available, right?
<noddy> 1. sec is a stub
<noddy> it can slide or whatever b/c on unix, we *do* have noise
<noddy> the point is xen
<noddy> on xen, every interrupt handler could hand off its timing to the pool this way
<avsm> right, i just think it needs to hit a fixed point so that a kernel idle for (e.g.) a day isn't constanly reseeding pointlessly
<noddy> spin is a red herring, right? is to *mock* and the actual even-based logic
<noddy> delete the unix module and thing about xen
<noddy> nobody is spinning
<noddy> it's up to the drivers to wakeup
<noddy> and hand off
<noddy> ( and btw i'm writing a spin anyway as we speak because, you know... no continuity, no fortuna )
<noddy> with a more sophisticated device model unix can wait on something external to happen too
<noddy> but on xen, that is builtin.
<noddy> have an interrupt? give the entropy, save the world.
<noddy> that's what other kernels do. when they wake up, they save a bit of randomness this generated.
<avsm> but why can't we key off the entropy pools being consumed?
<noddy> because we don't want to waste entropy
<avsm> yes, it's unix that would spin
<noddy> yes
<avsm> hm, there's no way to avoid that
<noddy> and we can _not_ do that
<noddy> we can even not loop
<noddy> ... to a point...
<noddy> thing about xen/arm
<noddy> * think
<noddy> driver wakes up, provides a little bit of prescious entropy
<avsm> yes, Xen is fine — it's all event driven
<noddy> yeah
<noddy> and on unix it can be an hour between too
<noddy> b/c the "noise" is known to be high quality
<avsm> so can't we trigger the Unix read from the event too
<avsm> it should read urandom on every kqueue exit
<avsm> just like xen does
<noddy> yes, we can
<avsm> that avoids the sleep
<noddy> this is just a stub
<noddy> to show the desired semantics
<avsm> yeah, i like this much more than the existing one
<noddy> we can really read every once-in-a-blue-moon on unix
<noddy> again, urandom is hi-entropy
<noddy> but this is the general thing
<noddy> taking care of your raspberry pi too
<avsm> aye, i like it
<noddy> also this is the actual correct thing
<avsm> replace the 1.2.0 entropy with this?
<avsm> yep, agreed
<noddy> you need to help me - how do i sync reading on unix?
<noddy> as in, what do i replace sleep with?
<avsm> we need some event hooks i think
<noddy> we can parameterize the module on something else
<noddy> but what?
<noddy> on unix, all we need is something that fires *sometimes*, and when there is activity
<avsm> i think we need to create a new
<avsm> for Lwt_unix
<avsm> that will do something on `event`
<avsm> for now, a sleep should be fine
<avsm> we dont care a huge amount of unix efficiency
<avsm> brb, about an hour
IbnFirnas has joined #mirage
avsm has quit [Quit: Leaving.]
avsm has joined #mirage
agarwal1975 has joined #mirage
tlockney is now known as tlockney_away
thomasga has joined #mirage
djs55 has joined #mirage
<noddy> thomasga: ping
<thomasga> noddy: pong
<noddy> you are right, "Entropy" would name-clash
<noddy> but
<hannes> morning
<noddy> i am not sure what having multiple would even mean
<noddy> hannes: morning :)
<thomasga> well, seems that you already have 2 implementations for unix
<noddy> just one
<thomasga> weak and strong ?
<noddy> that got pulled out
<thomasga> (just quickly read some of the patches)
<thomasga> ha
<noddy> check out half a day of discussion above. currently they are unique.
<thomasga> well, I don't know then … but maybe you want to have a network stack with no entropy at all
<thomasga> and one with some entropy
<thomasga> and you want to test the two stacks in the same program
<thomasga> but I haven't followed the discussion at all, so I trust your choices :p
<noddy> i am actually slightly confused by the tool itself and how to provide those identifiers
<noddy> there are two providers: `Entropy_xen` and `Entropy_unix (OS.Time)`
<noddy> in the second case, there is a need to make an identifier for this application. this is where i get confused.. for every separate application you do in a separate one for each of two providers?
<thomasga> Name.of_key <something unique to the implementation> ~base:<prefix of the idenfier>
<noddy> aha way just reading that module
<noddy> was
<thomasga> ideally you instanciate the functor once
<thomasga> and you use the 'module_name' everywhere
<thomasga> (and you add some 'module_name' = Functor(Arg1)(Arg2)(…)' in the configure function
<thomasga> (as you've already done)
<noddy> looking at Console, in
<thomasga> you just need to make the 'module_name' a bit more unique
<noddy> it uses its `t`-type to make a key
<noddy> but who provides this `t` (the string)?
<thomasga> ha, that's because there is only one Console implementation
<thomasga> but you can have multiple consoles
<thomasga> (with different connect parameters)
<thomasga> the "t" string is provided by
<thomasga> (if you use the default console, it's "0", but you can have an other console id)
<thomasga> anyway, need to run
<noddy> ( need to run too )
<thomasga> "Entropy_unix_instance" that's the new module name provided by mirage-entropy ?
<thomasga> ha I see, you said there is one implem anyway
<thomasga> so yes, why not
<noddy> that's just the temp name the functor application (is there is one) will be bound to
<thomasga> you can keep "Entropy" if that's really the case … if not, just use "Name.of_key"
<noddy> module Entropy_unix_instance = Entropy_unix (OS.Time)
<thomasga> well, in that case, keep "Entropy"
<thomasga> anyway, I've cherry-picked the discussion, as you have "type t = unit" you cannot have any name clash anyway
<thomasga> gn
thomasga has quit [Quit: Leaving.]
avsm has quit [Quit: Leaving.]
djs55 has quit [Quit: Leaving.]
