amiloradovsky has quit [Remote host closed the connection]
amiloradovsky has joined #ocaml
jbrown has joined #ocaml
amiloradovsky has quit [Remote host closed the connection]
mfp has joined #ocaml
amiloradovsky has joined #ocaml
nkly has quit [Read error: Connection reset by peer]
nkly has joined #ocaml
narimiran has quit [Ping timeout: 240 seconds]
narimiran has joined #ocaml
amiloradovsky has quit [Remote host closed the connection]
amiloradovsky has joined #ocaml
andreas303 has quit [Remote host closed the connection]
andreas303 has joined #ocaml
FreeBirdLjj has joined #ocaml
FreeBirdLjj has quit [Ping timeout: 268 seconds]
Louisono has quit [Remote host closed the connection]
Haudegen has quit [Quit: Bin weg.]
raver has quit [Quit: Gateway shutdown]
raver has joined #ocaml
ggole has joined #ocaml
mxns has joined #ocaml
nullcone has quit [Quit: Connection closed for inactivity]
Haudegen has joined #ocaml
mxns has quit [Ping timeout: 264 seconds]
<d_bot>
<mbacarella> is there an ocaml module that thinly wraps the linux syscall interface, completely bypassing libc?
mxns has joined #ocaml
amiloradovsky has quit [Remote host closed the connection]
amiloradovsky has joined #ocaml
borne has quit [Ping timeout: 260 seconds]
<mrvn>
No, ocaml does not have it's own libc.
<mrvn>
that's the job of libc.
nkly has quit [Ping timeout: 256 seconds]
borne has joined #ocaml
tryte has quit [Ping timeout: 240 seconds]
<d_bot>
<mbacarella> i meant does such a module exist anywhere, not is it included in ocaml standard
<d_bot>
<ggole> There's some library that adds various bits of POSIX that aren't in `Unix`
<d_bot>
<ggole> Called, uh, extunix or something
<d_bot>
<ggole> It almost certainly doesn't bypass libc though
<d_bot>
<mbacarella> not looking for posix. looking for linux 😛
<d_bot>
<ggole> Then I'm not aware of anything
<d_bot>
<ggole> Why do you need something like that?
<mrvn>
extunix just adds bindings for some more libc calls
<mrvn>
If you want a syscall that isn't in unix or extunix you should add bindings for the respective libc function. It deals with differences between kernel versions for you. Libc alsdo has a syscall function you could bind. What you really should not do is call syscalls yourself.
<d_bot>
<ulrikstrid> Is it possible to check if a process is a parent? (or rather, if it has any children)
<companion_cube>
@mbacarella: if you want linux syscalls you probably need to write specific bindings
<companion_cube>
and make a library out of it :p
<companion_cube>
(and call it io_uring or something like that for the internet points)
andreas303 has quit [Remote host closed the connection]
cantstanya has quit [Ping timeout: 240 seconds]
andreas31 has joined #ocaml
cantstanya has joined #ocaml
borne has quit [Ping timeout: 260 seconds]
tryte has joined #ocaml
hnOsmium0001 has joined #ocaml
<d_bot>
<mbacarella> @ggole strong emotional feeling that the less C code in my tech stack the better. i'm not quite ready to link my application against mirage. but i maybe am ready to delete libc
<Armael>
AFAIU the standard ocaml runtime already uses libc
<d_bot>
<mbacarella> yeah :/
<theblatte>
lots of C in OCaml :p
<d_bot>
<mbacarella> yeah :/
<Armael>
but I guess the mirage ones don't
<Drup>
I'm not sure there is much a distance between "delete libc" and "use mirage"
<Armael>
so I wonder how hard it would be to repurpose one of the mirage runtimes to directly use the linux apis
<companion_cube>
how does mirage allocate memory?
<mrvn>
Armael: as much work as it is to write a new libc.
<Armael>
I guess you have all the memory (from the unikernel point of view)
<Armael>
mrvn: only for the features that are actually used by the runtime…
<d_bot>
<mbacarella> isn't this how docker works? the linux kernel interface is actually pretty stable so you can run entirely different binary releases of distributions inside of them
<mrvn>
Armael: well, if you only want the runtime and not unix, extunix, core, batteries then it's just ~20 functions.
<Armael>
yeah that's my point
<mrvn>
Mostly the memory and thread handling.
<companion_cube>
Armael: you still need to get the memory from somewhere?
<companion_cube>
xen, the rump kernel, whatever?
<companion_cube>
and besides I thought that nowadays OCaml called malloc directly :s
<mrvn>
But what would be the point of writing your own malloc/free instead of using libc?
<def>
companion_cube: OCaml GC is not bad at managing memory.
<def>
in a VM you just see a virtual RAM bar
<companion_cube>
I mean ok, but it must use *something* to get the memory, right?
<companion_cube>
hu, ok
<Armael>
it's like a VM, you just have memory
<mrvn>
companion_cube: at boot it gets a memory map and then it hands out pages to the GC.
<hannes>
"how does mirageos allocate" - take a look at https://github.com/mirage/ocaml-freestanding -- this contains: (a) nolibc (b) openlibm (c) ocaml source (well, it depends on the ocaml compiler source code and builds the ocaml runtime)
<companion_cube>
ah so nolibc is what does this job of providing pages?
<hannes>
(a) includes a dlmalloc which we use to allocate memory. a mirageos unikernel gets their memory range at startup time, there's no way it can request more (or ballooning to less) memory -- nevertheless the dlmalloc serves for the ocaml runtime as allocator (+free) -- and the ocaml runtime allocates by GC, or directly (bigarray)
<mrvn>
ocaml-rpi implements just the bare minimum to run ocaml code on a RPi baremetal. Only the functions the runtime itself uses are there and many just stubs (math functions basically)
<companion_cube>
I see, so it ships with a custom C malloc
<companion_cube>
that makes sense
<mrvn>
One problem with baremetal / mirage is that the GC wants to allocate new heaps. There is no flag to tell it "This is your memory, this is all you will ever get"
<mrvn>
it won't do compression in place.
<hannes>
and the above mentioned nolibc does mostly the same: provide symbols so that the ocaml runtime can be linked (without unresolved symbols), and most of them are just stubs that error out or return and error code.
<mrvn>
hannes: I figure the openlibm fills a lot of functions where I have just dummy stubs.
<hannes>
mrvn: yes, and there are multiple call-sites of malloc in the ocaml runtime, thus it would be a non-trivial change (but very much appreciated)
motherfsck has joined #ocaml
<mrvn>
malloc, free, calloc, realloc are all used
<hannes>
mrvn: nice work on the bare-metal rpi4. it may be worth to consider reusing ocaml-freestanding -- or is there something missing in there?
<mrvn>
hannes: no idea.
<mrvn>
hannes: My code predates ocaml-freestanding and I just added every symbol the linker complained about till it worked.
<mrvn>
Implemented those that actually got called.
<hannes>
ic :)
<companion_cube>
so the actual memory map, and such, are managed by xen?
<mrvn>
hannes: When I looked into it mirage would not run on a Raspberry Pi because the RPi1 has no hardware VM support.
<companion_cube>
:o
<hannes>
mrvn: that is true, it changed with the rpi4 which now has kvm support afaik.
<mrvn>
Also starting with RPi2 there is SMP. I should try ocaml-multicore on that.
<mrvn>
The alternative would be to run 4 ocaml runtimes on it, one per core, and add some IPC mechanism.
Jesin has quit [Ping timeout: 268 seconds]
Louisono has joined #ocaml
Louisono has quit [Ping timeout: 245 seconds]
Tuplanolla has joined #ocaml
nullcone has joined #ocaml
Haudegen has quit [Ping timeout: 240 seconds]
Haudegen has joined #ocaml
borne has joined #ocaml
Haudegen has quit [Client Quit]
nkly has joined #ocaml
narimiran has quit [Ping timeout: 240 seconds]
andreas31 has quit [Remote host closed the connection]
andreas31 has joined #ocaml
amiloradovsky has quit [Ping timeout: 264 seconds]
reynir has quit [Ping timeout: 240 seconds]
reynir has joined #ocaml
Louisono has joined #ocaml
mxns has quit [Ping timeout: 256 seconds]
mxns has joined #ocaml
mxns has quit [Ping timeout: 272 seconds]
Louisono has quit [Ping timeout: 245 seconds]
<sadiq>
mrvn, ocaml-multicore's support for ARM is probably broken for now.
<sadiq>
it's on the todo-list to bring it back to parity with x86-64 but it hasn't kept up with the recent changes
<mrvn>
It would be nice if the GC could allocate data on different heaps depending on wether something is shared between threads or not.
<companion_cube>
multicore will promote to a different heap if it's shared, right?
<companion_cube>
(although of course, promote != allocate)
<mrvn>
more than just plain heaps being generational?
<companion_cube>
I think (but don't quote me on that) that there's a major heap for shared objects
<mrvn>
Does it move objects out of the minor heap when you share them?
<companion_cube>
ah, seems like the major heap is entirely shared
<companion_cube>
"[…] objects are promoted to the shared heap whenever
<companion_cube>
another thread actually tries to access them."
amiloradovsky has joined #ocaml
olle has quit [Ping timeout: 246 seconds]
<sadiq>
mrvn, companion_cube: to compicate matters there's two minor collectors.
mxns has joined #ocaml
<companion_cube>
oh, are you one of the coauthors?
<sadiq>
yes
<companion_cube>
nice :)
<sadiq>
one where minor heaps are all shared (parallel) and one where they're exclusive to a domain (concurrent)
<companion_cube>
is it a compile-time choice?
<sadiq>
no, only parallel will be upstreamed
<sadiq>
that's because concurrent requires changes to the C API and breaks most existing OCaml code
<companion_cube>
but does parallel slow down sequential code?
<mrvn>
Think or something building a list and then returning List.rev at the end. The compiler should be able to proof that the temp list is unshared and allocate it on a per-core heap.
<companion_cube>
(rather: does it slow it down a lot?)
<sadiq>
and actually, the parallel minor collector outperforms it
<sadiq>
no, the parallel minor is actually faster for sequential code
<companion_cube>
oh nice.
<sadiq>
it doesn't need a read barrier
<sadiq>
and for minor collections at least, what you end up running is almost identical to trunk ocaml
<companion_cube>
that's good.
<sadiq>
that linked paper has a bunch of benchmarks that compare against trunk
<mrvn>
sadiq: if you have sequential code in multicore can it run the code on one core and GC on the other?
<sadiq>
(and we've done a lot of optimisation work since the paper was published too)
<companion_cube>
will multicore remain a separate switch for a long time? or would it become the default?
<companion_cube>
I htink it kind of half-killed flambda1 that it was in separate switches
<sadiq>
mrvn, heh so the answer is it depends on what GC work you're doing. The minor collector requires all the domains to be stopped. The major collector does not.
mxns has quit [Ping timeout: 265 seconds]
<sadiq>
companion_cube, not speaking as ocaml labs or the ocaml developers - I think the intention is OCaml 5 wil have multicore there by default - we won't maintain two separate GCs and runtimes.
<octachron>
The plan is to have OCaml 5 with multicore, while keeping 4 up--o-date with bug fixes for a time
<companion_cube>
sadiq: thank god
<sadiq>
multicore has been designed to have as little impact on existing code as possible
Haudegen has joined #ocaml
<sadiq>
mrvn, so if you have multiple domains (multicore's fat threads) then even if one is idle it will do marking/sweeping work from the major heap.
<sadiq>
it will also do some of the minor heap promotion work if there's a minor collection.
tane has joined #ocaml
<sadiq>
there's a bit of complexity in there because a domain in multicore is actually really two threads. A main one and a backup thread, the backup thread is able to jump in and participate in a minor or major collection if the main thread has relinquished the ocaml runtime lock.
Anarchos has joined #ocaml
<companion_cube>
so that a long call to C doesn't block other domains?
narimiran has joined #ocaml
<mrvn>
sadiq: for the sequential case I would like to only do the required stop-the-world work in the main thread. Let the other core(s) do all the parallel work exclusively.
amiloradovsky has quit [Ping timeout: 268 seconds]
Jesin has joined #ocaml
<sadiq>
companion_cube, correct
mxns has joined #ocaml
<sadiq>
mrvn, the stop-the-world has to happen on all domains. Other domains can do major work though.
<mrvn>
sadiq: the point would be for the thread running actual code to not do GC work.
mxns has quit [Ping timeout: 260 seconds]
andreas31 has quit [Remote host closed the connection]
<sadiq>
mrvn, ah - that's difficult. The parallel collector needs everything stopped to promote safely.
<mrvn>
sadiq: sure. some things need stop-the-world. thats unavoidable. I'm just thinking of marking and sweeping and such.
<sadiq>
hrm. I don't think there are any knobs at the moment for that but it sounds like it wouldn't be too hard to add.
andreas31 has joined #ocaml
<mrvn>
The paper describes concurrent minor collection with private minor heaps. The main challenge is listed as "read barrier". Have you considered running a process on each core so each has it's own page table. Put the major heap in shared memory and the minor heap not. Access to the minor heap of another core would then segfault (which is hopefully rare).
ransom has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<sadiq>
mrvn, herding kids time, will respond a bit later.
ransom has joined #ocaml
<mrvn>
sadiq: Ok. When you come back I have another question: Do you create one domain per hardware core and then distribute threads as fibres across them automatically? Or does code have to create domains and fibres itself?
<companion_cube>
I think scheduling fibers will be done in userland
<mrvn>
companion_cube: that's my understanding too. But existing code just creates threads. Does that then create a new domain per thread? Or all htreads become fibres in a single domain? Or does it create domains till it hits the num cores and then create fibres?
<companion_cube>
iirc there are plans to map Thread to domains
<companion_cube>
fibers are a new thing
<companion_cube>
(or is to have all Thread.t live inside a domain? I forgot)
<mrvn>
all threads living in one domain would have a big performance impact for anything that releases the interpreter lock. Will even deadlock existing code that relies on ocaml running in parallel with C calls.
berberman_ has quit [Ping timeout: 240 seconds]
<zozozo>
from what I understand, it'd be the job of a lib (equivalent of lwt for instance), to create the adequate number of domains, then in the user code, you'd (in the end) use effects, and the lib that creates the domains would then schedule everything on the domains it created (because the effect allow precisely the user code to not have to worry about the scheduler)
berberman has joined #ocaml
<companion_cube>
I can't imagine Lwt creating more than one domain (outside of lwt_preemptive)
<zozozo>
yeah, not for lwt, but an equivalent for multicore
<companion_cube>
you can't just parallelize user code that was not designed for it
<companion_cube>
yeah we'll have new schedulers and stuff
<zozozo>
I know, once again, it'd be the equivaleznt of lwt, but adapted to multicore
<mrvn>
companion_cube: anything async in Lwt shouldn't care
<companion_cube>
oh really? it's full of shared state
<companion_cube>
I can create a local ref, start something in Lwt.async that will use this ref
<companion_cube>
and it works if >>= are well placed
<companion_cube>
but wouldn't work with full preemption
<mrvn>
companion_cube: the code already doesn't know when the result comes back. In your example if you start two somethings in Lwt.async they can run in any order.
<companion_cube>
I'm more talking about data races.
<companion_cube>
lwt tells you that sections between binds are, in effect, atomic
<mrvn>
can't parallelize the >>=
<companion_cube>
if you remove that, a lot of code might become magically broken
<mrvn>
nod
<companion_cube>
I mean there could be a Lwt.async_in_any_domain or something like that, going forward :)
mmohammadi9812 has quit [Quit: Quit]
<mrvn>
Personally I think having more GCs than cores becomes pointless.
<companion_cube>
depends a lot on what you do, I think
<mrvn>
companion_cube: as soon as you hit a "stop-the-world" you trigger a ton of context switches and rushes to finish the rest of each sweep.
<companion_cube>
ah right :s
<companion_cube>
so even preemptive threads should not use 1 domain each…
<mrvn>
except (there always is an except :) the thread might release the runtime lock, call a C function and block. Then the core would be idle.
<mrvn>
I think that's where the backup thread for each domain comes in
berberman has quit [Read error: Connection reset by peer]
berberman_ has joined #ocaml
mxns has joined #ocaml
<sadiq>
mrvn, heh, just trying to check which questions didn't get answered.
<sadiq>
yes, number of domains should map to number of physical cores
<sadiq>
a domain in itself is more than one thread though (as there's a backup thread)
<sadiq>
also just to makes things a little bit more complicated (but also more compatible) there's a systhreads implementation which means you can have multiple _other_ threads per domain.
<sadiq>
only one of those threads can hold the runtime lock at any point though
<sadiq>
so if you have a single domain it works the same way as systhreads
<mrvn>
sadiq: "other threads per domain" is nice. That means the thread library can automatically create one domain per thread and do a N:M mapping.
osa1 has quit [Quit: osa1]
<flux>
so what happens with blocking OS calls?
<mrvn>
flux: they block the thread
<flux>
or domain?
<mrvn>
flux: not if they correctly release the runtime lock
<mrvn>
"only one of those threads can hold the runtime lock"
<sadiq>
yes, as long as the code is correct and drops the runtime lock then yea it won't block the domain
<sadiq>
this is to retain compatibility with existing code that makes heavy use of systhreads for blocking IO
<mrvn>
if it doesn't drop the runtime lock then it is already broken now.
<companion_cube>
when you say "systhread" you mena `thread`, right? not the old systhread thingie ?
<companion_cube>
Thread.t, that is
<sadiq>
I may have got my terminology slightly wrong, yes.
<sadiq>
let me check.
<companion_cube>
:)
<mrvn>
companion_cube: the old systhreads would be fibers now
<companion_cube>
what about the old posix threads
<sadiq>
looking at the PR diff I'm still not sure
<mrvn>
from the above I gather they still exist and are Thread.t
<companion_cube>
heh
<sadiq>
I'd probably go with whatever the modern useful one is
<companion_cube>
I think there's an issue
<companion_cube>
about how to update the Thread implementation
Anarchos has quit [Quit: Vision[0.10.3]: i've been blurred!]
<sadiq>
(I didn't write this - it's engil's work mostly)
<Armael>
afaiu there's a compatibility layer for Thread.t? with a N:M mapping into domains
<sadiq>
companion_cube, yes, it's Thread.t
<mrvn>
companion_cube: from the above I gather threads remain just threads but get associated with a domain.
<sadiq>
correct
<sadiq>
we didn't want to change that because there are projects that use many many threads for blocking IO
<companion_cube>
ok, that's good indeed.
<mrvn>
sadiq: Is that a fixed mapping or is there a queue and any domain that becomes idle picks up a new thread?
<sadiq>
mrvn, I _think_ the mapping is fixed when you create the Thread
<companion_cube>
I suppose all threads could live inside one single domain
<sadiq>
(and is created by you Thread.create'ing on a given Domain)
<companion_cube>
oh, that's nice!
<companion_cube>
so you could still use multiple cores with Thread, with minor changes
<mrvn>
sadiq: it whould at least use work stealing so domains don go unused.