<hcarty>
rixed_: Do you want only unique entries, or are duplicated ok?
<rixed_>
hcarty: unique would be find
<rixed_>
s/find/nice (??!)
<rixed_>
maybe if Random.choice could returnalso the original enum without the chosen item, I could repeat N times
<rixed_>
speed is not required, though
<hcarty>
rixed_: One challenge is that enums are consumable. So they are difficult to work with when you want anything other than the head element
<rixed_>
hcarty: Originaly I had a list, but I converted it to an enum hopping I would have more functions available
<hcarty>
rixed_: You could use a list (or set or anything ordered I suppose), sort/order using a random function
<hcarty>
rixed_: Then take the first N elements from the result.
<kerneis>
be extremely careful about sorting with a random function
<kerneis>
depending on your sort algorithm, the result might be far from random
<rixed_>
kerneis: yeah I suppose with some algorithm the sort may even not terminate
<rixed_>
Or I can transform my original enum to a cyclic one and skip at random... no, wouldnt be unique
<rixed_>
Same idea but safer: use shuffle on the original enum and take the first N.
<kerneis>
rixed_: why don't you copy the source of Random.choice and modify it?
<kerneis>
but yes, shuffle then take would do the same
<rixed_>
BTW, any method to build a list from an enum (without getting the No_more_element exception) ?
adrien is now known as Camarade_Tux
Camarade_Tux is now known as Camarade_Tux_
Camarade_Tux_ is now known as Camarade_Tux__
Camarade_Tux__ is now known as adr1en
adr1en is now known as adrien_
adrien_ is now known as adrien
<sgnb>
kerneis: bad practice
<hcarty>
rixed_: List.of_enum. Most (many?) Batteries modules have something similar.
ygrek has quit [Ping timeout: 250 seconds]
<rixed_>
List.of_enum raise the aforementionned exception every time I use it. I suppose this is related to the consumable nature of enums...
StepanKuzmin has quit [Read error: Connection reset by peer]
<thelema>
rixed_: very odd - List.of_enum should return an empty list even if there's no elements in the list.
<thelema>
err, elements in the enum
<thelema>
are you sure it's raising the exception?
<thelema>
also, let choose n = Random.shuffle e |> Array.enum |> Enum.take n
StepanKuzmin has joined #ocaml
RAW has joined #ocaml
<thelema>
You could write a function that did one pass through the enum and chose or discarded each element with the correct probability, depending on the number of elements left in the enum and the number of elements still needed in the choice. I think this might be statistically valid...
impy has quit [Ping timeout: 276 seconds]
<thelema>
let keep need have = if need >= have then true else Random.float have <= need
<kerneis>
thelema: I thought about that but you might get unlucky and have too few elements
<kerneis>
hmm, no you are right
<thelema>
If you want to contribute the final function, I'll add it to batteries, as I've wanted random sampling from an enum too
<kerneis>
thelema: Random.float need <= have (and Random.int would work too I guess)
<thelema>
and yes, int is better. I was thinking I'd need more float arithmetic, but in the end I didn't
<kerneis>
oh, I thought "have" was how many you still have in your enum
<thelema>
yes, have is Enum.count, need = number left to sample
<kerneis>
so if you need 1 and have 100 left, you should pick a number between 0 and 99 and check if it is 1
<thelema>
less than 1, yes
<thelema>
Random.int have < need
<kerneis>
ok, I'm tired
<kerneis>
you are right
<thelema>
and this is why it needs to br written once, double-checked, and then not just used
<thelema>
*ne
<thelema>
*be
<thelema>
grr, s/not//
<mfp>
does the above assume you have Enum.count?
<thelema>
mfp: yes, no way to sample from an infinite enum with uniform probability
<rixed_>
thelema: I'm trying to figure out where the No_more_elements exception come from. I cxan't reproduce it in the repl while I cant get rid of it in my program :-/
<kerneis>
mfp: if you do not have Enum.count, I think you can't do it reliably
<mfp>
I believe it can be done without it in a single pass w/o forcing Enum.count; it can be done for N=1, at least, and it seems easy to generalize
<thelema>
mfp: please explain
<mfp>
for N=1, you pick the m-th element (replacing whatever you'd picked so far) with probability 1/m
<flux>
rixed_, are backtracess useless?
<thelema>
umm, that's not a probability distribution
<thelema>
mfp: sum (1/m) > 1.0
<mfp>
thelema: _replacing_ what you had (for N=1)
* thelema
is confused as to what that means
<mfp>
say you only want 1 element
<thelema>
ok
<mfp>
you do let elem = ref None in ...
<mfp>
then you get the elements + their indices (x, m) and do with probability 1/m do elem := Some x
<mfp>
so you pick the 1st one with probability one
<mfp>
then the 2nd one replaces your pick with P 0.5
<rixed_>
mfp: looks good to me
<mfp>
3rd one 1/3 and so on; at that point, the probability of !elem = Some x_i is 1/3 for all the elements (= for all i)
<thelema>
mfp: hmm, that seems to work... how to generalize to n elements? a set with limited size, when it's overfull, remove oldest? random? element?
<thelema>
mfp: hmm, that does seem to work.
<mfp>
something like: pick first N elements, then for m-th element (m > N), pick it with probability N/m, replacing a random element from the previous picks
<mfp>
just a hunch, gotta make sure the probabilities work out
<thelema>
I think that might work too.
* thelema
would code it up and do tests to see if elements are being chosen with uniform probability
<rixed_>
thelema: Hmmm, the problem is that my program (which is a custom toplevel) was opening batteries_uni instead of batteries... yet I can't figure out what exactly triggered the bug.
<thelema>
rixed_: odd, batteries_uni is what should be opened for batteries 1.x in a non-threads environment
<rixed_>
in other words, calling choose defined in a module compiled with batteries, from a toplevel when batteries_uni was opened, yield the NBo_more_elements exception after choose n |> List.of_enum... for some reason :-/
<thelema>
different exceptions between the two modules because of module paths. a funny behavior that produces unexpected results sometimes
<thelema>
glad you fixed that. I wish I could get rid of the batteries_uni/batteries distinction
<thelema>
maybe if I put dummy threads modules in batteries_uni, they can share the same interface...
<rixed_>
thelema: would have been great to have an error when opening Batteries_uni, though.
StepanKuzmin has quit [Read error: Connection reset by peer]
<thelema>
hmm... I don't think that's possible, as Batteries includes Batteries_uni
* rixed_
is not familiar enough with modules sha1s and flags to have a clue.
<thelema>
opening batteries_uni doesn't run any code provided by batteries, it just adds identifiers to the current namespace, unlike in other languages. link-time is when modules get run
StepanKuzmin has joined #ocaml
<mfp>
thelema: "replace random element from previous picks" doesn't work :-/ the probability of the m-th being chosen when there are m+1 elements is N/m * (N-1)/N = (N-1)/m <> N/(m+1)
<rixed_>
mfp: there is also a performence trouble: you are forced to scan the whole enum, so choose 1 amongst 1000 would be slow, while thelema idea stops once one get cosen
<thelema>
there's two cases - the m+1 element is added to the choices and the m+1 element isn't added
StepanKuzmin has quit [Remote host closed the connection]
<mfp>
rixed: but you don't want to stop --- you have to scan the whole sequence if each element can be chosen with same probability
<thelema>
if you can know the count w/o generatng the whole sequence, then it's faster, otherwise not
<thelema>
luckily, enums have a flag telling them whether the count function is fast, so we can use each algorithm exactly when it's appropriate
<rixed_>
thelema: why is List.iteri prototype "(int -> 'a -> 'b) -> 'a list -> unit" instead of "(int -> 'a -> unit) -> 'a list ->unit" ?
<rixed_>
ie. why 'b ?
<thelema>
rixed_: good question... I guess it does the ignore() for the user
<thelema>
probably inherited from extlib
<rixed_>
thelema: aren't we supposed to favor compatibility with stdlib more than with extlib?
<thelema>
-> 'b is compatible with stdlib
<thelema>
and list.iteri was only recently added to stdlib, no?
<rixed_>
yes, but stdlib would check the callback returns unit while extlib wont care
<rixed_>
thelema: I was refering to Array.iteri from stdlib
<rixed_>
List.iteri is not in stdlib (3.12.0) as far as I can tell
<thelema>
well, I'll change it in v2 (can't in 1.x, as it's backwards incompatible)
<rixed_>
not a big deal, I can do it if you want
<thelema>
oh, I was assuming that it was added - there's been a number of stdlib functions like this that have been added recently.
<thelema>
there's reasons either way - (-> 'b) is potentially more convenient, (-> unit) is potentially safer in terms of making sure the user isn't losing data.
<thelema>
In this case, I think erring on safety is fine, go ahead and fix the v2 branch
<rixed_>
thelema: will do
lopex has quit [Ping timeout: 252 seconds]
<rixed_>
thelema: fun is, List.iteri was not ignoring f return value. I guess we had a compilation warning there.
lopex has joined #ocaml
metasyntax|work has joined #ocaml
<thelema>
rixed_: I see the "f n h;" in List.iteri - I wonder how that got through the type-checker - maybe it's a corner case that doesn't currently trigger a warning
<rixed_>
thelema: Or this warning is disabled?
<thelema>
if so, please enable it. I know we have warning -> error turned on in the build system
<mfp>
that warning is quite recent IIRC
<thelema>
mfp: for non-unit used with ;?
<mfp>
yes
<mfp>
can't find it in Changes though
<mfp>
maybe I'm confusing w/ Lwt's >>
<flux>
thelema, btw, while I it's nice during development, it's also a good idea to disable that for actual releases :)
<flux>
wish ocaml was able to survive through errors and warnings, though..
<mfp>
warning number 10 > Expression on the left-hand side of a sequence that doesn’t have type "unit" (and that is not a function, see warning number 5).
<mfp>
5 > Partially applied function: expression whose result has function type and is ignored.
<rixed_>
thelema: as far as I can tell all warnings are enabled in myocamlbuild.ml
<mfp>
hmm there's also this notice "Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker."
<rixed_>
So maybe this caveat applies: "Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker
<flux>
thelema, let's say the user gets a new version of the compiler that has more warnings
<flux>
thelema, now he cannot anymore compile batteries
<rixed_>
thelema: I strongly support flux opinion.
<rixed_>
thelema: too often something break because the compilers authors add a new warning
<thelema>
ah. How to do this on releases only?
<flux>
I guess you have a mechanism for making releases?
<rixed_>
thelema: although for ocaml it never occured to me, its very frequent with gcc
<thelema>
'make release' runs the tests and then calls 'git archive ...'
<flux>
it could simply be adding an empty file 'release' into the archive and detecting that during configure or something
<mfp>
isn't that only a problem if we enable all warnings instead of listing those we do want (known by the current compiler)?
<flux>
mfp, there could be bugs in warnings that get fixed in a new version
<flux>
or better diagnostics for the same issues could be developed
<thelema>
well, if someone wants to make this happen for batteries, it'll be quickly merged.
<thelema>
flux: otoh, maybe the ocaml devs have realized this is a problem, which is exactly why they have their current system of warnings, and they may guarantee that new warnings appearing is a bug
<mfp>
hmm is it unreasonable to assume a batteries release will ususally follow a new OCaml shortly?
<mfp>
maybe not for a bugfix release
<thelema>
only when the ocaml release breaks batteries
<thelema>
which seems to happen all too frequently, but that's partially our fault
<mfp>
... does that include "broken build"? ;)
<mfp>
is batteries using module type of xxx yet?
<flux>
well, maybe the fix will be integrated the first time ocaml build fails due to -warnn-error ;)
<mfp>
in particular include module type of xxx to extend interfaces
<rixed_>
thelema: while we are at it, did you notice that ocaml master branch breaks batteries... badly, it seams.
<rixed_>
due to the addition of some optional parameters here and there
<thelema>
rixed_: yes, issue #151 and michael's emails to the list
<thelema>
I may make time this weekend to work on it, unless someone beats me to it.
alfa_y_omega has joined #ocaml
iratsu has quit [Ping timeout: 250 seconds]
iratsu has joined #ocaml
joewilliams_away is now known as joewilliams
agent|NaCl is now known as NaCl
sepp2k has joined #ocaml
joewilliams is now known as joewilliams_away
joewilliams_away is now known as joewilliams
ztfw has joined #ocaml
lopex has quit []
<hcarty>
mfp: Batteries does not use any 3.12-isms yet
<thelema>
hcarty: oops, somehow missed those questions. thanks.
<thelema>
we can probably start using 3.12-isms in the v2 branch, if anyone wants to start that process
<hcarty>
thelema: You're welcome. It's been on my mind recently, since it looks like it could save effort and code duplication.
<thelema>
totally agreed - those new abilities seem a very good fit for what batteries is trying to do.
rixed_ has quit [Quit: shutdown]
lopex has joined #ocaml
ulfdoz has joined #ocaml
lopexx has joined #ocaml
lopex has quit [Ping timeout: 260 seconds]
lopexx has quit []
lopex has joined #ocaml
lopexx has joined #ocaml
lopex has quit [Ping timeout: 264 seconds]
lopexx has quit [Client Quit]
lopex has joined #ocaml
larhat has quit [Quit: Leaving.]
ulfdoz has quit [Ping timeout: 260 seconds]
ulfdoz has joined #ocaml
lopex has quit [Ping timeout: 258 seconds]
avsm has quit [Quit: Leaving.]
lopex has joined #ocaml
ftrvxmtrx has quit [Quit: This computer has gone to sleep]
<flux>
I wonder, does anyone here care about ip-camera-based video monitoring/archival/etc?
<flux>
I'm contemplating beginning a related summer project for myself
othiym23 has joined #ocaml
Tobu has quit [Ping timeout: 260 seconds]
ftrvxmtrx has joined #ocaml
bobry1 has quit [Ping timeout: 255 seconds]
ygrek has joined #ocaml
alexyk has joined #ocaml
sepp2k has quit [Ping timeout: 255 seconds]
Julien_T has quit [Ping timeout: 260 seconds]
Tobu has joined #ocaml
sepp2k has joined #ocaml
Julien_Tz has joined #ocaml
Julien_Tz is now known as Julien_T
<adrien>
it's frustrating to see a pretty fast zipper implementation slowed down by the GC ='(
avsm has joined #ocaml
jamii has joined #ocaml
jamii has quit [Ping timeout: 260 seconds]
RAW is now known as impy
fraggle_ has quit [Quit: -ENOBRAIN]
<flux>
isn't ocaml gc actually pretty fast?
<thelema>
if the minor heap is too small...
fraggle_ has joined #ocaml
<adrien>
it's pretty fast but I have a lot of allocations, and I'm not sure this can be reduced
<flux>
I guess some nifty static analysis could to wonders to allocation rates
<thelema>
adrien: allocations on the minor heap are very cheap
<flux>
I had this program years ago that created a lot of garbage each frame (30 fps or 60 fps, cannot remember..), increasing the minor gc made wonders
<flux>
but cannot help thinking how little a pooled memory allocator would need to do work for that..
<thelema>
flux: just make sure your data fits in the minor heap, done.
<adrien>
top funcitons: 21.5% : mark_slics ; 17.5% : sweep_slice
<flux>
thelema, well, it would've been easy to see statically that I threw the data away :)
<adrien>
length of the zipper here is 100_000
<flux>
iirc it was all functional code
<thelema>
flux: ah, ok.
ftrvxmtrx has quit [Read error: Connection reset by peer]
<thelema>
adrien: so make your heap 200K words
<thelema>
minor heap
<adrien>
3.12.1, it's 256k by default now (even though 'man Gc' still says 32k), and I've made it 2M words actually
<adrien>
actually, I'm wondering if the garbage might not be created by something else, I should try with an array
ftrvxmtrx has joined #ocaml
ftrvxmtrx has quit [Quit: Leaving]
ulfdoz has quit [Ping timeout: 276 seconds]
Julien_T has quit [Ping timeout: 260 seconds]
<adrien>
well, no garbage for the array-based implementation so it's indeed all generated by my zipper implementation and not by lablgtk2 or some other layer/library
<adrien>
good night
ikaros has quit [Quit: Ex-Chat]
_andre has quit [Quit: leaving]
iratsu has quit [Quit: Leaving.]
tautologico has joined #ocaml
Cyanure has joined #ocaml
edwin has quit [Remote host closed the connection]
ftrvxmtrx has joined #ocaml
Modius has joined #ocaml
ftrvxmtrx has quit [Read error: Connection reset by peer]
ftrvxmtrx has joined #ocaml
dsheets has quit [Read error: Operation timed out]