gildor changed the topic of #ocaml to: Discussions about the OCaml programming language | http://caml.inria.fr/ | OCaml 3.12.1 http://bit.ly/nNVIVH
joewilliams_away is now known as joewilliams
Obfuscate has quit [Ping timeout: 246 seconds]
othiym23 has quit [Ping timeout: 264 seconds]
jamii has quit [Ping timeout: 258 seconds]
lopex has quit []
othiym23 has joined #ocaml
joewilliams is now known as joewilliams_away
joewilliams_away is now known as joewilliams
Obfuscate has joined #ocaml
othiym23 has quit [Quit: Linkinus - http://linkinus.com]
wagle has quit [Ping timeout: 246 seconds]
wagle has joined #ocaml
sepp2k has quit [Quit: Leaving.]
joewilliams is now known as joewilliams_away
joewilliams_away is now known as joewilliams
alexyk has quit [Quit: alexyk]
mfp has quit [Ping timeout: 250 seconds]
iratsu has joined #ocaml
mfp has joined #ocaml
NaCl is now known as agent|NaCl
joewilliams is now known as joewilliams_away
alexyk has joined #ocaml
alexyk has quit [Read error: Connection reset by peer]
alexyk has joined #ocaml
alexyk has quit [Quit: alexyk]
othiym23 has joined #ocaml
ulfdoz has joined #ocaml
Tianon has quit [Ping timeout: 260 seconds]
ankit9 has joined #ocaml
joewilliams_away is now known as joewilliams
jamii has joined #ocaml
Modius has quit [Quit: "Object-oriented design" is an oxymoron]
ulfdoz has quit [Read error: Operation timed out]
alexyk has joined #ocaml
Julien_t has quit [Ping timeout: 258 seconds]
Cyanure has joined #ocaml
vivanov has joined #ocaml
vivanov has quit [Quit: leaving]
edwin has joined #ocaml
alexyk has quit [Quit: alexyk]
jamii has quit [Ping timeout: 260 seconds]
joewilliams is now known as joewilliams_away
Cyanure has quit [Remote host closed the connection]
othiym23 has quit [Quit: Linkinus - http://linkinus.com]
ygrek has joined #ocaml
edwin has quit [Remote host closed the connection]
Obfuscate has quit [Ping timeout: 246 seconds]
bobry1 has joined #ocaml
bobry1 has quit [Client Quit]
bobry1 has joined #ocaml
Obfuscate has joined #ocaml
ftrvxmtrx has quit [Read error: Connection reset by peer]
edwin has joined #ocaml
ftrvxmtrx has joined #ocaml
ygrek has quit [Ping timeout: 250 seconds]
ftrvxmtrx has quit [Ping timeout: 264 seconds]
Obfuscate has quit [Ping timeout: 246 seconds]
blinky has joined #ocaml
ikaros has joined #ocaml
ftrvxmtrx has joined #ocaml
Obfuscate has joined #ocaml
rgrinberg_ has quit [Read error: Connection reset by peer]
StepanKuzmin has joined #ocaml
avsm has quit [Quit: Leaving.]
StepanKuzmin has quit [Read error: Connection reset by peer]
avsm has joined #ocaml
StepanKuzmin has joined #ocaml
StepanKuzmin has quit [Read error: Connection reset by peer]
StepanKuzmin has joined #ocaml
Julien_Tz has joined #ocaml
Julien_Tz is now known as Julien_T
Amorphous has quit [Ping timeout: 252 seconds]
StepanKuzmin has quit [Remote host closed the connection]
ftrvxmtrx has quit [Ping timeout: 246 seconds]
lopex has joined #ocaml
StepanKuzmin has joined #ocaml
Amorphous has joined #ocaml
ftrvxmtrx has joined #ocaml
Tianon has joined #ocaml
Tianon has quit [Changing host]
Tianon has joined #ocaml
blinky has quit [Quit: /quit]
larhat1 has quit [Ping timeout: 255 seconds]
ygrek has joined #ocaml
_andre has joined #ocaml
larhat has joined #ocaml
StepanKuzmin has quit [Read error: Connection reset by peer]
StepanKuzmin has joined #ocaml
rixed_ has joined #ocaml
<rixed_> Im looking for a oneliner that return a random enum of N item from another enum (using batteries)
<rixed_> Something like "a_long_enum |> choose N"
<rixed_> any idea?
<hcarty> rixed_: (Enum.filter (fun _ -> Random.bool ()))
<hcarty> oops... that doesn't meet the N criteria
<hcarty> rixed_: Do you want only unique entries, or are duplicated ok?
<rixed_> hcarty: unique would be find
<rixed_> s/find/nice (??!)
<rixed_> maybe if Random.choice could returnalso the original enum without the chosen item, I could repeat N times
<rixed_> speed is not required, though
<hcarty> rixed_: One challenge is that enums are consumable. So they are difficult to work with when you want anything other than the head element
<rixed_> hcarty: Originaly I had a list, but I converted it to an enum hopping I would have more functions available
<hcarty> rixed_: You could use a list (or set or anything ordered I suppose), sort/order using a random function
<hcarty> rixed_: Then take the first N elements from the result.
<kerneis> be extremely careful about sorting with a random function
<kerneis> depending on your sort algorithm, the result might be far from random
<rixed_> kerneis: yeah I suppose with some algorithm the sort may even not terminate
<rixed_> Or I can transform my original enum to a cyclic one and skip at random... no, wouldnt be unique
<rixed_> Same idea but safer: use shuffle on the original enum and take the first N.
<kerneis> rixed_: why don't you copy the source of Random.choice and modify it?
<kerneis> but yes, shuffle then take would do the same
<rixed_> BTW, any method to build a list from an enum (without getting the No_more_element exception) ?
adrien is now known as Camarade_Tux
Camarade_Tux is now known as Camarade_Tux_
Camarade_Tux_ is now known as Camarade_Tux__
Camarade_Tux__ is now known as adr1en
adr1en is now known as adrien_
adrien_ is now known as adrien
<sgnb> kerneis: bad practice
<hcarty> rixed_: List.of_enum. Most (many?) Batteries modules have something similar.
ygrek has quit [Ping timeout: 250 seconds]
<rixed_> List.of_enum raise the aforementionned exception every time I use it. I suppose this is related to the consumable nature of enums...
StepanKuzmin has quit [Read error: Connection reset by peer]
<thelema> rixed_: very odd - List.of_enum should return an empty list even if there's no elements in the list.
<thelema> err, elements in the enum
<thelema> are you sure it's raising the exception?
<thelema> also, let choose n = Random.shuffle e |> Array.enum |> Enum.take n
StepanKuzmin has joined #ocaml
RAW has joined #ocaml
<thelema> You could write a function that did one pass through the enum and chose or discarded each element with the correct probability, depending on the number of elements left in the enum and the number of elements still needed in the choice. I think this might be statistically valid...
impy has quit [Ping timeout: 276 seconds]
<thelema> let keep need have = if need >= have then true else Random.float have <= need
<kerneis> thelema: I thought about that but you might get unlucky and have too few elements
<kerneis> hmm, no you are right
<thelema> If you want to contribute the final function, I'll add it to batteries, as I've wanted random sampling from an enum too
<kerneis> thelema: Random.float need <= have (and Random.int would work too I guess)
<zorun> thelema: "(need >= have) || (Random.float have <= need)" looks prettier
<thelema> if you need 1 and have 50...
<thelema> zorun: agreed
<thelema> and yes, int is better. I was thinking I'd need more float arithmetic, but in the end I didn't
<kerneis> oh, I thought "have" was how many you still have in your enum
<thelema> yes, have is Enum.count, need = number left to sample
<kerneis> so if you need 1 and have 100 left, you should pick a number between 0 and 99 and check if it is 1
<thelema> less than 1, yes
<thelema> Random.int have < need
<kerneis> ok, I'm tired
<kerneis> you are right
<thelema> and this is why it needs to br written once, double-checked, and then not just used
<thelema> *ne
<thelema> *be
<thelema> grr, s/not//
<mfp> does the above assume you have Enum.count?
<thelema> mfp: yes, no way to sample from an infinite enum with uniform probability
<rixed_> thelema: I'm trying to figure out where the No_more_elements exception come from. I cxan't reproduce it in the repl while I cant get rid of it in my program :-/
<kerneis> mfp: if you do not have Enum.count, I think you can't do it reliably
<mfp> I believe it can be done without it in a single pass w/o forcing Enum.count; it can be done for N=1, at least, and it seems easy to generalize
<thelema> mfp: please explain
<mfp> for N=1, you pick the m-th element (replacing whatever you'd picked so far) with probability 1/m
<flux> rixed_, are backtracess useless?
<thelema> umm, that's not a probability distribution
<thelema> mfp: sum (1/m) > 1.0
<mfp> thelema: _replacing_ what you had (for N=1)
* thelema is confused as to what that means
<mfp> say you only want 1 element
<thelema> ok
<mfp> you do let elem = ref None in ...
<mfp> then you get the elements + their indices (x, m) and do with probability 1/m do elem := Some x
<mfp> so you pick the 1st one with probability one
<mfp> then the 2nd one replaces your pick with P 0.5
<rixed_> mfp: looks good to me
<mfp> 3rd one 1/3 and so on; at that point, the probability of !elem = Some x_i is 1/3 for all the elements (= for all i)
<thelema> mfp: hmm, that seems to work... how to generalize to n elements? a set with limited size, when it's overfull, remove oldest? random? element?
<thelema> mfp: hmm, that does seem to work.
<mfp> something like: pick first N elements, then for m-th element (m > N), pick it with probability N/m, replacing a random element from the previous picks
<mfp> just a hunch, gotta make sure the probabilities work out
<thelema> I think that might work too.
* thelema would code it up and do tests to see if elements are being chosen with uniform probability
<rixed_> thelema: Hmmm, the problem is that my program (which is a custom toplevel) was opening batteries_uni instead of batteries... yet I can't figure out what exactly triggered the bug.
<thelema> rixed_: odd, batteries_uni is what should be opened for batteries 1.x in a non-threads environment
<rixed_> in other words, calling choose defined in a module compiled with batteries, from a toplevel when batteries_uni was opened, yield the NBo_more_elements exception after choose n |> List.of_enum... for some reason :-/
<thelema> different exceptions between the two modules because of module paths. a funny behavior that produces unexpected results sometimes
<thelema> glad you fixed that. I wish I could get rid of the batteries_uni/batteries distinction
<thelema> maybe if I put dummy threads modules in batteries_uni, they can share the same interface...
<rixed_> thelema: would have been great to have an error when opening Batteries_uni, though.
StepanKuzmin has quit [Read error: Connection reset by peer]
<thelema> hmm... I don't think that's possible, as Batteries includes Batteries_uni
* rixed_ is not familiar enough with modules sha1s and flags to have a clue.
<thelema> opening batteries_uni doesn't run any code provided by batteries, it just adds identifiers to the current namespace, unlike in other languages. link-time is when modules get run
StepanKuzmin has joined #ocaml
<mfp> thelema: "replace random element from previous picks" doesn't work :-/ the probability of the m-th being chosen when there are m+1 elements is N/m * (N-1)/N = (N-1)/m <> N/(m+1)
<rixed_> mfp: there is also a performence trouble: you are forced to scan the whole enum, so choose 1 amongst 1000 would be slow, while thelema idea stops once one get cosen
<thelema> there's two cases - the m+1 element is added to the choices and the m+1 element isn't added
StepanKuzmin has quit [Remote host closed the connection]
<mfp> rixed: but you don't want to stop --- you have to scan the whole sequence if each element can be chosen with same probability
<mfp> thelema: hmm right, then it's N*m * ((N-1)/N/(m+1) + m/(m+1))
<rixed_> mfp: not with the algo proposed by thelema if I understood corectly
<thelema> mfp: if you have 1k elements, you can simply choose the first with p(1/1000), and if you get lucky, you stop there.
<thelema> no need to scan whole list every time, only in the case that you don't choose any of the first 999 elements
<mfp> only if you know you have 1000 elements
<rixed_> mfp: yes, that's a requirement
<thelema> yes, that's the precondition of this method, you know the # of elements
<rixed_> mfp: so you must count the items, but it's not slower than to scan all of them, and you probably already knows the count
<rixed_> but you solution is nicer I think, if it happen to work :-)
<mfp> rixed_: it requires more memory, though
<mfp> than an online algorithm
<mfp> http://propersubset.com/2010/04/choosing-random-elements.html has got an online algorithm for k out of n
<thelema> if you can know the count w/o generatng the whole sequence, then it's faster, otherwise not
<thelema> luckily, enums have a flag telling them whether the count function is fast, so we can use each algorithm exactly when it's appropriate
<rixed_> thelema: why is List.iteri prototype "(int -> 'a -> 'b) -> 'a list -> unit" instead of "(int -> 'a -> unit) -> 'a list ->unit" ?
<rixed_> ie. why 'b ?
<thelema> rixed_: good question... I guess it does the ignore() for the user
<thelema> probably inherited from extlib
<rixed_> thelema: aren't we supposed to favor compatibility with stdlib more than with extlib?
<thelema> -> 'b is compatible with stdlib
<thelema> and list.iteri was only recently added to stdlib, no?
<rixed_> yes, but stdlib would check the callback returns unit while extlib wont care
<rixed_> thelema: I was refering to Array.iteri from stdlib
<rixed_> List.iteri is not in stdlib (3.12.0) as far as I can tell
<thelema> well, I'll change it in v2 (can't in 1.x, as it's backwards incompatible)
<rixed_> not a big deal, I can do it if you want
<thelema> oh, I was assuming that it was added - there's been a number of stdlib functions like this that have been added recently.
<thelema> there's reasons either way - (-> 'b) is potentially more convenient, (-> unit) is potentially safer in terms of making sure the user isn't losing data.
<thelema> In this case, I think erring on safety is fine, go ahead and fix the v2 branch
<rixed_> thelema: will do
lopex has quit [Ping timeout: 252 seconds]
<rixed_> thelema: fun is, List.iteri was not ignoring f return value. I guess we had a compilation warning there.
lopex has joined #ocaml
metasyntax|work has joined #ocaml
<thelema> rixed_: I see the "f n h;" in List.iteri - I wonder how that got through the type-checker - maybe it's a corner case that doesn't currently trigger a warning
<rixed_> thelema: Or this warning is disabled?
<thelema> if so, please enable it. I know we have warning -> error turned on in the build system
<mfp> that warning is quite recent IIRC
<thelema> mfp: for non-unit used with ;?
<mfp> yes
<mfp> can't find it in Changes though
<mfp> maybe I'm confusing w/ Lwt's >>
<flux> thelema, btw, while I it's nice during development, it's also a good idea to disable that for actual releases :)
<flux> wish ocaml was able to survive through errors and warnings, though..
<mfp> warning number 10 > Expression on the left-hand side of a sequence that doesn’t have type "unit" (and that is not a function, see warning number 5).
<mfp> 5 > Partially applied function: expression whose result has function type and is ignored.
<rixed_> thelema: as far as I can tell all warnings are enabled in myocamlbuild.ml
<mfp> hmm there's also this notice "Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker."
<rixed_> So maybe this caveat applies: "Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker
<rixed_> (from ocamlc manpage)
<thelema> :)
<rixed_> thelema: what you said
<thelema> flux: ??? why disable compile warnings -> error on releases?
<flux> thelema, let's say the user gets a new version of the compiler that has more warnings
<flux> thelema, now he cannot anymore compile batteries
<rixed_> thelema: I strongly support flux opinion.
<rixed_> thelema: too often something break because the compilers authors add a new warning
<thelema> ah. How to do this on releases only?
<flux> I guess you have a mechanism for making releases?
<rixed_> thelema: although for ocaml it never occured to me, its very frequent with gcc
<thelema> 'make release' runs the tests and then calls 'git archive ...'
<flux> it could simply be adding an empty file 'release' into the archive and detecting that during configure or something
<mfp> isn't that only a problem if we enable all warnings instead of listing those we do want (known by the current compiler)?
<flux> mfp, there could be bugs in warnings that get fixed in a new version
<flux> or better diagnostics for the same issues could be developed
<thelema> well, if someone wants to make this happen for batteries, it'll be quickly merged.
<thelema> flux: otoh, maybe the ocaml devs have realized this is a problem, which is exactly why they have their current system of warnings, and they may guarantee that new warnings appearing is a bug
<mfp> hmm is it unreasonable to assume a batteries release will ususally follow a new OCaml shortly?
<mfp> maybe not for a bugfix release
<thelema> only when the ocaml release breaks batteries
<thelema> which seems to happen all too frequently, but that's partially our fault
<mfp> ... does that include "broken build"? ;)
<mfp> is batteries using module type of xxx yet?
<flux> well, maybe the fix will be integrated the first time ocaml build fails due to -warnn-error ;)
<mfp> in particular include module type of xxx to extend interfaces
<rixed_> thelema: while we are at it, did you notice that ocaml master branch breaks batteries... badly, it seams.
<rixed_> due to the addition of some optional parameters here and there
<thelema> rixed_: yes, issue #151 and michael's emails to the list
<thelema> I may make time this weekend to work on it, unless someone beats me to it.
alfa_y_omega has joined #ocaml
iratsu has quit [Ping timeout: 250 seconds]
iratsu has joined #ocaml
joewilliams_away is now known as joewilliams
agent|NaCl is now known as NaCl
sepp2k has joined #ocaml
joewilliams is now known as joewilliams_away
joewilliams_away is now known as joewilliams
ztfw has joined #ocaml
lopex has quit []
<hcarty> mfp: Batteries does not use any 3.12-isms yet
<thelema> hcarty: oops, somehow missed those questions. thanks.
<thelema> we can probably start using 3.12-isms in the v2 branch, if anyone wants to start that process
<hcarty> thelema: You're welcome. It's been on my mind recently, since it looks like it could save effort and code duplication.
<thelema> totally agreed - those new abilities seem a very good fit for what batteries is trying to do.
rixed_ has quit [Quit: shutdown]
lopex has joined #ocaml
ulfdoz has joined #ocaml
lopexx has joined #ocaml
lopex has quit [Ping timeout: 260 seconds]
lopexx has quit []
lopex has joined #ocaml
lopexx has joined #ocaml
lopex has quit [Ping timeout: 264 seconds]
lopexx has quit [Client Quit]
lopex has joined #ocaml
larhat has quit [Quit: Leaving.]
ulfdoz has quit [Ping timeout: 260 seconds]
ulfdoz has joined #ocaml
lopex has quit [Ping timeout: 258 seconds]
avsm has quit [Quit: Leaving.]
lopex has joined #ocaml
ftrvxmtrx has quit [Quit: This computer has gone to sleep]
<flux> I wonder, does anyone here care about ip-camera-based video monitoring/archival/etc?
<flux> I'm contemplating beginning a related summer project for myself
othiym23 has joined #ocaml
Tobu has quit [Ping timeout: 260 seconds]
ftrvxmtrx has joined #ocaml
bobry1 has quit [Ping timeout: 255 seconds]
ygrek has joined #ocaml
alexyk has joined #ocaml
sepp2k has quit [Ping timeout: 255 seconds]
Julien_T has quit [Ping timeout: 260 seconds]
Tobu has joined #ocaml
sepp2k has joined #ocaml
Julien_Tz has joined #ocaml
Julien_Tz is now known as Julien_T
<adrien> it's frustrating to see a pretty fast zipper implementation slowed down by the GC ='(
avsm has joined #ocaml
jamii has joined #ocaml
jamii has quit [Ping timeout: 260 seconds]
RAW is now known as impy
fraggle_ has quit [Quit: -ENOBRAIN]
<flux> isn't ocaml gc actually pretty fast?
<thelema> if the minor heap is too small...
fraggle_ has joined #ocaml
<adrien> it's pretty fast but I have a lot of allocations, and I'm not sure this can be reduced
<flux> I guess some nifty static analysis could to wonders to allocation rates
<thelema> adrien: allocations on the minor heap are very cheap
<flux> I had this program years ago that created a lot of garbage each frame (30 fps or 60 fps, cannot remember..), increasing the minor gc made wonders
<flux> but cannot help thinking how little a pooled memory allocator would need to do work for that..
<thelema> flux: just make sure your data fits in the minor heap, done.
<adrien> top funcitons: 21.5% : mark_slics ; 17.5% : sweep_slice
<flux> thelema, well, it would've been easy to see statically that I threw the data away :)
<adrien> length of the zipper here is 100_000
<flux> iirc it was all functional code
<thelema> flux: ah, ok.
ftrvxmtrx has quit [Read error: Connection reset by peer]
<thelema> adrien: so make your heap 200K words
<thelema> minor heap
<adrien> 3.12.1, it's 256k by default now (even though 'man Gc' still says 32k), and I've made it 2M words actually
<adrien> actually, I'm wondering if the garbage might not be created by something else, I should try with an array
ftrvxmtrx has joined #ocaml
ftrvxmtrx has quit [Quit: Leaving]
ulfdoz has quit [Ping timeout: 276 seconds]
Julien_T has quit [Ping timeout: 260 seconds]
<adrien> well, no garbage for the array-based implementation so it's indeed all generated by my zipper implementation and not by lablgtk2 or some other layer/library
<adrien> good night
ikaros has quit [Quit: Ex-Chat]
_andre has quit [Quit: leaving]
iratsu has quit [Quit: Leaving.]
tautologico has joined #ocaml
Cyanure has joined #ocaml
edwin has quit [Remote host closed the connection]
ftrvxmtrx has joined #ocaml
Modius has joined #ocaml
ftrvxmtrx has quit [Read error: Connection reset by peer]
ftrvxmtrx has joined #ocaml
dsheets has quit [Read error: Operation timed out]
ztfw has quit [Remote host closed the connection]
dsheets has joined #ocaml
Julien_Tz has joined #ocaml
Julien_Tz is now known as Julien_T
ygrek has quit [Ping timeout: 250 seconds]
Cyanure has quit [Ping timeout: 240 seconds]
lopex has quit []
Obfuscate has quit [Quit: WeeChat 0.3.5]
Obfuscate has joined #ocaml