<thelema>
mcstar: they're represented as a block composed of a tag word that has the length of the array in it plus the words of the data in the array. fast to index, fast to allocate
<mcstar>
thelema: i wrote a prefix trie implementation in ocaml(my first program) and it turned out to be incredibly slow compared to haskell/cpp
<thelema>
mcstar: pastebin the code?
<mcstar>
i cant even load the full problem, i get segfault
<thelema>
In ocaml, mutation does have a GC cost, as it has to keep track of pointers from the old heap to the young heap
djcoin has quit [Read error: Connection reset by peer]
<thelema>
but I don't see anything obviously wrong/inefficient with the code
<mcstar>
thelema: for example, for 50K words, haskell is done in 2 secs, ocaml in 10
<mcstar>
i expected better mutable performance from a non-pure language(in contrast to haskell)
kmicinski has joined #ocaml
<mcstar>
thelema: the part which is slow(the creation of the trie) must be saturated by allocation speed, thats why i asked the question about arrays
<mcstar>
can the runtime show me some GC summary?
<thelema>
I see. OCaml's allocation speed is pretty good in general, you may be hitting a mutation limit...
<thelema>
Gc.stat();;
<adrien>
and the GC has parameters to set
<mcstar>
what does ^^ do?
<mcstar>
is that a full summary?
<mcstar>
i.e. should i execute it at the end?
Sablier has joined #ocaml
<thelema>
ocaml does allocate a pretty large amount of memory - 700MB for 50K words out of my dict.
<mcstar>
thats alright
<mcstar>
my haskell one has 2GB> heap allocation
<mcstar>
>2GB
<mcstar>
but thats for 200K
<mcstar>
thelema: so is there some switch/option to display the summary? that gc.stat only returns a struct with values
<adrien>
quite often, if you know you're going to have a burst of allocations (and no freeing for some time), it can help to tweak the GC and make it less active for a few CPU minutes
<mcstar>
or is that only possible while profiling?
<adrien>
nah, you can print it, and I think the Gc module has a predefined function to print it
<thelema>
mcstar: if you're allowing weird optimizations, you may try `type pftrie = pftrie array`
<zorun>
:o
<mcstar>
how would that help?
<thelema>
use arrays of 257 elements, with the last element being a pointer to one of two constant trie nodes, one being true and the other being false
<thelema>
mcstar: it would get rid of all the extra baggage for the record and the option
<mcstar>
hm
<mcstar>
no i dont get it
<mcstar>
wait
<mcstar>
yes, i dont get it, you removed my bool variable, so how can a node be true/false anymore?
<mcstar>
this doesnt typecheck in my head
<thelema>
let yes_node = <some dummy trie node>
<thelema>
let no_node = <some dummy trie node>
<mcstar>
ah, you mean compare by memory address?
<thelema>
if n.(256) == yes_node then (* is true *)
<mcstar>
ok
<mcstar>
that could work
<thelema>
let empty = let rec seed = [| seed; seed; seed; seed; seed; seed; seed; seed |] in let e = Array.concat [seed; seed; seed; seed; seed; seed; seed; seed; [|seed|]] in for i = 0 to 255 do e.(i) <- e; done; e
<thelema>
blah. that's unnecessarily complex without Obj.magic
<mcstar>
:)
<thelema>
let empty = let e = Array.make 257 (Obj.magic 0) in for i = 0 to 255 do e.(i) <- e; done; e
<thelema>
anyway, something like that for yes_node and no_node, and then empty has .(256) <- no_node
<mcstar>
but yes/no nodes dont need initialization
<thelema>
they need to have some value, unless you're planning on just Obj.magic-ing them into existence
<mcstar>
Array.create?
<mcstar>
ah k that needs a seed
<thelema>
yup, strings can be created uninitialized, but arrays have to have a value
<mcstar>
how about just using a sum type?
<mcstar>
option?
<thelema>
has more overhead
<thelema>
Just do either the seed construction or use Obj.magic
<mcstar>
so what is this Obj.magic exactly?
<mcstar>
can allocate any object?
<thelema>
mcstar: tells the type system that you know what you're doing, even if it's totally wrong.
<hcarty>
thelema: Do you think you'll provide that package file in the odb repository or send it to Jane St. to include in their repository/repositories?
<thelema>
I'll see if someone else can verify that it works to install core, and then suggest it to Jane St.
<thelema>
if they don't want it, then I'll put it in the odb repo
<hcarty>
thelema: Trying now on a relatively fresh (yesterday) ocamlbrew install
<thelema>
hcarty: thanks
lamawithonel has quit []
lamawithonel has joined #ocaml
<hcarty>
thelema: Works here - 64 bit Linux, OCaml 3.12.1 (ocamlbrew), odb.ml snapshot downloaded yesterday
<thelema>
hcarty: great.
<wmeyer``>
thelema: Hi
<thelema>
wmeyer``: hi
<wmeyer``>
thelema: I rebased the patch
<hcarty>
I have a sudden urge to serialize all of my data types as s-expressions...
<wmeyer``>
hcarty: Later you will find your data turns to code :-)
<thelema>
wmeyer``: one sec
<wmeyer``>
hcarty: It's very dangeorus
<wmeyer``>
hcarty: Lately I have done some Scheme - not being particulary excited - after 2hours the walls started to melt :-)
<thelema>
wmeyer``: sorry for making you do that work; I was in the middle of my own merge, and I added some things, so it was easier for me to just finish my merge
<thelema>
wmeyer``: have a look at the new code.
<wmeyer``>
thelema: Not at all. It's my responsibilty to merge it with upstream. There was just a bogus git conflict which I fixed quickly.
BiDOrD_ has quit [Remote host closed the connection]
thomasga has quit [Ping timeout: 244 seconds]
thomasga has joined #ocaml
BiDOrD has joined #ocaml
thomasga has quit [Client Quit]
Yoric has joined #ocaml
Yoric has quit [Remote host closed the connection]
Yoric has joined #ocaml
<wmeyer``>
thelema: I got an e-mail, cheers!
djcoin has joined #ocaml
<jonafan>
what's the awesomest way to parse xml
<adrien>
jedi tricks
<djcoin>
regexp
<djcoin>
:)
<jonafan>
those are the same answer
<gnuvince>
jonafan: couple of undergrads
<jonafan>
i like that answer but none of the students here are capable
<gnuvince>
jonafan: then I'd go with an XML parser
<jonafan>
okay, but does an awesome xml parser exist for ocaml
<djcoin>
How original ! :)
emmanuelux has joined #ocaml
<jonafan>
there's a bunch on caml hump, but i don't know which one is the coolest
<hcarty>
jonafan: Word on the street is that the cool kids are using Xmlm
<hcarty>
jonafan: The hip crowd is using PXP
<jonafan>
oh boy, now i have to decide which group is more legit
bjorkintosh has joined #ocaml
<wmeyer``>
thelema: I would be able to implement the _oasis dependency parser - if you don't have much time
Anarchos has joined #ocaml
<thelema>
wmeyer``: I don't have much time, please go ahead.
<wmeyer``>
thelema: Thanks
lamawithonel has quit []
Snark has joined #ocaml
ivan\ has quit [Ping timeout: 240 seconds]
tomprince has quit [Ping timeout: 240 seconds]
ski has quit [Ping timeout: 240 seconds]
pippijn has quit [Ping timeout: 240 seconds]
ivan\ has joined #ocaml
mehitabel has quit [Ping timeout: 252 seconds]
ski has joined #ocaml
pippijn has joined #ocaml
pippijn has quit [Changing host]
pippijn has joined #ocaml
tchell has quit [Ping timeout: 252 seconds]
emmanuelux has quit [Remote host closed the connection]
<wmeyer``>
jonafan: good choice, I think one of the key features is online parsing of XML, there is coresponding Json library too
Yoric has joined #ocaml
<adrien>
I guess "online" is sax-like, right? or "stream"?
<jonafan>
unless it meshes well with Lwt, it's moot
<wmeyer``>
adrien: "stream" is perhaps the right term here - the idea is to parse only when it's needed and reclaim immediately everything what is not needed no longer
mcstar has left #ocaml []
<wmeyer``>
jonafan: I think Daniel would be able to answer, he was mentioning some while ago Lwt integration? AFAIR Of course to use non blocking stream you need to have a special support in xmlm
<jonafan>
well, i think it'll be okay
<wmeyer``>
jonafan: As long as you can "poll" the stream, you would be fine.
<mfp>
didn't he post to the ML about his new design allowing to integrate with Lwt and such
<mfp>
?
Snark has quit [Quit: Quitte]
<wmeyer``>
mfp: Yes, I also remeber something in that lines a couple of weeks ago
<hcarty>
Is it possible to create a functor function, along the lines of: val make_map : ordered_type_module -> map_module_with_ordered_type_key
<thelema>
Kakadu: that's a good question for their mailing list.
<hcarty>
I'm expect that "functor function" is the wrong terminology here. I'm not sure what the correct term is.
gnuvince has quit [Ping timeout: 265 seconds]
Anarchos has quit [Quit: Vision[0.9.7-H-090423]: i've been blurred!]
dsheets has quit [Ping timeout: 250 seconds]
<gildor_>
wmeyer``: why implement a dependency parser ?
<gildor_>
for _oasis ?
dsheets has joined #ocaml
zuymanto has quit [Quit: zuymanto]
zuymanto has joined #ocaml
Kakadu has quit [Quit: Konversation terminated!]
zuymanto has quit [Client Quit]
zuymanto has joined #ocaml
zuymanto has quit [Client Quit]
zuymanto has joined #ocaml
Yoric has quit [Ping timeout: 252 seconds]
gnuvince has joined #ocaml
eni has quit [Ping timeout: 245 seconds]
eikke has joined #ocaml
zuymanto has quit [Quit: zuymanto]
zuymanto has joined #ocaml
zuymanto has quit [Client Quit]
zuymanto has joined #ocaml
zuymanto has quit [Client Quit]
kolera has joined #ocaml
<wmeyer``>
gildor_: for odb
<wmeyer``>
gildor_: parse _oasis file and get the dependencies from there
<gildor_>
wmeyer``: why not
<gildor_>
wmeyer``: however, you won't handle all options (flags, conditional and so on)
Anarchos has joined #ocaml
kolera is now known as randori
kmicinski has quit [Ping timeout: 250 seconds]
randori is now known as kolera
kmicinsk1 has joined #ocaml
kmicinsk1 has quit [Ping timeout: 265 seconds]
thomasga has joined #ocaml
jamii has quit [Read error: Operation timed out]
thomasga has quit [Quit: Leaving.]
cyphase has quit [Ping timeout: 245 seconds]
<wmeyer``>
gildor_: Initial my idea was to reuse parts of Oasis
<wmeyer``>
gildor_: And bolt it, if you can expose it would be great - the problem is of course to take a tarball and get the depdency we should not be concerned with installing oasis
<gildor_>
wmeyer``: not an easy task in fact ;-)
<wmeyer``>
gildor_: I think I agree with that - but not entirelly - oasis can help a lot with extracting partial information from _oasis files, but we really we need dependency list
<gildor_>
wmeyer``: best you can do is to copy and paste the RecDescParser
<wmeyer``>
gildor_: Ouch :-)
<wmeyer``>
gildor_: I agree, or get it from the known location!
<wmeyer``>
gildor_: As a script?
ulfdoz has joined #ocaml
<gildor_>
gildor_: I will probably get rid of the this parser in fact, it is ugly and slow
<gildor_>
wmeyer``: my best proposal is that you write a good replacement (fast and clear) and that I replace RecDescParser using that
<gildor_>
i.e. something that generate the right AST and I use this AST in oasis and you use it in odb
Anarchos has quit [Quit: Vision[0.9.7-H-090423]: i've been blurred!]
<dsheets>
anybody use lwt under js_of_ocaml here? how do you force evaluation of Lwt.t? how do you integrate mainloop?
* gildor_
gtg
<wmeyer``>
gildor_: OK. I will look into that over the weekend.
cyphase has joined #ocaml
<wmeyer``>
gildor_: I think with slowness I can't help. But with ugliness - we could have a micro parsing combinator library which I can ship with RecDescParser
<gildor_>
wmeyer``: I am almost not online during the WE, send me a mail to tell me what you want to do so that we can gave something great for everyone ;-)
<wmeyer``>
s/I/we :-)
<wmeyer``>
gildor_: OK, try to look at it ASAP - checking out the tree now :-)
cdidd has quit [Remote host closed the connection]
<gildor_>
wmeyer``: there are plenty of tests that parses _oasis in the oasis project, you can probably start by trying to replace the parser until you get something that passes the test
<wmeyer``>
gildor_: OK. Thanks. BTW: How do you expand one AST to another?
<gildor_>
which AST ?
<wmeyer``>
gildor_: The AST you get from the parser to the simple & expanded AST which is the databse of information
<gildor_>
so AST -> OASISTypes.package ?
<wmeyer``>
gildor_: Probably yes -- don't know much about the codebase
<gildor_>
I use a schema base approach, you register field and value parser in a schema and the schema is used to extract data and create section/package
<wmeyer``>
gildor_: So you evaluate on demand what you need - interesting approach
<gildor_>
have a look at OASISLibrary* and OASISAst
<gildor_>
really gtg
eikke has quit [Ping timeout: 256 seconds]
eikke has joined #ocaml
smondet has quit [Quit: ERC Version 5.3 (IRC client for Emacs)]
<wmeyer``>
gildor_: thelema Might take some time to get all this done so stay tune!
<wmeyer``>
is it only here that the weather outside is ridiculolsy hot
<thelema>
wmeyer``: no rush - it might be easier to build a standalone exe inside the oasis tree that returns a csv (or otherwise) list of deps for a given _oasis file
<thelema>
wmeyer``: if that's the case, we can just install oasis for that exe before installing anything with an _oasis file that doesn't have deps listed
<wmeyer``>
thelema: Sounds like a good starting point idea
<wmeyer``>
thelema: Maybe the oasis driver should be able to give that information in general
<wmeyer``>
thelema: even going a bit further - the oasis-db protocol perhaps should give the dependencies of course
<wmeyer``>
thelema: But I would really start with something standalone
<thelema>
I wouldn't be surprised if there were some way to do this already, through some weird API that noone has ever heard of
<thelema>
One important point is that it's not sufficient to look at all targets (executables + libraries) and union their dependencies, as some targets are test-only
<dsheets>
what versioning system does odb use? semver?
<thelema>
dsheets: none at all.
<thelema>
dsheets: odb has no version.
<dsheets>
i mean for packages
<thelema>
well...
<dsheets>
or are package compat not checked?
<thelema>
dependencies give a findlib name (or executable name) and a version
kmicinsk1 has joined #ocaml
<thelema>
versions are parsed as a [Number | not-number] list
<thelema>
getting the version of an executable is currently a broken hack, but the version of a findlib package is easy to get
<thelema>
err, the dependency has an operator as well - >, >=, or =
<thelema>
comparison is done component by component in the list.