<gildor>
(and for god sake, don't include bundled libraries that continue to live elsewhere)
<thelema>
looks like the bits I'm interested in are GPL. maybe I can get a different License
<thelema>
gildor: would you consider this code as living?
<gildor>
e.g. camlimages, camlzip, configwin, cryptgps, facile, labgl, lablgtk
<thelema>
of course, I won't repeat that mistake.
<gildor>
thelema: if it is just the code, in extlib, just go on
<thelema>
wow, specialized code for float arrays...
<thelema>
It looks like I need to make a benchmark for reading files - I can't figure out why I keep seeing [filename -> string] code that uses a resizing buffer to store chunks of data
<thelema>
plus, what's the optimal buffer size for ext3? Wouldn't it be 4K?
mnabil has quit [Read error: Operation timed out]
<gildor>
I don't think buffer size is a big issue
<gildor>
thelema: ^^^
<thelema>
There's only one way to tell
<gildor>
thelema: I have tried with various buffer size (direct file access) and didn't get a really improved throughput
<thelema>
using Science!
<thelema>
true, it's probably not a huge deal, as the CPU load is much smaller than the drive access time
<gildor>
and 4k buffer can give you almost 130MB/s on a RAID array
<thelema>
sure. but I keep seeing 1K buffers
<thelema>
and bitstring uses 16K buffers
<gildor>
give us you benchmark results when you are done
<thelema>
of course
<thelema>
once I finish my more urgent task for the day
<mrvn>
Non threaded, lacking the enter/leave_blocking_mode() around the pread.
<thelema>
no threads in my test
<mrvn>
I assumed
<thelema>
#includes?
<thelema>
well, it turns out that a lot of time *is* saved by checking the size of the file and reading exactly that much. what a surprise. http://pastebin.com/T6NG3ncn
<thelema>
mrvn: and what type is` buffer`?
joewilliams is now known as joewilliams_away
bitbckt_ has joined #ocaml
mnabil_ has joined #ocaml
eye-scuzzy has joined #ocaml
<mrvn>
I should polish the bigarray io functions for inclusion in batteries
sepp2k1 has quit [Ping timeout: 255 seconds]
_2x2l has quit [Ping timeout: 255 seconds]
mnabil has quit [Ping timeout: 265 seconds]
bitbckt has quit [Ping timeout: 265 seconds]
_2x2l has joined #ocaml
<thelema>
mrvn: sounds good to me
<mrvn>
thelema: Bigarray. any kind should do. I use uint8 with c layout.
ikaros has quit [Read error: Connection timed out]
<thelema>
mrvn: your pread completes almost as fast as mmap
<gildor>
ah yes, so indeed you will stress Gc and swap
<gildor>
my use case was more io -> process -> io
<gildor>
with a few data in memory
ikaros has joined #ocaml
<thelema>
I could easily move to that model, but I worry about doing benchmarks on that and sensitivity to disk activity
<brendan>
if you want to test the ocaml-level overhead, it's probably better if the file is in the buffer cache. reading from disk will dominate everything else (though there may be differences in prefetching for read vs mmap)
<thelema>
i.e. me being able to browse the web while the test runs
<thelema>
(this is for my real test, for which I can't share as much code)
<gildor>
my test procedure was: run 3 times, take the best time
<thelema>
well, reading the same file a gazillion times I think guarantees the file is in the buffer cache
<gildor>
and I was not able to browse the web while doing this
<gildor>
thelema: no because I took file bigger than memory
<thelema>
yup
<brendan>
twice will do, but only if you don't have something else evict it in between :)
* gildor
gtg
<thelema>
well, I want to compare my network protocol parser with another
ymasory has joined #ocaml
<thelema>
but I have huge pcap files, and thought that pre-reading them as a string would eliminate the disk access penalty
<thelema>
I'd be happy to mmap, but bitstring requires a string.
ftrvxmtrx has joined #ocaml
<thelema>
anyway, gotta go, back in 4 hours
<brendan>
might be better to let the kernel prefetcher take care of the read pipeline
<brendan>
you get lower latency to start and shouldn't be much slower overall
<thelema>
I approve of the lower latency, except for the disk access problem
<brendan>
well that's what the prefetcher is supposed to hide :)
<flux>
it would be extra nice if bitstring supported streaming in
<flux>
but, it would break the interface
<flux>
thus such a system would perhap better be called bitstring2
rup has joined #ocaml
Yoric has joined #ocaml
<mrvn>
thelema: So all the delay is caused by ocaml copying buffers around and the GC cleaning up then.
<mrvn>
thelema: mmap should be zero copy as the data is never accessed while pread copies it once from kernel to user space.
<mrvn>
Unix.read copies twice and in_channel probably too.
ftrvxmtrx has quit [Ping timeout: 276 seconds]
<flux>
hmm..
<flux>
I wonder if mmapped ocaml strings would be feasible..
<flux>
basically it would work like:
<flux>
1) mmap a region
<flux>
2) mmap zero before it
<flux>
3) stick the mmapped region size into it
<flux>
4) return as ocam lstring!
<flux>
but it might not work great with ocaml gc?
ygrek has joined #ocaml
ftrvxmtrx has joined #ocaml
<brendan>
is that Bigstring.map_file function doing something like that?
Associat0r has joined #ocaml
<flux>
I don't recall seeing that
<flux>
perhaps it has arrived at a later version than what I've looked at/used, or it simply reads the file before processing it
<brendan>
it looks like it to me, but I have never learned Obj.magic etc
<flux>
heh, it appears to do exactly that except in a more researched, ie. more working fashion :)
<flux>
ah, biGstring
<flux>
I read biTstring
<brendan>
yeah, bigstring looks pretty nice
<mrvn>
flux: that would have quite an overhead. A full page.
<mrvn>
flux: and you would have to make it a custom block with finalizer and then pretend it actually is a string. Not sure the two structures are compatible.
<flux>
mrvn, a full page for the integer? well, I would assume it would still be for one, say, pcap file, which would be potentially megabytes
<mrvn>
flux: a page for the GC heade before the mmaped data.
<mrvn>
For strings the GC header contains the size. Are those bits free in a custom block with finalizer?
<mrvn>
Never mind. they must be free as they are the size of the block.
<mrvn>
But strings also have a length modifier at the end saying how much of the last word is string iirc. How would you get that after the mmaped data? Add another page and only allow 4k aligned string size?
<mrvn>
Another snag would be that the toplevel wouldn't print the string. But I guess for strings >4k that is a good thing.
<flux>
forgot about that
<flux>
but you can map with private
<flux>
and modify the block
<flux>
which gets copied-of-write
<flux>
s/of/on/
<mrvn>
then the file needs to be 4 bytes longer than what you map or the behaviour is undefined. But you can mmap /dev/zero for the last 4 bytes and copy the last chunk into it plus the 4 bytes.
<mrvn>
I think for those cases a char bigarray is probably better suited. You can create sub arrays from it when you parse the data into more usefull chunks and such.
<flux>
well, even bigarrays are unsuitable for bitstring as of now ;)
<flux>
maybe.. a functorized version of bitstring!
<brendan>
but does bigstring work with bitstring?
<flux>
I seriously doubt that
<flux>
but, I need to suspend my laptop, not much power left :)
<mrvn>
what does string have that bigarray doesn't?
<brendan>
bigstring? it looks like it provides a mmaped file as a string, but I am not good enough at ocaml to tell whether it would work with bitstring
<mrvn>
no, I mean string itself. Why does string work for bitstring but not bigarray?
tony_ has quit [Quit: Ex-Chat]
arubin has quit [Ping timeout: 240 seconds]
smerz has joined #ocaml
arubin has joined #ocaml
ftrvxmtrx has quit [Ping timeout: 255 seconds]
ftrvxmtrx has joined #ocaml
tony_ has joined #ocaml
eye-scuzzy has quit [Quit: leaving]
eye-scuzzy has joined #ocaml
agarwal1975 has joined #ocaml
thieusoai has joined #ocaml
agarwal1975 has quit [Ping timeout: 276 seconds]
agarwal1975 has joined #ocaml
edwin has quit [Remote host closed the connection]
ccasin has quit [Quit: Leaving]
<thelema>
mrvn: well, pread is super fast, if that's the case...
thieusoai has quit [Remote host closed the connection]
<thelema>
mrvn: and the str_only code allocates the right size string and really_inputs into it - there can't be that much overhead in that copying that it would take 100* more time
<mrvn>
thelema: If it is thread safe then it has to allocate a temporary buffer in the C stub, enter_blocking_section(), read into that, leave_blocking_section(), memcpy into the string.