ChanServ changed the topic of #picolisp to: PicoLisp language | Channel Log: https://irclog.whitequark.org/picolisp/ | Check also http://www.picolisp.com for more information
rob_w has quit [Quit: Leaving]
freemint has quit [Ping timeout: 256 seconds]
alexshendi has joined #picolisp
_whitelogger has joined #picolisp
orivej has quit [Ping timeout: 244 seconds]
_whitelogger has joined #picolisp
aw- has joined #picolisp
<razzy> not a lot of pain. looked much bigger in my head :] thx
_whitelogger has joined #picolisp
alexshendi has quit [Read error: Connection reset by peer]
_whitelogger_ has joined #picolisp
_whitelogger has quit [Ping timeout: 250 seconds]
orivej has joined #picolisp
_whitelogger has joined #picolisp
orivej has quit [Ping timeout: 240 seconds]
aw- has quit [Ping timeout: 250 seconds]
razzy has quit [Ping timeout: 246 seconds]
aw- has joined #picolisp
aw- has quit [Ping timeout: 250 seconds]
aw- has joined #picolisp
shpx has joined #picolisp
shpx has quit [Ping timeout: 246 seconds]
abel-normand has joined #picolisp
razzy has joined #picolisp
m_mans has joined #picolisp
<m_mans> Hi all!
<Regenaxer> Hi m_mans!
<Regenaxer> Long time :)
<m_mans> yeah...
<m_mans> Have you talked about 'create' function already?
<Regenaxer> Here in IRC?
<m_mans> yes
<Regenaxer> yes, we discussed it here in the beginning
<Regenaxer> tankf33der tested it too
<m_mans> I've just seen it, tried it first on my SSD, then on tmpfs disk
<Regenaxer> You could check the irc log
<Regenaxer> I found that the speed of disk hardware does not matter
<Regenaxer> ssd is not faster than magnetic disk
<m_mans> yes, seems so, I suppose bottleneck is syscalls
<Regenaxer> What helps a *little* is putting .pil/tmp/ on ssd
<Regenaxer> (symbolic link)
<m_mans> ah, so it use temp-files in .pil/tmp/?
<Regenaxer> Pil DB runs mostly cached in memory, so disk speed is not critical
<Regenaxer> But a lot of RAM is helpful
<Regenaxer> yes, it creates temp files
<Regenaxer> rather large ones
<Regenaxer> It first builds the temps, then imports
<Regenaxer> the temp files are deleted while importing, so disk space is not used too much
<m_mans> how can I use another location for temp-files?
<Regenaxer> I made a symbolic link
<m_mans> ok
<Regenaxer> lrwxrwxrwx 1 abu abu 4 Dec 15 14:29 .pil/tmp -> /ssd/
<Regenaxer> m_mans, cold already?
<m_mans> hehe, already warm. It was -35 some days ago, now -9
orivej has joined #picolisp
<Regenaxer> oh :)
<Regenaxer> Here is +/- zero
<Regenaxer> last week +10 C
<m_mans> no difference, even with ram-disk for everything, ~37 sec for 1000000 objects
<Regenaxer> ok
<m_mans> Regenaxer: oh, +10 is cool :)
<Regenaxer> 'create' gets fast if the db is larger than RAM
<Regenaxer> If the DB fits mostly into RAM, create is not really needed
<m_mans> anyway, it's fine that we have such function
<Regenaxer> yeah, some things are impossible to import without
<m_mans> sadly I'm still totally occupied with current job
<Regenaxer> :(
<Regenaxer> I'm experimenting with OpenStreetMap again
<m_mans> I hope I'll play a little with Pil on holydays
<Regenaxer> Good
<Regenaxer> Import of whole Germany takes several days
<Regenaxer> 52 GiB data
<m_mans> oooh
<m_mans> even with 'create'?
<Regenaxer> yes
<Regenaxer> The objects and indexes are fast, about a day
<Regenaxer> The problem is the +Joints between many ways and nodes
<Regenaxer> a "way" is a list of nodes, eg a street
<Regenaxer> each node is linked to one or several ways
<Regenaxer> and one or more neighbors
<m_mans> I'm still thinking about some backend for IO-operations not using syscalls, just read-write in memory
<Regenaxer> Well, the pil DB does that
<Regenaxer> only the storage is in DB, the persistence
<Regenaxer> But you need some syscalls, eg. to synchronize and other IPC
<m_mans> I mean just several big reads and writes.
<Regenaxer> Yes, that what 'create' does
<Regenaxer> it collects 1 mio objects for one 'commit'
<m_mans> but what commit does? I suppose it does many low-level write operations
<Regenaxer> yes, but in a sorted way
<Regenaxer> so it may be quite contiguous
<Regenaxer> especially in a new, empty DB
<Regenaxer> The syscalls don't matter
<Regenaxer> again it is the RAM size
<m_mans> So, I mean to replace every low-level syscall (write to file) with just memory write operation (no syscall)
<Regenaxer> OS caches disk operations
<razzy> funny old school link "garbage collecting more than once in 12years could be harmfull" http://3e8.org/pub/scheme/doc/lisp-pointers/v1i3/p17-white.pdf
<Regenaxer> OK, but then only a single process?
<m_mans> yes, of course it fits very specific case - import for example
<Regenaxer> I think this is not the problem
<Regenaxer> The bottleneck is disk cache
<m_mans> you could just test it in asm or in pil32 to see the difference. I'm not so fast in C programming
<Regenaxer> if the accumulated areas being accessed is bigger than disk caches, it begins to trash
<Regenaxer> I did many tests
<Regenaxer> it is almost only disk caches
<Regenaxer> if just one index file is bigger than RAM
<Regenaxer> and accessed in *random* order, it can't be cached by the O ss
<Regenaxer> OS
<Regenaxer> and gets *very* slow
<Regenaxer> So what 'create' does is pre-sorting all data
<Regenaxer> Then it sweeps linearly, not random
<Regenaxer> so disk caches work perfectly
<Regenaxer> Thats the true bottleneck
<Regenaxer> The simple example in the ref of 'create'
<Regenaxer> 10 Mio objects take 50 min here
<Regenaxer> The naive, randow way will not finish in a week I think
<Regenaxer> (not tried)
<Regenaxer> Why do Schemers worry so much about avoiding garbage collection?
<Regenaxer> I think the opposite
<Regenaxer> Small memory is better, with fast GC
<m_mans> did you try to just write big amount of data with one write-call? Is it slow too?
<Regenaxer> I pil it just takes milliseconds, and is needed anyway to clean up also non-referred and non-dirty DB objects and trees
<Regenaxer> I think it makes no difference
<Regenaxer> one big or many small, if the bottleneck is the disk cache
<Regenaxer> really, try it
<Regenaxer> it is really dramatic
<razzy> imho GC require operation. schemers have idea that their system is used nonstop and every operation is very valuable.
<Regenaxer> Why?
<Regenaxer> GC takes only a very little fraction of the time
<m_mans> ok, T, I could try it by myself
<m_mans> must go, bb all
<Regenaxer> ok, see you!
m_mans has left #picolisp [#picolisp]
<Regenaxer> afp
razzy has quit [Ping timeout: 250 seconds]
razzy has joined #picolisp
<tankf33der> i can try create on fastest intel xeon cpu in february
<tankf33der> intel xeon gold 6144
<razzy> maybe you could functionally without GC. or wait for downtime
<Regenaxer> tankf33der, great!
<Regenaxer> razzy, why worry? GC runs several times per second in pil, that's the best I think
orivej has quit [Ping timeout: 250 seconds]
<razzy> Regenaxer: if you run as process in other OS. you are propably better with GC running often and having small footprint.
<Regenaxer> Other OS?
<Regenaxer> Ah, you mean non-PilOS?
<Regenaxer> Well. small memory footprint is always better
<Regenaxer> CPU-cache etc.
<razzy> if you run as OS, if you have whole memory for yourself. it is better to use most you have :]
<razzy> also you need to have all algorithms adjusted to that :]
<razzy> general advice would be: smaller footprint -> better .
<Regenaxer> T
<razzy> Regenaxer: "all algorithms adjusted" is really big optimization problem :]
<Regenaxer> I have no clue what you mean
<razzy> long story, little payoff
orivej has joined #picolisp
orivej has quit [Ping timeout: 268 seconds]
abel-normand has quit [Ping timeout: 250 seconds]
m_mans has joined #picolisp
<Regenaxer> m_mans, concerning our discussion before: I think all the system calls take up together only a few seconds in a day-long import. All the time is spent inside the Linux-kernel juggling with the disk buffers
<Regenaxer> So it will not help at all to optimize simple writes
<Regenaxer> It is the fact that many places in a huge file are accessed in a random order
<razzy> pilOS is the answer?
<Regenaxer> There is no question
m_mans has quit [Quit: Leaving.]
orivej has joined #picolisp
freemint has joined #picolisp
aw- has quit [Quit: Leaving.]
<freemint> Regenaxer: How "large/redundant" is a PicoLisp DB when compared to the original file/another database file with the same data?
<Regenaxer> This depends on the number of indexes and joints you put into the model
<DKordic> freemint: [de add [Carry N] [if (= 0 Carry) N (add (>> -1 (& Carry N)) (x| Carry N))] ]
<Regenaxer> The more involved the model, the bigger, but also faster or more powerful the system
<freemint> How about when implemting the same indexes and joints as a regular db could
<freemint> also implementing https://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks might be interesting for comparison
<Regenaxer> A "regular" DB has no joints
m_mans has joined #picolisp
m_mans has quit [Quit: Leaving.]
ubLIX has joined #picolisp
alexshendi has joined #picolisp
alexshendi has quit [Read error: Connection reset by peer]
<freemint> Regenaxer: but it has identifiers which you can use to build joints.
ubLIX has quit [Quit: *cackles*]
<Regenaxer> Thats something different. You "join" by traversing further indexes, not direct pointers
<Regenaxer> (+List +Joint) etc
<Regenaxer> And a lot more to program for a simple selection
<Regenaxer> freemint, I was really surprised yesterday that you did not know about '@' results
<Regenaxer> This is one of the most central and often-used features in pil
<freemint> I do not use it that often. It is a feature i have not internalized yet because it is a little ambigious what think is strored in @
<freemint> (while (next) (setq N (add @ N)) why is @ not bound to N but bound to (next)
<Regenaxer> This is explained in the ref
<Regenaxer> It has precise rules
<Regenaxer> And if you has studied some existing programs you would surely have found it
<Regenaxer> s/has/had
<freemint> existing means in "@/lib" or code you wrote which was not published
<Regenaxer> Rosettacode
<Regenaxer> and yes, all libs and everything else
<freemint> shame on me ;)
<Regenaxer> Most functions use it I think
<Regenaxer> "gefühlt" most functions ;)
<freemint> what was the "which functions use this"
<freemint> function
<Regenaxer> ?
<Regenaxer> Check any lib
<freemint> i think there was a function which checks for the existence of certain symbols/functions in all functions
<Regenaxer> 'who'
<Regenaxer> : (more (who '@) pp)
<freemint> I am not going to step thorugh 132 functions
<freemint> oh my got 'later is so neat
<Regenaxer> right, no sense to step through all, just for the idea
mtsd has joined #picolisp
mtsd has quit [Quit: WeeChat 1.6]