#picolisp on 2018-12-29 — irc logs at freenode.irclog.whitequark.org

2018-09-14 18:41 ChanServ changed the topic of #picolisp to: PicoLisp language | Channel Log: https://irclog.whitequark.org/picolisp/ | Check also http://www.picolisp.com for more information

00:41 rob_w has quit [Quit: Leaving]

01:42 freemint has quit [Ping timeout: 256 seconds]

01:49 alexshendi has joined #picolisp

02:11 _whitelogger has joined #picolisp

03:03 orivej has quit [Ping timeout: 244 seconds]

04:11 _whitelogger has joined #picolisp

04:53 aw- has joined #picolisp

05:31 <razzy> not a lot of pain. looked much bigger in my head :] thx

05:41 _whitelogger has joined #picolisp

06:18 alexshendi has quit [Read error: Connection reset by peer]

06:37 _whitelogger_ has joined #picolisp

06:39 _whitelogger has quit [Ping timeout: 250 seconds]

07:00 orivej has joined #picolisp

07:38 _whitelogger has joined #picolisp

08:13 orivej has quit [Ping timeout: 240 seconds]

08:33 aw- has quit [Ping timeout: 250 seconds]

08:34 razzy has quit [Ping timeout: 246 seconds]

08:38 aw- has joined #picolisp

08:47 aw- has quit [Ping timeout: 250 seconds]

08:55 aw- has joined #picolisp

09:09 shpx has joined #picolisp

09:21 shpx has quit [Ping timeout: 246 seconds]

10:41 abel-normand has joined #picolisp

11:15 razzy has joined #picolisp

11:27 m_mans has joined #picolisp

11:27 <m_mans> Hi all!

11:28 <Regenaxer> Hi m_mans!

11:28 <Regenaxer> Long time :)

11:28 <m_mans> yeah...

11:29 <m_mans> Have you talked about 'create' function already?

11:29 <Regenaxer> Here in IRC?

11:29 <m_mans> yes

11:29 <Regenaxer> yes, we discussed it here in the beginning

11:29 <Regenaxer> tankf33der tested it too

11:30 <m_mans> I've just seen it, tried it first on my SSD, then on tmpfs disk

11:30 <Regenaxer> You could check the irc log

11:30 <Regenaxer> I found that the speed of disk hardware does not matter

11:31 <Regenaxer> ssd is not faster than magnetic disk

11:31 <m_mans> yes, seems so, I suppose bottleneck is syscalls

11:31 <Regenaxer> What helps a *little* is putting .pil/tmp/ on ssd

11:31 <Regenaxer> (symbolic link)

11:31 <m_mans> ah, so it use temp-files in .pil/tmp/?

11:31 <Regenaxer> Pil DB runs mostly cached in memory, so disk speed is not critical

11:32 <Regenaxer> But a lot of RAM is helpful

11:32 <Regenaxer> yes, it creates temp files

11:32 <Regenaxer> rather large ones

11:33 <Regenaxer> It first builds the temps, then imports

11:34 <Regenaxer> the temp files are deleted while importing, so disk space is not used too much

11:34 <m_mans> how can I use another location for temp-files?

11:34 <Regenaxer> I made a symbolic link

11:34 <m_mans> ok

11:35 <Regenaxer> lrwxrwxrwx 1 abu abu 4 Dec 15 14:29 .pil/tmp -> /ssd/

11:36 <Regenaxer> m_mans, cold already?

11:37 <m_mans> hehe, already warm. It was -35 some days ago, now -9

11:37 orivej has joined #picolisp

11:37 <Regenaxer> oh :)

11:37 <Regenaxer> Here is +/- zero

11:37 <Regenaxer> last week +10 C

11:38 <m_mans> no difference, even with ram-disk for everything, ~37 sec for 1000000 objects

11:38 <Regenaxer> ok

11:38 <m_mans> Regenaxer: oh, +10 is cool :)

11:38 <Regenaxer> 'create' gets fast if the db is larger than RAM

11:40 <Regenaxer> If the DB fits mostly into RAM, create is not really needed

11:40 <m_mans> anyway, it's fine that we have such function

11:40 <Regenaxer> yeah, some things are impossible to import without

11:42 <m_mans> sadly I'm still totally occupied with current job

11:42 <Regenaxer> :(

11:43 <Regenaxer> I'm experimenting with OpenStreetMap again

11:43 <m_mans> I hope I'll play a little with Pil on holydays

11:43 <Regenaxer> Good

11:43 <Regenaxer> Import of whole Germany takes several days

11:44 <Regenaxer> 52 GiB data

11:45 <m_mans> oooh

11:45 <m_mans> even with 'create'?

11:45 <Regenaxer> yes

11:45 <Regenaxer> The objects and indexes are fast, about a day

11:46 <Regenaxer> The problem is the +Joints between many ways and nodes

11:46 <Regenaxer> a "way" is a list of nodes, eg a street

11:46 <Regenaxer> each node is linked to one or several ways

11:46 <Regenaxer> and one or more neighbors

11:47 <m_mans> I'm still thinking about some backend for IO-operations not using syscalls, just read-write in memory

11:47 <Regenaxer> Well, the pil DB does that

11:47 <Regenaxer> only the storage is in DB, the persistence

11:48 <Regenaxer> But you need some syscalls, eg. to synchronize and other IPC

11:48 <m_mans> I mean just several big reads and writes.

11:48 <Regenaxer> Yes, that what 'create' does

11:48 <Regenaxer> it collects 1 mio objects for one 'commit'

11:49 <m_mans> but what commit does? I suppose it does many low-level write operations

11:49 <Regenaxer> yes, but in a sorted way

11:50 <Regenaxer> so it may be quite contiguous

11:50 <Regenaxer> especially in a new, empty DB

11:50 <Regenaxer> The syscalls don't matter

11:50 <Regenaxer> again it is the RAM size

11:50 <m_mans> So, I mean to replace every low-level syscall (write to file) with just memory write operation (no syscall)

11:50 <Regenaxer> OS caches disk operations

11:51 <razzy> funny old school link "garbage collecting more than once in 12years could be harmfull" http://3e8.org/pub/scheme/doc/lisp-pointers/v1i3/p17-white.pdf

11:51 <Regenaxer> OK, but then only a single process?

11:52 <m_mans> yes, of course it fits very specific case - import for example

11:53 <Regenaxer> I think this is not the problem

11:53 <Regenaxer> The bottleneck is disk cache

11:53 <m_mans> you could just test it in asm or in pil32 to see the difference. I'm not so fast in C programming

11:53 <Regenaxer> if the accumulated areas being accessed is bigger than disk caches, it begins to trash

11:54 <Regenaxer> I did many tests

11:54 <Regenaxer> it is almost only disk caches

11:54 <Regenaxer> if just one index file is bigger than RAM

11:55 <Regenaxer> and accessed in *random* order, it can't be cached by the O ss

11:55 <Regenaxer> OS

11:55 <Regenaxer> and gets *very* slow

11:55 <Regenaxer> So what 'create' does is pre-sorting all data

11:55 <Regenaxer> Then it sweeps linearly, not random

11:55 <Regenaxer> so disk caches work perfectly

11:56 <Regenaxer> Thats the true bottleneck

11:56 <Regenaxer> The simple example in the ref of 'create'

11:56 <Regenaxer> 10 Mio objects take 50 min here

11:56 <Regenaxer> The naive, randow way will not finish in a week I think

11:56 <Regenaxer> (not tried)

11:59 <Regenaxer> Why do Schemers worry so much about avoiding garbage collection?

11:59 <Regenaxer> I think the opposite

12:00 <Regenaxer> Small memory is better, with fast GC

12:01 <m_mans> did you try to just write big amount of data with one write-call? Is it slow too?

12:01 <Regenaxer> I pil it just takes milliseconds, and is needed anyway to clean up also non-referred and non-dirty DB objects and trees

12:01 <Regenaxer> I think it makes no difference

12:02 <Regenaxer> one big or many small, if the bottleneck is the disk cache

12:02 <Regenaxer> really, try it

12:02 <Regenaxer> it is really dramatic

12:03 <razzy> imho GC require operation. schemers have idea that their system is used nonstop and every operation is very valuable.

12:03 <Regenaxer> Why?

12:03 <Regenaxer> GC takes only a very little fraction of the time

12:03 <m_mans> ok, T, I could try it by myself

12:04 <m_mans> must go, bb all

12:04 <Regenaxer> ok, see you!

12:04 m_mans has left #picolisp [#picolisp]

12:05 <Regenaxer> afp

12:08 razzy has quit [Ping timeout: 250 seconds]

12:09 razzy has joined #picolisp

12:14 <tankf33der> i can try create on fastest intel xeon cpu in february

12:14 <tankf33der> intel xeon gold 6144

12:33 <razzy> maybe you could functionally without GC. or wait for downtime

13:04 <Regenaxer> tankf33der, great!

13:04 <Regenaxer> razzy, why worry? GC runs several times per second in pil, that's the best I think

13:34 orivej has quit [Ping timeout: 250 seconds]

13:34 <razzy> Regenaxer: if you run as process in other OS. you are propably better with GC running often and having small footprint.

13:35 <Regenaxer> Other OS?

13:35 <Regenaxer> Ah, you mean non-PilOS?

13:36 <Regenaxer> Well. small memory footprint is always better

13:36 <Regenaxer> CPU-cache etc.

13:36 <razzy> if you run as OS, if you have whole memory for yourself. it is better to use most you have :]

13:37 <razzy> also you need to have all algorithms adjusted to that :]

13:39 <razzy> general advice would be: smaller footprint -> better .

13:44 <Regenaxer> T

13:44 <razzy> Regenaxer: "all algorithms adjusted" is really big optimization problem :]

13:50 <Regenaxer> I have no clue what you mean

13:54 <razzy> long story, little payoff

14:02 orivej has joined #picolisp

14:07 orivej has quit [Ping timeout: 268 seconds]

14:21 abel-normand has quit [Ping timeout: 250 seconds]

15:11 m_mans has joined #picolisp

15:21 <Regenaxer> m_mans, concerning our discussion before: I think all the system calls take up together only a few seconds in a day-long import. All the time is spent inside the Linux-kernel juggling with the disk buffers

15:22 <Regenaxer> So it will not help at all to optimize simple writes

15:22 <Regenaxer> It is the fact that many places in a huge file are accessed in a random order

15:55 <razzy> pilOS is the answer?

15:56 <Regenaxer> There is no question

16:35 m_mans has quit [Quit: Leaving.]

16:39 orivej has joined #picolisp

16:44 freemint has joined #picolisp

16:45 aw- has quit [Quit: Leaving.]

16:46 <freemint> Regenaxer: How "large/redundant" is a PicoLisp DB when compared to the original file/another database file with the same data?

16:48 <Regenaxer> This depends on the number of indexes and joints you put into the model

16:49 <DKordic> freemint: [de add [Carry N] [if (= 0 Carry) N (add (>> -1 (& Carry N)) (x| Carry N))] ]

16:49 <Regenaxer> The more involved the model, the bigger, but also faster or more powerful the system

16:54 <freemint> How about when implemting the same indexes and joints as a regular db could

17:02 <freemint> also implementing https://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks might be interesting for comparison

17:10 <freemint> https://www.arangodb.com/2018/02/nosql-performance-benchmark-2018-mongodb-postgresql-orientdb-neo4j-arangodb/

17:20 <Regenaxer> A "regular" DB has no joints

17:39 m_mans has joined #picolisp

17:53 m_mans has quit [Quit: Leaving.]

18:14 ubLIX has joined #picolisp

18:34 alexshendi has joined #picolisp

19:02 alexshendi has quit [Read error: Connection reset by peer]

19:14 <freemint> Regenaxer: but it has identifiers which you can use to build joints.

19:34 ubLIX has quit [Quit: *cackles*]

19:39 <Regenaxer> Thats something different. You "join" by traversing further indexes, not direct pointers

19:40 <Regenaxer> (+List +Joint) etc

19:40 <Regenaxer> And a lot more to program for a simple selection

19:41 <Regenaxer> freemint, I was really surprised yesterday that you did not know about '@' results

19:41 <Regenaxer> This is one of the most central and often-used features in pil

19:43 <freemint> I do not use it that often. It is a feature i have not internalized yet because it is a little ambigious what think is strored in @

19:46 <freemint> (while (next) (setq N (add @ N)) why is @ not bound to N but bound to (next)

19:46 <Regenaxer> This is explained in the ref

19:46 <Regenaxer> It has precise rules

19:47 <Regenaxer> And if you has studied some existing programs you would surely have found it

19:48 <Regenaxer> s/has/had

19:49 <freemint> existing means in "@/lib" or code you wrote which was not published

19:50 <Regenaxer> Rosettacode

19:50 <Regenaxer> and yes, all libs and everything else

19:52 <freemint> shame on me ;)

19:52 <Regenaxer> Most functions use it I think

19:52 <Regenaxer> "gefühlt" most functions ;)

19:53 <freemint> what was the "which functions use this"

19:53 <freemint> function

19:53 <Regenaxer> ?

19:53 <Regenaxer> Check any lib

19:54 <freemint> i think there was a function which checks for the existence of certain symbols/functions in all functions

19:55 <Regenaxer> 'who'

19:56 <Regenaxer> : (more (who '@) pp)

20:13 <freemint> I am not going to step thorugh 132 functions

20:14 <freemint> oh my got 'later is so neat

20:19 <Regenaxer> right, no sense to step through all, just for the idea

20:48 mtsd has joined #picolisp

21:45 mtsd has quit [Quit: WeeChat 1.6]