#picolisp on 2018-02-28 — irc logs at freenode.irclog.whitequark.org

2017-10-09 12:10 ChanServ changed the topic of #picolisp to: PicoLisp language | Channel Log: https://irclog.whitequark.org/picolisp/ | Picolisp latest found at http://www.software-lab.de/down.html | check also http://www.picolisp.com for more information

00:09 freemint has quit [Ping timeout: 265 seconds]

00:56 aw- has joined #picolisp

02:53 orivej has quit [Ping timeout: 256 seconds]

07:02 orivej has joined #picolisp

07:20 freemint has joined #picolisp

08:06 mtsd has joined #picolisp

08:06 orivej has quit [Ping timeout: 256 seconds]

08:12 <freemint> Hello

08:12 <cess11_> Good morning.

08:13 <mtsd> Good morning

08:14 <beneroth> Good morning picolispers :)

08:16 <Regenaxer> Good morning freemint, cess11_, mtsd, beneroth :)

08:16 <yumaikas> and one rogue lurking Erlanger

08:16 <freemint> Good morning *

08:16 <yumaikas> morning

08:17 <Regenaxer> Hi yumaikas

08:17 <yunfan> afternoon :D

08:17 <freemint> Regenaxer, i decided i am building an mail archive

08:17 <cess11_> freemint: What kind of audience is it you will have at your presentation?

08:18 <Regenaxer> freemint, cool

08:18 <freemint> cess11_, fellow hackers which do not know about Picolisp

08:18 <Regenaxer> I used hypermail for that recently

08:18 <freemint> Regenaxer, that means i will ask you a lot how to layout the database

08:19 <Regenaxer> ok, good

08:20 <freemint> I will first do er.l write up. and you might critique it. Actually i have no idea how big our mail archives are but i would guess less than 100 Gigs

08:20 <freemint> I will import the .mbox format

08:20 <Regenaxer> Yes, I also always start with er.l

08:21 <yumaikas> Regenaxer: er.l?

08:21 <freemint> entity-relationship

08:21 <Regenaxer> yumaikas, it is a filename aka standard for the DB model

08:22 <yumaikas> Ah

08:22 <freemint> Regenaxer, provided a model application where he named the file containing the classes of objects in the database er.l

08:23 <Regenaxer> in fact I use the name "er.l" in all projects

08:23 <Regenaxer> (unless the class definitions are just inline in a single file)

08:25 <freemint> is there a german word for er?

08:25 <Regenaxer> good question

08:25 <Regenaxer> Just "Datenmodell"?

08:26 <freemint> "Ding-Beziehungsmodell"?

08:26 <Regenaxer> hehe

08:26 <Regenaxer> "Entitäten"

08:26 <Regenaxer> Alle meine "Entitäten" schwimmen ...

08:26 <freemint> haha

08:27 alexshendi_ has quit [Ping timeout: 256 seconds]

08:28 <freemint> (to translate for all: word pun on ducks in german "Ente" that sounds similar to "Enti"täteten, all my entities are swimming

08:30 <mtsd> :)

08:41 <freemint> Regenaxer, What do i need to know about access to bags ...

08:43 <Regenaxer> mom, on tel

08:44 <freemint> *Swaps

08:44 <freemint> fine

08:52 orivej has joined #picolisp

09:14 <freemint> afk

09:22 freemint has quit [Ping timeout: 252 seconds]

09:39 alexshendi_ has joined #picolisp

09:47 freemint has joined #picolisp

09:48 <freemint> back

10:15 <cess11_> mbox is horrible but you'll be able to index senders and some contents fairly easy at least.

10:18 <freemint> cess11_, I see that

10:21 <freemint> (push '*questionqueue "Regenaxer , is there a way to get information how far you are into a file, to document problems when parsing?")

10:21 <cess11_> I think pil is at its most impressive when handled interactively. You could show a session where you build a DB from a fresh pil + and the mbox.

10:24 alexshendi_ has quit [Read error: Connection reset by peer]

10:24 <cess11_> There are several ways. I've on occasion used globals to keep track of 'line number' as in position in a '(+List +String), one can also use 'match, 'from or 'member to reach back into data from an offending part, and there's a kind of book keeping in the POSIX file handling that allows 'line to sort of keep track of where it is.

10:25 <freemint> (in "mbox" (while (there_is_text)(read-mail))) and when read-mail encounters problems it gives me an eval where i can call functions like add-rule to modify the parser?

10:25 <freemint> cess11_, the file might be bigger than RAM

10:26 <cess11_> More reason to import all of it into the DB to begin with.

10:26 <freemint> i could subdivide it ofcourse ... to insert it

10:26 <freemint> T

10:28 <cess11_> I would probably do '(make (in Mbox (until (eof)(link (line T] and put that into an object, then examine parts of that data to see ways to exfiltrate the good parts that should be copied into new objects.

10:28 <cess11_> Perhaps do it in chunks if resources are really tight, can easily be done with 'do or 'co.

10:31 <cess11_> I would not add any indexing on that field though, it could cause a DoS due to lack of resources if the import is big. Just pulling in a list of strings is easier on the machine and then you write your own little parsers and pattern matching routines that are simple and efficient.

10:32 <cess11_> Sometimes it is faster to 'chop large data sets in bigger chunks than a line at a time. You will have to fiddle around a bit to see what works best in your environment.

10:33 <freemint> keeping a line list is a good idea

10:33 <freemint> all in one go parsing is to amitious i think

10:42 <Regenaxer> Sorry, was on phone until now

10:42 <Regenaxer> BTW, mailbox parsing is done in the mailing list handler

10:43 <Regenaxer> misc/mailing

10:43 <Regenaxer> Is in the pil distro iirc

11:04 <beneroth> Regenaxer, about binary io: (pr) and (rd) use PLIO, (wr) is raw bytes, right?

11:04 <beneroth> ah I see now

11:05 <beneroth> (rd) with argument reads raw bytes

11:05 <beneroth> nevermind

11:05 <Regenaxer> yess :)

11:07 * beneroth is implementing DIME, an outdated deprecated binary format comparable to MIME...

11:08 <Regenaxer> Where is that needed?

11:10 <beneroth> SOAP with attachments, when the server is horribly legacy stuff

11:12 <Regenaxer> oh

11:18 <freemint> Regenaxer, I thik

11:18 <freemint> *I think your code is not exactly what i am looking for but i will take it as reference

11:26 <beneroth> Input: picolisp number. Output: raw 16 bit integer. how: (let Num 257 (out "test" (wr (/ Num 256) (% Num 256))) (in "test" (rd 2)) - correct? better way?

11:27 <Regenaxer> Perhaps with >> and &

11:28 <Regenaxer> slightly faster than mul/div

11:29 <beneroth> aye, I see

11:29 <Regenaxer> If the number is not more than 16 bits, & can be omitted

11:29 <Regenaxer> all positive numbers?

11:30 <beneroth> yeah all unsigned I think

11:30 <beneroth> positive only

11:37 <freemint> like that (wr (>> 8 (& (* 255 256) NUM)))(wr (& 255 NUM)) ?

11:38 <freemint> with `(*255 256) expanded

11:38 <Regenaxer> yes, but the & is not needed then

11:39 <beneroth> I've got now (wr (>> 8 Num) (& Num 255))

11:39 <Regenaxer> yes

11:39 <freemint> oh

11:39 <beneroth> thanks!

11:39 <Regenaxer> if the input is less than 65536

11:39 <beneroth> naturally

11:40 <beneroth> it's actually a kind of content length, an identifier string

11:40 <Regenaxer> I see

11:40 <freemint> If it is content length i would go for & honestly

11:41 <Regenaxer> Cause it may be bigger?

11:41 <freemint> Cause you parse untrusted contend

11:41 <beneroth> no I write it

11:41 <beneroth> :)

11:41 <beneroth> time for lunch. away for a while - thank you guys :)

11:42 <Regenaxer> :)

11:43 <freemint> Regenaxer 'wr has a built in %

11:44 <freemint> but it is 255

11:44 <Regenaxer> Ah, rigtht!

11:44 <freemint> (wr 33333333)

11:44 <freemint> U-> 33333333

11:44 <freemint> ! (% 33333333 255)

11:44 <freemint> ! (char "U")

11:44 <freemint> -> 85

11:44 <freemint> -> 243

11:44 <freemint> ! (% 33333333 256)

11:44 <freemint> -> 85

11:44 <freemint> *256

11:44 <freemint> Why is it %256?

11:46 <Regenaxer> yes, 'wr' uses ld (X) B # Store byte

11:46 <Regenaxer> so upper bits are ignored anyway

11:46 <Regenaxer> ie (& 255)

11:46 <freemint> it is not

11:46 <freemint> look at my test case

11:47 <freemint> it is %256

11:47 <freemint> ahh

11:47 <freemint> forget that

11:47 <freemint> %256 = %255

11:47 <freemint> *&255

11:47 <Regenaxer> : (out "a" (wr 33333333))

11:47 <Regenaxer> -> 33333333

11:47 <Regenaxer> : (hd "a")

11:47 <Regenaxer> 00000000 55

11:48 <Regenaxer> : (hex (char "U"))

11:48 <Regenaxer> -> "55"

11:48 <freemint> you are right i confused mod and and

11:48 <freemint> %256 is the same as &255

11:48 <Regenaxer> yes

11:50 <freemint> Regenaxer, Can you recommend the use 'match on char lists of lines?

11:50 <Regenaxer> yes, fine

11:52 <freemint> Would you write a parser in mostly match?

11:52 <freemint> Is there a command to get all set patterns

11:53 <Regenaxer> Most frequently I use from/till

11:53 <Regenaxer> is also the fastest

11:54 <freemint> I want to build it really robust ... i can imagine that is it the fastest.

11:56 <freemint> my reasoning i simple that i would make for a cool demonstration of picolisp, when it fails to parse a email, which has some weird Date and i get thrown in to a shell, and i (add-pattern ...) (test-pattern) (commit-pattern)

11:56 <cess11_> The 'match patterns are lists you design and keep track of.

11:58 <freemint> cess11_, i know that they are often used that way ... but i can imagine making my code easier it there would be a get all matched patterns

11:58 <freemint> *if

11:59 <cess11_> 'mapcar and a 'match lambda. Result is a list of matches.

12:01 <freemint> : (setq @MAIL "jobsch")

12:01 <freemint> : (fill '(@MAIL .))

12:01 <freemint> -> ("jobsch")

12:01 <freemint> : (fill '(@MAIL))

12:01 <freemint> -> "jobsch"

12:01 <freemint> Speicherzugriffsfehler

12:01 <cess11_> Yeah, that's a memory leak.

12:01 <freemint> 'match yields true

12:03 <cess11_> Right, forgot. 'make is better.

12:03 <cess11_> Rather mixed up use cases.

12:04 <Regenaxer> 'fill' does not keep track of circular lists

12:04 <freemint> I've noticed

12:04 <freemint> Should that be documented?

12:05 <freemint> how do you use 'make to do what math does cess11_ ?

12:05 <Regenaxer> Almost *all* functions don't handle them, as the check is expensive

12:05 <Regenaxer> only a handful does

12:05 <Regenaxer> print, length, size iirc

12:06 <cess11_> '(make (mapcar '((X)(and (match '(@A " " @B) (chop X))(link (pack @A " " @B))) L]

12:06 <freemint> your decision if it end's up in the docs

12:07 <Regenaxer> cess11_: Why 'make' with 'mapcar'? Build two lists?

12:07 <Regenaxer> freemint, document in *every* function?

12:08 aw- has quit [Quit: Leaving.]

12:08 <Regenaxer> A circular list is simply a huge (infinite) list :)

12:08 <cess11_> Ah, the lambda could return from 'pack instead of 'match or somesuch.

12:09 <Regenaxer> or use mapc

12:09 <Regenaxer> if the value is not needed

12:09 <Regenaxer> to avoid building a garbage list

12:11 <cess11_> T

12:11 <freemint> i can not get your code working

12:12 <freemint> cess11_,

12:14 <freemint> The difference between 'do and 'mapc is that it iterates over multiple lists?

12:14 <cess11_> It works as is. You need a list 'L, that's it, though pretty ugly.

12:15 <freemint> How should L look like

12:15 <freemint> List string, list of chars?

12:15 <freemint> (list of lists of?

12:17 <cess11_> Doesn't matter. Char list isn't affected by 'chop. You'll need a better pattern for 'match though, that one assumes it has several chars and a '" ".

12:18 <freemint> ok could you give me an example?

12:19 <cess11_> '(setq L '("abc" "def" "ghi"))

12:20 <freemint> yields nil

12:21 <cess11_> '(mapcar '((X)(and (match '(@A "e" @B) (chop X))(let R (pack @B) R))) L]

12:21 <cess11_> Yes, it gets no hits.

12:47 <freemint> afl (away from lenovo)

13:13 jibanes has quit [Ping timeout: 245 seconds]

13:16 jibanes has joined #picolisp

13:32 <Regenaxer> hehe

13:36 mtsd has quit [Ping timeout: 240 seconds]

13:56 <beneroth> back

13:57 <beneroth> freemint, I just looked at implementation of (wr) in @src64/io.l: shr A 4 # Normalize - I guess shr is right shift

13:58 <freemint> It drops the flag of the cell

13:58 <Regenaxer> The point is that in the end register B is stored

13:58 <Regenaxer> an implicit & FF

13:58 <beneroth> ah

13:58 <freemint> (including the sign)

13:58 orivej has quit [Ping timeout: 240 seconds]

13:59 <Regenaxer> putStdoutB -> ld (X) B # Store byte

14:00 orivej has joined #picolisp

14:01 <freemint> Regenaxer, have you ever considered pre-rendering (in to a file) certain web pages for speed?

14:02 <freemint> Regenaxer, You own the PicoLisp Youtube channel, don't you?

14:02 <Regenaxer> You mean static pages?

14:03 <Regenaxer> No, I have no youtube channel

14:04 <beneroth> freemint, I plan to do that (pre-rending - in the past such a thing was just called 'caching)

14:04 <beneroth> the problem is to accurately track all changes to know when to invalidate the cache :)

14:05 <beneroth> only worth to optimize if you really need it, arguably

14:05 <Regenaxer> T

14:07 <freemint> T i just wanted to know whether there is a solution since i would be interested how the cache invalidation is handled

14:07 <beneroth> for web apps with up to a few 100 (< ca. 500) concurrent users, Regenaxers architecture (with form.l) is optimal I believe

14:08 <freemint> There is a PicoLisp youtube channel showing of Penti ... i think it is even linked in the wiki...

14:08 <beneroth> freemint, no off-the-shelf solution. also because it is highly app-specific. arguably the form.l architecture by Regenaxer does kinda do caching in RAM :)

14:09 <Regenaxer> Ah, a very short video showing Penti "Hello world"

14:09 orivej has quit [Ping timeout: 245 seconds]

14:09 <freemint> Yes

14:09 <freemint> Does somebody know whose youtube channel it is?

14:09 orivej has joined #picolisp

14:09 <beneroth> afaik picolisp.com, software-lab.de, the mailing list, and his personal twitter and google play accounts are managed by Regenaxer. everything else (including this IRC channel) is managed by others.

14:09 <Regenaxer> I made it very early with the first Penti version on Android

14:10 <Regenaxer> beneroth, right

14:10 <freemint> I see would you be interested in my presentation being recorded to be uploaded there?

14:11 * beneroth would watch it

14:11 <Regenaxer> Yes, sure!

14:11 <Regenaxer> Though it is not my channel

14:11 <beneroth> maybe a picolisp playlist would be the right thing, to also include the pilOS (pisces) vid and the froscon talk by Regenaxer, etc

14:11 <Regenaxer> I don't remember, I think my daughter uploaded the vid

14:12 <Regenaxer> The froscon talk is a bit useless, as there is nothing to see

14:12 <freemint> Do you still have the slides?

14:12 <beneroth> picolisp resources on the internet are a bit chaotic set up, though optimized for minimal bureaucracy xD

14:13 <Regenaxer> T

14:13 <beneroth> yeah it would be more useful with the slides as pdf or such...

14:13 <Regenaxer> freemint, I think there were no slides, I just presented a session on my netbook

14:13 <beneroth> ah

14:14 <freemint> the reconstructing what you roughly did would be possible ...

14:15 <freemint> the FeM (my local hacker space) does/did video for many chaos communication congresses

14:15 <Regenaxer> Let me check

14:16 <freemint> so i would know wat people to ask to do the editing.

14:16 <Regenaxer> Found something

14:17 <Regenaxer> I made it with mgp

14:17 <Regenaxer> -> pdf

14:17 <Regenaxer> moment

14:17 <Regenaxer> software-lab.de/quasiconf.pdf

14:18 <Regenaxer> software-lab.de/quasiconfAug12.mgp is the soure

14:19 <Regenaxer> So it is just an outline, a few notes

14:21 <freemint> mhh if i had all the content i might be able to get that in to modified CCC template or something

14:21 <Regenaxer> all content?

14:22 <freemint> all the content that should appear on the screen

14:22 <freemint> (like what you show in the shell ....

14:22 <Regenaxer> Yes, that's lost

14:22 <Regenaxer> was on the fly

14:22 <freemint> (what a browser would see ...

14:23 <freemint> but i guess it can be reconstructed well enough

14:24 <freemint> if i had that i could rerecord the screen and have some one put the video together

14:24 <freemint> just an offer

14:25 <freemint> is picolisp a trademaked name?

14:25 <Regenaxer> I don't think so

14:25 <Regenaxer> I hope it is not ;)

14:26 <Regenaxer> There is no "trade" involved I would say

14:27 <freemint> I was more asking if you did trademark it once

14:27 <Regenaxer> nada

14:29 <freemint> ok back to reading stuff and thinking about match, so i can think what data i can i gain and what er.l works best for that

14:35 <Regenaxer> good

14:36 <freemint> is there a better way to patterns and conditions the to check the patterns after match for condition (being a month for example)

14:37 <freemint> *is there a better way to handle patterns and conditions then to check the patterns after match for condition (being a month for example)

14:38 <Regenaxer> hmm, depends on the data

14:38 <Regenaxer> I always start with direct parsing with 'from', 'till', 'peek', 'char', 'skip' etc.

14:39 <Regenaxer> 'head', 'match' etc. are useful only for line-structured data

14:39 <Regenaxer> eg HTTP headers

14:39 <Regenaxer> the body has no lines, so a stream parser with 'from' et. al. is better

14:40 <freemint> (or mail headers

14:40 <Regenaxer> right

14:40 <Regenaxer> For some purposes 'echo' is optimal

14:40 <Regenaxer> when you want to replace patterns in a stream

14:42 <Regenaxer> eg, consider

14:42 <Regenaxer> (in "@lib/socialshareprivacy/jquery.socialshareprivacy"

14:42 <Regenaxer> (while (echo "<BASE>" "<FBTXT>" "<TWTXT>" "<G+TXT>" "<HELP>" "<PERMA>")

14:42 <Regenaxer> (casq @

14:42 <Regenaxer> ("<BASE>" (prin (baseHRef) "@lib"))

14:42 <Regenaxer> ("<TWTXT>" (prin ,"2 clicks for more data privacy: only after you click

14:42 <Regenaxer> ("<FBTXT>" (prin ,"2 clicks for more data privacy: only after you click

14:42 <Regenaxer>

14:42 <Regenaxer> ...

14:43 <Regenaxer> or "Rosetta Code/Fix code tags"

14:44 <Regenaxer> (let Lang '("ada" "awk" "c" "forth" "prolog" "python" "z80")

14:44 <Regenaxer> (while (echo "<")

14:44 <Regenaxer> (in NIL

14:44 <Regenaxer> (let S (till ">" T)

14:44 <Regenaxer> (cond

14:44 <freemint> cool an example of that should end uo in doc 'echo

14:44 <Regenaxer> ((pre? "code " S) (prin "<lang" (cddddr (chop S))))

14:44 <Regenaxer> ((member S Lang) (prin "<lang " S))

14:44 <Regenaxer> ((= S "/code") (prin "</lang"))

14:44 <Regenaxer> ((and (pre? "/" S) (member (pack (cdr (chop S))) Lang))

14:44 <Regenaxer> (prin "</lang") )

14:44 <Regenaxer> (T (prin "<" S)) ) ) ) ) )

14:44 <Regenaxer> Rosetta has many such examples

14:44 <Regenaxer> Nice: Strip block comments

14:44 <Regenaxer> (in "sample.txt"

14:44 <Regenaxer> (while (echo "/*")

14:44 <Regenaxer> (out "/dev/null" (echo "*/")) ) )

14:45 <Regenaxer> I think for the ref the above examples are a bit too long

14:59 <freemint> yeah

14:59 <freemint> fun fact there 388 different header attributes which might end up in an email

15:11 <freemint> If i remove all http related it is still 144

15:12 <freemint> *only keep strictly email related

15:25 <beneroth> :)

15:43 <cess11_> Sender and some contents will be easy, full implementation of headers and whatnot plus quirks of mbox would be nightmarish.

15:44 <cess11_> For some reason I still haven't learned that EOF Overrun might as well be lacking " as ( or ).

15:46 <beneroth> no worries, it will nag you until you do

15:47 <cess11_> Let's hope so.

15:50 <Regenaxer> cess11_, true, but the error message is too much at the lowest level to know the context ;)

15:56 <freemint> Regenaxer, The error messages could be much friendlier ...

15:56 <cess11_> Yar, it would rather be something for linting or syntax colouring to catch, but then I'd have to use those and, well...

15:57 <freemint> at a cost

15:57 <cess11_> Nah, then one might be surprised when the segfault comes.

15:57 <Regenaxer> vip helps here

15:58 <cess11_> Sure, I was lazy and on a hobby project.

15:58 <Regenaxer> It underlines strings

15:58 <cess11_> T, should use it more instead of vanilla vim.

15:58 <cess11_> But habits.

15:59 <Regenaxer> yeah, especially as some minor details are different

15:59 <Regenaxer> I got used to it, use it 100%

15:59 <Regenaxer> only for C/JS sources, or binary stuff, I use vip

16:00 <Regenaxer> 'vi' is a link to 'vip' here

16:00 <Regenaxer> "only for ... I use vim" I meant

16:05 <freemint> Regenaxer, what kind of index would you recommend on the email body?

16:06 <Regenaxer> You need a special one probably, the standard ones are not for long text

16:06 <Regenaxer> Something like in the Wiki

16:06 <Regenaxer> (class +MupIdx +index)

16:06 <Regenaxer> wiki/er.l

16:07 <freemint> what attributes does that one have?

16:07 <Regenaxer> it splits and folds the words in the text body

16:07 <Regenaxer> You have the wiki sources?

16:08 <Regenaxer> in wiki/lib.l is

16:08 <Regenaxer> (de splitWords (Lst)

16:08 <Regenaxer> (mapcar pack

16:08 <Regenaxer> (extract fold

16:08 <Regenaxer> (split Lst ~(chop "^J !,-.:;?{}")) ) ) )

16:09 <Regenaxer> then it extracts the words to index:

16:09 <Regenaxer> (de foldedWords (Mup)

16:09 <Regenaxer> (uniq

16:09 <Regenaxer> (when Mup

16:09 <Regenaxer> (filter '((W) (>= (length W) 3))

16:09 <freemint> i just downloaded it agaion

16:09 <Regenaxer> (splitWords (in (blob Mup 'txt) (till))) ) ) ) )

16:09 <Regenaxer> ok

16:10 <Regenaxer> Did not change in these parts I think

16:10 <Regenaxer> only last April:

16:10 <Regenaxer> (filter '((W) (>= (length W) 3))

16:10 <Regenaxer> it was 4 before

16:10 <Regenaxer> so now it indexes shorter words

16:11 <freemint> Is the wiki text and the index stored in Mup

16:11 <freemint> (many picolist commands are around 3 words

16:11 <Regenaxer> yes, each modification is a +Mup

16:12 <freemint> what does +Mup stand for

16:12 <Regenaxer> Markup

16:13 <freemint> Can i mostly recycle this?

16:14 <Regenaxer> sure

16:18 <freemint> could i use these folded indexes to compute similarity between documents (aka detect quotes)

16:19 <Regenaxer> Good idea

16:19 rick42_ has joined #picolisp

16:19 <Regenaxer> Perhaps by counting how many words are the same

16:21 <freemint> In the end each branch of the folded index ends in a list of +Mup having the same content is that right?

16:22 <Regenaxer> no, each each branch points to a separate mup

16:22 <Regenaxer> There may be several mups with the same text

16:23 <freemint> if i have "abb bbd dee jje" and "abb bbd dde jjj" do they share parts of the search tree when they re folded

16:23 <freemint> Regenaxer, autsch

16:23 <freemint> i do not like that

16:23 <Regenaxer> no sharing

16:23 <freemint> why not?

16:24 <Regenaxer> It is how the indexes work

16:24 rick42 has quit [*.net *.split]

16:24 <Regenaxer> You would need a lot of searching during tree operations otherwise

16:24 rick42_ is now known as rick42

16:24 <freemint> ah

16:24 <freemint> how does that approach affect space?

16:25 <Regenaxer> And you would not save sooo much

16:25 <Regenaxer> a list also takes space

16:25 <Regenaxer> now there is key/value for each entry

16:26 <Regenaxer> It is probably not a super-high-performance fulltext index

16:26 <freemint> is there a strc

16:26 <freemint> *structure at the end of the ree

16:27 <Regenaxer> no, each node points to an object

16:27 <freemint> *tree which can be distinguished from intermediate levels

16:27 <Regenaxer> all +index subclasses

16:28 <freemint> How would a new class like +leave affect performance, when then maintain a hash of the path to them and are indexed by this hash additionally

16:28 <Regenaxer> no idea

16:28 <beneroth> freemint, pilDB index trees are BTrees

16:28 <cess11_> If you're low on hardware resources you probably want to profile the actual data.

16:28 <Regenaxer> Perhaps a research project for you?

16:29 <freemint> cess11_, I am not low on hardware

16:29 <beneroth> then don't try to do premature optimization :P

16:29 <cess11_> Well then, just rock a +Sn on a partial chunked set and see how it goes.

16:30 <beneroth> T

16:30 <freemint> you are confusing premature optimization mind games and premature optimization

16:30 <beneroth> both cost time and block you from getting to results :P

16:30 <Regenaxer> The pil db index classes were designed for relatively short values

16:30 <freemint> but generate fun

16:30 <freemint> ;P

16:30 <Regenaxer> You could use some external indexing tool

16:30 <beneroth> point

16:31 <Regenaxer> But in the Wiki it performs well :)

16:31 <freemint> That is another question i have: are +Sn more compact than character folded indexes

16:31 <Regenaxer> +Sn is only for personal (human) names

16:31 <Regenaxer> European that is

16:32 <Regenaxer> you mean +Idx

16:32 <freemint> i here your warning but still

16:32 <Regenaxer> folding gives compacter stuff

16:32 <cess11_> Works also for other kinds of text, but not as nicely as with names.

16:32 <Regenaxer> yes

16:32 <freemint> is +Sn (through the loosy compression) more compact than text

16:33 <Regenaxer> that's true

16:33 <freemint> index

16:33 <freemint> what's true?

16:33 <Regenaxer> but the typical (+Sn +Idx) gets long

16:33 <Regenaxer> the +Idx

16:33 <Regenaxer> 17:32 <freemint> is +Sn (through the 17:32 <freemint> is +Sn (through the l

16:33 <Regenaxer> oops

16:34 <Regenaxer> True is: +Sn (through the loosy compression) more compact than text

16:34 <Regenaxer> off for a while

16:34 <Regenaxer> :)

16:35 <freemint> the +Idx get's as long as longest text, or as long as the longest shared tree between two texts

16:35 <freemint> bye

16:58 <cess11_> I'm not sure but +Idx and '(+Sn +Idx) on +String takes a fair bit of space, you'll see as soon as you do an import and have top/htop/&c. visible.

17:07 <Regenaxer> ret

17:09 <Regenaxer> +Sn produces only a single short entry, It is +Idx with all its substrings which blows it up

17:10 <freemint> so in (+Sn +Idx), +Idx works on normal strings

17:10 <Regenaxer> For non-human names I use almost always +IdxFold

17:10 <Regenaxer> yes

17:11 <freemint> i would not have assumed that

17:11 <Regenaxer> There was a document, I think we discussed it here a while ago

17:11 <Regenaxer> moment

17:12 <Regenaxer> yes: software-lab.de/doc/search

17:12 <Regenaxer> This is where I always look up if I'm not sure

17:14 <Regenaxer> The indexes produced by a value "Regen Axer"

17:14 <Regenaxer> For longer strings the size differences get more dramatic

17:14 <Regenaxer> +Key is the shortest :)

17:14 orivej has quit [Ping timeout: 240 seconds]

17:22 freemint has quit [Ping timeout: 252 seconds]

17:54 orivej has joined #picolisp

18:19 orivej has quit [Ping timeout: 256 seconds]

18:19 orivej has joined #picolisp

19:49 orivej has quit [Ping timeout: 240 seconds]

20:04 tankf33der has joined #picolisp

20:54 karswell has joined #picolisp

20:55 orivej has joined #picolisp

21:09 freemint has joined #picolisp

21:13 <freemint> "discussed"

21:39 orivej has quit [Ping timeout: 245 seconds]

21:53 orivej has joined #picolisp

22:26 orivej has quit [Ping timeout: 240 seconds]

22:44 orivej has joined #picolisp

22:56 beneroth is now known as bene|off