<Regenaxer>
The point of mis> is that it returns an error message
<Regenaxer>
Depending on the relation
<Regenaxer>
for nm it is clear anyway
<tankf33der>
ok
orivej has joined #picolisp
orivej has quit [Ping timeout: 245 seconds]
orivej has joined #picolisp
orivej has quit [Ping timeout: 240 seconds]
alexshendi has joined #picolisp
alexshendi has quit [Read error: Connection reset by peer]
orivej has joined #picolisp
jibanes has quit [Ping timeout: 272 seconds]
jibanes has joined #picolisp
orivej has quit [Ping timeout: 250 seconds]
<tankf33der>
what is core or kernel or point of power to understand picodb ?
<Regenaxer>
hmm, everything about external symbols?
<tankf33der>
ok
<Regenaxer>
they are the basis
<Regenaxer>
They can exist by themselves
<Regenaxer>
above is @lib/btree.l
<Regenaxer>
also by itself
<Regenaxer>
and on top of *that* is @lib/db.l, ie. the E/R stuff
<Regenaxer>
The three layers are isolated from each other
<tankf33der>
eh.
orivej has joined #picolisp
orivej has quit [Ping timeout: 240 seconds]
jibanes has quit [Ping timeout: 240 seconds]
jibanes has joined #picolisp
inara has quit [Quit: Leaving]
inara has joined #picolisp
orivej has joined #picolisp
orivej has quit [Ping timeout: 240 seconds]
<tankf33der>
so
<tankf33der>
i have 29M object in NY taxi DB now
<tankf33der>
lets import next several monthes
<Regenaxer>
Good
<beneroth>
Good evening
<beneroth>
nice, tankf33der
<beneroth>
yeah I recommend reading lib/db.l too
<beneroth>
lib/btree.l is more about inner workings, no need to grok unless you want to fiddle much with db internals, I think. but lib/db.l is good to understand +Entity, the +relation classes and I found it rather easy to understand the db structures by going through (db) source
alexshendi has joined #picolisp
<beneroth>
I found this an important document, both for understanding DB and the default form.l framework: https://software-lab.de/dbui.html
<beneroth>
it's a bit dated but good write-up of the core concepts
<beneroth>
tankf33der, @lib/too.l is also recommended, contains (not loaded by default) utility functions, among them the database garbage collector (dbgc), the integrity checker (dbCheck), and functions to fix/rebuild indices/indexes. unfortunately it is not documented afaik, and the source is often rather deep stuff, so easiest is to ask here in IRC how to achieve something or how exactly those functions are called.
<Regenaxer>
Good evening beneroth
<beneroth>
I've a collection of chat logs and notes I work with, but I haven't yet taken the time to compile it into a usable documentation
<tankf33der>
beneroth: thanks
<beneroth>
you could do (pool...) (load "@lib/too.l") (dbCheck) on your database, if it returns T all is fine. else you might have some mistakes either in the schema (e/r definitions) and/or some code which wrote to the database
<beneroth>
but the risk with such mistakes are not very grave usually, worst case some query might not return all data it should return. actual data loss can not happen so easily unless you run (dbgc) while records aren't in any index at all.
<beneroth>
important to know is, unique index relations specified with +Key are not enforced on database level, meaning there will always only be one entry per key in the index, but no checks in (new) or so to prevent duplicates, creating a second object with the same value in a +Key property just overwrites the entry in the +Key index. you have to check yourself (e.g. with (db)) to prevent duplicates, or use form.l GUI which does such checks.
<beneroth>
often you use (genKey) for +Key properties, like id field in primary keys in traditional relational databases. then of course (genKey) checks for duplicates.
<beneroth>
right, Regenaxer ? :)
<beneroth>
Regenaxer, removing *Tsm mechanism is fine by me. I also never used the mode in emacs. Seems to me like something which probably was useful 20+ years ago but likely not so anymore.
<Regenaxer>
Perfect! :)
<Regenaxer>
OK, good to hear
<Regenaxer>
"Perfect" was for all above explanations :)
<beneroth>
yeah, thanks :)
<Regenaxer>
beneroth, btw, some time ago we discussed about DB modifications without locks
<Regenaxer>
I did such a thing now in 'create'
<Regenaxer>
each process writes a separate index
<Regenaxer>
so no locking is needed, only a global lock for the whole 'create' run
<beneroth>
yeah I see
<beneroth>
but well, during (create) other work on the DB is kinda blocked. I'm thinking about concurrent writes. I read up about how usual databases do this and what the different techniques are - in my view, your very simple approach is quite enough for the usual use cases/work loads where pilDB is now mainly used for, even more so with todays hardware. the alternatives are quite complex and have general overhead costs.
<beneroth>
As you know I thought about using object-level locks for concurrent writes on bigger databases where the concurrent writes are likely not overlapping (are often concurrent but usually in different areas/indices/objects), probably it would not even need any modification just well-tough use of the existing overloaded (lock) variant.
<Regenaxer>
Right, as the locks are very short in time
<Regenaxer>
But for 'create' it is a big gain
<beneroth>
yes, if the programmer keeps the (dbSync) ... (commit 'upd)/(rollback) code small.
<beneroth>
of course.
<Regenaxer>
Many processes write in parallel
<Regenaxer>
well, the root file is still locked
<Regenaxer>
so *write* is only one process
<beneroth>
I often have DB logic code which has a lot of nested (ifns) in it, like (ifn condition (rollback) (check next condition)). maybe I could structure this a bit nicer/more readable.
<Regenaxer>
But caching up to a million modified objects without locking is the important point
<beneroth>
yeah that (create) is a nice new tool, important advancement!
<tankf33der>
now i have 113M objects, whole 2017 year, picodb still works fast as before.
<Regenaxer>
cool
<beneroth>
awesome
<Regenaxer>
In my test I imported I billion objects (1e9)
<Regenaxer>
with three indexes
<Regenaxer>
A little more than a day
<tankf33der>
show me the scheme of that import
<tankf33der>
and how many disk spaces taken
<Regenaxer>
Currently I'm experimenting with OSM data again
<tankf33der>
OSM ?
<Regenaxer>
That import is the one in the ref
<Regenaxer>
+Cls
<beneroth>
I'm currently struggling in one project with a MSSQL relational database with 340k records (the database configuration is not optimised and the database schema is a mess) ^^
<Regenaxer>
OSM is OpenStreetMap
<tankf33der>
ok
<Regenaxer>
The model in the wiki article
<tankf33der>
ok
<tankf33der>
remember one.
<tankf33der>
i've imported whole 2017 year
<Regenaxer>
Great! :)
<tankf33der>
first 3 monthes took more data than next ones
<tankf33der>
so last several monthes took less disk space than athers
<Regenaxer>
ok, though in general it will get a little slower when it gets bigger
<tankf33der>
ok
<Regenaxer>
OSM is tough to import. Not so much because of the indexes, but because each node is connected to possibly many other nodes
<Regenaxer>
Currently importing all of Germany. A single XML file of 52 GiB
<beneroth>
nice!
<beneroth>
also for the distributed ERP system?
<beneroth>
*is this
<razzy>
stupid question. if you have numbers, symbols, why do you have to have pairs?
<Regenaxer>
Probably not. The data are not so useful by themselves
<beneroth>
ok
<Regenaxer>
Cons pair?
<beneroth>
razzy, pair is the simplest form of a list?
<Regenaxer>
Yeah, a cons pair combines exactly two items
<razzy>
i thought you have symbols implemented by pairs
<razzy>
maybe
<Regenaxer>
Not by pairs, but by cells
<beneroth>
underneath everything is based on a cell structure, but you should not care about that, thats implementation internals
<Regenaxer>
A cell is the storage unit
<razzy>
it is listed as special structure. for education purposes
<beneroth>
a cell is basically 2 pointers
<Regenaxer>
a symbol may be a single cell, or many cells
<razzy>
i thought cell is adress,number/data,pointer-to-another-cell
<Regenaxer>
Thats correct
<Regenaxer>
as beneroth said, it is two pointers
<Regenaxer>
if we consider a shortnum a virtual pointer
<Regenaxer>
So "cell" is a physical unit, and "pair" a logical one
<Regenaxer>
A pair combines two Lisp items
<Regenaxer>
a cell not necessarily
<razzy>
"a cell not necessarily" i do not get
<Regenaxer>
It may hold raw data in
<Regenaxer>
CAR and CDR
<Regenaxer>
not necessarily another Lisp item (atom or pair)
<Regenaxer>
ie bignum cells have raw digit data
<Regenaxer>
Name cells of a symbol have the characters
<Regenaxer>
But a pair always has an atom or a pair both in its CAR and in its CDR
<razzy>
fo now, i cannot imagine in CDR nothing else than pointer to another cell. maybe NIL "symbol"
<razzy>
and even NIL should be in another cell :]
<beneroth>
pointers are also just integer values (memory address). you can look at cell also as a box of two integer values (C integers, not numbers). so it should not so difficult for you to imagine that one of the values in a cell ist a number and not pointing to another memory address.
<beneroth>
NIL is a symbol.
<Regenaxer>
it is, see doc64/structures. NIL occupies 2 cells
<beneroth>
in picolisp, the datatype is encoded in special flags in a cell value.
<beneroth>
I gotta go. take care dudes, cu :)
<Regenaxer>
See you beneroth!
<razzy>
if you put pure number into CDR, i am not sure if you do not destroy some predictability of cell chain
<razzy>
it is nice to be certain about something in universe
<Regenaxer>
T
<Regenaxer>
"pure number" is only in CAR btw
<Regenaxer>
from doc64/structures:
<Regenaxer>
Bignum
<Regenaxer>
|
<Regenaxer>
V
<Regenaxer>
+-----+-----+
<Regenaxer>
| DIG | | |
<Regenaxer>
+-----+--+--+
<Regenaxer>
|
<Regenaxer>
V
<Regenaxer>
+-----+-----+
<Regenaxer>
| DIG | | |
<Regenaxer>
+-----+--+--+
<Regenaxer>
|
<Regenaxer>
V
<Regenaxer>
+-----+-----+
<Regenaxer>
| DIG | CNT |
<Regenaxer>
+-----+-----+
<razzy>
this pure number could represent any data
<Regenaxer>
no
<Regenaxer>
no Lisp data at all
<razzy>
in CAR
<razzy>
no?
<razzy>
how, why?
<Regenaxer>
Lisp data are atoms or pairs, not raw bit patterns
<Regenaxer>
Please understand: There are *only* numbers, symbols and pairs
<razzy>
yes i get that. one or many CARs make atom
<Regenaxer>
one or many *cells* make atom
<Regenaxer>
cells, not pairs!
<Regenaxer>
CAR is half of a cell
<Regenaxer>
the other half is CDR
<razzy>
ok.
<Regenaxer>
A cell *may* implement a pair
<Regenaxer>
but also a symbol or a bignum
<Regenaxer>
only shortnums don't exist physically
<razzy>
and pair is (CAR . CDR)
<Regenaxer>
yep
<razzy>
which makes some problem in reasoaning and playing with adreses and stuff
<Regenaxer>
What kind of problem?
<razzy>
you cannot always assume that CDR is adress
<Regenaxer>
There is no concept of an address in Lisp
<razzy>
i do not have example. just bad feeling
<Regenaxer>
Forget addresses
<razzy>
to me symbol is adress to cell chain
<Regenaxer>
Forget that for now
<Regenaxer>
It is not useful
<Regenaxer>
A symbol is an *atom*
<Regenaxer>
it has a value, a name and properties
<Regenaxer>
It is helpful to have the pointer structures in mind, to understand what is going on, but on the programming level they are irrelevant
<Regenaxer>
A symbol has simply those 3 components
<Regenaxer>
The name is not modifyable, only the value and the properties
<razzy>
ok thx
<Regenaxer>
:)
<razzy>
if i have anonymous symbol, cannot i name and rename it?
<Regenaxer>
Not possible. The name cannot be modified
<Regenaxer>
You must assign the value and properties to another symbol
<Regenaxer>
Ah
<Regenaxer>
I forgot
<razzy>
not a problem, is there only human programmer reason? or is there technical one
<Regenaxer>
You *can* do it, with 'name'
<Regenaxer>
for transients
<Regenaxer>
and thus also for anonymous syms :)
<Regenaxer>
It is prohibited for internal symbols, as it would mess up the namespace
<Regenaxer>
and also external symbols, as their names are special
<Regenaxer>
But transients can be renamed
<Regenaxer>
Is still dangerous, as a transient may be in some other namespace
<Regenaxer>
which will then be inconsistent
<Regenaxer>
I don't remember a case where I did that, renaming a symbol in that name
<razzy>
yeah, shared namespaces, same problem as with shared memory
<razzy>
and external resources
<Regenaxer>
external resources yes, but namespaces are not shared
<Regenaxer>
A symbol can be in several namespaces though
<Regenaxer>
So the symbol is kind of shared perhaps :)
<razzy>
btw, no limit on number of symbols i guess
<Regenaxer>
T
<Regenaxer>
only memory
<Regenaxer>
External symbols may logically be maximally 256 Peta objects
mtsd has joined #picolisp
<mtsd>
Good evening!
<Regenaxer>
Good evening mtsd!
<mtsd>
Hi Regenaxer!
<mtsd>
I found the reference I mentioned on the mailing list
<Regenaxer>
About *Tsm?
<mtsd>
Yes, exactly
<mtsd>
In doc/tut.html, "Auto-formatting (underlined) och double quouted strings..."
<mtsd>
Can I remove that line and send you an updated doc/tut.html file?
<Regenaxer>
Ah, indeed!
<Regenaxer>
No problem, I can do all the changes tomorrow
<Regenaxer>
I just wait a little if there are objections
<mtsd>
Good idea
<mtsd>
I need your help with one small thing
<mtsd>
The form doc, and some other docs I have been involved in, still have my old e-mail address
<mtsd>
I still check it, but the main one has changed. Ok if I update and send you the files?
<Regenaxer>
Can you send the fixed?
<mtsd>
Absolutely
<Regenaxer>
yes, nice
<Regenaxer>
I did not modify the form doc without you
<mtsd>
It is probably time to revisit that, now that I have some more experience
<mtsd>
It is usually a good idea to have another go at your writings, after some time has passed
<Regenaxer>
Indeed
<rick42>
very wise, mtsd
orivej has joined #picolisp
<mtsd>
Hi rick42!
<mtsd>
How are you these days? :)
orivej has quit [Remote host closed the connection]
orivej has joined #picolisp
mtsd has quit [Quit: WeeChat 1.6]
alexshendi has quit [Read error: Connection reset by peer]
alexshendi has joined #picolisp
<beneroth>
re, hi guys
<beneroth>
as rick42 said, wise and good approach, mtsd :)
<beneroth>
oh mtsd is already gone :0
<beneroth>
Regenaxer, nice and well worded explanation (you made to razzy)
<beneroth>
razzy, for about the hundred or so time: stop theorizing and caring so much about picolisp internals (pointers and stuff). stop trying to look at lisp through a C-perspective, it is simply wrong (trying to look at/understand any lisp that way, this is not special about picolisp), it will not help you to understand it. Try do some practical programming with picolisp! later, with some experience how picolisp can be used for ACTUAL PRACTICAL programs you can
<beneroth>
go back to understand how it works exactly internally on a byte and bit level. Without the knowledge how to actually program and think in the lisp way you will not figure out much sense why the internal design is as it is. lisp programs are like onions or nested matryoshka dolls, layer on layer, every lower/inner layer consisting of smaller and simpler/more basic pieces/bricks, and every higher/outer layer being an increasing refinement and assemblage of t
<beneroth>
he smaller building blocks, and the resulting application and big picture being mainly a kind of emergent structure/behaviour resulting from the combination and interplay of the increasingly smaller/more general pieces it consists of. if you look at a tiger with a microscope you only see single hairs which don't tell you anything about the look, colour/texture nor behaviour of the whole tiger, it only makes really sense too look at the single tiger hair af
<beneroth>
ter you observed the whole tiger.
<beneroth>
if your only interested in hairs (pointers, C-style programming which just like one small step from pure assembler), then go play with C instead of lisp. C is primarily about how compiler and hardware works. lisp is primarily about abstractions and software design and how to think as a human about it, without much regard of the hardware specifics. LISP was invented to teach the theory of programming, the original first author didn't even intend it as an ac
<beneroth>
tual programming language but as pure theoretical concept to teach understanding of software design.