<nisstyre> lavaflow: I think all you wanted was to multiply them together and then use that in the bayes formula
<lavaflow> it's not totally clear to me how that would work
<nisstyre> lavaflow: straight from the paper you linked, P(viagra|spam)*P(free|spam)
<nisstyre> gives you P(viagra & free | spam)
<nisstyre> which you use bayes theorem to get P(spam | viagra & free)
<nisstyre> because bayes lets you reverse a conditional
<lavaflow> hmm, yes that makes sense
<nisstyre> and as they say, it assumes independence
<nisstyre> if you don't want to do that, you can use an n-gram model
<nisstyre> where you actually look at groups of words
<nisstyre> but that is more computationally intensive
<lavaflow> independance assumption is fine for my use-cae, even 2-gram would become pretty unwieldy for me to actually store.
<nisstyre> yeah it's fine for a simple model
<lavaflow> I'm nearing 2,000 tags and plan on keeping it performant up to 10,000 or so
<nisstyre> lavaflow: if you want to save on memory, you could look into bloom filters
<nisstyre> I've got a C implementation of one that should be fairly easy to read...
<nisstyre> and translate to Racket
<lavaflow> I've already got a racket implementation of bloom filters actually
<nisstyre> ah ok cool
<nisstyre> what hash are you using?
<lavaflow> I was using them previously to speed up checking to see if files were members of a given tag set. one bloom filter per tag.
<lavaflow> murmur3
<nisstyre> mine uses FNV and then takes the upper and lower 32 bits of it
<nisstyre> a 64 bit FNV that is
<nisstyre> I think murmur is just as good
<lavaflow> I found this racket FFI bindings for murmur3, it seemed sufficiently fast: https://github.com/jrslepak/murmur3
<nisstyre> oh I know that guy
<nisstyre> he used to be in ##programming a lot
<lavaflow> small world
<nisstyre> python bindings though
<nisstyre> I just used libfnv
<lavaflow> I've not cleaned up my whole bloom filter package for publishing, I mean to write some docs for it at least first, but here is the meat of it: https://gist.github.com/jgreco/69f063f3e1b058fdf867c231c68bb47e
ZombieChicken has quit [Ping timeout: 256 seconds]
ZombieChicken has joined #racket
<lavaflow> I'm pretty pleased with (make-recommended-bloom-filter n tolerance) in particular. give it the anticipated size and the desired tolerance, and it will kick out a bloom filter with the right parameters
<nisstyre> yeah
<nisstyre> generally the part that has an impact on performance is memory allocation
<lavaflow> yeah I'm not doing anything fancy in that regard, I'm just using data/bit-vector and hoping it all works out. the part that concerned me the most actually was how big all the bloom filters would be when serialized into my sqlite database
<lavaflow> it didn't really turn out to be a problem though
<nisstyre> oh, yeah IDK about that
<nisstyre> but it looks fine to me
<nisstyre> you should clean up the API and publish it to http://pkgs.racket-lang.org/
<lavaflow> yeah, I've been meaning too.
<lavaflow> I'm not really sure how to get it to play well with those murmur3 FFI bindings though, it took some coaxing to get it to produce a working dynamic library on my system, and then my racket's /lib/ directory manually before it would find it.
<lavaflow> I'm sure there is some way to make a modern racket package that does FFI stuff cleanly, so the murmur3 package can be upgraded to that then my bloom-filter package can simply depend on it like normal and not worry about the details, but that's something I've not really looked into yet
Arcaelyx has joined #racket
ziyourenxiang has joined #racket
<lavaflow> oh man, now that I look at that code again, it's pretty shoddy. I'm recomputing hashes needlessly for the kmo-hash.
vraid has quit [Ping timeout: 260 seconds]
badkins has joined #racket
g00s has joined #racket
jmiven has quit [Quit: co'o]
jmiven has joined #racket
pierpal has quit [Ping timeout: 246 seconds]
dmiles has quit [Ping timeout: 246 seconds]
hyp3rbor3ax has joined #racket
hyp3rbor3ax has quit [Remote host closed the connection]
logicmoo has joined #racket
logicmoo has quit [Ping timeout: 246 seconds]
dmiles has joined #racket
dmiles has quit [Ping timeout: 244 seconds]
logicmoo has joined #racket
_whitelogger has joined #racket
dan_f has joined #racket
Arcaelyx has quit [Quit: Textual IRC Client: www.textualapp.com]
acarrico has quit [Ping timeout: 246 seconds]
dbmikus has joined #racket
dan_f has quit [Read error: Connection reset by peer]
dan_f_ has joined #racket
dbmikus_ has joined #racket
dbmikus_ has quit [Ping timeout: 252 seconds]
g00s has quit [Quit: Textual IRC Client: www.textualapp.com]
dddddd has quit [Remote host closed the connection]
dan_f_ has quit [Read error: Connection reset by peer]
dan_f has joined #racket
rnmhdn has joined #racket
_whitelogger has joined #racket
teardown has quit [Ping timeout: 246 seconds]
g00s has joined #racket
pierpal has joined #racket
johnjay has quit [Ping timeout: 256 seconds]
g00s_ has joined #racket
g00s has quit [Ping timeout: 240 seconds]
g00s_ is now known as goos
goos is now known as g00s
g00s has quit [Quit: Textual IRC Client: www.textualapp.com]
jao has quit [Ping timeout: 250 seconds]
g00s has joined #racket
rnmhdn has quit [Ping timeout: 244 seconds]
dbmikus has quit [Quit: WeeChat 1.9.1]
g00s has quit [Remote host closed the connection]
rnmhdn has joined #racket
rnmhdn has quit [Ping timeout: 252 seconds]
YuGiOhJCJ has joined #racket
sauvin has joined #racket
rnmhdn has joined #racket
rnmhdn has quit [Ping timeout: 240 seconds]
rnmhdn has joined #racket
rnmhdn has quit [Ping timeout: 240 seconds]
<lavaflow> nisstyre: yeah that's the proper way to do it. only compute those two hashes once each then combine those results with h1+h2*i
g00s has joined #racket
bor0 has joined #racket
dustyweb has quit [Ping timeout: 264 seconds]
vraid has joined #racket
ZombieChicken has quit [Ping timeout: 256 seconds]
rnmhdn has joined #racket
mzan has joined #racket
aquiandres has joined #racket
g00s has quit [Quit: Textual IRC Client: www.textualapp.com]
aquiandres_ has joined #racket
aquiandres has quit [Ping timeout: 250 seconds]
logicmoo has quit [Ping timeout: 245 seconds]
dmiles has joined #racket
orivej_ has quit [Ping timeout: 240 seconds]
aquiandres_ is now known as aquiandres
orivej has joined #racket
ubLIX has joined #racket
aquiandres has quit [Read error: Connection reset by peer]
orivej has quit [Ping timeout: 240 seconds]
bor0 has quit [Quit: Leaving]
orivej has joined #racket
ubLIX has quit [Quit: ubLIX]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
confusedwanderer has joined #racket
<rnmhdn> can someone explain this to me:
<rnmhdn> > (case (car '(/))
<rnmhdn> ['(+ - *) '(1) ]
<rnmhdn> ['(/) '(2) ]
<rnmhdn> [else '(3) ])
<rnmhdn> '(3)
<rnmhdn> I found out myself sorry.
<bremner> too many ' ?
<rnmhdn> kinda
<rnmhdn> any comments on this code:
<rnmhdn> it takes '( 1 2 3 3 2 1 1 1 2) and gives '( 1 2 3 2 1 2)
dan_f has quit [Quit: dan_f]
<bremner> can you replace your use of append with cons?
<bremner> I guess it's not a real efficiency hit here, but it does look odd to do (append (list 'thing) tail) rather than (cons 'thing tail)
<rnmhdn> yeah thank you
<rnmhdn> I fixed that
<bremner> (list-tail lst 1) is slightly surprising, but I guess it makes sense to use list-tail in both places.
<rnmhdn> yeah I thought that'd be more readable
<rnmhdn> also is this how you would approach this problem in racket?
<bremner> more or less. I'd try to avoid the call to length, since that will be expensive on long lists. I might also see if I could restructure the nested if's into a single cond
<rnmhdn> what would you do instead of length?
<rnmhdn> I don't agree so much with the if to cond conversion because I think that reduces readability
<rnmhdn> also how would you solve LIS in racket?
<rnmhdn> in polynomial time at least.
dddddd has joined #racket
<bremner> rnmhdn: for length I would do something like (or (null? (cdr lst)) (null? (cddr list)))
iyzsong has joined #racket
<bremner> I guess for LIS you need some storage, maybe pass down a vector. Or use a for loop and a vector
rnmhdn has quit [Ping timeout: 252 seconds]
orivej has quit [Ping timeout: 240 seconds]
jao has joined #racket
acarrico has joined #racket
orivej has joined #racket
lockywolf has joined #racket
dbmikus_ has joined #racket
lockywolf has quit [Ping timeout: 264 seconds]
iyzsong has quit [Ping timeout: 264 seconds]
rnmhdn has joined #racket
orivej has quit [Ping timeout: 245 seconds]
orivej has joined #racket
<rnmhdn> sorry
<rnmhdn> I was disconnected
<rnmhdn> bremner: did you say anything to me?
<bremner> two things: for length I would do something like (or (null? (cdr
<bremner> lst)) (null? (cddr list))), and I guess for LIS you need some storage, maybe pass down a
<bremner> vector. Or use a for loop and a vector
lavaflow has quit [Read error: No route to host]
tcsc has quit [Ping timeout: 264 seconds]
pagnol has joined #racket
<pagnol> I'm looking for a function that does a right shift
<pagnol> I only found arithmetic-shift
pierpal has quit [Ping timeout: 240 seconds]
<bremner> rudybot: init racket
<rudybot> bremner: error: with-limit: out of time
<bremner> sigh.
lavaflow has joined #racket
<bremner> pagnol: anyway, pass a negative shift
<pagnol> bremner, ouch, I thought I had read in the docs that the argument should be nonnegative
audriu has joined #racket
pierpal has joined #racket
audriu has quit [Quit: Leaving]
DGASAU has joined #racket
tcsc has joined #racket
DGASAU has quit [Read error: Connection reset by peer]
DGASAU has joined #racket
iclon has quit [Ping timeout: 244 seconds]
sleepnap has joined #racket
orivej has quit [Ping timeout: 276 seconds]
iclon has joined #racket
YuGiOhJCJ has joined #racket
vraid has quit [Quit: Leaving]
ManDay has joined #racket
<ManDay> Are there any people here?
* tilpner beep boop
<ManDay> Ah hi. I just installed racket because Guile's backtraces are complete bollocks. I'd like to just run a file like I'd do with Guile. `racket x.scm` complains about missing (module) and such, can this be bypassed simply?
<tilpner> racket -r x.scm
<tilpner> That works for a boring displayln. You may need to specify the language if you want anything more
<ManDay> the language? i do nothing fancy, rudimentary r6rs will make me happy?
<ManDay> Ok, this backtrace isn't very helpful either...
rnmhdn has quit [Ping timeout: 244 seconds]
<ManDay> Why is it so hard to get a proper backtrace from scheme programs?
jrslepak has quit [Ping timeout: 252 seconds]
jrslepak has joined #racket
pagnol has quit [Ping timeout: 240 seconds]
<ManDay> I've added (require errortrace) at the top of the source code and although the output has changed and has become more verbose, I don't even get something as simple as the line number on which the error occured when running `racket -r x.scm`
<ManDay> (I know from guile it's happening on (cdr #f), but nothing indicates that in the racket error)
<ManDay> Oh maybe not. Maybe there is a different error here
<ManDay> I have to investigate. Seems like racket works through the program in a bit different way than Guile
bartbes has quit [Quit: Reboot]
rnmhdn has joined #racket
ManDay has quit [Ping timeout: 256 seconds]
bartbes has joined #racket
orivej has joined #racket
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
vraid has joined #racket
<rain1> https://schemers.org anybody here have edit access to this website?
<rain1> would like to suggest some chagnes
DGASAU has quit [Remote host closed the connection]
DGASAU has joined #racket
odanoburu has joined #racket
<lexi-lambda> rain1: This is #racket… did you mean to ask that in #scheme?
<rain1> na
sauvin has quit [Read error: Connection reset by peer]
<lexi-lambda> rain1: This page says to contact Shriram if you have any feedback. https://schemers.org/welcome.shtml
<rain1> i know, that's why i asked here
<lexi-lambda> I don’t think Shriram is in this channel.
rnmhdn_ has joined #racket
rnmhdn has quit [Ping timeout: 240 seconds]
orivej has quit [Ping timeout: 245 seconds]
DGASAU has quit [Ping timeout: 250 seconds]
confusedwanderer has quit [Remote host closed the connection]
charh has joined #racket
johnjay has joined #racket
DGASAU has joined #racket
joebobjoe has joined #racket
<joebobjoe> hi. doing htdp.org. why do parens need an operator inside them
<joebobjoe> why can't things like (+ 1 (2)) work?
<joebobjoe> why can't a subexpression just be a value?
<rain1> (2) is an error
<joebobjoe> yes, but why
<rain1> it's trying to call 2 as a function
<rain1> but it's not a function
<joebobjoe> why don't parens serve as simple expression grouping?
leif has joined #racket
<joebobjoe> why do they call a funciton?
<rain1> that's what they do in lisp
<joebobjoe> is there some sort of ambiguity I'm not thinking of?
<joebobjoe> I'm new to lisp-like langs
<joebobjoe> I'm used to langs where parens can simply group expressions
<rain1> yeah
<joebobjoe> I guess it makes it so your code has a normal form?
<rain1> it's not really about having a normal form
<joebobjoe> is it an idiosyncrasy?
<rain1> I can see the connection though
<joebobjoe> I guess the reason racket doesn't allow (+ 1 (2)) is because there is never a need to use extra parens like that, ever
<joebobjoe> because the order of expression evaluation is already built in to the syntax?
ubLIX has joined #racket
<rain1> that's true
<rain1> you never have any question like 1 + 2 * 3 vs (1 + 2) * 3 vs 1 + (2 * 3)
<rain1> it's always just (+ 1 (* 2 3))
ubLIX has quit [Client Quit]
<joebobjoe> rain1: are you learning racket, too?
<rain1> it's closer to what a computer understands than math
<rain1> yeah
dbmikus_ has quit [Ping timeout: 252 seconds]
odanoburu has quit [Quit: Connection closed for inactivity]
rnmhdn_ has quit [Ping timeout: 272 seconds]
ubLIX has joined #racket
pera has joined #racket
dbmikus has joined #racket
siel has quit [Ping timeout: 264 seconds]
vraid has quit [Ping timeout: 264 seconds]
<joebobjoe> wait why does bsl (or racket in general idk) have different equality operators for different types?
<joebobjoe> what is wrong with (= "foo" "foo")? why do I have to use (string=? "foo" "foo")?
pera has quit [Quit: leaving]
cpup has quit [Ping timeout: 268 seconds]
siel has joined #racket
<aeth> = is used for numbers
cpup has joined #racket
aquiandres has joined #racket
<rain1> there's a lot of different equality tests
<aeth> If there was a general equality it wouldn't be called =
<aeth> it might be called e.g. equal?
dbmikus has quit [Remote host closed the connection]
<joebobjoe> I just don't understand why ='s semantics can't just be shared/implemented by all types
<ubLIX> how then would you differentiate between "1234" and 1234?
<ubLIX> just like that, i suppose
longshi has joined #racket
aquiandres_ has joined #racket
<joebobjoe> man racketeers are geniuses
<joebobjoe> just kidding
<joebobjoe> :/
aquiandres has quit [Ping timeout: 264 seconds]
aquiandres_ has quit [Read error: Connection reset by peer]
aquiandres has joined #racket
sleepnap has left #racket [#racket]
<ubLIX> joebobjoe: I'm not a racketer, just a lapsed dabbler. Interesting question though. Something to do with racket not being a typed language (unless you want it to be)?
<rain1> joebobjoe: EQUAL? works for everything
ubLIX has quit [Quit: ubLIX]
<joebobjoe> also, why is it called typed/racket not racket/typed ?
<joebobjoe> wouldn't the latter make more sense
DGASAU has quit [Ping timeout: 244 seconds]