<alva>
And parsing a few measly MB blocks everything
<alva>
So it doesn't seem to do anything clever to work around the singlethreadedness/anyything-CPU-bound-blocks-everything-ness of node
<jfhbrook>
yeah, that's unfortunate
prophile has joined #elliottcable
<jfhbrook>
the streaming part makes some sense because you can parse rows as you receive them
<jfhbrook>
but an async parse of an entire buffer sounds unusual
<jfhbrook>
god I hate coffeescript
<jfhbrook>
I can't tell if this looks like word salad because it's a parser or because it's coffeescript or what
<alva>
Why can I stream a file just fine without callbacks in C
<alva>
Or stdin for that matter
<jfhbrook>
what are you talking about
<alva>
Just how little I get this
<jfhbrook>
you understand that Stream is a concept specific to node, right? Like yes streams in general exist in a lot of places, but node streams are a specific thing
<alva>
Yeah
<jfhbrook>
and it's not really 100% the same
<jfhbrook>
like, evented io and all that crap
<alva>
How do people work around the CPU bound blocking thing in node? Spawn subprocesses or something?
<jfhbrook>
*sigh* yeah
<jfhbrook>
or if they're smarter, write it in C++ and write node bindings to it
<jfhbrook>
because there's support for thread pools in libuv
<jfhbrook>
but in general, you can spawn threads in native addon land
<alva>
I seee. If I haev to write C++ I might as well write the whole thing in it.
<alva>
At least if I can use C++ >= 11
<jfhbrook>
there's definitely an argument for using a different tool if you're strongly CPU bound rather than IO bound, which could look like java or possibly C++
<jfhbrook>
on the other hand, you might want to look harder at your problem and see if you're really, well
<jfhbrook>
if you really have that big of a problem parsing a CSV in a blocking way
<alva>
I'm just whining, it's my own fault for using JS for this.
<jfhbrook>
it could very well be that you just need to use more of the streaming APIs rather than trying to parse 100k lines buffered into memory all at once
<jfhbrook>
at least, for all I know
<alva>
Distracting myself from my PL. Which coincidentally streams source code into the tokenizer and evaluates the resulting stream of tokens, so the program doesn't need to even fit in memory.
<jfhbrook>
I think for web stuff it's fairly unusual to run into something that's so CPU bound that node is wholly inappropriate rather than less-than-ideal
<jfhbrook>
and that includes data munging stuff
<alva>
I just parsed like 4.5 MB, didn't think it'd be an issue
<jfhbrook>
you could also just be using a shitty library, for all I know
<jfhbrook>
I was generating a streaming CSV the other day and I didn't have any real perf issues with it
<alva>
Likely. Just googled node csv
<jfhbrook>
try npm's search, and if that fails try the cli's search which is awful and slow but last I checked was sometimes more relevant
<jfhbrook>
I think this is a sturgeon's law thing for you
<jfhbrook>
combined with the fact that most data people would use like R or python or even excel for working with csv data
<jfhbrook>
like I said, I think R or python would be appropriate
<jfhbrook>
and hell if you like ruby I bet ruby can do it too
<alva>
I don't know what I like, still looking. I like C for being small, usable everywhere, and reasonably 1:1 to the resulting code. But no stdlib to speak of, so it's a PITA for small tasks.
<alva>
I'd like to, er... like a scripting language too, I did some perl6, it was interesting.
<alva>
Like a swiss army knife unfolding into more swiss army knives forever
<alva>
Maybe I can use my own once it works a bit more.
<alva>
I like Haskell a bit, when efficiency isn't a concern
<jfhbrook>
well I like javascript, but it has its quirks and sometimes there's a huge pile of shit libraries to dig through in a way that you don't see as much in some other more traditional scripting language ecosystems
<jfhbrook>
don't get me wrong, they have their weaknesses too
<alva>
Python is a mixed bag in that regard, but I do appreciate the breadth of the stdlib
<jfhbrook>
but in terms of standard obvious libraries to do things with, whether they're in the actual stdlibs (js's of which is very small)
<alva>
Rarely even have to look for libs
<jfhbrook>
well I can say that for data science you'll probably have pretty good luck with python, it's pretty popular amongst post-matlab/r science/engineer types
<jfhbrook>
so has a lot of those tools
<jfhbrook>
keep in mind neither of those things magically get rid of the blocking problem, python has a GIL so
<jfhbrook>
similar situation to node/js there
<alva>
Yeah :/
<alva>
Subprocesses leave a bad taste...
<jfhbrook>
you could also truck along until it's actually a problem that you can't solve by writing smarter code
<jfhbrook>
also keep in mind that pegging a cpu isn't a bad thing, you might have to cluster your services if that's what you're doing but that's not as bad as shelling out I don't think
<alva>
A lot of things are problematic on my devices :D
<alva>
Most of my job involves small obnoxious devices too and I guess it follows me around
<eligrey>
guise im hirin
<eligrey>
node.js devs pls
<eligrey>
im working on a project and it got really big and intimidating
<eligrey>
you need lots of PKI, WebRTC, and general JS knowledge
<eligrey>
client-side pki only (no node.js pki needed)
<eligrey>
lots of webrtc stuff for both client and server side though
<eligrey>
90% of what's left to be done is webrtc stuff
<eligrey>
jfhbrook: you'll probably be interested
<eligrey>
i'm making a p2p cdn of sorts
<eligrey>
alias [domain] to [my service's domain] and magically get your site accelerated by the users on it
<eligrey>
so for high traffic sites most requests will actually be fulfilled by your users
<jfhbrook>
oh cute
<jfhbrook>
is anyone doing that in the wild right now?
<eligrey>
nope
<jfhbrook>
does it require the users to install anything dumb?
<eligrey>
nope
<eligrey>
the closest thing is webtorrent
<eligrey>
but that's very limited and only really applicable to static files
<eligrey>
my thing is like webtorrent but for entire dynamic sites
<eligrey>
once you cname/alias to my domain every request returns the same html page with a single script
<jfhbrook>
so it does involve requiring javascript
<eligrey>
then it connects to peers to get the data, and if no peers are online, it connects to [name of my service].[current domain]
<eligrey>
yeah it uses js
<eligrey>
i have all of the client-side js finishes sans webrtc and pki code
<jfhbrook>
who would use it?
<jfhbrook>
like who's your target audience?
<eligrey>
any mid-high traffic website that wants to reduce load
<jfhbrook>
like do you know of someone that's not worried about time-to-render or a11y enough to use that?
<eligrey>
it will be a free service
<eligrey>
+ a paid enterprise on-premises version
<eligrey>
jfhbrook: all of that can be handled by the content site's code
<eligrey>
i can explain in more detail under nda with you
<eligrey>
if you're interested
<jfhbrook>
not enough to sign an nda, nah
<eligrey>
quick informal nda
<eligrey>
jfhbrook: just went to see your resume and i have one big issue
<eligrey>
content-disposition: attachment omg please no
<eligrey>
i like using my browser's built-in pdf reader
<jfhbrook>
hahahaha
<jfhbrook>
is that a github thing?
<jfhbrook>
I should warn you right now, I'm on the tail end of a spiritual journey where it's looking like I'm doubling down on my current job
<alva>
I worked on the p2p stuff in Spotify when it was a thing, but we quickly found out that US people have data caps and get mad if you use their upload
<alva>
...when laucnhing in the US, oops
<eligrey>
alva: yeah, but fortunately i won't be responsible for the users upload
<eligrey>
the free service is well... free
<eligrey>
and the enterprise one runs on other peoples servers
<eligrey>
you're exchanging your upload to help reduce the costs in a website
<eligrey>
alva: i think i may have a 'pro' tier though that gives you access to the api to opt users out of peering
<eligrey>
i guess if your site already has some kind of premium membership thing you'd want to opt out those users from peering
Rurik has joined #elliottcable
Sorella has quit [Quit: Connection closed for inactivity]