<t0m>
morning all, I'm trying to run the twitter trending hashtags example after refactoring the code a bit but I'm getting parition not defined in the twitter_wallaroo_app.py, does the declaration order matter like with node? or is this just user error?
<jonbrwn>
morning t0m: do you have the exact error you're getting? might also be easier to debug if the code is hosted somewhere where we can look at it
<t0m>
it should be line 11 in the twitter_wallaroo_app.py, i don't know if i defined partition right or if i need to switch from to_stateful to just to
<jonbrwn>
ok, I'll get back to you in a few
<t0m>
no worries at all
<SeanTAllen>
t0m: i would avoid to_stateful
<SeanTAllen>
its really for toys only
<t0m>
sounds good, thank you
<SeanTAllen>
well its not for toys
<SeanTAllen>
but the number of applications that want to_stateful as compared to a partition is probably low
<SeanTAllen>
to_stateful isnt parallel so...
<SeanTAllen>
its slower, but provides ordering guarantees for every message as compared to ordering guarantees per key in the partition
t0m has quit [Ping timeout: 240 seconds]
<jonbrwn>
t0m: looks like you accidentally misspelled `def partition` as `def parition`
<jonbrwn>
correcting that should fix your problem
vaninwagen has joined #wallaroo
<vaninwagen>
hi folks, i am looking into getting syslog messages into wallaroo. So far there are 2 viable options, both with their own difficulties.
<nisanharamati>
Hi vaninwagen. What are the options and difficulties?
<vaninwagen>
there is UDP, where according to https://tools.ietf.org/html/rfc5426#section-3 each UDP datagrum must contain 1 syslog message - difficulty here is having a UDPSource for wallaroo
<vaninwagen>
what would be minimal requirements for writing such a source?
<nisanharamati>
Hmm. It sounds like you're looking through the syslog output configuration options, right?
<vaninwagen>
the other option is using TCP transport, leveraging the existing TCPSource, rsyslog supports plain TCP transport
<vaninwagen>
nisanharamati: yes, exactly :)
<nisanharamati>
for UDP you'd need to have a bridge or proxy app that receives the UDP data from syslog, and sends it to Wallaroo's input address
<nisanharamati>
It might be similar for the TCP output as well.
<vaninwagen>
my initial idea was to send from my local rsyslog directly to wallaroo
<nisanharamati>
the biggest compatibility issue for TCP I think is the format
<nisanharamati>
Wallaroo needs a length header to know how many bytes to read
<SeanTAllen>
vaninwagen: there's one other option that most folks couldnt do, but you can. you could add a UDP source to Wallaroo by writing some pony code.
<vaninwagen>
SeanTAllen: that is exactly what my question above was aiming at :)
aturley has joined #wallaroo
<SeanTAllen>
Given you know Pony, you could probably take the example of our TCPSource and write a UDPSource based on it and the standard UDP stuff in Pony
<vaninwagen>
SeanTAllen: yeah, looking into it... quite a machine! :)
<SeanTAllen>
I am most responsible for TCPSource so I can probably answer any questions you have, feel free to ask away here in general. Someone can probably help if I'm not around.
t0m has joined #wallaroo
aturley has quit [Quit: aturley]
<vaninwagen>
just to get the bigger picture, i write my UDPSource, which gets a bunch of builders upon creation and my custom syslog SourceHandler which does the decoding into some class/data-structure. Then, after decoding it i send it down the pipeline using a built runner.
<SeanTAllen>
Ya
t0m has quit [Ping timeout: 240 seconds]
<vaninwagen>
ok, the harder part seems to be keeping track of all the topology changes and the router stuff
<SeanTAllen>
well
<vaninwagen>
but hey, i can copy and paste :)
<SeanTAllen>
that should be handled in Wallaroo
<SeanTAllen>
you need to provide the builders and what not
<vaninwagen>
yes i mean all the whatnots
<SeanTAllen>
are you planning on using the Python API? If yes, there's a bit of work to support UDP there as well
<SeanTAllen>
IF its something you are interested in, we can set something up where jtfmumm could walk you through the details of what needs to be done
<vaninwagen>
would it be crazy to stick to the pony API?
<SeanTAllen>
there's no documentation and its not supported, other than that, no, have at it if you want.
<SeanTAllen>
anything you find though, we'll be happy to help with via IRC.
<vaninwagen>
given the preferences of people in my company i'm afraid if anything, it will be the go api
<SeanTAllen>
well most of the work for the UDP source would be in Pony, then a small amount of Python or Go glue code