morning all, I'm trying to run the twitter trending hashtags example after refactoring the code a bit but I'm getting parition not defined in the twitter_wallaroo_app.py, does the declaration order matter like with node? or is this just user error?
morning t0m: do you have the exact error you're getting? might also be easier to debug if the code is hosted somewhere where we can look at it
it should be line 11 in the twitter_wallaroo_app.py, i don't know if i defined partition right or if i need to switch from to_stateful to just to
ok, I'll get back to you in a few
no worries at all
t0m: i would avoid to_stateful
its really for toys only
sounds good, thank you
well its not for toys
but the number of applications that want to_stateful as compared to a partition is probably low
to_stateful isnt parallel so...
its slower, but provides ordering guarantees for every message as compared to ordering guarantees per key in the partition
t0m has quit [Ping timeout: 240 seconds]
t0m: looks like you accidentally misspelled `def partition` as `def parition`
correcting that should fix your problem
vaninwagen has joined #wallaroo
hi folks, i am looking into getting syslog messages into wallaroo. So far there are 2 viable options, both with their own difficulties.
Hi vaninwagen. What are the options and difficulties?
there is UDP, where according to https://tools.ietf.org/html/rfc5426#section-3 each UDP datagrum must contain 1 syslog message - difficulty here is having a UDPSource for wallaroo
what would be minimal requirements for writing such a source?
Hmm. It sounds like you're looking through the syslog output configuration options, right?
the other option is using TCP transport, leveraging the existing TCPSource, rsyslog supports plain TCP transport
nisanharamati: yes, exactly :)
for UDP you'd need to have a bridge or proxy app that receives the UDP data from syslog, and sends it to Wallaroo's input address
It might be similar for the TCP output as well.
my initial idea was to send from my local rsyslog directly to wallaroo
the biggest compatibility issue for TCP I think is the format
Wallaroo needs a length header to know how many bytes to read
vaninwagen: there's one other option that most folks couldnt do, but you can. you could add a UDP source to Wallaroo by writing some pony code.
SeanTAllen: that is exactly what my question above was aiming at :)
aturley has joined #wallaroo
Given you know Pony, you could probably take the example of our TCPSource and write a UDPSource based on it and the standard UDP stuff in Pony
SeanTAllen: yeah, looking into it... quite a machine! :)
I am most responsible for TCPSource so I can probably answer any questions you have, feel free to ask away here in general. Someone can probably help if I'm not around.
t0m has joined #wallaroo
aturley has quit [Quit: aturley]
just to get the bigger picture, i write my UDPSource, which gets a bunch of builders upon creation and my custom syslog SourceHandler which does the decoding into some class/data-structure. Then, after decoding it i send it down the pipeline using a built runner.
t0m has quit [Ping timeout: 240 seconds]
ok, the harder part seems to be keeping track of all the topology changes and the router stuff
but hey, i can copy and paste :)
that should be handled in Wallaroo
you need to provide the builders and what not
yes i mean all the whatnots
are you planning on using the Python API? If yes, there's a bit of work to support UDP there as well
IF its something you are interested in, we can set something up where jtfmumm could walk you through the details of what needs to be done
would it be crazy to stick to the pony API?
there's no documentation and its not supported, other than that, no, have at it if you want.
anything you find though, we'll be happy to help with via IRC.
given the preferences of people in my company i'm afraid if anything, it will be the go api
well most of the work for the UDP source would be in Pony, then a small amount of Python or Go glue code