#wallaroo on 2019-01-07 — irc logs at freenode.irclog.whitequark.org

2017-09-30 09:40 SeanTAllen changed the topic of #wallaroo to: Welcome! Please check out our Code of Conduct -> https://github.com/WallarooLabs/wallaroo/blob/master/CODE_OF_CONDUCT.md | Public IRC Logs are available at -> https://irclog.whitequark.org/wallaroo

12:23 _whitelogger has joined #wallaroo

12:43 aturley has quit [Quit: aturley]

14:00 zodiac403 has joined #wallaroo

14:06 <zodiac403> Hi there. I'm trying to connect Wallaroo to a Redis server. Someone here who can help me with the source / connector definition?

15:01 <SeanTAllen> hi zodiac403, let me get someone who can assist

15:01 <SeanTAllen> is there a specific problem you are having or are you looking for general help?

15:02 <zodiac403> That would be great.

15:04 <jonbrwn> hi @zodiac403, I'm Jonathan. what can I help you with?

15:04 <zodiac403> I habe the `alerts_stateless` sample up and running with the `wallaroo.GenSourceConfig`. Now I want to subscribe on Redis instead.

15:05 <zodiac403> I found the `redis_subscriber_source` but I am missing the link on how to "connect" the two...

15:06 <jonbrwn> ok, let me have a look

15:13 <jonbrwn> @zodiac403 have you done any work to make the move over to Redis or are you more so trying to figure out how?

15:14 <zodiac403> Nothing that works yet

15:16 <jonbrwn> ok, initially you'd want to get a channel setup and create a script that publishes Transactions to that channel

15:17 <jonbrwn> here's how we generate our Transactions: https://github.com/WallarooLabs/wallaroo/blob/cb6123c112719f3875088163eb05bc836a9f9cb3/examples/python/alerts_stateless/alerts.py#L76-L87

15:19 <zodiac403> Is this part of potential real world code or is this just part of the demo-value-generator?

15:20 <zodiac403> As far as I understood the redis_subscriber_source it should create a subscription on redis: https://github.com/WallarooLabs/wallaroo/blob/0.6.1/connectors/redis_subscriber_source#L6-L14

15:21 <jonbrwn> this particular bit is demo value generator, you'd need some meaningful data for Wallaroo to ingest

15:21 <zodiac403> Do I have to implement this code in my app or can I make Wallaroo call this?

15:21 <zodiac403> > you'd need some meaningful data for Wallaroo to ingest

15:21 <zodiac403> I have a script that is publishing messages on a given Topic on Redis.

15:22 <zodiac403> But that currently is faaar separated from the Wallaroo processes...

15:23 <jonbrwn> 10:21 AM <zodiac403> I have a script that is publishing messages on a given Topic on Redis.

15:23 <jonbrwn> are the messages in a format that the alerts_stateless application can process or is it for a different application?

15:24 <zodiac403> Format is a simple string "foobar".

15:24 <zodiac403> yet

15:24 <zodiac403> <zodiac403> Do I have to implement this code (the redis subscribe) in my app or can I make Wallaroo call this?

15:27 <jonbrwn> the code that generates data should be implemented outside of Wallaroo

15:29 <zodiac403> Sorry, I'll be offline in a few minutes to catch my train...

15:29 <jonbrwn> no worries, did that at least help in getting that bit of the Redis setup explained?

15:30 <jonbrwn> either way, do feel free to ping us when you're available so we can help.

15:30 <zodiac403> Not yet... How can I tell the Wallaroo Pipeline to actually call the redis_subscriber_source and e.g. pass on my TOPIC to subscribe to?

15:31 <zodiac403> Did I miss an important part in your documentation?

15:31 <jonbrwn> it's very possible that we didn't describe it in the best way for users to consume

15:32 <zodiac403> Thx, Jonathan so far. I'll try to come back later...

15:32 <zodiac403> ;)

15:32 zodiac403 has quit [Quit: Page closed]

17:30 zodiac403 has joined #wallaroo

17:53 <zodiac403> Hi @jonbrwn, are you still there?

17:54 <jonbrwn> yes @zodiac403

17:56 <jonbrwn> if you wanted to continue to use the `alerts_stateless` application, I'd recommend setting up an external script which uses the generator code that I linked to above to publish to your Redis topic

17:57 <jonbrwn> then we can get you setup with the Redis connector source

17:57 <zodiac403> Great! I worked myself a little more through the sample code. I have a good idea, how to create a pipeline with a source configuration (e.g. from this sample: https://github.com/WallarooLabs/wallaroo/blob/0.6.1/examples/python/celsius_connectors/celsius.py#L37) and I have the Redis-Subscriber-Connector-Code from the Redis sample. Should these 2 run in the same process or in seperate ones?

18:00 <jonbrwn> you'll end up having 2 separate processes: one for the redis source connector and one for the wallaroo application

18:02 <jonbrwn> you will also need a Redis Sink Connector process if you plan to write out to Redis from Wallaroo as well

18:04 <zodiac403> > "one for the redis source connector and one for the wallaroo application"

18:04 <zodiac403> How do these 2 communicate with eachother?

18:06 <jonbrwn> when you pass the application_module to the connector, it uses the port defined in the SourceConnectorConfig to send the data over to the running Wallaroo application using the same module

18:07 <zodiac403> ok. give me some minutes...

18:07 <zodiac403> ;)

18:09 <jonbrwn> no worries, I'll check back in a few

18:34 zodiac403 has quit [Ping timeout: 256 seconds]

19:57 zodiac403 has joined #wallaroo

19:57 <zodiac403> Hi @jonbrwn, sorry, got a little distracted.

19:58 <zodiac403> Now I have alerts.py and redis_subscriber_source.py in my working directory.

19:59 <jonbrwn> what's in your redis_subscriber_source.py file?

19:59 <zodiac403> I try to start it with the following line but that doesn't feel right. ```python redis_client.py --application-module alerts --connector c --c-topic mytopic --out 127.0.0.1:5555 --metrics 127.0.0.1:5001 --control 127.0.0.1:6000 --data 127.0.0.1:6001 --name worker-name --external 127.0.0.1:5050 --cluster-initializer --ponythreads=1 --ponynobl```

19:59 <zodiac403> The redis_subscriber_source.py contains exactly the content from https://github.com/WallarooLabs/wallaroo/blob/0.6.1/connectors/redis_subscriber_source

20:01 <jonbrwn> so you'd run `python redis_subscriber_source ...` instead of `redis_client.py`

20:02 <jonbrwn> and a few of those args look off

20:02 <jonbrwn> give me a few

20:03 <zodiac403> argh, yeah copy-paste-error in the chat

20:04 <strmpnk> A note on the command line arguments: We usually pass the extra arguments so that your application can parse out the same fields when loaded from either the connector script or the wallaroo worker. It's worth noting you can leave many of those off like --name --data --control --external and so on. You do need --application-module, --connector, and any connector required arguments (as well as any your application script expects

20:04 <strmpnk> to parse).

20:05 <strmpnk> The --c- prefix, in this case, is there to avoid any command line argument name conflicts between the connector instance specific parameters and wallaroo ones. TL/DR, if you're not sure, it's safe to pass all of the wallaroo arguments through.

20:06 <jonbrwn> @zodiac403 I'm stepping out for a few, @strmkpnk will help you while I'm gone

20:07 <strmpnk> zodiac403: I wrote the original connector script you're using so you can blame me if it's unclear. ;-) I'm happy to help port over the code.

20:08 <zodiac403> thx ;)

20:09 <zodiac403> As you can read above, I have the `alerts_stateless` sample up and running. But I feel a little lost connecting it to a Redis subscriber.

20:10 <zodiac403> If you have a working example or a piece of docs I missed so far, I'd be happy.

20:10 <strmpnk> Okay. So let's look at your application module. Can you paste it in a gist or pastebin? We can walk through that first.

20:10 <strmpnk> Or I can start from the sample and we can do it together if you prefer.

20:11 <zodiac403> See if this works: https://gist.github.com/zodiac403/07eabd7e4725f24305702d69ac66bb97

20:11 <strmpnk> Got it. Looking through now.

20:13 <strmpnk> Okay. It looks good. There is a minor note, depending on which version of python/machida you're running. The decoder is returning the byte-string directly. So if you're expecting a regular python string, you might want to decode that.

20:14 <strmpnk> The main problem, which I think is the error is the port number.

20:15 <strmpnk> --data defines an internal communication port for data between workers. It's possibly too vague of a name.

20:16 <strmpnk> If you change that port in the source configuration to something like 7000 it should be able to listen for connections properly.

20:17 <zodiac403> Which port do you mean?

20:17 <strmpnk> Line 37 in the gist.

20:17 <strmpnk> It's a connector instance specific port, which should be separate from the rest.

20:18 <strmpnk> If you run it, it should show that it can connect (not sure it will work just yet since we need to parse data into a working format for the example and this depends on how you put things into redis).

20:20 <zodiac403> OK. I changed the port. But how do I start it?

20:20 <zodiac403> Is `python redis_subscriber_source.py` the correct entry point?

20:21 <strmpnk> Okay. So you can start both processes in any order but I usually start the worker first. You can run the worker with the same command line as the example's readme first.

20:21 <zodiac403> Yes, I have that "... Alerts (stateless) source attempting to listen on 127.0.0.1:7000 ..."

20:22 <strmpnk> Once that is ready, we can run the script. This should need three arguments: --application-module, --connector, and --c-topic

20:23 <strmpnk> The application module must be in the python path so it can load it and use the proper encoder and discover the port to use.

20:23 <strmpnk> The --connector name should match what you use in the application module, so "c" like you have.

20:23 <strmpnk> And then the connector specific topic with the prefix.

20:23 <zodiac403> Wait, I get a python error...

20:24 <strmpnk> What does it say?

20:24 <zodiac403> I call `python redis_subscriber_source.py --application-module alerts --connector c --c-topic foo`

20:25 <zodiac403> See error message: https://gist.github.com/zodiac403/07eabd7e4725f24305702d69ac66bb97#file-error1-txt

20:26 <strmpnk> Ah. I forgot, the application script expects an output address.

20:26 <strmpnk> So you'll need to pass that.

20:26 <strmpnk> Then we'll fix up the encode/decode definitions but first lets see if it can connect.

20:28 <zodiac403> OK. And we need a redis port to cast it to `int()`

20:29 <strmpnk> The port is read from the connector configuration.

20:29 <strmpnk> Oh, sorry, I misread, yes.

20:29 <zodiac403> No, I had to provide `--c-host` and `--c-port`.

20:30 <strmpnk> Yeah. We don't default those. Let me see.

20:30 <zodiac403> The PY code reads like they were optional. But sth went wrong when passing them to the redis client libary.

20:31 <strmpnk> Yeah. I think the redis library is probably picky and requires explicit host and port.

20:31 <strmpnk> I think I tested them w/o that but it could be things have changed.

20:32 <strmpnk> But I think that error is unrelated to the redis port and host.

20:32 <strmpnk> It's line 29 of the application.

20:32 <strmpnk> It's calling tcp_parse_output_addrs(args)

20:32 <strmpnk> This expects --out.

20:33 <strmpnk> So you'll have to pass it as well.

20:33 <zodiac403> Yes, you're right. I did that

20:33 <strmpnk> Does it get farther along now?

20:34 <zodiac403> yes, indeed.

20:34 <zodiac403> give me a sec.

20:34 <strmpnk> Great. So we're not completely set just yet but before we move on, I can explain any confusion over which arguments were passed and why.

20:36 <zodiac403> OK, I'm back. Currently I call the redis script with `python redis_subscriber_source.py --application-module alerts --connector c --c-topic foo --c-host localhost --c-port 6379 --out 127.0.0.1:5555`

20:36 <zodiac403> Now I get some redis connection issues, but that seems to be part of my docker setup.

20:37 <zodiac403> <@strmpnk> ... I can explain any confusion over which arguments were passed and why. >>> Thanks, but that is understandable for me.

20:37 <strmpnk> Ah. Yeah, docker may not be mapping the redis port to localhost. `-p` to the docker command line might help (or equivalent if you're using other tools to setup the container)

20:38 <strmpnk> So the next step is to use the encode_feed call to construct Transaction objects and then adjust decode_feed to turn the bytes back into a Python object.

20:40 <strmpnk> We have many options here. To illustrate with text: [something that writes to topic]-->[redis]-->[connector (encoder)--]--[-->(decoder) wallaroo]-->[sink]

20:40 <strmpnk> We can setup whatever format we want as long as (decoder) can turn it into a Transaction object.

20:41 <strmpnk> So redis could use a specific format (JSON, pickled object, text format, something custom).

20:41 <strmpnk> But in this example Transaction has two parts: the transaction id and the amount. So we can pass those however we want.

20:42 <strmpnk> (encode)'s job is to take whatever is in redis and do any translation it need to for (decode) to do it's job.

20:42 <strmpnk> It may not need to do anything, or it could filter/clean up the data. Up to you.

20:43 <zodiac403> cool.

20:44 <zodiac403> Still having some redis connection issues on this side. But that I'll figure out myself.

20:44 <strmpnk> In your example, you have decode passing something from struct. Here, we'd need to construct a Transaction.

20:44 <zodiac403> When the publisher triggers the encoder, which payload type (string / binary / ?) will I expect?

20:45 <zodiac403> DECODER of course...

20:45 <strmpnk> The encoder gets whatever write() passes. The decoder gets the same byte string that encoder returns for each item.

20:45 <strmpnk> (Given that this is all being passed over TCP)

20:48 <strmpnk> I'm going to be AFK for ~10m. BRB

20:48 <zodiac403> No problem. You helped me a LOT

20:48 <zodiac403> thanks so far.

20:57 zodiac403 has quit [Quit: Page closed]

20:58 * strmpnk is back

21:16 aturley has joined #wallaroo