jemc changed the topic of #ponylang to: Welcome! Please check out our Code of Conduct => https://github.com/ponylang/ponyc/blob/master/CODE_OF_CONDUCT.md | Public IRC logs are available => http://irclog.whitequark.org/ponylang | Please consider participating in our mailing lists => https://pony.groups.io/g/pony
brainproxy has joined #ponylang
acarrico has quit [Ping timeout: 268 seconds]
acarrico has joined #ponylang
jemc has quit [Ping timeout: 268 seconds]
_whitelogger has joined #ponylang
PrsPrsBK1 has joined #ponylang
PrsPrsBK has quit [Ping timeout: 250 seconds]
PrsPrsBK1 is now known as PrsPrsBK
brainproxy has quit [Quit: WeeChat 2.3]
jemc has joined #ponylang
endformationage has quit [Quit: WeeChat 2.3]
<rkallos> jemc: Makes sense. Thanks! However, if the struct's member is `Pointer[U8]`, that seems to imply that the space for the array is not (necessarily) contiguous with the struct. As I understand it, in C, an array member in a struct is contiguous with the other fields.
<jemc> gah, you're right - I forgot that detail
<jemc> so if you're dealing with a fixed-length array like that, your best bet is probably the many-`U8`-fields hack
<jemc> and if it's a dynamic-length array, well, it wouldn't be contiguous with the other struct fields and the `Pointer` approach should work
<jemc> rkallos: you *might* be able to get a tuple to work as a fixed-length array? though my brain is a bit fried at this point tonight and I'm not totally sure whether that will work correctly
<jemc> that is, something like `struct Foo let arr: (U8, U8, U8, U8)`
jemc has quit [Ping timeout: 245 seconds]
profetes has joined #ponylang
ExtraCrispy has quit [Remote host closed the connection]
srenatus has joined #ponylang
inoas has joined #ponylang
inoas has quit [Client Quit]
EEVV has joined #ponylang
<EEVV> hey, so I want to make a sort of buffer where I pop data from it (goes through parser) but sometimes the buffer contains partial data (cuts off mid way) so I needed some sort of blocking reads to address this, but now I have an idea and I want to make sure it would work. What if I made it such that the TCP connection would update my buffer by pushing data to it and at the same time I pop from it?
<EEVV> I don't know if this is possible with behaviours tho
<EEVV> otherwise I would have to build a sort of state machine which is annoying since functions essentially encode a state machine and functions are more human readable
<EEVV> with this project the client sends to me the packet length as a varint (similar to protocol buffers), but after the encryption stage the client sends the whole packet encrypted (even the length)
<SeanTAllen> there are no shared data structures protected by lock in Pony EEVV which is what you seem to be describing
<EEVV> SeanTAllen, imagine something like getting this http command `GET / HTTP/1.1` but it is prefixed with a packet length
<EEVV> and the data cuts off midway
<EEVV> so your first `received()` function would be triggered and you get `GET / HT` and second one is `TP/1.1`
endformationage has joined #ponylang
<EEVV> I don't know how to do this nicely without blocking readas
<EEVV> reads
<SeanTAllen> i dont understand how you would do that with blocking reads
<SeanTAllen> what exactly are you blocking on?
<EEVV> in the parser
<EEVV> where it parser the incoming data
<SeanTAllen> sorry i dont understand
<EEVV> it does some stuff like `pop_u32()` and `pop_varint`
<EEVV> takes it from the data array that is given by `received` in TCPConnection
<SeanTAllen> where is "a blocking read" in that?
<EEVV> ok so for example
<Candle> I enable something along those lines with https://github.com/CandleCandle/text-reader
<EEVV> `pop_varint` can take like 7 bytes max but it is a variable amount
<EEVV> and for me it can cut off midway the data
<Candle> But then I almost force a state machine on the user.
<EEVV> if it cuts off I just get an error because my array ran out of elements
<SeanTAllen> i dont understand what you want to do EEVV
<SeanTAllen> you want to do what? wait for 7 bytes of data?
<EEVV> no
palmhead has joined #ponylang
<EEVV> for example let's say it is supposed to send me 7 bytes of data
<SeanTAllen> work stuff. back later.
<EEVV> but for some reason it sends me 3 bytes then 4 bytes
<EEVV> when my function takes these bytes and try to turn them into a datastructure it outputs an error because the first array which has 3 bytes doesn't have enough information to reconstruct the data structure
<EEVV> Candle, not really because sometimes I don't know what I can expect
<EEVV> for example the client would encrypt the whole packet (with packet length) and send it, there is no way for me to know how much to expect
<EEVV> in C I did a sort of leapfrogging buffer approach where when you try to read outside of the buffer it waits for the connection to send in the required data (usually in multiples of 128 bytes)
<EEVV> let me try to give a good example
<Candle> Right, ok, in text-reader, I look-ahead into the buffer, which allows me to know if there's a fill line to read.
<EEVV> Candle I will give you an example and you will tell me if you had to solve this solution too
<EEVV> say for example you are receiving a string, the format is like this: first byte is the length (max length = 255) the rest is the string. For example you want to receive 05 68 65 6c 6c 6f
<EEVV> which is the string in bytes for `hello` (prefixed with 1 byte as length)
<EEVV> but when you actually receive you get two different calls to the `received` function one is `05 68 65` the other is `6c 6c 6f`
<EEVV> effectively the packet is split in half but you must join it somehow
<Candle> Yes, I solve something along those lines.
<EEVV> so for me I would have lets say a `pop_string` function that tried to parse the bytes of a string
<EEVV> but since the string size exceeds my data array I get an error...
<EEVV> in C instead of getting an error I would wait for more bytes
<EEVV> but here I am lost
<aturley> EEVV most of the code i've written (and seen from others) uses some sort of state machine to keep track of the state between receiving chunks of data.
<EEVV> that's the annoying part :(
<aturley> buffer until you have what you need, then send that data off for processing.
<EEVV> with a blocking read the state doesn't have to be saved and restored by hand because you don't "go" anywhere else
<EEVV> aturley, not possible
<Candle> EEVV: in pop_string you'd read the first byte, then look at the data array size, if there's not enough, then yes, go into a "waiting for more data" state and hopefully call TCPConnectionNotify/#expect with the difference between what you have and what you need.
<EEVV> it is not for me to decide who sends the data I am simply sticking to a standard
<EEVV> Candle, but the problem is I would have to make this for every single `pop_X` function
<EEVV> surely there must be a better way
<aturley> EEVV i guess i'm not understanding why this is "not possible". i'm on a call, i'll got back and re-read what you've written about that later to see if i can get a better understanding.
<EEVV> aturley, because I do not control the client...
<Candle> the way I have it, the 'received' function dumps the data array into a buffer, then does a `while buffer.has_line() do dispatch_line(buffer.line()) end`
<Candle> Which you can't really do in your case.
<EEVV> Candle, what if I simulate a blocking read?
<EEVV> like let's say
<EEVV> my TCPConnection class would just receive and store in an internal buffer, when someone calls `read_block` it would return the internal buffer and reset the internal buffer to empty?
<EEVV> but maybe this contradicts what Sean said about locks...
<Candle> What is driving the reading of the data? the pony tcp connection gets an event when data arrives, so it needs to handle it and dispatch it. In Java, you tend to do it the other way around, your code is reading from the buffer, blocking when there's nothing to read - pony: the TCP connection is driving; Java the applicaiton is driving.
<Candle> (Picking two quite different approaches...)
<Candle> You can dump your data in to a https://stdlib.ponylang.io/buffered-Reader/ and then read it from there.
<EEVV> this will be the Java approach?
<Candle> Dumping into the Reader is more like the blocking approach that you describe (and my Java example)
<Candle> But with Pony being actor based, that's not always a viable approach.
<EEVV> I will try this, thanks
<Candle> Even then if you're doing something like a protobuf, you have a data structure that you are expecting, so you know when it's "ready". In that case, a state machine with a dispatch_payload(..) is the way I'd go about it.
<Candle> Just to complicate things, there's also the issue where you get two payloads in one received(..) call!
<aturley> you can stick it in a loop ...
<EEVV> Candle, I think I have basically implemented that Reader
<EEVV> but will it consume more data when it needs to? Like to blocking reads
<aturley> it will raise an error if you try to read more data than the buffer contains.
<EEVV> well.. I basically reinvented the reader then
<EEVV> the problem is I don't want it to error but wait for more data somehow
<profetes> EEVV: how do you know you've read enough, in C? Since payload length is also encrypted, how do you know it's okay to process?
<EEVV> I know it's ok to decrypt because I use AES which is a block cypher, so I just read 128 bytes and decrypt, if I need more just read more and decrypt more
<profetes> maybe the same logic can be applied in pony, while reading more and more, keeping yet-unencrypted data in buffer and testing everytime you receive sth new?
<Candle> You have the "available()" function on the Reader to avoid the error...
<EEVV> but I don't want a state machine
<EEVV> it makes the whole thing harder to understand
<Candle> The built-in Reader is better when you have a mixed set of data types, it's `line()` function is poor because it is the only one where you can't be sure of the length of data that you are reading, hence the error raised by it.
<profetes> subsequent reads can be scheduled with Timer (https://patterns.ponylang.io/async/waiting.html), to avoid too much checking, and when data is read - timer can be deactivated. It's a mini state machine :)
<Candle> I've been trying to develop a sensible pattern for that sort of state machine in Pony. Much like the otp patterns in Erlang. I have yet to see/have something that I'm happy with yet.
profetes has quit [Quit: Leaving]
jemc has joined #ponylang
jemc has quit [Quit: WeeChat 2.2]
jemc has joined #ponylang
<SeanTAllen> Candle: welcome to the club!
<SeanTAllen> aturley and i both have as well (as well as many others)
<SeanTAllen> EEVV: how does a "blocking read" now that it shouldn't block anymore?
<EEVV> when there is available data to return
<SeanTAllen> i literally do not know what that means
<SeanTAllen> how do you know there is available data to return?
<EEVV> so like
<EEVV> I would do `read(n)` and it would return me an array (max size of n) of bytes if it can't return any data it will wait until it can
<EEVV> so my connection would have a hidden buffer
<EEVV> when I receive something it goes to that internal buffer
<EEVV> or I should say when my computer receives something...
<SeanTAllen> wait until i can do what?
<SeanTAllen> that is what expect does btw
<SeanTAllen> you say "give me X bytes"
<EEVV> am I really this bad at explaining
<SeanTAllen> and there is a buffer
<SeanTAllen> that you dont see
<EEVV> give me X bytes or less?
<SeanTAllen> X bytes
<SeanTAllen> no more
<SeanTAllen> no less
<SeanTAllen> x bytes
<EEVV> this could work if I do expect(1) every time
<SeanTAllen> not sure why that is the case
<aturley> EEVV are you worried about the case where the client sends *fewer* bytes than it said it would?
<aturley> you've mentioned that the problem is that you "can't control the client", but i don't understand what out-of-control thing the client could do that you're worried about handling.
<aturley> tcp is stream-oriented, so basically your only options are to either say "give me what you've got for now and i'll take care of piecing it together and making sense of it" or "wait until you have this much data and then give it to me, because i'm expecting that much data".
<aturley> it sounds like your problem comes down to wanting a library/framework/something that stores the incoming bytes and only hands them off to you when it has received the correct number. but then it also looks like you're talking about a situation where you receive fewer bytes than expected, so i can't quite figure out how that fits in.
<aturley> again, i seem to not be understanding something.
<EEVV> hmm
<EEVV> I will try to rewrite with expect...
<EEVV> but first how does expect work? Do I just put it in my `received`?
<aturley> it depends on your protocol, but usually if you have a framed protocol then the first few bytes are of known lengths. so you should be able to put an `expect` call in the `connected` method of your notify class that expects however many bytes you want to look at in the beginning of the message.
<aturley> then, in the `received` method you'll read those bytes and call `expect` again with however many more bytes you want to read.
<aturley> if you *don't* call `expect` in the `connected` method then the first call to `received` will have an unknown number of bytes.
<EEVV> ah well
<SeanTAllen> EEVV: check out the example i sent yesterday
<EEVV> the first bytes I don't know how many there could be
<EEVV> the packet length system this uses is varint
<aturley> oh varint ...
<SeanTAllen> do you mean there's a delimiter where you know when you have "read to the end of this bit" EEVV ?
<aturley> now you've opened a whole new can of worms. :)
<aturley> if the first thing is a varint then i'd throw out `expect` altogether and go back to dropping things into a `Reader` buffer and pulling them out myself.
<aturley> way back in the day sylvan and i talked about adding `varint` support to `Reader` and `Writer` but that never happened.
inoas has joined #ponylang
<EEVV> SeanTAllen, yeah like there the MSB which says if this is the last byte of the varint
<SeanTAllen> so when you say "blocking read" you mean something that understands the protocol and will return meanginful chunks when they are available?
<EEVV> yes
<EEVV> so for example current byte says "I'm not the last one" but my data buffer is empty so the blocking read would pause until there is enough data accumulated to return the integer
<EEVV> with the string example
<SeanTAllen> so this "blocking read" is what? its a loop that is constantly checking a buffer? is that what you imagine how it would work?
<SeanTAllen> what does "pause" mean there?
<EEVV> let's say the client wants to send 05 68 65 6c 6c 6f (first byte is length, rest is string in ascii), but the server (me!) receives it in 2 chunks first one 05 68 65 and second one 6c 6c 6f. The server knows it needs to accept a string so first it checks the size and then it has a simple loop to fetch the remaining bytes
<EEVV> but as I'm doing this in pony
<EEVV> it runs out of data since the first chunk is only 3 bytes big. So an error occurs
<SeanTAllen> how do you know the first byte is the length?
<EEVV> based on prior packets let's just say
<EEVV> and other data such as packet id
<SeanTAllen> "lets just say" is kind of vague
<EEVV> I don't think it is that relevant tho
<SeanTAllen> how do you know that the length is denoted in one byte?
<SeanTAllen> its very relevant
<EEVV> it is the protocol
<SeanTAllen> a protocol i dont know
<EEVV> I just gave an example protocl
<SeanTAllen> that you are asking for help with
<EEVV> I don't think you need to know this protocol
<SeanTAllen> it the length always 1 byte?
<SeanTAllen> you want help with handling the protocol but dont think i need to understand the protocol. i disagree.
<EEVV> for the protocol I'm working on, no
<SeanTAllen> ok well i cant help because i dont understand the protocol.
<SeanTAllen> sorry.
<EEVV> but it doesn't matter because I still would have this "chunking" issue elsewhere
<SeanTAllen> if you say so
<SeanTAllen> i dont understand the protocol so ¯\_(ツ)_/¯
<EEVV> I just want you to understand how I did it before and maybe you can help me translate this into pony somehow
<EEVV> I only want you to look at this simple example of me receiving a string and we are assuming that we know it is a string (for sake of example)
<EEVV> that I'm supposed to receive this data 05 68 65 6c 6c 6f, but my code gets two calls to the `received` function which means that the packet was effectively split
<SeanTAllen> do you understand what aturley meant by "tcp is stream oriented"?
<EEVV> yeah but here it is like pony hides it
<SeanTAllen> how does pony hide it?
<EEVV> in C I would just be able to use the `recv` function iirc
<EEVV> which would wait for new data and return it
<SeanTAllen> recv returns whatever is available
<SeanTAllen> that is all pony is doing
<SeanTAllen> tcp is steam oriented
<SeanTAllen> if you send "FOOBAR"
<SeanTAllen> its a stream not a message
<SeanTAllen> with UDP, you will always get FOOBAR
<SeanTAllen> there is no guarantee with TCP when you call recv that you will get FOOBAR
<SeanTAllen> its a stream
<SeanTAllen> you will get some portion of the stream
<SeanTAllen> getting FOOBAR ina single recv call from TCP is pure happenstance
<EEVV> well let's say I established a new connection, and my server has a `recv` the server would wait for the client to send it data before it can process it hence it blocks
<EEVV> SeanTAllen, if TCP is a stream then I could do something like `get_next_byte()` correct?
<SeanTAllen> here is how the Pony TCPConnection works
<SeanTAllen> it use Epoll to be notified when new data is available on the socket
<SeanTAllen> it reads that and if you arent using expect, calls received() and passes it that data
<SeanTAllen> because TCP is stream oriented
<SeanTAllen> it might get
<SeanTAllen> F
<SeanTAllen> then
<SeanTAllen> OO
<SeanTAllen> then
<SeanTAllen> BAR
<SeanTAllen> it might get
<EEVV> yes
<SeanTAllen> FOOBAR
<EEVV> Sean I get that
<SeanTAllen> ok
<EEVV> I'm trying hard to explain what I want but it isn't working
<SeanTAllen> maybe someone else can help
<SeanTAllen> ive asked for what i think i need to understand
<SeanTAllen> you dont think i need that
<EEVV> maybe I can give some sort of analogy... Like if a user inputs to a program, the program waits, yes?
<SeanTAllen> i dont need an analogy
<aturley> EEVV just to be clear, your message starts with a varint?
<SeanTAllen> and no i wouldnt agree to that analogy
<EEVV> well Sean some part of your program waits and if the input is very important why wouldn't the program wait if it has nothing else to do?
<EEVV> aturley, yes it does
<SeanTAllen> maybe someone else can help, sorry i dont seem to be able to
<EEVV> thanks for trying to understand
<aturley> i think part of the problem is that the example you give has a fixed length piece of data at the beginning, but then the question you're asking is about messages with a variable-length piece of data at the beginning. these are different problems.
<aturley> if you have a fixed-size number at the beginning that tells you how big the rest of the message is then use `expect`.
<EEVV> ok, should I start over?
<aturley> if you have a varint then i'm not sure how even a blocking read would help you.
<EEVV> can I explain why it would help?
<aturley> sure.
<EEVV> say for example I get ff ff ff 00 each MSB = 1 says there is more bytes after the current one
<EEVV> or atleast that is what the client sends
<aturley> right.
<aturley> i'm with you so far.
<EEVV> now the way I did it before in C was I would have a `pop_varint` function that would use the `recv` functions for each of those bytes
<EEVV> the `recv` is basically a `get_next_byte` function
<aturley> you mean you would `recv` one byte at a time?
<aturley> ok, got it.
<EEVV> yes something like that
<EEVV> so the point is that
<EEVV> if I get my data in chunks (as in not ff ff ff 00 immediately)
<EEVV> it wouldn't matter, right?
<aturley> when you say "it wouldn't matter", what do you mean?
<EEVV> what I mean is it wouldn't disrupt the function, the output would be the same regardless
<EEVV> agree?
<aturley> ok, i think i see what you mean.
<EEVV> so my point now is that
<aturley> the function doesn't return until it has all the bytes it needs to determine the value of the varint.
<aturley> is that what you mean?
<EEVV> in pony if I would receive ff ff ff 00 in chunks like ff ff then ff 00 it would cause two `received` calls which doesn't work because I made incorrect assumptions...
<EEVV> aturley, yes
<aturley> ok, so we may be talking around each other here.
<aturley> i see what you mean.
<EEVV> so long as you're understanding I'm fine
<aturley> the plain answer is: pony won't let you do that.
<EEVV> I just don't know how to do this well
<aturley> there's no way around that.
<aturley> you can't do it that way in pony.
<EEVV> that's sad for me
<aturley> however, there are ways to read a varint.
<aturley> we've suggested several of them.
<EEVV> yeah
<aturley> but you can't do it by writing a function that uses blocking io.
<EEVV> the one way I saw doing it was using a state machine but this makes it unreadable which is why I hoped for a better solutio
<EEVV> solution
<SeanTAllen> you are going to have a state machine somewhere EEVV
<SeanTAllen> "is there enough data for this to be considered a chunk" is a state machine
<SeanTAllen> its a question of where you put it
<aturley> well, in the blocking case the state machine is hidden.
<EEVV> yes but I rather have an "implicit" state machine where the structure of your code determines the state (ifs, whiles, function calls) instead of having to write one explicitly if you know what I mean
<EEVV> before I go rewrite my stuff I will ask what lead to this design decision? (not implying anything here just want to know)
<aturley> the point was to not have blocked threads.
<EEVV> miss 'em already :)
<SeanTAllen> pony runs 1 thread per core
<SeanTAllen> and its locked to that core
<SeanTAllen> having more threads than cores leads to performance degradation
<EEVV> oh and since pony uses its own scheduler you would have to essentially make a better scheduler?
<SeanTAllen> im not going to touch that EEVV
<aturley> i don't know if i'd say "better".
<SeanTAllen> thats really insulting
<EEVV> sorry I didn't mean it like that
<aturley> there's a certain set of design tradeoffs.
<EEVV> better as in handle more cases
<aturley> i'm not sure i'd say that either.
<SeanTAllen> yup
<EEVV> I'm new to actor model but I think one thing about actor was that you want to have lightweight threads, correct?
<aturley> but yes, you could write a scheduler with different tradeoffs.
<SeanTAllen> i wouldnt say that either
<aturley> those tradeoffs would bleed over into a bunch of other choices outside of the scheduler.
<aturley> you'd end up with something different.
<aturley> you don't necessarily need lightweight threads.
<SeanTAllen> pony's two primary aims are maximizing performance and safety
<SeanTAllen> those come into conflict sometimes but they are the two primary goals
<EEVV> ok I will go rewrite, I didn't mean to insult anyone I am still learning so I may make some assumptions
<SeanTAllen> m:n threading models with blocking have a large performance impact in return for a "more familiar" programming model
<SeanTAllen> if you wanted to explain the protocol in some depth EEVV, i'd be happy to show you multiple ways that it could be done
<SeanTAllen> the example that i showed you for using except is very low level, its from Wallaroo where performance is more important than "clean, easy to understand code"
<SeanTAllen> that's a wallaroo tradeoff
<EEVV> I will take the same tradeoff that was my goal too
<SeanTAllen> we will favor doing things that make the code harder to understand if they give a performance boost
<SeanTAllen> i could show you how to do it like the wallaroo way
<SeanTAllen> and it will probably be the fastest way
<EEVV> can I get context what wallaroo is?
<SeanTAllen> i could show you how to do it using multiple actors that is more like what you are thinking
<SeanTAllen> but there will overhead and it will be slower
<SeanTAllen> Wallaroo is a high-performance stream processing engine written in Pony
<SeanTAllen> It has a Python API that is open source and we also use it to build high-performance data processing application written in Pony
<EEVV> hmm I will ask: does wallaroo take a "bottom-up" approach?
<EEVV> "top-down" is easier for blocking, you would do something like `pop_my_data` and it might do many other calls to other `pop_X` functions
<SeanTAllen> i dont know what "top-down" and "bottom-up" are. sorry.
<EEVV> with "bottom-up" it would I guess build the current partial data structure and with more bytes comming in complete the data structure
<EEVV> it's ok, maybe you can send me the link?
<SeanTAllen> the link to what?
<EEVV> I won't try to bug you guys anymore I feel like already I did too much damage
<EEVV> wallaroo
<SeanTAllen> feel free to keep asking questions EEVV
<EEVV> oh does that contain pony code?
<SeanTAllen> for you "blocking style"... the closest you could do in Pony is
<EEVV> I think I should do it the pony way
<SeanTAllen> tcpconnection actor receives data... forward it as is to a collects and knows when a chunk is available actor... any actors that need chunks register with the collector actor and are sent chunks when they are available
<SeanTAllen> in that case, your state machine for "is a chunk available" is isolated to that one collector actor
<SeanTAllen> you can take that same pattern and put it all in a single tcpconnection received if you make the colllector a class
<SeanTAllen> then each time you get data
<SeanTAllen> you check to see if a chunk is available
<SeanTAllen> as long as a chunk is available, send it to whoever is intereste
<SeanTAllen> the state machine there is in a class rather than an actor
<SeanTAllen> less overhead
<jemc> EEVV: just popping into this conversation, but yeah, the design decision to not allow blocking operations is a very conscious decision in Pony, so it's not something we're interested in changing - it's a feature rather than a bug or deficiency, and using Pony implies making this paradigm shift
<SeanTAllen> but either way there is no actual blocking
<EEVV> jemc, yes I see now
<SeanTAllen> the thing that receives the data from "a collector"
<SeanTAllen> doesnt need to know anything about a state machine
<EEVV> just not used to any other way, you know?
<SeanTAllen> when data is available it will receive it
<EEVV> SeanTAllen, can you link me part of walloo which does this state machine?
<EEVV> wallaroo
<SeanTAllen> the wallaroo one is low level
profetes has joined #ponylang
<SeanTAllen> and its using a framed message protocol so, its a simpler state machine
<SeanTAllen> you are either waiting for a header
<SeanTAllen> or you are getting a payload
<EEVV> hmm
<SeanTAllen> the header tells us how big the next payload is
<SeanTAllen> because it is a framed protocol, the header is always 4 bytes in length
<SeanTAllen> those very simple state machine
<jemc> EEVV: so, looking at the example you gave where you need to read each byte at a time to know what's coming next, this case is best served by `buffered.Reader` abstraction - you just keep shoveling bytes in one end, and on the other end you can test whether more bytes are available - the `Reader` becomes the basis of your state machine; whereas, as Sean is mentioning, if you have a framed protocol with an
<jemc> explicit length specified up front for the whole chunk, you get a much simpler state machine
<jemc> so, one option is to use a framed protocol on top of your varint-using protocol, so you only receive one message at a time
<jemc> another simplified option is to have a partial function as your parser that just raises an `error` whenever it runs out of bytes, and doesn't consume anything until a final commit for the whole message, so that it will run again later when you get the next chunk; this approach isn't totally performance-optimized because you will waste time parsing when you maybe didn't have to, but it at least provides a
<jemc> simpler, more familiar model where you don't need to keep a state machine outside of what's on the stack
<jemc> that code was written before we had `TCPConnection.expect`, and if I were writing it again today I'd probably optimize it by using `expect`
<jemc> but it still shows a reasonable example of how you can do this in a more familiar way if you're okay with giving up some performance
<jemc> (note that if you're using `error` to indicate "not enough bytes yet", you need to have another way to communicate a protocol error, which in the example I linked above is done using the `SessionNotify` object)
inoas has quit [Quit: inoas]
<EEVV> hmm
<EEVV> jemc, is a framed protocol one where the header has constant size?
<EEVV> and the header includes the size of the payload?
<SeanTAllen> EEVV: yes
<jemc> yeah
<EEVV> ok
<jemc> there are variations on it, but that's the core requirement - you know how many bytes to `expect` to start with, then reading that data tells you how many bytes to `expect` for the next read
<EEVV> then I cannot use the framed approach
<SeanTAllen> this reminds me jemc. we at Wallaroo Labs have a number of performance improvements to port over to TCPConnection re: expect.
<SeanTAllen> EEVV: that's not entirely true
<SeanTAllen> when you read the varint, you know how large the next chunk is right?
<EEVV> yes
<SeanTAllen> the payload, correct?
<EEVV> yes
<SeanTAllen> you can use expect for that
<SeanTAllen> but its a bit more complicated
<jemc> do you know the max number of bytes the varint can be?
<EEVV> this is true, so header need not be constant size?
<jemc> like, does it cap out at 4 or 8 bytes or something like that?
<EEVV> in my case I think varint can be like 5? Enough to fill 32 bit
<SeanTAllen> so what you can do
<EEVV> expect(5)?
<SeanTAllen> do expect (5) at the start
<SeanTAllen> yes
<SeanTAllen> and if some of those bytes are payload
<EEVV> I see, just do array manipulation
<EEVV> append, right
<SeanTAllen> you do expect(payload_size-extra_bytes_i_have)
<SeanTAllen> for the next round
<EEVV> ah ok
<EEVV> can I complicate this a bit more?
<SeanTAllen> then take your extra bytes and add them to what comes in on the next received
<SeanTAllen> ship them off
<SeanTAllen> and then
<SeanTAllen> back to expect(5) again
<aturley> O.O
<SeanTAllen> yes aturley ?
<SeanTAllen> there is however in that approach EEVV the chance for not getting all data
<SeanTAllen> if the very last thing you got was a varint of 1
<SeanTAllen> and the payload
<aturley> i guess just remember that if length is 1 and you don't get any more bytes for a while then it will be a while before you read again
<SeanTAllen> you'd need to account for that
<aturley> that one, yes
<EEVV> maybe this might work... I just have one thing where after a certain point the packets are encrypted with 128bit AES, I guess after a certain point I could do expect(128) and decrypt, get the header size then do the rest in terms of AES cypher blocks (if that's what they're called)
<EEVV> sorry not header size, header
<SeanTAllen> so it isnt literally expect(payload-bytes_i_have_already)
<aturley> dunno, that feels more complicated to me, but i'm saul and this is between y'all.
<SeanTAllen> because if its 0, then well, its differnt
<SeanTAllen> its an option aturley
<SeanTAllen> that is all
<EEVV> Sean what about this
<SeanTAllen> I think that sticking all the data into something like Reader and after each received, removing any chunks that are ready is the best approach.
<EEVV> say for example I am taking in a string let's isolate the example. What if in my state machine it said I need a string, when I receive it checks the state machine which says I need a string and partially completes the string structure by appending whatever data I have gather thusfar
<SeanTAllen> yes
<EEVV> and once the string is complete my state machine would say something like I need a varint, etc...
<SeanTAllen> yes
<EEVV> this is what I meant by bottom-up
<SeanTAllen> lets not talk about bottom-up and top-down
<EEVV> ok
<SeanTAllen> probably not productive
<SeanTAllen> cos i dont really know what they mean
<SeanTAllen> but yes you general approach is sound
<SeanTAllen> something needs to be doing that
<SeanTAllen> no matter what you do
<SeanTAllen> something needs to be getting the data
<SeanTAllen> and understand the protocol and saying "ah-ha i have a chunk"
<EEVV> I will choose this method, I think it works best with the encryption down the line
<SeanTAllen> i cant comment on the particulars because i dont know the protocol
<EEVV> it is fine, I just need to brainstorm how I will go about doing this for a bit
<SeanTAllen> what were you using before that provided `pop_varint` EEVV ?
<EEVV> SeanTAllen, you want to understand how my current system does this?
<SeanTAllen> did you write pop_varint?
<EEVV> yes
<SeanTAllen> ok
<EEVV> I assumed that all the data wouldn't be "chunked"...
<SeanTAllen> what does "chunked" mean here?
<EEVV> SeanTAllen, the whole TCP is a stream thing you talked about
<SeanTAllen> so you assumed it worked like UDP? arrived as "whole messages"?
<EEVV> SeanTAllen, well I don't know how UDP works but yes "whole messages" as in if a client sends 100 bytes the server should receive 100 bytes, obviously a stupid assumption
<SeanTAllen> thats how UDP works
<EEVV> but UDP also drops packets sometimes right?
<SeanTAllen> UDP has no delivery guarantees that is correct
<SeanTAllen> but it is "message oriented"
<EEVV> hmm ok
<EEVV> I must go now, thanks for the help and the patience
<SeanTAllen> you're welcome
EEVV has quit [Quit: see you tomorrow]
<SeanTAllen> when you have an idea, come back and we can help
palmhead has quit [Ping timeout: 252 seconds]
srenatus has quit [Quit: Connection closed for inactivity]
<profetes> Hi, I missed sync yesterday, I was pretty sure it was today. But today there was nobody, and then I realized the sad truth. Is there a chance to upload the audio from sync a little sooner, pretty please?
<profetes> thanks in advance
travis-ci has joined #ponylang
<travis-ci> ponylang/ponyc#5592 (master - 4293efc : [Main]): The build was broken.
travis-ci has left #ponylang [#ponylang]
<jemc> I think @aturley recorded sync this week ^
<aturley> i did, i'll get it up soon.
<aturley> i’m starting my “Twitch IRC bot in Pony” stream in a few minutes. https://www.twitch.tv/aturls
<aturley> The recording of the January 15, 2019 Pony sync meeting is available:
<jemc> @profetes: ^
<profetes> aturley: thank you, i'll listen to that tomorrow. And to twitch stream as well. goodnight for now.
profetes has quit [Quit: Leaving]
aturley has quit [Ping timeout: 268 seconds]