#stellar-dev on 2015-09-29 — irc logs at freenode.irclog.whitequark.org

00:03 de_henne_ has joined #stellar-dev

00:07 de_henne has quit [Ping timeout: 264 seconds]

00:47 loglaunch has quit [Ping timeout: 240 seconds]

00:52 loglaunch has joined #stellar-dev

01:13 de_henne has joined #stellar-dev

01:17 de_henne_ has quit [Ping timeout: 260 seconds]

01:53 pixelbeat has quit [Ping timeout: 268 seconds]

03:44 TheSeven has quit [Disconnected by services]

03:45 [7] has joined #stellar-dev

03:54 gst has quit [Quit: leaving]

03:56 gst has joined #stellar-dev

06:02 de_henne has quit [Remote host closed the connection]

06:05 de_henne has joined #stellar-dev

08:34 pixelbeat has joined #stellar-dev

12:16 stellar-slack has quit [Remote host closed the connection]

12:16 stellar-slack has joined #stellar-dev

15:20 <stellar-slack> <lab> i started a network with 4 node quorum slice and 1 failure safty. nodes was start in order: node1(scp), node2(scp), node3(scp), node4

15:22 <stellar-slack> <lab> then i stop node1.

15:24 <stellar-slack> <lab> node2 met as network failure and lost track consensus. then node3 lost and node4 lost.

15:24 <stellar-slack> <lab> but LCL of node is 144, and LCL of node3/node4 is 145.

15:25 <stellar-slack> <lab> and i cann't restart the network without fully reset.

15:27 <stellar-slack> <lab> i tried manualclose/replace db on node2, but didn't work.

15:28 <stellar-slack> <lab> is it a fork? @jed @monsieurnicolas @matschaffer

15:43 <stellar-slack> <jed> what is the LCL of node 1?

15:44 <stellar-slack> <jed> That doesn't sound like a fork. It just sounds like you lost too many servers and SCP couldn't keep going

15:44 <stellar-slack> <jed> you can start the network from that state

15:54 <stellar-slack> <lab> lcl of node1 is 1 acturely. i restarted it from newdb and didn't synced yet.

15:54 <stellar-slack> <jed> oh I meant before it was restarted

15:59 <stellar-slack> <jed> so the way to get it back is copy the DB and the bucket dir from the guys on 145 to one of the other nodes

15:59 <stellar-slack> <jed> then you forcescp on the 3 with 145

15:59 <stellar-slack> <jed> this should start over from 145

15:59 <stellar-slack> <lab> node1 is restart be for lcl 100

15:59 <stellar-slack> <lab> before

16:00 <stellar-slack> <lab> i tried copy db and bucket dir.

16:02 <stellar-slack> <jed> hmm and what happens when you start it after copying?

16:06 pixelbeat has quit [Ping timeout: 250 seconds]

16:10 pixelbeat has joined #stellar-dev

16:18 <stellar-slack> <lab> it just shows logs of overlay info

16:19 <stellar-slack> <lab> i just tried copy db+buckets from node3 -> node1

16:19 <stellar-slack> <jed> I mean when you go stellar-core --info

16:20 <stellar-slack> <lab> then start node1 with forcescp, node1, node3, node4 come to 148 before i manually stopped node1

16:23 <stellar-slack> <lab> 2015-09-30T00:21:44.046 a96c31 [] [default] INFO { "info" : { "build" : "klm-1.0", "ledger" : { "age" : 456, "closeTime" : 1443543248, "hash" : "a46967ac5e734f48adc6472d0cec9c81a457e570fa90320b0247c027fcd0d27c", "num" : 149 }, "network" : "KLM is a Kilo of xLM; Strllar is an awesome copycat Stellar.", "numPeers" : 2, "protocol_version" :

16:23 <stellar-slack> <lab> 2015-09-30T00:22:01.106 a96c31 [] [default] INFO { "you" : "a96c31" }

16:23 <stellar-slack> <lab> above message is from node2

16:34 <stellar-slack> <jed> so what is the state of the network now?

16:37 <stellar-slack> <lab> node2 behaves normal after a copy db from node1 which (lcl is 159)

16:38 <stellar-slack> <lab> very strange

16:38 <stellar-slack> <jed> so everything is fine now?

16:38 <stellar-slack> <lab> yes

16:38 <stellar-slack> <lab> maybe node2 has connectivity issue

16:38 de_henne has quit [Remote host closed the connection]

16:38 <stellar-slack> <lab> it's the only node located in china

16:39 <stellar-slack> <jed> it is probably lag

16:39 <stellar-slack> <jed> What happens it gets out of sync?

16:40 <stellar-slack> <lab> not lag. just 100% packet lose.

16:42 <stellar-slack> <lab> 2015-09-29T14:11:37.226 a96c31 [] [Overlay] WARN read timeout

16:43 <stellar-slack> <lab> just before out of sync

16:44 <stellar-slack> <lab> i think it's because of GFW

16:45 <stellar-slack> <jed> yeah maybe

16:46 <stellar-slack> <jed> The connection to the others drops?

16:47 <stellar-slack> <lab> so, the number of node in china, should be less then failure safty

16:49 <stellar-slack> <lab> the behavior of GFW is not quite predictable. it's route dependent, time dependent...

17:37 pixelbeat has quit [Ping timeout: 240 seconds]

17:58 <stellar-slack> <buhrmi> the wall has to go. the wall has to go.

18:48 pixelbeat has joined #stellar-dev

19:01 nivah has joined #stellar-dev

19:16 pixelbeat has quit [Ping timeout: 260 seconds]

20:53 pixelbeat has joined #stellar-dev

22:40 <stellar-slack> <jed> restarting testnet shortly

22:42 <stellar-slack> <scott> the master branches of ruby/js/go base libraries are now updated to support the xdr as of stellar-core commit ad22bccafbbc14a358f05a989f7b95714dc9d4c6

22:45 <stellar-slack> <graydon> scott: updated scc locally, seems to mostly work!

22:46 <stellar-slack> <graydon> (except um, Failed to validate tx: 475a8fc56bee57f7a7144f506099e705dfb129f571266c4c74fdb7de069cc22e could not be found in txhistory table on process node0)

22:46 <stellar-slack> <graydon> scott: any chance I need to adjust something else scc-side?

22:47 <stellar-slack> <scott> graydon: Are you using ruby-stellar-base 0.6.0 or 0.6.1? 0.6.0 used fees that were too low

22:47 <stellar-slack> <scott> The latest scc master should be using 0.6.1

22:47 <stellar-slack> <graydon> 0.6.1

22:47 <stellar-slack> <graydon> I just updated

22:47 <stellar-slack> <graydon> I'll merge w/ master

22:47 <stellar-slack> <graydon> sec

22:48 <stellar-slack> <scott> I was able to run all the recipes from horizon without error, so I would expect your recipes to work.

22:50 <stellar-slack> <graydon> not sure. txs are getting in and ledgers are closing but the one it is looking to verify isn't showing up

22:51 <stellar-slack> <graydon> this is examples/simple_payment.rb

22:51 <stellar-slack> <scott> oh weird! I was ran that fine locally at one point today, let me try again

22:51 <stellar-slack> <graydon> (under docker, with a couple fixes to the .env file generation I'm about to push)

22:51 <stellar-slack> <graydon> I'll try under local

22:52 <stellar-slack> <graydon> same error locally

22:55 <stellar-slack> <scott> let me dig around the scc code for a bit to see if something pops out. Updating my docker images now to try that as well

22:55 <stellar-slack> <graydon> scott: maybe my image is behind. stick with what you're doing, it's probably on my end.

22:55 <stellar-slack> <graydon> was just wondering if you thought it should work / if it rang a bell

22:56 <stellar-slack> <graydon> "yes" and "no" is fine :)

22:56 <stellar-slack> <scott> kk! If you need any help, I’ll be around for the next hour or so

22:56 <stellar-slack> <graydon> ok

23:04 <stellar-slack> <scott> shutting down horizon to do some maintenance. Will start it back up after stellar-core is updated

23:04 <stellar-slack> <jed> testnet is down while we roll out the new version

23:33 Anduck has quit [Ping timeout: 246 seconds]

23:33 Anduck has joined #stellar-dev

23:36 <stellar-slack> <jed> ok testnet is back up

23:39 <stellar-slack> <scott> horizon-importer is back up and running. horizon is going to be down for about an hour while I fix up some last minute snafus