siruf has quit [*.net *.split]
dwknoxy has quit [*.net *.split]
vlad_starkov has quit [*.net *.split]
ckrailo has quit [*.net *.split]
blowmage has quit [*.net *.split]
dwknoxy has joined #rubygems
blowmage has joined #rubygems
ckrailo has joined #rubygems
vlad_starkov has joined #rubygems
siruf has joined #rubygems
jitendravyas has joined #rubygems
jitendravyas has quit [Ping timeout: 272 seconds]
Hanmac has quit [Ping timeout: 260 seconds]
dwknoxy has quit [Quit: Textual IRC Client: www.textualapp.com]
huoxito has quit [Remote host closed the connection]
huoxito has joined #rubygems
huoxito has quit [Ping timeout: 265 seconds]
havenwood has quit [Remote host closed the connection]
huoxito has joined #rubygems
huoxito has quit [Ping timeout: 245 seconds]
havenwood has joined #rubygems
huoxito has joined #rubygems
tenderlove has quit [Quit: Leaving...]
Hanmac has joined #rubygems
jhass is now known as jhass|off
havenwood has quit [Remote host closed the connection]
huoxito has quit [Remote host closed the connection]
lsegal has joined #rubygems
huoxito has joined #rubygems
huoxito has quit [Ping timeout: 245 seconds]
tbuehlmann has joined #rubygems
_redmenace has joined #rubygems
redmenace has quit [Ping timeout: 244 seconds]
_redmenace has quit [Read error: Connection reset by peer]
redmenace has joined #rubygems
redmenace has quit [Ping timeout: 250 seconds]
redmenace has joined #rubygems
_redmenace has joined #rubygems
redmenace has quit [Ping timeout: 258 seconds]
elia has joined #rubygems
seanlinsley has quit [Ping timeout: 272 seconds]
seanlinsley has joined #rubygems
dangerousdave has joined #rubygems
tbuehlmann has quit [Remote host closed the connection]
dangerousdave has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
workmad3 has joined #rubygems
sferik has joined #rubygems
<sferik> evan qrush drbrain samkottler: anyone awake? we’ve got problems
<dwradcliffe> Just woke up
<sferik> dwradcliffe: good morning
<sferik> dwradcliffe: there appears to be a situation
lsegal has quit [Quit: Quit: Quit: Quit: Stack Overflow.]
<sferik> dwradcliffe: I haven’t had much time to diagnose the problem but it seems we’re getting DDoS’d
<dwradcliffe> yeah pagerduty woke me up :)
<sferik> dwradcliffe: my internet connection sucks and I need to go give a conference talk in about an hor
<dwradcliffe> give me a second to sort out what's happening
<sferik> dwradcliffe: I noticed that app01-aws.rubygems.org was missing some security updates, so I decided to apply those
<sferik> dwradcliffe: I hope that’s not a mistake
<sferik> dwradcliffe: it seemed prudent since the site was already down
<dwradcliffe> that's still running?
<dwradcliffe> that's not in use
<sferik> dwradcliffe: yes (please don’t reboot)
<sferik> dwradcliffe: what's not in use?
<sferik> app01?
<dwradcliffe> that server
<dwradcliffe> yeah
<sferik> oh, I am very confused then
<sferik> which are the production servers?
<sferik> brb
sferik has quit [Quit: Textual IRC Client: www.textualapp.com]
sferik has joined #rubygems
redmenace has joined #rubygems
<dwradcliffe> ok looks like redis is down again
<sferik> dwradcliffe: okay, the update to app01-aws.rubygems.org is complete
<sferik> dwradcliffe: which servers are involved?
_redmenace has quit [Ping timeout: 265 seconds]
jhass|off is now known as jhass
tcopeland has quit [Quit: Leaving.]
rossgeesman has joined #rubygems
rossgeesman has quit [Remote host closed the connection]
sferik has quit [Quit: Textual IRC Client: www.textualapp.com]
<qrush> Hi
<qrush> Seems like things are ok?
<qrush> Apparently texts do not wake me always
dangerousdave has joined #rubygems
dwknoxy has joined #rubygems
dangerousdave has quit [Client Quit]
dangerousdave has joined #rubygems
rossgeesman has joined #rubygems
bbrowning_away is now known as bbrowning
rossgeesman has quit [Ping timeout: 265 seconds]
tcopeland has joined #rubygems
willywos has joined #rubygems
workmad3 has quit [Ping timeout: 250 seconds]
rossgeesman has joined #rubygems
dangerousdave has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
huoxito has joined #rubygems
tbuehlmann has joined #rubygems
workmad3 has joined #rubygems
bradland has joined #rubygems
rossgeesman has quit [Remote host closed the connection]
tenderlove has joined #rubygems
rossgeesman has joined #rubygems
tbuehlmann has quit [Quit: Leaving]
seanlinsley has quit [Ping timeout: 244 seconds]
havenwood has joined #rubygems
rossgeesman has quit [Remote host closed the connection]
rossgeesman has joined #rubygems
rossgeesman has quit [Remote host closed the connection]
rossgeesman has joined #rubygems
seanlinsley has joined #rubygems
rossgeesman has quit [Remote host closed the connection]
havenwood has quit [Remote host closed the connection]
havenwood has joined #rubygems
rossgeesman has joined #rubygems
dvu has joined #rubygems
dvu has quit [Remote host closed the connection]
dvu has joined #rubygems
rossgeesman has quit [Ping timeout: 260 seconds]
bradland has quit [Quit: bradland]
bbrowning is now known as bbrowning_away
drbrain has quit [Ping timeout: 240 seconds]
drbrain has joined #rubygems
tbuehlmann has joined #rubygems
elia has quit [Quit: Computer has gone to sleep.]
elia has joined #rubygems
bbrowning_away is now known as bbrowning
workmad3 has quit [Ping timeout: 272 seconds]
dwknoxy is now known as dknox-lunch
thumpba_ has joined #rubygems
havenwood has quit [Remote host closed the connection]
thumpba has quit [Ping timeout: 246 seconds]
thumpba has joined #rubygems
thumpba_ has quit [Ping timeout: 255 seconds]
bbrowning has quit [Remote host closed the connection]
havenwood has joined #rubygems
bbrowning has joined #rubygems
djbkd has joined #rubygems
havenwood has quit [Remote host closed the connection]
<samkottler> qrush: we need to move to a new redis box
<samkottler> because we're evicting too quickly
<qrush> evicting what
<samkottler> pages
<samkottler> to the disk
<qrush> ok :)
<evan> we also need to get off redis
<qrush> i thought the dependency API is off us anyway
<evan> it's still on my todo.
<qrush> which is the primary redis issue
<evan> we've still got stats in redis
<samkottler> evan: CRDT's
<samkottler> just sayin
<evan> yeah
<qrush> had to google that -_-
<samkottler> while we're at it stat-update or whatever that thing is called should get rewritten
<dwradcliffe> hey guys, already moved redis this morning
<evan> yeah
<samkottler> oh awesome dwradcliffe, to which instance type?
<dwradcliffe> m3-large
<qrush> samkottler: ah you were responding to my earlier question?
<qrush> i'm really surprised that just doing increments in redis is problematic
<qrush> is it because we're on AOF too?
<samkottler> it's not doing increments themselves
<samkottler> it's that heap frag means we need a huge amount of free memory
<dwradcliffe> I had to reboot redis02 about a dozen times. it was only lasting about 10 minutes before freezing.
<samkottler> like 40% overhead
<samkottler> AWS should make a redis instance type
<samkottler> 1 core and a bunch of memory
<samkottler> trololol
<evan> what does elasticcache use?
<qrush> we run a shitload of redis at basecamp and it's not an issue
<qrush> i can bring one of our guys in here, maybe we can check our config against theirs?
<samkottler> sure
<samkottler> this a known issue with redis
<samkottler> it just requires more hardware
<samkottler> evan: cache.r3.large 2 CPU's and 13.5 GB of RAM
<evan> nice
<qrush> samkottler: sure as in yes, let's double check?
<qrush> just trying to help :)
<samkottler> qrush: I'm totally open to chatting with folks about :)
<dwradcliffe> our setup is fairly vanilla so maybe there's something we can tweak
mr_ndrsn has joined #rubygems
mkent has joined #rubygems
* qrush summoned a few to take a look
<qrush> samkottler: can you gist the config for mr_ndrsn and mkent ? also *waves*
<mr_ndrsn> *waves*
<mr_ndrsn> Hey Sam!
<qrush> or evan dwradcliffe :)
<evan> :D
<johnmwilliams___> Basecamp Ops REPRESENT!
<samkottler> hey hey mr_ndrsn!
<dwradcliffe> howdy mr_ndrsn
<dwradcliffe> I can gist it unless someone else has it already
<samkottler> I'm gisting right now
<samkottler> one second
<dwradcliffe> gist race!
<samkottler> high level overview:
<qrush> <3 <3
<samkottler> maxmemory-policy means we use it like a LRU cache
<samkottler> but _never_ evict key permanently, just to rdb
<samkottler> you can see the write policy
havenwood has joined #rubygems
<samkottler> we fsync every second
<samkottler> not particularly aggressive about BGREWRITEAOF
<samkottler> and then activerehashing
<samkottler> that's mostly it
<samkottler> other than some setting around data types
<samkottler> which are generally unimportant
<johnmwilliams___> Just to be clear, this is a disk space issue not a disk IO issue, correct?
<samkottler> mr_ndrsn: do you have lots of issues at basecamp around heap presusre?
<samkottler> pressure**
<samkottler> johnmwilliams___: the disk isn't the issue at all
<johnmwilliams___> Ok, getting mixed things from here and internal chat.
<samkottler> internal chat where?
<dwradcliffe> johnmwilliams___: memory issue
<samkottler> oh work
<mr_ndrsn> No idea SK. mkent/johnmwilliams are better candidates for that question.
<samkottler> so here's what it looks like to me and past experience has shown this problem before
<samkottler> extreme pressure around heap alloc/dealloc
<samkottler> it's hard to actually prove that
<samkottler> other than throwing more memory at it and then hoping it works better
<samkottler> :P
<johnmwilliams___> Already verified that you are not hitting max open files or anything like that?
<johnmwilliams___> (We have done that before)
<samkottler> nope
<samkottler> this issue is pretty well isolated the memory pressure
<samkottler> to memory pressure**
<mr_ndrsn> What are they symptoms you’re seeing?
<mr_ndrsn> err, the.
<qrush> if it helps i do have ssh access and we're all in the same place today
<qrush> so they could poke around with my box
<qrush> wow that sounds awful
<mkent> don't see maxmemory set in the gist
<mkent> wonder if it'll even observe the policy
<samkottler> mkent: maxmemory itself isn't set, but the policy is
<samkottler> mkent: it's possible this is a bug in redis
<samkottler> where is doesn't start using the policy when it's just putting pressure on system memory
<dwradcliffe> kernel: [13900.835381] Out of memory: Kill process 4162 (redis-server) score 888 or sacrifice child
<dwradcliffe> kernel: [13900.839891] Killed process 4162 (redis-server) total-vm:3635904kB, anon-rss:3590092kB, file-rss:0kB
<samkottler> oom killer is a whole other thing
<samkottler> the problem should stop before oom-killer kicks in
<samkottler> maybe we should try to set maxmemory statically
<mr_ndrsn> statically == “uncomment the entry in the gist?”
<johnmwilliams___> If you don't set maxmemory there is a chance it will just eat up all available memory.
<johnmwilliams___> Including swap.
<samkottler> alright lemme try setting the maxmemory to 3gb
<johnmwilliams___> I'd say set it in the config to 85% of system memory.
<samkottler> or actually, 6.5GB on the new box
<johnmwilliams___> 5.5 would be alright.
<mkent> worth a try, we set one to 3G, seems to keep it at about 3.3G of rss
<samkottler> I'm somewhat scared about the policy
<samkottler> which is why this hasn't been set before
<samkottler> because the data in our redis instance is not cache, it's real long-term data that needs to be persisted at all costs
<johnmwilliams___> It should get persisted to disk.
<samkottler> which is a big reason in an of itself to get rid of redis in its current form for us
<samkottler> johnmwilliams___: not necessarily, some of the policies evict keys like an LRU
<samkottler> an LRU in which a population means the data is gone
<samkottler> pollution
<samkottler> brb, it's like 3:05pm and I haven't had lunch yet
<mkent> well, volatile-lru should only dump stuff set with an expiration
<mkent> unless there's data set with an absurdly high value
<mr_ndrsn> Or could try noeviction and handle it in the application code? ¯\_(ツ)_/¯
drbrain has quit [Quit: Goodbye]
drbrain has joined #rubygems
tbuehlmann has quit [Remote host closed the connection]
mr_ndrsn has quit [Quit: mr_ndrsn]
mkent has quit [Quit: Leaving.]
tenderlove has quit [Remote host closed the connection]
drbrain has quit [Ping timeout: 255 seconds]
dknox-lunch is now known as dknox
drbrain has joined #rubygems
seanlinsley has quit [Quit: seanlinsley]
seanlinsley has joined #rubygems
djbkd has quit [Remote host closed the connection]
dangerousdave has joined #rubygems
<qrush> hey evan samkottler
<qrush> what if we just had a separate redis per year
<qrush> for stats
<qrush> this isn't too stupid right? :)
<qrush> and then only one is written to per year
willywos has quit [Ping timeout: 272 seconds]
tcopeland has quit [Ping timeout: 244 seconds]
djbkd has joined #rubygems
dangerousdave has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
djbkd has quit [Ping timeout: 245 seconds]
djbkd has joined #rubygems
dvu has quit [Remote host closed the connection]
<evan> qrush: I dump the year stats anyway
<evan> thats not fine grained enough.
djbkd has quit [Remote host closed the connection]
djbkd has joined #rubygems
mkent has joined #rubygems
mr_ndrsn has joined #rubygems
huoxito has quit [Remote host closed the connection]
mr_ndrsn has quit [Client Quit]
elia has joined #rubygems
tenderlove has joined #rubygems
tenderlove has quit [Read error: Connection reset by peer]
tenderlove has joined #rubygems
lsegal has joined #rubygems
tcopeland has joined #rubygems
tenderlove has quit [Quit: Leaving...]
mkent has quit [Quit: Leaving.]
mkent has joined #rubygems
bbrowning is now known as bbrowning_away
tenderlove has joined #rubygems
jhass is now known as jhass|off
tenderlove has quit [Client Quit]
mkent has left #rubygems [#rubygems]
tenderlove has joined #rubygems
jhass|off is now known as jhass
dvu has joined #rubygems
<Rennex> ugh... i tried "gem install pry -v", "gem -v install pry", "gem --verbose install pry", and "gem install -v pry" before finally landing on the winning combination of "gem install --verbose pry"
seanlinsley has quit [Quit: seanlinsley]
huoxito has joined #rubygems
seanlinsley has joined #rubygems
tenderlove has quit [Read error: Connection reset by peer]