#m-labs on 2017-10-13 — irc logs at freenode.irclog.whitequark.org

2015-03-04 14:45 sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs

01:05 <GitHub82> [smoltcp] podhrmic closed pull request #53: IGMP support (master...igmp) https://git.io/vdrfz

01:05 <GitHub175> [smoltcp] podhrmic commented on issue #53: Closing for now, will reopen with more complete feature set. https://git.io/vdiij

01:27 <GitHub106> [smoltcp] podhrmic opened issue #56: rustfmt.toml for smoltcp https://git.io/vdiX0

02:12 <GitHub111> [smoltcp] whitequark commented on issue #56: Sure. Do you think you could play with rustfmt's options to get an output very close to the current one? Then I'll merge that and reformat the rest. https://git.io/vdiMV

02:18 <GitHub193> [artiq] sbourdeauducq commented on issue #838: > Yes, "reset" refers to rebooting the FPGA board.... https://github.com/m-labs/artiq/issues/838#issuecomment-336331534

02:46 <sb0> rjo, I just built it locally last time (for the 3.0 release)

02:47 <sb0> (it == misoc)

02:49 <GitHub19> [migen] sbourdeauducq tagged 0.6.dev at fce4a41: https://git.io/vdiy8

03:00 <sb0> _florent_, so for artix7 you are just setting some fixed bitslip values? how did you determine them? what about setup/hold timing?

03:02 <sb0> I have the impression that your design is just working by chance, and the proper way of fixing this would be to shift the clock that goes to the SDRAM

03:03 <sb0> then do write leveling first (of course this only works if each SDRAM chip has its own clock), then read leveling

03:03 <sb0> is it possible to do glitch-free phase shifts on artix7 plls or did xilinx fuck that up, too?

03:26 rohitksingh_work has joined #m-labs

03:42 sb0 has quit [Quit: Leaving]

03:42 rohitksingh_work has quit [Ping timeout: 248 seconds]

03:44 rohitksingh_work has joined #m-labs

04:07 rohitksingh_work has quit [Ping timeout: 240 seconds]

04:17 rohitksingh_work has joined #m-labs

06:05 <_florent_> sb0: for the fpga to sdram direction, yes we are using a shifted dqs clock: https://github.com/enjoy-digital/arty-soc/blob/master/arty_base.py#L75

06:05 <_florent_> sb0: that's similar to we we are doing for SDR/DDR/DDR2 on others targets

06:06 <_florent_> sb0: for sdram to fpga direction, i'm using a script that show valid sampling window: https://github.com/enjoy-digital/arty-soc/blob/master/test/test_sdram.py#L102

06:07 <_florent_> sb0: and I use values in the middle of the window

06:09 <_florent_> sb0: so yes we are using static delay, i've been using that on 4 different boards (arty, nexys video + 2 custom boards) and it seems to be working fine

08:04 <rjo> sb0: because the buildbot was broken already then?

08:59 rqou has quit [Ping timeout: 246 seconds]

09:00 rqou has joined #m-labs

09:09 <GitHub13> [smoltcp] phil-opp commented on issue #49: > @phil-opp Do you think you can take the abovementioned outline and implement this yourself? I'm fairly busy right now.... https://git.io/vdPUR

10:26 <GitHub164> [artiq] jordens pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/e1e1f58ba983b08ac56d0cf2fddc63b30760a47c

10:26 <GitHub164> artiq/master e1e1f58 Robert Jordens: libboard: fix use

10:40 <bb-m-labs> build #827 of artiq-board is complete: Failure [failed conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/827 blamelist: Robert Jordens <rj@m-labs.hk>

10:40 <bb-m-labs> build #1714 of artiq is complete: Failure [failed] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1714 blamelist: Robert Jordens <rj@m-labs.hk>

11:59 <GitHub75> [smoltcp] batonius commented on issue #49: Should we coordinate this effort with #55? It would be great if the new `Device` trait or the new `Interface` struct could be used as trait objects. https://git.io/vdPC5

12:02 sb0 has joined #m-labs

12:03 <sb0> rjo, yes, it was broken

12:19 <GitHub55> [smoltcp] whitequark commented on issue #49: @batonius I disagree with your approach on #55, but haven't had the time to write that down yet. In general I am quite opposed to trait objects in smoltcp; you can notice that in the existing code... https://git.io/vdPlS

12:52 <sb0> _florent_, why not calibrate it in firmware?

12:53 <sb0> like the BIOS does write leveling on kintex

13:03 rohitksingh_work has quit [Read error: Connection reset by peer]

13:44 <GitHub159> [smoltcp] batonius commented on issue #49: @whitequark Sure, I just think we should keep the issues in sync. https://git.io/vdP2v

13:46 <rjo> sb0: how was it fixed?

13:46 <sb0> it was not. I just built it locally

13:47 <rjo> sb0: doesn't that go against the idea of having the buildbot?

13:47 rohitksingh has joined #m-labs

13:51 <sb0> as long as conda brims with bugs, it's either that or wasting half a day on simple operations like this

13:51 <whitequark> once I get done with this DMA emergency I can spawn images on demand

13:52 <whitequark> can update the buildbot to*

13:52 <whitequark> that will take care of conda bullshit

13:59 <rjo> emergency?

14:00 <rjo> srtio is not even in master.

14:00 <rjo> sb0: that's a bad choice. you are just hoping that others will take care of it.

14:01 <rjo> sb0: and the complaining does not change one iota.

14:04 <sb0> rjo, yes. we need money that is independent from the sayma mess.

14:06 <sb0> rjo, and yes, someone else should take care of it. I've said many times that we need a yak-shaver. besides, I've already fixed a number of conda problems already.

14:07 <whitequark> the sort of person who can effectively track down contrived problems in conda is typically not employed as a conda yak-shaver. that's the whole tragedy of it.

14:07 <whitequark> it should have just been designed decently instead

14:09 <rjo> sb0: that's not a worakble attitude. saying something doesn't make it happen. in the same way that whining doesn't change much. every one of us has fixed numerous conda issues and wrestled the buildbot. let's set the priorities straight. let's fix the buildbot working and look at sayma again before we return to srtio/dma interactions.

14:09 <sb0> rjo, disagreed.

14:09 <rjo> rewriting conda may be a nice idea for next year.

14:09 <rjo> but definitely not now.

14:10 <rjo> employing a personal slave is also not something we can wait for.

14:10 <sb0> just build misoc locally for now (it is the only package affected) and move forward

14:11 <rjo> that doesn't work.

14:11 <rjo> artiq is also affected.

14:12 <rjo> artiq-board is

14:12 <rjo> http://buildbot.m-labs.hk/builders/artiq-board/builds/827/steps/conda_build/logs/stdio

14:15 <rjo> fwiw this smells more like a bad interaction of buildbot with conda or divergence of the buildbot slaves or hidden and broken assumptions between buildbot and conda. not even an intrinsic conda problem. just snapping and whining about conda if something goes wrong might be very premature.

14:17 <sb0> well, artiq wasn't affected before. I built 3.0 and every other package on the buildbot

14:17 <sb0> _after_ misoc broke

14:18 <rjo> doesn't matter

14:18 <whitequark> no, both slaves failed misoc builds

14:18 <whitequark> and artiq-board builds

14:18 <sb0> also, the only reason srtio isn't in master is because I want to avoid breaking and delaying further SAWG on Sayma

14:18 <whitequark> not migen builds, so this really seems specific to conda workdir code

14:19 <sb0> and the DMA issue

14:19 <rjo> sb0: i appreciate that

14:20 <rjo> whitequark: i remember a similar situation a while back. and you had a trick to unbreak it. but i can't find it.

14:20 <whitequark> rjo: I have no memory of that event

14:20 <whitequark> (that doesn't mean it didn't happen)

14:21 <whitequark> sigh. let me try something

14:21 <sb0> rjo, was it the one that can be worked around by triggering a build with --branch?

14:21 <rjo> whitequark: forcing the branch maybe?

14:21 <rjo> yes. rings a bell.

14:22 <whitequark> bb-m-labs: force build misoc --branch master

14:22 <bb-m-labs> build forced [ETA 3m23s]

14:22 <bb-m-labs> I'll give a shout when the build finishes

14:22 <rjo> could also be unrelated... but the symptoms seemed familiar

14:23 <sb0> the workaround was to trigger a build with another branch, then restart the original build. idk if it works by simply adding --branch

14:23 <rjo> looks good. thanks.

14:24 <rjo> fwiw this is a conda-buildbot interaction issue.

14:25 <bb-m-labs> build #260 of misoc is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/misoc/builds/260

14:25 <sb0> yes, conda leaving some state... whitequark's vm solution would take care of it

14:25 <whitequark> ah, no, that would just create another problem

14:26 <whitequark> downloading all the required packages every time means 15+ min builds for misoc even

14:26 <whitequark> i think

14:26 <sb0> can't the vm come with some of those packages preinstalled?

14:26 <rjo> i.e. state?

14:26 <whitequark> yes if it is easily regenerated, because otherwise we'll have the travis problem

14:26 <whitequark> rjo: not really

14:26 <whitequark> or rather *known good* state

14:26 <sb0> state that is only changed when we explicitly touch the vm

14:27 <whitequark> sb0: the travis problem, that is, that vms get ancient and never updated

14:27 <sb0> not at random and usually incovenient times

14:27 <whitequark> i could solve that through ansible

14:27 <whitequark> already moved all my own infra to it so i won't need to learn anything new

14:28 <whitequark> bb-m-labs: force build artiq-board --branch master

14:28 <bb-m-labs> build forced [ETA 16m06s]

14:28 <bb-m-labs> I'll give a shout when the build finishes

14:28 <bb-m-labs> build #828 of artiq-board is complete: Exception [exception conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/828

14:28 <whitequark> what

14:28 <whitequark> bb-m-labs: force build artiq-board --branch master --props=package=artiq-kc705-nist_clock

14:28 <bb-m-labs> build forced [ETA 16m06s]

14:28 <bb-m-labs> I'll give a shout when the build finishes

14:28 <bb-m-labs> build #829 of artiq-board is complete: Exception [exception conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/829

14:29 <rjo> whitequark: thanks!

14:29 <whitequark> bb-m-labs: force build --branch master --props=package=artiq-kc705-nist_clock artiq-board

14:29 <bb-m-labs> build forced [ETA 16m06s]

14:29 <bb-m-labs> I'll give a shout when the build finishes

14:30 <rjo> whitequark: while i have you on ansible... quick questions: (a) do you always write the recipes to be strictly idempotent? (b) how do you handle secrets (ssh keys etc) (c) do you develop the recipes with that idempotency and trial-and-error?

14:31 <bb-m-labs> build #830 of artiq-board is complete: Failure [failed conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/830

14:31 <rjo> (d) how do you port existing setups?

14:32 <whitequark> (a) yes (b) - vars_files: [vars/private.yml]; echo vars/private_yml >>.gitignore (c) i don't understand the question (d) by using --check (dry runs)

14:32 <whitequark> i wanted to reorganize my setup to use roles for a while so i'll publish some of what i consider high-quality roles soon

14:33 <whitequark> 3rd party roles are mostly crap

14:33 <whitequark> i don't even know why people share them

14:34 <whitequark> oh and anything targeting ansible prior to 2.4 is probably also crap because ansible only grew a functional dependency/abstraction system at 2.4

14:34 <rjo> crap because they are not generic enough?

14:34 <whitequark> the opposite

14:34 <whitequark> they're too generic in a way that creates tons of edge cases which they do not handle

14:35 <whitequark> i typically write a recipe that's only parameterized by names and such. rarely, one parameter, or some loops

14:35 <rjo> yes. not handling self-inflicted edge cases is what i meant.

14:35 <whitequark> sothere isn't any combinatorial explosion

14:36 <rjo> then you can throw things like distribution packages over board.

14:37 <rjo> on (c): if you e.g. tweak your config for something, do you iterate over the edit/ansible-execute/test cycle?

14:38 <whitequark> (continuing) if there *is* combinatorics then i handle that with separate roles and dependencies

14:38 <rjo> or whatever the ansible command is to do something

14:38 <whitequark> much more robust ime

14:38 <whitequark> ansible-playbook since the ansible command doesn't handle roles

14:38 <whitequark> with a trivial playbook

14:38 <whitequark> oh and

14:38 <whitequark> i do two more things.

14:39 <whitequark> the first is my recipes are not just idempotent, they do set_facts: ... cacheable=yes for things like "checking if nginx is installed"

14:39 <whitequark> so that i never wait centuries on a high-latency link when i just want one config file updated

14:39 <whitequark> the second is the tagging system

14:40 <whitequark> tags are really a weak point, so i work around that by manually adding a combinatorial... well it's not a detonation, let's call it a combinatorial conflagration of tags

14:42 <whitequark> e.g. nginx (only nginx), nginx-site (all nginx sites), whitequark.org (everything that backs up whitequark.org), nginx-site:whitequark.org (intersection of nginx-site and whitequark.org) etc

14:42 <whitequark> this lets me do very finely grained config updates and a rapid modify-execute-apply cycle

14:42 <whitequark> erm, modify-execute-test

14:43 <rjo> ok. with the cacheable thing you have shortcuts in the playbooks to make the iteration quicker. and similarly with tags?

14:43 <whitequark> yes.

14:43 <whitequark> otherwise i found it unbearable. just getting distracted all the time with 10s of seconds of wait time per cycle

14:44 <whitequark> of course you could just check that out on the lab.m-labs.hk but i don't use vim

14:44 <rjo> i see.

14:45 <rjo> but if i e.g. have a working "manual" nginx setup. how do i re-implement that in ansible? step by step replacement? shut everything down for a day and then rebuild?

14:45 <whitequark> first, go through it manually and refactor it

14:46 <whitequark> so that all parts that can be reasonably shared, are

14:46 <whitequark> then do a step by step replacement until ansible reports no (significant?) changes

14:46 <whitequark> then run it

14:46 <whitequark> then run it on a fresh new VM to find where you had bad assumptions and missing dependency edges

14:47 <rjo> ok. apart from your code, any recommendations where to steal from?

14:47 <whitequark> i did that while migrating whitequark.org and it took me a total of something like three days. dovecot postfix roundcube nsd tarsnap

14:47 <whitequark> hm

14:48 <whitequark> i have not found any ansible playbooks or roles that i found even minimally acceptable

14:48 <whitequark> the only thing i stole were workarounds for ansible language deficiencies and even that was mostly the bugtracker

14:48 <whitequark> i would suggest thinking about it as any other programming language and apply the same techniques. and read the module documentation.

14:50 <rjo> do you use it for your laptop as well?

14:50 <whitequark> not currently but I should do this before the US visit

14:50 <rjo> hahaha.

14:50 <whitequark> so that I can wipe and reimage at will

14:50 <whitequark> one of the last things blocking that, really...

14:51 <whitequark> rjo: being able to set up a config on your laptop identical to the config on your server is *very* valuable

14:51 <whitequark> far more valuable than it first seems

14:51 <whitequark> insta-staging

14:52 <whitequark> also write a task for synchronizing the database/files from production, worth every second doing it

14:52 <whitequark> speaking of "wipe and reimage" that'll also be what i'll do if anyone hacks into whitequark.org or i lose the ssh key i have here

14:53 <whitequark> no other way to access it except via that key

14:53 <whitequark> also a good idea to have a task for restoring from backup fwiw

14:54 <rjo> how far down the setup do you go? what's the initial state? just a default/empty debian installation? do you set the hostname/gen the ssh key?

14:55 <whitequark> empty debian installation, which i strip even further down (i don't want systemd or acpi or exim4 or tty2-6 getty's on my server)

14:55 <whitequark> hostname and key is set by digitalocean

14:55 <whitequark> there's a task for creating a digitalocean machine too

14:55 <whitequark> i also remove sytemd, install sysvinit and reboot, in an automated way

14:56 <whitequark> if digitalocean didn't set hostname and authorized_keys for me i'd do that with ansible, too

14:58 <rjo> if you worried about the gettys, do you build and distribute your own kernel?

14:59 <whitequark> of course not

14:59 <whitequark> i just think they clutter up `top`

15:00 <whitequark> (actually could kill that on tty1 too, looking as how I don't have passwords anyway)

15:01 <rjo> and then you run unattended-upgrades, i guess (by the way, that is not really working on lab.m-labs.hk), and if that collides with your playbooks, what then?

15:02 <rjo> also isn't upgrading the debian release a mess? how do you handle that?

15:02 <whitequark> unattended-upgrades: yes, it doesn't work for some reason. doesn't work on whitequark.org either. i must be screwing up the config in the same way, but i never actually figured out how.

15:02 <whitequark> it doesn't screw up my servers because they're all on stable+stable-backports

15:04 <rjo> "it" being release upgrades?

15:04 <whitequark> no, unattended-upgrades

15:04 <whitequark> i do release upgrades by hand because i literally need it once per two years

15:04 <whitequark> went through the last one on whitequark.org and it was almost completely painless

15:04 <rjo> so u-a does not work on whitequark.org but it is stable+?

15:05 <rjo> but the procedure for a release upgrade is that you do the upgrade and then re-run your playbooks and fix them?

15:05 <whitequark> i removed a few default_release options, removed specific versions from mysql and postgresql packages, manually migrated the postgresql cluster, updated php to 7.0, and handled a breaking change in dovecot-sieve

15:06 <whitequark> that was all

15:06 <whitequark> took me all of like 30 minutes, including updating playbooks

15:06 <whitequark> and not including the DB migration, with corresponding downtime of several hrs

15:07 <whitequark> rjo: u-a updates packages as well as distros. i configured it to only update packages. it doesn't do anything.

15:07 <whitequark> release upgrade procedure: yes.

15:08 <whitequark> removing references to jessie-backports was trivial, php/mysql/postgres manifested as apt errors, sieve change manifested as getting several hundred emails in my inbox and a lot of headscratching

15:08 <whitequark> i blame that on dovecot people

15:09 <rjo> if you were to migrate e.g. a postgresql setup from one host to another, do you write a playbook to do that or do you just do the setup with the playbook and the data migration manually?

15:10 <whitequark> the latter. it is very likely to go wrong in ways that are of no benefit to automate handling

15:10 <whitequark> it did, in fact, go wrong, and i had to restart it a few times.

15:13 <rjo> bb-m-labs: force build --branch=rtio-sed artiq

15:13 <bb-m-labs> build forced [ETA 8h32m20s]

15:13 <bb-m-labs> I'll give a shout when the build finishes

15:13 <rjo> bb-m-labs: force build --branch=master artiq

15:13 <bb-m-labs> The build has been queued, I'll give a shout when it starts

15:15 <rjo> whitequark: ok. i think i got enough. thanks!

15:16 <whitequark> yw

15:37 <bb-m-labs> build #831 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/831

15:42 <bb-m-labs> build #1715 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1715

15:42 <bb-m-labs> build forced [ETA 8h32m20s]

15:42 <bb-m-labs> I'll give a shout when the build finishes

15:52 <GitHub122> [smoltcp] phil-opp opened pull request #57: [WIP] New device trait design (master...new-device-trait) https://git.io/vdPH4

16:06 <bb-m-labs> build #832 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/832

16:48 <GitHub141> [smoltcp] podhrmic commented on issue #56: I ll give it a try. I think the main difference is the alignment - i.e. this is what is in smoltcp now:... https://git.io/vdPb8

16:53 <GitHub78> [smoltcp] whitequark commented on issue #56: To be honest I find that it helps readability a lot. https://git.io/vdPb1

17:19 kristianpaul has quit [Quit: Lost terminal]

17:21 rohitksingh has quit [Ping timeout: 240 seconds]

17:21 rohitksingh has joined #m-labs

17:29 rohitksingh has quit [Read error: Connection reset by peer]

17:30 rohitksingh has joined #m-labs

18:01 rohitksingh has quit [Quit: Leaving.]

18:08 kristianpaul has joined #m-labs

19:53 <whitequark> sb0: I have an idea

19:53 <whitequark> about DMA

19:53 <GitHub177> [smoltcp] podhrmic commented on issue #56: After poking around it looks like there is no setting for rustfmt that would keep the LHS => RHS alignment as you have it. Apparently it is a discouraged style for Rust. ... https://git.io/vdX32

19:53 <whitequark> verifying currently

19:54 <GitHub177> [smoltcp] whitequark commented on issue #56: Let's keep it, but just so that your work doesn't go nowhere, please post your rustfmt config here. If we ever switch to using rustfmt I'll make use of it. https://git.io/vdX3P

19:58 <GitHub87> [smoltcp] podhrmic commented on issue #56: OK. Place `rustfmt.toml` in the root dir of the project.... https://git.io/vdXs3

20:01 <whitequark> nevermind, a fluke.

20:14 <whitequark> ok, I do have some leads for tomorrow