thomas.apestaart.org » 2013

measuring puppet

Filed under: puppet — Thomas @ 20:58

2013-01-24
20:58

For one of work's projects, we'll soon be working on scaling our platform more, which will require deploying a bunch more machines. For this project, we basically have a local dev platform, an online dev platform, a preproduction platform, and a production platform.

Right now, all these platforms have exactly one host. There are two puppetmasters, one for the dev platforms and one for the pre/pro platforms.

Since deploying a bunch more machines is going to require a lot more puppet running, I want to work on removing as much friction as I can from my puppet work. I do the runs manually as we upgrade platforms during deployment, and a run typically takes well over a minute. For me, that's too long - it causes me to waste time, lose focus, task switch, and forget I should be following up on puppet runs. It makes finetuning puppet modules a chore as I hack on them.

So I wanted to start by trimming some of the obvious fat before I segment my puppet config into separately testable pieces. I would have expected puppet apply to actually have something to help with that, but it doesn't.

After thinking it through, I realized I wanted some kind of tool that would timestamp output of puppet apply --debug so I could see which things it does take more time than others.

I wasn't sure what to google for, but timestamp stdout seemed to bring up some results, and I hit on http://joeyh.name/code/moreutils/ which includes a tool called 'ts' and is a simple pipe filter that timestamps lines going to stdout.

That was almost good enough. What I really wanted though was to know how much time elapsed since printing the last line. My perl is rusty, but I managed to quickly cook up a patch that makes it print incremental timestamps.

Now I can do a puppet run like this:
puppet apply --modulepath=`pwd`/modules:`pwd`/dev/modules manifests/site.pp --debug | ts -i "%H:%M:%.S" | egrep -B 1 "^00:00:0[^0]" 00:00:00.001066 debug: Executing 'test -e ../commit && ( test xorigin/master == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/master^0 | head -n 1` == `cat ../commit` ) )' 00:00:02.646908 debug: Service[postfix](provider=redhat): Executing '/sbin/service postfix status' -- 00:00:00.000987 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.871258 debug: Exec[git-checkout-/var/www/partner-test](provider=posix): Executing check 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:00.000942 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.886606 debug: Prefetching aliases resources for mailalias -- 00:00:00.000957 debug: Executing '/usr/sbin/semanage fcontext -l | grep -q '^/home/git/dev(/.*)?'' 00:00:01.750281 debug: /Stage[main]/Dexter::Apache/Selinux::Set_fcontext[home-httpd]/Exec[semanage-/home/git/dev-httpd_sys_content_t]/unless: /usr/sbin/semanage: Broken pipe -- 00:00:00.000855 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:02.064475 debug: /Schedule[puppet]: Skipping device resources because running on a host -- 00:00:00.001048 debug: Executing '/usr/sbin/semanage fcontext -l | grep -q '^/srv/merchant(/.*)?'' 00:00:01.750129 debug: /Stage[main]/Partner::Install/Selinux::Set_fcontext[srv-merchant-httpd]/Exec[semanage-/srv/merchant-httpd_sys_content_t]/unless: /usr/sbin/semanage: Broken pipe -- 00:00:00.000861 debug: Executing 'test -e ../commit && ( test xmaster == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify master^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.841316 debug: Exec[git-checkout-/var/www/merchant](provider=posix): Executing check 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:00.000955 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.858206 debug: Service[httpd](provider=redhat): Executing '/sbin/service httpd status'

Some explanation is in order.

The puppet apply is straightforward if you know puppet a little - it will apply a manifest and spit out a lot of debug info.

The output gets piped into ts, which will do incremental timestamping (with -i, which is what my patch adds) according to the specified format (ts by default uses seconds precision, but can do microsecond precision if you use %.S in the format).

Then I grep for all lines that take at least 1 second to be output, and display the line before that too (since puppet is generating output either before or after a possibly long-running task, so either on the line before or the line that took too long).

In the first section, I doubt service postfix status is to blame, so it's probably my convoluted git updating that takes too long. I need to rework that module so it doesn't fetch on every run.

In the third section, semanage is to blame. Hm, maybe I need to find a different way to look up whether the particular fcontext rule I want to add is already there. I've considered converting it to facts, although that sounds like it would be stretching facts a little - that's a lot of info to store in a fact.

The others are repeats of both, so I know where to start trimming the fat now!

And when all > 1 sec items are gone, time to shave off more below that.

If you want to try out ts with incremental timestamping, it's available in the rebuilt moreutils rpm in my package repositories for CentOS 6 and F16/17/18.

If any puppetmaster (hah!) has good tips on how to debug and measure the catalog generation step (the one on the master), let me know!

Comments (6)

mach 1.0.2 “ears” released

Filed under: mach,Releases — Thomas @ 22:32

2013-01-22
22:32

Another Fedora, another mach release. This release fixes a minor bug and adds support for Fedora 18.

Get the source, update from my repository, or wait until updates hit the Fedora repository.

Happy packaging!

Comments (1)

morituri 0.2.0 “ears” released

Filed under: morituri,Releases — Thomas @ 23:45

2013-01-20
23:45

A new year, a new morituri release.

I got informed some people wanted to use morituri with a different log output, so I made the logger pluggable.

For my personal use, I have now gotten to ripping all my singles and ep's, and so instead of having singles with the same name as an album overwrite the album, I added template variables for the release type. I've also changed the default templates to use it, so if you were relying on the default template for your collection, you may want to either move those files or use the previous default template.

morituri now has a config file, so once you've run rip offset find to find your drive's offset, it will save it and automatically use it for ripping. Same for checking whether cdparanoia can defeat the drive's caching. morituri saves it by drive information, not by device node, so it will work with different USB drives too.

See the trac page for more info and download links. You can also download it from my package repository for Fedora 17 and 18 if that's your distro.

For the curious, here's some more info:

This is morituri 0.2.0, "ears"


Coverage in 0.2.0: 67 %   (1890 / 2807), 95 python tests
Features added in 0.2.0:
- added plugins system for logger

- added rip cd rip --logger to specify logger

- added reading speed, cdparanoia and cdrdao version to logger

- added rip drive analyze to detect whether we can defeat audio cache behaviour

- store drive offsets and cache defeating in config file

- rip drive list shows configured offset and audio cache defeating

- added rip image retag --release-id to specify the release id to tag with

- added %r/%R for release type to use in track/disc template

- added %x for extension to release template
Bugs fixed in 0.2.0:

- 89: Fails to rip track with \ in its name - 105: Backslash in track names causes "Cannot find file" during rip - 108: Unable to find offset / rip - 109: KeyError when running "rip offset find" - 111: Python traceback when config has no read offset for CD - 76: morituri should allow for a configuration file - 96: rip image retag: allow specification of release ID - 107: Backslash in track name confuses AR step - 112: add MusicBrainz lookup URL to generated logfile

Comments (6)

Fedora 18 part one

Filed under: Fedora,General — Thomas @ 00:10

2013-01-17
00:10

Yesterday, I was wondering if there shouldn't be a new Fedora out by now and if it would fix a bunch of my current GNOME 3 annoyances.

So I checked, and lo and behold, the final release date was yesterday! Excellent.

Let's do some completely unscientific scoring this time around. In part one, it's bound to get ugly because you always run into the negatives first when doing an upgrade.

First challenge was finding the torrent links for the full DVD. Apparently the DVD is now a well-hidden option, and the torrent even more so - I had to google for it, I couldn't find any links on the download site. -1 and -1. I appreciate that there is a small CD with a live installer and everything, but I have to upgrade 3 computers in total so I prefer to download once as much as I can - although it's likely all of them will need to upgrade gazillions of packages soon after.

The second issue: after booting, by default it gives you the second option - test media and install. I didn't realize that, and just hit Enter. Then anaconda starts counting something without telling you what it's doing, at which point I figured it would be a media check as it was really slow. But if you hit Esc to abort it, you get dropped into a rescue shell, instead of just continuing. Err, OK. I don't know if anyone out there really uses or prefers the media check option, but I never do.

Reboot, make sure to go up to select the first option (which, really, should be switched to the second if you're not going to default to it ?)

-1 for being confusing and defaulting to wasting my time.

This is the first anaconda that is actually full-screen on my laptop, nice. +1

The first impression of anaconda is that it looks good and it looks very GNOME 3-y. Not entirely sure I like the 'things popping up on my screen as anaconda presumably checks stuff' without telling me, because there's potential for getting it wrong (-1), but I'll accept it for now. Definitely liking that it figured out my network connection automatically, if NetworkManager is behind it then I'll be darned - yay for NetworkManager! (+10 alone for that one, so NetworkManager pulls out slightly from minus infinity.)

Date & Time, since you now can rely on network possibly working maybe you should look me up by IP as a sane default instead of New York ? Network Time was on by default, but no. No points for or against though.

On to partitioning. I've always used a custom layout. The new dialog scares me a little; I checked "I don't need help" but the only option forward is 'reclaim space', so I'm not sure it's not going to do anything bad to my drives. -5

On the next screen, I see a tree with New Fedora 18 Installation, Fedora Linux 15, Fedora Linux 16, and Unknown. I typically have two or three root partitions so I can test different Fedora versions and fall back to older ones when I'm upgrading. It's a little confusing to use the tree, but basically I figured out how to go through the Fedora Linux 15 config and get it to move the ones I wanted to use to the New Fedora 18 Installation. I definitely see the potential for this being easier to use than the old way but it needs a bit more documentation or tooltips or explanation to make it really feel safe to use. Maybe it would help too to have a final overview page when finishing partition so you can confirm that it looks like it's going to do the right thing. -1 for the confusion, but +5 for finding partition info from all my roots.

It's a small touch, but it's nice it's asking for your root password *while* already installing packages. Makes it a little faster to get it done. +3 guys! I'm wondering if it couldn't do more of that - your time zone config for example ?

The redesigned anaconda really looks nice, and fits in well with the GNOME 3 experience I'm now used to. Gutsy move, but this is going to pay off in the long run. +10

After waiting for the packages to install, I clicked Reboot, and it dropped me in a text mode that said [terminated]. oh well, nothing's perfect I guess. -2

On a hard reboot, I got greeted with a reasonably nice GUI that had managed to pick up my old boot options - even the Windows partition I have on this machine. +5 The text looked ugly and stretched (-1), but it could have been worse.

Firstboot starts and greets me first with a big white square around my cursor (-1), and then the nice-looking GUI. Firstboot asks me for date and time info again, not sure why. Maybe an oversight. -1

And then we're on to the login screen. And it definitely looks nice! +5

Logging in. Being told there are updates. Holy crap - 218 updates - for a release that's a day old. Does the word 'release' mean anything anymore? -5. Seriously, freeze that crap for a few days, only real security issues or facepalm bugs.

My updated failed to process - because I had installed the rpmfusion rpm and it doesn't have the GPG key. Yet again, by default updating packages fails completely when anything in the config is not working, instead of at least getting me the updates that can be installed, in the name of, you know, security. -10 because this is a persistent attitude problem for yum.

Create some missing symlinks, and the upgrade can continue. So I leave for lunch.

And when I come back, I am greeted by some kind of lock screen. It looks pretty. (+3) It's like a video game, those arrows. Yes, that's it - it reminds me of when I pretend to be Batman in Arkham City and I'm on a mission and it's telling me to glide down in the direction of the three floating arrows. Except, it's not actually acting like a lock screen - when I click it, something happens and I go to a user selection ? It looks like I got logged out behind my back ? Really ? Is it doing some kind of automatic logout after upgrading ? I hope not, that would be horrible as a default. No clue what happened. -5

I log in again, and recover my vim sessions that got so brutally killed.

I start running the install commands that are part of my general upgrade checklist. In the meantime, I check out this rumour I heard that Fedora 18 installs with kernel 3.6 by default but the one day old upgrades install 3.7, so I run rpm -qa | grep kernel.

Oh my. It's spewing db errors halfway through the query. Three times in a row. Contrary to popular belief, rpm is really robust, and you really need to do evil things to get it to corrupt, like drop your hard drive or kill -9 during package installs. But here it just fails simply querying, presumably for the first time in my experience it can't handle querying while installing ? -5

After letting it sit there and install some more, I get that lock screen again. I click it, and some arrows flash. Maybe I'm supposed to drag it up or something ? But before I can do, the screen flashes, and I'm back to the login prompt. Oh, so even worse - this new lock screen crashes my whole desktop somehow ? Ouch. -5

Evolution forgot my sort settings (per folder) and 3-pane window. -3 for making me suffer through having to sort every single folder by date, descending again (really, is unsorted a sane default to anyone ?)

The lock screen looked cool at first glance, but after what feels like lifting up the door to my garage four times today already it's getting on my nerves. -3 Same with the 'pressure-triggered' notification area, which is starting to cause pain in my hand on my laptop, and I never have that kind of trouble. I wonder if these things got designed with a console joypad as an interface, where you could accept that pressure-triggered actions make sense. -3 for sucking and another -3 for making me think originally that it looked cool until I actually had to use it.

My first login as a 'fresh' user (I don't mount my real home until I'm sure all the basics work ok) is very zippy and GNOME 3 looks tidier. +5

However, my second login, with my old user, takes a good 30 seconds before anything at all appears beside the desktop. I don't know which dead weight I'm dragging along from before, but this upgrade is not liking it one bit. No feedback whatsoever on what's going wrong though. -3

Total score so far: -13.

It didn't pull back to breakeven, but don't despair - now that the basics are done, it's bound to get better in the next part.

(editor's note - see if you can tear this whole article to pieces by pointing out a counting error in the score, cleverly invalidating my already unimportant opinion!)

Comments (20)

Present Perfect

measuring puppet

2013-01-2420:58