[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

Puppet/puppetdb/storeconfigs validation issues

Filed under: puppet,sysadmin — Thomas @ 21:31

2016-10-09
21:31

Over the past year I've chipped away at setting up new servers for apestaart and managing the deployment in puppet as opposed to a by now years old manual single server configuration that would be hard to replicate if the drives fail (one of which did recently, making this more urgent).

It's been a while since I felt like I was good enough at puppet to love and hate it in equal parts, but mostly manage to control a deployment of around ten servers at a previous job.

Things were progressing an hour or two here and there at a time, and accelerated when a friend in our collective was launching a new business for which I wanted to make sure he had a decent redundancy setup.

I was saving the hardest part for last - setting up Nagios monitoring with Matthias Saou's puppet-nagios module, which needs External Resources and storeconfigs working.

Even on the previous server setup based on CentOS 6, that was a pain to set up - needing MySQL and ruby's ActiveRecord. But it sorta worked.

It seems that for newer puppet setups, you're now supposed to use something called PuppetDB, which is not in fact a database on its own as the name suggests, but requires another database. Of course, it chose to need a different one - Postgres. Oh, and PuppetDB itself is in Java - now you get the cost of two runtimes when you use puppet!

So, to add useful Nagios monitoring to my puppet deploys, which without it are quite happy to be simple puppet apply runs from a local git checkout on each server, I now need storedconfigs which needs puppetdb which pulls in Java and Postgres. And that's just so a system that handles distributed configuration can actually be told about the results of that distributed configuration and create a useful feedback cycle allowing it to do useful things to the observed result.

Since I test these deployments on local vagrant/VirtualBox machines, I had to double their RAM because of this - even just the puppetdb java server by default starts with 192MB reserved out of the box.

But enough complaining about these expensive changes - at least there was a working puppetdb module that managed to set things up well enough.

It was easy enough to get the first host monitored, and apart from some minor changes (like updating the default Nagios config template from 3.x to 4.x), I had a familiar Nagios view working showing results from the server running Nagios itself. Success!

But all runs from the other vm's did not trigger adding any exported resources, and I couldn't find anything wrong in the logs. In fact, I could not find /var/log/puppetdb/puppetdb.log at all...

fun with utf-8

After a long night of experimenting and head scratching, I chased down a first clue in /var/log/messages saying puppet-master[17702]: Ignoring invalid UTF-8 byte sequences in data to be sent to PuppetDB

I traced that down to puppetdb/char_encoding.rb, and with my limited ruby skills, I got a dump of the offending byte sequence by adding this code:


Puppet.warning "Ignoring invalid UTF-8 byte sequences in data to be sent to PuppetDB"
File.open('/tmp/ruby', 'w') { |file| file.write(str) }
Puppet.warning "THOMAS: is here"

(I tend to use my name in debugging to have something easy to grep for, and I wanted some verification that the File dump wasn't triggering any errors)
It took a little time at 3AM to remember where these /tmp files end up thanks to systemd, but once found, I saw it was a json blob with a command to "replace catalog". That could explain why my puppetdb didn't have any catalogs for other hosts. But file told me this was a plain ASCII file, so that didn't help me narrow it down.

I brute forced it by just checking my whole puppet tree:


find . -type f -exec file {} \; > /tmp/puppetfile
grep -v ASCII /tmp/puppetfile | grep -v git

This turned up a few UTF-8 candidates. Googling around, I was reminded about how terrible utf-8 handling was in ruby 1.8, and saw information that puppet recommended using ASCII only in most of the manifests and files to avoid issues.

It turned out to be a config from a webalizer module:


webalizer/templates/webalizer.conf.erb: UTF-8 Unicode text

While it was written by a Jesús with a unicode name, the file itself didn't have his name in it, and I couldn't obviously find where the UTF-8 chars were hiding. One StackOverflow post later, I had nailed it down - UTF-8 spaces!


00004ba0 2e 0a 23 c2 a0 4e 6f 74 65 20 66 6f 72 20 74 68 |..#..Note for th|
00004bb0 69 73 20 74 6f 20 77 6f 72 6b 20 79 6f 75 20 6e |is to work you n|

The offending character is c2 a0 - the non-breaking space

I have no idea how that slipped into a comment in a config file, but I changed the spaces and got rid of the error.

Puppet's error was vague, did not provide any context whatsoever (Where do the bytes come from? Dump the part that is parseable? Dump the hex representation? Tell me the position in it where the problem is?), did not give any indication of the potential impact, and in a sea of spurious puppet warnings that you simply have to live with, is easy to miss. One down.

However, still no catalogs on the server, so still only one host being monitored. What next?

users, groups, and permissions

Chasing my next lead turned out to be my own fault. After turning off SELinux temporarily, checking all permissions on all puppetdb files to make sure that they were group-owned by puppetdb and writable for puppet, I took the last step of switching to that user role and trying to write the log file myself. And it failed. Huh? And then id told me why - while /var/log/puppetdb/ was group-writeable and owned by puppetdb group, my puppetdb user was actually in the www-data group.

It turns out that I had tried to move some uids and gids around after the automatic assignment puppet does gave different results on two hosts (a problem I still don't have a satisfying answer for, as I don't want to hard-code uids/gids for system accounts in other people's modules), and clearly I did one of them wrong.

I think a server that for whatever reason cannot log should simply not start, as this is a critical error if you want a defensive system.

After fixing that properly, I now had a puppetdb log file.

resource titles

Now I was staring at an actual exception:


2016-10-09 14:39:33,957 ERROR [c.p.p.command] [85bae55f-671c-43cf-9a54-c149cede
c659] [replace catalog] Fatal error on attempt 0
java.lang.IllegalArgumentException: Resource '{:type "File", :title "/var/lib/p
uppet/concat/thomas_vimrc/fragments/75_thomas_vimrc-\" allow adding additional
config through .vimrc.local_if filereadable(glob(\"~_.vimrc.local\"))_\tsource
~_.vimrc.local_endif_"}' has an invalid tag 'thomas:vimrc-" allow adding additi
onal config through .vimrc.local
if filereadable(glob("~/.vimrc.local"))
source ~/.vimrc.local
endif
'. Tags must match the pattern /\A[a-z0-9_][a-z0-9_:\-.]*\Z/.
at com.puppetlabs.puppetdb.catalogs$validate_resources.invoke(catalogs.
clj:331) ~[na:na]

Given the name of the command (replace catalog), I felt certain this was going to be the problem standing between me and multiple hosts being monitored.

The problem was a few levels deep, but essentially I had code creating fragments of vimrc files using the concat module, and was naming the resources with file content as part of the title. That's not a great idea, admittedly, but no other part of puppet had ever complained about it before. Even the files on my file system that store the fragments, which get their filename from these titles, happily stored with a double quote in its name.

So yet again, puppet's lax approach to specifying types of variables at any of its layers (hiera, puppet code, ruby code, ruby templates, puppetdb) in any of its data formats (yaml, json, bytes for strings without encoding information) triggers errors somewhere in the stack without informing whatever triggered that error (ie, the agent run on the client didn't complain or fail).

Once again, puppet has given me plenty of reasons to hate it with a passion, tipping the balance.

I couldn't imagine doing server management without a tool like puppet. But you love it when you don't have to tweak it much, and you hate it when you're actually making extensive changes. Hopefully after today I can get back to the loving it part.

399 days without hard drive failures

Filed under: Fedora,sysadmin — Thomas @ 20:36

2012-12-01
20:36

Well, it's been a record 399 days, but they have come to an end. Last weekend, a drive in my home desktop started failing. I had noticed some spurious SATA errors in dmesg before, and load times were rising (although lately in the 3.4/5/6 kernels I've been running I've actually seen that happen more and more, so it wasn't a clear clue).

Then things really started slowing down, and a little later I noticed the telltale clicking sound a drive can make when it's about to give up.

Luckily life has taught me many valuable lessons when it comes to dealing with hard drives. The failing drive was a 1TB drive in a RAID-1 software raid setup, so fixing it would be simple - buy a new 1TB drive and put it in the RAID, and just wait for hours on end (or, go to sleep) as the RAID rebuilds.

A few years ago I started keeping track of my drives in a spreadsheet, labeling each drive with a simple four digit code - the first two digits the year I bought the drive in, and the second two digits just a sequence (and before you ask, the highest those two digits got so far is 07 - both in '11 and '12). The particular drive failing was 0906, so the drive was about 3 years old - reasonable when it comes to failure (given that it has been running pretty much 24/7), but possibly still under warranty, and I've never had the opportunity to try and get a disk back under warranty, although this particular one was bought in Belgium.

But I digress.

Of course, I seldom take the simple route. When buying hard drives, I basically only follow one rule - buy the biggest drive with the cheapest unit price. And at last, Barcelona stores have gotten to the 3TB drives being at that sweet spot. So, why buy a comparatively expensive 1 TB drive and not get to have any fun with complicated drive migration?

So I settled on a 3TB Seagate Red drive (this is a new range specifically for NAS systems, although I'm not convinced they're worth the 6% extra cost, but let's give it a try) so I could replace the penultimate 2TB drive in my ReadyNAS, get 1TB of extra capacity on that, and then just use the newly freed 2TB drive in my desktop computer.

Of course, that's when I ended up with two problems.

Problem 1 was the NAS. The ReadyNAS was at 10TB already, having 4 3TB drives and 2 2TB drives with dual redundancy. I took out a tray, replaced the drive, put it back in, and then waited a good 18 hours for the array to rebuild. (The ReadyNAS has something they call XRaid2 which really is just a fancy way of creating software raids and grouping them with LVM, but in practice it usually works really well - figuring out a number of raid devices it should create using the mix of physical drives).

This time, it had correctly done the raid shuffle, but then gave me an error message saying it couldn't actually grow the ext4 filesystem on it because it ran out of free inodes. Ouch. A lot of googling told me that I should try to do an offline resize, so I stopped all services using the file system, killed all apple servers that somehow don't shut down, and did the offline resize. And then I rebooted.

The ReadyNAS seemed to be happy with that at first, saying it now had more space (although depending on the tool you use, it still says 10TB, because of the 2 to the 10/10 to the 3rd number differences adding up). But soon after that it gave me ext3 errrors. Uh oh.

With sweaty palms, I stopped all services again, unmounted the file system, and fsck'd it. And almost immediately it gave me a bunch of warnings about wrong superblocks, wrong inodes, all in the first 2048 sectors. Sure I have backups, but I wasn't looking forward to figuring out how up-to-date they were and restoring up to 10 TB from them.

I gasped for air and soldiered on, answering yes to all questions, until it churned away, and I went to sleep. The next morning, a few more yeses, and the file system seemed to have been checked. Another reboot, and everything seemed to be there... Phew, bullet 1 dodged.

On to problem number 2 - the desktop. The first bit was easy enough - although I've never been able to use gdisk to copy over partition tables like I used to with fdisk - it seems to say it did it, but it never actually updates the partition table. Anyway, I created it by hand copying the exact numbers, then added the partitions to the software raid one by one, and again waited a good 6 hours.

And looking at my drive spreadsheet, I noticed I had a spare 2TB drive lying around that I was keeping in case one of the NAS drives would fail - but given that most of them are 3TB right now, that wouldn't be very useful. So, after the software raid rebuilt in my desktop, I switched out the working 1 TB drive as well, and repeated the whole dance.

So now I had 2 2TB drives, 1 TB of which was correctly used. At this point I would normally figure out how to grow the partitions, then the md device, then the LVM on it, and then finally grow my ext4 /home partition. But since it's using LVM and I never played with it much, this time I wanted to experiment.

I still had the working 1TB drive which I could use as a backup in case everything would fail, so I was safe as houses.

At first I was hoping to do this with gparted live, but it seems gparted doesn't understand either software raid or lvm natively, so it's back to the command line.
Create two linux raid partitions on the two 2TB drives, assemble a new md device, and spend a lot of time reading the LVM howto.

In the end it was pretty simple; step 1 was to use vgextend to add the new md to the volume group, and then lvextend -l 100%FREE -r to grow the logical volume and resize the file system all at once. That automatically fsck's (which you can follow progress of by sending USR1) and then resize2fs (which you can't really check progress of once it's started)

(By now, we're over a week into the whole disk dance, in case you were wondering - doing anything with TB-sized disks takes a good night for each operation).

Except that now rebooting for some reason didn't work - grub complained that it didn't know the filesystems it needed - /boot is on a software raid too, and even though I don't recall running anything grub-related in this whole process, I had swapped out a few disks and may have botched something up when transferring boot records.

At the same time, I was also experimenting with Matthias's excellent new GLIM boot usb project (where you finally just drop in .iso files if you want to have multiple bootable systems on your usb key, without too much fidgeting), so I tried doing this in system rescue cd.

Boot into that, manually mount the right partitions, chroot into that, and then grub2-install /dev/sda.

Except that grub complained saying

Path `/boot/grub2' is not readable by GRUB on boot. Installation is impossible. Aborting.

Most likely this was due to it being on a software raid. Lots of people seemed to run into that, but no clear solutions, so I went the dirty way. I stopped the raid device, mounted one half of it as a normal ext file system (tried read-only first, but grub2-install actually needs to write to it), ran grub2-install, unmounted again. Then I recreated the software raid device for /boot again by reassembling, and that somehow seemed to work.

Reboot again, and this time past GRUB, but dropped in a rescue shell. My mdadm.conf didn't list the new raid device, so the whole volume group failed to start. Use blkid to identify the UUID, add that to /etc/mdadm.conf (changing the way it's formatted, those pesky dashes and colons in different places), verify that it can start it, and reboot.

And finally, the reboot seems to work. Except, it needs to do an SELinux relabel for some reason! And in the time it took me to write this way-too-long blogpost, it only managed to get up to 54%.

And I was hoping to write some code tonight...

Oh well, it looks like I will have 1TB free again on my NAS, and 1TB of free space on my home desktop.

There is never enough space for all of the internet to go on your drives...

UPDATE: SELinux relabeling is now at 124%. I have no idea what to expect.

Puppet pains

Filed under: sysadmin — Thomas @ 14:53

2012-03-27
14:53

The jury is still out on puppet as far as I'm concerned.

On the one hand, of course I relish that feeling of ultimate power you are promised over all those machines... I appreciate the incremental improvements it lets you make, and have it give you the feeling that anything will be possible.

But sometimes, it is just so painful to deal with. Agent runs are incredibly slow. It really shouldn't take over a minute for a simple configuration with four machines. Also, does it really need to be eating 400 MB of RAM while it does so ? And when running with the default included web server (is that webrick ?), I have to restart my puppetmaster for every single run because there is this one multiple definition that I can't figure out that simply goes away when you restart, but comes back after an agent run:

err: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate definition: Class[Firewall::Drop] is already defined; cannot redefine at /etc/puppet/environments/testing/modules/manifests/firewall/drop.pp:19 on node esp

And sometimes it's just painfully silly. I just spent two hours trying to figure out why my production machine couldn't complete its puppet run.

All it was telling me was

Could not evaluate: 'test' is not executable

After a lot of googling, I stumbled on this ticket. And indeed, I had a file called 'test' in my /root directory.

I couldn't agree with the reporter more:

I find it incredibly un-pragmatic to have policies fail to run whenever someone creates a file in root which matches the name of an executable I am running.

More adventures in puppet

Filed under: General,Hacking,sysadmin — Thomas @ 23:32

2012-03-04
23:32

After last week's Linode incident I was getting a bit more worried about security than usual. That coincided with the fact that I found I couldn't run puppet on one of my linodes, and some digging turned up that it was because /tmp was owned by uid:gid 1000:1000. Since I didn't know the details of the breakin (and I hadn't slept more than 4 hours for two nights, one of which involving a Flumotion DVB problem), I had no choice but to be paranoid about it. And it took me a good half hour to realize that I had inflicted this problem on myself - a botched rsync command (rsync arv . root@somehost:/tmp).

So I wasn't hacked, but I still felt I needed to tighten security a bit. So I thought I'd go with something simple to deploy using puppet - port knocking.

Now, that would be pretty easy to do if I just deployed firewall rules in a single set. But I started deploying firewall rules using the puppetlabs firewall module, which allows me to group rules per service. So that's the direction I wanted to head off into.

On saturday, I worked on remembering enough iptables to actually understand how port knocking works in a firewall. Among other things, I realized that our current port knocking is not ideal - it uses only two ports. They're in descending order, so usually they would not be triggered by a normal port scan, but they would be triggered by one in reverse order. That is probably why most sources recommend using three ports, where the third port is between the first two, so they're out of order.

So I wanted to start by getting the rules right, and understanding them. I started with this post, and found a few problems in it that I managed to work out. The fixed version is this:

UPLINK="p21p1"
#
# Comma seperated list of ports to protect with no spaces.
SERVICES="22,3306"
#
# Location of iptables command
IPTABLES='/sbin/iptables'

# in stage1, connects on 3456 get added to knock2 list
${IPTABLES} -N stage1
${IPTABLES} -A stage1 -m recent --remove --name knock
${IPTABLES} -A stage1 -p tcp --dport 3456 -m recent --set --name knock2

# in stage2, connects on 2345 get added to heaven list
${IPTABLES} -N stage2
${IPTABLES} -A stage2 -m recent --remove --name knock2
${IPTABLES} -A stage2 -p tcp --dport 2345 -m recent --set --name heaven

# at the door:
# - jump to stage2 with a shot at heaven if you're on list knock2
# - jump to stage1 with a shot at knock2 if you're on list knock
# - get on knock list if connecting t0 1234
${IPTABLES} -N door
${IPTABLES} -A door -m recent --rcheck --seconds 5 --name knock2 -j stage2
${IPTABLES} -A door -m recent --rcheck --seconds 5 --name knock -j stage1
${IPTABLES} -A door -p tcp --dport 1234 -m recent --set --name knock

${IPTABLES} -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
${IPTABLES} -A INPUT -p tcp --match multiport --dport ${SERVICES} -i ${UPLINK} -m recent --rcheck --seconds 5 --name heaven -j ACCEPT
${IPTABLES} -A INPUT -p tcp --syn -j door

# close everything else
${IPTABLES} -A INPUT -j REJECT --reject-with icmp-port-unreachable

And it gives me this iptables state:
71242

So the next step was to reproduce these rules using puppet firewall rules.

Immediately I ran into the first problem - we need to add new chains, and there doesn't seem to be a way to do that in the firewall resource. At the same time, it uses the recent iptables module, and none of that is implemented either. I spent a bunch of hours trying to add this, but since I don't really know Ruby and I've only started using Puppet for real in the last two weeks, that wasn't working out well. So then I thought, why not look in the bug tracker and see if anyone else tried to do this ? I ask my chains question on IRC, while I find a ticket about recent support. A minute later danblack replies on IRC with a link to a branch that supports creating chains - the same person that made the recent branch.

This must be a sign - the same person helping me with my problem in two different ways, with two branches? Today will be a git-merging to-the-death hacking session, fueled by the leftovers of yesterday's mexicaganza leftovers.

I start with the branch that lets you create chains, which works well enough, bar some documentation issues. I create a new branch and merge this one on, ending up in a clean rebase.

Next is the recent branch. I merge that one on. I choose to merge in this case, because I hope it will be easier to make the fixes needed in both branches, but still pull everything together on my portknock branch, and merge in updates every time.

This branch has more issues - rake test doesn't even pass. So I start digging through the failing testcases, adding print debugs and learning just enough ruby to be dangerous.

I slowly get better at fixing bugs. I create minimal .pp files in my /etc/puppet/manifests so I can test just one rule with e.g. puppet apply manifests/recent.pp

The firewall module hinges around being able to convert a rule to a hash as expressed in puppet, and back again, so that puppet can know that a rule is already present and does not need to be executed. I add a conversion unit test for each of the features that tests these basic operations, but I end up actually fixing the bugs by sprinkling print's and testing with a single apply.

I learn to do service iptables restart; service iptables stop to reset my firewall and start cleanly. It takes me a while to realize when I botched the firewall so that I can't even google (in my case, forgetting to have -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
) - not helped by the fact that for the last two weeks the network on my home desktop is really flaky, and simply stops working after some activity, forcing me to restart NetworkManager and reload network modules.

I start getting an intuition for how puppet's basic resource model works. For example, if a second puppet run produces output, something's wrong. I end up fixing lots of parsing bugs because of that - once I notice that a run tells me something like

notice: /Firewall[999 drop all other requests]/chain: chain changed '-p' to 'INPUT'
notice: Firewall[999 drop all other requests](provider=iptables): Properties changed - updating rule

I know that, even though the result seems to work, I have some parsing bug, and I can attack that bug by adding another unit test and adding more prints for a simple rule.

I learn that, even though the run may seem clean, if the module didn't figure out that it already had a rule (again, because of bogus parsing), it just adds the same rule again - another thing we don't want. That gets fixed on a few branches too.

And then I get to the point where my puppet apply brings all the rules together - except it still does not work. And I notice one little missing rule: ${IPTABLES} -A INPUT -p tcp --syn -j door

And I learn about --syn, and --tcp-flags, and to my dismay, there is no support for tcp-flags anywhere. There is a ticket for TCP flags matching support, but nobody worked on it.

So I think, how hard can it be, with everything I've learned today? And I get onto it. It turns out it's harder than expected. Before today, all firewall resource properties swallowed exactly one argument - for example, -p (proto). In the recent module, some properties are flags, and don't have an argument, so I had to support that with some hacks.

The rule_to_hash function works by taking an iptables rule line, and stripping off the parameters from the back in reverse order one by one, but leaving the arguments there. At the end, it has a list of keys it saw, and hopefully, a string of arguments that match the keys, but in reverse order. (I would have done this by stripping the line of both parameter and argument(s) and putting those on a list, but that's just me)

But the --tcp-flags parameter takes two arguments - a mask of flags, and a list of flags that needs to be set. So I hack it in by adding double quotes around it, so it looks the same way a --comment does (except --comment is always quoted in iptables --list-rules output), and handle it specially. But after some fidgeting, that works too!

And my final screenshot for the day:
71245

So, today's result:

Now, I have a working node that implements port knocking:

node 'ana' {

$port1 = '1234'
$port2 = '3456'
$port3 = '2345'

$dports = [22, 3306]

$seconds = 5

firewall { "000 accept all icmp requests":
proto => "icmp",
action => "accept",
}

firewall { "001 accept all established connections":
proto => "all",
state => ["RELATED", "ESTABLISHED"],
action => "accept",
}

firewall { "999 drop all other requests":
chain => "INPUT",
proto => "tcp",
action => "reject",
}

firewallchain { [':stage1:', ':stage2:', ':door:']:
}

# door
firewall { "098 knock2 goes to stage2":
chain => "door",
recent_command => "rcheck",
recent_name => "knock2",
recent_seconds => $seconds,
jump => "stage2",
require => [
Firewallchain[':door:'],
Firewallchain[':stage2:'],
]
}

firewall { "099 knock goes to stage1":
chain => "door",
recent_command => "rcheck",
recent_name => "knock",
recent_seconds => $seconds,
jump => "stage1",
require => [
Firewallchain[':door:'],
Firewallchain[':stage1:'],
]
}

firewall { "100 knock on port $port1 sets knock":
chain => "door",
proto => 'tcp',
recent_name => 'knock',
recent_command => 'set',
dport => $port1,
require => [
Firewallchain[':door:'],
]
}

# stage 1
firewall { "101 stage1 remove knock":
chain => "stage1",
recent_name => "knock",
recent_command => "remove",
require => Firewallchain[':stage1:'],
}

firewall { "102 stage1 set knock2 on $port2":
chain => "stage1",
recent_name => "knock2",
recent_command => "set",
proto => "tcp",
dport => $port2,
require => Firewallchain[':stage1:'],
}

# stage 2
firewall { "103 stage2 remove knock":
chain => "stage2",
recent_name => "knock",
recent_command => "remove",
require => Firewallchain[':stage2:'],
}

firewall { "104 stage2 set heaven on $port3":
chain => "stage2",
recent_name => "heaven",
recent_command => "set",
proto => "tcp",
dport => $port3,
require => Firewallchain[':stage2:'],
}

# let people in heaven
firewall { "105 heaven let connections through":
chain => "INPUT",
proto => "tcp",
recent_command => "rcheck",
recent_name => "heaven",
recent_seconds => $seconds,
dport => $dports,
action => accept,
require => Firewallchain[':stage2:'],
}

firewall { "106 connection initiation to door":
# FIXME: specifying chain explicitly breaks insert_order !
chain => "INPUT",
proto => "tcp",
tcp_flags => "FIN,SYN,RST,ACK SYN",
jump => "door",
require => [
Firewallchain[':door:'],
]
}
}

and I can log in with

nc -w 1 ana 1234; nc -w 1 ana 3456; nc -w 1 ana 2345; ssh -A ana

Lessons learned today:

  • watch iptables -nvL is an absolutely excellent way of learning more about your firewall - you see your rules and the traffic on them in real time. It made it really easy to see for example the first nc command triggering the knock.
  • Puppet is reasonably hackable - I was learning quickly as I progressed through test and bug after test and bug.
  • I still don't like ruby, and we may never be friends, but at least it's something I'm capable of learning. Puppet might just end up being the trigger.

Tomorrow, I need to clean up the firewall rules into something reusable, and deploy it on the platform.

How do you manage mailing lists?

Filed under: Question,sysadmin — Thomas @ 17:00

2012-01-02
17:00

Every new year is a time of cleaning. After getting back to Inbox 0, my next target is my mailing list subscriptions.

It must be something psychological, but I cannot bring myself to unsubscribe from some of these mailing lists. I don't check on them daily, but once in a while it's darn useful to search through my local copy of mails on, say, selinux, and find solutions for a problem I'm having.

However, all this mailing list mail brings me a lot of headache. My email client is slow, and I would want it to be fast for the real mail I'm getting (from actual people, needing actual work). It's hard to track the mails that matter - all my list mail gets put into folders automatically with some procmail magic, but it also means that some of the things I should be paying more attention to are just another bold folder in Evolution somewhere down the mail tree. And lastly, the server where I host my mail shared with friends gets too much traffic, and syncing 3 different evolutions over IMAP with it is a big part of the burden.

I vastly prefered the newsreader model of old, and I think the de facto standard of mailing lists really is a mistake. But I'm not sure what to replace it with.

What I want:

  1. have selected mailing list archives be available on my machines, locally
  2. have them synced/updated automatically
  3. have them out of the way of my normal mail usage unless when I need them

I've been considering getting a separate email account just for email lists for this purpose, although I don't look forward much to having to change all my subscriptions, and would first like to hear from other people how this approach works out for them.

There used to be a push towards web-based mailing list subscriptions, but I don't know if anyone is really seriously using that, and I would like to have the option of reading these mailing list archives offline.

How do you separate your 'real' mail from your mailing list mail? How do you handle them?

Next Page »
picture