Impossible de trouver la page à laquelle il est fait référence : <s5defs.txt>
metche & collective system administration procedures
One of the initial motivations of the boum.org project (2003). Let's present some thoughts about this (still) ongoing experiments...
Most common way of doing sysadmin
- The Bastard Operator From Hell!
- One lonely power hungry male behind his keyboards
- High "bus factor"
- Doesn't really fit with our ideals
In your collectives?
- Go around of current practice during 15 minutes
- Collective of geeks and less geeks
- Knowledge exchange between members
- Keep track of members actions
- Peer review
- No one should be central
Creation of metche
- The "apt-get upgrade" syndrom:
- No-one should be able to update the system without telling the others
- Force to pay attention to the whole server and not a single service
- Unattended, unavoidable monitoring (big brother like)
- Encourage peer review and record tracking
- Rollback of quick & dirty hacks or mistakes
- Monitors system states:
- installed Debian packages
- Sends changes by e-mail
- Backups important states
System status model
- Loosely based on Debian release system
- When a change is detected, state saved as "unstable"
- After one hour without new changes, state is promoted "testing":
- Mail is sent to the collective mailling-list
- Previous "unstable" states are deleted
- Three more days later, if no further modifications, state is promoted "stable":
- Mail is sent
- Get rid of previous "testing" states
- Every change must have a corresponding Changelog entry
- Document sysadmin like software developpement
- Human-readble communication channel between human admins
- Keep track of server history
- Note tricky issues
- "Simple" shell script (828 lines in trunk 2006-08-22)
- Crontab run metche every 5 minutes
- Full tarball of /etc in /var/lib/metche/
- apt-listchanges (apt-showversion?) for Debian packages
- GNU diff to create reports
- Able to send GPG encrypted mails
- Next version will have vserver support
- Debian package working out-of-the-box (nearly) should be in etch and is in backports.org now. has debconf bugs
We got a lot of positive feedback for a software that we wrote in 12 hours and which is not so long. There are saved system states so you can roll back if you fucked something up.
Mails can content sensitive information (like shadow passwords) that should be encrypted.
- Example mail received::
> 2006 07 10 - Lunar > > * squirrelmail: > Update MOTD.
< -rw-r--r-- 1 root root 7740 Jun 30 11:21 etc/squirrelmail/config.php > -rw-r--r-- 1 root root 7744 Jul 11 01:50 etc/squirrelmail/config.php
The real diff can be sent with the mail, as a configurable option.
Use cases (from real life ;))
- Jacques has done the security updates, Lunar is notified, and can go back to sleep
- Mistur forgot to document changes in /etc/postgrey/postgrey.conf, Lunar ask for more information on the mailling-list
- intrigeri added a new domain to the mail service, and chouchou can learn how he did it
- ricola modified a DNS zone, intrigeri noticed that the serial hasn't been updated
Changelog good practices
- Keep it human-readable (not really human-readable, but at least newbie-system-administrator readable)
- Not too terse, not too verbose
- Intent, first, if not obvious
- Explicitely mention modified files:
- Paste relevant changes of configuration files
- But not if the relevant lines can be determined by looking at the file
- Copy non-obvious shell commands (a2ensite, a2enmod, big find)
- Indymedia (sarai, kompost, ...)
- More or less: poivron.org, squat.net, a bunch of laptops :)
- 24 popcon registred installations (popcon is the debian popularity contest software: registers the installs of packages, architecture, platform, etc.)
- Still have some (small) bugs
- Changelogs keep history, not document the whole current system:
- Think about newcomers
- Solved by the Wiki at boum.org
- Atomicity issues (design flaw): if you modify files many admins at the same time, it creates problems
- Non-incremental backups (feature?) tarballs don't get broken, so it's more reliable
- Automatically modified files issue (if the system changes itself without human interaction, meche doesn't note the changes)
Q: is the changelog modification enforced or voluntary?
A: it's voluntary, peer pressure forces people to document their stuff.
Comment: other issue: within the boum collective, in the context of mixing geeks and non-geeks, we have a broader project of collaborative administration. We are trying to split functions into smaller pieces. You don't have to manage a server when you join boum.org, but for example manage the mailing lists. You can look at the mailing list administration, and read the mailing list Changelog only, so you are not overwhelmed by the amount of different tasks. If you learned how to administrer mailing lists, than you can move on to a different area. There is the idea of having multiple changelogs for different parts of sysadmin, to make it more accessible to people.
Q: future development/features?
A: ideal setup: special filesystem for /etc which would allow you to make a direct link between a file modification and changelog entry. links between files and their property/function... Move closer to the kernel and being able to export the list of changes when you close the command line.
it would be nice to integrate it with snoopy, so that every command is recorded when you start a session. changelog file formats can be confusing. it would be nice to have a way to say "the session is over" and it would pop a changelog editor
Q: is there remote SMTP support or does it need a local sendmail thingy?
A: mutt is used so you can use remote SMTP
Q: Debian only?
A: We are Debian people. You can patch it.
Q: thought about using source code revision control systems?
A: on the first revision, we tried that but then later we completely dropped it, because right now the design is using simple UNIX tools. tarballs are easy to recover and manager. with any kind of RCS, it's harder to recover the state of the system. it can still be done. it's simpler right now with GNU tar.
Q: can we use it to administer multiple servers?
A: that was one of the crazy ideas at the beginning.
Q: how many emails do you get per day? :)
A: it depends on the activity. it's sending a mail after working an 1 hour. metche waits for 1h of "non-modification" before sending an email.
Q: how many people working on that server (boum.org?)?
A: 6-8 people? there's not much work on it. Small server.
Comment: something about vserver support. You can have the Changelog on the vserver but they cannot remove it?
Q: Is it possible to join the developer team? And is it possible to port it to other Linux distributions? ...
A: We have a subversion server on poivron.org, dependency on Debian tools is optional, so it is possible to port it.
Q: FAMd - File Alteration Monitoring program can be used?
A: We could combine meche and FAMd to get rid of crontab.
Comment: on Autistici we have a similar thing and we do it in a different way. We'll present our tool tomorrow that manages multiple servers with a single set of users.
Q: There has been bad reports. Are you planning a next release?
A: Micah was our debian sponsor, and he's been busy.
A: if the PGA conference would not be happening right now, it would already be made.
Comment: feature suggestion: if someone logs out without documenting his changes, it can create a problem. You could add a list of last logins to the automatic notification mails.
Note: the debconf system is broken so that the configuration files don't get modified after debconf. Just re-edit the metche configuration file.
Note2: disable full diffs, because it leads to sending passwords through email which always makes a lot of noise and not secure. Comment: there is a program that filters out the passwords from certain files. Although there are a lot of programs that use different files and notation systems to store passwords.
The current program is working for the collective, so we will not change it too much in the future if we don't have new problems. But we would likt to encourage other collectives to try the software and make their changes.
Thanks for coming!
- apt-get install metche :)
DigitalStruggles - PGA Conference 2006