Problems
- Centralization: 5000 anonymous users and hundreds of mailing lists, all on a single server - BAD
- Legal background
- o growing control paranoia, in Italy like elsewhere o coming laws that will endanger the possibility of offering anonymous services (keeping logs, etc...)
- Tecnhical problems
- o server reaching its capacity limits
crackdown
June 2005: crackdown. authorities came to ecn and autistici said you have to censor a particuar website. it was a judge ordered.
so autistici wondered what law are they using. a lawyer researched the issue, and found a small reference to evidence obtained by the cops. the police were explicit "we were able to dump the content of the messages thanks to the ssl certificate that we obtained from seizing the server."
it happened in 2004, the provider allowed the cops to copy the entire disks, the machine was off-line overnight, but the hosting provider blamed a power outage and time to get it back running again in the morning. at the time they were very suspicious, but they didn't have any evidence that something funny was going on. they sent a newsletter to people to beware, but they as time passed people forget.
there was a big campaign around the actions of the cops and the provider. much debate about who to sue, but in the end they focused on raising awareness and a political campaign.
Build a distributed network
This was a big eye opener for us. We discovered that we needed a really different approach. We had to achieve a different perpective on the needs of social struggles that is scalable from small communities to the universe. We have already been setting up community servers to serve small groups of people in a decentralised manner, but this was not enough to serve the movement.
as much as social struggle can defend our servers, it is not enough to actually protect our servers. we need a technical solution as well.
A more resilient, not only more resistant service!
Mass mobilisation did happen to defend indymedia before, but it was not enough, because it was merely crisis management. we wanted to avoid the crisis!
Then we designed a secret plan: the...
R*Plan
The so-called R* plan is the attempt to provide an answer to the problems just identified:
- private data delocalization (i.e. mailboxes) - economical reason: we cannot replicate all data on all servers.
- infrastructure redundancy (partly) - as much as possible it still have to work if some servers are down.
- almost linear scalability - the bigger it is, the more
the objective is to make it so that the communication flow cannot be stopped, unless they arrest everyone who is running the server.
Network structure
- N functionally equivalent server - they don't have specific roles
- each hosts a fraction 1/N of the users' private data - data is split around the servers
- servers are geographically distributed in different countries - almost all in different countries
- servers are interconnected through a VPN
The "quantum" is a user: all of a user's data is hosted on a single machine. (right?)
VPN
This is the logical view of the "private" network. The various rings correspond to different levels of database access (different trust levels, indeed). the traffic in the vpn is compressed.
Encrypted file systems
- encrypted partitions:
- are mounted on boot with manual passphrase insertion
- contain the private keys of the various services that use asymmetric keys (SSL, cfengine, tinc...)
- implemented using dmcrypt (no performance issues)
- we're evaluating the possibility of encrypting the partitions that contain private user data - relying on them however is a problem.
We don't really trust it as a defense mechanism.
User database
- all user-related data (mailboxes, FTP accounts, web sites, etc...) is stored in an LDAP database
- such db is replicated on all servers
- the db operates in single-master mode, however switching the master node is a quick operation (master is only a role)
- administration can be done either through normal LDAP tools (for easy tasks) or with our own custom-written administration software
all the essential database for user information is in the LDAP database.
LDAP: read a lot, not write a lot. fully replicable between the servers.
Configuration management
- configuration of all servers is centralized
the mechanism used (CFengine http://www.cfengine.org/) allows both file distribution and server-specific customizations
- the main repository is under version control so that changes can be monitored and accounted for, and roll back
- CFengine is a stable and convergent solution (the LDAP database is similar in design: master and slave roles can be changed easily)
"We found it really hard to screw everything up." "Great tool but the syntax sucks."
- multiple MX records, with equal priority
- antispam checks are always performed locally
- internal routing uses the VPN interconnection (so you don't have to use TLS, because the VPN is already secured)
POP/IMAP services are deployed using the same scheme:
- mail.autistici.org uses round-robin DNS records
the connection is then forwarded through the VPN towards its final destination (using perdition http://www.vergenet.net/linux/perdition/)
- for the moment, IMAP connections from the webmail are only local (you can read email only on the server that holds your mailbox) to reduce VPN traffic
Forwarding scheme (and some math)
- The forwarding scheme used for SMTP and POP/IMAP implies a certain amount of "wasted" bandwidth. This amount can be computed:
- each connection comes to each of the N servers with equal probability 1 / N
- the probability that it should be forwarded to the right server is (N - 1) / N, that tends to 1 for large N
- every server then receives on average (N - 1) / N connections forwarded from other servers for every incoming external connection
- the total number of incoming connections has an asymptotic behavior as N grows larger.
Web
Some sites (the ones we directly manage) are active on all servers using round-robin DNS and MySQL database replication (with their own domain).
- Problem: the scarcity of interconnection bandwidth make this approach valid only when database reads are overwhelming in respect to writes.
- More so, the general complexity of the application configuration makes it unsuitable for users' websites (no control on the app. config, the address is autistici.org/username)
- user sites (and their relative MySQL databases) are simply split up between the various servers.
HTTP redirect
- for users' sites, clients are redirected to the server hosting the site
for distributed applications, clients are locked onto a specific server once a session is established
the redirect maps (together with the rest of the Apache configuration) are periodically generated from the database.
Mailing Lists
- mailing list manager: Mailman (for historical reasons)
- Mailman stores its configuration on the filesystem (in binary format, which is very bad)
- since we don't have a distributed filesystem, each list needs to be active on a single server at a time
- postfix? transport maps direct the mail to the appropriate server
- configurations are copied periodically to every server so that we can rapidly switch servers for failover
Problem: we don't have a distributed filesystem because all solutions we know consume too much bandwidth.
The lists are active only on one server at a time and can be activated in case of a server crash/problem. The changes from the mailing lists are flowing in the opposite direction than the rest of the system: they feed back into the LDAP database. This is a compromise.
Q: how much do you have to patch software to run R*Plan? A: we don't.
Comment: there's a way to patch mailman to make it do something something.?
30 gigs of mailman archive: 3 gigs of .mbox, 27 gigs of html crap.
Anonymity
A/I implements many mechanisms for user identity protection:
- system logs are anonymized directly at the syslog level (IPs and mail addresses, mostly)
- Apache logs are anonymized (except the error.log!)
- SMTP headers of outgoing messages are stripped of the information about the sender's IP address
Every website has an anonymous email address for contact.
Q: How many servers? Any problems? A: 5 servers, 5000 mbox, 700 sites. We only had hardware failures - but it's easy to fix that if you have the data elsewhere! At the end, we finished and everybody went to vacation, because it saved a lot of sysadmin time for us.
Q: Do you have all the servers in commercial spaces? A: Yes there is no option to have servers in squats any more. But this factor is not relevant anymore. There is SeaCCP, XS4ALL, Sweden, Florence DIY data centre. But the point is that you don't have to worry too much if you loose a server.
Comment: the R*Plan server uses less bandwidth so you can theoretically set it up in a squat.
Comment: all squats could have r*plan servers built with skip hardware. Eviction is not a problem for the R*Plan, and you still have your server.
Q: It is pretty complex. A: We try to make it more simple and document it thoroughly. We also have to understand yourself.
Q: You could write some non-geek language documentation as well to explain the why and the how?
Q: I've tried to switch to R*Plan for a year... A: We are happy for everyone because it makes the network stronger.
apt-get install rplan! ... would be an objective when version 1.0 is reached.
The Mission of Autonomous Servers: provide a constant continuous non-brakable communication flow. To defend against taking away servers and surveillance, etc. For example in Geneva crackdown. We don't care how the server is called, our mission is more important than our identity!
Comment: We all face the same problem. Different approached:
- Legal
- Political (French servers?)
- Technical (R*Plan)
- Physical (nadir?)
We should try all 4 at the same time.
If that doesn't work, there's still the war approach.
Nadir: We wanted to have our data separated on different servers. Aut: we could have separated the services as well, but that's even more complex.
We like the idea of having nodes interchangeable, where they can be spared, so we don't have seperate services.
All the members have root access to all boxes, we are around 40 people, we have to trust really everybody really much for them to join. We can put the technical and political trust on different levels. It takes time to build political trust. That means that the R*Plan will grow slowly, because we don't trust everybody.
Q: you are developing some kind of culture of decentralisation of services here. You mentioned mailman - that you don't want to maintain another version of mailman. Did you try to find anyone to do that for you? That could build and spread the culture as well as making the R*Plan adoption easier. A: We only had problem with mailman. The reason is that it is a shitty mailing list manager (mailman). Maybe sympa would be better? There are many tools that support decentralisation, but they have different parameters. A: on the dev front, there are 2 possible things:
- Thunderbird plugin to use IMAP?
- rsync for maildirs that work properly. it could be possible to sync maildirs very fast.
You could remove a server to work on it an the users wouldn't notice, and when you would put it back then the data flowed back automatically.
Each message gets delivered twice so that there are potentially two live mailboxes.
Comment: eventually this leads/overlaps with the routable VPN anarchist net that is not accessible from the Internet. "If we are not too many, they can route us out of the fucking Internet." On the other hand i would be happy to force the law enforcement to modify the routing tables of the Internet. The best thing would be if they had to shut down some intercontinental cables to stop us.
Q: Are you doing anything special to anonymise which servers a user uses?
A: the idea is that it is isea to hide it for the casual server. if you want to defeat traffic analysis, that's a different approach. You can proxy everything if you want. We think that this level a traffic analysis is beyond the powers of the people who got to our last server.
Q: what keeps someone seizing a machine from breaking into the VPN by having access to the key?
A: this is still one of the weak points, because on a running server it's hard to keep it from looking on it. We can still remove the compromised node and change the keys.
The most probable threat for this model is to have a bribed hacker break into the server.
Even with RPlan, we need to fight server seizures, but now at least people are protected from seizure, at least partially. We must work on all fronts at once.