Distributed Software Development in Freenet

1. What?
2. How?
3. Why?
4. Improvements & Suggestions
5. Good Practice

   

1. What?

Distributed Software Development is a process where many people, possibly from different geographical locations, work on a common software project. Especially "Free Software" and "Open Source" projects are often developed in such a distributed process. Sites like sourceforge.net provide a convenient platform for such distributed development, with the most important aspects of such a platform being

  • communications between developers, and between developers and users
  • a version-controlled software repository
  • pseudonymous authentication

All of these can also be provided by freenet: for communications there are various well-established means like NIM, frost and fmb. Pseudonymous authentication works via SSKs. The tricky part that remains is the version-controlled software repository.

Probably the most widely used repository software is "CVS". It has almost all of the important features one needs for developing software. Clients exist for most operating systems. And it doesn't lend well for porting to freenet because it relies on a centralized repository. The same is also true for some of CVS's alleged successors like "subversion".

A rather new repository software is Tom Lord's "Arch". The good thing about Arch is that it's distributed by design, i. e. it's relatively easy to use together with freenet.

2. How?

Basically, I have written a patch for Arch that enables it to access archives located at freenet:... type URLs, and to mirror an existing (local) archive into freenet. I'm also suggesting a naming convention for in-freenet-archives to be used instead of email addresses.

2.1. Namespaces

The important part of defining a namespace for Freenet-based Arch-controlled projects is that everyone must be able to create a globally unique ID for themselves and their projects while making it reasonably difficult for anyone to impersonate them.

In the non-anonymous internet, email addresses provide a namespace that fulfills these requirements: to generate a valid email address you must have control over the mailserver for the domain. The domain name system guarantees uniqueness of the domain part, the mailserver operator guarantees uniqueness of the local part. Of course anyone can claim to be "linus@transmeta.com" or "rms@fsf.org", but they will quickly be identified as impersonators when someone tries (and fails) to reach them at these addresses.

In Freenet, the only means to satisfy the requirements is probably the SSK namespace. Global uniqueness isn't actually guaranteed, but due to the sheer size of the SSK namespace it is extremely unlikely that two people by chance generate the same SSK for themselves. Impersonation is also extremely difficult because SSKs are cryptographically secured (remember, "SSK" means "Secure Subspace Key").

2.1.1 User IDs

In order to accomodate SSKs as part of the unique user ID a small extension could be made to the original specification of the unique ID syntax in Arch:

<[-.[:alnum:]]+@[-.[:alnum:]]+\.[-.[:alnum:]]+> (original)

<[-.[:alnum:]]+@[-.~[:alnum:]]+\.[-.[:alnum:]]+> (new)

An alternative is to generate SSKs over and over again, until the public part doesn't contain any tilde character. This is (for now) the preferred and implemented method.

The suggested pattern to form a unique ID for Freenet-based Arch projects is, then:

<nickname@SSK-public.freenet>

Rationale: there is no top-level domain "freenet", and hopefully there will never be one. Therefore,

  1. there can't be any clashes with email-addresses, which guarantees global uniqueness,
  2. In-Freenet-IDs can be distinguished from other unique IDs by matching the end of the ID with the string ".freenet>", which may be useful.

As a general rule, the owner of such a user ID should maintain a freesite at "freenet:SSK@SSK/nickname//".

2.1.2 Archive Names

Quoting from the (old) Arch documentation, section 9.2:

An archive name consists of an email address [...], followed by "--", followed by an additional string of numbers, letters and dashes [that must not contain two consecutive dashes].

(The stuff in brackets was inserted by me, and has been suggested to the Arch developers because it fixes an error in the current specification.)

By substituting "email address" with the "unique ID" as defined in the previous section we can easily extend the existing namespace for archives to accomodate in-Freenet-archives. The nice thing about that definition is that the "unique ID" in the archive name can automatically be mapped to a freenet key (as described above) containing the actual archive. (A freesite that's published under the same key conveniently does not interfere with the archive at the same location. When using fcpputsite, both must be published simultaneously, though.)

2.2 Preserving Anonymity

This is one of the tricky parts. I have already talked about namespaces and how to choose an arch user ID and archive name that won't give away your email address. It is your job to follow these suggestions!

Another issue concerning anonymity is file paths. Section 5.1 of this document suggests a setup that's as anonymous as possible. Again, it is your job to follow these suggestions!

Both of the above points can relatively easy be followed by invoking the command "larch new-freenet-project" in an appropriate directory location. See the help message of that command for more information.

Finally, there is the problem of user and group names in tar archives. Arch stores patch sets as a simple tar archive containing plain diffs and some meta information. Tar files created by you contain your user and group names and IDs! Therefore, publish-freenet-mirror will use the included "anon-tar" command to anonymize the tar files before publishing them.

2.3 The Arch Patch

The Arch Patch implementing the described changes is available at

freenet:SSK@LCgWj0qabAqxUNKbfI93PCTH9RUPAgM/DSDiF

You can either download the source tarball, or access the arch archive at the same URL using arch itself by entering

larch register-archive DSDiF@LCgWj0qabAqxUNKbfI93PCTH9RU.freenet--DSDiF freenet:SSK@LCgWj0qabAqxUNKbfI93PCTH9RUPAgM/DSDiF

For bootstrapping (and for simplicity :-/) the archive site is mirrored at

http://www.unix-ag.uni-kl.de/~conrad/Archives/DSDiF/

and

http://217.160.110.211/Archives/DSDiF/ (slow!)

The patch requires fcptools to be installed.

The current version is pretty alpha, use carefully! In particular, I'd recommend to backup your local archive before trying publish-freenet-mirror.

2.4 Known Problems

  • It seems that publish-freenet-mirror has managed to destroy two patch files from my archive in one case. As I said before, you should make a backup of your archive before publishing it into freenet!
  • Splitfiles are not supported at this time, which means that your archive must not contain files larger than (almost) 4 megabytes.
  • It's sloooooow...
  • publish-freenet-mirror ignores the FCP_* environment variables

2.4 Getting Started

First, you should get familiar with arch, e. g. by reading this tutorial.

Once you understand how to operate arch, understanding how to operate arch over freenet becomes very simple:

  • you can access (read-only) in-freenet archives by simply registering the archive name with a freenet: URL with a command like
    larch register-archive project@key.freenet--project freenet:SSk@...
    Once the archive is registered you can access it with the normal arch commands.
    Optionally, you can use the environment variables FCP_HOST, FCP_PORT, FCP_HTL, FCP_HTL_INCR and FCP_RETRIES to configure access to your freenet node. FCP_HOST and FCP_PORT specify the fcp client port of your node, FCP_HTL the initial hops-to-live value to use, FCP_HTL_INCR the increment for the hops-to-live value on each retry, and finally FCP_RETRIES the number of retries that should be made on each file.
  • If you want to publish your work in freenet you must first create an SSK key pair, e. g. with fcptools:
    fcpputsite -g
    Create a file containing nothing but the secret key on one line. And don't forget the public key, either. :-)
    Then, set up a local archive, into which you commit your work, e. g.
    larch make-archive project@pubkey.freenet--project /some/local/path
    Then, register your freesite as a mirror archive, e. g.
    larch register-archive project@pubkey.freenet--mirror freenet:SSK@pubkeyPAgM/project
    Now you can mirror your local archive into freenet using the command
    larch publish-freenet-mirror --ssk-file filename project@key.freenet--project project@key.freenet--mirror
    publish-freenet-mirror will automatically insert the files from local-archive into a new edition at e. g. freenet:SSK@.../project/edition-number for a new edition-number. For convenience, you should maintain a DBR link that's always pointing to the current edition.
  • The command larch new-freenet-project will perform most of the steps listed above.

3. Why?

So why does anyone need to develop software in freenet when there are already nice and convenient platforms like sourceforge available? Several reasons come to mind:

  • Just like free speech is an endangered essential right in many (if not all) parts of the world these days, so is free software. E. g. the distribution and development of PGP was hampered by the ITAR (despite the fact that it was widely available outside the USA). The DMCA has already proven a major nuisance e. g. in the case of Dimitry Skylarov, a russian programmer. With software patents looming over europe it's easy to guess what the next big blow to free software will be.
    Even if accusations based on such laws cannot be held up in court, the mere possibility can and will keep people from developing free software. I know what I'm talking about: in 1995, I published a small ZIP password cracking tool that today would probably be seen as a violation of the DMCA. Since I have no intention of becoming another Skylarov I will not visit the US in the foreseeable future. So much for freedom.
  • As has been said before, freenet needs (legal) content. I've recently read an interesting article that comes to the conclusion that it might be difficult to ban freenet (at least in the USA) if it has "substantial non-infringing use". I think a few serious software projects could certainly be called "substantial use", probably much more so than certain porn sites... :-)
  • Ultimately, the freenet software itself could be developed within freenet. This is an important point - usually the RIAA and others attack P2P platforms by targeting the developers. By moving the development of the freenet software itself into freenet this approach would become virtually impossible.

4. Improvements & Suggestions

  • One of the big problems of freenet is that it is not very reliable as a storage media. This is by-design: unpopular content is deliberately dropped in order to replicate, and thereby improve the availability of, popular content. While this is OK for freesites (freesites are, like websites, dynamic and therefore need to be re-inserted anyway) and discussion groups (unless you need an extensive history), it is bad for software development, where a version history is of some importance.
    Therefore I propose a new type of datastore, to be used together with an instance of the current datastore implementation, that
    • is read-only for requests received via FNP,
    • read-write for requests received via FCP, and
    • does not discard any content (at the cost of possibly growing very large).
    In this way one could, by running a permanent node with the new datastore type, ensure that at least one copy of a given file is available in freenet at all times.
    One should be aware that the contents of that datastore could be used against you, so it should never be used for content that's clearly illegal.

5. Good Practice

This section contains a few hints on how to set up a project for development via freenet.

5.1 Anonymity

If you choose to develop a software project via Freenet it is not unlikely that you do so because Freenet offers anonymity. Anonymous development is quite uncommon in Open Source and Free Software Projects, though, and therefore revision control software is usually trying pretty hard to make sure you receive credit for your contributions. So you should choose to set up a development environment that makes sure Arch does not include any hint on your identity into the project repository.

First, you should set up an encrypted filesystem, e. g. using loopback encryption on a linux machine. Make sure it is mounted under some unconspicuous name like "/encrypted". (Using e. g. "/home/torvalds/cryptofs" would probably be a bad idea if your name was Linus Torvalds.)

Second, create an unconspicuous "development home" directory on the encrypted filesystem. Like e. g. "/encrypted/dev/CoolProject".

Third, change your $HOME environment variable to point to that directory before setting up your arch environment and the rest of the project. This is a little tricky, because if you do this manually in your shell the commands may end up in your shell's history file! Therefore, create a small helper shell script (with an unconspicuous name, of course, like "/encrypted/script1") that contains something like this:

#!/bin/sh

HOME=/encrypted/dev/CoolProject
export HOME
cd $HOME
$SHELL -i

Only after executing that script should you initialize your arch environment and work on the project. Arch saves configuration information in "$HOME/.arch-params", which will automatically end up in the encrypted filesystem. Of course you should create your {archive} and {repository} on the encrypted volume, too. It might be clever to use a different {archive} for each project, so you can publish them independently. And don't use your email address as your Arch user ID! :-)

5.2 Project Management

For rapid publication of new patches you should publish your archive as an edition based freesite. Immediately after creating a patch you should insert a new edition of the archive freesite. Using the naming convention described above (see section 2.1.2) you can publish a link to the new version immediately after uploading, e. g. via frost. At the same time you should maintain a DBR site that is regularly updated to point at the latest edition. This way, your co-developers can quickly get the latest version of your archive, while others can use the (never changing) DBR link to retrieve a current but not necessarily absolutely latest version.

The default page of your freesite should contain:

  • information about the project itself
  • information about different versions and latest changes
  • the Arch archive name of your project archive
  • links to your co-developers' freesites