Previous | Contents | Next

Chapter 4: Advanced Server Configuration

4.1 Vocabulary

This chapter introduces some terminology which is needed to understand the functionality of apt-cacher-ng; it's recommended to understand it before continuing with the advanced configuration.

4.2 Configuration file types

By default, the /etc/apt-cacher-ng directory (or the one specified with program options) contains all config files, HTML page templates, the stylesheet and other text-based support files used by apt-cacher-ng. The contents may vary depending on the installation of apt-cacher-ng, refer to the package documentation for Linux Distribution packages.

There are a few certain file types distinguished by apt-cacher-ng:

  1. Main configuration files:

    *.conf files are assumed to contain configuration directives in the form of "key: value" pairs. The package comes with a commented example configuration file. apt-cacher-ng reads all files matching *.conf in alphabetical order and merges the contents. For options documentation, see commented example file shipped with apt-cacher-ng (conf/ directory in original source).

    For security reasons, the files can be made readable only to the daemon and administorator accounts, e.g. when they contain passwords or other sensitive data.

  2. URL lists and remote repository list files. The file names are arbitrary, no special suffix is required. They are read and included during processing of configuration files and can contain data in one of the following formats:
  3. Various support files used for the configuration web interface, named like *.css and *.html.
  4. *.default files are used in some rare cases as replacement for list files having the same name without .default suffix.
  5. *.hooks files specify custom actions which can be executed upon connection/disconnection (see section 4.3.2 for details).

Except from .conf files, most files listed above can be moved to another "support" directory and the daemon will look for them therein if they are not present in the primary configuration directory. This feature is intended to help keeping pure configuration data and mostly static data in different locations. The directory path is specified at build time and can be overriden with the SupportDir directive (and if used, this should be set as early as possible).

4.3 Repositories and URL mapping

With the most simple configuration, apt-cacher-ng will act almost like an ordinary HTTP proxy with improved caching behaviour. When files are requested, they are downloaded from a remote location specified in client's request and are stored in a unique way.

However, for some use cases it can be beneficial to specify additional rules to achieve further improvements, e.g. in order to detect and prevent avoidable downloads, to reduce space requirements for the cache directory or simply hide real download locations from the APT clients.

These modifications are generally achieved by two strategies, Merging and Redirection, which are configured in a context of a specified cache Repository. The configuration for them is created using one or multiple Remap-... configuration directives (see below).

Merging:

"Merging" of incoming requests can be done if some subdirectories of different remote servers are considered equal where the last part of the remote file path leads to the same file content. When specified, the internal cache content is shared and the live download stream is shared. The configuration work consists of setting an "equality list" containing a set of URLs representing the base directories (like http://ftp.debian.org/debian and http://ftp.uni-kl.de/pub/linux/debian).

Redirection:

With redirection, client requests cause a download from a remote location which is different from what clients requested and believe to receive from. Redirection is an optional feature; if used, it's configured by one or multiple URL(s) pointing to target servers. The URL(s) must include a directory spec which matches the directory level of the URLs in the Merging URL(s), for example all ending with /ubuntu/ for usual Ubuntu mirror URLs. If redirection is not used (i.e. the target URL list is empty) the original URL from client's request is used to get the data.

Repository:

A (cache) repository is the internal identifier which declares the scope in which Merging/Redirection specs are applied. It also represents the name of an internal cache subdirectory.

4.3.1 Writing Remap-... configuration

When use cases for merging/redirection are identified and a repository name is chosen, these components are written into configuration directives starting with Remap- which follow the simple syntax:

Remap-RepositoryName: MergingURLs ; TargetURLs ; OptionalFlags

The repository name is a symbolic name which should be chosen carefully and should not be changed afterwards, otherwise the data might become inaccessible for clients until the files are extracted and reimported semi-manually. Internally, this string shares the namespace with host names and/or top directory names of other URLs. Name collisions can cause nasty side effects and should be avoided. Recommended names are made up from alphanumeric or URL-friendly characters. Also, a repository name should not be associated to a real hostname. Examples for good names: archlinux-repo, debianlocal. Examples for bad names: fedora.example.com, _very&weird.

The TargetURLs part is optional (see Redirection description above). If multiple targets are specified, the order of servers here defines their order of preference (see also the NetworkTimeout option and additional notes below).

Both URL lists simply contain URLs separated by spaces. The strings must be properly URL-encoded. Since all URLs are assumed to belong to http:// protocol and point to a remote directory, the http:// protocol prefix and trailing slashes are optional. There is no hard limit to the number of URLs. However, for readability reasons it's recommended to put them into separate list files (see section 4.2) and specify the particular list files with tags like file:urlsDebian.list instead of writing them into a single line. Raw URLs and file:... lists can be mixed.

Fully configured Remap lines can look like:

Example I:

Remap-debrep: ftp.de.debian.org/debian http://ftp.at.debian.org/debian

for the use case: small home network, clients have de... or at... servers in their sources.list files and use acng as HTTP proxy. Now the files are still downloaded from at... or de... mirrors depending on the user request, but already cached data is served to both, at... and de... users.

Example II:

Remap-ubuntu: file:ubumir.lst ; 192.168.17.23/pu ca.archive.ubuntu.com/ubuntu

for the use case: small home network, clients have various Ubuntu mirrors (which are listed in ubumir.lst) in their sources.list files and use acng as HTTP proxy. All requests are redirected to a mirror in the /pu directory of some local machine. When that machine is down, Canadian public server is used instead.

4.3.2 Special tricks and additional notes

There are some implementation details (partially explained above) and some configuration options related to repository settings which should be mentioned explicitly.

The internal cache directory tree follows the URL requests from the clients unless modified by Remapping rules. For proxy-style configuration on the user side, it is always the hostname of the requested URL. But if clients access the apt-cacher-ng server like a regular mirror (not using APT's proxy config) then it's just passed as regular directory name. And at this point, it's possible to use Remapping constructs to access random remote locations while the client assumes to download from a subdirectory of apt-cacher-ng (as http server). This is configured by simply using /some/directory/string/ instead of URLs in the Merging list to let your clients download from http://acngserver/some/directory/string/... paths.

If multiple Remap- lines for the same Repository are specified, the contents of both URL lists are merged.

On some restricted networks, it may be needed to enforce the use of predefined mirrors. If the ForceManaged option is set, only requests to URL matched in some Remap-... config is allowed.

Sometimes, it may be needed to execute a system command before connection to certain machines is established. This is possible by associating commands with a repository declaration, i.e. by storing a file named like repositoryname.hooks in the main configuration directory. It can contain PreUp, Down and DownTimeout settings. PreUp/Down are executed by the system shell and it's up to the administrator to make sure that no malicious code is contained there and that the execution of these commands does not cause significant delays for other apt-cacher-ng users. See package documentation for an exemplary hooks file.

If the Redirection part contains multiple URLs, the server prefers to use them in the order of appearance. On success, the first target is used all the time, and so this should be the preferred mirror (note: "success" means getting a started download or a non-critical failure in this context. A "404 File not found" status is not considered critical since client's apt can expect and use it to check the existence of remote files and then change its own behaviour accordingly).

And finally, there is an optional third field in the Remap directives which can contain extra flags to modify downloading behavior in the scope of that particular cache repository.

Config example:

Remap-debrep: file:deb_mirror*.gz ; file:backends_debian ;
   keyfile=Release keyfile=.deb

If the first mirror from backends_debian goes wild and returns 404 responses for everything then the next candidate will be used. However, while this feature can improve redundancy for certain installations it needs to be used with care! Some file types are allowed to be missing and apt interprets their absence to change its behavior as needed. keyfile= should only match files which have an essential role and which disappearance is undoubtful indication of a broken server.


Comments to blade@debian.org
[Eduard Bloch, Sun, 19 Apr 2015 10:25:49 +0200]