Creating a mirror

From ParabolaWiki
Jump to: navigation, search
Note: See Mirroring on demand instead for an alternative in which packages are fetch only when they are requested by clients of the mirror (basically caching), useful for multiple clients on internal networks.


1 Requirements

The repos are currently about 75GB per architecture; and there are currently 3 archs. This should be considered to be the bare minimum, as there are often supporting tools other than packages, such as LiveISOs and experimental snapshots, which could vary in size at any time.

All of the software in the Parabola repos is freely distributable, so anyone is free to keep their own personal mirror and offer access to it to others. There are no specific bandwidth requirements for that use-case; so it can be accomplished on any computer with sufficient disk space. However, we expect that all official Parabola public mirrors can support at least one mega-bit continuous outgoing bandwidth; and that admins will do their best to ensure that the mirror remains online and servicing downloads at the normal rate at all times.


2 Mirror Sync Script

Creating a mirror can be done with the script given below.

You should adjust the target and tmp directory before using the script. You can change the source_url (currently rsync://repo.parabola.nu:875/repos/) and the lastupdate_url to use another source.

To synchronise, just execute the script as a user with writing permissions on the target and tmp directory. You should run this script regularly to keep your mirror up to date by configuring a cron job to synchronise e.g. once a day (once an hour maximum, please sync on a random minute). Running this script as a cron job will only cause it to sync when the ./lastupdate file is changed.

Finally, you can set up a webserver such as Apache or nginx to serve files from target directory to the public. If you have enough resources to be a public mirror, please post your mirror details in the dev-mailinglist.

#!/bin/bash

# Directory where the repo is stored locally. Example: /srv/repo
target="/srv/repo"

# Directory where files are downloaded to before being moved in place.
# This should be on the same filesystem as $target, but not a subdirectory of $target.
# Example: /srv/tmp
tmp="/srv/tmp"

# Lockfile path
lock="/var/lock/syncrepo.lck"

# If you want to limit the bandwidth used by rsync set this.
# Use 0 to disable the limit.
# The default unit is KiB (see man rsync /--bwlimit for more)
bwlimit=0

# The source URL of the mirror you want to sync from.
source_url='rsync://repo.parabola.nu:875/repos/'

# An HTTP(S) URL pointing to the 'lastupdate' file on your chosen mirror.
lastupdate_url='https://repo.parabola.nu/lastupdate'

#### END CONFIG

[ ! -d "${target}" ] && mkdir -p "${target}"
[ ! -d "${tmp}" ] && mkdir -p "${tmp}"

exec 9>"${lock}"
/usr/bin/flock -n 9 || exit

rsync_cmd() {
        local -a cmd=(/usr/bin/rsync -rtlH --safe-links --delete-after ${VERBOSE} "--timeout=600" "--contimeout=60" -p \
                --delay-updates --no-motd "--temp-dir=${tmp}")

        if /bin/stty &>/dev/null; then
                cmd+=(-h -v --progress)
        else
                cmd+=(--quiet)
        fi

        if ((bwlimit>0)); then
                cmd+=("--bwlimit=$bwlimit")
        fi

        "${cmd[@]}" "$@"
}


# if we are called without a tty (cronjob) only run when there are changes
if ! /usr/bin/tty -s && [[ -f "$target/lastupdate" ]] && /usr/bin/diff -b <(/usr/bin/curl -Ls "$lastupdate_url") "$target/lastupdate" >/dev/null; then
        # keep lastsync file in sync for statistics
        rsync_cmd "$source_url/lastsync" "$target/lastsync"
        exit 0
fi

rsync_cmd \
        --exclude='*.links.tar.gz*' \
        "${source_url}" \
        "${target}"


3 Join the Official Parabola Mirror Network

If you would like your mirror to become part of the official Parabola public mirror network, send an email to the Parabola development mailing list with the following information:

  • geographical location of the service
  • name of the primary responsible party and email address(es) of server admin(s)
  • URLs to the repository base dir on the server, for each supported protocol

Access via http:, https:, ftp:, and rsync: are supported over both IPv4 and IPv6. Only IPv4 access via https: is mandatory for official mirrors. All other protocols are encouraged, but optional. We expect that you will normally sync exactly once per hour, and we may ask that you start the sync job at a specific minute of each hour, in order to avoid congestion.

If you are certain about the maximum nominal out-going bandwidth that your server can offer consistently, you could add that information optionally. If your server can reliably provide a relatively large amount of outgoing bandwidth, we may ask that you become a Tier 1 mirror, with which less capable Tier 2 mirrors can sync directly.

Names and emails addresses are for Parabola developers internal use only; in order to contact the respective admins if necessary (such as in case of an long-term failure). They will not be visible to any user. If you do not wish to send the names or email addresses to the public mailing list, you can omit the details, and request to send the information to one of the parabola developers privately. Anyone who wishes to share Parabola packages publicly, but without revealing their identity, can do so over the Pacman2Pacman p2p network.

Example:

Geographical Location:  Umeå, Sweden
Responsible Party:      Academic Computer Club, Umeå University
Admin Email:            admin@example.org
Alternate Email:        (optional)
Nominal Outgoing B/W:   (optional)
Sync Hour and Minute:   (optional)
Service URLs:           http://ftp.acc.umu.se/mirror/parabola.nu/
                        https://ftp.acc.umu.se/mirror/parabola.nu/
                        rsync://ftp.acc.umu.se/mirror/parabola.nu/