Mirroring on demand

From ParabolaWiki
Jump to: navigation, search
Note: see Creating a mirror instead for an alternative in which packages are synchronized periodically, which is the method normally used by public mirrors.

Since 70-100Gb could be a bit too much for mirroring Parabola repos, you can make your web server fetch, cache and serve packages on users' demand.

The basic procedure would be like this:

  • Maintain repo databases updated. This means don't cache them at all or cache them for as much as a day.
  • Make your web server proxy package and package signatures (or anything but *.db) and cache them in a way that the second query will be served locally.

Feel free to add instructions for your favourite web servers here.


1 Using Nginx

Being a reverse proxy, Nginx can be configured to proxy Parabola repos while filling a local cache. Here's an explained snippet:

# You can redirect queries to any number of mirrors to distribute the load
# as long as they serve the 'repo.parabolagnulinux.org' subdomain
upstream parabola {
    # Parabola Tier 0 repo IP
    server 93.95.226.249;
    # Other mirror IPs!
}

# The mirror
# It serves two subdomains:
# * The actual mirror, and
# * A bogus main repo subdomain so this server can act as another mirror
#   upstream ;)
server {
    listen 80;
    server_name mirror.example.org repo.parabolagnulinux.org;
    
    root /srv/http/mirror.example.org;
    autoindex on;
    
    # Databases are never cached
    # Queries are proxied directly to upstream
    # We trick upstream into serving the main repo subdomain
    location ~ \.db$ {
        expires 1h;
        proxy_pass http://parabola;
        proxy_set_header Host repo.parabolagnulinux.org;
    }
    
    # Get and cache everything else
    # The error_page directive means 'go fetch anything that can't be found here'
    location / {
        expires 7d;
        error_page 403 404 = @get;
    }
    
    # Bogus location that redirects queries
    # Pass it to the parabola upstream
    # We trick upstream into serving the main repo subdomain
    # Store the files in this format
    # Give them 644 permissions
    location @get {
        proxy_pass http://parabola;
        proxy_set_header Host repo.parabolagnulinux.org;
        proxy_store /srv/http/mirror.example.org$request_uri;
        proxy_store_access user:rw group:r all:r;
    }
}

2 Cleaning up

You can cleanup old packages using the db-cleanup utility from our dbscripts. Clone it from https://projects.parabolagnulinux.org/dbscripts.git, edit the config file to suit your needs and run db-cleanup every once in a while (we do it weekly).