Pacman2Pacman

From ParabolaWiki
Jump to: navigation, search

Pacman2Pacman is a plugin for pacman which allows it to download via bittorrent and HTTP mirrors simultaneously and transparently, and to "seed" downloaded packages back to other Parabola users. This reduces load on the mirrors, and makes Parabola more autonomous, by making package distribution less dependent on centralized hosts.

More importantly, the Pacman2Pacman network provides an extra layer of resiliency, which is absent from the standard federated mirror network. It is the "plan-b" cushion against any unexpected outages. It is also much easier to setup and operate than a complete mirror; and it can help, even if your computer is not always online, and even if you only share the packages that you have installed anyways.

The standard federated mirror network is quite strong; but it has a certain vulnerability, due to it's hierarchical nature. As with any federated network, there is an ever-present risk of partial or total blackouts. This p2p distributed (aka: de-centralized) distribution system allows Parabola to distribute software, even in some of the worse case scenarios (eg: DNS blackout, censorship, or total loss of the Parabola web servers infrastructure). However, it's health depends on the participation of Parabola users. Adding your disk space and bandwidth to the distributed network, and encouraging other parabola users to do the same, is actually preferable than growing the federated mirror network.

1 How it works

transmission-daemon runs at all times to seed packages and download them when needed.

A script `pacman2pacman-get' is put in pacman.conf's `XferCommand' setting. This script downloads a .torrent file for that package, modifies it to use your selected mirror as a webseed, and tells transmission-daemon to download the package by bittorrent.

pacman2pacman-get is written in bash and uses transmission-remote from the transmission-cli package to communicate with transmission-daemon. A progress bar is displayed as the file downloads by polling transmission-daemon to know how far downloaded the file is.

2 Installing

Make sure you have the pcr repo enabled in pacman.conf, then: pacman -S pacman2pacman

Then activate transmission (if it is not already) and enable it on start. If you are using systemd:

# systemctl start transmission
# systemctl enable transmission

or if you are using OpenRC:

# rc-service transmission-daemon start
# rc-update add transmission-daemon default

Put the following into /etc/pacman.conf, under the [options] section:

XferCommand = /usr/bin/pacman2pacman-get %u %o

Once you have installed or updated some stuff, you can see the packages being seeded with transmission-remote -l

You can provide feedback by speaking to Xylon on IRC or opening bugs on: https://labs.parabola.nu/projects/pacman2pacman/issues

3 How the pacman mirrorlist is used

The pacman mirrorlist is still used with pacman2pacman, for two things:

  1. the download URL that pacman gives to the pacman2pacman.-get script (through XferCommand) is used as a webseed for the torrent.
  2. pacman2pacman needs the .torrent file to give to transmission. this is about 400 bytes. pacman2pacman-get chooses 3 random mirrors from your mirrorlist and tries to load the torrent file from all three simultaneously. The first one to respond is used, if none respond in 6 seconds then the download is done with HTTP instead.

4 Use cases

4.1 Local peer discovery

If you activate local peer discovery in transmission, the computers that are on the local network will then be able to participate, which will result in way bigger download speed because the local network is usually way faster.

5 Notes

Xylon is running opentracker on taskenizer.crabdance.com to track the torrents. Regard the stats page: taskenizer.crabdance.com:6969/stats.

fauno has suggested a transmission-daemon settings file with some security options enabled: https://github.com/fauno/duraskel/blob/develop/src/.config/transmission/settings.json

fauno has also provided a list of links for transmission optimization: http://www.pps.univ-paris-diderot.fr/~jch/software/bittorrent/tcp-congestion-control.html http://falkhusemann.de/blog/2012/07/transmission-utp-and-udp-buffer-optimizations/ http://blog.lxgr.net/posts/2013/01/28/my-openwrt-setup/ https://github.com/dtaht/deBloat/ http://www.bufferbloat.net/projects/bloat/wiki/Wiki

We plan to make a system for VPS owners to seed the most popular packages: PDCS

6 Possible massive performance enhancement

ATM pacman downloads all packages sequentially. It would be much faster if we could make it download all packages in parallel.

The plan could be: when pacman asks pacman2pacman-get to download a file, it adds the torrent to transmission and returns instantly. When it gets to the last download, pacman2pacman-get blocks and shows a progress meter for all files. In order to do this we would have to modify pacman to give some indication to pacman2pacman-get when it gets to the last download.

Implementation idea: patch pacman, modify pacman_upgrade in src/pacman/upgrade.c so before downloading all packages it calls a configuration-specified optional program with all URLs in arguments; write such script for pacman2pacman that adds all torrents to transmission.