Save Bandwidth by Setting Up a Fedora Mirror

Configuring the Apache server

Enable KeepAlive

Enabling KeepAlive in httpd allows persistent connections. These long-lived HTTP sessions allow multiple requests to be sent over the same TCP connection, as it does not require separate connection set-ups for each file. This reduces some overhead and significantly reduces latency periods. By default, Fedora’s Apache httpd package has KeepAlive disabled. They should be enabled, with a timeout of two seconds. Don’t keep this very high since it may overload your server. Take a look at Figure 1 to see the changes required in the Apache configuration file.

Figure 1: Enabling KeepAlive in Apache

Figure 1: Enabling KeepAlive in Apache

Handling of metadata

Metadata are typically defined as ‘data about data’. When you try to install a package or update a system, the first things that get downloaded are package metadata. These are files with information about the packages, their age and other details. If, for example, a computer has old metadata cached, according to which all the packages are up-to-date, no new updates will be installed into the system. To work around this, we explicitly add the Cache Control: must-revalidate option, which insists that Yum or any client must revalidate the metadata against the server before serving it from the cache. For this, add the following section to your /etc/httpd/conf/httpd.conf around the <Location> directive (around line 900; take a look at Figure 2 to get an understanding of the exact location):

<LocationMatch ".(xml|xml.gz|xml.asc|sqlite)">
    Header set Cache-Control "must-revalidate"
    ExpiresActive On
    ExpiresDefault "now"
</LocationMatch>

Figure 2: Configuring metadata handling in Apache

Figure 2: Configuring metadata handling in Apache

Content types

ISO and RPM files should be served using MIME Content-Type: application/octet-stream. In Apache, this can be done inside a VirtualHost or similar section:

<VirtualHost *:80>
    AddType application/octet-stream .iso
    AddType application/octet-stream .rpm
</VirtualHost>

Limiting download accelerators

Download accelerators will try to open the same file many times, and request chunks, hoping to download them in parallel. This can overload already heavily-loaded mirror servers, and cause a denial of service. In order to limit connections to ISO directories by some amount, per IP, add this to your apache configuration file:

<IfModule mod_limitipconn.c>
    MaxConnPerIP 3
</IfModule>

To block ranged requests as this is, indeed, what download accelerators do, add this section to your apache configuration file:

RewriteEngine on
RewriteCond %{HTTP:Range} [0-9] $
RewriteRule .iso$ / [F,L]

Restart Apache

Now restart Apache. If everything is fine, you should not get an error. If you can start the Apache server successfully, it means you are done with most things.

  • ashishkumar2703

    Just commenting to get a wave invite. ;)

  • tinhed

    Thanks for this very useful article.

  • DamnitDog

    Re the mount ISO “use cp -p” or DIE … errr, not quite. ;-)

    It’s a good idea to get used to doing the right thing, but if you don’t, then rsync -a (which implies -t) will save you in this case.

    Missing files get copied completely. Identical date/time/size/name files get skipped. But filenames with different date stamps get checksummed on both sides — and only the differences are sent. In this case there *are* none, so you spend a bit of time (but not much bandwidth) figuring that out — but the entire file doesn’t come down.

    (If a file _really has_ changed, then just the checksum and changes are transmitted, not the entire file.)

    VERY good article besides that single nit.

  • susmit

    @DamnitDog, you are right.

    The rsync will only change the timestamp if it is already present, it won’t pull the entire content.

    Sorry for the mistake.

  • Pingback: Getting larger files over n/w

  • http://linuxexplore.wordpress.com Rahul Panwar

    Thanks for the useful post.

    I try the same, but i am getting the error, on starting the apache service, after adding the following lines in httpd.conf:

    Header set Cache-Control “must-revalidate”
    ExpiresActive On
    ExpiresDefault “now”

    ERROR:
    Invalid command ‘Header’, perhaps misspelled or defined by a module not included in the server configuration

    What can i do to resolve this? I think this is related to some module that is not included in my conf file. Can you please tell me the name of that.

    Thanks & Regards,
    Your Fan :-)
    Rahul Panwar

  • http://www.mspy.com/ Tomfille

    A proxy mirror is a local mirror that does not sync the entire Fedora install tree. Instead, it serves files through a reverse caching proxy that connects to a public Fedora mirror and downloads files as needed.

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.