Gitolite: Administer a Central Git Repository without Going Insane

Specify complex access controls with ease

Over the past few years, centralised version control systems like SVN have steadily given way to distributed systems like Git or Mercurial — the former undoubtedly the more powerful and popular of the two. In fact, Git has become very popular even in corporate environments, which have additional access control requirements. Gitolite was originally created to help with those requirements, but it turns out that it’s equally useful in the open source world. Read on for more on how projects like Fedora and KDE use gitolite, and how it can help you.

Let’s go over the terminology first. “Authentication” is the process that determines if you are who you claim to be. If authentication succeeds, the system can associate a valid username with your connection or session. If it fails, the whole operation is aborted. “Authorisation” is the next step, during which the system tries to decide if you have the right to do what you are attempting to do.

From a security perspective, there is a good case to be made for “outsourcing” authentication to an external tool or library, and not making it part of the core product. Not everyone can, or should, be writing crypto software. The best algorithm can be compromised by a weak implementation — something that would be much more easily caught by a community of developers dedicated to cryptography and authentication, rather than a team centred around version control.

As a result, Git devolves the authentication decision to SSH or httpd (in normal server-client usage). Of course, authorisation is where all the fun is. You cannot easily “outsource” it completely, because it’s not just about “who you are” but also “what are you trying to do” that affects authorisation.

A central server for a DVCS

Because Git is a distributed version control system (VCS), there is no need for a central server — it is quite easy for developers in small teams to simply pull from each other, and work. However, while a central server is not a technical need, it is often convenient to make it an administrative requirement.

It’s quite easy to set up a central server for a few developers to work together on, using UNIX group memberships and permissions to manage access at a very broad level. You just add all the relevant users to one group (say, dev), and give that group ownership of the repository. You can do this all with git init --shared, and a few invocations of chmod, chgrp, and usermod. This will allow all users the same access to all the repos.

In reality (especially in a corporate setup), when the number of repos and the number of developers grow, there will be pressure to set up more fine-grained access control. In theory, you can still do it using UNIX groups, but the model starts to break down — becoming cumbersome, fragile, and very difficult to audit/review.

Worse, you have no real way of ensuring that some people have read-only access, while others have read-write access. You can get that if you start using POSIX ACLs, but it’s still painful — POSIX ACLs have a lot more power, but aren’t any friendlier than chmod :-)

Thus, the most common use for Gitolite is simply to help administer a central server without going insane! Here’s an example, with two repositories and five developers:

repo    r1
    RW      =   alice
    R       =   bob carol

repo    r2
    RW      =   alice bob
    R       =   eve mallory

As you can see, this is very easy to write, maintain (when changes are needed), read, and review. It’s also obvious that this can effortlessly scale to dozens of repos and users.

However, that’s not the end of it. Sometimes the number of repos is quite large, but all have a uniform set of access rules, so you can use the following code:

repo    git gitolite linux rakudo
    RW      =   alice bob carol
    R       =   eve mallory

If the number gets even higher, you can try the code below:

@open       =   git gitolite
@open       =   linux rakudo
    # add as many more lines like this as you want; they accumulate
@devs       =   alice bob carol

repo    @open
    RW      =   @devs
    R       =   eve mallory

The basic syntax is simple and clean. And it gets even better: you can specify access in bits and pieces, even if they overlap. Let’s say some access needs are best grouped based on the repos, some by username, and some by both. You just do all of them, and Gitolite will combine all the access lists correctly! Here’s an example (assume the groups are already defined):

# all bosses have read access to all projects
repo @open @closed @topsecret
    R   =   @bosses

# everyone has read access to "open" projects
repo @open
    R   =   @bosses @devs @interns

If you notice that @bosses are given read access to @open via both rules, do not worry that this causes some duplication or inefficiency, because it doesn’t.

Going out on a limb

Once your project really starts using Git, you will have branches and tags — and you will need to control how they are dealt with. Here’s one possible setup:

@QA_team    =   QA_guy QA_gal
@Lead_devs  =   sitaram dilbert
@devs       =   @Lead_devs alice wally

repo    foo
    RW  refs/tags/v[0-9]        =   @QA_team
    RW+                         =   @Lead_devs
    RW  dev/                    =   @devs

The language is simple enough that you can understand what it is doing without being told. A member of the QA team can only push tags which start with v, followed by a digit (optionally followed by anything else). A lead dev can push or rewind (a.k.a. “force push”; you may know about the + syntax in Git push) just about anything. (When you don’t supply a pattern between the permissions and the = sign, it means it matches any ref.) A normal dev can only push branches whose name starts with dev/.

Gitolite processes rules in sequence, looking at rules where the user and the branch/tag being written, match. It stops when the permission matches the attempted operation (which is either W for a fast-forward push or a + for a rewind), or a “deny” is found.

That might sound complicated, so let’s look at some examples. Let’s assume that you wanted to establish that “…any dev can do anything to any branch, except that only lead devs can push to master…”. To implement that, you have to use the “deny” rule, denoted by a minus sign:

repo    foo
    RW+ master                  =   @Lead_devs  # line 1
    -   master                  =   @devs       # line 2
    RW+                         =   @devs       # line 3

When a normal dev (not a lead dev) tries to write to “master”, the first matching rule is Line 2, which says “deny”. If a lead dev tries it, though, Line 1 (which comes before Line 2) matches, and allows the access. (Just as an exercise, think about what happens if you switch Lines 1 and 2. Since “lead” devs are also members of @dev, they will be denied any write access to “master” since the deny rule will be matched first!)

How about if you wanted to say, “…any dev can do anything to any branch, except that only lead devs can rewind master,” (i.e., normal devs can now do a fast-forward push to “master”, which they could not do in the previous example)? Here’s how you can accomplish this:

repo    foo
    RW+ master                  =   @Lead_devs  # line 1
    RW  master                  =   @devs       # line 2
    -   master                  =   @devs       # line 3
    RW+                         =   @devs       # line 4

If a non-lead dev tries to force push to “master”, the first two lines don’t match (he’s not a lead dev so Line 1 won’t match, and if he tries to do an RW+ it doesn’t match RW), so they have no effect. The next line does match, and so denies that access. Of course, all this is for “master”; the last line allows any pushes by anyone to any other branch/tag.

As a final example, let’s look at access control based on what files are being changed. Let’s say that junior developers should not be allowed to push changes to a critical module, whose code resides in critmod.c:

repo    foo
    RW+                         =   @devs
    RW+ NAME/                   =   @Lead_devs
    -   NAME/critmod.c          =   @devs
    RW+ NAME/                   =   @devs

I’ll leave this as a tantalising example of what you can do, so you have some incentive to read the online documentation!

Walk on the wild side

The KDE project allows authenticated developers to create personal repositories, which can either be clones of existing KDE project repos, or completely ad-hoc, “scratch” repos for personal tools, scripts, etc. How do they do that? And how do they manage access to those repos? That’s where Gitolite’s “wildcard” feature comes in. It allows the administrator to say “these users are allowed to create new repos whose names match this pattern”. You do not have to do anything special to create such a repo; just clone it as if it already existed, and it will spring into existence, with you as the “owner”. Here’s a slightly modified snippet from the KDE infrastructure team’s Gitolite config file:

repo clones/[^\W][\w.-]*[^\W_]/CREATOR/[^\W][\w.-]*[^\W_]
    C                   = @all
    RW+CD               = CREATOR
    RWCD                = MANAGERS
    RWC                 = WRITERS
    RW                  = @all

As you can see, that is already pretty intuitive; it is easy to guess that using a regular expression as a repo name means all repos that have matching names are affected. You might even guess that the CREATOR is replaced by the name of the user creating the repo (so any clones I create, for example, will always look like clones/foo/sitaram/bar, where clones and sitaram cannot change).

The permissions given to special usernames are interesting. The owner can create, delete, or rewind any branch or tag. MANAGERS can create or delete branches, but cannot rewind them (although you can get around a rewind restriction by deleting and recreating a branch!), and WRITERS cannot delete or rewind branches. These special “user” names are actually user categories, and the owner can specify who belongs in each category. For example, if I had a repo called clones/foo/sitaram/bar, I might use the following code:

ssh git@server setperms clones/foo/sitaram/bar
READERS wally
WRITERS alice asok dilbert
MANAGERS pointy_hair
^D

(That’s a literal Ctrl-D there, because the setperms command takes its data from STDIN.)

Note: The RW = @all is a KDE policy — any authenticated user can push changes to any branch of any repo. People in the corporate world may gasp in shock at this, but it is a very good idea for an open source project, because it encourages participation. And it is quite harmless, since RW permissions can never destroy anything, and can always be rewound by the owner (or anyone who has RW+ permission).

And that’s basically it. You’ll find all the details in the documentation, especially the required setting in the rc file, but the basic idea is very simple: the Gitolite administrator decides what rights users and user categories have, and the owner decides who belongs to what category.
Gitolite also comes with a bunch of commands to allow you to create a clone directly on the server, delete it, undelete it up to a certain number of days later, and so on, which can all be enabled by the Gitolite administrator. If you look at the KDE link again, you will notice many such commands, although some of them are very customised from the default versions.

The end result is a system that can be pretty much “self-service” in terms of repos created by users, with access rights managed by themselves (but within broad limits set by the admin), which is very useful for large developer communities.

Wrapping it up

I hope this has sparked a little interest in those of you who’ve been considering Git, but are not sure how to handle access control. Gitolite has many more features; please check the README and the many documents that come with the source. The best starting point for using Gitolite is https://github.com/sitaramc/gitolite#start.

  • Alexander

    repo foo RW+ master = @Lead_devs # line 1 – master = @devs # line 2 RW+ = @devs # line 3

    can @devs now read the repo or create a new branch of master?

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.