Feeds:
Posts
Comments

Posts Tagged ‘CPAN’

The CPAN ecosystem is one of the most compelling reasons for the continued growth of the Perl programming language. It has been discussed at length by numerous people, and there have been several attempts to imitate this aspect of the Perl community through projects like: CRAN, CCAN, JSAN and more.

Unfortunately, in equal parts due to its age and design philosophy, the PAUSE system powering CPAN makes it difficult for distributions to be maintained by a group, rather than an individual. The inspiration for this post comes from a discussion I had recently with Florian Ragwitz, who contributes to several key Perl projects, including Catalyst, Moose, DBIx::Class and many more.

Permissions

First, a bit about how permissions on CPAN work.

In order to make a package installable using the CPAN Shell, there must be some mechanism to disambiguate a module name. Consider this simple example:

  1. I upload Acme::Package to CPAN.
  2. Some time passes, and unbeknownst to me, another author uploads a different package, but which is called Acme::Package to CPAN as well.

In the absence of any permission checking, if I then instructed users to install Acme::Package using the CPAN Shell, they would inadvertently install the wrong distribution! This has some rather serious implications: the other Acme::Package is probably quite different from mine, and a malicious author could have taken my software and added a backdoor vulnerability.

CPAN solves this issue by tracking each module namespace separately using the PAUSE Indexer, which assigns upload permissions to users through two mechanisms:

  1. The module namespace registration list.
  2. First-come status (the first uploader of a given package namespace “owns” that namespace).

Going back to the example given, the second uploader of Acme::Package would not have permission to use the namespace. The package will be accepted into the archive, but will not be indexed, meaning that users installing Acme::Package will still get my distribution.

If users want to install the other author’s package (which is marked as an UNAUTHORIZED upload in big red letters on CPAN Search), they would need to explicitly specify AUTHOR/Acme-Package-1.00.tar.gz.

For packages maintained by several people, it is also possible to assign co-maintainer status to others, so that they may also upload a package and have it correctly indexed. This way, two or more people can work on the same package together, and upload it under their own accounts (without causing the upload to be marked unauthorized). Thus, PAUSE credentials do not need to be shared.

This provides a nice solution to the malicious upload problem, but also has implications for team-maintained packages. In particular, consider the case where there are two authors working on Acme::Library.

  1. Alice uploads the first version to CPAN, containing modules: Acme::Library and Acme::Library::Main.
  2. The PAUSE Indexer grants Alice first-come permissions to both Acme::Library and Acme::Library::Main.
  3. Alice grants Bob co-maintainer status on both Acme::Library and Acme::Library::Main.
  4. Bob creates a new Acme::Library::Other module and adds it to the  package.
  5. The PAUSE Indexer grants Bob first-come permissions to Acme::Library::Other.
  6. Subsequent uploads by Alice will cause the upload of Acme::Library::Other to be marked UNAUTHORIZED.

Solutions

Clever Perl authors have attempted to solve this problem in many different ways over the years, but none of them have been widely successful because they all rely on some degree of human interaction.

Shared PAUSE Accounts

Some notable projects have attempted to solve the issue by creating a shared PAUSE user to hold the requisite first-come or module list upload permissions, which may then be granted to all other team members through the existing co-maintainer facility.

Alternatively, since it is easier for smaller projects, many modules simply assign first-come permissions to a single person, who is then in charge of providing co-maintainer permissions to others who would like to work on it.

Both of these approaches have the same limitation: any people uploading new modules must remember to assign first-come permissions to the group or user in question. In our case, Bob should have assigned first-come permissions to Acme::Library::Other to Alice, who then must pass co-maintainer permissions back to Bob. Unfortunately, this almost never happens, and Alice must chase down Bob (who happens to be on vacation in Antarctica) or, alternatively, the already over-worked PAUSE administrators.

Single Uploader

Some projects deal with this issue by sharing a version control system and having all the uploads go through a single person, in our case, Alice. This fixes the permission problem, since first-come permissions are always granted to Alice, but it results in a single point of failure. If there are some serious security issues requiring an immediate release, Alice must be available (and, as luck would have it, she is vacationing in Antarctica at the time).

Enter x_authority

One proposed solution, which is used in projects including Moose and Catalyst, is to use a special field in the CPAN Metadata file (META.yml or META.json) that defines someone as the “authority” for first-come namespaces in a distribution.

This is how it would work for Alice‘s Acme::Library distribution:

  1. Alice uploads a package to CPAN, containing modules: Acme::Library and Acme::Library::Main.
  2. Alice specifies, in META.yml:
    x_authority: cpan:ALICE

    This refers to Alice‘s PAUSE login, and is the person to whom permissions for new modules uploaded in this distribution are assigned.

  3. Alice grants Bob co-maintainer status on both Acme::Library and Acme::Library::Main.
  4. Bob creates a new Acme::Library::Other module and adds it to the package
  5. The PAUSE indexer, seeing the x_authority defined in META.yml, grants Alice (not Bob!) first-come permissions to Acme::Library::Other. At this time, Bob also automatically gets co-maintainer permissions to Acme::Library::Other.
  6. Subsequent uploads by Alice will be indexed properly.

Problems

There are still some outstanding issues that need to be resolved, but the x_authority proposal represents a giant leap forward for team-maintained software.

The name: any keys not part of the CPAN Metadata Specification must be prefixed with “x_” – eventually, once it is used by more people and accepted into the specification, this name will become, simply, “authority.”

Other comaintainers: if Charlie joined the project prior to Bob‘s upload of Acme::Library::Other, then Alice still needs to grant co-maintainer permissions to Charlie. Unfortunately, the PAUSE Indexer cannot automatically grant permissions to him, since it has no notion of a “distribution,” only module namespaces.

Malicious uploaders: in the worst case, if Eve joins the project and maliciously (or unintentionally!) changes the x_authority, she will automatically get first-come permissions on the namespace of any modules she adds. However, this is the same behaviour that we had in the absence of x_authority.

Conclusions

Ultimately, the benefits of this feature (making group maintenance easier) drastically outweigh the cost (only a few small changes need to be made to the PAUSE Indexer). They are unlikely to cause any problems in practice, and the worst-case behaviour is the same as if we did not have x_authority at all.

It isn’t perfect, but it is a solution that requires minimal effort and minimal changes to PAUSE. Eventually, the goal is to create a more sophisticated system that will handle the issues outlined above, as well as more complex ones, such as renaming distributions or moving modules between distributions.

Thanks to Florian Ragwitz for spending some time discussing x_authority at length with me. He and Leon Timmermans proofread this article prior to publication.

Read Full Post »

Earlier in the year, I wrote a similar article discussing the Catalyst Web Framework and the MojoMojo Wiki software. At the beginning of December 2009, I wrote an article which was published in the Catalyst Advent Calendar. I’m re-posting it here for posterity, and because it is still relevant to others today.

Introduction

Because Catalyst is a rapidly evolving project, packages supplied by operating system vendors like Debian, Fedora, Ubuntu, and many others have historically been outdated compared to the stable versions. In effect, this limited users of Debian’s package management system to outdated versions of this software.

In 2009, thanks to the efforts of Matt S Trout and many others, Debian’s Catalyst packages have been improving. The idea that Debian’s Perl packages are outdated is an idea that is itself becoming obsolete. There are many situations where system-wide Debian packages (and similarly, Ubuntu packages) can be preferable to installing software manually via CPAN.

Advantages

Here are some reasons why packages managed by Debian are preferable to installing packages manually:

  • Unattended installation: the majority of our packages require absolutely no user interaction during installation, in contrast to installs via CPAN.
  • Quicker installs for binary packages: since binary packages are pre-built, installing the package is as simple as unpacking the package and installing the files to the appropriate locations. When many modules need to be built (as with Catalyst and MojoMojo), this can result in a significant time savings, especially when one considers rebuilding due to upgrades.
  • No unnecessary updates: if an update only affects the Win32 platform, for example, it does not make sense to waste bandwidth downloading and installing it. Our process separates packages with bugfixes and feature additions from those that have no functional difference to users, saving time, bandwidth, and administrative overhead.
  • Only packages offered by Debian are supported by Debian: if there are bugs in your Debian software, it is our responsibility to help identify and correct them. Often this means coordinating with the upstream software developers (i.e. the Catalyst community) and working toward a solution together – but our team takes care of this on your behalf.
  • Updates occur with the rest of your system: while upgrading your system using aptitude, synaptic, or another package management tool, your Perl packages will be updated as well. This prevents issues where a system administrator forgets to update CPAN packages periodically, leaving your systems vulnerable to potential security issues.
  • Important changes are always indicated during package upgrades: if there are changes to the API of a library which can potentially break applications, a supplied Debian.NEWS file will display a notice (either in a dialog box or on the command line) indicating these changes. You will need to install the “apt-listchanges” utility to see these.

This year has seen greatly improved interaction between the Debian Perl Group and the Catalyst community, which is a trend we’d like to see continue for many years to come. As with any open source project, communicating the needs of both communities and continuing to work together as partners will ultimately yield the greatest benefit for everyone.

Disadvantages

As with all good things, there are naturally some situations where using Debian Perl packages (or, indeed, most operating-system managed packages) is either impossible, impractical, or undesirable.

  • Inadequate granularity: due to some restrictions on the size of packages being uploaded into Debian, there are plenty of module “bundles”, including the main Catalyst module bundle (libcatalyst-modules-perl). Unfortunately, this means you may have more things installed than you need.
  • Not installable as non-root: if you don’t have root on the system, or a friendly system administrator, you simply cannot install Debian packages, let alone our Perl packages. This can add to complexity for shared hosting scenarios where using our packages would require some virtualization.
  • Multiple versions: with a solution like local::lib, it’s possible to install multiple versions of the same package in different locations. This can be important for a number of reasons, including ease of testing and to support your legacy applications. With operating-system based packages, you will always have the most recent version available (and if you are using the stable release, you will always have the most recent serious bug/security fixes installed).
  • Less useful in a non-homogeneous environment: if you use different operating systems, it can be easier to maintain a single internal CPAN mirror (especially a mini-CPAN installation) than a Debian repository, Ubuntu repository, Fedora/RedHat repository, etc.

For my purposes, I use Debian packages for everything because the benefits outweigh the perceived costs. However, this is not the case for everyone in all situations, so it is important to understand that Debian Perl packages are not a panacea.

Quality Assurance

The Debian Perl Group uses several tools to provide quality assurance for our users. Chief among them is the Package Entropy Tracker (PET), a dashboard that shows information like the newest upstream versions of modules. Our bug reports are available in Debian’s open bug reporting system.

If you have any requests for Catalyst-related modules (or other Perl modules) that you’d like packaged for Debian, please either contact me directly (via IRC or email) or file a “Request For Package” (RFP) bug. If you have general questions or would like to chat with us, you’re welcome to visit us at any time – we hang around on irc.debian.org, #debian-perl.

See Also

  • Our IRC channel, irc.debian.org (OFTC), channel #debian-perl
  • Package Entropy Tracker is a dashboard where we can see what needs to be updated. It allows us (and others, if interested!) to easily monitor our workflow, and also contains links to our repository.
  • Our welcome page talks about what we do and how you (yes you!) can join. You don’t need to be a Debian Developer to join the group (actually, I’m not yet a DD and yet I maintain 300+ packages through the group).
  • This guide explains how to file a Request For Package (RFP) bug, so that the modules you use can be added to the Debian archive. Note that Debian is subject to many restrictions, so issues like inadequate copyright information may prevent the package from entering the archive.

Statistics

Here are some statistics of note:

  • We maintain over 1400+ packages as of today. For details, see our Quality Assurance report
  • We have quite a few active members; probably around 10 or 20

Acknowledgments

Thanks to Matt S Trout (mst) for working so closely with the group to help both communities achieve our goal of increasing Catalyst’s profile. Also thanks to Bogdan Lucaciu (zamolxes) for inviting us to contribute this article, and Florian Ragwitz (rafl) for his review and feedback.

Everything that is good in nature comes from cooperation. Neither Catalyst, nor Perl, nor Debian Perl packages could exist without the contributions of countless others. We all stand on the shoulders of giants.

Read Full Post »

I’ve recently been pushing for greater support for Catalyst and MojoMojo on Debian. For the uninitiated, Catalyst is a Model-View-Controller Framework designed for writing web applications. MojoMojo is a Wiki application based on Catalyst that provides a lot of neat features; while it seems less popular than Wikimedia’s MediaWiki software, it’s still got plenty of features other wikis don’t.

Here’s a blurb about it from their homepage:

We also have a bunch of features you won’t find in every wiki, like an attachment system that automatically makes a web gallery of your photos, live AJAX previews as you are editing your text, and a proper full text search engine built straight into the software.

Unfortunately, such a rich feature set comes at a price — this shiny piece of software has a rather large dependency chain. As a result, building the module (after building its prerequisites) from CPAN is both slow and prone to failure, since each module must be individually retrieved, extracted, built, tested and then installed.

To make matters worse, any failure anywhere in the chain (perhaps a new version of a module breaks things) will cause a complete failure to build the module — either Catalyst or MojoMojo — which has some serious implications for production applications.

In Debian, we mitigate this risk by having separate unstable and testing distributions, so if a newer version breaks things in unstable, we will catch it and have a chance to fix it before the package makes it into testing. By packaging these modules for Debian, we get the advantages of a faster installation process (since we’re installing pre-built binaries) combined with better Quality Assurance.

One of the big issues blocking both of these has been missing copyright information for a lot of modules. I’ve worked a lot with Matt S. Trout, one of the primary people behind coordinating the efforts of the Catalyst project, and gathered the necessary information for an upgrade and upload into Debian.

Recently, libcatalyst-modules-perl (version 35) and libcatalyst-modules-extra-perl (version 4) were uploaded to Debian, containing many necessary updates and fixes to improve the Catalyst experience on Debian. The next big push is to get MojoMojo’s dependencies packaged (currently only String::Diff is blocking it, due to missing copyright information).

A bounty of $150 is being offered by one of the MojoMojo developers to the first person who can re-implement the String::Diff functionality in a free/open source way.

Read Full Post »

Older Posts »