LJ Archive CD

RubyGems

Dirk Elmendorf

Issue #147, July 2006

A comprehensive guide to finding, creating and using Ruby resources called gems.

RubyGems is a system for managing Ruby software libraries. Ruby code packaged in this manner is called a gem. When you find Ruby software you want to use in a project, gems offer a means of downloading, installing and managing the software.

History

Ruby's connection with Perl caused converts to ask an obvious question “Where is the CPAN (Comprehensive Perl Archive Network) for Ruby?” If you have done any Perl programming or used Perl software, you likely have downloaded something from CPAN to make the software work. As it is the de facto standard for sharing libraries in Perl, access to CPAN makes it easier to re-use code developed by others. This tool allows the developer to focus on new problems instead of re-inventing the wheel.

As it turns out, package management is not as simple as it sounds. It gets even more complicated when you are trying to solve a problem for a variety of platforms and operating systems (Ruby runs on a *nix/*BSD/Mac OS X/WinX). There have been several attempts at building a working system.

Ryan Leavengood is credited with creating the very first RubyGems project back in 2001 (see the on-line Resources). The project got started but did not really gain enough momentum to take off. Other solutions were attempted, but they did not really catch on enough to dominate the field.

In November 2003, Rich Kilmer, Chad Fowler, David Black, Paul Brannan and Jim Weirch got together at a Ruby conference and started coding. Their goal was to create a solution once and for all. They obtained permission to use the existing name RubyGems from Leavengood, even though they did not use any code from the previous project.

RubyGems set out to solve several problems. The focus was on simplifying the process of installing, removing, updating and managing Ruby libraries. The developers added an interesting twist by allowing the system to manage multiple versions of the same library easily.Using the versioning scheme from RubyGems, it is possible to provide very powerful control over the selection of which version of a library your code will actually use.

Getting Started

There are plans to include RubyGems as part of the core distribution of Ruby, but until that happens, you need to install it. Your Linux distribution may have a package (RPM, Deb and so on) for RubyGems. In the event that it does not, you can install it from source easily, assuming you have Ruby and the development headers for Ruby already installed on your Linux box.

You can do the following as a user: go to rubyforge.org/projects/rubygems, and download the current version (0.8.11 at the time of this writing):

tar xzf rubygems-0.8.11.tgz
cd rubygems-0.8.11

You must be root to install the software (assuming you want it to be available to all users):

ruby setup.rb all

Now that RubyGems is installed, you should have the gem command (gem is the command used to interact with the RubyGems package system). Test it out by running:

gem list

It should show a single package—sources (0.0.1) installed. The gem command is the way you interact with the RubyGems package system.

User Tasks

Now that you have the gem command, you can begin installing gem packages. You need to be root to install or modify gems, but any user can query the system to find out what is installed. When you want to find software, you can always check out RubyForge (see Resources). It is the main clearinghouse for Ruby open-source software.

One of the most popular RubyForge projects is Ruby on Rails. The Rails gem (and the gems it depends on) can be installed with the following command:

gem install rails --include-dependencies

Another very popular project is RMagick. RMagick is a useful Ruby interface to ImageMagick (see Resources), and it can be installed with the following command:

gem install rmagick

This gem includes non-Ruby code. When you install it, it will compile the C code as part of the installation process. If you do not have compile tools installed, the installation will fail.

RubyGems features a number of useful features, including:

gem search rails --remote gems.rubyforge.org

This returns a list of all the packages and versions available on RubyForge that have the word rails in the title of the package. Here are a few more, well, er, gems:

  • gem update: updates all the current versions of gems to their latest version.

  • gem cleanup: removes old versions of gems that are installed.

  • gem uninstall: removes a given gem from the repository.

Because I try to keep up with the most current version of the gem software, I usually gem update and then gem cleanup the repository to get rid of old libraries. Doing this keeps the gems directory a little cleaner and makes it easier to sort through if and when you need to look in the directory.

Developers

Now that you have some software installed, you will want to use it. To get started, you may want to read the documentation on the gems to learn their API. If you have installed rdoc on your system, gem automatically generates the rdoc (Ruby documentation) for all of the gems you install. You can view this documentation in two different ways. The first one is to run the command:

gem_server

This automatically launches a Ruby-based Web server on port 8808. You can add the -p option to launch the server on a different port. This makes it easy for you to use your Web browser to browse the documentation for all of the gems that are installed. The gem_server can be stopped by pressing Ctrl-C. Also, be aware that the server accepts connections from all hosts that are able to connect to that port. So, if you are concerned about opening a port on your server, you may want to try the alternate means of access.

The other way to access this documentation is to navigate to the place on the filesystem where gem has generated it. In most cases, it will be in /usr/lib/ruby/gems/1.8/doc, but in the event that gem has been installed in a different path, you can ask gem where the correct directory is:

gem environment gemdir

This command gives you the base directory where gem is installed. The documentation is stored in the doc subdirectory of that directory. When you access the files this way, you do not get the summary overview that you get from the gem_server; instead you get only a directory listing of all the gems that are installed.

In order to make your Ruby scripts able to use the Ruby libraries you have now installed, you need to use Ruby's require mechanism to load in the code. The simplest way to use RubyGems is to call the following lines:

require 'rubygems'
require 'RMagick'

This loads all the RubyGems code and automatically allows you to use the latest gem version of RMagick that you have installed. If the code is available locally, it will be included from there instead.

If you would like to tie your software to a specific version of the library, a different function must be called:


require 'rubygems'
require_gem 'RMagick' , '>=1.10'
require_gem 'rake', '>=0.7.0', '<0.9.0'

These statements tell Ruby to use RMagick as long as it is greater than or equal to 1.10. The second line allows any version of rake as long as it is greater than or equal to 0.7 and less than 0.9.0. The version statement supports a number of operators: =, !=, >, >=, <, <= and ~>. The last one is a special operator. It assumes that you are following a RubyGems standard for versioning.

X.Y.Z

You increase X when you release a version that is incompatible with the previous version. You increase Y when you release a version with a new feature that is otherwise compatible. You increase Z when you release a fix for the software.

This allows the ~> requirement to select within a special range. So for example: 1.0, 1.0.1, 1.0.2,1.1 are all ~> 1.0, and 1.1, 1.1.2 are ~> 1.1.

This lets you support minor changes in the gem version without having to change the require statements in your code.

A word of advice: if you are putting in require statements that are tied to a version, make sure you have a central place for calling and organizing them. This will make it easier to determine what other software you depend on and to adjust version requirements later when they need to change.

Building Your Own

So far, gems have really been about using other people's software in your code. If you decide you have a library that might be useful to other people, you easily can package it up as a gem.

Now that you know how to use gems, you might want to know how to build them. The process of turning your code into a gem is a two-part process. The nice thing about that process is you do not have to modify your code to make it available as a gem. The first part is getting your library set up in a directory structure that is suitable for conversion to a gem. I'm going to be using an existing project called IPAdmin (see Resources) as my example of how this works.

The directory structure is organized as follows:

  • /ipadmin/lib: this directory contains all of the Ruby code related to the project.

  • /ipadmin/pkg: this is where the gem will be generated.

  • /ipadmin/tests: this is where any unit or other tests should be stored.

  • /ipadmin/README: this file should contain a summary of the project—especially the license under which it is being released (feel free to add a separate file for the license).

This is the bare minimum layout you need to build up a gem.

More complex projects (rake for example) add the following directories:

  • /rake/bin: this is for any command-line scripts that are part of the project.

  • /rake/doc: additional documentation about the project.

This shows how some projects (rake, capistrano) are able to add in new command-line tools once they are installed on a system.

RMagick includes a special directory:

  • /RMagick/ext: this is where non-Ruby source should be stored if it is going to be compiled.

This is another power option. RubyGems supports shipping non-Ruby source code in the gem. When the user installs this “source” gem on the destination computer, gem attempts to compile the extra code as part of the installation. The advantage of shipping a gem this way is that the non-Ruby code will bind to the actual libraries that are installed on the destination computer. This is exactly what happens when you install RMagick. If you do not have the proper libraries (ImageMagick) or a compiler, the install will fail. To get around the problem of not being able to compile the code, it is possible to ship a precompiled version of the gem. In this case, the source files are compiled and then simply included in the gem.

Once you have your code set up in the correct directory structure, you can focus on the other part of the process of gem building—the gem specification. This is basically a manifest that gives gem all the information it needs about the gem being built. You can build a gem spec as a standalone file, but it is easier to work with if you make it a Rakefile. This simplifies the building process.

There is a Rakefile in the main directory of IPAdmin:

require 'rubygems'
Gem::manage_gems
require 'rake/gempackagetask'

spec = Gem::Specification.new do |s|
    s.platform  =   Gem::Platform::RUBY
    s.name      =   "ipadmin"
    s.version   =   "0.2.2"
    s.author    =   "Dustin Spinhirne"
    s.email     =   "dspinhir @nospam@ yahoo.com"
    s.summary   =   "A package for manipulating IPv4/IPv6 address space."
    s.files     =   FileList['lib/*.rb', 'test/*'].to_a
    s.require_path  =   "lib"
    s.autorequire   =   "ip_admin"
    s.test_files = Dir.glob('tests/*.rb')
    s.has_rdoc  =   true
    s.extra_rdoc_files  =   ["README"]
end

Rake::GemPackageTask.new(spec) do |pkg|
    pkg.need_tar = true
end

task :default => "pkg/#{spec.name}-#{spec.version}.gem" do
    puts "generated latest version"
end

This is a good example of a standard Rakefile for a gem. Here you can see that it is including RubyGems and adding some tasks from rake. The main spec handles providing all the information about the gem that is being built. The last task adds a simple helper that allows you to run rake in the directory and automatically build a gem.

Each of the lines in the spec has a special meaning. The entire list of options that can be set is available from the Gemspec Reference on the RubyGems Manuals site (see Resources).

Specification Explained

platform determines for what platform the gem is meant. If you are just using pure Ruby, it can stay with this default. This flag becomes very important when you are shipping precompiled gems.

name, version, author, email and summary provide basic information about the gem and its author. This is how users can find out who is responsible for the code.

files defines the list of files that are to be included in the gem. The FileList command is provided by rake, which does two things that make life easier. First, it handles globs (*) and patterns meaning that you can grab a lot of files easily. It also understands that certain files should be excluded. By default, it excludes CVS, svn, bak and core files.

require_path is set to determine what directories should be searched for code. The value for this would change if you were building extensions in the ext.

autorequire designates which file will be loaded when require ipadmin is called in code. ipadmin.rb in this module handles requiring the other three libraries that ship with ipadmin.

test_files is a list of files that should be executed when the gem is installed if the user adds the -t argument to the gem install. This is a way to provide safety checks to make sure everything worked after the gem is installed.

has_rodc is a way to tell gem you have included rdoc tags in the code. If this flag is false or missing, gem will not generate documentation automatically.

extra_rdoc_files allows you to include other files in the documentation that is generated by gem. In this case, the README file is being linked into the document ion. If you had other documents, they could be listed here.

Because IPAdmin is a very simple project, it does not include one very useful command: add_dependency. If you build a gem that depends on another gem, this command allows these dependencies to be specified. You even can tie it to a version number in the same way you can with require_gem. When you install a gem that has a dependency, gem checks to see if it is met. If it is not met, gem offers to install it. To add a dependency on rake, you could add this to the spec definition:

s.add_dependency("rake",">=0.7.0")

Signing Gems

Thanks to a patch from Paul Duncan, the latest version of RubyGems (0.8.11) now has some features to support signing your gems using a public/private key. This introduces some new options for the gem specification (signing_key and cert_chain). This change also allows you to install gems in a high-security mode that will install only gems that are signed by trusted sources. Because the feature itself is very new, some pieces of infrastructure to make it useful in the greater scheme of things are missing—namely, an easy way to build up a chain of trust so that end users do not have to add certificates for every single gem author out there. That being said, these features might be useful if you want to control gems inside your network across a lot of servers. You could download them once and sign them with an internal certificate. Then, you could update all your servers by requesting gems from the server where you distribute these signed gems. Duncan has written a great overview of getting started with gem signing on the RubyGems Manuals site (see Resources).

Distribution

Now that you have a gem, you probably want to share it. There are several ways to distribute your code. The simplest way is to host the file. When people want to install it, they can download the file and run gem in the same directory.

The second option is to host the project at RubyForge.org. RubyGems ships with RubyForge as the default source for gems. RubyForge even runs a special script so that once you upload your new gem to your account, it automatically is available to all users of RubyGems.

Assuming you do not want to use RubyForge, there are two options left to make it possible to distribute your gem via RubyGems. First, you need to run your own server. The easiest way to do that is to simply fire up gem_server. It automatically shares gems with anyone who connects to it.

The other option is to cd to a directory inside of the webroot of an existing Web server. Create a directory called gems, and copy all the gems you want to distribute into that directory.

Run the following command, and replace DIR with the full path to the directory above the gems directory. This creates yaml and yam.Z files:

generate_yaml_index.rb -d DIR

You need to re-run the script anytime you modify the gems you are serving. Keep in mind that if you use either of these options, your users have to add the --source URL_OF_YOUR_SITE to the gem install command. This allows gem to search that site for gems.

Packaging

RubyGems is a package management system unto itself. If your system does not already have package management, this is a huge improvement. On the other hand, if your Linux system has package management, RubyGems can add some complexity. This is largely a side effect of RubyGems being completely separate from the host packaging system. According to the RubyGems Web site, the problem is related to the version-per-directory layout. This apparently conflicts with the Filesystem Hierarchy Standard (see Resources). Hopefully, some sort of middleground will be found, because the joy of having a good package management system is having a single place to make sure everything is up to date and works properly together. The risk is really related to gems that install non-Ruby code. For example, I believe it is possible to install a gem and then have the host package system replace a shared library that is managed by the host system with an incompatible version, which would render the gem useless.

In the long run, I hope that someone comes up with a good solution to the problem. So far, I have not been affected seriously by this potential issue. I use apt to manage Ruby and the rest of the system, and I use RubyGems to manage the gems I need. The one problem I had was more related to user error. I failed to install a library that RMagick required. The compilation of the RMagick extension failed, but I did not see the error because it scrolled by too fast, and the gem reported that it was installed. Eventually, I figured out what was going on, and no computers were harmed in the process. It could be argued that this problem may have been prevented if I were doing everything in apt, because it would have installed the missing library as soon as I installed RMagick. On the other hand, because a lot of the Rails and other Ruby gems seem to be updating frequently, it has been nice to be able to keep up with the latest version of the Ruby software instead of having to wait for new Debs to be released.

Conclusion

Package management for Ruby got off to a rocky start. Now that we have RubyGems, it is hard to imagine working without it. RubyGems crams a lot of features into a very tiny package. It has made it a lot easier to find, distribute and manage a wide variety of Ruby software. Now that you have made it through this brief introduction, you can start using gems in your own development.

Resources for this article: /article/9019.

Dirk Elmendorf is one of the founders of Rackspace Managed Hosting (www.rackspace.com). He is currently addicted to Ruby on Rails, and by the time you read this he will be happily married to Annie Tiemann!


LJ Archive CD