Engineering Blog

Stop Bashing on Tarballs in Chef

by Ooyalan on ‎10-15-2015 09:55 AM (3,474 Views)

Written by karen Bruner

Systems and Architecture


Stop Bashing on Tarballs in Chef


The Problem


Tarball handling in Chef recipes can lead to multiple problems. Sometimes it becomes a gateway drug to bad practices. "We had to use a bash block to call the system tar because there's no native tar resource. And that worked well, so we just decided to put more stuff in bash blocks." Next thing you know, the entire recipe is one big bash 'everything' do ... end. Or a recipe will untar some files, then try to modify them using the Chef file resource only to throw an error during the client run about how the file doesn't exist, because when Chef went to create the resource at the beginning of the run, the file really didn't exist.


A couple of cookbooks exist that offer a tar resource provider, but they either didn't create extracted files as Chef resources, or they used the system tar, which, to someone who comes from environments with a mix of Linux with GNU tar and FreeBSD with its more traditional Unix tar, adds a bit of unpredictability.



The tarball Cookbook


Enter the tarball cookbook. It will extract files from a tarball as Chef resources and without shelling out. The tar extraction is done with pure ruby, built around the RubyGems TarReader class, which exists in just about every ruby installation, including that in /opt/chef/embedded.

Features include: * Automatic detection and handling of gzipped (or uncompressed!) tarballs * Has resource attributes for handling extracted file owner/group/mode (using masks) * Allows extraction of the entire archive or of specific files (including wildcard support) * Supports extraction of regular files, directories, and symbolic and hard links * Includes POSIX tar format support and partial GNU tar extention support.



Example usage


Add the tarball cookbook as a dependency in your cookbook's metadata.rb, then in a recipe where you need to extract files from a tarball:

# Fetch the tarball if it's not a local file
remote_file '/tmp/some_archive.tgz' do
  source ''

tarball_x '/tmp/some_archive.tgz' do
  destination '/opt/my_app_path'    # Will be created if missing
  owner 'root'
  group 'root'
  extract_list [ '*.conf' ]
  umask 002             # Will be applied to perms in archive
  action :extract


The Nitty Gritty


As it happens, there are almost as many variations on and extensions to the POSIX tar standard as there are snowflakes. Most variations are at least somewhat compatible with POSIX, although occasional quirks will crop up (invalid file modes, truncated paths, etc.) The Gems::Package::TarReader class was designed to handle the standard format used by RubyGems, but a Chef recipe can pull tarballs from various sources using any imaginable format. As it turns out, the most informative source for the basic details of the various specifications came from the FreeBSD tar(5) man page.


The tarball cookbook supports the POSIX standard (via the Gems::Package::TarReader class) and most common GNU tar extensions, as those were the formats we encountered most frequently. Support for additional extensions can be added in providers/x.rb in the extraction method.

During extraction, directories are creating using Chef's directory resource, files using the file resource, and symbolic and hard links using the link resource with appropriate link_type. This makes the created filesystem objects visible to Chef as resources that can be interacted with further along in the client run.



The Cookbook


You can find the tarball cookbook on GitHub.

Want to hack on the tar cookbook and other infrastructure automationat Ooyala? Join us!