DSpace is a pretty capable repository framework, but it has a number of missing pieces in terms of administrative functionality. You can actually do a lot from the web interface, and the command line operations provide some nice tools, but some tasks just aren't supported; for example:

  • moving items or collections to other collections or communities
  • changing permissions on items or bitstreams
  • running anything but the most basic reports
  • and many more

This is unfortunate, because the DSpace Core API has good coverage in these areas, but there is not an easy way to access these tools. Java, for all its strengths, sometimes isn't ideal for scripting small actions. What DSpace (and sometimes Java) needs is a scripting environment.

Fortunately, JRuby (Ruby that runs on the JVM and provides interoperability with Java classes) provides such an environment, and, with some configuration, provides a very nice way to work with the DSpace Core API (i.e., in a REPL or short scripts).

I've put together a small gem, dscriptor, to help setup a convenient scripting environment for DSpace. This gem provides three pieces of functionality:

  1. Loading the DSpace runtime, which involves defining the location of the configuration file and starting the service kernel.
  2. Importing required Java classes, so that they can be accessed directly in the shell. (Currently these are imported into the global namespace, which isn't ideal, but should be fine for small scripts or IRB sessions.)
  3. Providing a number of "mixins", or convenience methods, that make it easier to perform certain actions in the Core API.

To use dscriptor, provide a configuration block:

Dscriptor.configure do |c|
  c.dspace_cfg = ENV['DSPACE_CFG'] # the path to your dspace.cfg file
  c.admin_email = ENV['ADMIN_EMAIL'] # the admin email address
  c.imports.merge %w{
    org.dspace.authorize.AuthorizeManager
    org.dspace.authorize.ResourcePolicy
    org.dspace.storage.rdbms.TableRow
  } # any number of classes you want to import
end

Prepare the runtime:

Dscriptor.prepare

This starts the DSpace kernel, requires all of the core jar files, and imports the specified java classes.

Include the tool's mixins (optional):

include Dscriptor::Mixins

Currently, these mixins are rather sparse. The most useful so far is probably context, which provides an instance of org.dspace.core.Context with the admin user loaded, which can be used throughout the script/session.

However, it's easy to add new convenience methods:

module Dscriptor
  module Mixins

    # add a method to look up DSpaceObject by handle
    def find_by_handle(hdl)
      HandleManager.resolve_to_object(context, hdl)
    end

    # add a method to move a collection from one community to another
    def move_collection(coll, old_comm, new_comm)
      new_comm.add_collection(coll)
      old_comm.remove_collection(coll)
    end

  end
end

One thing to keep in mind is that you'll need to call context.complete at the end of your script if you have modified any DSpace objects.

Another thing to note: you'll need to have the DSpace Solr webapp running if you want to use any of the discovery classes/tools.

Some examples of useful scripts can be found on Github.