UPDATE – July 26, 2012: The release of Zend Framework 2 RC 1 makes much of this obsolete since it includes better native console support. I’ll try to post again when I have time to rework everything to use the new functionality, but for now I’ve just patched in a workaround. See comments at the bottom of the article if you are interested.
VuFind is primarily a web application, but it also includes a number of command-line tools for performing various harvest, import and maintenance tasks. It would be nice if these command-line tools could leverage the infrastructure of the web application so we don’t need to write redundant code for setting up autoloaders, configuring resources, etc. However, we don’t want to accidentally expose command-line behaviors through the web interface. Fortunately, Zend Framework 2’s module system makes this fairly easy to achieve.
The Goal
In order to achieve some level of granularity, it would be nice if, when you run any given command-line utility, VuFind routes your request to a special command-line controller whose name corresponds with the directory containing the tool, executing an action corresponding with the tool’s filename. So, for example, running import/import-xsl.php would call importController::importXslAction().
Naming the Module
Zend Framework 2 modules correspond with PHP namespaces. For example, the main VuFind module is located in module/VuFind/Module.php, and it defines a VuFind namespace. All supplementary files living within the VuFind namespace are found under module/VuFind/src/VuFind, and Zend Framework knows how to access them based on the module’s configuration.
When creating the CLI module, we have two options:
1.) We can create a new namespace, such as VuFindCLI, and locate the module in module/VuFindCLI/Module.php with a structure totally parallel to the main VuFind module. This results in the cleanest directory structure, but the namespacing isn’t very logical, since this is really a subset of VuFind functionality.
2.) We can create a sub-namespace, such as VuFindCLI, and locate the module in module/VuFind/CLI/Module.php. Because this namespace is a subset of the main VuFind namespace, supplemental files will live in module/VuFind/src/VuFind/CLI rather than module/VuFind/CLI/src/VuFind/CLI — a potential source of some confusion.
I opted for approach #2 — I prefer having logical namespaces at the expense of a little directory irregularity.
Loading the Module
Having decided what to call the module, loading it is simply a matter of modifying the main application configuration (config/application.config.php) to load the CLI module when it detects that it is running in CLI mode:
$config = array(
'modules' => array(
'VuFind',
),
/* ... trimmed for clarity ... */
);
if (PHP_SAPI == 'cli') {
$config['modules'][] = 'VuFind\CLI';
}
return $config;
Creating the Module
The CLI module itself doesn’t need to contain very much content. We need a configuration to tell it how to load CLI-specific controllers (module/VuFind/CLI/config/module.config.php), and a module definition to set up custom routing (module/VuFind/CLI/Module.php).
The routing is a little bit complicated, so let’s look more closely at it. Here is the relevant code from Module.php:
public function onBootstrap(MvcEvent $e)
{
$callback = function ($e) {
// Get command line arguments and present working directory from
// server superglobal:
$server = $e->getApplication()->getRequest()->getServer();
$args = $server->get('argv');
$filename = $args[0];
$pwd = $server->get('PWD', CLI_DIR);
// Convert base filename (minus .php extension) and containing directory
// name into action and controller, respectively:
$baseFilename = basename($filename);
$baseFilename = substr($baseFilename, 0, strlen($baseFilename) - 4);
$baseDirname = basename(dirname(realpath($pwd . '/' . $filename)));
$routeMatch = new RouteMatch(
array('controller' => $baseDirname, 'action' => $baseFilename), 1
);
// Override standard routing:
$routeMatch->setMatchedRouteName('default');
$e->setRouteMatch($routeMatch);
};
$events = $e->getApplication()->getEventManager();
$events->attach('route', $callback);
}
The onBootstrap() method is called automatically on every module. Within this method, we are using the Zend Framework 2 event manager to associate a callback function with the route event. The callback function is defined as a closure.
Within the closure, we need to do two things:
1.) Figure out the directory and filename that were used to access VuFind (the idea here is that every CLI utility will simply be a wrapper that loads the core of Zend Framework with an include statement).
2.) Using this contextual information, force the router to load the appropriate controller by injecting a routeMatch object that matches the ‘default’ route defined by the core VuFind module.
As it turns out, step 1 was a little harder than anticipated. Figuring out the filename that was accessed is easy; PHP’s $_SERVER superglobal (accessible in Zend Framework through the getServer() call) contains an ‘argv’ element representing command-line parameters, and the first element of this array will always contain the base filename. The hard part is figuring out the containing directory. The __DIR__ magic constant is of no use to us, because it refers to the context of the currently-executing file, not the top-level script run by the user. Similarly, the getcwd() function is of no help, because part of the standard ZF2 initialization sets the current working directory to a fixed location.
Some versions of PHP come to the rescue with a $_SERVER element called ‘PWD’ which contains the directory from which the user executed PHP. The problem is that this is not present in every operating system (it is missing in Windows, for example). For lack of a more elegant solution, I eventually settled on defining a constant called CLI_DIR in my command-line scripts so that code deeper in the framework can figure out the context. Hence the code:
$pwd = $server->get('PWD', CLI_DIR);
This attempts to use the $_SERVER[‘PWD’] variable, but if it is not set, it fails over to the CLI_DIR constant. That way, the ugly workaround is only triggered when absolutely necessary.
Putting it All Together
As my first proof of concept, I decided to implement the import/import-xsl.php script. The code inside the script is very simple:
define('CLI_DIR', __DIR__); // save directory name of current script
require_once __DIR__ . '/../public/index.php';
This just sets the CLI_DIR constant described above and loads the framework.
Now we just need to define a controller to respond to the request. I created a base controller with shared methods that are likely to be used by other CLI-oriented controllers (module/VuFind/src/VuFind/CLI/Controller/AbstractBase.php) and then extended that with the actual ImportController functionality (module/VuFind/src/VuFind/CLI/Controller/ImportController.php).
At this point, all the pieces are in place. When you run import-xsl.php, it loads Zend Framework. Zend Framework detects CLI mode and loads the CLI module. The CLI module overrides the router and directs the user to ImportController::importXslAction(). The controller is able to make use of all the same classes and resources as a web application, and no setup code has been duplicated anywhere.
The Rough Edges
There is one piece of the puzzle that I am not entirely happy about right now. Zend Framework 2 controllers work by building up a ViewModel or Response object and then returning that for further processing. This model does not work well in the CLI environment for two reasons:
1.) CLI tools often need to produce real-time output. Unlike a web request which gets built all at once, a CLI tool will often show incremental details as it works (“loading file 1, loading file 2, etc.”).
2.) CLI tools need to return an exit status to the operating system in order to indicate success or failure, which is critical for incorporating PHP tools into shell scripts and batch files.
Neither of these use cases are currently met through native framework features (at least as far as I can tell). For now, I am simply using “echo” and “exit” inside the controllers to achieve the desired effects, which is functional but less than ideal.
There is hope, though: the Zend Framework community is currently thinking about CLI integration, as evidenced by this Request For Comment. I’ll try to keep an eye on developments in this area, and once the framework has the capabilities we need, the existing code can be more tightly integrated into it.