Skip Navigation
Falvey Library
Advanced
You are exploring: Home > Blogs

Moving VuFind to Zend Framework 2: Part 4 — Command Line Tools

UPDATE – July 26, 2012: The release of Zend Framework 2 RC 1 makes much of this obsolete since it includes better native console support. I’ll try to post again when I have time to rework everything to use the new functionality, but for now I’ve just patched in a workaround. See comments at the bottom of the article if you are interested.


VuFind is primarily a web application, but it also includes a number of command-line tools for performing various harvest, import and maintenance tasks.  It would be nice if these command-line tools could leverage the infrastructure of the web application so we don’t need to write redundant code for setting up autoloaders, configuring resources, etc.  However, we don’t want to accidentally expose command-line behaviors through the web interface.  Fortunately, Zend Framework 2’s module system makes this fairly easy to achieve.

The Goal

In order to achieve some level of granularity, it would be nice if, when you run any given command-line utility, VuFind routes your request to a special command-line controller whose name corresponds with the directory containing the tool, executing an action corresponding with the tool’s filename.  So, for example, running import/import-xsl.php would call importController::importXslAction().

Naming the Module

Zend Framework 2 modules correspond with PHP namespaces.  For example, the main VuFind module is located in module/VuFind/Module.php, and it defines a VuFind namespace.  All supplementary files living within the VuFind namespace are found under module/VuFind/src/VuFind, and Zend Framework knows how to access them based on the module’s configuration.

When creating the CLI module, we have two options:

1.) We can create a new namespace, such as VuFindCLI, and locate the module in module/VuFindCLI/Module.php with a structure totally parallel to the main VuFind module.  This results in the cleanest directory structure, but the namespacing isn’t very logical, since this is really a subset of VuFind functionality.

2.) We can create a sub-namespace, such as VuFindCLI, and locate the module in module/VuFind/CLI/Module.php.  Because this namespace is a subset of the main VuFind namespace, supplemental files will live in module/VuFind/src/VuFind/CLI rather than module/VuFind/CLI/src/VuFind/CLI — a potential source of some confusion.

I opted for approach #2 — I prefer having logical namespaces at the expense of a little directory irregularity.

Loading the Module

Having decided what to call the module, loading it is simply a matter of modifying the main application configuration (config/application.config.php) to load the CLI module when it detects that it is running in CLI mode:

$config = array(
    'modules' => array(
        'VuFind',
    ),
    /* ... trimmed for clarity ... */
);
if (PHP_SAPI == 'cli') {
    $config['modules'][] = 'VuFind\CLI';
}
return $config;

Creating the Module

The CLI module itself doesn’t need to contain very much content. We need a configuration to tell it how to load CLI-specific controllers (module/VuFind/CLI/config/module.config.php), and a module definition to set up custom routing (module/VuFind/CLI/Module.php).

The routing is a little bit complicated, so let’s look more closely at it. Here is the relevant code from Module.php:

    public function onBootstrap(MvcEvent $e)
    {
        $callback = function ($e) {
            // Get command line arguments and present working directory from
            // server superglobal:
            $server = $e->getApplication()->getRequest()->getServer();
            $args = $server->get('argv');
            $filename = $args[0];
            $pwd = $server->get('PWD', CLI_DIR);

            // Convert base filename (minus .php extension) and containing directory
            // name into action and controller, respectively:
            $baseFilename = basename($filename);
            $baseFilename = substr($baseFilename, 0, strlen($baseFilename) - 4);
            $baseDirname = basename(dirname(realpath($pwd . '/' . $filename)));
            $routeMatch = new RouteMatch(
                array('controller' => $baseDirname, 'action' => $baseFilename), 1
            );

            // Override standard routing:
            $routeMatch->setMatchedRouteName('default');
            $e->setRouteMatch($routeMatch);
        };
        $events = $e->getApplication()->getEventManager();
        $events->attach('route', $callback);
    }

The onBootstrap() method is called automatically on every module. Within this method, we are using the Zend Framework 2 event manager to associate a callback function with the route event. The callback function is defined as a closure.

Within the closure, we need to do two things:

1.) Figure out the directory and filename that were used to access VuFind (the idea here is that every CLI utility will simply be a wrapper that loads the core of Zend Framework with an include statement).

2.) Using this contextual information, force the router to load the appropriate controller by injecting a routeMatch object that matches the ‘default’ route defined by the core VuFind module.

As it turns out, step 1 was a little harder than anticipated. Figuring out the filename that was accessed is easy; PHP’s $_SERVER superglobal (accessible in Zend Framework through the getServer() call) contains an ‘argv’ element representing command-line parameters, and the first element of this array will always contain the base filename. The hard part is figuring out the containing directory. The __DIR__ magic constant is of no use to us, because it refers to the context of the currently-executing file, not the top-level script run by the user. Similarly, the getcwd() function is of no help, because part of the standard ZF2 initialization sets the current working directory to a fixed location.

Some versions of PHP come to the rescue with a $_SERVER element called ‘PWD’ which contains the directory from which the user executed PHP. The problem is that this is not present in every operating system (it is missing in Windows, for example). For lack of a more elegant solution, I eventually settled on defining a constant called CLI_DIR in my command-line scripts so that code deeper in the framework can figure out the context. Hence the code:

$pwd = $server->get('PWD', CLI_DIR);

This attempts to use the $_SERVER[‘PWD’] variable, but if it is not set, it fails over to the CLI_DIR constant. That way, the ugly workaround is only triggered when absolutely necessary.

Putting it All Together

As my first proof of concept, I decided to implement the import/import-xsl.php script. The code inside the script is very simple:

define('CLI_DIR', __DIR__);     // save directory name of current script
require_once __DIR__ . '/../public/index.php';

This just sets the CLI_DIR constant described above and loads the framework.

Now we just need to define a controller to respond to the request. I created a base controller with shared methods that are likely to be used by other CLI-oriented controllers (module/VuFind/src/VuFind/CLI/Controller/AbstractBase.php) and then extended that with the actual ImportController functionality (module/VuFind/src/VuFind/CLI/Controller/ImportController.php).

At this point, all the pieces are in place. When you run import-xsl.php, it loads Zend Framework. Zend Framework detects CLI mode and loads the CLI module. The CLI module overrides the router and directs the user to ImportController::importXslAction(). The controller is able to make use of all the same classes and resources as a web application, and no setup code has been duplicated anywhere.

The Rough Edges

There is one piece of the puzzle that I am not entirely happy about right now. Zend Framework 2 controllers work by building up a ViewModel or Response object and then returning that for further processing. This model does not work well in the CLI environment for two reasons:

1.) CLI tools often need to produce real-time output. Unlike a web request which gets built all at once, a CLI tool will often show incremental details as it works (“loading file 1, loading file 2, etc.”).

2.) CLI tools need to return an exit status to the operating system in order to indicate success or failure, which is critical for incorporating PHP tools into shell scripts and batch files.

Neither of these use cases are currently met through native framework features (at least as far as I can tell). For now, I am simply using “echo” and “exit” inside the controllers to achieve the desired effects, which is functional but less than ideal.

There is hope, though: the Zend Framework community is currently thinking about CLI integration, as evidenced by this Request For Comment. I’ll try to keep an eye on developments in this area, and once the framework has the capabilities we need, the existing code can be more tightly integrated into it.


Like
1 People Like This Post

4 Comments »

  1. Comment by dkatz — July 26, 2012 @ 3:10 PM

    As noted at the top of the article, the release of Zend Framework 2 RC1 this week renders this text largely obsolete. ZF2 now has native Console libraries that do some of this work.

    Since I don’t currently have time to fully investigate the new capabilities, I have put in some workarounds to allow my existing solution to continue functioning — it just amounts to moving some logic from the CLI module’s bootstrap callback into a custom router, and skipping a few other bootstrap routines when running in console mode.

    Here are the Git commits in case you are interested:

    http://vufind.git.sourceforge.net/git/gitweb.cgi?p=vufind/vufind;a=commitdiff;h=cd85deb6c2984aa6ac56a76879230e7397fe4af5

    http://vufind.git.sourceforge.net/git/gitweb.cgi?p=vufind/vufind;a=commitdiff;h=a30f235ca61c278fb3a883e11c65c522007f0b43

    However, this is just a temporary solution. I need to come up with something that’s better integrated with the framework. I’ll work on that later after some higher priority refactoring is done, and I’ll try to post something here if the subject isn’t better-documented elsewhere by that time.

  2. Comment by Curtis — August 13, 2012 @ 11:17 PM

    Demian,

    If you do update your code for the new Zend Console stuff, I’d like to see how it’s done. There’s basically zero documentation for it and I’m having a hard time figuring out how to create a cli script.

  3. Comment by Curtis — August 14, 2012 @ 12:25 AM

    I managed to get it working and I was wrong, there’s pretty good documentation provided by Zend:

    http://packages.zendframework.com/docs/latest/manual/en/modules/zend.console.controllers.html

    The use of the zf command confused me though since it’s been removed while they refactor it. It took me a while to figure out calling the index.php file after creating a console route was what I needed.

  4. Comment by dkatz — August 14, 2012 @ 8:32 AM

    Curtis,

    Glad to hear you were able to solve your own problem.

    On my end, I’ve further updated VuFind to better integrate some of the new ZendConsole features — specifically, I have replaced echo calls with ZendConsoleConsole::writeLine(), and I have replaced exit() calls with return $this->getResponse()->setErrorLevel() calls. The relevant diffs are here:

    http://vufind.git.sourceforge.net/git/gitweb.cgi?p=vufind/vufind;a=commitdiff;h=3108e046d043db4ec1829d589410925bea2677a7

    I still need to think about whether to replace my custom routing solution with the new built-in console router. For the moment, I’m still using my custom router in combination with the old ZendConsoleGetopt parameter processing.

    Based on a bit of investigation, I don’t think I can easily incorporate the new router with my “individual PHP files for individual actions” approach. The best way to move forward without creating nasty hacks would probably be to replace all the individual PHP files with shell scripts and batch files that call index.php with the relevant action/controller parameters concatenated with user input. This may technically be a better solution than what we currently have, but it would break backward compatibility (i.e. people would need to change their existing automation to call scripts instead of execute PHP) and it would clutter up the directories (since cross-platform compatibility would require both batch files and shell scripts for every single action). For now, it’s a backburner item.

RSS feed for comments on this post. TrackBack URI

Leave a comment

 


Last Modified: July 19, 2012

Ask Us: Live Chat
Back to Top