FALVEY MEMORIAL LIBRARY



You are exploring: VU > Library > Blogs > Library Technology Development > Separating Local Code Customizations in PHP

Separating Local Code Customizations in PHP

  • Posted by: Demian Katz
  • Posted Date: October 5, 2011
  • Filed Under: VuFind

Background

For the past few months, I have been working on a prototype of VuFind 2.0, a reimplementation of the software based on the Zend Framework. I’m very proud of the 1.x series of VuFind releases, and I think they stand pretty well on their own, but the software has been around long enough to begin outgrowing its initial architecture. This reimplementation is designed to clean up some long-standing messes and make the package even more developer-friendly.

One of the big issues for any open source project is figuring out how to deal with local code customizations. A major benefit of open source is that anyone can change it… but changes can come back to bite you when it comes time to upgrade. There are two main strategies that can help alleviate this problem: use a version control system (i.e. Subversion or Git) and try to isolate your changes to separate files rather than changing core files whenever possible. Isolating changes is useful since, even if an upgrade breaks something, it helps you remember exactly what you customized. Version control is of obvious value — if you do have to resort to changing core modules, it helps you keep track of what you did and merge it with future developments.

VuFind already has some powerful mechanisms for isolating local changes from the core — theme inheritance makes user interface customization cleaner, a wealth of configuration file options reduces the need to change core code in many cases, and plug-in mechanisms like record drivers and recommendation modules offer hooks for inserting locally-built code. However, if you need to change some aspect of a core library class, you may still need to resort to editing core code.

The Goal

I have seen packages where you can override classes by copying a core PHP module, pasting it into a different directory, and making your changes to the copy. By taking advantage of a PHP search path that checks the “local” area prior to the “core” area, the package will then load your copy of the file in preference to the core version, allowing you to override the class. While this solution is a step in the right direction as far as avoiding the need to edit core files, it has significant drawbacks — you have to copy an entire class in order to change any one element of it, and when upgrade time comes around, chances are that you’re still going to have to do a significant amount of work to reconcile your locally-copied files with the new core. In fact, I would argue that this solution is actually worse than simply editing the core, since it makes it harder to effectively use version control software to merge changes.

As I see it, a better solution would be to find a way to extend core classes without completely overriding them — i.e. to create a child class that adjusts only the method or methods you need to change, without replacing the entire class. This would encapsulate your local changes in the most concise form possible, and while you still might have to do some reconciliation at upgrade time, good use of object-oriented principles combined with a stable application design could keep problems to a manageable minimum.

The biggest challenge to implementing this is that you run into naming problems. For obvious reasons, PHP doesn’t let you have two classes with the same name. If your core code refers to a class called VF_Search_Object and you want to change the behavior of the getResults() method without editing any other code, how can you do that? Fortunately, there is a way — it’s just a bit tricky.

The Solution

The answer to this problem relies on two key characteristics of PHP: class autoloading and dynamic code generation. With autoloading, PHP has the ability to call a function whenever you attempt to instantiate a class which does not exist. With dynamic code generation, PHP can actually create classes on the fly based on the contents of variables. The trick is to build an autoloader that detects whether or not local customizations have been made and to dynamically generate a new class that derives from either the locally customized version or the original core version as needed.

Still sounds complicated? Fortunately, Zend Framework makes it easier with its powerful autoloader module. The Zend Autoloader gives you a great deal of control over how classes get autoloaded. It can be configured to look at different class name prefixes and load those classes from different directories… or even call different custom autoloader functions. To solve our problem, we need to set up three different class name prefixes:

Core – Any class that begins with “Core_” is core code. Users would never want to directly edit any of these files.

Local – Any class that begins with “Local_” is localized code. Normally these would only exist when a user wanted to customize some piece of functionality, and they would extend a Core_ class with the same name suffix (i.e. the “Local_Example” class would extend the “Core_Example” class).

Extensible – Whenever any code instantiates a class, it will use the “Extensible” prefix instead of “Core” or “Local” — this is how the magic happens, since there should be no classes in PHP files on disk whose name begins with “Extensible_” — instead, the classes will be created dynamically as needed.

Here’s the code that sets this all up using the Zend Autoloader:

$autoloader = Zend_Loader_Autoloader::getInstance();
$autoloader->registerNamespace('Core_');
$autoloader->registerNamespace('Local_');
$autoloader->pushAutoloader('extensibleAutoloader', 'Extensible_');

Very simple — “Core_” and “Local_” are registered as standard namespaces within the autoloader, which means that they will be searched for on disk. “Extensible_” is registered as a special namespace that needs to trigger a custom autoloader called extensibleAutoloader. Here’s the code for that function:

/**
 * Autoloader that allows optional local classes to extend required core classes
 * seamlessly with the help of a particular namespace.
 *
 * @param string $class  Name of class to load
 * @param string $prefix Class namespace prefix
 *
 * @return void
 */
function extensibleAutoloader($class, $prefix = 'Extensible_')
{
    // Strip the class prefix off:
    $suffix = substr($class, strlen($prefix));

    // Check if a locally modified class exists; if that's not found, try to load
    // the core version.  If nothing is found, throw an exception.
    if (@class_exists('Local_' . $suffix)) {
        $base = 'Local_' . $suffix;
    } else if (@class_exists('Core_' . $suffix)) {
        $base = 'Core_' . $suffix;
    } else {
        throw new Exception('Cannot load class: ' . $class);
    }

    // Safety check -- make sure no crazy code has been injected; these have to be
    // simple class names:
    $base = preg_replace('/[^A-Za-z0-9_]/', '', $base);
    $class = preg_replace('/[^A-Za-z0-9_]/', '', $class);

    // Dynamically generate the requested class:
    eval("class $class extends $base { }");
}

As you can see, it’s actually pretty simple — extensibleAutoloader() takes advantage of the regular autoloader in combination with “class_exists” to check whether or not localized versions are available. This tells it which base class needs to be extended in order to generate the requested Extensible_ class… then it uses the eval() function to dynamically create the class.

So imagine you run this code:

$z = new Extensible_Sample();

If you haven’t created a Core_Sample or Local_Sample class, you’ll get an exception. But suppose you put this Core_Sample class into your library:

class Core_Sample
{
    public function __construct()
    {
        echo 'I am a rock.';
    }
}

Now instantiating the Extensible_Sample object will display “I am a rock.” on screen — the autoloader will find and load Core_Sample but name it Extensible_Sample.

Let’s take it a step further and create a Local_Sample that extends Core_Sample:

class Local_Sample extends Core_Sample
{
    public function __construct()
    {
        echo 'Some people may think that ';
        parent::__construct();
    }
}

Now the Extensible_Sample object will display “Some people may think that I am a rock.” Magic!

Conclusions

I’m very happy to see that it is actually possible to achieve this effect — it’s something that I’ve been thinking about for a long time, and I’m happy I was able to make it work. That being said, I’m not sure if it’s worth the effort. I see three major drawbacks:

- This is a powerful mechanism for extending code IF YOU UNDERSTAND IT. But it increases the learning curve for getting into the codebase, since at a glance it will be very confusing to see all these references to Extensible_* classes that don’t actually exist on disk.
- All of the autoloading involved in the solution adds some overhead to the code. I haven’t done testing to see how significant the overhead actually is… but without some kind of caching or PHP acceleration, I have a feeling it might turn out to be somewhat expensive.
- The eval() function is one of the most dangerous features in PHP, since it provides an opportunity for attackers to execute arbitrary code. I believe that the way I’m using it here is safe (especially with the extra regex cleanup I’ve added), but it nonetheless makes me a little nervous.

I would love to hear what other people think of this — is the solution technically sound? Is the benefit worth the cost? At this point, I’m not necessarily committed to implementing this as part of VuFind 2.0 (and obviously the namespaces won’t be “Core_”, “Local_” and “Extensible_” if I eventually do). It could be done, though, and I think it’s worth considering. All feedback is welcome!

Like

8 Comments »

  1. Comment by Jonathan Rochkind — October 6, 2011 @ 1:57 pm

    I guess the other alternative is a more explicit ‘dependency injection’ design, where almost every class name used in code is contained in a variable, and you can change that variable to tell it to use your local version by name. Where your local version would usually be a sub-class of the core version, I’d figure.

    Man, I wouldn’t want to try and implement that in PHP either. PHP’s OO facilities are… not my preference to work in.

    You might want to check out Xerxes (the Metalib front-end), Xerxes does a pretty good of keeping local and core code seperate, but Xerxes is much simpler and less flexible software. It basically lets you specify an ‘action’ class for each action (basically each URL path endpoint) in an XML config file, where by default they are the core ones, but you could use your custom ones. But there’s no way to substitute PHP object classes more granularly than that.

    Xerxes’s ‘view’ layer, for better or worse is all XSLT, and it’s set up so your local XSLT files are loaded in dynamically on top of the core ones, such that you can over-ride the XSLT on a template-by-template basis. (That’s an individual XSLT template; designing the core XSLT to put these templates at the right granularity is the trick).

  2. Comment by Demian Katz — October 6, 2011 @ 2:52 pm

    I had originally thought about the “class name as variable” approach you suggest — in fact, Zend Framework provides plugin management tools that will return the most appropriate class name given certain parameters and which might help with that sort of implementation.

    Two things made me eventually reject that idea, though:

    1.) It puts its footprints all over the code — lots of extra variables everywhere, and potentially extra calls to populate those variables every time you try to construct an object. The approach detailed in this post is a lot more seamless.

    2.) It doesn’t work well when the core code already has class hierarchies in it. In PHP, you can’t instantiate classes using variable names without using the eval() function (unless I’m missing something). This means that there’s no easy way to make a subclass extend a variable parent class. Since VuFind has a lot of base classes that people might want to modify without necessarily changing their subclasses, I don’t think the variable approach is workable. However, again, the approach detailed in this post works around that problem, as long as your classes extend parent classes from the “Extensible_” namespace rather than the “Core_” namespace.

    I can see how that approach might be appropriate in a less complex project, though!

  3. Comment by Jonathan Rochkind — October 6, 2011 @ 7:54 pm

    If you can’t instantiate a class with a name in a string without using ‘eval’, then that would for me eliminate that technique right off the bat, so there you go.

    I think how you’ve done is probably the neatest way to do it in PHP.

  4. Comment by Luke O'Sullivan — October 7, 2011 @ 6:03 am

    Thanks for this post Demian – I think it explains the proposed changes very well. However complex the “backend” of the changes are, what a system administrator / programmer will have to do is actually relatively simple – create a custom class / function. This is exactly what they have been doing for VuFind 1.x.

    The way that the record drivers and ILS drivers currently work is relatively straight forward when it comes to modifying methods but here we have the option of naming the new php class we wish to call. It would be pretty tedious if we had to do that for every custom class!

    Of course, the benefits of the change are much cleaner and non-conflicting code so in essence, it’s a no-brainer.

    As you point out and as Jonathan says, I see the code eval() and I immediately think of hell and damnation but as long as one considers the advice given by Uncle Ben to Peter Parker “With great power comes great responsibility”, I’m sure I can get over that.

  5. Comment by Demian Katz — October 7, 2011 @ 8:30 am

    Just to be sure we’re on the same page, you can instantiate an object of a class using a variable with no problem:

    $class = ‘MyClass’;
    $object = new $class();

    …but you can’t define a class with a variable:

    $parent = ‘MyClass’;
    class SubClass extends $parent { /* … */ }

    It’s the latter case that gets in the way for VuFind… but if you don’t need to do that, you can probably get away with using variables.

  6. Comment by Bill Dueber — October 8, 2011 @ 11:35 am

    PHP, unfortunately, makes it hard to keep stuff in configuration and out of code. You could, however, have a system where (a) the same empty class is always loaded, but (b) users are expected to change where that class inherits from.

    So:

    $a = new TheInstantiatedClass() # in code

    …where the class def looks like

    class TheInstatiatedClass extend TheCoreClass
    {
    }

    If you want to override TheCoreClass, the rule is that you do it by changing the “extend” clause and allow the auto-load stuff you explained above to take care of finding it.

    It’s more of a BFMI (Brute Force, Massive Ignorance) approach, but has simplicity on its side and doesn’t mess around with what PHP calls metaprogramming. The downside is that it makes it impossible to use both unmodified and modified classes at once.

  7. Comment by Tulie Amichal — October 11, 2011 @ 10:24 am

    I Like the new direction. Having the option to extend the core classes and extend them selectively is a great direction.

  8. Comment by Demian Katz — March 14, 2013 @ 3:32 pm

    An update — while I am not using this approach in VuFind 2.0 (the Zend Framework “service locator” mechanism provides greater flexibility), in case anyone else finds it useful in another context, keep in mind that PHP 5.3+ adds a class_alias() function which can be used to map one class to multiple names without resorting to the ugly eval() hack I originally proposed.

RSS feed for comments on this post. TrackBack URI

Leave a comment

*

 


Last Modified: October 5, 2011