Autoloading (Revisited)

September 19th, 2011 by Ralph Schindler

Upon the arrival of PHP 5.0, the ability to autoload classes was introduced. At the time, autoloading was such a new feature, it was hardly adopted. As such, many applications being ported from PHP4 to PHP5 still had lots of procedural code in them (code incapable of being be autoloaded) and many class files which had long ‘require_once‘ lists. It wasn’t until years later that certain best practices had emerged and the prolific usage of require_once/include_once throughout large bodies of code had started drying up. Even after autoloading had been adopted by larger more visible projects, a common patten had yet to emerge. The PEAR project had already had its one-class-per-file rule, and a class to filesystem naming convention, but this was hardly the rule at the time, and as such, there were many different patterns of autoloading strategies.

As time has passed, slowly, more and more projects had gone through re-writes and the strategy that most projects were landing on was the one that came from the PEAR group. Fast-forward to today, and we see that this standard for autoloading has agreed upon by a large number of projects and has come to be named the “PSR-0 autoloading standard”.

What We’ve Learned

After having attained a consistency (for code) in how we utilize autoloading, we’ve attempted to find the most efficient and performance optimized way of executing our autoloading strategy. Matthew Weier O’Phinney has blogged about this in the past, it’s a good read if you have not already read it. To summarize, he found the following things to be true:

  • disk based class name to filesystem location maps are the fastest lookups
  • class filesystem paths that are absolute and that do not rely on include_path are fastest
  • lightweight autoload functions that utilize class maps directly are the fastest

For more information about the above generalization, see Matthew’s blog post.

Nearly a year ago, in conjunction with his findings, Matthew also wrote a classmap generation tool. This tool produced a .classmap.php file that would reside in the directory responsible for containing class files. The general idea here is that a developer could utilize a automatic mapping based autoloader, like the PSR-0 autoloader, or, he could utilize this .classmap.php file in order to build a more performance centric strategy for his/her autoloading needs.

This approach presents developers with two primary problems. One, dot files are generally hidden on a filesystem, and as such, this means that this PHP data array is also part of a code-path that is hidden from most developers view of the codebase. This then lead to moments of confusion when something related to the location of classes goes awry. The second of the problems it that this strategy assumes that the consumer has some way of consuming the contents of this class-map file. For ZF users, they could utilize one of the shipped Zend\Loader classes that are designed to use a class-map. The problem here is not necessarily for ZF users, but that it is promoting a strategy that is more ZF specific than generic in nature.

The addition of, and swift adoption of PHP’s namespace support in PHP 5.3 has also presented us with both a platform for standardization as well as a few challenges. Traditionally, when we thought of the PEAR naming convention, we assumed that for a given class (in prefix notation) Alpha_Beta_Gamma, there would be a single mapping of this class to a single place on the filesystem, namely: some/path/Alpha/Beta/Gamma.php. This inherently presents no problems. What does present a problem is if we have another project that utilizes part of this prefix, but in a different location. Assume that you want to use part of the prefix, for example, the Alpha_Beta_ portion, with a different logical component/module/project within your organization. In this case, it might make sense that class Alpha_Beta_Gamma live in one project on disk, and that Alpha_Beta_Omega live somewhere completely different. Any number of situations could realistically present this problem, but the most apparent is that your organization wants to utilize a naming scheme that allows for MyCompany_MyDivisionWithinMyCompany_PerhapsSomeLogicalComponent_ClassName.

In any of the likely scenarios of the above, a simple mapping rule that might govern one class name to filesystem name autoloader will not work for another class that could conceivably within the same project without some kind of either autoloader filter, or filesystem munging. Either way, we can no longer make the assumption that a simple map of class name to one location on disk mapping will suffice.

More an more, we are seeing this pattern emerge, (this time with namespace):

namespace VendorName\ComponentName {

    class SomeComponentClass {

    }

}

This class is then found inside its own logical project, with its own data files, web files, or test files in a project structure that looks similar to this:

path/to/VendorName_Component/
    src/
        VendorName/
            ComponentName/
                SomeComponentClass.php
    data/
        some-data-file.txt
    tests/
        phpunit.xml
        phpunit-bootstrap.php
        VendorName/
            ComponentName/
                SomeComponentClassTest.php'
    docs/
        some-documentation-format.xml
    README.md

As you can imagine, any one vendor/organization who’s in the business of building software will more than likely have more than one project that both utilizes this kind of naming scheme and also takes advantage of this listed project structure for developing and releasing this bit of code. This being the case, unless the project is merged with other code for the purposes of a consuming project, parts of the namespace will exist in two separate parts of the filesystem … something which, a specialized autoloader will need to take into consideration.

Ideally, we should find a solution that will present class-map based autoloading in a way that is an easily identifiable code pattern, simple, expressive, works well with common development practices and takes advantages of the current day PHP platform (namespaces and autoloading facilities).

And, What I’ve Found Is This …

And, what I’ve found is that projects should present a few different options as per how they provide an “out-of-the-box” experience as it relates to autoloading. Such a solution should offer the consumer a usage story that consists of the most minimal of requirements when it comes to bootstrapping this 3rd party code. Let’s examine the following project structure (expanded from our example above):

path/to/VendorName_Component/
    src/
        VendorName/
            ComponentName/
                SomeComponentClass.php
    data/
        some-data-file.txt
    tests/
        phpunit.xml
        phpunit-bootstrap.php
        VendorName/
            ComponentName/
                SomeComponentClassTest.php'
    docs/
        some-documentation-format.xml
    autoload_classmap.php
    autoload_function.php
    autoload_register.php
    README.md

What you’ll notice is the addition of 3 autoload_*.php files. Let’s have a look at what these files provide and the reasons for their existence. First the autoload_classmap.php:

<?php
return array(
    'VendorName\Component\SomeComponentClass' => __DIR__ . '/src/VendorName/ComponentName/SomeComponentClass.php'
    /* .. other classes here .. */
);

This file provides the exact map of the classname to the location on disk that this class can be found in. This file takes advantage of PHP’s ability to have return values returned from the inclusion of a file. A simple usage story for this file might be:

<?php
// ...
$classmapAutoloader = new MyClassMapAutoloader();
$classmapAutoloader->loadClassMap(include __DIR__ . '/vendors/VendorName_Component-1.5/autoload_classmap.php');
// ...

Let’s next look at the autoload_function.php file:

<?php
return function ($class) {
    static $classmap = null;
    if ($classmap === null) {
        $classmap = include __DIR__ . '/autoload_classmap.php';
    }
    if (!isset($classmap[$class])) {
        return false;
    }
    return include_once $classmap[$class];
};

This file provides a closure based autoloader as its return value. This function can then be used by the consumer directly for injecting into their own autoloader stack/queue, or directly into the autoloader queue provided by PHP:

<?php
// ...
spl_autoload_register(include __DIR__ . '/vendors/VendorName_Component-1.5/autoload_function.php');
// ... or ...
$autoloader = new MyFancyAutoloader();
$autoloader>registerAutoloaderFunction(include __DIR__ . '/vendors/VendorName_Component-1.5/autoload_classmap.php');

Either way, the consumer is provided with a callback that is capable of being utilized, in a single line, to bootstrap this components autoloading needs.

Finally, the complete, one line solution can be found by utilizing autoload_regsiter.php directly:

<?php
// autoload_register.php
spl_autoload_register(include __DIR__ . '/autoload_function.php');

While the above is so trivial as to ask why it should be included, it does offer a single-line usage story:

<?php
// ...
require_once __DIR__ . '/vendors/VendorName_Component-1.5/autoload_register.php';

Why not do this in the first place? Well, this approach is assuming the consumer does not necessarily care about how the autoload function is loaded into PHP’s spl_autoload queue. One thing to keep in mind is that when spl_autoload_register() is called, autoloaders are placed as the end of the queue by default. This behavior can be changed by passing true as the 3rd parameter of spl_autoload_register(). This type of performance optimization might be important when you know some autoload-able code will be utilized more often than other code, and thus you want the autoloader for that code to be consulted first. Another reason for this kind of user registration is that some autoloaders might be so generic as to want to act as a fallback autoloader or a generic autoloader. For these kind of autoloaders, it is important that they always be last in the queue since they might throw an error or exception when they cannot find a class as opposed to returning false and letting other autoloader have an attempt at finding the class requested.

Conclusion

The above mentioned strategy is something to be considered if you are creating reusable PHP components that you wish provide perhaps as Pyrus packages and/or as PHP phar archives for 3rd party consumption. This autoloading strategy provides an out-of-the-box usability experience in minimal amount of code. It also plays nice with other autoloaders, provides a solution that is opcode cacheable, and since it utilizes absolute paths (via __DIR__) – minimizes the amount of stat() calls to the filesystem your application will generate during its runtime.

PHP Component and Library API Design Overview

January 18th, 2011 by Ralph Schindler

There’s been lots of change in the PHP community over the past few years. PHP now has namespaces. More PHP developers are using an IDE. More PHP developers are pulling inspiration from the Java, C#/.NET, and Ruby communities. And even more PHP developers are embracing the object-oriented and, ironically, the functional nature (closures) of PHP. All these changes make for interesting code. What has also happened is that better and more readable code is being produced by this ever growing PHP community. It’s been a long time since “PHP application” meant a series of transaction scripts as a mix of SQL, CSS, JS, with some PHP sprinkled in, and a couple of few classes for good measure. Of course, that still exists, but you no longer need to go to the ends of the earth to find non-spaghetti code that is understandable within a few minutes.

For the most part, all of these changes are good changes. The number of good/senior/expert level PHP developers is ever increasing and there are more and more “enterprise grade” frameworks and libraries that are being produced. That said, with all of these new changes, the one area which is still fairly inconsistent from project to project is the naming conventions that are employed inside PHP 5.3 project that utilize namespaces. This article will attempt to describe what an API is, how names and object-oriented features affect an API, and how various decisions affect the consumers of a particular API is.

What Is An API?

Before we jump into naming, it’s important to have a common understanding of the actual problem area. When we talk about names, we are really talking about the API. An API is a particular set of rules and specifications that a developer can follow to access and make use of the services and resources provided by another particular software program, component or library. Put another way, it is an interface between various software pieces and facilitates their interaction, similar to the way the user interface facilitates interaction between humans and computers.

For PHP 4 / procedural based libraries, the API is defined by the functions that are declared for usage in that library. It is further described by the global names and global state that the library utilizes to do its job. Typically, API’s based on purely function based libraries are far simpler to understand.

Object-oriented API’s are a bit more complex. When you build an object-oriented library or component, you are typically designing two API’s at the same time, whether or not you know it. This is the nature of object-oriented languages when you employ the use of abstract classes and interfaces in your design.

The first API, the more common of the two, I call the Consumption API. This is the API that answers the question: “how do people consume this thing.” The answer to this question is generally situated around the great majority of use cases that were identified by the author of the software component/library. In PHP, consumption might look like this:

$foo = new SomeCompany\FooComponent\FooComponent($options);
$foo->setAdapter(new SomeCompany\FooComponent\Adapter\SomeAdapter($adapterOptions));
$interestingResult = $foo->doSomethingInteresting();

As you can see, no declarative code was required to fulfill the most common use case that was identified as a need for this component’s existence. The above API is defined by the totality of all the public (concrete) classes, their public properties and public methods. By examining these elements, a good API design should allow a developer to deduce how the component works without examining any documentation. When that is possible, the API has become the documentation as well as the “story” behind how the component/library is to be used.

Not all use cases are accounted for in generic components and generic libraries. As developers, we attempt to create generic libraries and components that will solve the majority of problems of the majority of the community. We cannot envision all use cases or even edge cases behind a particular component. That said though doesn’t means that the outlying use cases are unimportant or should be unaccounted for. These use cases are handled by the secondary API: the “Extension API”.

The Extension API answers the question: “since this component does 90% of what I want, how can I extend it to fulfill the last few of my needs.” Clearly, it makes sense to leverage tools that do most of what you need especially if they can be extended in ways that are outside of the out-of-the-box feature-set. Object-oriented/class based code is particularly well suited to extension through the principle of overriding polymorphism.

The primary tool behind overriding polymorphism is method overriding. For this to be possible, base types, or the types that are shipped with the component/library you are extending, will be overridden to fulfill this new behavior that is your specialized use case. Consider the following code example:

namespace MyCompany\FooComponent\Adapter; // My Component
use SomeCompany\FooComponent\Adapter\SomeAdapter; // Consumed Component

// extend the provided Component with my special use case
class MyAdapter extends SomeAdapter
{
    protected function _someWorkToBeDone()
    {
        // do something special that fulfills our use case
        return parent::_someWorkToBeDone();  // protected method on parent class
    }
}

As you can see here, we’ve extended the functionality of the base adapter from the shipped component/library with our own functionality. This is possible since the base adapter tucked away the business logic we needed to alter inside a protected method. This is what allows us to rely on overriding polymorphism to extend code to suit more specific needs. This “Extension API” can therefore be defined by the totality of all protected members of a class: methods and properties that can be utilized in child classes. These protected methods are not all that important or even useful in the documented and de-facto use cases of a component, but become extremely important when extending.

API Philosophy

It’s hard to quantify importance of any one aspect of a codebase’s API over another without first talking about the general philosophy. In the land of a 1000 frameworks and libraries, being well written and poorly written divides the great majority of them. Of what is left of the (generally regarded) well written ones, philosophy divides the rest.

There exist two common philosophical “goals” that most libraries/components generally subscribe to that, depending on your perspective, might be contradictory. For arguments sake, let’s assume that each is as important as the other. The first: “easy to use”. A component’s like-ability by developers is greatly determined by how easy something is to use, if it’s intuitive, if it’s fulfills the majority of one’s needs. The other: “easy to extend”. The majority of the time, a component is written for some well known use cases. Generally, that will suite the majority of the needs of any one developer, but there are always some unknown use cases. A components ability to be able to deliver a mostly working solution while allowing the developer to extend it for the unknown is what determined how easy it is to extend said component.

More often than not, ease of use and extensibility live at two ends of the spectrum. Things that are easy to use are generally hard to extend, and things that are simple to extend are generally harder to use. This is the case because to accommodate one usually comes at the expense of the other.

Getting back to philosophy and this example at hand, both ease of use and extensibility are both equally important. The goal, in terms of API design, is to be able to accommodate each equally and strike a balance between the two so that each goal is represented in the API.

Basic Tips And Tricks For Better APIs

The tips and tricks for building better component API’s could get fairly long, so this article will attempt to cover some of the more “basic” ideas.

Adopt A Common Namespace & Class Naming Scheme

While it is true that the PHP platform has no built-in packaging, or file based import mechanism… the PHP autoloader with the help of some common conventions can get you 99% of the way there. Large projects like Zend Framework, Symfony, PHPUnit, and PEAR have all settled on a pretty simple and common naming scheme based on the PEAR naming standards. By utilizing this naming scheme, your code will be instantly familiar to developers who already have knowledge of this scheme in other projects. The benefit here is that developers will know exactly where to find classes inside the filesystem.

namespace MyCompany\MyComponent;
class Foo {
    // will be found relative to the include_path, or some path
    // managed by an autoloader at
    // MyCompany/MyComponent/Foo.php, pretty simple eh?
}
Avoid Doing Too Much In the Constructor

There’s lots of places on the web that discuss this, so I’ll link to them here and not go into too much detail. I’ve seen it called a “unified constructor”, but that’s not what we are talking about here, or at least, that is not the goal. The goal is to allow the consumer to give as much or as little information about the identity of the object at instantiation time. The common signature that I like for this is the following:

class Foo
{
    public function __construct($options = null)
    {
        if (is_array($options)) {
            $this->setOptions($options);
        } elseif (is_string($options)) {
            $this->setValueThatIsDocumentAndWellKnown($options);
        }
    }
}

Generally, the call to setOptions() will in turn call various setters if they exist. What is important is that at construction/instantiation time a consumer is not required to fulfill all of the classes requirements. Why is this important? It reverses order in which dependencies are required to be interacted with. Lets examine this in code:

// Example 1
// assuming: class Foo { __construct(A $a, B $b, C $c) {} }
$a = new A($aOption1, $aOption2);
$b = new B();
$c = new C($cOption, $a);

$foo = new Foo($a, $b, $c); // and finally
$foo->doSomethingInteresting();

/** OR ALTERNATIVELY **/

// Example 2
// assuming: class Foo { __construct($options = null) {} }
$foo = new Foo(array(
    'a' => ($a = new A($aOption1, $aOption2)),
    'b' => new B(),
    'c' => new C($cOption, $a)
    ));
$foo->doSomethingInteresting();

// Example 3
// or better:
$foo = new Foo();
$a = new A($aOption1, $aOption2);
$foo->setA($a)
    ->setB(new B())
    ->setC(new C($cOption, $a));
$foo->doSomethingInteresting();

The difference is that in Example 1, even though our target use case is handled by class Foo, we are forced to interact with the dependencies first. Conversely, examples 2 and 3 show that our target object Foo is created up front, and dependencies are handled after instantiation. If code clarity is a goal, reading the code top down in example 2 and 3 makes more sense than in example 1 since the API has allowed the developer to code his use case in a top-down or story-like code block. Why do I like this pattern of usage? Simple: it highlights PHP’s loose nature and flexibility in it’s use case… but mostly because it’s more readable.

Avoid final And private

This one speaks to extensibility. Unless you are attempting to restrict a user from utilizing some kind of use case, there is little gain in marking members as final or private. Sooner or later, someone somewhere will need to override a method you’ve implemented for some obscure use case. A better approach is to provide them with a codebase that will meet most of their needs and can be extended to fulfill the rest if they are outside the original scope. That way, they are not forced to patch your codebase.

Summary

This is by far not an exhaustive list. As more of the larger projects move to using namespaces, closures and the other PHP 5.3 features, we’ll start to see a few more best-practices emerge as they relate to API design. In the mean time, this overview will serve as a springboard for a few discussions on API design moving forward with ZF2 and PHP 5.3 component development that is currently on-going.

Composite Rowsets For Many-To-Many Relationships Via Zend_Db_Table

November 15th, 2010 by Ralph Schindler

One of the hardest problems to solve when developing an ORM of any complexity is in deciding how to handle the retrieval of rows that satisfy a many-to-many relationship, also known as a M:N relationship. From the perspective of an object, there is no such thing as a many to many relationship. There are only two relationships an object understands. The first is the relationship of itself to another object, which is a one to one (1:1) relationship. The second is the relationship of itself to a group of other objects, or a one-to-many (1:N) relationship. It’s not until you look at the relationship of all objects in a system that the many-to-many relationship pattern emerges.

In RDBM systems, rows and their relationships are modeled through the use of foreign keys and foreign key constraints between a left table and a right table. Foreign key constraints, by themselves, can only model 1:1 and 1:N relationship of rows. To model M:N relationships, database developers must get creative. By employing the use of a “3rd party”, and by utilizing foreign keys that model a 1:N relationship, database developers can model a M:N relationship. This 3rd party comes in the form of another table that may or may not have any data model specific information attached to it. This table is generally known as a junction table, but has also been known as a cross-reference table, bridge table, join table, map table, intersection table, linking table, many-to-many resolver, link table, or association table.

Zend_Db_Table_Row And Junction Tables

Zend_Db_Table is a component in Zend Framework that implements the Table Data and Row Data Gateway patterns. In short, a row object attempts to create a single PHP object per actual row in the database table. Furthermore, Zend_Db_Table_Row objects can go as far as to describe, understand, and interrogate these various 1:1, 1:N and M:N relationships. This allows row objects to be able to find and return related row objects in the form of a rowset.

One of the primary tenets of Zend_Db_Table and Zend_Db_Table_Row is to be able to produce consistent row objects. This means that the properties of these row objects should be a complete and logical representation of how the row might look inside the table of the RDBMS.

Some time ago an issue (ZF-6232) was filed against Zend_Db_Table to report that columns from the junction table were being included in the resulting rowset’s row objects. This was causing issues for people who then attempted to save() the row object back to the database. If a developer mistakenly altered one of the junction table values that was accidentally included in the row, Zend_Db_Table_Row would throw an exception since the row object had more columns than the actual row in the database. Given that we want to create consistent, complete and logical row objects, a solution was devised to ensure that the junction table’s row information was not included in the resulting rowset’s rows. Consequently, this meant that anyone relying on this undocumented behavior would no longer be able to get data stored inside the junction table as part of the result set’s row object. This fix was incorporated into the 1.10.2 release.

Over the past several years of working on Zend Framework, I’ve noticed the developer population at large is really good at finding undocumented and previously unthought-of use-cases of Zend Framework components. These use-cases, while sometimes “inventive” to say the least- are also sometimes blatant misuses of a component. It suffices to say that these use-cases are not captured in a unit test and consequently are not protected by backwards compatibility.

Relying on Zend_Db_Table_Row to include junction data is not only an unintended use case but also a misuse of the findManyToManyRowset() functionality provided by Zend_Db_Table_Row. That said, I do want to provide a solution for developers that relied on this behavior of Zend_Db_Table_Row in Zend Framework previous to 1.10.2.

A Solution

While the motivation for creating this class is based on providing a solution to developers who relied on utilizing junction table data in Zend_Db_Table_Row’s many-to-many rowsets, this same technique can be utilized with any ORM or database abstraction layer that handles many-to-many result sets.

Basically, I’ve created a single class that effectively take the place of Zend_Db_Table_Row::findManyToManyRowset() for the purposes of creating an iterable rowset that allows access to both the target many-to-many rowset as well as the junction rowset. This solution is called a Composite Rowset. In this solution, both rowsets (iterators) are kept in sync with one another. This proves to be an ideal solution in a couple of ways. First, it will produce consistent row objects that are explicitly tied to a row in a database. Second, the cost of creating this composite rowset is at the expense of 2 queries: the original many-to-many query and a similar query to retrieve the junction rowset. This is ideal since previously, to get the junction data, findDependentRowset() would have had to been called on each row within the rowset produced by the Zend_Db_Table_Row::findManyToManyRowset().

The API for this Composite Rowset looks like this:


/**
 * @link https://github.com/gooeylabs/Gooey-PHP-5.2-Components/blob/master/library/Gooey/Db/Table/ManyToManyCompositeRowset.php
 */
class Gooey_Db_Table_ManyToManyCompositeRowset implements SeekableIterator, ArrayAccess, Countable
{

    public function __construct(Zend_Db_Table_Row_Abstract $row, $matchTableName, $junctionTableName, $matchRefRule = null);
    public function seek($position);
    public function current();
    public function currentJunction();
    public function next();
    public function rewind();
    public function key();
    public function valid();
    public function offsetSet($offset, $value);
    public function offsetGet($offset);
    public function offsetExists($offset);
    public function offsetUnset($offset);
    public function count();
    public function getRow($position, $seek = false);
    public function getJunctionRow($position, $seek = false);
    public function toArray();
    public function junctionRowsetToArray();
}

NOTE: Full class located here.

As you can see, the API mirrors that of Zend_Db_Table_Rowset to provide a something that is immediately recognizable. Below is an example of sample usage. For this example, assume there is a typically artist/genre data model that demonstrates a many-to-many relationship. Inside of the junction table we are attempting to track the date that the relationship was created. This examples shows this usage:


$aTable = new ArtistTable();
$artist1 = $aTable->find(1)->current();
echo 'Artist: ' . $artist1->name . PHP_EOL;
// instead of $genres = $a->findManyToManyRowset('GenreTable', 'ArtistGenreTable');
$genres = new Gooey_Db_Table_ManyToManyCompositeRowset($artist1, 'GenreTable', 'ArtistGenreTable');

// iterate
foreach ($genres as $genre) {
    echo '  Genre ' . $genre->name . ' added on ' . $genres->currentJunction()->added_on . PHP_EOL;
}

/**
 * Sample Output:
 *
 *    Artist: Foo Artist
 *      Genre Rock & Roll added on 2010-11-10
 *      Genre Hiphop added on 2010-11-11
 *
 */

Where To Get It & Conclusions

This code is available on my GooeyLabs github account, specifically inside of the Gooey-PHP-5.2-Components repository. (Gooey is my namespace and moniker for my open source code contributions.) Hopefully, those who have found they’ve had issues with the above mentioned fix for Zend_Db_Table_Row::findManyToManyRowset() and junction table data might find value in this class.

Exception Best Practices in PHP 5.3

September 15th, 2010 by Ralph Schindler

Every new feature added to the PHP runtime creates an exponential number of ways developers can use and abuse that new feature-set. However, it’s not until developers have had that chance that some agreed-upon good usage and bad usage cases start to emerge. Once they do emerge, we can finally start to classify them as best or worst practices.

Exception handling in PHP is not a new feature by any stretch. In this article, we’ll discuss two new features in PHP 5.3 based around exceptions. The first is nested exceptions and the second is a new set of exception types offered by the SPL extension (which is now a core extension of the PHP runtime). Both of these new features have found their way into the book of best best practices and deserve to be examined in detail.

Special note: some of these features have existed in PHP < 5.3 or are at least capable of being implemented in PHP < 5.3. When this article mentions PHP 5.3, it is not in the strictest sense of the PHP runtime. Instead, it is meant that code bases and projects that are adopting PHP 5.3 as a minimum version but also all of the best practices that have emerged in this new phase of development. This phase of development highlighted by the “2.0″ efforts of projects like Zend Framework, Symfony, Doctrine and PEAR to name a select few.

Background

Previously in PHP 5.2, there was a single exception class Exception. Generally, speaking from a Zend Framework / PEAR coding standard perspective, this exception class became the root for all exceptions that might be thrown from within your library. For example, if you created a library for your company MyCompany, then you would, according to ZF/PEAR standards, have prefixed all code with MyCompany_. For this library, you might create a base exception for your library code: MyCompany_Exception, which extends the PHP class Exception and from which all your components might inherit, subclass, and throw. So, if you created a component MyCompany_Foo, it might have a base exception class called MyCompany_Foo_Exception that is expected to be thrown from within the MyCompany_Foo component. These exceptions can be caught by attempting to catch MyCompany_Foo_Exception, MyCompany_Exception, or simply Exception. This would allow 3 levels of granularity (or more depending on how many times the MyCompany_Foo_Exception was subclassed) to consumers of this component in this particular library, and handle that exception in a way they deem fit.

New Feature: Nesting

In PHP 5.3, the base exception class now handles nesting. What is nesting? Nesting is the ability to catch a particular exception, create a new exception object to be thrown with a reference to the original exception. This then allows the caller access to both the exception thrown from within the consumed library of the more well known type, but also access to the exception that originated this exceptional behavior as well.

Why is this useful? Typically, this is most useful in code that consumes other code that throws exceptions of its own type. This might be code that utilizes the adapter pattern to wrap 3rd party code to deliver some kind of adaptable functionality, or simply code that utilizes some exception throwing PHP extension.

For example, in the component Zend_Db, it uses the adapter pattern to wrap specific PHP extensions in order to create a database abstraction layer. In one adapter, Zend_Db wraps PDO, and PDO throws its own exception PDOException, Zend_Db needs to catch these PDO specific exceptions and re-throw them as the expected and known type of Zend_Db_Exception. This gives developers the assurance that Zend_Db will always throw exceptions of type Zend_Db_Exception (so it can be caught), but they will also have access to the original PDOException that was thrown in case it is needed.

The following is an example of how a fictitious database adapter might implement nested exceptions:


class MyCompany_Database
{
    /**
     * @var PDO object setup during construction
     */
    protected $_pdoResource = null;

    /**
     * @throws MyCompany_Database_Exception
     * @return int
     */
    public function executeQuery($sql)
    {
        try {
            $numRows = $this->_pdoResource->exec($sql);
        } catch (PDOException $e) {
            throw new MyCompany_Database_Exception('Query was unexecutable', null, $e);
        }
        return $numRows;
    }

}

To utilize a nested exception, you would call the getPrevious() method of the caught exception:


// $sql and $connectionParameters assumed
try {
    $db = new MyCompany_Database('PDO', $connectionParams);
    $db->executeQuery($sql);
} catch (MyCompany_Database_Exception $e) {
    echo 'General Error: ' . $e->getMessage() . "\n";
    $pdoException = $e->getPrevious();
    echo 'PDO Specific error: ' . $pdoException->getMessage() . "\n";
}

Most recent PHP extensions have OO interfaces. As such, those API’s tend to lean on throwing exceptions instead of raising errors. A short list of exception throwing extensions in PHP include PDO, DOM, Mysqli, Phar, Soap and SQLite.

New Feature: New Core Exception Types

Also in PHP 5.3 development we are shining a light on some new and interesting Exception types. These exceptions have been in place since the PHP 5.2.x, but it has not been till recently and the “re-evaluation” exception best practices that they are now gaining some limelight. They are implemented in the SPL extension and are listed on the manual pages located here. Since these new exception types are part of core PHP as part of SPL, they can be used by anyone who targets PHP 5.3 as the minimum runtime for their code. While this might seem less important for when writing application layer code, the way we adopt and use these new exception types becomes even more important when we are writing and consuming library code.

So why new exception types in general? Previously, developers attempted to give more meaning to their exceptions by putting more information into the message of the exception. While this is good, it has a few drawbacks. One is that you cannot catch an exception based on a message. This can be a problem if you know a set of code is throwing the same exception type with various message for various exceptional conditions that can be handled differently. For example, an authentication class that during $auth->authenticate(); it throws the same type of exception (let’s assume Exception), but with different messages for two specific failures: a failure where the authentication server cannot be reached and the same exception type but different message for a failed authentication attempt. In this case (nevermind that using Exceptions might not be the best way to handle authentication responses), it would require string parsing the message to handle those two scenarios differently.

The solution to this is clearly some way to codify exceptions so that they can be easily interrogated when trying to discern how to react to this exceptional situation. The first response libraries have had is to use the $code property of the Exception base class. The other is to create multiple types, or new exception classes, that can be thrown to describe the behavior. Both of these approaches have the same simple drawback. Neither has emerged as a best practice and as such, neither is considered a standard, thus each project attempting to replicate this solution might do so with small variations that force the consumer to go back to the documentation to understand the library specific solution that was created. Now with the new types approach in the SPL, otherwise known as the Standard PHP Library; developers can utilize these new types in the same way in their projects and the projects they are consuming since a best practice for these new types has emerged.

The second drawback of the detailed message approach is that it makes understanding the exceptional situation harder for non-english or limited-english speaking developers. This might slow down some developers when trying to decipher what an exception message is trying to convey. As many developers as there are writing exceptions, there are equally as many variations in how they will describe that situation in the message since there is no standard for conformity or for codification.

So How Do I Use Them, Give Me The Dirty Details?

There are a total of 13 new exceptions in the SPL. Two of them can be considered “base” types: LogicException and RuntimeException; both extend the PHP Exception class. The remainder of the methods can thusly be broken down into three logical groups: the dynamic call group, the logic group and the runtime group.

The dynamic call group contains the exceptions BadFunctionCallException and BadMethodCallException. BadMethodCallException is a subclass of BadFunctionCallException which in turn is a subclass of LogicException. That means that these exceptions can be caught by either their direct type, LogicException, or simply Exception. When do you use these? Generally, these should be used when an exceptional situation arises as a result of an unresolvable __call() during a method or when a callback cannot find a valid function to call (or better put, when something is not is_callable()).

For example:


// OO variant
class Foo
{
    public function __call($method, $args)
    {
        switch ($method) {
            case 'doBar': /* ... */ break;
            default:
                throw new BadMethodCallException('Method ' . $method . ' is not callable by this object');
        }
    }

}

// procedural variant
function foo($bar, $baz) {
    $func = 'do' . $baz;
    if (!is_callable($func)) {
        throw new BadFunctionCallException('Function ' . $func . ' is not callable');
    }
}

While the direct example is inside __call and anywhere near something that will call_user_func(), this group of exceptions are also useful when developing any kind of API where dynamic method call and function call lookups are utilized. An example of this would be a SOAP or XML-RPC client/server who is capable of issuing and/or interpreting method requests.

The second group is the logic group. This group consists of DomainException, InvalidArgumentException, LengthException, and OutOfRangeException. These exceptions are a subclass of LogicException which is in turn a subclass of the PHP Exception class. You use these exceptions when there is an exceptional situation that arises from either a mutation of state or as a result of bad method or function parameters. To get a better understanding of this, we will first look at the last group of exceptions.

The final group is the runtime group. It consists of OutOfBoundsException, OverflowException, RangeException, UnderflowException, and UnexpectedValueException. These exceptions are a subclass of RuntimeException which is in turn a subclass of the PHP Exception class. These exception should be used when an exceptional situation arises during the “runtime” of a function or method call.

How do these logic group and runtime group work together? If you look at the anatomy of an object, one of two things is generally happening. First, the object will be tracking and mutating state. This means the object is generally not doing anything (yet); it might have configuration passed to it; it might be setting up properties (via setters and getters); or, it might be getting references to other objects. Second, when the object is not tracking and mutating state, it is operating – doing what it was designed to do. This is the object’s runtime. For instance, during the objects lifetime, it might be created, passed a configure object, then it might have setFoo($foo), setBar($bar) called. During these times any kind of LogicException should be raised. In addition, when the object is asked to do something, with parameters, for example $object->doSomething($someVariation); during the first few lines when it interrogates that $someVariation variable, it would throw a LogicException. After it is done interrogating $someVariation, and it goes on about doing its job of doSomething(), this is considered its “runtime” and in this code it would throw RuntimeExcpetions.

To better understand, we’ll look at this concept in code:


class Foo
{
    protected $number = 0;
    protected $bar = null;

    public function __construct($options)
    {
        /** this area throws LogicException types **/
    }

    public function setNumber($number)
    {
        /** this method throws LogicException types **/
    }

    public function setBar(Bar $bar)
    {
        /** this method throws LogicException types **/
    }

    public function doSomething($differentNumber)
    {
        if ($differentNumber != $expectedCondition) {
            /** this area throws LogicException types **/
        }

        /**
         * From here on down, this method throws
         * RuntimeException types
         */
    }

}

Now that this concept is understood, what does this do for a consumer of this code base? The caller can be sure that anytime they are mutating the state of an object, they can catch exceptions with the most specific type, for example InvalidArgumentException or LengthException, and at least LogicException. By having this level of granularity, and multiple types involved, they can catch the exception minimally with LogicException, but also get greater understanding of what when wrong via the actual type of the exception. This same concept applies for the Runtime group of exceptions as well, more specific types can be thrown and either the specific or the less specific type will be caught. This offers a greater deal of knowledge about the situation and granularity of control to the caller.

Below is a table of the information you might find of interest concerning these SPL exceptions

Best Practices In Library Code

Since the advent of these new exception types in PHP 5.3, a new best practice for library code has also emerged. While it is most beneficial to get a standard specialized exception type like InvalidArgumentException or RuntimeException, it would also be useful to catch component level exceptions. You can read a more in-depth discussion of the concepts on the ZF2 wiki or the PEAR2 wiki.

The long and short of this, in addition to the best practices listed above, is that there should be a component level type that can be caught for any exception that emanates. This is accomplished by using what is known as a Marker Interface. By creating a component level marker interface, real exception types inside a given component can extends the SPL exception types and be caught by any number of class types at runtime. Let’s examine the following code:


// usage of bracket syntax for brevity
namespace MyCompany\Component {

    interface Exception
    {}

    class UnexpectedValueException
        extends \UnexpectedValueException
        implements Exception
    {}

    class Component
    {
        public static function doSomething()
        {
            if ($somethingExceptionalHappens) {
                throw new UnexpectedValueException('Something bad happened');
            }
        }
    }

}

Assuming the above code, if one were to execute MyCompany\Component\Component::doSomething(), the exception that is emitted from the doSomething() method can be caught by any of the following types: PHP’s Exception, SPL’s UnexpectedValueException, SPL’s RuntimeException the component’s MyCompany\Component\UnexpectedValueException, or the component’s MyCompany\Component\Exception. This affords the caller any number of opportunities to catch an exception that emanates from a given component within your library. Furthermore, by analyzing the types that make up the exception, more semantic meaning can be given to the exceptional situation that just occurred.

Summary

In summary, this article should help guide you in creating and throwing more meaningful exceptions in a standards based and best practices way by negating the emphasis of the exception message and putting more emphasis on the exception type. If you’d like to carry on the discussion of these concepts feel free to comment here, on the PHP documentation pages, or in the ZF2 wiki comments section for the Exception proposal linked above.

The Anatomy Of A Bug/Issue Reproduction Script

February 18th, 2010 by Ralph Schindler

“There is a problem with component Fooey-Bar-Bazzy, I think it’s related to Nanny-Nanny-Neener. Please Fix Now.” If you’ve written a bug/issue report like that in the past with no other details- shame on you! This may come as a shock, but as great as some developers might be, they cannot read minds. Each has their own way of coding, custom working environment as well as their own favorite tools; aside from variances in coding standards and best practices. Some could argue these little intricacies are outside of the realm of coding standards and best practices and that these are the differences between good, great, and even terrible developers. Each developer has a different opinion on how particular applications, libraries of code, or even features of a particular project are expected to behave in practice. These varying expectations are why bugs/issues exist. No one developer producing code for mass consumption can anticipate every possible use case. Additionally, no one developer can replicate every environment surrounding every pre-conceived use case. There are simply not enough resources at hand; be it in the form of a variety of systems or simply the number of hours in a developers day.

With that in mind, I write this as a plea to all developers to be good to the maintainer of code you use. In the simplest form of advice, I suggest that before you click submit on that bug/issue report form, ask yourself two questions: “Did I do enough due-diligence in determining if this is really a bug?” AND “If I got this bug report, would I be able to reproduce it.. let alone understand it?”. If the answer is YES to both of those questions. Go ahead- click submit. If your answer is no, you’ve got some more work to do.

Some Tenets Of the Good Reproduction Script

In this short article, I’d like to outline a few details of what should go into a bug/issue report. These are some simple guidelines that should be considered when you write a bug/issue report. It should be noted that this list is by all means not exhaustive, but if you at least consider the list below before clicking submit- you’ll make a code maintainers day. I promise.

  1. List Out All Assumptions Clearly

    PHP specifically is well known for being a “glue language”. What that means is that PHP is generally sitting between multiple pieces of software that is, of course, not PHP. This means that these pieces of software each have their own set of configurations and environments that PHP is “gluing” together. That being the case, any assumptions about non-PHP assumptions should be clearly listed in the reproduction script. This could include database flavor and its settings, a PHP library component, or perhaps a specific version of an extension that is being used and the underlying unmanaged/c-based library your PHP environment is consuming.

  2. Use The Shortest Possible Use Case

    As tempting as it is to copy a script from your project and paste it into the bug/issue submission box, don’t do this. If you are truly invested in seeing the bug/issue fixed in a timely fashion, take the time to create a small reproduction script. In this script should be the absolute minimal amount of code to demonstrate to another human that there is indeed a problem that needs solving. By keeping the script minimal and short, you are also removing any other distractions from the script that otherwise might confuse the maintainer and prevent him from fully understanding the real problem.

  3. Use Generic Yet Meaningful Names

    It cannot be stressed enough that any non-meaningful names should be discouraged at all costs. And as mentioned above, you want to have as few distractions as possible in the use case. For example, supplying your database table of customers, with first_name, last_name, etc has virtually nothing to do with the problem at hand. In these cases where table and column names are ancillary to the actual problem, they should be generalized: a table named ‘foo’, and columns named ‘bar1′ and ‘bar2′. Unless …

    … the variable name can add context to the problem. What does this mean? $customer would be bad; but $faultyTableObject is good. The latter naming makes it easy for the maintainer to focus on the variable that need to be tracked leading up to the problem.

  4. Document Both What You Expect, And The Actual Result

    Claiming something is broken without offering what you expect and what the actual result is offers next to nothing to the maintainer attempting to fix the problem. Generally speaking, most use cases that end up being bugs/issues are outside of the original preconceived use cases for the actual component. That said, the maintainer is going to need the context of the use case that you’ve found to be problematic. It also helps to point out any existing documentation that describe the more well-defined uses cases, and how your use case relates and/or deviates from those already defined use cases.

  5. Make The Reproduction Script As Generic As Possible

    Perhaps this is redundant, but it’s important to know the minimal requirements for reproducing a bug/issue. You are not expected to be an expert on how to fix the actual problem, but you should do your own due-diligence in order to hand the problem off to the maintainer. It’s already been said to “List out all assumptions clearly”, but it is just as important to peel off any specific pieces of the problem that are not directly part of the problem.

    This concept can best be described by example. While MySQL is a widely available database platform, SQLite is widely known as the easiest to use and most portable database platform, at least in the PHP runtime. If you find a problem while using mysql, but it’s clear it can be replicated using SQLite, use SQLite. SQLite is built into PHP by default, and in a single script, you can create a memory based database and its schema in just a few lines of code.

    Sometimes a issue cannot be described in a single script. This is ok. This would be the case if, for example, you found an issue in a larger system, like Zend Frameworks MVC layer. In this case, it makes sense that you need to provide a minimal ZF project to demonstrate the issue. In these cases, make sure to again, use a few files and as little code as possible to demonstrate the issue. Also, in the spirit of using generic code, ensure to make all file system paths relative. This will help the maintainer get up and running with the problematic project in a minimal amount of time, with minimal configuration.

A Reproduction Script By Example

The following is a reproduction script I have written based on an issue (ZF-3709) provided to Zend Framework in our issue tracker. I chose this issue to write a reproduction for because it offers the ability to talk about how one might go about describing the environment, more specifically what the database should look like in order to replicate the problem.

(This script can also be found at http://gist.github.com/307396)

<?php

/**
 * This reproduction script shall accompany the issue reported at
 * http://framework.zend.com/issues/browse/ZF-3709
 *
 * Assumptions:
 *   Zend_Db_Table_* from trunk
 *   PHP Environment has SQLite with :memory: capabilities
 *
 * Result:
 *   This script should run without any assertions failing (empty output)
 */

// ensure that Zend Framework trunk is being tested against & classes are available
// set_include_path('/path/to/ZendFramework/library');
require_once 'Zend/Loader/Autoloader.php';
Zend_Loader_Autoloader::getInstance();

// setup the adapter, this uses SQLite so that its minimally invasive
// to anyone wishing to reproduce the issue on their local machine
$dbAdapter = Zend_Db::factory(
    'Pdo_Sqlite',
    array('dbname' => ':memory:')
    );

// ensure all tables have access to the adapter
Zend_Db_Table::setDefaultAdapter($dbAdapter);

// setup the database, classes, & assertion system
setup();

/**
 * BEGIN Reproduction Code
 */

// find a record that has a relationship to some bars through foo_to_bar
$fooTable = new Foo();
$fooRow = $fooTable->fetchRow('id = 2');
$fooIdOnesBars = $fooRow->findManyToManyRowset('Bar', 'FooToBar');

// the expected values for the next call
$expectedValues = array(
    array('id' => '2', 'name' => 'bravo'),
    array('id' => '3', 'name' => 'charlie')
    );

// when we loop through the rows, they should match the expected results above
foreach ($fooIdOnesBars as $index => $barRow) {
    // I'll use assert here to throw warnings when expected does not match actual
    $actualValue = $barRow->toArray();
    assert($expectedValues[$index] === $actualValue);
}

/**
 * END Reproduction Code
 *
 * Supporting code below
 */ 

// setup function
function setup() {
    setup_database();
    setup_classes();
    setup_assertions();
}

// This function will setup the proper database structure with test data
function setup_database() {
    global $dbAdapter;

    $conn = $dbAdapter->getConnection();
    $conn->query('
        CREATE TABLE foo (
            id INTEGER PRIMARY KEY,
            name VARCHAR(25)
            );
        ');

    foreach (array('one', 'two', 'three', 'four') as $numberName) {
        $conn->query('INSERT INTO foo (name) VALUES ("' . $numberName . '");');
    }

    $conn->query('
        CREATE TABLE bar (
            id INTEGER PRIMARY KEY,
            name VARCHAR(25));
        ');

    foreach (array('alpha', 'bravo', 'charlie', 'delta') as $word) {
        $conn->query('INSERT INTO bar (name) VALUES ("' . $word . '");');
    }

    $conn->query('
        CREATE TABLE foo_to_bar (
            id INTEGER PRIMARY KEY,
            foo_id INTEGER,
            bar_id INTEGER,
            extra VARCHAR(20)
            );
        ');
    $datas = array(
        array('foo_id' => 2, 'bar_id' => 2, 'extra' => 'Two to Two'),
        array('foo_id' => 2, 'bar_id' => 3, 'extra' => 'Two to Three'),
        array('foo_id' => 3, 'bar_id' => 4, 'extra' => 'Three to Four'),
        );
    foreach ($datas as $datum) {
        $conn->query('INSERT INTO foo_to_bar '
            . '(' . implode(',', array_keys($datum)) . ')'
            . ' VALUES ("' . implode('", "', array_values($datum))
            . '");');
    }
}

// This function will define the proper Zend_Db_Tables and their relationships
function setup_classes() {

    class Foo extends Zend_Db_Table_Abstract
    {
        protected $_name = 'foo';
    }

    class Bar extends Zend_Db_Table_Abstract
    {
        protected $_name = 'bar';
    }

    class FooToBar extends Zend_Db_Table_Abstract
    {
        protected $_name = 'foo_to_bar';
        protected $_referenceMap = array(
            'Foo' => array(
                'columns' => 'foo_id',
                'refTableClass' => 'Foo',
                'refColumn' => 'id'
                ),
            'Bar' => array(
                'columns' => 'bar_id',
                'refTableClass' => 'Bar',
                'refColumn' => 'id'
                )
            );
    }

}

// assertion setup
function setup_assertions() {
    assert_options(ASSERT_ACTIVE, true);
    assert_options(ASSERT_WARNING, false);
    assert_options(ASSERT_CALLBACK, 'assert_failure');
}

// callback for assertion failures
function assert_failure() {
    global $expectedValues, $index, $actualValue;
    echo 'Was expecting an array that looked like:' . PHP_EOL;
    var_dump($expectedValues[$index]);
    echo 'But got array that looked like:' . PHP_EOL;
    var_dump($actualValue);
    echo PHP_EOL . PHP_EOL;
}

To the best of my ability, this script passes both of my earlier questions: “Yes, I did enough due-diligence in determining if this is really a bug.” AND “Yes, if I got this bug report, would I be able to reproduce it and understand it.”

A Few Considerations

This above script does not have unit tests, nor does it represent a patch to the existing framework. While that would be the most ideal, that sets the bar much too high for people to report worthwhile issues. The consumers of the code are not expected to be experts on the actual issue at hand, or even how to write valid unit tests that fully exercise a feature or bug. Ultimately, as a code maintainer, I simply want to be able to see the issue you are attempting to describe.

If you’d like to go above and beyond the standard reproduction script, you might also considering offering lines of code that you feel might be problematic. What that allows is maintainers to set breakpoints at specific locations and really drill down into the offending code.

I hope this helps developers understand what is expected of them as they file issue reports on open source code they use. By following these guidelines you’ll be doing a service to the maintainer by making their life easier, and even your own since reproduction scripts offer quicker turn around time for issues over those that require in-depth research.

Dynamic Assertions for Zend_Acl in ZF

August 13th, 2009 by Ralph Schindler

In Zend Framework 1.9.1, Zend_Acl gets two major issues resolved and a simple API change that now make it possible to create a more robust, more expressive ACL definition with less code. ZF issues ZF-1721 and ZF-1722, each nearly two years old, have both been solved. Over the last two years, I’ve seen a variety of duplicate issues come into the issue tracker, which stem from two fundamental flaws in Zend_Acl – “Zend_Acl::isAllowed does not support Role/Resource Inheritance down to Assertions” and “Zend_Acl assertions breaks when inheritance is required (ie DepthFirstSearch)”. In this article, we’ll explore the API changes that alleviate these two problems, and we’ll demonstrate how to leverage the Zend_Acl assertion system to create expressive, dynamic assertions that work with your applications models.

Backwards Compatible API Changes

Before discussing the issues, let’s go over the API change and how that affects the component. Previously, the two methods for setting up an ACL that were used by a developer were add() and addRole(). Interestingly, add() was intended to imply addResource(). Since add() implied that you were adding a resource, its clear that this component was created from the perspective of resources as a primary actor, and then roles and assertions as secondary actors.

The new API allows for the creation of an ACL by using strings instead of having to use Zend_Acl_Role and Zend_Acl_Resource objects explicitly. To me, this is a pretty important step towards what I’d like to see in 2.0. In 2.0, I would ideally like to see addRole() and addResource() accept strings for types of roles and resources to query against, and accept objects for explicit role and resource objects to query against (even if they match an already registered type). To put simply, I would expect addRole('user') and addRole($userObjectForRalph) to have different behaviors if different permissions were registered for each. This would allow me to specify specific access for the user object ‘ralph’ separately from the ACL’s for objects of role type ‘user’. The behavior can be further defined to either inherit from the type, or override type ACL’s depending on the desired effect. Ultimately, this would allow for a more dynamic experience with Zend_Acl.

Dynamic Assertions Example

In the following example, we’ll have a look at a common use case that is now possible in Zend_Acl. In plain English, what developers want to be able to do is be able to design assertions that can accept application models that implement the Resource or Role interface, and be able to apply some dynamic or custom logic to assess whether or not the given role has access to the given resource. As mentioned previously, this was not possible because in the process of checking the ACL tree, using a depth-first search, the calling resource and roles was lost, and only the original registered objects was being persisted into the assertions. Well, that’s fixed now.

For the purposes of this example, we’ll take a simple concept: a user needs to be able to only edit their own blog post. The user in this case, would be our applications model for users. The actual class will implement the Zend_Acl_Role_Interface. We will also have a BlogPost model which will serve as the resource in question, thus implementing the Zend_Acl_Resource_Interface. Naturally, our system will be able to handle users of different role ‘types’, but our BlogPost will only be of a single resource type ‘blogPost’.

Note: the following code is demonstration only. As such, some coding standards or conventions are not necessarily what you’d expect in proper object-oriented code or even a Zend Framework MVC based application. Some of the code might contain rouge ‘echo’ statements so that the demonstration below will be more expressive of what its actually doing.

class User implements Zend_Acl_Role_Interface
{
    // using public members here for brevity in this article
	public $id = null;
    public $role = 'guest';

    public function getRoleId()
    {
        return $this->role;
    }
}

class BlogPost implements Zend_Acl_Resource_Interface
{
	public $id          = null;
    public $ownerUserId = null;

    public function getResourceId()
    {
        return 'blogPost';
    }
}

Next, we’ll create the dynamic assertion. We generally would expect this assertion to be called when a User is requested to modify a BlogPost. This assertion will ensure that the BlogPost‘s owner id (the user id that owns said BlogPost), is the same as the provided User objects id. If it is, pass, if not, fail. Fairly common use case, right? Here is what our assertion should look like, with a few inline comments:

class UserCanModifyBlogPostAssertion implements Zend_Acl_Assert_Interface
{
    /**
     * This assertion should receive the actual User and BlogPost objects.
     *
     * @param Zend_Acl $acl
     * @param Zend_Acl_Role_Interface $user
     * @param Zend_Acl_Resource_Interface $blogPost
     * @param $privilege
     * @return bool
     */
    public function assert(Zend_Acl $acl, Zend_Acl_Role_Interface $user = null, Zend_Acl_Resource_Interface $blogPost = null, $privilege = null)
    {
    	echo ' == Checking the assertion ==' . PHP_EOL; // only here for the purposes of article

        if (!$user instanceof User) {
            throw new InvalidArgumentException(__CLASS__ . '::' . __METHOD__ . ' expects the role to be an instance of User');
        }

        if (!$blogPost instanceof BlogPost) {
            throw new InvalidArgumentException(__CLASS__ . '::' . __METHOD__ . ' expects the resource to be an instance of BlogPost');
        }

        // if role is publisher, he can always modify a post
        if ($user->getRoleId() == 'publisher') {
        	return true;
        }

        // check to ensure that everyone else is only modifying their own post
        if ($user->id != null && $blogPost->ownerUserId == $user->id) {
        	return true;
        } else {
        	return false;
        }
    }
}

Note: Assertions, as with ACL’s can be treated, and most likely should be treated, as application models. As such, if you are using the Zend Framework MVC application structure, you might want to name this one similarly to Default_Model_Acl_UserCanModifyBlogPostAssertion, and would live in application/models/Acl/UserCanModifyBlogPostAssertion.php. Likewise, the User class would actually be Default_Model_User, and BlogPost might be Default_Model_BlogPost.

Now that we have our models setup for our ACL to interact with, its time to define the actual ACL definition itself. For the purposes of this exercise, we’ll not assume that the ACL itself is a model, but our consuming script below will simply interact with it. In a Zend Framework MVC application, one might find the ACL defined as a model within your application, depending on your needs.

$acl = new Zend_Acl();

// setup the various roles in our system
$acl->addRole('guest');
$acl->addRole('contributor', 'guest');
$acl->addRole('publisher', 'contributor');

// add the resources
$acl->addResource('blogPost');

// add privileges to roles and resource combiniations
$acl->allow('guest', 'blogPost', 'view');
$acl->allow('contributor', 'blogPost', 'contribute');
$acl->allow('contributor', 'blogPost', 'modify', new UserCanModifyBlogPostAssertion());
$acl->allow('publisher', 'blogPost', 'publish');

The above code has produced a fully defined ACL object, at least for the purposes of this article, that we can now start interacting with. In the follow examples, we’ll interact with this ACL object. The User and BlogPost objects utilize public properties for brevity and illustrative purposes, but you can assume that these object properties might be populated and persisted via Zend_Db_Table row, a web service, or some other data source persistence layer.

$user = new User();
$post = new BlogPost();

// some default values
$user->id = 1;
$post->ownerUserId = 1;

/**
 * Demonstrate guest Privileges
 */
echo 'Demonstrating ' . $user->role . ' privileges' . PHP_EOL
    . '------------------------------------------'
    . PHP_EOL . PHP_EOL;

echo 'Can user (' . $user->role . ') view?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'view') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL; 

echo 'Can user (' . $user->role . ') contribute?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'contribute') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

echo 'Can user (' . $user->role . ') modify?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'modify') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

echo 'Can user (' . $user->role . ') publish?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'publish') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

/**
 * Demonstrate contributor Privileges
 */

$user->role = 'contributor';

echo 'Demonstrating ' . $user->role . ' privileges' . PHP_EOL
    . '------------------------------------------'
    . PHP_EOL . PHP_EOL;

echo 'Can user (' . $user->role . ') view?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'view') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL; 

echo 'Can user (' . $user->role . ') contribute?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'contribute') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

$post->ownerUserId = 5;

// the following two examples should demonstrate the assertion being checked

echo 'Can user (' . $user->role . ') modify someone elses blogPost?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'modify') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

$post->ownerUserId = 1;

echo 'Can user (' . $user->role . ') modify own blogPost?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'modify') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

echo 'Can user (' . $user->role . ') publish?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'publish') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

/**
 * Demonstrate publisher Privileges
 */

$user->role = 'publisher';

echo 'Demonstrating ' . $user->role . ' privileges' . PHP_EOL
    . '------------------------------------------'
    . PHP_EOL . PHP_EOL;

echo 'Can user (' . $user->role . ') view?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'view') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL; 

echo 'Can user (' . $user->role . ') contribute?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'contribute') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

$post->ownerUserId = 5;

echo 'Can user (' . $user->role . ') modify someone elses blogPost?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'modify') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

$post->ownerUserId = 1;

echo 'Can user (' . $user->role . ') modify own blogPost?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'modify') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

echo 'Can user (' . $user->role . ') publish?' . PHP_EOL
    . ($acl->isAllowed($user, $post, 'publish') ? 'yes' : 'no') . PHP_EOL
    . PHP_EOL;

Once you have all of that in place, you can see a the run of such a script would produce these results:

/home/ralph/test-script/$ php acl-inheritance.php

Demonstrating guest privileges
------------------------------------------

Can user (guest) view?
yes

Can user (guest) contribute?
no

Can user (guest) modify?
no

Can user (guest) publish?
no

Demonstrating contributor privileges
------------------------------------------

Can user (contributor) view?
yes

Can user (contributor) contribute?
yes

 == Checking the assertion ==
Can user (contributor) modify someone elses blogPost?
no

 == Checking the assertion ==
Can user (contributor) modify own blogPost?
yes

Can user (contributor) publish?
no

Demonstrating publisher privileges
------------------------------------------

Can user (publisher) view?
yes

Can user (publisher) contribute?
yes

 == Checking the assertion ==
Can user (publisher) modify someone elses blogPost?
yes

 == Checking the assertion ==
Can user (publisher) modify own blogPost?
yes

Can user (publisher) publish?
yes

Conclusion

Zend_Acl can now be used to make concise, dynamic and expressive ACL systems. The assertion system that is in place in Zend_Acl can be leveraged in ways never seen before out of the box. While the User/BlogPost example is on the simple side, you can use this article to start thinking about the different ways such a system can be leveraged in your own projects where dynamic assertions would simplify controller or model code that is already in place.

Database Abstraction Layers Must Live!

July 15th, 2009 by Ralph Schindler

I come preaching true hope, against the fallacies.

I’ve heard the arguments for and against database abstraction layers (DALs) time and time again. I must say first, I agree with them all, both sides, equally. Interestingly, I can put the vocal proponents of each side of the argument in one of two boxes: a programmer guy box, or a database guy box. For some unknown reason though, they never seem to see eye to eye.

Honestly though, I like to put myself in the middle of that argument. I see both sides. I think fine tuning an application’s core business with vendor specific features is tremendously important, after all, that is why there are so many competing database vendors. Generally speaking of database driven projects, I feel like planning to use a specific vendor up front, knowing its pro’s and con’s, and tailoring an application to the chosen database’s strengths can only help in the long run. Also, I feel that building a database model first before any code, offers many performance and scalability advantages than does code first development.

That said, I also see value in using a database as a simple data-store when the actual database is not a key component of the overall application. That’s right, it is completely valid to say that the data-storage & database component of an application sometimes is not the key component; a database guy probably will never agree with you there. Just as there are programmers who swear by this code first, database later mantra, there are database developers that will swear by the database first, code later mantra.

The fact is, each project is unique. It’s this uniqueness of projects and their execution that ultimately shapes the perspectives of developers as well as the tools they write and consume. To say that one mantra is clearly a better choice over another is simply being ignorant.

The Use Case of Abstraction Layers

To be honest, I don’t really buy the “I might switch database vendors at some point” argument either, as Jeremy Zawodny points out. For larger projects (on the scale of the facebooks, the twitters, etc), switching the database underneath after a project has been in production is a monumental task- regardless if you have an abstraction layer or not. Chances are, you used some of the database specific features, not to mention, you now have a large set of mission critical data that also has to be ported. Long story short, its never as easy as swapping the abstraction layers database adapter out.

What I will buy though, is there are some problems that fall in thicker end of the Pareto Principle that can be solved with a database abstraction layer. For the uninitiated, the Pareto Principle is effectively the 80/20 rule. In software use cases, when applying this term- the 80% use case is the majority of use cases. These use cases are generally not that interesting in terms of database interaction. To give it a label, we can call these the CRUD, BREAD, or <<insert your favorite terminology here>> operations. That is not to say that these operations are not important, but they are not special. In fact, they are so un-special, that we can just about apply a standard query syntax (SQL 92) to them, and expect that the query is both portable between databases and common across applications that wish to use them.

This is where database abstraction fits in. As a developer, you’ll come across this problem time and time again. A large portion of an application are CRUD screens and the smaller more interesting part of your application is your reporting screens. With an abstraction layer, we are able to code against both a unified API as well as have a layer that will produce consistent and vendor compatible queries. This allows us to build more specialized data access layers (patterns) for multiple database vendors with great ease. You want Table Gateway- done, you want Row Gateway- done, you want Active Record- done. Each can be implemented to tackle the 80% part of the 80/20 rule when applied to the database centric business code of an application.

The Slow Path & The Fast Path

When I talk about this 80/20 rule in terms of the applications we write, I like to further refine the terminology so that it easier to visualize. The most prominent terms that helps developers visualize the 80/20 rule in their application is the slow path of your application, and the fast path of your application. Each of these terms has a set of characteristics that set each apart from one another:

Slow Path:

  • Performance is not of primary importance
  • Has an interactive nature
  • Validation and verification of data are of high priority
  • Application to data-store interactions are fairly trivial
  • Does not comprise applications core business logic

Fast Path:

  • Performance is of importance
  • Limited interactive nature, information flow is fairly static (non-interactive)
  • Flow of information consist of already verified and validated data (originates from the databsae)
  • Application to data-store interaction can become complex (JOINs, SUB-SELECTS, VIEWS)
  • Is the core business of the application

To get a better understanding of how the terms are applied, lets look at a typical web application. Generally speaking, there are a few web based forms that users interact with. These forms are the entry point of a code path that does not get a lot of throughput. This is generally because forms are submitted by people, and people can only type and submit forms so fast. In addition to this being a less traveled code path, it also has a few checks along the way- validation of data, and verification of data. Typically, the problems of verification and validation of data are not too unique to the application being executed. In fact, the web forms, validation and verification problems have been solved over and over again by various libraries.

On the other side of the equation, there is the aggregation and merging of the stored data (which inevitably came from the aforementioned web forms.) Since the unique aggregation and processing of this data is the core aspect of business of said application, it stands to reason that this code path will be more well traveled by users. This, is the fast path. The problems solved in this code path are generally unique and since they are unique, it’s hard to find an off the shelf solution to these problems.

Since this is where the money is to be made, it also stands to reason that developers should concentrate their efforts in the fast path of their application. This means they should solve the slow path problems of their application with existing tried and tested solutions- this includes generic forms solutions, validation and verification libraries and yes, database abstraction layers.

Getting Cozy With Zend_Db, a Database Abstraction Layer

Not that we’ve made a use case for DAL’s, what would one look like? Well, I’ll use Zend Frameworks Zend_Db as my use case.

The connection code:

$dbAdapter = Zend_Db::factory(array(
    'adapter' => 'Pdo_Mysql', // could be Pdo_Sqlite, Mysqli, Pdo_Mysql, Db2, or even Oracle
    'params' => array(
        'username' => 'test_user',
        'password' => 'test_pwd',
        'dbname' => 'test'
        )
    ));

You’ll note that since this factory takes a standardized array, it makes it trivial to swap out various connection information for different adapters.

Simple queries:

$data = array(
    'name'        => 'Remember the Milk',
    'description' => '2% Milk'
    'due_on'      => '2009-07-15',
    );
$dbAdapter->insert('todo_list', $data); // insert that data

// or
$lastInsertId = $dbAdapter->lastInsertId('todo_list');
$dbAdapter->update('todo_list', array('completed' => 'YES'), 'id = ' . $lastInsertId);

$dbAdapter->delete('todo_list', 'id = ' . $lastInsertId);

Here you’ll notice the generic and abstracted nature of this API. Since there are several tasks in database interaction that are consistent across the board, those such as INSERT, UPDATE and DELETE, it makes sense that we can create a generic API for handling such interactions. These interactions (INSERT, UPDATE and DELETE) represent the mutation methods of a database and as such, represent the most predominant way of getting data into a system.

For all intents and purposes though, simple SELECTs are fairly standardized too. They are standardized enough as to compliment the INSERT, UPDATE, and DELETE abstractions so that we can find actual rows to do these mutation operations.

Now that we have a simple and consistent API for doing simple SELECTs, INSERTs, UPDATEs, and DELETEs; we can implement something a little more interesting: the table & row gateway:

Zend_Db_Table_Abstract::setDefaultAdapter($dbAdapter);
$userTable = new Zend_Db_Table('user'); // ZF 1.9 feature
$userRow = $table->find(5); // find user by id 5 (primary key);
echo $userRow->username;

Immediately, you should see the inherent value in the above example. Rudimentary and common tasks can now be handled with a consistent and simple API. But what happens when you’ve started using this DAL, and you want to use a vendor specific feature? Well..

// assuming what you want is really REPLACE or INSERT IGNORE from mysql
$dbAdapter->query('INSERT IGNORE INTO configuration (name, value) VALUES (?, ?)', array($name, $value));

// OR
$dbAdapter->query('REPLACE INTO configuration (name, value) VALUES (?, ?)', array($name, $value));

As you can see, the query method of our database adapter will allow us to pass custom SQL into the database thus taking advantage of vendor specific features.

What if you want to combine both paradigms for ultimate flexibility?


// assuming Zend_Db_Table_Row, with a FriendshipReference rule
$friendRowset = $currentUserRow->findDependentRowset('User', 'FriendshipReference');

// collect friend id's
foreach ($friendRowset as $friendRow) {
    $friendIds[] = $friendRow->related_user_id;
}

$inClause = ' IN (' . implode(',', $friendIds) . ')';

$select = $dbAdapter->select();
$select
    ->from('user', array(
        'user_id',
        'related_user_id',
        'became_friends_on'
        ))
    ->where('user_id ' . $inClause);

// interact with driver directly
$mysqli = $dbAdapter->getConnection();
$mysqli->query('CREATE TEMPORARY TABLE friend ('
        . ' `user_id` int(11) NOT NULL,'
        . ' `related_user_id` int(11) NOT NULL,'
        . ' `became_friends_on` DATE NOT NULL'
        . ' ) ENGINE=MEMORY;'
    );
$mysqli->query('INSERT INTO friend ' . (string) $select);

// query new friend view
$friendTable = new Zend_Db_Table('friend');
$rows = $friendTable->fetchAll(
    'became_friends_on > DATE_SUB(CURDATE(), INTERVAL 6 MONTH)',
    'became_friends_on'
    );

While that above example is “a bit out there”, it does show that even with a DAL, if it’s flexible enough, you can code as close to or as far away from the database as you like. Ultimately the mantra here is: lets get the job done in the most effective, efficient and sound way possible.

Conclusions

Simply put, a database abstraction layer is just another tool in the toolbox. You don’t have to completely change your paradigm of programming, nor do you have to apply an all-or-none approach to using a DAL. When applied correctly, you can build out the slow path of your application in little to no time, while leaving extra time for developing and fine-tuning the fast path of your application. And to keep code from becoming unruly, simply apply some best-practices code organization to your project.

PHP: Environments, Libraries, and Applications – Oh My!

May 24th, 2009 by Ralph Schindler

Over the past 10 years or so, I’ve worked with many different code bases and libraries. Originally, the “libraries” were my own because in my earlier programming days, I had a bad case of “NCH” syndrome. That’s “Not Coded Here” syndrome for the uninitiated. As time had gone on, there were some solutions that I needed for a simple project and did not have the time nor the patience to develop a custom library for. That’s when I started relying on others experience and code to get me through projects.

The first “library” I remember using was px.sklar.com by David Sklar. There were some great components in there that were worth integrating into projects, but I hesitate to call it a true library though since its both a repository of both reusable components as well as complete solutions/applications. Moving on into the 21st century, a more “official” PHP library was being born; the PEAR project. The first component I really started depending on for many projects was the Spreadsheet_Excel_Writer. PEAR is not without issues of its own, but thats a topic for a separate article.

A Little History

My earliest PHP applications where fairly simple. A PHP page that would interact with a database, and render some html. Looking back at them, they all look like oodles of hacks and spaghetti code. Of course this was 1999ish, so it was OK because after all, it got the job done. As projects grew larger, so did a desire for better organization. This new wave of applications I was writing at the time was the first divergence from Model 1 applications, and came with the introduction of the second library I started using.

Smarty (which used to be part of the PHP Project), was a library I came to depend on in every project. The single greatest aspect of Smarty from a code organization standpoint was that it separated scripts into “business logic” scripts and “presentation logic” scripts. If an application was a soup of code, Smarty was the tool which divided out the presentation specific code, or what we’d call the ‘view’ in the MVC paradigm, from the business specific code, or what we’d call the controller and model in the MVC paradigm. This was the first step many took towards what is known in the JSP world as Model 2 programming.

So why this history wrapped in with a little personal experience? Well, I’d say the path I have followed is pretty typical of programmers that use scripting languages to build applications, specifically web-applications. That said, as the technologies we’ve used evolved and grown.. we tend to move towards solutions that offer a sense of best practices, better code organization, and most importantly- reduce the time to market.

What does that have to do with you? Well, I’ve seen my share of PHP centric projects come and go. In addition to those projects, I’ve kept a watchful eye on projects in other communities such as the Ruby, Perl, Java and .NET communities. From them, we’ve borrowed concepts, ideas and tools to create better solutions for the PHP community. With that, I’ll continue on with explaining several of the most common facets of any PHP project. If this seems basic at first, its actually laying the groundwork for a few more in-depth articles down the line.

What is an Environment?

In PHP, the environment is the set of resources, capabilities and settings for immediate use within the lifespan of any one php process. I know thats a very general statement, but lets explore that a bit. On most systems, you’ll find a php.ini file. This ini file generally sets values for the php process to initialize with when it starts up. Some of these can be modified by the SAPI (command line layer, apache layer, etc), while other can be modified during runtime via set_ini, and others cannot be modified at all.

Each time a script is executed, it first inherits these php.ini values. This means, by default, if none have changed, a script is subject to the rules defined by the php.ini on the system. If these values (php.ini system values) are out of your control, this means that the script running has an ambiguous initial environment. This environment might have been defined by the system administrator or by the packager of the php distribution you are using.

If you are subject to an ambiguous environment setup, there are greater the chances your application will fail upon setup or during execution. At least one of these situations has come to plague a PHP developer at one time or another:

  • display_errors might be off, causing a WTF moment when an error arises.
  • error_reporting level is set to E_STRICT and the script was not written with respect to the error_reporting including this mode, thus creating 100′s of notices.
  • open_basedir was set and your script doesn’t have access to some resources it expects to have access to.

Those are just 3 of the more popular examples stemming from 3 different keys that can be set within a php.ini. To put it in a bigger perspective: there are 100s of these values. The point that needs to be most impressed is that for any given php script or php application, it should either check the environment at script startup, or in the least provide all of the environment prerequisites and assumptions the script or application makes. The ideal solution is to supply a script that will check the environment and report at installation time if the ini values are correct.

One of the more interesting environment variables in PHP, much like other languages and systems, is the common path. In PHP, the common path is called the include_path. The include_path just might be the most important php.ini based value to any script or project. During a PHP scripts runtime, the loading of files and components are generally checked against the paths defined within the include_path. This means that any scripts or classes (effectively any PHP code) can be located and loaded with a relative path, a path that is relative to any of the paths defined in the include_path.

The include_path is a pretty powerful thing. It makes it easier to bundle components and packages into “libraries”, and use them within projects. This helps facility DRY principals by encouraging good code reuse and solid library design. On the other hand, if you don’t properly manage your libraries that are on your include_path, this could pose some pretty significant problems down the line. More on that later though.

The general rule of thumb is this: take control of the php process’s environment as much as possible to ensure consistent behavior.

What is a Library?

Its seems like library is a fairly generic term, but I want to add some specific meaning to it at least in terms of PHP. A general definition of a library would effectively be a “collection of reusable code”; and that statement is true for all intents and purposes. For the purposes of this article, I’d like to take that a little further.

A library is a collection of components. While a library solves a less specific general problem, components solve a more specific general problem. Get it yet?

For demonstration purposes, I’ll use the Zend Framework.. since I’m a little biased towards that one. The Zend Framework has a couple of libraries, the main one called the Standard Library. The ZF Standard Library solves a pretty general problem: “The PHP Application problem”. As you can see, thats a fairly general (relatively speaking) problem it attempts to solve. This library is made up of several components that solve specific problems within the “PHP Application problem.” For example, Zend_View and Zend_Controller solve the “web application structure” problem. Zend_Form solves the “web forms” problem. So on and so forth. These are problems that can be solved with tried, tested, and true solutions. These solutions can generally be considered “best practices“. They are solved so that you can get onto solving the even more specific problems… those inside the “application”.

Its worth noting that the definition of a library is also relative to the audience its targeted at. In our above example, the Zend Framework’s intended audience is all PHP developers. Your company, on the other hand, has a smaller target audience: its internal developers. Since that audience is a smaller and more concise group, their needs are more specific than those of the global developer community. That means that a company’s “library” might solve “more specific general problems” on a company wide scale. For example, a company might have 10 applications that use a single-sign-on system. Since those 10 applications within that company have the less specific problem of user sign on, that solution would be best fitted inside the company’s “library”.

In general, libraries solve problems that are generic enough for the entire intended audience, and each problem solved into a component of the “library”. Everything else goes into your “application”.

What is an Application?

As hinted above in the section on libraries, an application too is defined by the problem it attempts to solve. An application is a collection of business specific code which solves a very specific business problem. Again, this sounds generic, but it can be further defined and explained.

A business problem is the most specific problem that can be solved with code; this is the application. It will be the sum of all target environments, target audiences, and target tasks that should be solved. These business problems have a very narrow focus. While applications can be further defined into specific areas of code, the whole of the application’s object is to solve the business problem.

Depending on how complicated the business problem is that is target of the application to solve; an application might be modular. If an application is modular, that implies that the application’s problem area can be divided into even more specific areas of code with specific responsibilities. Lets take a community website for example. The site might include forums, user management, mail, calendaring and news. Each of these respective areas of the site could be considered modules of the main application or website. While this is a generic example, it does demonstrated a logical division of responsibility which is ultimately the point of introducing modules into an application. Each project and business should evaluate their application and decide upfront how granular the application’s problem is, and how best to further divide it. Doing this up front will alleviate many issues that could arise later as the code base starts to grow.

Beyond the modularity of an application, a further, more logical division and organization of code is generally applied. While there are several paradigms of application organization, we’ll focus on the MVC architecture (if you are not familiar with the MVC architecture it might be best to read the wikipedia article first before moving forward). Both an applications module and a non-modular application can be organized into Models, Views, and Controllers.. the main constituents of the MVC paradigm. Without getting to involved into what MVC is, one should know that:

  • The model represents the code base for solving the business problem at hand in a UI and environment agnostic way.
  • The controller represents the code base responsible for bridging a user’s interaction with the UI to the business model, and setting up new UI.
  • The view represents the code base responsible for creating the environment specific UI.

The above grouping of purposes is what is called as a separation of concerns.

Recap

Here is a recap of the terms defined within this article:

  • An Environment is the sum of all resources, capabilities and settings that exist in a PHP process. This generally includes what extensions and ini settings are preset for the PHP process.
  • A Library is collection of code that solves a less specific problem which is further defined by the libraries target audience and problem area.
  • A Component is a collection of code that solves a more specific problem within a library.
  • An Application is collection of code that solves a specific business problem. Ideally, applications consume libraries and components to facilitate quicker and more standardized development.
  • A Module is a collection of code that solves a more specific atomic problem of the larger business problem. The sum of all modules within an application attempt the solve the larger business problem.
  • MVC is a way to group code within both a module and application into a code base that facilitate a better separation of concerns.

The Semi-Official Zend Framework Pear Channel

January 7th, 2009 by Ralph Schindler

Pear Channel?

For the past few months, the ZF team has been playing with the idea of releasing ZF from a PEAR channel. Over the past 2 years, we have seen a few channels distributing ZF that have pop up here and there.. so that lead us to believe there is an itch that needs scratching.

The compelling reason against a PEAR channel is that, with ZF, there is nothing to “install”. Just pop ZF in your include_path and off you go. You could obtain ZF from SVN via export, checkout or externals tag.. or you could download from the website. A PEAR channel (until recently), didn’t make enough sense because copying files from one location to another was all it would be doing.

ZF Grows beyond Component Library

That is … until ZF 1.8 (coming soon to developers near you). With 1.8, Zend_Tool will be going into production. I’ve chatted (#zftalk.dev@freenode) about it, I’ve spoke about it (#zendcon08), and I’ve tweeted about it in recent months. But for those that don’t know, I can sum Zend_Tool up in 3 major aspects of functionality:

  • Zend_Tool_Framework is a dispatch system. While Zend_Controller has the Front Controller and web model hammered down pretty good, Zend_Tool_Framework is an introspective dispatch system for exposing its capabilities via command line (cli), XML-RPC, SOAP, or any other [insert your remoting platform of choice here].
  • Zend_Tool_Project is a profile driven system for managing project related resources and their relationships to one another, the ability to create them, remove them and alter them within the lifecycle of a projects development.
  • Zend_Tool_CodeGenerator is an abstracted system for generating code, including but not limited to PHP. Plans are in the works for generating Apache configuration files, ini and xml configuration files… all wrapped up in an API that is natural and similar to the API’s you’ve already become accustom to inside ZF.

So, that said.. What does this have to do with the PEAR channel? ZF is moving from a library of “runtime components” into more of a holistic framework with capabilities of code-generation, scaffolding, and project management, which complicates the process of installation. PEAR installer is really good at installing code into an already running PHP stack, be it site wide or local. So, by delivering ZF through the PEAR channel, the complexity of installation is shifted off of the consumers and onto the delivery channel.

So what does “installing” mean? It means some elements of the package need to go into some pretty specific areas on your system for them to work correctly. For ZF, it means you will need to put zf.sh or zf.bat in your executable path, zf.php in the php_bin directory, and put the Zend Framework inside your include_path. If you’ve used tools like PHPUnit, PHPDoc, or some other framework, this type of “installation” should make sense to you. If not, go poke around you system after installation to better understand.

Details

So, onto the technical details. If you want to see what it can do, first discover and install:

(discover the zf channel)
/my/path# pear channel-discover pear.zfcampus.org

(install zf-devel)
/my/path# pear install zfcampus/zf-devel

(or for something stable)
/my/path# pear install zfcampus/zf

More information will be posted on http://pear.zfcampus.org as it becomes available (this includes other packages in the channel, and other releases like beta and alpha).

To see Zend_Tool in action:

/my/path# mkdir tmp; cd tmp;
/my/path/tmp# zf create project

Now, go explore the project that was created. In addition to that, you can also run “zf show profile” and it will generate a tree of your project. There will be more updates, and more providers available in the coming weeks to show off what we’ve been developing for Zend_Tool. Also keep Zend_Application in mind because as it formalizes, it will be the target of what we will be generating from Zend_Tool and the zf command line interface.

Details, Details, DETAILS!

Like mentioned previously, the pear channel is beta. What could be beta about it you ask? Well for one, the package and release plan that comes along with it. As of this writing, here is the plan:

  • ZF Package
    • Stable (no version modifier)
      • source: tag
      • schedule: on tag
    • RC – Release Candidate
      • source: tag
      • schedule: on tag
    • Beta (beta)
      • source: branch of current release branch
      • schedule: weekly
      • version: current + 1 mini
    • Alpha (alpha)
      • source: trunk
      • schedule: weekly
      • version: current + 1 minor
    • Development (devel)
      • source: trunk patched with selected incubator components
        • maintained in a file in incubator (locally for now)
      • schedule: weekly (or on demand)
      • version: current + 1 minor
  • ZF_Minimal Package
    • (scheme same as above)
    • Source modified
      • no tests
  • ZF_Extras Package
    • planning
  • ZF_Laboratory Package
    • planning
  • ZF_Doc_Lang Package (maybe)
    • planning

This might get tweaked over time, but the idea is pretty solid. Stable comes from tags as well as release candidate (and patch releases if they exist, not mentioned here). Betas are considered the next mini release, and alphas the next minor release. Development is super developmental, as you can see as its cut from trunk with selected incubator components.

More details will be forthcoming as I’m sure there will be questions you might have that are in search of answers. Till then…

Happy ZF-ing!

Slides for ZendCon Talks

September 17th, 2008 by Ralph Schindler

This blog entry will be a working page for those who would like to download my slides and code for my talks a #zendcon.

Zend_Tool Talk