Compiling Gearman (or anything) for Zend Server CE on Snow Leopard

May 12th, 2010 by Ralph Schindler

The first thing you need to know about Mac OS.X Snow Leopard all Mac’s and Macbook Pro’s is that this hardware is 64 bit capable. This may not mean you are running a 64 bit kernel, it simply means that the operating system is capable of executing x86 64bit executables. We won’t go into the details of kernel architecture, you can read more about that here.

What is important though is that both x86_64 and i386 based executables can run on snow leopard. What is not uncommon on OS.X is to have executables (and libraries) that have multiple architectures compiled in. To see what architectures are inside a particular file, run something like this:

    /usr/local# file /usr/bin/php
    /usr/bin/php: Mach-O universal binary with 3 architectures
    /usr/bin/php (for architecture x86_64): Mach-O 64-bit executable x86_64
    /usr/bin/php (for architecture i386):   Mach-O executable i386
    /usr/bin/php (for architecture ppc7400):        Mach-O executable ppc

    /usr/local# file /usr/local/zend/apache2/bin/httpd
    /usr/local/zend/apache2/bin/httpd: Mach-O executable i386

This means that PHP (supplied by apple), has been compiled with 3 architectures inside. What does that mean? It means there is basically 3 versions on PHP compiled into a single binary, and that when it is loaded into memory, only one particular version will be used at a time. To demonstrate, lets take a pretty common difference between 32bit and 64bit architectures: integer size. We know that 64 bit integer space is larger than that of the 32bit space. The following demo will show running different architectures from the same binary:

    /usr/local# arch -arch x86_64 /usr/bin/php -nr 'echo PHP_INT_MAX;'
    9223372036854775807

    /usr/local# arch -arch i386 /usr/bin/php -nr 'echo PHP_INT_MAX;'
    2147483647

We know we are running same command though different architectures since we know PHP has different max integer sizes.

The next important thing to understand is the nature of the PHP stack. PHP is generally regarded as a glue language. That might mean several things to different people, but we will be looking strictly at this statement in the purest technical sense. PHP is made of the core language and features, but also a rich set of extensions. These extensions are typically written in C, and have interfaced with the C layer PHPAPI. Most of the really useful extensions are linked against libraries on your system, for example the openssl set of functions are not actually implemented in PHP’s source code, the openssl extension is simple a wrapper that calls out to libssl.so (or .dylib on mac, .dll on windows). This is what is meant by PHP being a glue language/platform.

Since PHP relies on existing compiled libraries, you further have to understand how things are linked and compiled. There are generally two options here: linking dynamically, or statically compiling. Either way, one thing remains true: you cannot mix architectures. This means that if your apache/mod_php and/or php binary are only i386, then all of the libraries on your system that will be used must contain the i386 architecture. Likewise, apache/mod_php and/or php binary are only x86_64, then all of your libraries must contain the x86_64 architecture. Failing to have this, you will get a message like this for example:

PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/zend/lib/php_extensions/gearman.so' - dlopen(/usr/local/zend/lib/php_extensions/gearman.so, 9): no suitable image found.  Did find:
/usr/local/zend/lib/php_extensions/gearman.so: mach-o, but wrong architecture in Unknown on line 0

Now that we understand that executables and libraries can have multiple architectures, let’s get to the task at hand: making sure new extensions can run with Zend Server CE.

Zend Server CE for Mac (as of this writing), comes compiled as an i386 executable only. This includes the PHP binary, php library, and apache binaries that come shipped with ZSCE. While ZSCE works great out the box with all the provided extensions, you might find that you want some additional 3rd party PHP extensions compiled/linked into this stack. That’s where things get a little confusing, and in this post, we’ll look at how to install the gearman extension.

PHP Extensions are basically wrappers around existing libraries, so generally, these extensions require the base library to already be on the system. In our case, we need “libgearman” compiled and on our system for us to be able to compile and use the PHP Gearman Extension.

At this point, I would generally instruct you to compile Gearman with multiple architectures and install (–prefix=/usr/local). (Note: to compile for multiple architectures, simply do the following):

    export CFLAGS='-arch i386 -arch x86_64'

In the particular case of Gearman, this will not work as the Gearman makefile utilizes flags that are not compatible with multiple architecture targets. As such, we go to plan B.

Plan B is something I generally do to keep my system clean: statically building libraries. I have a personal rule of not keeping i386 only libraries installed in common places like /usr/lib or /usr/local/lib, in this case /usr/local/lib/libgearman.dylib. Since this is the case, I’ll build Gearman statically, compile it into the PHP Gearman Extension, and this will allow me to remove the temporary Gearman installation which will have to be i386 only.

    # check to ensure we have a multi-arch libevent (if not go create it as
    # normal with CFLAGS="-arch i386 -arch x86_64" and install to /usr/local)

    /usr/local/src/gearmand-0.13# file /usr/local/lib/libevent.dylib
        /usr/local/lib/libevent.dylib: Mach-O universal binary with 2 architectures
        /usr/local/lib/libevent.dylib (for architecture i386):  Mach-O dynamically linked shared library i386
        /usr/local/lib/libevent.dylib (for architecture x86_64):        Mach-O 64-bit dynamically linked shared library x86_64

    # next compile gearman to a temp location

    /usr/local/src/gearmand-0.13# export "CFLAGS=-arch i386"
    /usr/local/src/gearmand-0.13# ./configure --disable-shared --prefix=/usr/local/gearman-tmp
    /usr/local/src/gearmand-0.13# make && make install
        [gearman installed now, this should only have static files]

    # ensure we only have a .a library file for gearman
    /usr/local/src/gearmand-0.13# ls /usr/local/gearman-tmp/lib/
        libgearman.a    libgearman.la   pkgconfig

    # make sure zend/bin is first on your PATH
    /usr/local/zend/tmp# echo $PATH
        /usr/local/zend/bin:/var/root/.bin:/usr/local/git/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
    /usr/local/zend/tmp# which phpize
        /usr/local/zend/bin/phpize

    # next, go to our zend server location, and pull down gearman extension
    /usr/local/src/gearmand-0.13# cd /usr/local/zend/tmp/
    /usr/local/zend/tmp# pecl download gearman-beta
        downloading gearman-0.7.0.tgz ...
        Starting to download gearman-0.7.0.tgz (29,258 bytes)
        .........done: 29,258 bytes
    File /usr/local/zend/tmp/gearman-0.7.0.tgz downloaded

    # next, unpack, phpize, and statically compile
    /usr/local/zend/tmp# tar zxf gearman-0.7.0.tgz
    /usr/local/zend/tmp# cd gearman-0.7.0
    /usr/local/zend/tmp/gearman-0.7.0# phpize
        Configuring for:
        PHP Api Version:         20090626
        Zend Module Api No:      20090626
        Zend Extension Api No:   220090626
    /usr/local/zend/tmp/gearman-0.7.0# ./configure --with-gearman=/usr/local/gearman-tmp/ --disable-shared
    /usr/local/zend/tmp/gearman-0.7.0# make
    /usr/local/zend/tmp/gearman-0.7.0# make install
        Installing shared extensions:     /usr/local/zend/lib/php_extensions/

    # Now go add extension=gearman.so to your php.ini file inside /usr/local/zend/etc/php.ini

    # Now go check that php will have gearman support
    /usr/local/zend# php -i | grep gearman
        gearman
        gearman support => enabled
        libgearman version => 0.13

    # Since we statically compiled it, we can remove our temp install of gearman
    /usr/local/zend# rm -Rf /usr/local/gearman-tmp/

At this point, you now have a 3rd party PECL extension that is compiled and working with ZSCE on Mac OS.X.

PHPundamentals Series: A Background on Statics (Part 1 on Statics)

May 6th, 2010 by Ralph Schindler

Just beyond reading the title, you’ve more than likely come to this article as the curious yet uninformed, the mad and raving lunatic, or as an enlightened one. Static class members (from here on called simply, “statics”) in PHP conjure both the best and worst in developers for a variety of reasons. In part 1 of this series of articles on statics, we’ll explore some background to get a better understanding of statics in PHP.

Some Static Background And Understanding

Before we can move into the arguments that surround statics, we first need to understand what they are in the context of PHP.  The core of the PHP language and runtime can draw some pretty big corollaries from the Java/JVM and C#/.NET language platforms. The biggest, and most important for the purposes of this article, is PHP’s object model. Like Java and .NET, PHP follows a class-based, single-inheritance, multiple-interface model- a tenet described by the grandfather of OO languages: smalltalk. Of course, PHP applies its own “perspective” when it comes to the actual implementation details in that of typing, casting, mixed-paradigm usage, and so on; but the foundation for the object model is clearly defined.

That said, it is easy for the PHP community to draw comparisons and, more importantly, “borrow” best practices from both the Java and .NET communities. We certainly have borrowed our fair share with regards to development time tools, infrastructure tools and design patterns. Over the past 5 to 7 years, there has been an increasing adoption of best practices and patterns from the enterprise Java community, particularly in the form of two major texts: GoF and PoEAA. The GoF (Gang of Four) text primarily discusses best practices in the form of code structure and reuse: factory, singleton, adapter, composite, facade, iterator and observer to name a few. PoEAA (Patterns of Enterprise Application Architecture), on the other hand, attempts to solve higher order problems, particularly architectural problems at the application layer: MVC, Page Controller, Front Controller, Domain Model, Table and Row Gateway, and so on. While the examples are primarily executed in Java, they are structurally similar when implemented in PHP, so much so that PHP developers can read the Java examples as pseudo-code. This is what makes these patterns so applicable and thus popular in the PHP community.

Since we now know where these usage patterns originated, we should have a look at the target language platform: PHP. The key concept which delineates the PHP platform from the JVM and .NET platforms, is that PHP by default assumes a shared-nothing architecture. What does this mean? It means out of the box, PHP is not a persistent application platform. PHP’s runtime is built around the notion of primarily solving the web problem. In turn, since the web is request driven, you might say that an application written in PHP is also request driven. Put another way, the scope of your application is bound to a single request. The shared-nothing aspect means that the state of the application is built-up and torn-down upon the start and completion of each request to your application. Conversely, Java and .NET offer a persistent application stack which means the application’s state exists separate from the requests that come in via the web server. So, in PHP, the many requests each contain a single running instance of your application. In Java/.NET, the single application running handles the many requests.

Statics in Analogies

Still don’t get it? Let’s talk in a couple of analogies. Let’s assume we’ve built a basic application with the “out-of-the-box” technologies offered; one built on top of PHP and the other built on top of Java (or .NET, you can choose.) With your Java/.NET application, if a request is never received from your web server, the application is indeed still running. In PHP, if a request is never received from your web server, the application has NEVER run. The runtime of a Java/.NET application might be hours or days, whereas the runtime of a PHP application is a long as it takes to service the request. This analogy’s mileage may vary, and it is surely intended for demonstrative purposes. You could inject any number of monkey wrenches into it, but for all intents and purposes- it’s correct and it works.

Understanding the full scope of an applications runtime state is the most important aspect into understanding the role of static class members in OO programming. Static class members live as long as the application runtime is valid and alive. What this means it is that any class member state that has been set during any operation during the applications runtime will persist until the application ceases to exist. Looking back at our main platform differences, we can see that in the Java/.NET platform, statics members created in the scope of an application layer will be around until someone pushes the “shutdown” button on that application. This could mean a static member or static state is persisted for hours, days, or even longer. Like these persistent application stacks, PHP will destroy any static members and state at the end of the applications lifecycle. Unlike these persistent application stacks, the application lifecycle ends with the completion of a web request. This means that static members and static state in PHP, for the average web application, sticks around for seconds or less and is only valid in the context of a single web request.

Statics in Pictures

Still don’t get it? Lets have a look at a few images to better explain these concepts.

The following images will attempt to explain the various layers of a web application, one from the perspective of the JVM/.NET platform, the other from the perspective of the PHP platform. (For all intents and purposes, the PHP platform could also be any scripting language executed by an apache module or fastcgi.)

The green layer is the web server layer, this is the process that will attach to port 80 and listen for requests. The blue layer represents the application process itself. This layer is responsible for global application state and class-based static state. The orange layer is a request which comes in from the web, this is typically what we’ve called a page request. Inside of each web request is the yellow layer, which represents the page-lifecycle. In terms of the application, this is where all of the request specific application routines happen including page startup and business logic.

Contrasted against …

The most important thing to take away from these images, particularly with respect to understanding statics, is the blue layer, or the layer that best represents the scope of globals and static members. This is the heart of what is meant by a “shared-nothing” architecture. It is this key difference that affects how we architect the code for our web applications.

In the next article in this series, we’ll have a look at PHP’s application architecture in greater detail and how it solves problems that might arise from a shared-nothing style architecture, why this architecture is arguably better for the web and cloud based services, but most importantly, how statics fit into this paradigm.

The Anatomy Of A Bug/Issue Reproduction Script

February 18th, 2010 by Ralph Schindler

“There is a problem with component Fooey-Bar-Bazzy, I think it’s related to Nanny-Nanny-Neener. Please Fix Now.” If you’ve written a bug/issue report like that in the past with no other details- shame on you! This may come as a shock, but as great as some developers might be, they cannot read minds. Each has their own way of coding, custom working environment as well as their own favorite tools; aside from variances in coding standards and best practices. Some could argue these little intricacies are outside of the realm of coding standards and best practices and that these are the differences between good, great, and even terrible developers. Each developer has a different opinion on how particular applications, libraries of code, or even features of a particular project are expected to behave in practice. These varying expectations are why bugs/issues exist. No one developer producing code for mass consumption can anticipate every possible use case. Additionally, no one developer can replicate every environment surrounding every pre-conceived use case. There are simply not enough resources at hand; be it in the form of a variety of systems or simply the number of hours in a developers day.

With that in mind, I write this as a plea to all developers to be good to the maintainer of code you use. In the simplest form of advice, I suggest that before you click submit on that bug/issue report form, ask yourself two questions: “Did I do enough due-diligence in determining if this is really a bug?” AND “If I got this bug report, would I be able to reproduce it.. let alone understand it?”. If the answer is YES to both of those questions. Go ahead- click submit. If your answer is no, you’ve got some more work to do.

Some Tenets Of the Good Reproduction Script

In this short article, I’d like to outline a few details of what should go into a bug/issue report. These are some simple guidelines that should be considered when you write a bug/issue report. It should be noted that this list is by all means not exhaustive, but if you at least consider the list below before clicking submit- you’ll make a code maintainers day. I promise.

  1. List Out All Assumptions Clearly

    PHP specifically is well known for being a “glue language”. What that means is that PHP is generally sitting between multiple pieces of software that is, of course, not PHP. This means that these pieces of software each have their own set of configurations and environments that PHP is “gluing” together. That being the case, any assumptions about non-PHP assumptions should be clearly listed in the reproduction script. This could include database flavor and its settings, a PHP library component, or perhaps a specific version of an extension that is being used and the underlying unmanaged/c-based library your PHP environment is consuming.

  2. Use The Shortest Possible Use Case

    As tempting as it is to copy a script from your project and paste it into the bug/issue submission box, don’t do this. If you are truly invested in seeing the bug/issue fixed in a timely fashion, take the time to create a small reproduction script. In this script should be the absolute minimal amount of code to demonstrate to another human that there is indeed a problem that needs solving. By keeping the script minimal and short, you are also removing any other distractions from the script that otherwise might confuse the maintainer and prevent him from fully understanding the real problem.

  3. Use Generic Yet Meaningful Names

    It cannot be stressed enough that any non-meaningful names should be discouraged at all costs. And as mentioned above, you want to have as few distractions as possible in the use case. For example, supplying your database table of customers, with first_name, last_name, etc has virtually nothing to do with the problem at hand. In these cases where table and column names are ancillary to the actual problem, they should be generalized: a table named ‘foo’, and columns named ‘bar1′ and ‘bar2′. Unless …

    … the variable name can add context to the problem. What does this mean? $customer would be bad; but $faultyTableObject is good. The latter naming makes it easy for the maintainer to focus on the variable that need to be tracked leading up to the problem.

  4. Document Both What You Expect, And The Actual Result

    Claiming something is broken without offering what you expect and what the actual result is offers next to nothing to the maintainer attempting to fix the problem. Generally speaking, most use cases that end up being bugs/issues are outside of the original preconceived use cases for the actual component. That said, the maintainer is going to need the context of the use case that you’ve found to be problematic. It also helps to point out any existing documentation that describe the more well-defined uses cases, and how your use case relates and/or deviates from those already defined use cases.

  5. Make The Reproduction Script As Generic As Possible

    Perhaps this is redundant, but it’s important to know the minimal requirements for reproducing a bug/issue. You are not expected to be an expert on how to fix the actual problem, but you should do your own due-diligence in order to hand the problem off to the maintainer. It’s already been said to “List out all assumptions clearly”, but it is just as important to peel off any specific pieces of the problem that are not directly part of the problem.

    This concept can best be described by example. While MySQL is a widely available database platform, SQLite is widely known as the easiest to use and most portable database platform, at least in the PHP runtime. If you find a problem while using mysql, but it’s clear it can be replicated using SQLite, use SQLite. SQLite is built into PHP by default, and in a single script, you can create a memory based database and its schema in just a few lines of code.

    Sometimes a issue cannot be described in a single script. This is ok. This would be the case if, for example, you found an issue in a larger system, like Zend Frameworks MVC layer. In this case, it makes sense that you need to provide a minimal ZF project to demonstrate the issue. In these cases, make sure to again, use a few files and as little code as possible to demonstrate the issue. Also, in the spirit of using generic code, ensure to make all file system paths relative. This will help the maintainer get up and running with the problematic project in a minimal amount of time, with minimal configuration.

A Reproduction Script By Example

The following is a reproduction script I have written based on an issue (ZF-3709) provided to Zend Framework in our issue tracker. I chose this issue to write a reproduction for because it offers the ability to talk about how one might go about describing the environment, more specifically what the database should look like in order to replicate the problem.

(This script can also be found at http://gist.github.com/307396)

<?php

/**
 * This reproduction script shall accompany the issue reported at
 * http://framework.zend.com/issues/browse/ZF-3709
 *
 * Assumptions:
 *   Zend_Db_Table_* from trunk
 *   PHP Environment has SQLite with :memory: capabilities
 *
 * Result:
 *   This script should run without any assertions failing (empty output)
 */



// ensure that Zend Framework trunk is being tested against & classes are available
// set_include_path('/path/to/ZendFramework/library');
require_once 'Zend/Loader/Autoloader.php';
Zend_Loader_Autoloader::getInstance();

// setup the adapter, this uses SQLite so that its minimally invasive
// to anyone wishing to reproduce the issue on their local machine
$dbAdapter = Zend_Db::factory(
    'Pdo_Sqlite',
    array('dbname' => ':memory:')
    );

// ensure all tables have access to the adapter
Zend_Db_Table::setDefaultAdapter($dbAdapter);

// setup the database, classes, & assertion system
setup();



/**
 * BEGIN Reproduction Code
 */



// find a record that has a relationship to some bars through foo_to_bar
$fooTable = new Foo();
$fooRow = $fooTable->fetchRow('id = 2');
$fooIdOnesBars = $fooRow->findManyToManyRowset('Bar', 'FooToBar');

// the expected values for the next call
$expectedValues = array(
    array('id' => '2', 'name' => 'bravo'),
    array('id' => '3', 'name' => 'charlie')
    );


// when we loop through the rows, they should match the expected results above
foreach ($fooIdOnesBars as $index => $barRow) {
    // I'll use assert here to throw warnings when expected does not match actual
    $actualValue = $barRow->toArray();
    assert($expectedValues[$index] === $actualValue);
}



/**
 * END Reproduction Code
 *
 * Supporting code below
 */


// setup function
function setup() {
    setup_database();
    setup_classes();
    setup_assertions();
}

// This function will setup the proper database structure with test data
function setup_database() {
    global $dbAdapter;

    $conn = $dbAdapter->getConnection();
    $conn->query('
        CREATE TABLE foo (
            id INTEGER PRIMARY KEY,
            name VARCHAR(25)
            );
        ');

    foreach (array('one', 'two', 'three', 'four') as $numberName) {
        $conn->query('INSERT INTO foo (name) VALUES ("' . $numberName . '");');
    }

    $conn->query('
        CREATE TABLE bar (
            id INTEGER PRIMARY KEY,
            name VARCHAR(25));
        ');

    foreach (array('alpha', 'bravo', 'charlie', 'delta') as $word) {
        $conn->query('INSERT INTO bar (name) VALUES ("' . $word . '");');
    }

    $conn->query('
        CREATE TABLE foo_to_bar (
            id INTEGER PRIMARY KEY,
            foo_id INTEGER,
            bar_id INTEGER,
            extra VARCHAR(20)
            );
        ');
    $datas = array(
        array('foo_id' => 2, 'bar_id' => 2, 'extra' => 'Two to Two'),
        array('foo_id' => 2, 'bar_id' => 3, 'extra' => 'Two to Three'),
        array('foo_id' => 3, 'bar_id' => 4, 'extra' => 'Three to Four'),
        );
    foreach ($datas as $datum) {
        $conn->query('INSERT INTO foo_to_bar '
            . '(' . implode(',', array_keys($datum)) . ')'
            . ' VALUES ("' . implode('", "', array_values($datum))
            . '");');
    }
}

// This function will define the proper Zend_Db_Tables and their relationships
function setup_classes() {

    class Foo extends Zend_Db_Table_Abstract
    {
        protected $_name = 'foo';
    }

    class Bar extends Zend_Db_Table_Abstract
    {
        protected $_name = 'bar';
    }

    class FooToBar extends Zend_Db_Table_Abstract
    {
        protected $_name = 'foo_to_bar';
        protected $_referenceMap = array(
            'Foo' => array(
                'columns' => 'foo_id',
                'refTableClass' => 'Foo',
                'refColumn' => 'id'
                ),
            'Bar' => array(
                'columns' => 'bar_id',
                'refTableClass' => 'Bar',
                'refColumn' => 'id'
                )
            );
    }

}

// assertion setup
function setup_assertions() {
    assert_options(ASSERT_ACTIVE, true);
    assert_options(ASSERT_WARNING, false);
    assert_options(ASSERT_CALLBACK, 'assert_failure');
}

// callback for assertion failures
function assert_failure() {
    global $expectedValues, $index, $actualValue;
    echo 'Was expecting an array that looked like:' . PHP_EOL;
    var_dump($expectedValues[$index]);
    echo 'But got array that looked like:' . PHP_EOL;
    var_dump($actualValue);
    echo PHP_EOL . PHP_EOL;
}

To the best of my ability, this script passes both of my earlier questions: “Yes, I did enough due-diligence in determining if this is really a bug.” AND “Yes, if I got this bug report, would I be able to reproduce it and understand it.”

A Few Considerations

This above script does not have unit tests, nor does it represent a patch to the existing framework. While that would be the most ideal, that sets the bar much too high for people to report worthwhile issues. The consumers of the code are not expected to be experts on the actual issue at hand, or even how to write valid unit tests that fully exercise a feature or bug. Ultimately, as a code maintainer, I simply want to be able to see the issue you are attempting to describe.

If you’d like to go above and beyond the standard reproduction script, you might also considering offering lines of code that you feel might be problematic. What that allows is maintainers to set breakpoints at specific locations and really drill down into the offending code.

I hope this helps developers understand what is expected of them as they file issue reports on open source code they use. By following these guidelines you’ll be doing a service to the maintainer by making their life easier, and even your own since reproduction scripts offer quicker turn around time for issues over those that require in-depth research.