Coding Codes

github.com/ucirello | @ucirello in gophers.slack.com

Notes on PHP Programming

24 May 2015

Introduction

You are going to find out a lot of overlaps with Rob Pike’s “Note on C Programming”, it is more of a coincidence - often we would agree on the topic but disagree with the intensity of the opinion.

I do not want to write a guide or a standard here about PHP programming. These are opinions that I have been thinking of and accumulating for a long time, and are based on experience, thus I hope they will help you about thinking the details of writing PHP code. If you disagree with them or think they are twisting some PHP design decision, fine. But if they spark on the thoughts on why you disagree with me, that is better.

Your comments are welcome.

Coding style

The single most common aspect of PHP code, since version 3, is that there are no two programmers that agree with a coding style. I would not mind different projects with different standards, but it is quite common seeing several styles in the same file.

Fortunately this is no longer necessary, tools such php.tools and PHP CS Fixer can solve the problem for you. Use them. There are several standards, maybe the most important nowadays being PSR-2. But pick a standard and use the tools to apply it throughout source code.

Keep in mind separating commits changing logic from those changing code style.

Variable names

I deem they are the fingerprint of a developer in source code. Even if the code is under a single style, PHP developers still keep their naming practices.

Again, there are no two developers that agree with variable naming in PHP world, and the possibilities are endless. The following list is not exhaustive but it shall give you a proper notion of the problem: those who use Hungarian notation the wrong way, those who use Hungarian notation in some right way, those who prefer short names everywhere, those who use long names everywhere, those who believe names are meaningless so they use cryptic names, those who believe form and content walks together so they embedded type into name (which is a slightly different problem than Hungarian notation: former will say $int_i, the latter will say class SomethingClass and all its instances will be of the type SomethingClass) and many other cases.

Variables names are important, but usually they need context to be meaningful, thus smaller context allows for smaller names. $i for local counters are good, but $obj for a global variable is a big no-no.

There are no good reasons for Hungarian notation: PHP is not (yet) a strongly typed language, but even if it were, the compiler knows the variable type and it is able to do many type conversions on your behalf.

One could say that proper use of Hungarian notation is still valid, such as to indicate whether a variable is tainted or help naming complex structures. They are both misled: if you need tainting, use the taint extension. If your structures are too complex, then you have a design problem.

I do use Hungarian notation at times, though implicitly, only when it improves clarity, such

if ($hasAttr || $isValid) {...}

(note the prefix “has” and “is”)

References

These are good tools and help you improve clarity a lot. The deeper you go in data structure, the more you will benefit from using references.

Thus:

$a[$b][$c][$d] = 10;
$a[$b][$c][$e] = 20;

can be made more concise with:

$item = &$a[$b][$c];
$item[$d] = 10;
$item[$e] = 20;

Also you can point to an undeclared variable or dimension, at the cost of one null initialization isset() will still tell you correctly whether they exist or not. Conversely, references in foreach values are tricky and should be avoided. One example of bad use:

foreach ($arr as &$i) {
	// change $i
}
foreach ($arr as &$i) {
	// last element of $arr is overwritten.
}

Function names

Functions should be named after what they return, otherwise after what they do. Names are the code talking back to you. If you find a function hard to name, it is because it is overloaded with responsibilities.

In the case of closure, I just move the name to the variable who holds it.

Comments

They should be clean, neat, banner free, except in the obvious case of document comments (/** .. */), and most important, non existent.

If the code is clear, it should be self-evident - therefore, not needing comments. Also, comments are not checked or used by PHP runtime. There are some applications the inspect document comments for annotations, but this is not about them.

I am merciless to comments, usually deleting them. The only comments I write, and keep, are the comments that introduce what follows, or, the ones describing the meaning of a data structure.

Complexity

Design issues apart, there are some low-level decisions that create or worsen code complexity. From what I see so far, these are my thoughts when I see complex code:

1 - Speed Hacks: There is no such thing as speed hacks in PHP, or rather, those available can be introduced through automated tooling, such as converting from double quotes to single quotes.

2 - Profile before optimize, and act only when one part of the code is a clear and consistent outlier.

3 - Fancy datastructures, even those from Standard PHP Library, are slow. Either they are slow to read, slow to update or slow to create. Try to keep using standard scalars and array/hashmap implementation. For a period time, objects holding data outperformed arrays, but nowadays performance gains might be dim.

4 - Focus on data structures. If you think about data structures and how they relate to each other, it will stand out clear what and how to code. This has been voiced by many people, all of them relevant: Linus Torvals, Rob Pike and Fred Brooks.

Includes and Autoloader

Avoid include, include_once and require_once. If you need to include a file, it means you require this file, thus use require. It will include the file and fail if not found.

If you are satisfied with autoloader perfomance, then you should only care about dependency explosion (see next topic). Otherwise, consider adding the dependencies on top of the main file. It will help your code to run faster as it will not depend on the autoloader to find the classes, and also, it will help you to keep track of the real size of your dependencies (see next topic).

The golden rule is: includes must not include files. This helps keeping include graph small, avoiding problems of multiple inclusions - making include_once and require_once unnecessary.

Dependencies

It is perfectly normal to depend on third-party projects to run your code. But be careful, it does not follow that a package published in Github or Packagist is actually good or production ready - or any other public place, for that matter.

Many of these packages depend themselves of more packages and you might find yourself in a dependency explosion. This is bad because PHP spends more time loading files and classes which might be marginally used, or that are not properly implemented. It means you could be importing not only a feature into your source code base, but also a bottleneck or a bug.

Use some runtime browser, such inclued, to see how big your class space has grown after you added a dependency. Try to keep it as small as possible.

Personally, I tend to vendor my dependencies - but I understand it might be a problem for those crossing incompatible licenses. At least, consider the idea of committing your composer.lock if you are using composer.