The character, which is not
July 15th, 2007
I was busy integrating a search keyword-based advertising solution into one of our clients’ site some day last week. It seemed like a no-brainer task: communication was handled by a neatly implemented php script provided by the peer company, along with a pdf document describing all what need to be programmed with lots of examples, created for dummies.
As lazy as one can be, I tried to save a few minutes by copy-pasting five relevant code lines from this pdf manual, and only changing the necessary parts to fit into the site’s code base.
Think of something as simple as
...
require_once('peer_code.class.php');
$s = new peerCodeClass();
$s->search($keyword);
foreach ( $s->results as $result)
...
Done, great. Give it a go. Kaboom.
It just didn’t work. The error log kept saying “Call to undefined function: search()”. Darn.
I don’t count php as one of my favorite languages, partly because its lack of a decent debugger. So I needed to stick to tricks like echo-ing variable values at certain points of the code, and raising error_reporting level to maximum.
So it seemed that the class does get instantiated, and $s contains a fully equipped peerCodeClass. What’s wrong then? Why does it keep telling me that search() is undefined?
I’ve checked for every minuscule warning in the error log, and fixed the code so that even the pragmatic compiler could say nothing wrong to that. I even started renaming variables and function names to avoid any hidden name clash possibilities. Still nothing. I was about to start believing in trolls, when my eyes caught one strange character in one of the error log entries. It said “Undefined variable: s- in …”
A suspicion slowly crept into my mind. No that can’t be just it! I quickly retyped that one line with the $s->search(); and gave it another go. Bingo! It worked this time!
Folks, copy-pasting from pdf into console may result in character-alikes that are in fact something else. In my case, I’ll never know what sort of dash was in there, but definitely not the ASCII dash, which the compiler expected.
Doing the math, by copy-pasting that small piece of code I’ve saved maybe 2 minutes, only to lose 2 hours for debugging. When writing code, copy-paste is evil, and not just the usual - uh, forgot to update the parts that differ - way!
The FixedPointField bug
July 1st, 2007
Some time a year ago I have spent several unpleasant hours spread across two entire weeks tracing down a very nasty bug in FixedPointField, which is part of the Plone Archetypes code base. During the test phase of a web application built upon Plone, some of the numbers have mysteriously changed their signs. At first sight it seemed totally indeterministic. At a second glance, it seemed that in all occurrences of the phenomenon, the numbers turned positive from negative. However, only a few numbers did this. I’ve spent several hours checking all related code for a wrong clause, all without success.
It took quite a time to realize that all the affected numbers are always small. Then I wrote unit tests to determine the range in which the numbers fail to keep their signs. That turned out to be the range of (-1.0 .. 0.0). That was the point when I was beginning to suspect that there’s something fishy in the underlying maths… so I descended to the realm of Plone and Archetypes code to narrow down the problematic part. It wasn’t too long before I found that the sign gets lost in the following piece of code from Archetypes/Fields.py:
value = value.split('.')
__traceback_info__ = (self, value)
if len(value) < 2:
value = (int(value[0]), 0)
else:
fra = value[1][:self.precision]
fra += '0' * (self.precision - len(fra))
#handle leading comma e.g. .36
if value[0]=='':
value[0]='0'
value = (int(value[0]), int(fra))
return value
Arrgh. Too bad. I wish I had never seen this code. Archetypes has reinvented fixed point storage and handling in a very - hmm - “clever” way. For those the problem still isn’t obvious, let’s see how, say, “-0.5″ gets handled: as a string it gets split into whole and fraction at the decimal separator, and only then it gets converted into integer. Too bad there’s no difference between -0 and 0, isn’t it? So “-0.5″ becomes (”-0″,”5″), and finally ending up as (0,5), which equals to “0.5″.
The fix, from here, is obvious, though it raises some concerns. First I thought the sign need to be stored separately in the value tuple, however, that would mean tuples from there on will consist of 3 items, so the code becomes incompatible with already existing FixedPointField values in the world, which is a big no-no. This leaves us with the sole chance of applying sign to the fraction when in range (-1..0). Let’s see how much of the code need to be changed for that:
#handle leading comma e.g. .36
if value[0]=='' or value[0]==’-’:
value[0]+=’0′
value = (int(value[0]), int(fra)*([1,-1][value[0]==’-0′])
Since this stores “-0.5″ as (0,-5), we also need to change the get() method:
def get(self, instance, **kwargs):
template = '%%d.%%0%dd' % self.precision
value = ObjectField.get(self, instance, **kwargs)
__traceback_info__ = (template, value)
if value is None: return self.getDefault(instance)
if isinstance(value, basestring): value = self._to_tuple(instance, value)
return template % value
To something like this:
def get(self, instance, **kwargs):
template = '%%s%%d.%%0%dd’ % self.precision
value = ObjectField.get(self, instance, **kwargs)
__traceback_info__ = (template, value)
if value is None: return self.getDefault(instance)
if isinstance(value, basestring):
value = self._to_tuple(instance, value)
return template % ([”,’-'][value[1]<0],value[0],abs(value[1]))
And we’re done. Nevertheless, code quality remains questionable. (how do you change precision on the fly, for example. And I dare not mention performance issues of all those unnecessary string to integer and vice versa conversions.)
Unfortunately, submitting this bug report and patch isn’t so obvious as the fix. Archetypes trac only accepts new issues from registered members, but doesn’t provide a link to register. The Plone support page mentions the Plone issue tracker, which does link to a page where you can register, but registration for some reason failed in my case, because either the e-mail or the user name was considered wrong, even after several different attempts. And finally, nobody really cared on the Plone support mailing list either. Sorry guys, you lost my vote there and then. Recent svn still has the bug, and it is unlikely to be fixed without being reported.
The lessons to be learned here are:
- for me not to take any “complete” frameworks granted as bug-free, even in the basics.
- for developers in general, that no matter how clever you’re trying to reinvent the wheel, it is almost certain that there will be something overlooked, potentially causing nasty corner case bugs, so unless there is a rock solid reason behind reworking basic methods from scratch, it does not worth the potential havoc it surely going to wreak.
- for the Plone project to set up at least one obvious, and working channel to submit bug reports and patches.
