The FixedPointField bug

Some time a year ago I have spent several unpleasant hours spread across two entire weeks tracing down a very nasty bug in FixedPointField, which is part of the Plone Archetypes code base. During the test phase of a web application built upon Plone, some of the numbers have mysteriously changed their signs. At first sight it seemed totally indeterministic. At a second glance, it seemed that in all occurrences of the phenomenon, the numbers turned positive from negative. However, only a few numbers did this. I’ve spent several hours checking all related code for a wrong clause, all without success.

It took quite a time to realize that all the affected numbers are always small. Then I wrote unit tests to determine the range in which the numbers fail to keep their signs. That turned out to be the range of (-1.0 .. 0.0). That was the point when I was beginning to suspect that there’s something fishy in the underlying maths… so I descended to the realm of Plone and Archetypes code to narrow down the problematic part. It wasn’t too long before I found that the sign gets lost in the following piece of code from Archetypes/Fields.py:

value = value.split('.')
__traceback_info__ = (self, value)
if len(value) < 2:
    value = (int(value[0]), 0)
else:
    fra = value[1][:self.precision]
fra += '0' * (self.precision - len(fra))
#handle leading comma e.g. .36
if value[0]=='':
    value[0]='0'
    value = (int(value[0]), int(fra))
return value

Arrgh. Too bad. I wish I had never seen this code. Archetypes has reinvented fixed point storage and handling in a very - hmm - “clever” way. For those the problem still isn’t obvious, let’s see how, say, “-0.5″ gets handled: as a string it gets split into whole and fraction at the decimal separator, and only then it gets converted into integer. Too bad there’s no difference between -0 and 0, isn’t it? So “-0.5″ becomes (”-0″,”5″), and finally ending up as (0,5), which equals to “0.5″.

The fix, from here, is obvious, though it raises some concerns. First I thought the sign need to be stored separately in the value tuple, however, that would mean tuples from there on will consist of 3 items, so the code becomes incompatible with already existing FixedPointField values in the world, which is a big no-no. This leaves us with the sole chance of applying sign to the fraction when in range (-1..0). Let’s see how much of the code need to be changed for that:

#handle leading comma e.g. .36
if value[0]=='' or value[0]=='-':
    value[0]+='0'
value = (int(value[0]), int(fra)*([1,-1][value[0]=='-0'])

Since this stores “-0.5″ as (0,-5), we also need to change the get() method:

def get(self, instance, **kwargs):
    template = '%%d.%%0%dd' % self.precision
    value = ObjectField.get(self, instance, **kwargs)
    __traceback_info__ = (template, value)
    if value is None: return self.getDefault(instance)
    if isinstance(value, basestring): value = self._to_tuple(instance, value)
    return template % value

To something like this:

def get(self, instance, **kwargs):
     template = '%%s%%d.%%0%dd' % self.precision
     value = ObjectField.get(self, instance, **kwargs)
     __traceback_info__ = (template, value)
    if value is None: return self.getDefault(instance)
    if isinstance(value, basestring):
        value = self._to_tuple(instance, value)
    return template % (['','-'][value[1]<0],value[0],abs(value[1]))

And we’re done. Nevertheless, code quality remains questionable. (how do you change precision on the fly, for example. And I dare not mention performance issues of all those unnecessary string to integer and vice versa conversions.)

Unfortunately, submitting this bug report and patch isn’t so obvious as the fix. Archetypes trac only accepts new issues from registered members, but doesn’t provide a link to register. The Plone support page mentions the Plone issue tracker, which does link to a page where you can register, but registration for some reason failed in my case, because either the e-mail or the user name was considered wrong, even after several different attempts. And finally, nobody really cared on the Plone support mailing list either. Sorry guys, you lost my vote there and then. Recent svn still has the bug, and it is unlikely to be fixed without being reported.

The lessons to be learned here are:

  1. for me not to take any “complete” frameworks granted as bug-free, even in the basics.
  2. for developers in general, that no matter how clever you’re trying to reinvent the wheel, it is almost certain that there will be something overlooked, potentially causing nasty corner case bugs, so unless there is a rock solid reason behind reworking basic methods from scratch, it does not worth the potential havoc it surely going to wreak.
  3. for the Plone project to set up at least one obvious, and working channel to submit bug reports and patches.

UPDATE 05 Sep 2008: the above fix has finally been integrated into plone codebase, changelog here. Thanks, Maurits! [Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

2 Responses to “The FixedPointField bug”


  1. 1 Tom DufresneNo Gravatar

    Thanks for posting this. I’m sure you saved me many hours of head bashing.

  2. 2 Maurits van ReesNo Gravatar

    Hi,

    It’s a bit late, but I finally fixed this bug:
    https://dev.plone.org/plone/ticket/7549

    Thanks a lot for showing how it needed to be fixed!

Leave a Reply