Author Archive for tamasdecsi

Page 2 of 5

A performance patch

Back in May 2006, I needed to performance-tune a web application built on Plone. The symptom was an unacceptable 60 seconds delay upon creating a new user account on top of the already existing 10000 accounts. The source of the delay was very soon traced back by Zope’s built-in call profiler to GroupUserFolder, where the solution was instantly obvious.

The original method was checking for existing users by using linear list lookup, which is known to be algorithmically slower than dictionary key lookup. Knowing that, the performance patch was a no-brainer, and, as expected, it eliminated the reported delay. I’ve sent this patch to ingeniweb (the author of GroupUserFolder) in an email on 22 May 2006, but haven’t heard back from them since. Yet, the patch has been accepted, even got refined a bit and was applied on the code trunk on 13 Jun 2006. From the one side, I’m happy to see it at least being fixed, while on the other side, I could have been more happy receiving a feedback on my report.

Nevertheless, I was still wondering how exactly the two lookup methods compare, so I’ve put together a python script to check how much we can actually gain on using haystack_dict.has_key(needle) instead of needle in haystack_list. I’ve compared them with small to larger sample sizes in small to larger sets (and lists) of samples.

I’ve used TOPCAT, an excellent, yet easy to use visualization tool for tabular data, to graphically represent the results.

has_key vs in list

As you can see, even at the small sets of small samples region, has_key performs much faster, and with the growing set and sample sizes, it keeps performing equally well, while list lookup goes wild in the same time.

Can’t say I was surprised or learned anything by doing this, but thought I’d mention as it might turn out to be useful for someone out there, so let me know if you’ve found this useful.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Making of the logo font

This year the Helvetica typeface is celebrating its 50th anniversary, so everyone is talking about fonts. Not that I’m about to propose a replacement for Helvetica, or anything such, still I’ve created my very first font, and therefore I’ve wrote this post about it.

Although my blog logo is still just scraped, it’s time to write about the making of the font used there. I felt spending a couple of hours creating the font makes things learned worth sharing.

I named this font ArcNBrick mostly because of the basic concept behind its glyph shapes, where everything is built up of – surprise – arcs and bricks.

The tool I’ve chosen to build the font was FontForge, a cross-platform, open source application designed for the very purpose of creating fonts of different types.

Creating letter o in FontForge
Let’s take a quick tour of how I’ve started building a font with FontForge. First of all, the concept behind my font allowed me to work without any pre-drawn, scanned helper images, even though FontForge supports those. Let’s discuss the basics of drawing glyphs, then.

  • All paths are built up of curve, tangent, and corner points. Curve points are marked with a circle, tangent points with a triangle, and corner points with a rectangle.
  • Control points (marked with x) help you to shape the curves as desired.
  • Outer boundary paths should be drawn clockwise, while inner boundaries counter-clockwise

After learning these very few basic rules from the tutorial, I was very soon up to speed in drawing the glyphs for all the letters. In order to accelerate the process and also preserving design consistency, I used cheap transformation tricks to create glyphs from one another. For example the letter b is a perfect candidate to create letters d, p, and q.

So, drawing letter d consisted of the following operations: select all on letter b, copy, paste to letter d, then element: transformations: transform: flip horizontally, and finally element: correct direction. Drawing letter p wasn’t much harder: select all on letter b, copy, paste to letter p, then at element: transformations: transform: first flip vertically then move y by -200, and finally element: correct direction. That was it. Very soon, all letters took shape, so I’ve built the TrueType version to give it a go.

ArcNBrick font sample

As you can see, the result is nothing useful for everyday use, yet, you might find it appealing for short, emphasized labels used in logos, for example. Should this be the case, feel free to download and use my ArcNBrick.ttf. Make sure to leave a comment here on where and how this font was particularly useful for you, I’d be happy to receive feedback!

I need to confess that I haven’t tackled any of the issues that would make a font professional, such as hinting, accents, proper encodings, to name a few, but by this time you’ve already guessed that I am by no means a typographer. Yet it was fun building a unique font from scratch and seeing it working in a word processor. Some day I might even try building some wingdings fonts too, who knows…

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

What to do when NULL hides your data?

Relational databases have been around us for quite a while. One may think during all those years of SQL (being the de-facto standard query language for RDBMS), its numerous proprietary and open source implementations have matured. A recent issue I’ve been dealing with tells me they are still evolving. Nevertheless, this incompleteness only kicks in when you want to create complex SQL statements, but then there always are alternatives to reach a particular goal.

I’ve recently came across an issue with the ordering of the NULL elements. For some reason, PostgreSQL thinks NULL is the highest value. Well, it can be interpreted that way too, but with the following business case it wasn’t so.

The structure (simplified for easier digestion) being queried contains two tables: one for clients, and another for purchases made by clients. Obviously, every purchase is linked to a client, yet not every client has made a purchase.

I needed to create a report that displays every customer, with the total amount of purchases made by them, showing the client with the highest total amount first.

I’ve started off with the following query:
SELECT client.id, SUM(purchase.amount) AS total FROM client LEFT JOIN purchase ON purchase.client_id = client.id GROUP BY client.id ORDER BY total DESC;

Running the query revealed the problem: the aggregate function SUM returned NULL for the clients with no purchase, and NULL is treated as the highest value by Postgres.

This was just about time for looking this issue up. Well, according to the relevant wikipedia entry:

“The SQL standard does not explicitly define a default sort order for Nulls. Instead, on conforming systems, Nulls can be sorted before or after all data values by using the NULLS FIRST or NULLS LAST clauses of the ORDER BY list, respectively. Not all DBMS vendors implement this functionality, however. Vendors who do not implement this functionality may specify different treatments for Null sorting in the DBMS.”

Yes, that NULLS LAST sounded like a promising solution, so I’ve checked which DBMS implement it. A quick search revealed this forum entry that says, by default Postgres, DB2, and Oracle considers NULLs as higher than non-NULLs, while MSSQL and MySQL considers them being lower. DB2 V7 and Oracle 9i supports NULLS FIRST/LAST, while MSSQL, MySQL and PostgreSQL doesn’t.

However, that entry was two and a half years old, while PostgreSQL is known to being actively developed, so I’ve checked its status concerning NULLS LAST again. Its latest documentation does in fact tells NULLS LAST is supported, so I’ve checked when it became integrated. There it was, a patch from last December does in fact take care of the NULL ordering problem. However, it is only included in version 8.3, and an upgrade wasn’t an option in my case.

That meant another route to be taken. After a little more reading, I’ve found my alternative candidate, the COALESCE function, which is implemented in older versions of PostgreSQL too. COALESCE returns the first non-NULL value from its list of parameters, which is exactly what I needed here. My query therefore was altered to this:

SELECT client.id, COALESCE(SUM(purchase.amount),0) AS total FROM client LEFT JOIN purchase ON purchase.client_id = client.id GROUP BY client.id ORDER BY total DESC;

And suddenly, the problem ceased to exist, and every one was happy by looking at the numbers of the biggest customers at last. After all, from the business perspective, they represent the highest value, not NULL.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Upgrading a Toughbook CF-W2

In a previous post, I told the story of replacing the faulty HDD in my Panasonic Toughbook CF-W2. Back then I have followed a guide on a forum linked from repair4laptop.org, which has disappeared since. I also wanted to upgrade the built-in ipw2100 miniPCI WLAN adapter to an ipw2200bg, so I decided to document the upgrade process myself too. The results are here:

First of all, the most important rule is: don’t try to fix if it ain’t broken!! A working Toughbook is a charm, but the same machine as a paperweight is useless. Taking it apart is a long and tricky process, putting it together needs just as much care and attention.

All right, so yours is broken, and isn’t so lucky as “Tsiolkovsky” to have a Panasonic Laptop repair shop around, and you think have the skills, patience, and guts to disassemble it, and YOU UNDERSTOOD THAT THIS IS NOT AN OFFICIAL GUIDE, AND I TAKE ABSOLUTELY NO RESPONSIBILITY FOR ANY POSSIBLE DAMAGE OR LOSS EVEN REMOTELY CONNECTED TO THIS POST. The fact that I’m managed to repair and upgrade my CF-W2 this way does not guarantee that you’ll be as lucky as me.

Ok, enough said, let’s see the process…

Make sure you have the proper screwdrivers, some chewing gum, or a magnetic head screwdriver, and some thermal compound at hand, also prepare the new parts to be built in. Allow yourself at least an hour’s worth of time dedicated to the process.

1. Back up all your precious data to minimise damage if anything goes wrong.

2. Clear some flat workspace, and have something to hold the screws in.

3. Before shutting down, open the CD/DVD drive cover (there’s a screw beneath to tackle later).

4. Turn off the computer, remove the battery and unplug the power cord.

5. Take a big breath.

6. Remove the plastic cover on the WiFi antenna on the right side of the notebook. It is secured with two plastic tabs. You may need to strain it a bit.

7. Unwind the screw in the CD/DVD drive

8. Unwind the two screws at the wifi antenna on the right side of the laptop.

9. Unscrew the two bolts holding the VGA out connector.

10. Turn the notebook upside down.

11. Unwind the memory extension’s screw, remove the plastic cover.

12. Unwind the three long screws that hold the keyboard, remove the plastic cover.

13. Flip over the notebook.

14. Dislodge the keyboard. On the top, two plastic tabs keep it in place: one to the left of F1 and one to the right of F12, and a couple more plastic tabs around the spacebar. The keyboard is also glued with some sticky tape on three places, so a little force is needed to dislodge it from the casing. Its cable is also needed to be unplugged from the main board. Take extra care not to harm the ribbon cable during the process.

15. Unwind the 6 black screws underneath the keyboard.

16. unwind the 2 hidden screws: these are very tricky. you may consider using a magnetic head screwdriver, or a bit of chewing gum to prevent the escape of these screws into the casing.

17. Close the screen, then turn the notebook upside down. By closing the screen panel you actually secure the casing together while the bottom cover is removed.

18. Unwind all 8 screws on the bottom of the notebook. Some screws may stuck at first disassemble, be sure to use the right tool, otherwise the head will suffer, and you might be unable to take the notebook apart without considerable damage. There are four short, and four longer screws, mind their positions. (Update: Brett told me that if you mix the short and long screws, you might end up punching a hole in the motherboard during the reassemble, so better check it twice!)

19. Unwind the 2 screws securing the screen panel on both sides.

20. Carefully remove the bottom cover.

Now you can access both the HDD and the Wlan card in the miniPCI slot. Apart from the memory extension and the keyboard, these are the two modularly replaceable parts in the Toughbook W2.

21. The Wlan card has two antenna plugs (a main and an auxiliary). Simply, carefully pull them, then you can flip the module from the bay, and insert the upgrade module instead, and finally don’t forget to attach the antenna plugs.

Upgrading to an ipw2200bg or an ipw2915abg doesn’t raise any hardware compatibility issues, only a driver upgrade is needed then. You can find many of these cards on the second-hand market, usually originating from broken notebooks. I got a 2200bg for around $27 on an auction site.

22. Concerning the HDD: you need to know that most Panasonic Toughbooks use a low voltage (3.3V) HDD, which you can’t readily buy in retail shops, so you need to mod a 5V notebook HDD by clipping pins 41 and 44. (At least this worked in a few reported cases, including my Samsung SpinPoint HM080HC 80GB 2,5″ HDD).
I’ve found this HDD upgrade guide useful.

Ok, now that the parts have been replaced, you can put it back together.

23. Secure the back cover with the 2×2 screws fixing the screen panel, and the 4 long and 4 short screws fixing the cover to the casing (remember, DO NOT MIX THEM UP, as the long screw in a short’s place might damage the motherboard), then turn the notebook around and open the screen.

24. Continue with the two hidden screws: again, use a magnetic head screwdriver, or a bit of chewing gum to secure the screws while twisting. Should they fall, you’ll need to open the bottom panel again to get to them.

25. Next, take care of the the 6 black screws.

26. Now comes an important step: clean the surface of the processor (and the metallic part on the bottom of the keyboard), then apply some thermal compound on it. This is particularly important, as otherwise the CPU can get pretty hot, up to an unhealthy 90 degrees centigrade with processor intensive tasks running.

27. Plug the keyboard back, flip the plastic tabs in place. Again, handle the ribbon cable with extra care.

28. Flip the notebook over, secure the keyboard and the plastic cover with the 3 long screws, then the memory extension cover.

29. Flip once more, screw in the 2+1 screws, and put the WiFi antenna cover back in place.

30. Finally, screw the VGA connector’s bolts back to their respective place.

Now put the battery back in place, and hold your breath while starting up the notebook. If it works, well, congratulations, you’re done. In every other case, I’m sorry to say, but now it is your turn to find out what have gone wrong.

Should you be succeed or fail, I’d like to read about it, so please do post a comment here. Thanks!

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

The state of GPS car navigation

With prices falling and new devices appearing on the market every week, it seems that GPS car navigation systems are finally getting widely adopted. Still, the question remains whether they are ready to be used by the “average driver”?

Sure, it is useful to know where you are, and in what general direction your destination is, what more, how on Earth can you get there on four wheels. All this requires a GPS receiver (to read your current location), a mapping software (to calculate your moving direction from subsequent reads), and an accurate map (to give some context and meaning to your location readouts). The first two are more or less out there. Still, maps can be considered anything but reliable or accurate.

With news on blind faith put in navigation systems, people have already got wet, others found themselves down the stream, on the track, or fighting the dunes, the map inaccuracy issue finally got into the headlines.
People, of course, are eager to come up with various solutions.
* An almost-victim said she won’t ever use a navigation system again.
* others said people should update the maps in their navigation systems every once in a while
* some Welsh officers has even created a new, but ambiguous traffic sign to warn lorry drivers of the road being narrower than expected.
* while others came up with fairly good ideas, such as superimposing the map on a front-camera’s stream, so that even faithful users can compare the map with the reality on the same screen.

However, keeping maps up-to-date is not as simple as one may think. Downloading the latest updates doesn’t do the trick. Maps become inaccurate at (or even before) the very moment they are published. As it is their purpose to sketch the spatial layout of real-world features, the only way they can remain up to date is if they constantly evolve along with their “mapped” real-life features. And constantly means very short term here. Think of traffic, for example: maps that fails to mention actual road congestion, current roadworks or accidents, has only a limited use, routing their users through nasty traffic jams.

In order to better understand the situation, first let’s take a look at the various approaches of creating and updating maps:

The commercial route is what proprietary navigation software vendors take. Usually you get these applications bundled with a GPS receiver, and preloaded with some not-too-up-to-date maps. The map data used by them is something commercially available, which most likely originates from the digitized version of maps of the “pen and pencil” era, with updates from geocoded aerial photography, and most likely not thoroughly validated by actual GPS reading on the ground. User feedbacks are usually welcome, and the major inaccuracies gradually get updated , but that’s a slow process, and the more recent map you need the more deep you need to dig into your wallet for it.

On the other end, there are the open approaches, thinking of open source projects with an enthusiastic community of hobbyists and professionals dedicating their free time to make something useful or at least interesting.

These projects take different approaches to acquire map data.
Some of them make use of openly available data (where there is such a thing), for example, Roadmap uses the TIGER GIS data, where of course, data accuracy is limited again.

Others, such as gpsdrive use dumbed down version of “available” data in form of snapshots of rendered maps from various web map services, such as google maps or yahoo maps to mention the two biggest. This approach can’t do routing, though, as the map is treated as a simple raster image only, without any routing information or so.

Yet another approach is based on a mapping community, and let contributors draw the maps, like cgpsmapper’s MapCenter, or the geocaching communities offsprings (e.g. turistautak for Hungary), or the very active openstreetmap project, or the much less active FREEMAP project to mention a few. The resulting maps are definitely GPS-validated, as the data source itself is the GPS tracks gathered, but it takes enormous amount of total community time to create a quite complete, routable map for even a relatively small region, nonetheless, quite impressive results already exist.

Still another concept is algorithmic map generation from GPS tracks, using statistical methods, thus reducing the amount of expertise needed on the map creation side. Mapgeneration takes this algorithmic map-drawing route. The problem (again) is gathering enough usable tracks to work on.

Yet, these all are just base maps needing constant updates, having no information on current traffic data. For this latter, other, more sophisticated techniques are needed. Let’s take a look at some of these approaches:

Starting with the statistical approach, when previously collected and analyzed traffic patterns are attached to each road. With this incorporated traffic flow data, the routing algorithm can do better time estimates for the possible routes. Even though this approach gives a better route, it still didn’t solve the problem of keeping its data up-to-date.

Being up-to-date necessitates an interface to refresh data. The online service approach solves this by having a central hub that constantly collects traffic data from various sources (traffic surveillance and control systems, or sampling a subset of cars also known as Floating Car Data), and sends updates to subscribers through some wireless medium, typically over a cellphone data network. Such systems can provide information on traffic events minutes or even hours earlier than radio news, saving precious time for their users.

However, not all types of traffic data needs a central database. The decentralized version of Floating Car Data uses traffic information collected from neighbouring cars in an ad-hoc network fashion. In this scenario, cars broadcast their position, direction, velocity and other traffic information, as well as relay data from cars ahead, and filter data broadcasted by other cars for relevance. The strength of this approach lays in its adaptability to spontaneous short term situations, e.g. assisting drivers to cooperate for driving safety. In the same time it may also provide enough data for making longer term decisions, such as choosing a better route.

These advanced systems are still mostly lay on the drawing boards, and are still too expensive to become widespread. However, with the recent trend of car manufacturers (e.g. Toyota, GM, and others) equipping all new cars with GPS receivers, the location and velocity data will be available from more and more vehicles on the move. At the same time, ad-hoc wireless networks also getting more of a commodity, it seems finally things are getting in place for distributed, always up-to-date, cheap and reliable navigational systems to appear in the near future.

Better be soon, as driving safety is also more and more of an issue as people rush towards their goals with an ever hastening pace, and they need to be guided, not further confused by navigation systems.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]