Sucking XML in Perl

I'm working on some XML stuff in Perl, and took quite some time to explore the available XML parsers in Perl. There are quite a lot of them, so choosing the right one isn't allways easy :

* XML::Twig looks quite interesting and powerfull, but has quit a steap learning curve due to the imo cumbersome interface, espacially the twighandler stuff. Not intended if you want to do something quickly in Perl, unless you have the Twig experience.

* XML::Simple seemed a good tool so I started implementing using this module, but the performance is horrible : it took 14 seconds to parse a document around 3500 lines big, which took me straight back to the drawing board.

* XML::Parser was close to what I wanted, but the output was too cluttered, certainly if you're working with more complicated XML files.

* XML::LibXML is a module specifically built to corner some of the performance issues of XML::Simple, but it was built around the Gnome XML libraries, which weren't available on the HP-UX 11.11i server.

* XML::Smart seemed a great and intuitive interface for my problem. Unfortunately it isn't available in the default Perlmods, this could not be used. As it had quite some dependancies, installing it on the server wasn't an option.

There are also the SAX modules, but they seem more stuff you want to use when you're working in framework related stuff, and the SAX thing seemed too much of a burden to carry around (the program wasn't that big after all...).

So what did I do then ? As far as speed goes, the reality seems to be: a regexp-based, non-XML parser is going to be faster than a "close to the metal" parser (XML::Parser or XML::LibXML), which is going to be faster than a more convenient parser (XML::Simple, XML::Twig) which is going to be faster than a pipeline involving passing events througth various object-oriented layers (XML::SAX). So with that in mind, I implemented my own regexp based XML parser. It took to my surprise only half an hour to get it working, and was about 1500% faster than XML::Simple.

SIGUSR1

If there's on thing you can say about Unix, then it's the fact that it keeps on amazing you, even after years of usage. Here's a new gem I recently discovered :

When using dd to copy data between devices, it can take a long time to finish, even when using large block sizes. The dd utility doesn't report status information by default, but when fed a SIGUSR1 signal it will dump the status of the current operation :

$ dd if=/dev/zero of=/tmp/foo bs=512 &
[1] 7749

$ kill -SIGUSR1 7749
1038465+0 records in
1038465+0 records out
554904576 bytes (555 MB) copied, 8.79635 seconds, 63.1 MB/s

MySQL to Oracle migration

Lots of little sysadmin aiding tools start up as little MySQL databases on a test server. When these tools become too important to fail, they eventually migrate to conventional production servers with backup and all the stuff. When you're not that handy with the sql*plus Oracle client, this Mysql/sqlplus conversion page might come in handy.

Xgl

As Xgl is included in the Dapper repositories, I thought it would be a good idea to submerse myself for once in the eye candy that is available. There are several howto's available and basically it's just adding two lines in /etc/apt/sources.list, and create the Xgl server entry in gdm.conf and the compiz-start script.

Xgl is fun to watch, but I have the same feeling when I use KDE : lots of eye candy, but in the end everything is keeping you from being productive. The 'wobbly' feature will be the first thing probably that I'm going to disable, cause it blurs the fonts on the screen when an item is moving, which wearies the eye. It also takes a second or so to stop a menu from wobbling, which is somehow annoying in the case of right-click menus.

For the rest, everything is like in the Xgl promo video : changing a workspace gives you the image of a rotating cube, and Alt-Tab displays a column with previews of available windows. What I really like in Xgl/Compiz, is the fact that inactive windows are displayed grey, which gives you a better view/attention on the focussed window.

There are still some little issues with other OpenGL programs, and the fact that the standard Xorg server is still started, which disables the 'Switch User' functionality. But that are issues to look at later.

Anonymous Fri, 06/09/2006 - 19:52

wiki.ubuntu.com contains more howto's and configuration details about Xgl and Compiz. It also contains more information about Compiz' features. Must-read !

Ubuntu on the desktop

There's no better date than 6.06.06 to install Ubuntu 6.06 on my desktop. Time to say goodbye to Debian, a last glance on the mounted partitions, and which ones to erase or to keep, and then time to insert the 32bit install CD. Why 32bit ? Cause the 64bit versions are still too much hassle. Things like OpenOffice or Java still don't play well in 64 bit environments, and creating 32bit chroots or start up a 32bit VMware Ubuntu just to check my bank accounts isn't imo worth the trouble.

But what a fast installer Ubuntu has ! The download of the install ISO took longer than the install itself (but maybe that says something about my ADSL download speed). I wished Ubuntu provided netinst iso's, but as I see how they provide the installer as a graphical program on the live CD, I don't think this is going to happen soon.

A date like 6/6/6 isn't allways a good one to perform installations or upgrades. There were some security issues with the installed Drupal version, so I thought it would be a good idea to upgrade to 4.7. Wrong guess : not only Drupal now shows the current module number from where to upgrade (instead of the version number - quite confusing), but something went horribly wrong with the upgrade of the phptemplate or xtemplate engine. The result was that I couldn't get any sideblocks to view, which is annoying cause there reside the administration and login menus. I had no other option than to revert the database and Drupal engine back to 4.6, and upgrade from there to 4.6.8.

Serge van Gind… Wed, 06/07/2006 - 10:29

It's not a netinstall cd, and I'm not sure to what extent you want one, but perhaps the "Alternate install CD" should give you the other possibilities you miss?

Besides that, I wholehartly agree about perferring the 32bit version. I took the opportunity of the new dapper myself to reinstall my setup, leaving 5.10 AMD behind.

Marc Thu, 06/08/2006 - 17:19

Cool, hope you like Dapper!