Syntactic Sirup

Friday, October 26, 2007

TaskPaper

I recently started using a small app for the mac called TaskPaper. Its basically a small todo list/outliner app, but with a big difference that sets it apart from the more complex competing products. Its file format is just plain text.... in fact, its so plain that it's probably very similar to what you would use if you maintained the same kind of list using a plain text editor. Let me show you a little sample from my own todo list at work.

Software update service:
- Add public/private key sign. to connections @code
- Add checksum after download and update @code
- Add temporary file cleanup @code
- Fix load issue with server side sockets @bug
- Write readme file @doc

Thats really all there is to it. Having the file format this simple makes it possible to share your task lists using code repositories such as subversion with all the same benefits as regular source code files. In other words, its the perfect tool for maintaining todo lists in your code projects :-)

I could keep on blabbering about how nice and simple the interface is, or how cool the TextMate bundle for TaskPaper files is, but really... download the trial version and give it a spin. Remember, if you end up not buying the app when your trial is over, all of your work will be perfectly accessible in the plain text files TaskPaper stores its data in.

Friday, September 7, 2007

TupleSoup, Part 6

I have just uploaded a new release of TupleSoup. Unless you have multiple threads accessing your data, there is no real benefit for you in this release. In fact, you are even required to change a few things in your source... sorry :-( ...however, the possibilities these changes open for are really exciting, which will be what the rest of this post is about.

The primary change since the last version has been that the class Table has been renamed to DualFileTable and Table has instead been recreated as an interface. The idea behind this is that it will be possible for other parts of the system to work with any kind of table storage as long as they adhere to the Table interface. To adapt your code, all you need to do is change your instantiations to use DualFileTable, instead of Table as shown in the following sample.

Table table=new DualFileTable("test","./");

The first new type of table is also included in this release, its called HashedTable and is basically a hash of 16 separate DualFileTable objects. Whenever you insert a new row, it uses the row id to decided which of the 16 tables it should be stored in. In an ideal situation, this allows up to 16 threads to write new rows to disk at the same time without locking (remember, even in an ideal situation most of these threads would still have to fight for io on a lower level). However, the penalty is that any thread who wants to scan through the entire set of data in the table, will now have to scan through 16 sets of data instead of one. Even though these will be smaller than the one large, this will bring in a substantial overhead, so only use this new table for data storage that rarely do full table scans.

This might not sound too exciting if you are not really doing massively parallel applications, however this whole refactoring has opened up great ways for me to add future functionality. One of the primary features I have been trying to find a good solution for is value indexing. That is, to create an index on a table that indexes all values of a single key on all rows. This will make it really snappy to select a set of rows based on the values of that key in the rows. With this new design, this type of indexing can be designed as an object that implements the Table interface and simply wraps around another table, injecting the indexing functionality where it is needed.

Thats about it for this post, I hope to soon follow up with a post that provides a longer full sample of TupleSoup usage.

Monday, August 27, 2007

TupleSoup Public Release

Tuplesoup is a small easy to use Java based framework for storing and retrieving simple hashes. If you are interested in the design and implementation choices I made, have a look at the following blog posts I made while working on the project.

Part 1, Introduction
Part 2, Record index
Part 3, Index caching
Part 4, First refactoring
Part 5, Sorting

Since the last post in this series, I have done some further refactoring as well as added a set of statistic counters that can be polled to look for performance issues in usage patterns. The project is far from done, but I feel that it has reached a maturity that might make it useful for other developers. We have currently been using Tuplesoup for projects in our production environment for more than 3 months now without any stability issues. That being said, use at your own risk!

Let me end this post with a little bit of source code to show you how to actually use Tuplesoup in your own code. This very short completely useless example creates a table, adds a row and retrieves the row again. Hopefully I will soon find the time to write a more detailed piece of code showing some real world usage.

import com.solidosystems.tuplesoup.core.*;

Table table=new Table("test","./");

Row row=new Row("1");
row.put("username","kasperjj");
row.put("floatNumber",3.141592);
table.addRow(row);

row=table.getRow("1");
String username=row.getString("username");

Tuplesoup is available through sourceforge and has been released under the BSD license.

Friday, August 24, 2007

GTAC 07

I'm currently sitting in google's offices in New York for a small two day invitation-only conference called Google Test Automation Conference 07. This year, the primary theme seems to be automated testing of web interfaces using either WebDriver or Selenium.

The overall content of the conference so far has been very good, except for a few really boring sessions... however, since I'm primarily a developer and not directly within qa, I might just be too far away from the target audience for those sessions. This morning started off with a really interesting talk about using domain specific languages to write Selenium tests. It was very well delivered, and the content was definitely interesting, but at the same time I just couldn't help but notice how most of their samples in their dsl simply brought the interface of Selenium closer to the native interface for WebDriver. I haven't actually tried using either of the two systems yet, so this is purely prejudiced speculation from my side :-)

Although the rest of the day is filled with more talks about Selenium which all sound really interesting, I'm really starting to look forward to a lightning talk at the end of the day called "Selenium vs WebDriver steel cage knife fight". It will be featuring the authors of the two projects and I intend to be right up there by the edge of the cage ready to throw bacon at them as the fight begins!

Thursday, August 23, 2007

Lunch at GTAC 07

Ok... I take it all back, lunch at google today was so incredibly yummy that I completely forgive the missing coffee this morning. Seriously, even after reading blog post after blog post of people raving about the food at google, I was in no way prepared for the actual deliciousness of it :-)

Now I just gotta find a way to sneak back into the offices around lunchtime every day... or maybe I could permanently occupy a stall in the bathroom and just come out once a day for lunch...

Caffeine free GTAC 07

I just arrived at googles offices in new york for a conference on testing automation. The invitation said breakfast included so I slept in late and went directly from my bed to the L train towards manhattan. The breakfast is fine, except for one important element... no more coffee... seriously, nothing but decaf :-(

How can you start a conference at 8:30 in the morning and run out of coffee at 8:35 ???

Wednesday, July 4, 2007

Netstat Sucks!!!

Well.. at least the following version of netstat sucks:


netstat 1.42 (2001-04-15)
Fred Baumgarten, Alan Cox, Bernd Eckenfels, Phil Blundell, Tuan Hoang and others

At first glimpse, netstat works just fine. In fact, if you have less than 12 digits in the ip addresses you are looking at, you will probably never notice a problem. Unfortunately, the servers I'm usually working on has 12 digits. Have a look at the following output line from netstat.


tcp 0 0 ::ffff:217.116.236.191:25 ::ffff:217.116.236.18:40280 ESTABLISHED

It looks like a connection from 217.116.236.18 to 217.116.236.191. Unfortunately, thats not the case. In this case, it was a connection from .185. So what happened? Well, somebody in the group of people who wrote netstat, decided that it was more important to have nice and pretty looking columns than correct data. So they actually crop the ip address if the ip and port data doesn't fit the column width.

I really can't see any good reason for doing this.... netstat sucks!