Posts filed under 'Projects'

After Release Status

After the first public release of SpaceMapper DataStore and MN8 last week I have some data to draw some conclusions. Unfortunately, on the release date, the FreshMeat announcement contained links which did not passed through the SourceForge counters (the link was directly to prdownloads instead of the downloads section on the status page), so I have no Idea about the downloads in the first day. However seems that there is a lot more interest in an XML database than in a new scripting language and even so 78% are interested in the binaries and only 22% in the source of the database project. With the scripting language the situation is reversed, 79% interest in the source and only 21% in the binaries. I guess peoples are more interested in how to write an interpreter than in using a new one :)

No feedback, no bugs, no mailing list interest no contributors which is reasonable to a first public release.

What is not reasonable is that the Klez virus on somebody's computer noticed the release and it sends thousands of virused mails with my email address in the from. In case anyone receives one I'm really sorry, it's not my fault, it's not from me and you can verify that by looking at the source of the message. All the mails I send goes through our server (194.102.233.6) which I'm sure you won't find in the received headers.

November 11th, 2002

DataStore and MN8 (ver. 0.7a) finally out!.

Just wanted to let you know that after many months of hard work a first public release of SpaceMapper DataStore and MN8 is available.

What is SpaceMapper DataStore ?

DataStore is a Java based document repository server for storing, querying and fetching XML based documents. It is built on practical needs allowing the storage of semi-structured (well formatted, maybe validated, XML, XHTML and HTML) documents and un-structured documents (TXT).

The documents are stored in conventional relational database (Postgresql, MySQL, DB2, SAP DB) assuring that way the maximum advantages and reliability of these products. Being built on top of the Avalon Phoenix framework, it allows server components to be easily developed, deployed and shared. The documents are managed through a BEEP and/or XML-RPC interface using a subset of the SEP (Simple Exchange Profile) protocol.

What is SpaceMapper mn8 ?

mn8 is an experimental object oriented scripting language, tightly integrated with the net, which emulates the concepts at the core of XML in order to simplify and make as transparent as possible information extraction and manipulation from the WWW and XML documents.

Written in Java works with most operating systems and allows easy reuse of the huge number of libraries available trough simple wrappers. At this point mn8 has concepts for: HTML, HTML-Forms, Cookies, RSS, OPML, HTTP, FTP, POP3, SMTP, Jabber, BEEP, XML-RPC, SOAP, MBox.

Then what is SpaceMapper ?

The SpaceMapper effort was born from the classic Internet desire to see if there is a better way. The effort evolved from an early RFP on the now-defunct SourceXchange which was awarded to the Romanian open source development firm noLimits Technologies. The project is Open Source (Apache like license) and was sponsored by the 501(c)(3) non-profit arm of media.org (Internet Mulicasting Service ) and noLimits Technologies.

For any questions related to the SpaceMapper and/or mn8 project please write to the mailto:spacemapper-user@lists.sourceforge.net mailing lists.

MN8 and DataStore is still very young and far away for reaching it's purpose, so any feedback, ideas, questions and constructive criticism is more than welcomed :)

November 5th, 2002

Java Community Server.

URL: http://radio.weblogs.com/0109405/2002/09/03.html#a30

.
Started the Java Community Server

It's an implementation of the Radio Community Server in Java of course. It consists of an XML-RPC backend, HSQL for the DB, and Pnuts to script all the XML-RPC stuff.

My main focus is get the xmlStorageServer stuff working first, which is the bulk of the community server anyway. <

So far I've got the basic stuff in place to allow all the XML-RPC stuff to be scripted via Pnuts.

[Miceda]

This is cool, would it be even cooler to have it as a block in Apache Phoenix (part of the Avalon project).

Even more the storage part is already existing and working well (do in alpha stage yet). Check out DataStore which is our XML data store on top of regular RDBM's. What makes DataStore unique is that it is not only for XML (you can store well formated documents (think XML, XHTML), well formated and validated (DTD only yet) documents or regular text documents. It's a block in Phoenix with an XML-RPC server block which assures you access to the storage. You can work with a cool XML based query language called SEP (Simple Exchange Profile) which allows you to search, update and add documents over BEEP or XML-RPC. You even have a Java client API similar to JDBC. Plus that the license is Apache like :) . We tested it with PostgreSQL, SAP DB and IBM DB2. Unfortunately MySQL 4 doesn't support yet SELECT INTERSECT so the intersect part in SEP doesn't work yet with MySQL.

September 4th, 2002

Quick Palm RSS Reader.

Today I took some time and wrote the mn8 script to harvest my subscription file from Radio and using a simple XSL to render it in some basic HTML. Once I had the HTML files rendered locally I only had to make a main page so that Plucker would convert them in it's internal format. Here are some screen shots.

Palm RSS Reader screen shot 0 Palm RSS Reader screen shot 1

Not quite satisfied yet, not sure if I should leave the regular HTML tags in description or I should convert it to txt, since doesn't make to much sense. The text would be a lot more readable. Also have to make it to display only the differences since the last run. The buy line from the shots is from the screen dumper program :)

Updated (8 june 2005): This was a simple experiment and by no way a final product or solution. You can check  Plucker for an complete Palm based news reader.

August 14th, 2002

SAPDB Roks!

After I managed to install last week Postgresql under Windows and tested DataStore with disappointing results, today I installed SAPDB and tested again.

Fantastic! I used the RFC reference XML documents from xml.resource.org, 3124 XML documents, small ones around 4M at all, DataStore over SAPDB, and an MN8 script to load all this documents and store them using SEP (the Simple Exchange Profile) over XML-RPC.

The procedure lasted less than an hour, that means over 60 documents per minute, this is the same as Postgresql, but what is different is that by the time the storing of the documents was completed the indexing was to. With Postgesql this requested around 30 hours.

Now, DataStore stores the document as it receives it (it does not alter it), but to allow very fast structured searches (get me the documents which have this element containing this value and this attribute with this value) it breaks the documents in little entities (prety much as Google does) entities which can be used efficiently (actually use their indexes in their queries) by relational databases along with keeping information about the structure of that entity (part of that element ...). That way you can actually query toons of well or not so well formatted documents using complex structure information and get the results instantly.

After the 3124 documents, DataStore extracted 27000 entities (most of the time words).

And in all this time I worked as usually on the same computer.

I can't explain the poor performances of Postgresql, it does not perform much better on the Linux server either and we tried many versions, must be something related to their JDBC drivers, in comparation to IBM DB2 or MySQL is way slower.

However SAPDB rocks and it's free (GPL/LGPL) with versions for Linux and Windows (and many other), so if you are looking for an free, industrial strength relation database look no further. Should I mention that it comes (the Windows version) with excellent GUI management tools.

June 14th, 2002

RUE rated “JARS top 25%”

After one year since submission JARS had reviewed RUE and rated it JARS top 25%.

The score RUE obtained is:

  • Presentation: 170
    (max possible 200 given for content- visual appeal, look, feel, packaging, graphics/sound/video quality, overview, documentation etc... )
  • AppletPerfect: 240
    ( max possible 300 given for difficulty/complexity, correctness, approach, technique, style, etc... )
  • Function: 180
    ( max possible 200 given for usefulness- to the net, ease of use, known bugs, operation, stability, etc... )
  • Originality : 280
    ( max possible 300 given for uniqueness- new concept, new idea, improvement, Innovative, usage, need/reasoning, etc...)
Apprentice JAVAHOLIC certified developer

We missed 80 points to make it JARS rated top 5%.

Heck, I even got certified by JARS as Apprentice Javaholic. Who cares, anyway?

June 2nd, 2002

Server side MN8

So after finishing the first fully functional internal release of mn8 we considered that running mn8 on a web server side is also useful so we started to make a servlet which will run mn8 based scripts and concepts.

A couple of days should have been enough, still it’s not ready. When I started I tried to make an accurate picture of the steps involved in order to try to improve the estimated time.

The reasons are: a malfunction of the custom scheme based url’s when mn8 is run inside Tomcat and a synchronization problem as now multiple mn8 scripts can be run concomitantly inside the same Java instance.

The synchronization issue could have been foresaw but the Tomcat thing not.

So, what takes to make accurate project estimates, experience ?

May 18th, 2002

MN8

Hm, it took me 3 weeks to solve a bug, that's definitely a record. Truth is I'm not sure it was a bug or a feature. It is a bug because doesn't worked, it did when I implemented the feature, but then "mn8" changed and nobody tested those two methods. Now works, and it does it nicely that's what counts. It is also covered with tests cases so next time it gets screwed will know. Cool we have now 1114 tests for "mn8", tests containing "mn8" script code running in a "mn8" script base test framework. Cool.

We also have some new scrappers and a primitive "Jabber" client. For now it is only able to send and receive normal messages. Will see latter what more can we do with it.

RSS, RDF, OPML, scriptingNews2.xml concepts implemented. A cool thing all do is allowing you to have a  new RSS, RDF ... feed containing only the new items since it was last invoked.

The only remaining things to do till we hand it over to our publisher are: finishing the documents for the language syntax (the API for the concepts is almost done), tackle a bit the basic error handling and do some more examples.

April 19th, 2002

Project Management

We said six months and it will be one year when the first phase of the project will be over. That means 100% late. True it is a bit of an research project and the number of features needed and implemented for a pleasant release is double than the initial proposal. Still how could I be this wrong?

Since I've been leadering at my old company I use to keep a time registration. What I notice is that for 80% of the tasks I have to do I'm bellow or there with the estimation. But with the remaining of 20% I'm chaos. Actually there is all the extra time. Maybe one day I will have enough information and I will manage to find all the correlations about the type of tasks or maybe the sphere of the tasks which are rebelling my management :) .

March 19th, 2002

Development

In 14 March we will have one year since we actually started the coding on all of the "SpaceMapper" projects, a very difficult one but full with rewards.

So, I took JMetric and I did a couple of measurements on mn8. Here are the results:

  • Lines of code: 20311
  • Statements: 14218
  • Classes: 211
  • Methods: 2296
  • Variables: 981
  • Public Methods: 1877

This is just the code, no documentation, no unit tests, just the core java source files. This was till recently a man/month effort. With the actual code started only from August, till August I was working on the prototype.

Not bad, I guess.

I only have two features open with one task in each so I'm really at the end of a first serious release. It's not a bad feeling but it is not good either, it's exhaustion, accomplishment and scare. In a couple of days/weeks your secret will be publicly exposed. It's like having a child and giving it away to strangers to take care of him.

Enough of this mumbling this is not what I had to say. I was about to tell you about testing and bugs.

The last two months among closing the remaining features we started intense testing. What I found is that bugs comes in layers. Three particular type of of bugs, each type with it's own schedule.

The first layer is the soft and easy bugs. Plenty of them, quick to catch and fix. Unit tests are great investment for this layer.

Then it comes the more complex layer. Not difficult to find but a bit more trickier to fix. Most of this bugs can be catched by unit test and can easily be kept under control for the future, again through the unit tests.

But then comes the last layer, at the end when, you are really tired and seek of bugs. These bugs are nasty ones. Very hard to catch, very hard to reproduce, very hard to understand what the hack is going on. Should I mention about fixing them? I spent the last two days chasing such a bug, I'm not there yet, but I will. Sometimes I wonder if it is a good idea at all to spend so much time for just one bug?

Unit tests won't help you with these bugs, except maybe after you fixed them to make sure don't reappear.

Also was interesting to notice that whoever said that 80% of bugs are situated in 20% of code was absolutely right. The majority of bugs where around 3 classes which where extremely complicated. Hard to believe that 80% of the bugs where actually in around 200 lines of code from 20,000. The problem is that when I designed those particular portions of code I was aware of the grade of difficulty exposed so I tried to code in the way Kent Beck recommends and explicitly expressing intention. All this by using meaningful names, breaking the code in many minuscule methods and so on. Still, even if was a lot easier that way to understand functionality it continues not to be extremely easy.

Another interesting conclusion was that even if at the beginning all of us blame somebody or something else, always, and I mean always we are the stupid ones, and probably the debugging time would be reduced considerably if we would always start checking the code instead of trying to catch what we imagine is happening which almost always is miles away from what is actually happening.

March 5th, 2002

Previous Posts