Posts filed under 'XDStore'

After Release Status

After the first public release of SpaceMapper DataStore and MN8 last week I have some data to draw some conclusions. Unfortunately, on the release date, the FreshMeat announcement contained links which did not passed through the SourceForge counters (the link was directly to prdownloads instead of the downloads section on the status page), so I have no Idea about the downloads in the first day. However seems that there is a lot more interest in an XML database than in a new scripting language and even so 78% are interested in the binaries and only 22% in the source of the database project. With the scripting language the situation is reversed, 79% interest in the source and only 21% in the binaries. I guess peoples are more interested in how to write an interpreter than in using a new one :)

No feedback, no bugs, no mailing list interest no contributors which is reasonable to a first public release.

What is not reasonable is that the Klez virus on somebody's computer noticed the release and it sends thousands of virused mails with my email address in the from. In case anyone receives one I'm really sorry, it's not my fault, it's not from me and you can verify that by looking at the source of the message. All the mails I send goes through our server (194.102.233.6) which I'm sure you won't find in the received headers.

November 11th, 2002

DataStore and MN8 (ver. 0.7a) finally out!.

Just wanted to let you know that after many months of hard work a first public release of SpaceMapper DataStore and MN8 is available.

What is SpaceMapper DataStore ?

DataStore is a Java based document repository server for storing, querying and fetching XML based documents. It is built on practical needs allowing the storage of semi-structured (well formatted, maybe validated, XML, XHTML and HTML) documents and un-structured documents (TXT).

The documents are stored in conventional relational database (Postgresql, MySQL, DB2, SAP DB) assuring that way the maximum advantages and reliability of these products. Being built on top of the Avalon Phoenix framework, it allows server components to be easily developed, deployed and shared. The documents are managed through a BEEP and/or XML-RPC interface using a subset of the SEP (Simple Exchange Profile) protocol.

What is SpaceMapper mn8 ?

mn8 is an experimental object oriented scripting language, tightly integrated with the net, which emulates the concepts at the core of XML in order to simplify and make as transparent as possible information extraction and manipulation from the WWW and XML documents.

Written in Java works with most operating systems and allows easy reuse of the huge number of libraries available trough simple wrappers. At this point mn8 has concepts for: HTML, HTML-Forms, Cookies, RSS, OPML, HTTP, FTP, POP3, SMTP, Jabber, BEEP, XML-RPC, SOAP, MBox.

Then what is SpaceMapper ?

The SpaceMapper effort was born from the classic Internet desire to see if there is a better way. The effort evolved from an early RFP on the now-defunct SourceXchange which was awarded to the Romanian open source development firm noLimits Technologies. The project is Open Source (Apache like license) and was sponsored by the 501(c)(3) non-profit arm of media.org (Internet Mulicasting Service ) and noLimits Technologies.

For any questions related to the SpaceMapper and/or mn8 project please write to the mailto:spacemapper-user@lists.sourceforge.net mailing lists.

MN8 and DataStore is still very young and far away for reaching it's purpose, so any feedback, ideas, questions and constructive criticism is more than welcomed :)

November 5th, 2002

SAPDB Roks!

After I managed to install last week Postgresql under Windows and tested DataStore with disappointing results, today I installed SAPDB and tested again.

Fantastic! I used the RFC reference XML documents from xml.resource.org, 3124 XML documents, small ones around 4M at all, DataStore over SAPDB, and an MN8 script to load all this documents and store them using SEP (the Simple Exchange Profile) over XML-RPC.

The procedure lasted less than an hour, that means over 60 documents per minute, this is the same as Postgresql, but what is different is that by the time the storing of the documents was completed the indexing was to. With Postgesql this requested around 30 hours.

Now, DataStore stores the document as it receives it (it does not alter it), but to allow very fast structured searches (get me the documents which have this element containing this value and this attribute with this value) it breaks the documents in little entities (prety much as Google does) entities which can be used efficiently (actually use their indexes in their queries) by relational databases along with keeping information about the structure of that entity (part of that element ...). That way you can actually query toons of well or not so well formatted documents using complex structure information and get the results instantly.

After the 3124 documents, DataStore extracted 27000 entities (most of the time words).

And in all this time I worked as usually on the same computer.

I can't explain the poor performances of Postgresql, it does not perform much better on the Linux server either and we tried many versions, must be something related to their JDBC drivers, in comparation to IBM DB2 or MySQL is way slower.

However SAPDB rocks and it's free (GPL/LGPL) with versions for Linux and Windows (and many other), so if you are looking for an free, industrial strength relation database look no further. Should I mention that it comes (the Windows version) with excellent GUI management tools.

June 14th, 2002

SpaceMapper Status

If I look at the mn8 cvs commits, I end up (again) at the conclusion that this was yet another unproductive week, half the number of commits than last week. Hmm, this is something very hard to believe. "mn8" is working more than ever, everything is on it's way happily doing as planned every feature I was afraid of seems to work. This should be the time to frenetically code for the last 100 matters and still ... Maybe is just a matter of discipline?

Finally DataStore seems to get on frozen status, no new bugs discovered so Crow and Atech are 100% procent on "mn8". There is only one trick issue to be solved around the net centered part. It is very important and decisive for "mn8" and is about integrating the net part with the query language in an transparent way. In this way you could do an each query from an URL and the from command would get the query, select only the variable that can be transformed in a protocol specific query (HTTP and HTML pages, beep and SEP queries, file and filtering, ...) and filter out the documents before being filtered by the where clause. If this doesn't work (but it will) we have serious problems, you want to work on a couple of documents from an SEP subtree and that would mean getting all the documents from the subtree and filtering them, that is a waste of bandwidth and processing power. This is what Atech is doing now.

Crow was busy doing some word wrapping and is working on stripping out text from HTML pages. This could prove usefull to convert them to XML and also to get some result formated in HTML and send them by email.

"HyperPad" is quite usable and stable now, Borzy still have a few bugs to crash, but it looks and works quite decently, for an Java application ;)

Got a funny idea last week! "mn8" doesn't have too many user interaction possibilities right now, and I don't think the usual ones (button, text field, ...) have to be directly implemented, but, concepts could be developed to emulate XForms. Now, having the concepts for some basic XForms widgets, someone could easily write a requested interface to the respective concept. Getting a simple output from the wrapper concept (the one which produce the desired interface to the concept, just like in MVC) to an external program which know how to do the rendering on it and to output the resulted XML to the initial concept and whoala you have the much wanted user interaction.

The good parts ? First you are not forced to an Java based GUI application to get the interaction, there are and will be some native XForms implementations. Second but also important you could provide a web based form starting from an XForms XML and a nice style sheet :). Heh you could even have a CLI or a fake user interaction this way. I can't wayt to get there, but this will be lateter after the basic "mn8" will work, in a second phase.

November 4th, 2001

MN8

mn8: O God, I'm running so late, like never before. I barely think or do anything else, yet still progress is so slow. The real problem is that designing an OO interpreter is not so trivial. All the time I have to come back to revise some design flows. I think that is called refactoring ;) . At least I do it!

Spent the whole weekend working like a crazy to refactor a few things instead of adding functionality, which means four more days of delay. But I had to do it, it was just not looking and functioning right. This was also the first time (I remember anyway) of hating Java. I just don't understand why they did the static behave the way it does during inheritance. At the end had to use the Singleton pattern, it works but I'm still not very happy. I will leave it as it is anyway, can't afford and don't think there is other solution.

Being so late and still having to work on it, always makes me think if I'm not like the cowboy programmer in the project management examples. But mn8 is (some thing which never stops amaze me) working as it was planned, more ready and more complete every day. But as I look around me, I don't really see people doing as radical and as much refactoring as me, and this worries me. Is that possible that the design was right from the first time, I don't think so, there is no such thing. I'm afraid that the others rather patch things instead of doing refactoring.

This time being late had some benefits too. DataStore got a alpha but released Avalon (finally we are not going to release it with a CVS version of Avalon), lot's of bug fixes, and a brand new SEP interpreter plus a more stabilized XML-RPC server, and a full blown PHP/XML-RPC example. Yep it works great. Crow is working on some small Java tools to let us transforms mails from mbox format in XML and then to feed them to DataStore. Will need them latter anyway, plus that is a good way of testing DataStore.

Atech did the BEEP handler so now you can open an URLConnection to a beep://xxx URL and it will work, still you have to know what to talk over the connection, but at least will allow mn8 to open url's transparently. Now he works on the XML-RPC handler. Let's see how that works out.

A, not to forget about "HyperPad". It got a pair of skin handlers so you can have skins in it (doesn't really work well, but don't think is our fault). It amazes me how well the new Linux Java works. It has better font rendering than the IBM one, and definitely is faster than under Windows 2000. It wasn't always like that. Linux rulez!

Thank's to neurogato for pointing out that: Alan's code crew text is actually lyrics to the tune of Motorhead's "(We Are) The Road Crew", yes indeed fits beautifully. BTW, if we are at the SmoothWall chapter it just happent last week to replace an old router based on LRP to a firewall running SmoothWall, it took us about two hours but only because, we went for the installation first and then to read the manuals, just like any (in)sane person would do. Great piece of software!

October 9th, 2001


So, who is Remus?

Remus Pereni is a 32 years old free thinker, IT addict, who lives, works, and wonders about the meaning of life, relations, human nature, IT, technologies, clients, value and business from Satu Mare, Romania. More

Calendar

January 2009
M T W T F S S
« May    
 1234
567891011
12131415161718
19202122232425
262728293031  

Categories

Posts by Month

Feeds