Friday, March 5, 2010

OSGi and source code growth problem

Recently I've found pretty interesting OSGi DevCon keynote preview:
http://techdistrict.kirkk.com/2010/02/17/osgi-devcon-preview/
Author proposes to solve source code growth problem by using OSGi in the system architecture.
I agree that OSGi is the best modularization technology for Java and quite useful if you need one or more of the following:
- modularity framework
- usage of multiple versions of the same module
- hot module redeployment
- dependency injection

However, in present state OSGi has a major drawback. It does not make developers more productive, instead it adds more work such as metadata management and following OSGi guidelines. In many cases, benefits of OSGi may be not worth additional development overhead.

And, finally, I dont's see how OSGi can solve the source code growth problem. Instead of managing thousands of classes, developer have to manage thousands of modules. The growth problem still exist with OSGi too.

Monday, March 1, 2010

Source code as data

Von Neumann architecture opened an era of modern programming where program instructions became data. Later, computer languages, compilers and source code were invented in order to help programmers deal with the growing complexity and size of programs. Today, 65 years later, people are able to manage terabytes of data, literally build clouds out of computers, but still struggle with programs with millions of lines of code which are far beyond a human's comprehension.

In the programming world, source code is treated like sacred knowledge, stored in text files, written in programming languages and understandable mostly by authors. Most difficulties in code comprehension and research stem from the fact that code is not considered data (data as in a database). Every programming language, technology or framework has it's own syntax and
program structure. Source code search is implemented by generic text search engines, which allow searching only by terms or names of classes, methods or so. It is impossible to formulate a precise query using names, program structure and references in a generic way. For example, generic text search engine cannot find a list of all functions called from said functions, recursively called functions etc.

It’s clear that exponential growth of programs size can be managed only by applying principles of structured data organization for source code. Program artifacts must be indexed and stored in database in technology-independent, generic form. Such program database must allow querying and modifying artifact in a transactional way. Generic form of artifacts will allow transferring algorithms, business logic between programming languages and technologies.