More news from the non-relational multiverse

bulbflow - a python framework for the graph era

All python-enthusiasts and admirers of the big snake rejoice! The open-source software developers group TinkerPop has been writing a great stack of software for/on top of new graph databases. The group was co-founded by Marko Rodriguez and focusses on technologies in the graph database space. Among these, there is “Bulbs” (bulbflow.com) –

… an open-source Python persistence framework for graph databases and the first piece of a larger Web-development toolkit that will be released in the upcoming weeks. It’s like an ORM (Object Relational Mapping) for graphs, but instead of SQL, you use the graph-traversal language Gremlin to query the database. You can use it to connect to any Blueprints-enabled database, including TinkerGraph, Neo4j, OrientDB, Dex, and OpenRDF (and there is an InfiniteGraph implementation in development). Blueprints is a collection of interfaces, implementations, ouplementations, and test suites for the property graph data model. Blueprints is analogous to the JDBC, but for graph databases.

– so you are not committing your code to any particular database or architecture. Which is great(!) and important, since I couldn’t decide yet which of the aforementioned noSQL database systems to commit to. Most likely, as with current relational (SQL) systems (like mySQL, postgreSQL, ORACLE), there will probably be a couple of big guys rooling the roost in the end, which you might end up using in parallel. But thanks to the JDBC that doesn’t really affect the code (much).
The point of using a domain specific language as opposed to a general purpose language is the ability to express big constructs more elegantly – which seems to be the case here, the pagerank-algorithm can be expressed in a few lines of code,
or see this code example for Eigenvector-centrality

g = new Neo4jGraph('/tmp/neo4j')

// calculate basic collaborative filtering for vertex 1
m = [:]
g.v(1).out(‘likes’).in(‘likes’).out(‘likes’).groupCount(m)
m.sort{a,b -> a.value b.value}

// calculate the primary eigenvector (eigenvector centrality) of a graph m = [:]; c = 0; g.V.out.groupCount(m).loop(2){c++ a.value b.value}

Overall, I am quite happy and impressed with the current active developments in the field of non-relational graph-databases. Labeled with the warning sign “some scientific assembly required” these developments match the requirements of biological networks very well. The motivation for this developments is probably spurred by the pressure of efficiently handling data of social networks and the semantic web. But – Life is a graph! (I lay claim to have originated that sentence). To me this seems a great opportunity in order to reframe existing biomolecular datasets as graphs and use network analysis methods out-of-the-box for further investigations on a scale unimaginable before. So many things to test and check out …

found via SchockWellenReiter

Algorithm, Code, Data, Graph, Networks, Source

This entry was posted on 2011/07/19, 22:52 and is filed under Computers & Code, Networks. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

cistronic