Posts Tagged Algorithm
As said before, I am getting deeper into graph-databases, specifically “neo4J “. The pace of development is breathtaking, it’s hard to keep up with the new versions and amazing features. In preparation of attending a “Cypher Hands On” (Meetup-Graph), I finally got round to updating to the latest 1.8M03 Milestone. By now, there are a couple of nice introductory videos available:
You might want to check out the videoGraphy @ neo4J. I also recommend the following Intro to Graph Databases (on vimeo) which has a nice explanation on what the buzz/whole point is all about plus some real world examples and history:
To deepen our understanding of the graph-theoretic foundations, I came across these books via blog.postmaster.gr:
“Graph Theory and Complex Networks: An Introduction” by Maarten van Steen. It is very interesting to note that this book is also available electronically as a personalised PDF. As the author notes: “When you write a book containing mathematical symbols, thinking big and acting commercially doesn’t seem the right combination. I merely hope to see the material to be used by many students and instructors everywhere and to receive a lot of constructive feedback that will lead to improvements. Acting commercially has never been one of my strong points anyway”.
– Reinhard Diestel: “Graph Theory“.
It is fun, indeed. Enjoy!
Google started to roll out the Knowledge Graph, intended to be more about things rather than just strings. Delivering and disambiguating related content based on semantic network associations sounds great, if this really is a step forward to move out of the filter-bubble remains to be seen. Overall, it seems to be related to the idea of a conceptual graph, and wikipedia forms a big chunk of the underlying knowledge-base.
techcrunch.com “Google Just Got A Whole Lot Smarter, Launches Its Knowledge Graph”
Googles official blog “Introducing the Knowledge Graph: things, not strings”
lifehacker.com “Google Knowledge Graph Brings Smarter Semantic Results to Your Google Searches”
webpronews.com “Knowledge Graph: Google Gets Tight With Wikipedia“
Pearl‘s book on “Causality” has been on my shelf for a while now. I also read it, a few times, but never managed to get through it in one go, cover to cover. Consequently, I haven’t come to grips with all details, implications and equations yet. No reason to worry about my intellectual capabilities, it’s quite fundamental and takes time to sink in. Now Judea Pearl has been awarded the 2011 ACM Turing Award – Congratulations!
The annual Association for Computing Machinery (ACM) A.M. Turing Award, sometimes called the “Nobel Prize in Computing,” recognizes Pearl for his advances in probabilistic and causal reasoning. His work has enabled creation of thinking machines that can cope with uncertainty, making decisions even when answers aren’t black or white. […]
The UCLA computer science professor is widely credited with coining the term “Bayesian Network,” which refers to a statistical model ACM describes as mimicking “the neural activities of the human brain, constantly exchanging messages without benefit of a supervisor.” Bayesian networks have been used to, among other things, analyze biological data for studies of medicine and diseases.
Here is a chance to see him talk for yourself:
“I compute, therefore I understand” – More videos are here on theScienceNetwork.
found via networkworld.com: Judea Pearl, a big brain behind artificial intelligence, wins Turing Award. See also on the ACM NEWS “Judea Pearl Wins 2011 ACM Turing Award“.
Last week CGAL-4.0-beta1 was released – as with most X.0 and beta releases of any kind of sofware, this is not yet intended for use in production. Howevever, previous releases look quite stable.
The goal of the CGAL Open Source Project is to provide easy access to efficient and reliable geometric algorithms in the form of a C++ library. CGAL is used in various areas needing geometric computation, such as: computer graphics, scientific visualization, computer aided design and modeling, geographic information systems, molecular biology, medical imaging, robotics and motion planning, mesh generation, numerical methods… CGAL can be used together with Open Source software free of charge.
Also, a Book on “CGAL Arrangements and Their Applications” just became available (Springer).
The list of features packed into the kernels is impressive and too long to be summed up in a few lines – see here for the Package Overview – I am sure you’ll find quite a few items of interest. Especially the spatial sorting functions and matrix searches sound very useful to me. In addition, there is support for 3rd party software such as the Boost Graph Library. So much to check out – here are some tutorials, manuals and videos on CGAL … For example the dynamic 3D Voronoi demo below. Have fun!
Thanks for hints to Kasthuri Kannan and Chris Sander.
From last week’s 28th Chaos Communication Congress (28C3) – an annual four-day conference on technology, society and utopia – there are a couple of really interesting talks. Of course, these are freely available (the logo on the right directs to their youtube-channel, the link in the blockquote takes you to the wiki) under a creative commons (BY-NC-ND) license.
As practised with 26C3 and 27C3 we want you to come together. no nerd left behind: Allow those unable to attend the Congress in Berlin to celebrate their own Hack Center Experience, watch the streams, participate via twitter or chats, drink Tschunk, cook and have a good time.
Indeed, in the very long run, it should only be necessary to
determine the amino acid sequence of a protein, and its three-dimensional
structure could then be predicted; in my view this day will not come soon,
but when it does come the X-ray crystallographers can go out of business,
perhaps with a certain sense of relief, and it will also be possible to discuss
the structures of many important proteins which cannot be crystallized and
therefore lie outside the crystallographer’s purview.
If you are into (structural) molecular biology, you will probably have seen this before. Honestly, I don’t get tired of reading this statement. That was 49 years (and 11 days, to be precise) ago – where are we now, almost half a century later? Are we there yet? (sounds like the little ones nagging on a long-distance journey – daddy told you it would take a while!) Seems we might be there soon, since we have made quite some headway recently.
First of all, the above statement displays some amazing farsightedness combined with a humble self-perception. He is not overstating it, indicating that not all will be crystallized. If you read on in his speech, he was already talking about larger assemblies and complexes, and that’s where we are now, and that’s where things get REALLY interesting. Besides the picture with him modeling a 3D structure (on the sticks for z axis) is by no means old-fashioned, to me it means he just took what was available at the time to get the 3D model constructed. Today we have sophisticated ComputerGraphics, yet nothing beats the experience of building a physical model – an art that should not be forgotten and developed further (thinking of 3D printing here). I am convinced that even in the age of the high-throughput techniques, interaction data etc. we ultimately need a structural view to truly understand the molecular mechanisms.
But the main point – or prediction – is that ultimately, we should be able to compute structure and function from sequence alone.
If you think about it, that’s a very bold statement indeed, with wide ramifications. By now our sequencing capabilities are growing at a pace beyond Moore’s law (see here). I probably don’t have to remind ourselves that experimental structure determination is difficult and time-consuming, to say the least. And computer predictions in the absence of a related solved structure in the PDB are usually no match for the real thing (a.k.a. experimental 3D structure).
But there is a fresh breeze in the field: Recently a number of groups report that the ancient dream (from the mid-nineties and even before, “ancient” in bioinformatics = over 15 yrs) of using patterns of correlated mutations to derive useful spatial constraints for structure prediction does work indeed. Properly. Finally!
Given enough information content, seems there are no limits to the size of the proteins, and even notoriously difficult ones like transmembrane structures seem to work. All you need is sequences. And lots of them. Properly aligned, of course. (That’s what a lot of bioinformatics was all about, wasn’t it?) But massive amounts of sequences is what we get anyway these days, more than you ever wanted (to analyze) from next-gen sequencing projects. That’s off-topic, delving deeper into that mania is a topic for different post to explore.
If you are interested to check it out in depth: One of the methods is called EVfold, see http://EVfold.org.
Of course, there is still some room for optimization, cross-fertilization and improvement in the methods, I think. Simply by looking at some of the predicted contact maps, it’s fairly obvious to me these methods are not only better than what was available so far, but they are also not identical. Seeing their performance and following the competition in this field hotting up on next years CASP will be jolly exciting.
I’m sure I’ll keep you posted on further developments and deeper analysis – for the moment I’ll leave you with a few references to get started. As a final word, I am so glad most of them (at least the ones I list below) are not hidden behind a payhedge but open access, free to check-out by anyone who cares.
- Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C. (2011) Protein 3D Structure Computed from Evolutionary Sequence Variation. PLoS ONE 6(12): e28766. doi:10.1371/journal.pone.0028766
- Taylor WR, Sadowski MI (2011) Structural Constraints on the Covariance Matrix Derived from Multiple Aligned Protein Sequences. PLoS ONE 6(12): e28265. doi:10.1371/journal.pone.0028265
- Burger L, van Nimwegen E (2010) Disentangling Direct from Indirect Co-Evolution of Residues in Protein Alignments. PLoS Comput Biol 6(1): e1000633. doi:10.1371/journal.pcbi.1000633
Stanford University offers free courses, mainly in Computer Science. Most closely related to the topics of this blog and the heart of yours truely probably are some of the following :
- Probabilistic Graphical Models http://www.pgm-class.org/
- Game Theory http://www.game-theory-class.org/
- Design and Analysis of Algorithms I http://www.algo-class.org/
- Model Thinking http://www.modelthinker-class.org/
- Machine Learning http://jan2012.ml-class.org/
- Information Theory http://www.infotheory-class.org/
From the FAQs:
How much does it cost to take the course? Nothing: it’s free!
Will I get university credit for taking this course? No.
In a nutshell, I very much like the idea to attend courses because one is interested in the topic per se, not for grabbing a title. That’s the spirit.
Thanks for hints to Esther Wojcicki (teacher and journalist) via google+.