Archive for July, 2009

On Computer Science and Computers

July 30, 2009 Leave a comment

Computer Science is no more about computers than astronomy is about telescopes.
— E.W. Dijkstra

The summer REU has ended and it was, overall, a very rewarding experience. I do not think I was really prepared for the overwhelming amount of work that goes into research, but I did learn a lot about computer science in general. The project I was participating in was entitled, Dynamics of Knowledge Creation in Open Biomedical Ontologies. The goal of this project was to examine how knowledge grows over time, and what intrinsic qualities in a social network leads to maximal growth in a technical network.

First, a little bit of background. An ontology is a set of concepts within a domain and these concepts must be representable in a machine parseable form. Think of an ontology like a dictionary;  ontologies contain a set of terms which define a domain, and a set of relations mapping terms to other terms, and terms to other domains.  My part of this involved parsing these ontologies and placing them into a database so that we can build knowledge graphs, and look at how they change over time.

Most of the communities in Open Biomedical Ontologies communicate by either mailing lists, SourceForge’s bug tracker, or both. Notre Dame provided us with an API to access most of the data required, but some of it was not provided; like the mailing list data.  I built a webcrawler to place this information into a database.

Once I had all of the required data, I needed a tool to calculate some graph metrics on it; such as centrality, density, and clustering coefficient.  I built a gdf parser to parse Guess graph information, and output the data in the form of csv files.

In essence, I spent most of the summer building tools to download empirical data.  I plan to compare this data to a simulation designed by Dr. Yilmaz. Once we are confident in the accuracy of this simulation, we can begin to examine how changing structural elements of the social network affect the technical network, and which characteristics maximize innovation.

All in all this project has been fascinating.  It quickly turned into a multidisciplinary project involving sociology, graph theory, and computer science.  A friend of mine once sent me that E.W. Dijkstra quote.  He mentioned that Computer Scientists are often wanted for their ability to manipulate massive quantities of data in a variety of fields.  I am now convinced he is right.

Categories: School Tags: ,