Sean Collins

sean [at] seanmcollins [dot] com

GPG Key ID: 0xf60f564978913931

sean [at] coreitpro [dot] com

GPG Key ID: 0xA1D7E590

profile for Sean at Stack Overflow, Q&A for professional and enthusiast programmers

Developing Bioinformatics Applications

During the winter of 2009 and spring of 2010, I developed applications to store patient and genetic data from a research study conducted by the University of Pennsylvania, Drexel Univery, and Hanhemmann University Hospital, and B-Tech Consulting LTD. The research team had gathered large sets of patient histories, genetic sequences, genetic mutations, and patient profiles. These datasets were then stored in flat files, which had added complexity to analysis, management, and export. It was quickly determined that a new strategy would need to be developed.

Using a relational database to store the research data was the obvious choice for solving the issues of scalability and management. A well thought out database schema would reflect the complex relationships in the data, and SQL queries could be used to quickly deliver answers to researchers.

A parser was written to import the research data into the database, and small programs were developed to run queries against the database. One problem involved a dataset of genetic samples from HIV viruses drawn from patients that had been imported into the database, and another dataset of patient medical histories.

We are trying to find genetic markers within the HIV virus that are early indicators for neurological AIDS symptoms. The concept is that if we find the genetic markers appearing prior the neurological symptoms, you can alter or enhance treatment and affect the course of the disease. There are no markers of this type currently known.

Dr. Brian Moldover, B-Tech Consulting, Ltd.

Both datasets had originated from separate files, but after being imported into the database a unified view of the patient’s mental health and the genetic markers could be built and exported out for statistical analysis.