PGCon2011 - Add 4 Video (2015.09.18)
PGCon 2011
The PostgreSQL Conference
Speakers | |
---|---|
Luis Carvalho |
Schedule | |
---|---|
Day | Talks - 2 - 2011-05-20 |
Room | DMS 1140 |
Start time | 13:30 |
Duration | 01:00 |
Info | |
ID | 332 |
Event type | Lecture |
Track | Applications |
Language used for presentation | English |
Doing Bioinformatics in PostgreSQL
We introduce and describe two modules that grew from the need to perform integrated and efficient Bioinformatics tasks in PostgreSQL: PostBio, a set of methods to store and query genomic sequences and features, and PostStat, a collection of statistical functions that allow for integrated statistical tests. A few practical examples will be presented to showcase the modules.
PostBio includes three data types: a GiST-indexable integer interval used to represent biological sequence features; a suffix tree type to search for maximum unique matches; and a compressed suffix array for fast short exact matches. In addition, PostBio provides a set of utilitary sequence routines.
PostStat comprises routines that compute a number of cumulative probability distributions, linear regression, and statistical tests, both parametric and non-parametric; the main motivation is to provide a way to test statistical hypothesis in simple models.