Reproducible Research:

Open Peer Review:

Metagenomics: RTAX, QIIME

RSS feed

GitHub: davidsoergel

Twitter: @loraxorg


Welcome! This is the home for various of my development projects, all written in Java unless otherwise noted. Forums, issue trackers, downloads, and documentation can be found within each project.

In addition to the project-specific sites linked below, there is a build server for all Java projects, and a Maven2 repository containing both releases and snapshots.

Stand-alone projects

s3napback Cycling, incremental, compressed, encrypted backups to Amazon S3.
RTAX Rapid and accurate taxonomic classification of short paired-end sequence reads from the 16S ribosomal RNA gene.
featuretools A perl package for dealing with sequence annotations. In particular, includes featuregrep, a program which searches sequences, together with associated DAS files or feature databases, for patterns. Uses an extended regular expression syntax to specify annotations as part of the search pattern (in addition to the sequence characters themselves).
Jandy Jandy is a program for managing multiple runs of scientific software. It facilitates running a program a number of times, perhaps with different combinations of input parameters, and keeps track of the outputs produced from each run in a database. Runs may be computed in parallel on a cluster. Jandy can produce some plots from the collection of inputs and outputs.
MSENSR Microbe Statistics from Environmental Nucleotide Sequence Reads. Computes various statistics from collections of nucleotide sequences, in hopes of discovering correlations between sequence statistics, phylogeny, and community composition. Also, generates simulated metagenomic data sets.

Science packages

phyloutils Provides data structures for weighted phylogenetic trees, and various operations on such trees. Includes phylogenetic alpha and beta diversity measures such as Weighted UniFrac. Also, computes phylogenetic distances between species based on the Ciccarelli et. al. 2006 tree of life.
ncbitaxonomy Provides a Hibernate-based object-relational interface to the NCBI taxonomy database, and convenience classes for navigating it.
sequtils Utility classes for dealing with biological sequences.
stats Some basic data structures and distributions for statistical computations.
ml Generic machine learning package. Provides a framework for supervised and unsupervised clustering (both online and batch), and currently implements naive Bayesian, k-NN, K-means, and Kohonen SOM clustering. Computes Variable Memory Markov models (aka PSTs) on strings. Also, implements various Monte Carlo methods (including Metropolis-coupled MCMC).
jlibsvm Heavily refactored Java port of LIBSVM, providing efficient training of Support Vector Machines. Provides many new features, including a fully generified API; the ability to add custom kernels for arbitrary data types; and integrated scaling and normalization.

Utilities packages

conja Incredibly easy functional concurrency in Java. Conja lets your code take advantage of multicore processors with no configuration and minimal code changes. It basically wraps java.utils.concurrent in syntactic sugar that encourages a functional style.
dsutils Provides various general utility classes. Some of these have slowly been replaced over the years by new features in the JDK and by the Apache Commons packages.
runutils Provides standard APIs and utility classes having to do with managing program runs and threads. Provides annotation-based runtime injection of configuration parameters into objects. Provides a framework for queueing tasks to be performed in separate threads (possibly obsoleted by similar functionality in java.util.concurrent).
chartutils Convenience classes for generating plots with the JFreeChart package.
springjpautils Base classes for using JPA with Spring and Hibernate.
event A notification framework for event-driven programs, especially useful for Swing GUIs. Provides an event broker that distributes events around a network of sources, relays, and listeners. Facilitates live updating of disparate components in an application whose relationships with each other are dynamic. Does not replace, but rather complements, methods of dealing with low-level events such as mouse clicks (e.g., the XML Actions framework). Provides a higher layer of events with semantic meaning to the application.


devenvironment Summary of how my development environment works, including IDE, build tools, documentation generation, and so forth.
svnnotebook Facilitates storing notebooks intermingled with project files in Subversion repositories. Extracts changes for a given time range from the repositories and formats them nicely, i.e. for weekly reporting. Builds browseable web sites of all formatted notebooks, organized by both topic and date. Supports Markdown syntax.

Older projects

pdftank Automatically navigate journal web sites to download and cache full-text PDFs.