c – bit.summa

Speedy BED conversion tool: convert2bed

Finishing touches are in place for my convert2bed tool (GitHub site). This utility converts common genomics data formats (BAM, GFF, GTF, PSL, SAM, VCF, WIG) to lexicographically-sorted UCSC BED format. It offers two benefits over alternatives: It runs about 3-10x as fast as bedtools *ToBed equivalents It converts all input fields in as non-lossy a […]

Platform-independent methods to get number of available cores and/or processors with C/C++

See: http://stackoverflow.com/questions/150355/programmatically-find-the-number-of-cores-on-a-machine

matrix2png -to- matrix2pdf

For scientific work, I have used matrix2png to make a nice PNG image from a text-formatted matrix of data values. PNG looks great on the web, but it doesn’t translate well to making publication-quality figures. My thought was to take matrix2png and — with the help of Haru (libharu) — turn it into matrix2pdf. Maybe I […]

OCR resources for iOS

Here are some useful resources for open source C and C++ -based OCR libraries that could run under iOS (need to check licensing): Seven Segment Optical Character Recognition (ssocr) Advice for 7-Segment Display OCR with Tesseract Tesseract OCR iOS library The end goal is to be able to use an iPhone to read LED displays, […]

Regression testing of SHA-1 signatures via command-line

I wrote a data extraction utility which uses PolarSSL to export a Base64-encoded SHA-1 digest of some internal metadata (a string of JSON-formatted data), to help validate archive integrity: $ unstarch –sha1-signature .foo 7HkOxDUBJd2rU/CQ/zigR84MPTc= So far, so good. But now I want to validate that the metadata are being digested correctly through some independent means, […]