Author: alexpreynolds

This mistake has caught me before, but I always overlook it: https://github.com/lindenb/magic/issues/1#issuecomment-54685236

Read More

Say we have a bunch of text files each containing a column of non-negative numerical values that we want to log-transform (base-10):

for i in `ls *.txt`; do echo $i; awk '{system("calc \"log("$1" + 1)\" | sed -e \"s/^[\t~]*//\"");}' $i > $i.transformed; done

Slow, but it seems to work in a pinch.

Read More

Cartograms with d3   TopoJSON   Same Sex Marriage

Shawn Allen wrote a d3.js-based implementation of a 2D cartogram, which sizes US states in an area-proportional manner, where area is based on some interesting statistic, like population.

There has been a great deal of progress made in the last year in defending the rights of GLBT Americans to marry and have their partnership rights acknowledged, rights like visitation and estate planning, rights that straight couples take for granted when visiting their loved one in the hospital, or sharing their lives in the house they own, etc.

It’s easy enough to see a map of the 50 states colored by legal status, but people are not spread out evenly to live across all states. I wanted to see how the United States was progressing as a factor of population.

I forked Allen’s project (GitHub project source code available here) and I redid the color scheme, which takes the 50 states and the District of Columbia and shades them by their legal status, whether their laws defend or remove same-sex marriage rights (and associated protections).

Green states allow same-sex marriage, light-green states allow civil unions, orange allow marriage or civil unions (but rulings are currently held up on appeal), and red states that do not defend same-sex marriage rights, either by explicit law or constitutional amendment.

I based the color assignments initially on data from the Right to Marry site, up-to-date as of May 19th, 2014. But with Pennsylvania’s Gov. Corbett conceding defeat and vowing not to appeal the ruling, I added Pennsylvania to the list of pro-equality states.

In addition to seeing how fast things have changed, what is also interesting is that drawing by area quickly shows that over half the country — by 2010 US Census population counts, at least — now enjoys (or will soon enjoy, pending appeals) legal protections that were once denied to a minority of Americans.

Read More

For scientific work, I have used matrix2png to make a nice PNG image from a text-formatted matrix of data values. PNG looks great on the web, but it doesn’t translate well to making publication-quality figures.

My thought was to take matrix2png and — with the help of Haru (libharu) — turn it into matrix2pdf. Maybe I can get this going on Github.

Read More

Here are some useful resources for open source C and C++ -based OCR libraries that could run under iOS (need to check licensing):

The end goal is to be able to use an iPhone to read LED displays, as commonly found on meters, etc. and then do something useful with that data (upload it somewhere, tagged with geodata). An aggregate of hundreds or thousands of users could conceivably collect data useful for themselves and also for the group as a whole.

Read More

I wrote a data extraction utility which uses PolarSSL to export a Base64-encoded SHA-1 digest of some internal metadata (a string of JSON-formatted data), to help validate archive integrity:

$ unstarch --sha1-signature .foo
7HkOxDUBJd2rU/CQ/zigR84MPTc=

So far, so good.

But now I want to validate that the metadata are being digested correctly through some independent means, preferably via the command-line, so that I can perform regression testing. I can use the openssl, xxd and base64 tools together to test that I get the same answer:

$ unstarch --list-json-no-trailing-newline .foo \
| openssl sha1 \
| xxd -r -p \
| base64
7HkOxDUBJd2rU/CQ/zigR84MPTc=

As a note to myself: I end up stripping the trailing newline from the JSON output of unstarch because this is what the PolarSSL library ends up digesting. This very nearly had me doubting whether PolarSSL was working correctly, or whether my command-line test was correct!

Read More

MacPorts is useful for installing a variety of command-line utilities and programs for Mac OS X. There are others, e.g. Homebrew. After using MacPorts to update a GNU gcc installation, it is useful to select the new revision. Tips were posted to this Stack Overflow thread. Basically, it boils down to two steps:

  1. sudo port select --list gcc
  2. sudo port select --set gcc mp-gcc47

Read More

Here’s a one-liner that converts jarch files to starch format, stripping the input file’s extension so that it can be replaced with a new one:

$ for i in `ls *.jarch`; do echo "${i%.*}.starch"; gchr $i | starch - > "${i%.*}.starch"; done

Read More

PolarSSL is a C-based cryptography and SSL library which has a GPL license, which makes it ideal for use with BEDOPS, where I plan to use it for quick SHA-1 hashes of metadata, so as to help validate the integrity of the archive.

I’ve been testing it out in Mac OS X 10.8 and it seems pretty straightforward. Here’s a simple project that hashes the string abc:

#include <stdlib.h>                                                                                                                                                                               
#include <stdio.h>                                                                                                                                                                                
#include "polarssl/config.h"                                                                                                                                                                      
#include "polarssl/sha1.h"                                                                                                                                                                        
                                                                                                                                                                                                  
int main(int argc, char **argv)                                                                                                                                                                   
{                                                                                                                                                                                                 
    unsigned char output[20];                                                                                                                                                                     
    unsigned char *buf;                                                                                                                                                                           
    size_t bufLength;                                                                                                                                                                             
    size_t idx;                                                                                                                                                                                      
                                                                                                                                                                                                  
    buf = strdup("abc");                                                                                                                                                                          
    bufLength = strlen(buf);                                                                                                                                                                      
                                                                                                                                                                                                  
    sha1(buf, bufLength, output);                                                                                                                                                                 
                                                                                                                                                                                                  
    for (idx = 0; idx < 20; idx++) {                                                                                                                                                               
        fprintf(stdout, "%02x", output[idx]);
        if ((idx + 1) % 4 == 0)
            fprintf(stdout, " ");
    }                                                                                                                                                                                                  
    fprintf(stdout, "\n");                                                                                                                                                                        
                                                                                                                                                                                                  
    free(buf);                                                                                                                                                                                    
                                                                                                                                                                                                  
    return EXIT_SUCCESS;                                                                                                                                                                          
}

To compile it:

gcc -Wall -lpolarssl sha1test.c -o sha1test

When we ask for the hash value of the string abc, we get the following result:

$ ./sha1test
a9993e36 4706816a ba3e2571 7850c26c 9cd0d89d

This agrees with the value reported at NIST, which is also:

a9993e36 4706816a ba3e2571 7850c26c 9cd0d89d

Testing output against standards is useful for validation.

Read More