Running a local fork of higlass-server

The goal is to set up a Django management command and API route (with authentication support) for deleting HiGlass tilesets.

We will do the work on a t2.micro EC2 instance running Ubuntu 18.04 (ami-0f65671a86f061fcd).



Install node, npm, and npx from Nodejs:

$ cd ~
$ wget -qO- > node-v11.1.0-linux-x64.tar.xz
$ tar xvf node-v11.1.0-linux-x64.tar.xz
$ cd node-v11.1.0-linux-x64/bin
$ sudo ln -s ${PWD}/node /usr/bin/node
$ sudo ln -s ${PWD}/npm /usr/bin/npm
$ sudo ln -s ${PWD}/npx /usr/bin/npx



$ cd ~
$ wget
$ chmod +x
$ ./


Python 3.7 does not appear to be compatible with Cython at this time, so we downgrade to 3.6:

$ conda create -n higlass-server python=3.6 --no-default-packages --yes



Clone and branch

We clone a fork of higlass-server, sync up with the upstream repository, and set up our branch off of the freshly-updated develop branch:

$ git clone
$ cd ~/higlass-server
$ git remote add upstream
$ git checkout develop
$ git pull remote develop
$ git checkout -b delete-tileset develop

Clean up old hms-dbmi references:

$ cd ~/higlass-server
$ grep -rl hms-dbmi . | xargs sed -i 's/hms-dbmi/higlass/g'



Install current GCC kit and libraries:

$ sudo apt install build-essential
$ sudo apt install libglib2.0-dev
$ sudo apt install libbz2-dev
$ sudo apt install liblzma-dev
$ sudo apt install libhdf5-serial-dev
$ sudo apt install libcurl4-gnutls-dev
$ sudo apt install libpng-dev
$ sudo apt install libssl-dev
$ gcc --version

Python requirements

Note: Edit requirements-secondary.txt to build clodius v0.9.3, or newer.

$ cd ~/higlass-server
$ source activate higlass-server
(higlass-server) $ pip install --upgrade -r ./requirements.txt
(higlass-server) $ pip install --upgrade -r ./requirements-secondary.txt

Initialize server

(higlass-server) $ python makemigrations
(higlass-server) $ python migrate
(higlass-server) $ python runserver localhost:8000

Test API

From another terminal session:

$ wget -qO- http://localhost:8000/api/v1/tilesets
$ wget -qO- http://localhost:8000/api/v1/tileset_info/?d=1234
{"1234": {"error": "No such tileset with uid: 1234"}}

Test ingestion and listing

$ wget -O- --user * --password * > /tmp/
$ python ingest_tileset --filetype hitile --datatype vector --filename /tmp/
uid: AIVpsJYwSemD8FVPBv6vrw
$ python list_tilesets
tileset: Tileset [name:] [ft: hitile] [uuid: AIVpsJYwSemD8FVPBv6vrw]

Set up superuser

$ python createsuperuser

Pull request

Running production and deployment React apps on the same EC2 host

Say you have an Amazon EC2 host with TCP ports 80 and 3000 open. For the purposes of this post, we’ll run Ubuntu 16 on this EC2 host.

You want to run production and deployment versions of a React application on ports 80 and 3000, respectively, using nginx as the web server underneath.

Even if you have port 3000 open, trying to use the default React development application with your EC2 host can raise EADDRNOTAVAIL errors, when trying to point your web browser to the EC2 host’s public IP and development port. A little more work is necessary.

This post will walk you through setting up Nodejs, installing a blank React app, installing nginx, configuring nginx, and setting up the React app to support both production and deployment targets to be served through the nginx server.



Install node, npm, and npx from Nodejs:

$ cd ~
$ wget -qO- > node-v11.1.0-linux-x64.tar.xz
$ tar xvf node-v11.1.0-linux-x64.tar.xz
$ cd node-v11.1.0-linux-x64/bin
$ sudo ln -s ${PWD}/node /usr/bin/node
$ sudo ln -s ${PWD}/npm /usr/bin/npm
$ sudo ln -s ${PWD}/npx /usr/bin/npx



$ sudo apt install nginx -y


Throughout this post, change my-react-app to the name of your React application.

$ sudo mkdir /var/www/my-react-app
$ sudo gpasswd -a "$USER" www-data
$ sudo chown -R "$USER":www-data /var/www
$ find /var/www -type f -exec chmod 0660 {} \;
$ sudo find /var/www -type d -exec chmod 2770 {} \;

Open a text file called /etc/nginx/sites-available/my-react-app-production and add the following boilerplate:

server {
  listen 80;
  root /var/www/my-react-app;
  index index.html;
  access_log /var/log/nginx/my-react-app-production.access.log;
  error_log /var/log/nginx/my-react-app-production.error.log;
  location / {
    try_files $uri /index.html =404;

This sets up nginx to listen to requests on port 80 and serve files from the document root /var/www/my-react-app. Later on, we’ll show how to put the production React application into this folder. For now, we first want to get the server set up.

I am using as a placeholder. Your EC2 host will have its own IP address. Replace with the public IP of your EC2 host. This host IP address will be available in the AWS EC2 console.

If you have an IP name that resolves to the public IP address, you can add a second server_name line that specifies this name, and then your web browser can point to this hostname.

Make a symbolic link to this configuration file in the /etc/nginx/sites-enabled directory:

$ sudo ln -s /etc/nginx/sites-available/my-react-app-production /etc/nginx/sites-enabled/my-react-app-production


When we set up the EC2 host, we set up the security group to allow public-facing traffic on TCP ports 80 and 3000.

For the development server, we will use nginx as a proxy server that redirects any public-side web requests that hit port 3000, to redirect them internally to the EC2 host’s private IP address on port 8080.

We’ll show later how to configure our development React app to run on this private port assignment of 8080, but for now we start with the nginx configuration.

Open a text file called /etc/nginx/sites-available/my-react-app-development and add the following boilerplate:

server {
  listen 3000;
  access_log /var/log/nginx/my-react-app-development.access.log;
  error_log /var/log/nginx/my-react-app-development.error.log;
  location / {
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection 'upgrade';
    proxy_set_header Host $host;
    proxy_cache_bypass $http_upgrade;

Change the line proxy_pass to replace with the private IP address of your EC2 host.

As with the public IP, this private IP address is available in the AWS EC2 console.

Make a symbolic link to this configuration file in the /etc/nginx/sites-enabled directory:

$ sudo ln -s /etc/nginx/sites-available/my-react-app-development /etc/nginx/sites-enabled/my-react-app-development


Use the following command to start the nginx service:

$ sudo service nginx start

If you make any changes to the web server’s configuration, such as to these two site configuration files, use the following to restart the service:

$ sudo service nginx restart



To set up a new app, you can run the following:

$ npx create-react-app my-react-app
$ cd my-react-app

In the React application directory, create a text file called .env and put in HOST and PORT settings:



To start the development server:

$ npm run start

This starts a development server on the EC2 host’s private-facing network on port 8080. If nginx is running and if it is set up correctly, then you can open your web browser to the EC2 host’s public IP and specify port 3000:

Remember that we are using as a placeholder. Replace that public IP with the one that Amazon assigned to your EC2 host.

The create-react-app tool creates an environment where you can edit the JSX files of your React application, and the development server will rebuild the application when those files change. So you should see any updates more or less immediately!


When you’re ready to deploy your application to production, there are two steps.

First, set up a deployment target in the React application’s package.json file.

Open the package.json file in a text editor and go to the scripts property. Initially, it will look like this or similar:

  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test",
    "eject": "react-scripts eject"

To this file, add a deploy target that synchronizes the build folder with what nginx has been configured to serve. The package.json file should look something like this:

  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test",
    "eject": "react-scripts eject",
    "deploy": "rsync -avzhe ssh --progress ./build/* /var/www/my-react-app"

You only need to set up the deploy target once. This uses rsync and ssh. The use of ssh here is optional as both source and destination are on the same local filesystem.

However, you may want to push up a compiled application to a remote host at a later time. You might simply edit the destination, specifying the username and host parameters of that remote deployment host.

Second, run the build and deploy targets:

$ cd my-react-app
$ npm run build
$ npm run deploy

The first target “compiles” the React application to the build folder. The second target synchronizes the /var/www/my-react-app directory with the contents of the build folder.

After doing this, the contents of the public-facing web server will be available from the public IP address on port 80. If nginx is running and is set up correctly, then you can open your web browser to the EC2 host’s public IP and specify port 80:

Remember that we are using as a placeholder. Replace that public IP with the one that Amazon assigned to your EC2 host.

Going forwards, to deploy to production, all you have to do is build and deploy. The updated application will be available to web clients.

How to install and set up a local UCSC BLAT environment

Downloading BLAT

To get BLAT source code:

$ mkdir /tmp/blat && cd /tmp/blat
$ wget
$ unzip

Patching (optional)

I decided to make blat a static binary to avoid missing shared library errors. Here’s a patch you can use to modify the blat makefile:

$ cat > static-blat-makefile.patch
< L += -lm $(SOCKETLIB) --- > L += -lm -ldl $(SOCKETLIB) -static-libgcc
< ${CC} ${COPT} ${CFLAGS} -o ${DESTDIR}${BINDIR}/blat $O $(MYLIBS) $L --- > ${CC} ${COPT} ${CFLAGS} -o ${DESTDIR}${BINDIR}/blat $O -static $(MYLIBS) $L

You may need static library packages installed on your system. The names of these packages will depend on your version of Linux.

Then apply the patch:

$ cd /tmp/blat/blatSrc/blat
$ cp makefile makefile.original
$ patch makefile.original -i ../../static-blat-makefile.patch -o makefile

You may decide not to apply this patch. You could probably skip this step. I just don’t like dynamic binaries.

Building BLAT

In any case, you will want to go to the top level of the blatSrc directory and run make to build the kit:

$ cd /tmp/blat/blatSrc && make

This will take a few minutes to build binaries. Grab some coffee or whatevs.

Installing BLAT

To install them into ${HOME}/bin/${MACHTYPE}, run:

$ make install

This destination is a subdirectory of your home directory.

Once it is built and installed, you can copy the binary to /usr/local/bin or somewhere in your shell’s PATH that makes sense to you. For me, my ${MACHTYPE} is x86_64 and I like having binaries in /usr/local/bin:

$ sudo cp ~areynolds/bin/x86_64/blat /usr/local/bin/blat

Adjust this to the particulars of your setup.

Downloading genomes

Once you have installed blat, the next step is to download a FASTA file for your genome of interest.

If you wanted hg38, for instance:

$ for chr in `seq 1 22` X Y; do echo $chr; wget -qO-$chr.fa.gz | gunzip -c - >> hg38.fa; done

Optimizing queries

Once you have this file hg38.fa, you can start doing queries against it to look for sequence matches, but it can help speed up searches if you first make an OOC file:

$ blat /path/to/hg38.fa /dev/null /dev/null -makeOoc=/path/to/hg38.fa.11.ooc -repMatch=1024

When you do searches, you’d pass this OOC file as an option to skip over regions with over-represented sequences.


Once you have this OOC file made, you can do searches with your FASTA file containing sequences of interest:

$ blat /path/to/hg38.fa /path/to/your-sequences.fa -ooc=/path/to/hg38.fa.11.ooc search-results.psl

The blat binary will write any search results to a PSL-formatted text file called search-results.psl. You can name this whatever you want.

The PSL format is described on the UCSC site.


If you have very many sequences, you can parallelize this by simply splitting up your input sequences file into smaller pieces, and running multiple blat processes, one process for each piece of your original sequences file, writing many PSL files as output.

Set operations

It can help to use a tool like BEDOPS psl2bed to convert PSL to a BED file to do set operations, but that depends on what you want to do with the results. In any case, to convert a PSL file to a sorted BED file:

$ psl2bed < search-results.psl > search-results.bed

How to get a list of HGNC symbols and names (descriptions)

Here’s a quick method to get HGNC symbols and names that draws upon data from UCSC and the open source project:

$ wget -qO- | gunzip -c | cut -f13 | sort | uniq | > hgnc_symbols_with_names.txt

There’s a Python script in there that I call

#!/usr/bin/env python

import sys
from mygene import MyGeneInfo

hgnc_symbols = []
for line in sys.stdin:
    hgnc_symbols.append('%s' % (line.strip()))

mg = MyGeneInfo()
results = mg.querymany(hgnc_symbols, scopes='symbol', species='human', verbose=False)

for result in results:
    sys.stdout.write("%s\t%s\n" % (result['symbol'], result['name']))

The pipeline above writes a two-column text file called hgnc_symbols_with_names.txt that contains the HGNC symbol (e.g., AAR2) and its name (e.g., AAR2 splicing factor homolog), which could be put into a lookup table or, given that it is sorted, could be searched very quickly with a binary search via the Python bisect library.

Getting list of SIMD flags for Linux and OS X

Here are ways to get SIMD/SSE flags from machines running either Linux or OS X:

On Linux (CentOS 7):

$ cat /proc/cpuinfo | grep flags | uniq
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local

On Mac OS X 10.12:

$ sysctl -a | grep machdep.cpu.features
$ sysctl -a | grep machdep.cpu.leaf7_features

See: for a discussion about how to detect instruction sets.

Installing and setting up mongoDB 3.2.1 from source on CentOS 7

The following post explains steps I took to install and enable mongoDB 3.2.1 as a service running under CentOS 7.

Install development tools and libraries, download mongoDB and compile source, and install the compiled binaries:

$ sudo yum group install "Development Tools"
$ sudo yum install scons
$ sudo yum install glibc-static
$ curl -O
$ tar zxvf mongodb-src-r3.2.1.tar.gz
$ cd mongodb-src-r3.2.1
$ scons --ssl all
$ sudo scons --prefix=/opt/mongo install

Set up a mongod account and relevant directories:

$ sudo groupadd --system mongod
$ sudo useradd --no-create-home --system --gid mongod --home-dir /var/lib/mongo --shell /sbin/nologin --comment 'mongod' mongod
$ sudo mkdir -p /var/lib/mongo
$ sudo chown -R mongod:mongod /var/lib/mongo
$ sudo chmod 0755 /var/lib/mongo/
$ sudo mkdir -p /var/{run,log}/mongodb/
$ sudo chown mongod:mongod /var/{run,log}/mongodb/
$ sudo chmod 0755 /var/{run,log}/mongodb/
$ sudo mkdir -p /data/db
$ sudo chown -R mongod:mongod /data/db
$ sudo chmod -R o+w /data/db

Copy over mongod.conf and mongod.service configuration files with modifications for our setup:

$ sudo cp rpm/mongod.conf /etc/mongod.conf
$ sudo cp rpm/mongod.service /lib/systemd/system/mongod.service
$ sudo sed -i -e 's\/usr/local/bin/mongod\/opt/mongo/bin/mongod\' /lib/systemd/system/mongod.service

Reload daemon templates, and start and enable the mongoDB service:

$ sudo systemctl --system daemon-reload
$ sudo systemctl start mongod.service
$ sudo systemctl enable mongod.service

Confirm that the service is running properly:

$ sudo systemctl status mongod.service
● mongod.service - High-performance, schema-free document-oriented database
   Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2016-01-27 14:33:39 PST; 9min ago
 Main PID: 116789 (mongod)
   CGroup: /system.slice/mongod.service
           └─116789 /opt/mongo/bin/mongod --quiet -f /etc/mongod.conf run

Jan 27 14:33:39 systemd[1]: Started High-performance, schema-free document-oriented database.
Jan 27 14:33:39 systemd[1]: Starting High-performance, schema-free document-oriented database...
Jan 27 14:33:39 mongod[116787]: about to fork child process, waiting until server is ready for connections.
Jan 27 14:33:39 mongod[116787]: forked process: 116789
Jan 27 14:33:39 mongod[116787]: child process started successfully, parent exiting

You can also check the file /var/run/mongodb/ for a valid process ID value. Sometimes it might be necessary to create the parent folder so that the PID can be created:

$ sudo mkdir /var/run/mongodb/

You could also check the mongoDB log for other errors:

$ tail /var/log/mongodb/mongod.log

If the mongod service is not active, double-check that folders are named correctly in configuration and service files, and that permissions and ownership are set correctly on those folders. If anything is not named and attributed correctly, then the service will likely not start and note something like the following error:

about to fork child process, waiting until server is ready for connections. forked process: 1234 ERROR: child process failed, exited with error number 1

I hope this helps others with setting up mongoDB under CentOS — good luck!

Getting GitLab CE to work with SSL and intermediate certificates

Our research lab is non-profit, but private GitHub repositories still cost money, so I have been playing with GitLab Community Edition to serve up some private Git repositories from a third-party host on the cheap.

Before using GitLab CE, I had set up a Git repository that, for whatever reason, would not allow users to cache credentials and would also not allow access via https (SSL). It was getting pretty frustrating to have to type in a long string of credentials on every commit, so setting up a proper Git server was one of the goals.

Installing and setting up the server is pretty painless. After installing all the necessary files and editing the server’s configuration file, I go into the GitLab web console and add myself as a user, and then add myself as a master of a test repository called test-repo.

When I try to clone this test repository via https, I get a Peer's Certificate issuer is not recognized error, which prevents cloning.

To debug this, Git uses the curl framework, which I put into verbose mode:


When cloning, I get a bit more detail about the certificate issuer error message:

$ git clone Cloning into 'test-repo'... * Couldn't find host in the .netrc file; using defaults * About to connect() to port 9999 (#0) * Trying ... * Connection refused * Trying ... * Connected to ( port 9999 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * failed to load '/etc/pki/tls/certs/renew-dummy-cert' from CURLOPT_CAPATH * failed to load '/etc/pki/tls/certs/Makefile' from CURLOPT_CAPATH * failed to load '/etc/pki/tls/certs/localhost.crt' from CURLOPT_CAPATH * failed to load '/etc/pki/tls/certs/make-dummy-cert' from CURLOPT_CAPATH * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: /etc/pki/tls/certs * Server certificate: * subject: CN=*,OU=Domain Control Validated * start date: Oct 10 19:14:52 2013 GMT * expire date: Oct 10 19:14:52 2018 GMT * common name: * * issuer: CN=Go Daddy Secure Certificate Authority - G2,OU=,O=", Inc.",L=Scottsdale,ST=Arizona,C=US * NSS error -8179 (SEC_ERROR_UNKNOWN_ISSUER) * Peer's Certificate issuer is not recognized. * Closing connection 0 fatal: unable to access '': Peer's Certificate issuer is not recognized.

Something is up with the certificate from Go Daddy. From some Googling around, it looks like nginx doesn’t like using intermediate certificates to validate server certificates.

To fix this, I concatenate my wildcard CRT certificate file with GoDaddy’s intermediate and root certificates, which are available from their certificate repository:

$ sudo su - # cd /etc/gitlab/ssl # wget # wget # cat gdig2.crt gdroot-g2.crt >

I then edit the GitLab configuration file to point its nginx certificate file setting to this combined file:

... ################ # GitLab Nginx # ################ ## see: # nginx['enable'] = true # nginx['client_max_body_size'] = '250m' # nginx['redirect_http_to_https'] = true # nginx['redirect_http_to_https_port'] = 443 nginx['ssl_certificate'] = "/etc/gitlab/ssl/" ...

Once this is done, I then reconfigure and restart GitLab the usual way:

$ sudo gitlab-ctl reconfigure $ sudo gitlab-ctl restart

After giving the server a few moments to crank up, I then clone the Git repository:

$ git clone Password for '': ...

I can even cache credentials!

$ git config credential.helper store

Much nicer than the previous, non-web setup.