Category: github

The goal is to set up a Django management command and API route (with authentication support) for deleting HiGlass tilesets.

We will do the work on a t2.micro EC2 instance running Ubuntu 18.04 (ami-0f65671a86f061fcd).

Nodejs

Installation

Install node, npm, and npx from Nodejs:

$ cd ~
$ wget -qO- https://nodejs.org/dist/v11.1.0/node-v11.1.0-linux-x64.tar.xz > node-v11.1.0-linux-x64.tar.xz
$ tar xvf node-v11.1.0-linux-x64.tar.xz
...
$ cd node-v11.1.0-linux-x64/bin
$ sudo ln -s ${PWD}/node /usr/bin/node
$ sudo ln -s ${PWD}/npm /usr/bin/npm
$ sudo ln -s ${PWD}/npx /usr/bin/npx

Anaconda

Installation

$ cd ~
$ wget https://repo.anaconda.com/archive/Anaconda3-5.3.0-Linux-x86_64.sh
$ chmod +x Anaconda3-5.3.0-Linux-x86_64.sh
$ ./Anaconda3-5.3.0-Linux-x86_64.sh
...

Environment

Python 3.7 does not appear to be compatible with Cython at this time, so we downgrade to 3.6:

$ conda create -n higlass-server python=3.6 --no-default-packages --yes
...

Cf. https://github.com/cython/cython/issues/1978

Github

Clone and branch

We clone a fork of higlass-server, sync up with the upstream repository, and set up our branch off of the freshly-updated develop branch:

$ git clone https://github.com/alexpreynolds/higlass-server
$ cd ~/higlass-server
$ git remote add upstream https://github.com/higlass/higlass-server.git
$ git checkout develop
$ git pull remote develop
$ git checkout -b delete-tileset develop

Clean up old hms-dbmi references:

$ cd ~/higlass-server
$ grep -rl hms-dbmi . | xargs sed -i 's/hms-dbmi/higlass/g'

Development

Environment

Install current GCC kit and libraries:

$ sudo apt install build-essential
$ sudo apt install libglib2.0-dev
$ sudo apt install libbz2-dev
$ sudo apt install liblzma-dev
$ sudo apt install libhdf5-serial-dev
$ sudo apt install libcurl4-gnutls-dev
$ sudo apt install libpng-dev
$ sudo apt install libssl-dev
$ gcc --version
...

Python requirements

Note: Edit requirements-secondary.txt to build clodius v0.9.3, or newer.

$ cd ~/higlass-server
$ source activate higlass-server
(higlass-server) $ pip install --upgrade -r ./requirements.txt
(higlass-server) $ pip install --upgrade -r ./requirements-secondary.txt

Initialize server

(higlass-server) $ python manage.py makemigrations
(higlass-server) $ python manage.py migrate
(higlass-server) $ python manage.py runserver localhost:8000
...

Test API

From another terminal session:

$ wget -qO- http://localhost:8000/api/v1/tilesets
{"count":0,"next":null,"previous":null,"results":[]}
$ wget -qO- http://localhost:8000/api/v1/tileset_info/?d=1234
{"1234": {"error": "No such tileset with uid: 1234"}}

Test ingestion and listing

$ wget -O- --user * --password * https://resources.altius.org/~areynolds/LN43287.75_20.normalized.GRCh38_no_alts.bw.tileset > /tmp/LN43287.75_20.normalized.GRCh38_no_alts.bw.tileset
$ python manage.py ingest_tileset --filetype hitile --datatype vector --filename /tmp/LN43287.75_20.normalized.GRCh38_no_alts.bw.tileset
uid: AIVpsJYwSemD8FVPBv6vrw
$ python manage.py list_tilesets
tileset: Tileset [name: LN43287.75_20.normalized.GRCh38_no_alts.bw.tileset] [ft: hitile] [uuid: AIVpsJYwSemD8FVPBv6vrw]

Set up superuser

$ python manage.py createsuperuser
...

Pull request

https://github.com/higlass/higlass-server/pull/79

Read More

The newer versions of emacs include JavaScript and other user modes useful for modern app development:

$ git clone git://git.savannah.gnu.org/emacs.git $ sudo yum groupinstall "Development Tools" $ wget ftp://ftp.gnu.org/gnu/autoconf/autoconf-2.68.tar.bz2 $ tar jxvf autoconf-2.68.tar.bz2 $ cd autoconf-2.68 $ ./configure; make; sudo make install $ sudo yum install texinfo libXpm-devel giflib-devel libtiff-devel libotf-devel $ cd ../emacs $ make bootstrap; sudo make install

This process can take upwards of 20-30 minutes.

With the git repo state as of 24 March 2015:

$ emacs --version GNU Emacs 25.0.50.1 Copyright (C) 2015 Free Software Foundation, Inc. GNU Emacs comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GNU Emacs under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING.

Via: http://haulynjason.net/weblog/?p=1592

Read More

Finishing touches are in place for my convert2bed tool (GitHub site).

This utility converts common genomics data formats (BAM, GFF, GTF, PSL, SAM, VCF, WIG) to lexicographically-sorted UCSC BED format. It offers two benefits over alternatives:

  • It runs about 3-10x as fast as bedtools *ToBed equivalents
  • It converts all input fields in as non-lossy a way as possible, to allow recovery of data to the original format

As an example, here we use convert2bed on a 14M-read, indexed BAM file to a sorted BED file (data are piped to /dev/null) on a 4 GB, dual-Core 2 (2.4 GHz) workstation running RHEL 6:

$ samtools view -c ../DS27127A_GTTTCG_L001.uniques.sorted.bam
14090028

Conversion is performed with default options (sorted BED as output, using BEDOPS sort-bed):

$ time ./convert2bed -i bam < ../DS27127A_GTTTCG_L001.uniques.sorted.bam > /dev/null
[bam_header_read] EOF marker is absent. The input is probably truncated.

real 3m5.508s
user 0m25.702s
sys 0m8.602s

Here is the same conversion, performed with bedtools v2.22 bamToBed and sortBed:

$ time ../bedtools2/bin/bamToBed -i ../DS27127A_GTTTCG_L001.uniques.sorted.bam | ../bedtools2/bin/sortBed -i stdin > /dev/null

real    28m22.057s
user    2m58.579s
sys     0m41.605s

The use of convert2bed for this file offers a 9.1x speed improvement. Other large BAM files show similar conversion speedups.

Further time reductions are conferred with use of bam2bedcluster and bam2starchcluster scripts (TBA) which make use of GNU Parallel or a Sun Grid Engine job scheduler, reducing conversion time even further by breaking conversion tasks down by chromosome.

When testing is complete, code will be wrapped into the upcoming BEDOPS v2.4.3 release. Source is now available via GitHub.

Read More