/images/jl11_lots.jpg

John Lees' blog

Pathogens, informatics and modelling at EMBL-EBI

Easy debugging of C/C++/CUDA python extensions

Writing an extension called by python (in C, C++ or CUDA)? Not working? Typical.

When doing the same from R it’s pretty easy to debug, just run with R -d <debugger name> e.g. R -d valgrind or R -d gdb you get into the debugger, continue, then run interactively as usual. (For a more complex example using both at once see this blog post).

Doing this from python seems trickier to me. I started off following this guide: https://johnfoster.pge.utexas.edu/blog/posts/debugging-cc%2B%2B-libraries-called-by-python/ But I’m not really an ipython user, and prefer gdb over lldb (due to familiarity). I think this is a good way to do this if you need an interactive python session, but really this is overcomplicated for my typical use case.

p-value < 2.2e-16

A claim: 2.2e-16 is the most popular p-value in research papers, even more popular than 0.05 (or if you’re being cynical 0.049).

Why?

2.2e-16 happens to be the epsilon of a double-precision float (i.e. a decimal number stored using 64 bits). Roughly, this means that if you try to calculate 1 - epsilon, with anything smaller than epsilon, the answer will be 1.

In R, you can calculate this by running the following code (+2 due to convention):

Honey Roast Parsnips (frozen), Iceland

£1.75, 750g

This weekend I wanted to buy some parsnips to roast, but they were absent from the produce section (except in a pack coming with four unwanted carrots, and one considerably more unwanted turnip). However, as I was shopping at Iceland, there was a handy pre-prepared frozen alternative:

/images/IMG_20210329_182003-241x300.jpg

Screamadelica, Primal Scream

Why is it only in 2021 that I am listening to Primal Scream’s Screamadelica for the first time?

/images/Screamadelica.png

A lot of critically acclaimed music from the 1980s maintains a pop appeal that means it still gets radio play, is featured in club nights, and is heavily promoted in my Youtube home. However, perhaps the post-rock, trip-hop and grunge of the early 1990s doesn’t have the same enduring commercial appeal. Whatever the reason, I’ve been missing out.

Porting a bioinformatics tool to the web using WebAssembly, React and javascript

We recently released a beta version of PopPUNK-web (https://web.poppunk.net). This is a WebAssembly (WASM) version of pp-sketchlib which sketches an user-input genome assembly in the browser; transmits this sketch as a JSON to a server running PopPUNK using gunicorn and flask; runs query assignment against a large database of genomes from the GPS project; returns a JSON containing strain assignment, a tree and network; these are then displayed using a react app.

Thoughts on 'Whole genome phylogenies reflect the distributions of recombination rates for many bacterial species'

I was happy to see that this paper, which originally appeared as a preprint back in April 2019 (!), was published earlier this month. I thought it was one of the most thought-provoking papers I’ve read recently, so suggested a journal club on the final version (it’s long paper – over 80 pages).

There were some parts that I liked a lot, and some parts I didn’t like, which I wanted to summarise here. Overall, I thought the paper brought an interesting ‘outsider’ approach to the problem of bacterial population genomics, and quantified some issues in new and useful ways. However, I was less keen on the presentation, which to me was overly confrontational, and failed to put the research within a proper modern context. I’ve summarised some of the discussion here, which I note is not meant to be a thorough review, and is subjective.