Photo (c) Jan Trutzschler von Falkenstein; CC-BY-NCPhoto (c) Rain RabbitPhoto (c) Samuel CravenPhoto (c) Gregorio Karman; CC-BY-NC

Research

I am a research fellow, conducting research into automatic analysis of bird sounds using machine learning.
 —> Click here for more about my research.

Music


Other

Older things: Evolutionary sound · MCLD software · Sponsored haircut · StepMania · Oddmusic · Knots · Tetrastar fractal generator · Cipher cracking · CV · Emancipation fanzine

Blog

New journal article from us!

"Automatic acoustic identification of individuals in multiple species: improving identification across recording conditions" - a collaboration published in the Journal of the Royal Society Interface.

For machine learning, the main takeaway is that data augmentation is not just a way to create bigger training sets: used judiciously, it can mitigate the effect of confounds in the training data. It can also be used at test time to check a classifier's robustness.

For bioacoustics, the main takeaway is that previous automatic acoustic individual ID research may have been overconfident in their claimed accuracy, due to dataset confounds - and we provide methods to try and quantify such issues, even without gathering new data.

This journal article is the output of a nice collaboration we've been working on, to try and bring machine learning closer to solved the problems zoologists really need solved. It's been very pleasant working on these ideas with Pavel Linhart and Tereza Petrusková (I didn't actually meet Martin Šálek!). The problem of detecting individual animals' vocal signatures is not yet a solved one, but I hope this paper helps nudge us part of the way there, and helps the field to get there more efficiently by a careful use of audio datasets.

science · Wed 10 April 2019

Where are all the solar panels in Britain? Are they in the south? The sunny east? The countryside, the city?

The UK's office "Ofgem" publishes open data about the solar PV installations that they know about. In the latest "feed-in tariff" (FiT) data, there are about 800,000 of them. The "installed capacity" adds up to about 4.9 gigawatts, about half of which comes from big industrial field-scale installations and half from domestic rooftop solar.

It would be handy to know where the solar panels are - for example, if you're searching for solar panels to map...

For privacy purposes, Ofgem don't publish exact locations, nor unique IDs, in their big spreadsheet. So the data aren't perfect for mapping, but they do give us the postcode district for 90% of these 800 thousand. So, using that postcode info, I've taken their data and simply plotted them on a choropleth. Let's take a look!

Before you look, please note that I'm plotting the raw numbers per postcode district, and NOT normalising the data to account for the size of the district. This partly explains why the plots look "dark" in the regions (such as London) which are chopped up into lots of small districts. Smaller districts should have fewer things in... but on the other hand, smaller districts are supposed to equate to higher density of households, so maybe the postcode district is a good unit of analysis after all.

Here are the plots - three plots showing, respectively, the raw number of installations per district, the total installed capacity in each district, and finally to get an idea of household density I also plot the number of households there are in each district according to census data:

And here's a CSV spreadsheet of the summary FiT numbers I used to plot these. Sorry for not showing (Northern) Ireland, it's not in the data I found.

(The CSV and the images are all derived from Ofgem's FiT data which are published under the Open Government Licence.)

Note that there are A LOT of caveats about this data. About 10% of the solar installations (80 thousand!) whose postcode district was listed as "unknown". Also some postcodes are allegedly not quite right (e.g. some of them are the postcode of the person who registered, not the location of the thing itself). Some of the installations they've listed might have been discontinued, and we don't really have much way of knowing. Oh, and... the postcode area data I'm using seems to have some omissions, hence the occasional white gap in Britain. But notwithstanding all that, this gives us some indication of the distribution.

One thing that pops out to me is that these three plots don't seem very correlated. I'd have expected them all to be highly correlated. For some reason it looks like a relatively high number of small-capacity installations across Yorkshire down into Essex. There's plenty of regional variation and clustering, which may be due to geographical/weather differences, or perhaps to local initiatives.

openstreetmap · Mon 25 March 2019

Jack had this great idea to find the locations of solar panels and add them to OpenStreetMap. (Why's that useful? He can explain: Solar PV is the single biggest source of uncertainty in the National Grid's forecasts.)

I think we can do this :) The OpenStreetMap community have done lots of similar things, such as the humanitarian mapping work we do, collaboratively adding buildings and roads for unmapped developing countries. Also, some people in France seem to have done a great job of mapping their power network (info here in French). But how easy or fast would it be for us to manually search the globe for solar panels?

(You might be thinking "automate it!" Yes, sure - I work with machine learning in my day job - but it's a difficult task even for machine learning to get to the high accuracy needed. 99% accurate is not accurate enough, because that equates to a massive number of errors at global scale, and no-one's even claiming 99% accuracy yet for tasks like this. For the time being we definitely need manual mapping or at least manual verification.)

(Oh, or you might be thinking "surely someone officially has this data already?" Well you'd be surprised - some of it is held privately in one database or other, but not with substantial coverage, and certainly almost none of it has good geolocation coordinates, which you need if you're going to predict which hours the sun shines on each panel. Even official planning application data can be out by kilometres, sometimes.)

Solar panel aerial image examples

Jerry (also known as "SK53" on OSM) has had a look into it in Nottingham - he mapped a few hundred (!) solar panels already. He's written a great blog article about it.

This weekend here in London a couple of us thought we'd have a little dabble at it ourselves. We assumed that the aerial imagery must be at least as good as in Nottingham (because that's what London people think about everything ;) so we had a quick skim to look. Now, the main imagery used in OSM is provided by Bing, and unfortunately our area doesn't look anywhere near as crisp as in Nottingham.

We also went out and about (not systematically) and noticed some solar panels here and there, so we've a bit of ground truth to put alongside the aerial imagery. Here I'm going to show a handful of examples, using the standard aerial imagery. The main purpose is to get an idea of the trickiness of the task, especially with the idea of mapping purely from aerial imagery.

It took quite a lot of searching in aerial imagery to find any hits. Within about 30 minutes we'd managed to find three. Often we were unsure, because the distinction between solar panels, rooftop windows or other rooftop infrastructure is hard to spot unless you've got crystal-clear imagery. We swapped back and forth with various imagery sources, but none of the ones we had available by default gave us much boost.

While walking around town we saw a couple more. In the following image (of this location), the building "A" had some stood-up solar panels we saw from the ground; it also looks like "B" had some roof-mounted panels too, but we didn't spot them from the ground, because they don't stick up much.

Solar example

Finally this picture quite neatly puts 3 different examples right next to each other in one location. At first we saw a few solar panels mounted flush on someone's sloping roof ("A"), and you can see those on the aerial - though my certainty comes from having seen them in real life! Then next to it we saw some stood-up solar panels on a newbuild block at "B", though you can't actually see it in the imagery because the newbuild is too new for all the aerials we had access to. Then next to that at "C" there definitely looks to be some solar there in the aerials, though we couldn't see that from the ground.

Solar example

Our tentative learnings so far:

See Jerry's blog for more learnings.

There are plenty of virtuous feedback loops in here: the more we do as a community, the better we'll get (both humans and machines) at finding the solar panels and spotting the gaps in our data.

openstreetmap · Mon 11 March 2019

Based on a conversation we had in the Machine Listening Lab last week, here are some blogs and other things you can read when you're - say - a new PhD student who wants to get started with applying/understanding deep learning. We can recommend plenty of textbooks too, but here it's mainly blogs and other informal introductions. Our recommended reading:

PRACTICAL:

ADVANCED:

Science · Fri 08 February 2019

I was setting a new laptop up recently. If you're not familiar with Linux you probably don't know how amazing is the ecosystem of software you can have for free, almost instantly. Yes sure the software is free but what's actually impressive is how well it all stitches together through "package managers". I use Ubuntu (based on Debian) and Debian provides this amazing jiu-jitsu wherein you can just type

sudo apt install sonic-visualiser

and hey presto, you get Sonic Visualiser nicely installed and ready to go.

So what that means for me is that when I'm setting up a new computer, I don't need to go running around clicking on a million websites, clicking through download links and licence agreements. I can just copy over the list of all my favourite software packages, and apt will install them for me in just a few steps.

For whatever reason - for my own recollection, at least - here's a list of lots of great packages I tend to install on my desktop/laptop. General useful stuff, plus things that an audio hacker, Python machine-learning developer, and computer science academic might use. I'll add some comments to highlight notable things:

# file sharing, synchronisation
syncthing          # for fabulous dropbox-without-dropbox file synchronising
syncthing-gtk
git
transmission-gtk

# graphics/photo editing
cheese
darktable
gimp            # great for bitmap (e.g. photo) editing
imagemagick
inkscape        # great for vector graphics
openshot        # great for video editing

# for a nice desktop environment:
pcmanfm
gnome-tweak-tool
caffeine-indicator   # helps to pause screensaver etc when you need to watch a film, give a talk, etc
xcalib               # I use this to invert colours sometimes

# academic
jabref
r-base
texlive
texlive-latex-extra
texlive-bibtex-extra
texlive-fonts-extra
texlive-fonts-recommended
texlive-publishers
texlive-science
texlive-extra-utils  # for texcount (latex word counting)
graphviz
gnuplot
latexdiff          # Super-useful for comparing original text against the re-submission text...
poppler-utils      # PDF manipulation
psutils
bibtex2html
pandoc

# for python programming fun
jupyter-notebook
virtualenv
python-matplotlib
python-nose
python-numpy
python-pip
python-scipy
python-six
python-skimage
python3-numpy
cython
ipython
ipython3

# for music playback
mopidy
mopidy-local-sqlite
ncmpcpp
pavucontrol
paprefs
brasero
banshee
qjackctl
jack-tools
jackd2
mixxx
mplayer
vlc

# music/audio file manipulation
audacity
youtube-dl
ffmpeg
rubberband-cli
sndfile-tools
sonic-visualiser
sox
id3v2
vorbis-tools
lame
mencoder
mp3splt

# audio programming libraries
libsndfile1
libsndfile1-dev
libfftw3-dev
librubberband-dev
libvorbis-dev

# for blogging / websiting:
pelican
lftp

# office
simple-scan
ttf-ubuntu-font-family
thunderbird-locale-en-gb
orage
xul-ext-lightning  # alt calendar software

# misc programming stuff
ansible
ant
build-essential
ccache
cmake
cmake-curses-gui
debhelper
debianutils
default-jdk
default-jre
devscripts
git-buildpackage
vim-gtk

# system utilities
apparmor
apport
anacron
nmap
hfsprogs
printer-driver-hpijs
dconf-editor
chkrootkit
dmidecode
zip
zsh             # zsh is so much better than bash
gparted
htop
baobab
wireshark-qt
bzip2
curl
dnsutils
dos2unix
dvd+rw-tools
less
openssh-server
openvpn
screen
unrar
unzip
wget
IT · Fri 01 February 2019

For a cook, Veganuary was a really interesting challenge. A whole month of being vegan! Here are some things I learnt:

  1. How to make a chia egg - it's a replacement for egg white made of... crushed seeds. Probably the main party-trick I learnt, since I already knew some of the other vegan voodoo secrets (e.g. aquafaba).
  2. How to veganise a cake
  3. Coconut becomes a bit of a fallback for creamy or fatty things: coconut cream, coconut oil, coconut milk... not too healthy to have too much of that!
  4. THERE@S MILK EXTRACT IN EVERYTHINGGG. This evening I picked up a pack of gnocchi - presumably just made of potato - and looked at the back... milk protein - gah!
  5. Vegan cheese: avoid Violife. Vegans will tell you "Oh this one tastes OK, this other one is good if you..." - don't listen to them! Violife will put you off vegans AND cheese!
  6. On the other hand, oat milk (Oatly) is ace alternative milk, and can be used for a surprising number of baking recipes with no problem at all. I've cooked Yorkshire puddings, quiches, all kinds of things with it.

You know, I didn't miss real cheese much, but that's probably because a month is not too long really. Didn't even get round to trying all the recipes I wanted to try. Had some lovely vegan junk food (shout out to Vivera shawarma, not to mention McSween's veggie haggis).

And for the record, as well as to show you all how interesting Veganuary can be if you like cooking, here are some of the delicious things we cooked+ate!

Jackfruit rendang meal
Food · Wed 30 January 2019

OK, "vegan chorizo carbonara" - I think neither the Italians nor the Mexicans will forgive me for this one! But it's a veganuary experiment and I like it.

Thanks to veganuary I'm learning about chia egg, and here it really does work to provide the gloopy egg-like saucing. The chia also gives a little bit of flavour and crunch.

To get the flavour balanced, you add more lemon than you would to a "normal" carbonara - it isn't authentic but it adds some freshness and lightness.

Serves one, takes 15 minutes.

First, prepare the chia egg: grind up the chia seed in a pestle and mortar (or similar), not for too long - it doesn't need to be very fine - then add 3 tbsp of cold water. You can leave this to stand and thicken up as you do the other stuff.

Start the spaghetti cooking: put it in a large pan of boiling salted water. Cook it for maybe 12 mins until it is al dente.

Divide the chorizo into small bites. In a small frying pan, fry the chorizo in olive oil, hot at first but then turn it down to medium.

Chop the parsley roughly.

Mix the lemon juice and rind into the chia egg. You may need to stir the chia egg and poke it to beat out any clumps.

When the pasta has reached the "al dente" stage, drain it in a colander and then return it to the pan you cooked it in. (No need for any more heat at this point.) Add the chia egg and lemon, as well as the oat cream, and mix it through thoroughly. Then add the chorizo and the parsley, and mix them all up.

Serve this up, with nutritional yeast sprinkled on top.

recipes · Sun 27 January 2019

I've been using "black bean chorizo" in my cooking for years. It's based on Hugh Fearnley-Whittingstall's "tupperware chorizo" recipe - it makes a densely-flavoured black bean paste, not as firm as real chorizo but with the same kind of flavour depth.

It keeps in the fridge for a long time (let's say... a month?) and is really handy for a bit of complex strong flavour which, in vegetarian cooking, can otherwise be hard to get!

Put the black beans in a bowl and lightly mush/crush them, e.g. with a fork or a masher. They don't need to be fully minced but, at least... not bean-shaped any more!

Add all the other ingredients. Mix it all up thoroughly. It may well seem "too wet" with the red wine but don't worry, it all absorbs and matures.

Put the mix in a tupperware box that you can shut airtight. Shut it, put it in the fridge, and leave it for at least a day before using, ideally 1 to 3 weeks.

recipes · Sun 27 January 2019

Other recent posts: