Photo (c) Jan Trutzschler von Falkenstein; CC-BY-NCPhoto (c) Rain RabbitPhoto (c) Samuel CravenPhoto (c) Gregorio Karman; CC-BY-NC

Research

I am a research fellow, conducting research into automatic analysis of bird sounds using machine learning.
 —> Click here for more about my research.

Music


Other

Older things: Evolutionary sound · MCLD software · Sponsored haircut · StepMania · Oddmusic · Knots · Tetrastar fractal generator · Cipher cracking · CV · Emancipation fanzine

Blog

I'm very happy to publish a video of this installation piece that Sarah Angliss and I collaborated on a couple of years ago. We used computational methods to transcribe a dawn chorus birdsong recording into music for Sarah's robot carillon:

We presented this at Soundcamp in 2016. We'd also done a preview of it at an indoor event, but in this lush Spring morning with the very active birds all around in the park, it slotted in just perfectly.

If you listen you find that obviously the bells don't directly sound like birds singing. How could they! Ever since I started my research on birdsong, I've been fascinated by the rhythms of birdsong and how strongly they differ from human rhythms, and what I love about this piece is the way the bells take on that non-human patterning and re-present it in a way that makes it completely unfamiliar (yet still pleasing). We humans are too used to birdsong as background sound, we fail to notice what's so otherwordly about it. The piece has a lovely ebb and flow, and is full of little gestures and structures. None of that was composed by us - it all comes directly from an automatic transcription of a dawn chorus. (We did of course make creative decisions about how the automatic transcription was mapped. For example the pitch range we transposed to get the best alignment between birds' and bells' singing range.) And in context with the ongoing atmosphere of the park, the birdsong and the children, it works really well.

art · Tue 20 February 2018

The paper "Wasserstein Learning of Deep Generative Point Process Models" published at the NIPS 2017 conference has some interesting ideas in it, connecting generative deep learning - which is mostly used for dense data such as pixels - together with point processes, which are useful for "spiky" timestamp events.

They use the Wasserstein distance (aka the "earth-mover's distance") to compare sequences of spikes, and they do acknowledge that this has advantages and disadvantages. It's all about pushing things around until they match up - e.g. move a spike a few seconds earlier in one sequence, so that it lines up with a spike in the other sequence. It doesn't nicely account for insertions or deletions, which is tricky because it's quite common to have "missing" spikes for added "clutter" in data coming from detectors, for example. It'd be better if this method could incorporate more general "edit distances", though that's non-trivial.

So I was thinking about distances between point processes. More reading to be done. But a classic idea, and a good way to think about insertions/deletions, is called "thinning". It's where you take some data from a point process and randomly delete some of the events, to create a new event sequence. If you're using Poisson processes then thinning can be used for example to sample from a non-stationary Poisson process, essentially by "rejection sampling" from a stationary one.

Thinning is a probabilistic procedure: in the simplest case, take each event, flip a coin, and keep the event only if the coin says heads. So if we are given one event sequence, and a specification of the thinning procedure, we can define the likelihood that this would have produced any given "thinned" subset of events. Thus, if we take two arbitrary event sequences, we can imagine their union was the "parent" from which they were both derived, and calculate a likelihood that the two were generated from it. (Does it matter if the parent process actually generated this union list, or if there were unseen "extra" parent events that were actually deleted from both? In simple models where the thinning is independent for each event, no: the deletion process can happen in any order, and so we can assume those common deletions happened first to take us to some "common ancestor". However, this does make it tricky to compare distances across different datasets, because the unseen deletions are constant multiplicative factors on the true likelihood.)

We can thus define a "thinning distance" between two point process realisations as the negative log-likelihood under this thinning model. Clearly, the distance depends entirely on the number of events the two sequences have in common, and the numbers of events that are unique to them - the actual time positions of the events has no effect, in this simple model, it's just whether they line up or not. It's one of the simplest comparisons we can make. It's complementary to the Wasserstein distance which is all about time-position and not about insertions/deletions.

This distance boils down to:

NLL = -( n1 * log(n1/nu)  +  n2 * log(n2/nu)  +  (nu-n1) * log(1 - n1/nu)  +  (nu-n2) * log(1 - n2/nu) )

where "n1" is the number of events in seq 1, "n2" in seq 2, and "nu" in their union.

Does this distance measure work? Yes, at least in limited toy cases. I generated two "parent" sequences (using the same rate for each) and separately thinned each one ten times. I then measured the thinning distance between all pairs of the child sequences, and there's a clear separation between related and unrelated sequences:

Distances between distinct children of same process:
Min 75.2, Mean 93.3, Median 93.2, Max 106.4
Distances between children of different processes:
Min 117.3, Mean 137.7, Median 138.0, Max 167.3

[Python example script here]

This is nice because easy to calculate, etc. To be able to do work like in the paper I cited above, we'd need to be able to optimise against something like this, and even better, to be able to combine it into a full edit distance, one which we can parameterise according to situation (e.g. to balance the relative cost of moves vs. deletions).

This idea of distance based on how often the spikes coincide relates to "co-occurrence metrics" previously described in the literature. So far, I haven't found a co-occurrence metric that takes this form. To relax the strict requirement of events hitting at the exact same time, there's often some sort of quantisation or binning involved in practice, and I'm sure that'd help for direct application to data. Ideally we'd generalise over the possible quantisations, or use a jitter model to allow for the fact that spikes might move.

science · Fri 16 February 2018

I'm lucky to be working with a great set of PhD students on a whole range of exciting topics about sound and computation. (We're based in C4DM and the Machine Listening Lab.) Let me give you a quick snapshot of what my students are up to!

I'm primary supervisor for Veronica and Pablo:

I'm joint-primary supervisor for Will and Delia:

I'm secondary supervisor for Jiajie and Sophie:

science · Fri 09 February 2018

I was shocked - and frankly, really sceptical - to realise that eating meat was one of the biggest climate impacts I was having. On the flip side, that's a good thing, because it's one of the easiest things to change on our own, without upending society. Easier than rerouting the air travel industry! I've been doing it for a couple of years now and hey - it's actually been surprisingly easy and enjoyable.

Yes meat-eating really is an important factor in climate change: see e.g. this recent letter from scientists to the world - but see also the great book "How Bad Are Bananas" for a nice readable intro. For even more detail this 2014 paper and this 2013 paper both quantify the emissions of different diets.

The great thing is you don't have to be totally vegetarian, and you don't have to be an absolutist. Don't set yourself a goal that's way too far out of reach. Don't set yourself up for failure.

So, my climatarian diet. here's how it works:

  1. Don't eat any more beef or lamb, or other ruminants. Those are by far the most climate-changing animals to eat (basically because of all the methane they produce... on top of the impacts of all the crops you need to feed them up, etc).

Actually, that rule is the only rule that I follow as an absolute. Avoid other meat as far as possible, but don't worry too much if you end up having some chicken/pork/etc now and again. The impact of chicken/pork/etc is not to be ignored but it's much less than beef, and I find I've not really wanted much meat since I shifted my eating habits a bit.

More tips:

  1. Seafood (and various fish) is a good CO2-friendly source of protein and vitamins etc for someone who isn't eating meat, so do go for those. Especially seafood, oily fish. (Though it's hard to be sure which fish is better/worse - see e.g. this article about how much fuel it takes to get different fish out of the sea. Farmed fish must be easier right?)
  2. Try not to eat too much cheese, but again don't worry too much.
  3. When people ask, it's easiest just to say "vegetarian" - they usually know how to feed you then :)

I never thought I'd be able to give up on beef (steaks, roasts, burgers) but the weirdest thing is, within a couple of months I just had no inclination towards it. Funny how these seemingly unchangeable things can change.

If CO2 is what you care about then you might end up preferring battery-farmed animals rather than free-range, because if you think about it battery-farming is all about efficiency, and typically uses less resources per animal (it also restricts animal movements) - however, I'm saying this not to justify it but to point out that maybe you don't want to have just a one-track mind.

Vegans have an even better CO2 footprint than vegetarians or almost-ish-vegetarians like me. Vegan food is getting better and better but I don't think I'm going to be ready to set the bar that high for myself, not for the foreseeable. Still, sampling vegan now and again is also worth doing.

Is this plan the ideal one? Of course not. The biggest problem, CO2-wise, is cheese, since personally I just don't know how to cut that out when I'm already cutting out meat etc - it's just a step too far for me. The book "How Bad Are Bananas" points out that a kilogram of cheese often has a CO2 footprint higher than that of some meats. Yet vegetarians who eat cheese still contribute one-third less CO2 than meat-eaters [source], because of course they don't eat a whole cheese steak every day!

You don't have to be perfect - you just have to be a bit better. Give it a go?

food · Mon 27 November 2017

SuperCollider works on Linux just great. I've been responsible for one specific part of that in recent years, which is that when a new release of SuperCollider is available, I put it into the Debian official package repository - which involves a few obscure admin processes - and then this means that in the next releases of Debian AND Ubuntu, it'll be available really easily.

These are my notes to help others understand how it's done.

First a few words: on Ubuntu you can also make SuperCollider available through an Ubuntu "PPA", and there's even a sort-of-official PPA where you can get it. Some people like this because it happens much quicker (there's less official approval needed). I strongly advise maintainers: it's really valuable to go through the Debian official repository, even though it's slower. There's no need to feel rushed! Getting it into Debian often means fixing a few little packaging quirks to make sure it installs nicely and interoperates nicely, and your work will result in much wider benefit. It's OK to do the PPA thing as well, of course, but you mustn't rely purely on the PPA. (You may as well do the debian thing and then repurpose the same codebase for PPA.)

The things I'm going to cover, i.e. the things you'll need to know/do in order to get a new SuperCollider release into Debian, include:

  1. joining the debian-multimedia team
  2. debian's lovely way of using git together with buildpackage
  3. importing a fresh sourcecode release of SC into the git
  4. compiling it, checking it, releasing it

But I'm mainly going to do this as a step-by-step walkthrough, NOT a broad overview. Sorry if that means some things seem unexplained.

DO IT IN DEBIAN

I'm an Ubuntu user normally, but to keep things clean I do this work in a Debian virtual machine, by using Virtualbox. Ubuntu is based on Debian so you might think you can do it directly in Ubuntu but in practice it tends to go wrong because you end up specifying the wrong versions of package dependencies etc.

Of course, Ubuntu "inherits" packages from Debian, so after we push the Debian package it will magically appear in Ubuntu too.

In the debian you'll also need these packages, which you can get from apt install as normal:

You'll also need to install whatever is needed ordinarily to compile SuperCollider - check the readme. (There's a tool mk-build-deps which can help with this, as long as the dependencies haven't changed since the previous SC.)

GET THE DEBIAN-FLAVOURED CODE

The Debian "multimedia team" has a special git repository of their own, which contains the released version of SuperCollider plus the debian scripts and metadata.

Here are shell commands for fetching the git repo and specifically checking out the three branches that are used in debian's git-buildpackage workflow:

git clone https://anonscm.debian.org/git/pkg-multimedia/supercollider.git
cd supercollider
git checkout -b upstream origin/upstream
ls
git checkout -b pristine-tar origin/pristine-tar
ls
git checkout master
ls

If you do those "ls" commands you get a rough idea of what's in the 3 branches:

  1. "upstream": this should be the exact same as the contents of the "-Source-linux.tar.bz2" sourcecode downloaded from the main SuperCollider release.
  2. "master": this is the same as upstream EXCEPT that it has the special "debian" folder added, which contains all of the magic to compile and bundle up SuperCollider correctly.
  3. "pristine-tar": the file layout in here is very different from the others. It's simply an archive of all the source code tar files, created automatically.

This might seem a bit arcane, but don't change it - the debian "git-buildpackage" scripts expect the git repo to be laid out EXACTLY like this.

A shortcut that actually pulls all three branches is provided by gbp:

gbp clone https://anonscm.debian.org/git/pkg-multimedia/supercollider.git

but I'm doing it explicitly because it's kinda useful to get a bit of an idea what's going on in those three branches.

IMPORTING A FRESH SOURCE CODE RELEASE

Let's imagine the main SuperCollider team have released a new version, including putting a new sourcecode download on the website. IMPORTANT: it needs to be the "-Source-linux.tar.bz2" version, because that strips out some Windows- and Mac-specific stuff. Some people don't care about whether there's extra Windows and Mac cruft in a zip file, but the Debian adminstrators do care, because they monitor the code in the repository to be careful there's no non-free material in there etc.

Do this every time there's a new release of SuperCollider to bundle up:

  1. Run uscan which checks the SuperCollider website for a new source code download. If it finds one it'll download it, and it'll also automatically repack it (removing some crufty files that are either not needed or lead to copyright complications). It puts the resulting tar.gz in ../tarballs. You can run uscan --verbose and it'll show some text details that might help you understand what actions the program is actually doing.
  2. Run gbp import-orig --pristine-tar --sign-tags ${path-to-fetched-tarball} the path, for me at least, is ../tarballs/ followed by the actual tarball file. Make sure it's the "repack" one. The procedure will check with you what the upstream version number is. Is it "3.8.0~repack"? No, it's "3.8.0".
  3. Refresh the patches. What this means is, the debian folder has a set of patches that it uses to modify the supercollider source code, to fix problems. These patches might not apply exactly to the new code, so we need to go through,

    export QUILT_PATCHES=debian/patches
    quilt push -a
    quilt refresh
    quilt pop      # repeat refresh-and-pop until all are popped
    

    Did you run the last two lines again and again? Eventually it says "No patches applied".

    After this, it's a good idea to do a git commit debian to store any changes you made to the patches in a git commit of their own.

    You may need to remove a patch - typically, this happens if it's been "upstreamed". To do that, you can git rm the patch itself as well as edit its name out of debian/patches/series, then commit that. You may also find you need to make a new patch, to fix some issue such as getting the latest code to build properly on all the architectures that Debian supports.

    (Recently the debian admins have started using gbp pq to look after the patches. Maybe that's useful. I haven't got into it yet.)

  4. Create a changelog entry.

    gbp dch -a debian

Here's something that might be surprising: the changelog file is what tells debian which version of SC it's building. If it sees 3.7.0 as the top item, it tries to build from 3.7.0 source. It doesn't matter what's been happening in the git commits, or which source code you have downloaded. So if you're importing a new version you have to make sure to add a new entry to the top of the changelog. Hopefully the pattern is obvious from the file itself, but you can also look at general Debian packaging guidelines to understand it more.

NOW TEST THAT IT BUILDS

First let's get gbp to build a debian style source package. (You may be wondering: we started with SuperCollider's original source code bundle, why are we now building a source code bundle? This is different, it makes a .dsc file that could be used to tell the Debian servers how to compile a binary.)

The main reason I'm telling you to do this is that it performs some of the build process but not the actual compiling. So it's a good way to check for any errors before doing the hardcore building:

gbp buildpackage -S --git-export-dir=../buildarea

This will also run lintian to check for errors in rule-following. Debian's rules are quite strict and you'll probably find some little error or other, which you should fix, then do a git commit for, then try again.

You can also then build proper binary debs:

gbp buildpackage --git-export-dir=../buildarea

This might take a long time. Eventually... it should produce some .deb files. It might even ask you to sign them.

THINGS YOU MAY HAVE TO CHANGE

  1. SuperCollider uses "boost" code library. SuperCollider comes bundled with a recent version of it, and the exact version gets updated now and again. Debian makes this a little more complicated - they don't want to use the bundled version, instead they want to use the version that's built in to debian. So if you look in debian/control you'll see some "libboost" dependencies specified. If SuperCollider's dependency has changed, you may need to update this to get it building properly. You'll also need to use apt install to fetch those boost dependencies.
  2. If SC source code files get renamed or the folders change, it's fairly common that you need to edit one of the text files in the debian folder to point it at the right thing.

NOW TEST THAT IT RUNS

You've successfully made the .deb files, i.e. the actual installable binaries. Install them on your system. You can do it using dpkg -i like you would with most deb files, or for convenience you can use the debi command which makes sure you're installing the whole set of packages you've built:

debi ../buildarea/supercollider_3.8.0-1_*.changes

NOTE that this installing step is "real" installing. If you're working in a virtualbox like I am then you're probably not worried about whether you'll be overwriting your existing SC. Otherwise do bear in mind - this install will overwrite/upgrade the SC that's installed on your system.

Once installed, run it, make sure the thing is OK.

NOW PUBLISH THIS STUFF

In order to get this stuff actually live on the official debian package system, you need to do a few things. You'll need to join "debian multimedia team" as a guest (see their webpages for more info on that), and once you've done that it gives you permission to push your git changes up to their server:

git push origin master upstream pristine-tar
git push --follow-tags

Then after that you need to do a "request for upload" - i.e. asking one of the debian multimedia team with upload rights, to give it a quick check and publish it. You do this via the debian multimedia team mailing list. It's also possible to get upload rights yourself, but that's something I haven't gone through.

CONCLUSION

So there we have it. Thanks to Felipe Sateler and other Debian crew for lots of help inducting me into this process.

Interested in helping out? Whether it's SuperCollider or some other audio/video linux tool, the Debian MultiMedia team would love you to join!

P.S. some added notes from Mauro about using Docker as part of this

supercollider · Sat 28 October 2017

I'm just flying from the International Bioacoustics Congress 2017, held in Haridwar in the north of India. It was a really interesting time. I'm glad that IBAC was successfully brought to India, i.e. to a developing country with a more fragmented bioacoustics community (I think!) than in the west. For me, getting to know some of the Indian countryside, the people, and the food was ace. Let me make a few notes about research themes that were salient to me:

So about those false-colour "long duration spectrograms". I've been advocating this visualisation method ever since I saw Michael Towsey present it (I think at the Ecoacoustics meeting in Paris). Just a couple of months ago I was at a workshop at the University of Sussex and Alice Eldridge and colleagues had been playing around with it too. At IBAC this week, ecologist Liz Znidersic talked really interestingly about how she had used them to detect a cryptic (i.e. hard-to-find) bird species. It shows that the tool helps with "needle in a haystack" problems, including those where you might not have a good idea of what needle you're looking for.

In Liz's case she looked at the long-duration spectrograms manually, to spot calling activity patterns. We could imagine automating this, i.e. using the long-dur spectrogam as a "feature set" to make inferences about diurnal activity. But even without automation it's still really neat.

Anyway back to the thematic listings...

As usual, my apologies to anyone I've misrepresented. IBAC has long days and lots of short talks (often 15 minutes), so it can all be a bit of a whirlwind! Also of course this is just a terribly partial list.

(PS: from the archives, here's my previous blog about IBAC 2015 in Murnau, Germany.)

science · Sun 15 October 2017

I just spent a couple of weeks in northern India (to attend IBAC). Did I have some great food? Of course I did.

No fancy gastro stuff, just tried the local cuisine. Funny thing is, as an English person and living in East London for more than a decade, I already knew plenty about the food! Or at least I knew more than the other international delegates did. The Indian names, the difference between naan and chapatti and poori, for example, or what's a palak paneer.

Credit's got to go to Sarab Sethi for showing me his favourite street food when we were in downtown Haridwar - aloo tiki, which is fried potatoes and yogurt with sweet and savoury sauces. Kind of like an Indian equivalent of chips with mayo+ketchup, maybe... or maybe that doesn't do it justice! And thanks Phil for the photo:

In the north India region, two staples of the local cuisine seem to be dhal makhani (buttery thick lentils) and paneer (cheese) curry, which we were served often. To be honest, though, those are very rich, too much to eat every day. On quite a few days I went for a lovely dosa instead - a crispy curry-filled pancake, which is from south India originally, and also not an everyday meal but I do like it.

I did have some top dishes locally though. At a dhaba (a roadside caff) we had some excellent-tasting chana (chickpeas) - no idea how they were flavoured so well but they were. The curried brinjal (okra) dishes were good too, nice and fresh-tasting.

In Delhi (at the start of the trip) I found a well-reputed eatery called Khaki di Hatti. They do massive naan breads - seriously massive, I recommend you agree to order the "baby" one when they suggest it! - and I had a great spinach kofte dish there. It was kofte ("meatballs", you might say, though vegetarian in this case) made of breadcrumbs and who-knows-what-else, in a lovely savoury spinach sauce.

Untitled Untitled

Eating veggie is easy in this region, since so many people are veggie. But then so is eating non-veggie. Lots of places are "pure veg" and in most other places it's really clearly labelled.

food · Sat 14 October 2017

A nice fresh pea soup can be great sometimes, and also a good thing to do with leftovers. This worked well for me when I had some leftover spring onions, creme fraiche and wasabi. You can of course leave out the wasabi, or swap the creme fraiche for cream or a dab of milk, or you could add watercress perhaps.

Boil a kettle.

In a smallish pan melt the butter. Chop the spring onions, and fry the white bits gently to soften them, about 4 minutes. Then add the green bits of the spring onions, as well as the peas and the tiny dab of wasabi.

Turn up the heat and also add the boiling water, just enough to cover things. Once you've brought the pan to the boil you can turn it right down low, put a lid on it, and let it bubble gently for approx 10 minutes, no need for more.

Take the pan off the heat, and with a hand blender you can whizz up the pan's contents to blend it to a smooth soup. Add the black pepper and creme fraiche and stir it through.

recipes · Tue 26 September 2017

Other recent posts: