I am a research fellow, conducting research into automatic analysis of bird sounds using machine learning.
—> Click here for more about my research.
I'm really pleased about the selection presentations we have for our special session at ICEI2018 in Jena (Germany) 24th-28th September. The session is chaired by Jérôme Sueur and me, and is titled "Analysis of ecoacoustic recordings: detection, segmentation and classification".
Our session is special session S1.2 in the programme and here's a list of the accepted talks:
We also have poster presentations on related topics:
You can register for the conference here - early discount until 15th Sep. See you there!
I've been trying out an e-ink reader for my academic work.
"e-ink" - these are greyscale LCD-like displays. You see the image by light reflectance, almost the same way you read a printed page, not by luminance like a TV/laptop screen. This should be better in lots of ways: better on your eyes, low-power, and you can read outside. The low-power comes because it doesn't need a full jolt of energy 50 times a second as does an LED display: if the image doesn't change, no power is needed, the image stays there for free.
Why for academic work? A LARGE portion of my everyday work consists of looking at academic article PDFs, scribbling on them, then giving/implementing feedback. This comes from students, collaborators, reviews for journals/conferences, and from editing my own work as I do it. Some people can do this kind of stuff directly on a laptop screen. I'm afraid I can't. It's less effective, less detailed when I do that - so, for years, I've been printing things out, scribbling notes on them, then throwing them away afterwards.
If an e-ink reader can replace all that, maybe that's a good thing for the environment?
Note: it takes a lot of resources to build an e-reader. At what point is it "better" to print thousands of pages of paper, versus manufacture one e-reader? I don't know.
You can't use any old e-reader, for academic reviewing: it needs to be large enough to render an A4 PDF well (ideally, full A4 size), and needs some way of annotating. This one I'm trying has a stylus that you can use to scribble, and it works. Surprisingly good so far.
NOW THE NEXT STEP:
We sometimes have sunny days, you know. For some reason this often happens when we've a workshop or conference organised. "Why don't we have the session outside on the grass?" I'm tempted to say. The answer would be... because you can't really look at people's slideshow slides out on the grass. Pass a laptop around? Broadcast the slides to everyone's smartphones? Redraw everything from scratch on a flipchart? Meh.
What I'd like to see is an e-ink screen, large enough to host a seminar with. The resolution doesn't need to be as high as all that, certainly doesn't need to be as high as is needed for reviewing PDFs. it just needs to be big. It would be great if there was a stylus or some other way of scribbling on the screen too.
Most academic slides are not animated. So an e-ink type screen is much more suitable than an LED screen, and would use much much less power. (Ever noticed the amount of cooling needed for those LED advertising signs in the street? Crazy power consumption.)
This week we've been at the LVA-ICA 2018 conference, at the University of Surrey. A lot of papers presented on source separation. Here are some notes:
An interesting feature of the week was the "SiSEC" Signal Separation Evaluation Challenge. We saw posters of some of the methods used to separate musical recordings into their component stems, but even better, we were used as guinea-pigs, doing a quick listening test to see which methods we thought were giving the best results. In most SiSEC work this is evaluated using computational measures such as signal-to-distortion ratio (SDR), but there's quite a lot of dissatisfaction with these "objective" measures since there's plenty that they get wrong. At the end of LVA-ICA the organisers announced the results of the listening test: surprisingly or not, the results of the listening test had broadly a strong correlation with the SDR measures, though there were some tracks for which this didn't hold. More analysis of the data to come, apparently.
From our gang, my students Will and Delia presented their posters and both went really well. Here's the photographic evidence:
Also from our research group (though not working with me) Daniel Stoller presented a poster as well as a talk, getting plenty of interest for his deep learning methods for source separation preprint here.
I've always thought fake meat was a bit silly. When I recently starting eating more veggy food I promised myself I wouldn't have to eat Quorn pieces, those fake chicken pieces that taste bland and (unlike chicken) don't respond to cooking. They don't caramelise, they don't get melty tender, they just warm up. If you like cooking, you're much better off cooking some actual veg.
So it's a shock to be saying that some of the best meals I've had in 2017 have been fake meat. It seems the veggie world is just stepping up and stepping up. I've been lucky enough to travel for work and here are some amazing things I ate:
In Beijing, there was this braised fish dish, an extravagant centrepiece to a meal. A big pot of braised Chinese vegetables, and at the centre a mock fish steak. I don't know what it was made of but it had been slashed across the upper surface (like you would do with meat to get flavours in) and that upper surface was grilled and caramelised, while the lower part in the braising sauce was meltingly tender.
In Sweden, I got off the train in Lund and within a few minutes my eyes lighted on a kebab shop (Lunda Kitchen) with a massive list of things labelled "vegan": burgers, kebabs, pepperoni pizzas... My host actually said that he thought "vegan" probably didn't mean the same thing as it did in English. Anyway it does. Their vegan doner kebab was just ace: just meaty and spicy enough, all the trimmings as usual.
In Germany, I had this literally unbelievable vegan schnitzel (at Max Pett, Munich). It wasn't just that it had the taste of a breaded steak "Wiener art", but also the structure, the resistance and texture you expect when you cut into an actual schnitzel. The only reason I didn't grab the serving staff and double-check whether it was veggie or not was that I was in a very definitely vegan restaurant.
In France the seitan bourgignon was a great idea but the execution wasn't ideal. However we had excellent seaweed "tartare" and artichoke "rillettes", both of which captured specific je-ne-sais-quoi tastes of the traditional dishes they were paying tribute to. These were in various Paris vegan bistros.
In India... I didn't have any fake meat at all. I had some amazing dishes, since they've a massive history of veggie cuisine of their own, but it doesn't centre around fake meat.
Back in London? Yes there's plenty of good food around, such as vegan doner kebab or cheezburger from "Vx". But... the veggie version of a roast beef Sunday lunch? I haven't seen it yet...
Excited today to get a delivery of the new mail-order vegan cheese from my friend's new London cheezmakery, Black Arts Vegan! It came beautifully packed, see:
Their first cheese is a vegan mozarella. We unpacked the cheese and had a taste - yes, a good clear taste like standard mozarella. But they've worked on getting it right so it goes melty and gooey, and browns nicely in the oven. So let's try it on a pizza!
It really does come into its own on the pizza - the lovely warm melted mozarella consistency is great, and it's easy to forget that it's plant-based and not dairy. Magic :)
Tamarind is ace. It imparts a deep, rich and sweet flavour to curries. Buy a block and put it in your fridge, it keeps for months, and you can hack a piece off and chuck it in your curry just like that. That's what I did in this lovely chana (chickpea) curry.
Note that the block sort-of dissolves as it cooks, and leaves behind inedible pips. If you prefer not to spit out pips then you could put the tamarind in a paper teabag perhaps, so you can fish it out afterwards.
You can change the veg choices in here - the red pepper is a nice bright contrasting flavour - but in particular the baby aubergines do this great thing of going gooey and helping to create the sauce. Full-sized aubergines don't seem to do that, in my experience. It's the tamarind and the aubergine that go to add body to the sauce, I think - I don't add any tomato or anything like that, and yet the sauce is flavoursome and thickened.
Heat the oil in a largeish deep pan which has a lid, on quite a hot frying heat. Add the spice seeds and the cloves - you might like to put the lid half-on at this point because as the seeds fry and pop they'll jump around and may jump out at you.
After 30 secs or so with the seeds, add the onion, then the chilli and the powdered spices. Give it a good stir round. Let the onion fry for a minute or two before adding the red pepper and the aubergines. Fry this all for another couple of minutes, stirring occasionally.
Add the chickpeas, the beetroot with its juices, the tamarind block, and maybe 1 cup of boiling water (don't add too much water - not enough to cover the mixture). Give this a good stir, then put the lid on, turn the heat down to its lowest, and let it bubble for 30 minutes or so. It can be longer or shorter, I'd say 20 minutes is an absolute minimum. No need to stir now, you can go and do something else, as long as you're sure it's not going to bubble over!
When the curry is nearly ready, take the lid off, turn the heat up to thicken the liquid if needed, and give it all a stir.
Give it a good twist of black pepper, then serve it up in bowls, with coriander leaf sprinkled on top. Serve it with bread (eg naan or roti).
I'm very happy to publish a video of this installation piece that Sarah Angliss and I collaborated on a couple of years ago. We used computational methods to transcribe a dawn chorus birdsong recording into music for Sarah's robot carillon:
We presented this at Soundcamp in 2016. We'd also done a preview of it at an indoor event, but in this lush Spring morning with the very active birds all around in the park, it slotted in just perfectly.
If you listen you find that obviously the bells don't directly sound like birds singing. How could they! Ever since I started my research on birdsong, I've been fascinated by the rhythms of birdsong and how strongly they differ from human rhythms, and what I love about this piece is the way the bells take on that non-human patterning and re-present it in a way that makes it completely unfamiliar (yet still pleasing). We humans are too used to birdsong as background sound, we fail to notice what's so otherwordly about it. The piece has a lovely ebb and flow, and is full of little gestures and structures. None of that was composed by us - it all comes directly from an automatic transcription of a dawn chorus. (We did of course make creative decisions about how the automatic transcription was mapped. For example the pitch range we transposed to get the best alignment between birds' and bells' singing range.) And in context with the ongoing atmosphere of the park, the birdsong and the children, it works really well.
The paper "Wasserstein Learning of Deep Generative Point Process Models" published at the NIPS 2017 conference has some interesting ideas in it, connecting generative deep learning - which is mostly used for dense data such as pixels - together with point processes, which are useful for "spiky" timestamp events.
They use the Wasserstein distance (aka the "earth-mover's distance") to compare sequences of spikes, and they do acknowledge that this has advantages and disadvantages. It's all about pushing things around until they match up - e.g. move a spike a few seconds earlier in one sequence, so that it lines up with a spike in the other sequence. It doesn't nicely account for insertions or deletions, which is tricky because it's quite common to have "missing" spikes for added "clutter" in data coming from detectors, for example. It'd be better if this method could incorporate more general "edit distances", though that's non-trivial.
So I was thinking about distances between point processes. More reading to be done. But a classic idea, and a good way to think about insertions/deletions, is called "thinning". It's where you take some data from a point process and randomly delete some of the events, to create a new event sequence. If you're using Poisson processes then thinning can be used for example to sample from a non-stationary Poisson process, essentially by "rejection sampling" from a stationary one.
Thinning is a probabilistic procedure: in the simplest case, take each event, flip a coin, and keep the event only if the coin says heads. So if we are given one event sequence, and a specification of the thinning procedure, we can define the likelihood that this would have produced any given "thinned" subset of events. Thus, if we take two arbitrary event sequences, we can imagine their union was the "parent" from which they were both derived, and calculate a likelihood that the two were generated from it. (Does it matter if the parent process actually generated this union list, or if there were unseen "extra" parent events that were actually deleted from both? In simple models where the thinning is independent for each event, no: the deletion process can happen in any order, and so we can assume those common deletions happened first to take us to some "common ancestor". However, this does make it tricky to compare distances across different datasets, because the unseen deletions are constant multiplicative factors on the true likelihood.)
We can thus define a "thinning distance" between two point process realisations as the negative log-likelihood under this thinning model. Clearly, the distance depends entirely on the number of events the two sequences have in common, and the numbers of events that are unique to them - the actual time positions of the events has no effect, in this simple model, it's just whether they line up or not. It's one of the simplest comparisons we can make. It's complementary to the Wasserstein distance which is all about time-position and not about insertions/deletions.
This distance boils down to:
NLL = -( n1 * log(n1/nu) + n2 * log(n2/nu) + (nu-n1) * log(1 - n1/nu) + (nu-n2) * log(1 - n2/nu) )
where "n1" is the number of events in seq 1, "n2" in seq 2, and "nu" in their union.
Does this distance measure work? Yes, at least in limited toy cases. I generated two "parent" sequences (using the same rate for each) and separately thinned each one ten times. I then measured the thinning distance between all pairs of the child sequences, and there's a clear separation between related and unrelated sequences:
Distances between distinct children of same process: Min 75.2, Mean 93.3, Median 93.2, Max 106.4 Distances between children of different processes: Min 117.3, Mean 137.7, Median 138.0, Max 167.3
This is nice because easy to calculate, etc. To be able to do work like in the paper I cited above, we'd need to be able to optimise against something like this, and even better, to be able to combine it into a full edit distance, one which we can parameterise according to situation (e.g. to balance the relative cost of moves vs. deletions).
This idea of distance based on how often the spikes coincide relates to "co-occurrence metrics" previously described in the literature. So far, I haven't found a co-occurrence metric that takes this form. To relax the strict requirement of events hitting at the exact same time, there's often some sort of quantisation or binning involved in practice, and I'm sure that'd help for direct application to data. Ideally we'd generalise over the possible quantisations, or use a jitter model to allow for the fact that spikes might move.