Other things on this site...

MCLD
music
Evolutionary sound
Listen to Flat Four Internet Radio
Learn about
The Molecules of HIV
MCLD
software
Make Oddmusic!
Make oddmusic!

My PhD thesis now online

I'm glad to say the thesis corrections have been approved so my PhD thesis is now in its finished form - available here:

The title is "Making music through real-time voice timbre analysis: machine learning and timbral control". (Tip for future PhDs, try to choose a title that you can say in one breath...)

I'm really grateful to all the fab people in C4DM - I've got so much from being in a research environment with so many people knowledgeable about such a variety of cool things - and, well, I don't want to rewrite the whole acknowledgments here (they're on page 3) but all the people who took part in experiments or just chatted about research. (Including the folks at humanbeatbox.com)

The thesis is available under creative commons. And, because I uploaded it to archive.org they also seem to have converted it into some crazy ebook formats, so you can presumably read a garbled version of it on your kindle if you like ;) probably best to use the original PDF if possible, though (the TeX source is also included).

Saturday 7th August 2010 | science | Permalink

SMC 2010 conference notes

I've just been at SMC2010, the Sound and Music Computing conference. It's the first time I've been so one question I had was, what differentiates it from other conferences in this research area like NIME, DAFX, ISMIR, ICMC? What's its specialist subject? The answer is that it deliberately tries not to over-specialise, they keep the topic broad to encourage cross-disciplinary thinking, and there's a good strong representation of young researchers so it's a good place for fresh ideas and making new connections. My paper about timbre remapping came across pretty well I think.

One reason I was keen to go to this conference was that it was hosted by UPF's Music Technology Group in Barcelona, because that group is the main place where people have done research on very similar lines as my PhD topic of beatbox-based control. It was great to meet Jordi Janer whose PhD was about singing-based control, and Marco Marchini and Hendrik Purwins who presented a poster about a kind of rhythmic beatboxing equivalent to the continuator - give it a piece of rhythmic audio and it will try to continue by chopping up the sound and outputting patterns in (hopefully) the same style. The most interesting part of their work is the automatic approach to clustering, where they hierarchically cluster all the sound events, and then let the system choose the appropriate clustering level (i.e. how many clusters to lump the events into) at playback time, by judging how 'informative' the markov-model resynthesis is at each level of clumpiness.

Also interesting was Ho-Hsiang Wu and Juan Bello's poster about representing the musical structure of a song. We all know that many songs have repetition in them, whether it's verse-chorus-verse-chorus or something else - and we can analyse this automatically from the audio, for example by detecting repeated sub-sequences of chord patterns or timbre. Their contribution is to visualise this detected repetition using 'arc plots', pretty little monochrome rainbows that reminded me of the kind of information aesthetics practised by Information Is Beautiful. The end result is that pop songs create little plots which generally all look quite similar but with little shape differences that you could spot by eye, whereas I imagine classical music pieces would probably each have their own visual signature that could be quite different. Could be a nice way to get an instant visual impression of the musical structure of a piece of recorded music.

The keynote talk by Ricard Sole was thought-provoking, discussing the theory of complex networks, with some results of his created by applying this theory to languages, software, and other things. Sound and music wasn't mentioned, but I know it's useful stuff that was food for thought for many people. (In our group we have some researchers who have looked at this kind of thing already - when you consider the network of MySpace bands & friends, for example, that's a complex network where issues of small-world-ness, hubs, etc. Which reminds me, I wonder how Kurt is getting on with his thesis... :)

In fact some of the research presented at SMC was grappling with these issues too, such as the work by Martin Gasser et al showing that the problem of hubs in music similarity (i.e. songs that keep getting returned as good similarity matches to various input songs, even if they don't sound that similar) may be affected by the "homogeneity" of the audio in the music database.

The concert programme was packed full of things: lots of soundscape-based work, and more generally electroacoustic stuff. My favourites out of those were Impulsus I by Lina Bativa (an audio-visual piece which had a great narrative energy despite being really abstract), and Juan Parra Cancino's reacTable performance which I mentioned in my post about the reacTable.

But one of the things I was most grateful for was the deliberate non-art-music session. Electroacoustic stuff is all very well, but I can't generally cope with so much of it packed into a week and after all, this is a broad conference where many of the researchers are working on pop music, techno, breakbeats, and stuff like that. As the conference chair (Xavier Serra) said, it's actually quite difficult to get the non-art-music in the conference, since research conferences aren't usually their scene and most of the good examples of techno-enhanced popular music are quite happily making music in front of normal crowds... So, many of us were glad to spend an hour listening to Japanese pop made using Vocaloid, and a dance set made using Loopmash. (Sergi Jorda also told me he had hoped to get a dance music set in the reacTable concert, but the performer wasn't available.)

This is something that we need to work on as a research community - the SMC hosts did well, assisted by the fact that some of their own technology has gone directly and quite notably into music tech used by producers - but it's one of those things that's going to need a constant bit of extra effort to try and encourage that kind of thing into these conferences.

Monday 26th July 2010 | science | Permalink

Automatic birdsong analysis

I've started my first project after my PhD, a small feasibility study into automatic birdsong analysis.

The picture visualises a few seconds of a skylark recording by Dr Elodie Briefer (in QMUL's School of Biological and Chemical Sciences), from her PhD research into the structure of skylark song.

What we're doing is looking at the potential for automatically analysing birdsong signals, which could mean picking them out of recordings, identifying species, identifying individual "syllables" in the song... who knows.

There are already a fair few published research papers about automatic birdsong analysis. I'm looking at the state of the art to determine the scope for future work, such as applying machine learning techniques we've developed in our group, or particular forms of signal analysis such as adaptive transforms.

In my PhD I was looking a lot at voice and music. Birdsong has interesting similarities to both music and spoken language - plus differences of course. So watch this space. And of course get in touch if you're interested.

Monday 5th July 2010 | science | Permalink

Unpredictable impact

There's a big change happening in UK science+engineering at the moment, and it goes by the name of Impact. What does it mean? When we do science we often do it just to find new things out, yet whether we intend it or not one of the great things about science is that it actually makes important changes to the world outside our research group. Impact is formally defined as being that effect that we have - on business and economy, on health, on public policy, on culture and the arts. There are billions of ways that impact spreads.

This has always been a very unpredictable thing and pretty hard to measure, so the government now has created a formal process for trying to account for the types of impact that we get out of research - and even further, to think hard about impact when deciding what research to fund. In a lot of cases the predicted impact will now account for up to 25% of the considerations in rating academic departments or allocating funding.

Sounds reasonable? Well many scientists are against it - and it's not because they don't like having to justify themselves (they already have to do that when they write grant applications etc), but because the real impact of science often happens in surprising ways, sometimes many years down the line. Take DNA fingerprinting for example. The scientists who came up with it were working with DNA, trying to measure various things, but they had no idea that the best thing they could do was make an unruly collection of DNA form patterns on a sheet of film - they discovered it by accident. And now it's an important part of many of the most serious court cases we have. Think of all the people who were convicted or freed based on DNA evidence - that's some serious impact there.

There are lots more examples of unpredictable impact - such as:

  • Email, when it was invented, was only able to send messages to people using the same mainframe. No-one predicted that tweaking it to send messages around the world would make it one of the most important communication tools we have.
  • Gregor Mendel - a lone priest planting peas in a garden, trying out different cross-breeds and making careful notes. It wasn't until years after his death that biologists realised how Mendel's laws of inheritance fit with Darwinian evolution, and formed the foundation of modern biology, with massive impact throughout society.
  • Texting. A phone is for phoning, right? Text messages were never planned to be the mainstay of what mobile phones were about, just a way to get a message through when you couldn't talk. But now many people text more than they call.
  • Liquid crystal displays eventually arose from the basically curiosity-driven research of Friedrich Reinitzer looking at the chemical cholesteryl benzoate. Now it's used in TVs, phones, watches...
  • Fibre optics was demonstrated as a curiosity and a demonstration of physical principles in the 19th century; but it wasn't until way into the 20th century that it became important for data transmission, for example in phone networks.

And the opposite is also true - history is littered with examples of discoveries/inventions that were widely expected to change the world, but didn't:

  • Video messaging: the phone companies seem to have thought that if we liked text messaging we were going to love video messaging. No.
  • Artificial intelligence: In the 1960s the artificial intelligence research community was an incredibly optimistic one, with leading lights such as Marvin Minsky basically thinking they would be able to recreate the intelligence of a whole human brain within a few years, and then we'd all be having conversations with robot pals. That optimism came crashing down. Sure, you can now buy robot pals, and sure, we're still researching artificial intelligence and indeed using it in various applications, but it hasn't yet been the revolutionary impact it was going to be.
  • Hovercrafts and maglev: these have become the clichés of misplaced futurology. After their invention they seemed to have been poised to take over the world - but no, we're still mostly using the good old wheel to get around.

So with all this evidence, it's not surprising that scientists are worried about this new approach of trying to plan your impact - much of the curiosity-driven stuff that has real impact could well get sidelined in favour of things which might be a bit less imaginitive but which seem like they'll definitely make some public or business connection.

OK fine - seems like there's some misguided bureaucracy coming down from government, and we have to try and make sure it doesn't end up stifling what it's supposed to be helping. But there's a bigger question that maybe we can think about. As I've said, "impact" is very hard to pin down or predict, and we don't really know how predictable it could or should be. But in many grant applications and suchlike, scientists are now writing down their predictions about the impact they'll have. Are those predictions useful data? Could we use "impact plans" as a great big study about whether impact can be predictable?

We could for example wait for five years, then look back at the pile of impact plans and ask, how many of those predictions (the ones which got funded, at least) came true? What percentage? What proportion of the observable scientific+engineering impact made over the next five years will have been predicted, in writing, in advance?

It would still leave a million questions unanswered, especially about unidentifiable impact (subtle things which are hard to count), long-term impact, and really it would still be a very reductive way to think about how science affects our society. But I wonder... would that make all these "impact statements" worth their while?

Friday 4th December 2009 | science | Permalink

Tree recursion, python/octave/matlab/sc3, informal benchmark

I'm writing a tree data structure as part of my research. I'm not going to describe the algorithm in detail, but it takes a set of data points and repeatedly chops them into two groups so that you can divide a dataset up into spatial subgroups.

Anyway, my first implementation (in SuperCollider 3) was running fairly slowly so I tried it in three other languages, to see which would be most practical for my situation.

It's an informal kind of benchmark - informal cos I'm not going to show you the code, and I haven't run the tests dozens of times, etc. (Some of the tests I ran just once, since they took so long.) The datasets consisted of artificially-generated 3D points sampled from a mixture of a cubic and a toroidal distribution. In the following graph, lower results (shorter times) are better:

The results show a couple of interesting things. SuperCollider was my starting point and it was never developed for large data-crunching tasks so I'm not surprised that it becomes the worst performer once we get to large datasets, although it actually doesn't do too badly. To be ten times as slow as Python or Matlab on big datasets is not embarrassing when both of those have had so many more person-hours of development effort specifically for big data crunching.

The comparison against Octave is illuminating. Octave was originally my open-source Matlab alternative of choice, but I've come to feel like it has all the drawbacks of Matlab (mainly the godawful design of the Matlab language) and none of the advantages (under-the-hood optimisation tricks, great plotting). Here I was running exactly the same code in Matlab (7.4) and Octave (3.0.5). I expected Octave to be roughly competitive, since this branching recursive code is quite difficult to auto-optimise, but Matlab generally handles it something like ten times as fast. So here I find another sign that Octave isn't quite there.

I now know, of course, that Python + numpy is the open-source Matlab alternative of choice. The language design is much better, and numpy (the module that provides all the matrix-crunching tools) has undergone lots of development effort and become better and better. And this (informal!) benchmark shows python (2.5.4, with numpy 1.3.0) performing just as well as Matlab on the large data.

(There is one thing that Python definitely lacks compared to Matlab: decent well-integrated 3D plotting. matplotlib doesn't have it except in old deprecated versions; python's gnuplot interface is poorly developed; other python plotting libs have drawbacks such as non-interactivity. I've mentioned this before.)

So I'll probably be using my Python implementation of the tree data structure. It's right up there in terms of speed, plus the code is conceptually cleaner than the Matlab version, so it'll easier to maintain, and easier for others to grok, so it's better for reproducible research. Remember, this benchmark was only informal so do your own tests if you care about this kind of thing...

Tuesday 10th November 2009 | science | Permalink

Are probiotics real, or meaningless?

Today Danone was forced to withdraw an advert for probiotic yoghurt because the scientific evidence didn't support it. The company claimed it boosted children's "defences" and cited various research studies to support it. The Advertising Standards Authority read the studies and found that although the studies were good, most of them weren't about the children in question, some of them used the wrong dosage of yoghurt or an inappropriate test group, and overall the results were inconsistent and didn't particularly support the claim.

I'm interested in this because probiotics is one of those weird new turns in commercialism in which you can't quite tell if there's real science there, or if there is nothing but an actor on screen grinning and rubbing her belly, saying "I trust good bacteria" over and over again.

I've heard some scientists saying that probiotics have been shown to be good for ill people recovering in hospital (whose natural gut flora might need "topping up") but that the evidence isn't there yet for any point at all in healthy people gulping down these yoghurts once a day as if they were your daily medication.

There are moves afoot in the EU which sound to me like a good idea. In 2006 a new EU law came in, stipulating that all medical-sounding marketing claims must be verified, and they now have a committee which looks at the evidence and pronounces yes or no on them. The claims for various yoghurt drinks, as well as all kinds of other products, has been submitted to this committee. They made the judgment that general probiotic claims aren't supported by evidence, although they'll be looking at more specific manufacturers' claims later.

The change hasn't actually come into force yet, but when it does, hopefully it won't be down to us to peer at the TV advert and think to ourselves, "Is that science or is that bullshit?" - it's only reasonable that we shouldn't have to do that, and companies should have to prove their stuff works before they parade it around in scientific clothing.

Wednesday 14th October 2009 | science | Permalink

InterSpeech09 conference: emotional speech

The InterSpeech conference was in Brighton this year - now, my research is all about "non-speech" voice (e.g. beatboxing) but I took the opportunity to go down and see what the speech folks were up to.

Automatic speech recognition is the "traditional" problem for computers+speech, but there's been a tendency recently to try and automatically recognise the emotional content too. This year was the first year of the InterSpeech "emotion challenge", in which researchers were challenged to automatically detect a range of emotions in a dataset of audio - recorded from schoolchildren who were trying to guide an Aibo round a track, apparently with emotive consequences...

I was surprised that many of the approaches to emotion recognition were so similar to the standard speech-recognition model: take MFCCs plus maybe some other measurements, model them with GMMs, classify the results (maybe with a HMM), so far so 1960s. The spectral measures (MFCCs) were typically augmented with prosodic measures such as the amount of pauses in a sentence, or measures about how the speaking pitch varied, and in quite a few of the papers it seemed that these prosodic features actually perform pretty strongly, often beating the spectral features. But I was surprised they were still relatively simple measures - no intricate prosody-specific models of temporal variation, for example, most seemed to use the average+minimum+maximum pitch. Combining the two types of data (spectral plus prosodic) was often the best but didn't seem to give a dramatic uplift vs using just one type. I suspect that more specific models could push the prosodic side a long way in the next few years. The winner of the "emotion challenge" was a kind of hand-designed decision-tree approach, pretty nice because they'd designed the classifier from theoretical motivations.

One thing about "emotion" is the same problem as for "timbre" (the musical attribute which I deal with in my research): it's still very hard to pin down exactly what you mean by it, specifically whether it's a continuous attribute or a set of categories. It seems that many datasets are labelled categorically - people mark a given word or sentence as being neutral/scared/happy/anxious/etc. But increasingly people are focusing on the continuous approach where emotion is treated as a 3D space, where one dimension is "arousal" (varying from calm to excited), one is "valence" (bad to good), and one is "potency" (dominated to dominant). If you combine those 3 dimensions variously you can cover the standard emotions pretty well (excitement, depression, boredom, anger, etc etc). This 3D approach gets around various cultural issues in the exact meaning of the labels, allows for some more refined analysis, and I believe it comes from a pretty well-validated area in psychology, although I don't know the literature on that.

Oh and there was a nice talk about automatically analysing and detecting laughter. Laughter is characterised by the bouts of vocal effort we push in, via the lungs and the tension in the vocal folds. That distinguishes it quite well from ordinary speech. So what these people did was a nice simple technique to estimate the glottal pulses (the moments of energy that come from our vocal folds), and to spot when these became more effortful and more frequent. You can't use an ordinary pitch tracker because each laugh is far too brief for a standard tracker to latch on to the quick pitch changes, but their custom analysis (plus a very basic classifier) seemed able to detect moments of laughter in TV talk shows etc. The analysis method (the zero-frequency filter) is technically very simple and potentially a useful trick...

Saturday 12th September 2009 | science | Permalink

Does processed meat cause cancer?

It's been on the radio news this morning, so it's timely that David Colquhoun has written this excellent article about diet and health. He goes through what the scientific evidence can and can't say about questions such as "Does eating processed food cause cancer?" - it's a long article but really clears things up.

Monday 17th August 2009 | science | Permalink

Vitamin supplements: avoid them?

This caught my eye in the paper this weekend: someone wrote in to the doctor's column asking if they should take vitamin A and E supplements to prevent cancer and heart disease, and the doctor's response was:

Several long-term and large trials have shown that taking extra vitamins A (such as betacarotene) and E does not reduce heart attack risk. In fact, some of the trials were stopped because there were more deaths in the vitamin groups than in those given placebos. As long ago as 14 June 2003 the Lancet reviewed the evidence and strongly discouraged any more research into the long-term use of such vitamin supplements. We get enough for our needs from a normal diet.

Blimey! I already knew that vitamin supplements were pointless (for healthy people) as long as you eat right. But do they actually do harm?

The doctor was referring to this 2004 review in the Lancet, which is a pretty good source. A web search also finds a 2008 Cochrane review of the evidence (another good source, but it's essentially an update of the earlier paper), which concludes:

We found no evidence to support antioxidant supplements for primary or secondary prevention. Vitamin A, beta-carotene, and vitamin E may increase mortality. Future randomised trials could evaluate the potential effects of vitamin C and selenium for primary and secondary prevention. Such trials should be closely monitored for potential harmful effects. Antioxidant supplements need to be considered medicinal products and should undergo sufficient evaluation before marketing.

This is pretty scary. According to these authors, there's no evidence that these supplements prevent cancer but there are hints that they might increase mortality? Such meta-analyses, when done properly, are very good ways to summarise the current state of research, but they're not set in stone - for example, when that review was published in the Lancet, the next issue featured some responses from some of the studies involved, who took issue with the general conclusion. But then, if the possibility of a negative effect looms strongly enough out of a systematic review like this, then it certainly needs to be considered.

Even this year more evidence arrives: this 2009 study finds that supplements of vitamins C or E or beta-carotene have no statistically significant effect on mortality (they don't increase or decrease the risk of death).

A couple of things to note:

  • This isn't about all vitamins, just about the vitamins mentioned above. As one correspondence notes, most people don't get enough Vitamin D, so maybe it's still worth taking Vitamin D supplements? (I haven't looked up any evidence about that yet.)
  • This is about vitamin supplements, not about vitamins in general. Fresh fruit and veg is a much better source of these vitamins in my opinion, and the evidence would seem to bear it out: here's a 2003 review which says, "A great deal of epidemiologic evidence has indicated that fruits and vegetables are protective against numerous forms of cancer." And here's a 2005 review which says a similar thing, and considers reasons why fruit and veg might be better than supplements.
Tuesday 28th July 2009 | science | Permalink

How does a PhD affect your salary?

In the lab we're chatting about what effect a PhD has on your career and your earning potential. This article is slightly old (2001) but it has some solid figures which are interesting:

Seems that a PhD in an electrical-engineering discipline (the closest match to ours) could raise your salary by around 8 or 9 percent.

Of course the economic car-crash puts a lot of things in question. But I'm glad at least that a PhD doesn't on average push your salary down, which some people say (and maybe it's true for some disciplines).

Wednesday 22nd April 2009 | science | Permalink

Distance analysis methods: Multidimensional Scaling and SplitsTree try to unravel the Tube map

In scientific research, one of the things you sometimes need to do is take a set of distance measurements (e.g. "it's 5 metres from A to B, 4 metres from A to C, and 3 metres from B to C") and try to reconstruct the actual spatial layout underlying that data.

So how to do it? Well one approach is Multidimensional Scaling (MDS) and it's been known for a few decades in timbre research. It assumes that the data exist in a Euclidean space (a pretty straightforward space like ordinary 3D space we're used to) and arranges the points in a layout that gives the least disagreement with the distance measurements. So if we have a set of musical timbre judgments (e.g. "a bassoon sounds quite like an oboe, but not much like a violin") we can try and force those objects into a spatial arrangement that suits their relationships, and then view the resulting map.

But there's a problem. Who in the world said that audio timbre behaved like a standard Euclidean space? Does it depend on context? (Yes.) Is the difference between A and B always the same as the difference between B and A? Does timbre behave more like categories (e.g. woody vs metallic vs watery) than like a space?

That's a big problem and there's no clear solution. I saw a talk by Ashley Burgoyne at ICMC 2007 which suggested some modifications to MDS to help account for the weirdness of timbre-space. Some of it makes intuitive sense: e.g. the use of "specificities" builds in the idea that one data-point may be more unique than it should be, having its own special distance to cover the fact that a trumpet sounds uniquely different from everything else. And he argued that the nonlinear versions coped better with the evidence about timbre judgments.

Then I heard about another completely different approach. Geneticists have developed rather clever ways of analysing the genes of different creatures, to produce "genetic distance" measures and then use those to reconstruct what the evolutionary tree could have been. The maths can be applied to any set of distance measurements (aha!) and creates a tree that best represents them - the "tree" is actually a kind of space, not the same as Euclidean space.

For an introduction to the maths involved, see Metric Spaces in Pure and Applied Mathematics.

I needed to get my head around how this approach might work, and whether it might be useful. So I decided to apply it to a weird space in which distance measurements might not correspond to actual spatial distances... the Tube map.

If you've been on the Tube you'll know that some journeys are longer than they should do, and the durations don't actually match up with the geographic distances they take you. You'll also know that the Tube map itself is highly nonlinear, the geographical layout is warped to make it neat and easy to read.

So I took this section of the Tube map:

and from the web I found two different sorts of data:

  1. how long it takes (in minutes) to walk from one station to another, overground;
  2. how long it takes to get from one station to another by tube.

Now the first set of data should be "more Euclidean" since walking is basically going in a straight line except for the buildings in the way; while the tube timings should be weirder because you're strongly constrained, there's only a few pipes you can go down and they don't always connect up in all the obvious ways.

So when you feed the walking-times into MDS you get this (I've painted the tube-lines back onto the map to make things more obvious):

Not bad eh? The arrangement is actually quite a lot like the real-world layout of the tube stations.

And here's what happened when the same walking-times were fed into SplitsTree:

Yes, it kind of works, except that Russell Square pokes out a bit weirdly, I think due to the algorithm's requirement that the data points sit at the edge of the graph. The SplitsTree representation is almost-but-not-quite happy to represent the data in 2D, shown by the patchwork of almost-rectangles.

Here's where the differences really show up though: the tube-timing data. The walking-time data was "easy"...

Tube-timing data after MDS:

Tube-timing data after SplitsTree:

Note that both algorithms push the circle line (the yellow line) away from the others, out towards the top-right of the space. That's because the circle line, although it crosses over the others, doesn't have as many intersections as it might do (it doesn't have a stop at Euston or Warren Street, for example). Both algorithms spot that Kings Cross is a hub in this network (meaning it's easy to get to most of these stops from Kings Cross), placing it right at the heart of the layout. More generally, neither algorithm reconstructs the geographical layout of the stations, simply because the time it takes to get from A to B isn't so much defined by geography but by the peculiarities of London Underground.

The SplitsTree representation seems here to use a lot of 3D boxes, and there are some convoluted goings-on inside the way it tries to rationalise all the distances.

Notice also that on the SplitsTree diagram, most stations have their own little spike to live on. These are similar to the "specificities" I mentioned earlier - each tube station takes that little bit of extra time because of the time needed to get up and down the escalators (or whatever). For the Piccadilly (dark blue) line, SplitsTree seems to suggest that the majority of the time taken is in getting up and down and the actual journey between stations is pretty quick, which I think pretty much reflects reality.

I did all this in order to try and grok the tree reconstruction algorithms. Not sure if I've got there yet, but this was definitely helpful...

Wednesday 11th February 2009 | science | Permalink

10 new PhD places in Media and Arts Technology

Our research group has 10 new fully-funded PhD places in Media and Arts Technology thanks to a big grant we've been awarded. The places include working with an industrial partner such as last.fm, the BBC, or Sony. If you know anyone who might be into that, let them know...

Tuesday 10th February 2009 | science | Permalink

Chaos theory is like biology used to be

Looking through the International Journal of Bifurcation and Chaos, the thing that strikes me is that chaos theory seems to be at the same kind of point that biology was at, before Darwin's work gave it a structure and an explanation. In the 19th century biologists would publish articles describing new species they'd found, saying it's a bit like this one, a bit like that one, but without evolution and genetics you can't really say much more than that - and you get the same feeling from modern chaos papers: look, I've found a new chaotic attractor, it's a double-scroll, it makes patterns like this.

There are all sorts of ways of categorising chaotic systems, characterising their general surface behaviour, even controlling them, but it looks like nothing really gets to the heart of what's going on. Is the study of chaotic systems waiting for some big explanation?

Wednesday 4th February 2009 | science | Permalink

My work in a BBC radio programme

The BBC reported on the "Augmented Instruments" concert that Jean-Baptiste Thiebaut organised a couple of weeks ago. As part of the feature, I gave a quick demo of my beatboxing synthesiser interface... Check out the podcast of the radio programme:

Tuesday 23rd September 2008 | science | Permalink

Some notes from DAFx08

Just returning from DAFX 2008 in Espoo, Finland, which was a good do. My first visit to DAFX - it's a smaller and friendlier conference than some others I've been to, a nice size (about 120 people). Met up with lots of good digital audio people, some new, some old. Some notes about a few topics that came up:

  • Vesa Valimaki's digital sound synthesis tutorial was good, including some tips about low-cost synth techniques ("Differentiated Parabolic Wave") coming from his lab, new to me. Similarly Ville Pulkki's spatial sound tutorial and demo, featuring the DirAC technique which seemed to give some nice sonic results.
  • Our lab was well-represented, and it was nice that Anssi Klapuri picked up on Becky Stewart's spatial music navigation ideas in his keynote. My talk on voice timbre went fine too, despite the interruption of an automatic blackboard...
  • The keynote by Hyri Huopaniemi (of Nokia) didnt have as much news as I was hoping, but it was nice to see a bit about how the Princeton group's mobile-phone synth system is put together, a python interface onto a C++ synthesis core.
  • Naofumi Aoki's poster on bandwidth extension of mobile phone audio was interesting, although not specifically for the bandwidth extension but for the steganography trick used to embed metadata into audio. This means you can do fancy things with mobile phone audio without having to change the way the worldwide phone system works...
  • There were quite a few good papers about guitar synthesis and guitar amp emulation, etc. Worth mentioning is Fredrik Eckerholm's guitar synth, just because to my ears it sounded very nice and had a lot of features (e.g. pickup placement, pick parameters).
  • Jari Kleimola's sound synthesis trick - essentially XOR on audio - caught a few people's attention, making some quite nice sounds despite its simplicity.
  • Damian Murphy's results on the quality of different DWM reverb techniques were interesting, although it's not my field so I can't judge it in detail.
  • Was nice to see spectutils which is a nice set of spectrogram plotting tools for GNU Octave. Should be useful.

170420081996 The conference banquet was v good too, good food and in a really nicely-architected building called Dipoli. Also had a good time in and around Helsinki but I've documented that elsewhere.

Saturday 6th September 2008 | science | Permalink

Beatboxing with a very different voice

Someone has written a very nice popular-science-type article... about me :)

Friday 27th June 2008 | science | Permalink

My reading list: the past 18 months

I decided to make a public archive of my Bibtex file - i.e. almost everything I've read, or not read, in my PhD so far.

This bibliography might be useful to people interested in sound/music technology, vocal timbre, real-time audio processing, etc.

The general angle of my research topic is summarised on my QMUL homepage

Wednesday 11th June 2008 | science | Permalink

Laryngographs of my beatboxing

So after I beatboxed for the scientists they've sent me some of the output from the laryngograph tests. Here it is!

First of all here's me doing a kick-drum-plus-bass sound:

  • laryng-kick-drum-plus-bass.mp3 WARNING: this is NOT a normal recording. On the LEFT channel you get the normal recording from a microphone, and on the RIGHT channel you get the direct output from the laryngograph - essentially, you get to listen to what my larynx is doing itself, without any of the complicated stuff that happens afterwards (in the throat, lips, tongue). Use your computer's left-right balance controls to choose what to listen to.

Here's a picture of that same clip:

Laryngogram of kick-plus-bass

In that picture the audio recording is the blue "Sp" line in the middle, and the larynx trace is the green "Lx" just below it - the signal goes up when vocal cords close, goes down when they open.

Towards the end of the clip my larynx is opening and closing normally, a regular opening-and-closing just like in normal speech. But towards the beginning it's a bit more chaotic than that, and it almost looks like there are two different frequencies competing to take over. I'm not entirely sure what this implies, but the researchers pointed that feature out, and maybe it's connected to the sound that's produced somehow.

OK, now here's a bit of "vocal scratching":

  • laryng-vocal-scratching.mp3 WARNING: this is NOT a normal recording either. On the LEFT channel you get the normal recording from a microphone, and on the RIGHT channel you get the direct output from the laryngograph.

Here's a picture of that same clip:

Laryngogram of vocal scratching

The main thing they were looking at on the scratching was the very fast pitch changes - look at the lowest panel and the green "Fx" line, which is the fundamental frequency. It changes by up to one-and-a-half octaves in 150 milliseconds, which apparently is ridiculously fast. Now I'm not the best vocal-scratcher in the world, so I bet that it goes even faster than that for others...

Thursday 13th March 2008 | science | Permalink

They put a CAMERA up my NOSE

And it was all in the name of science. I volunteered for an experiment which wanted to look at beatboxer's voice-boxes while they were beatboxing, so I went and let someone put a camera up my nose (a nasal endoscopy). This was also being filmed for a Science Museum beatboxing project, so as well as the actual scientists there was a one-woman film crew plus a Science Museum person co-ordinating the thing and handing me the SM58 so I could bust some beats in the little clinic room.

I couldn't see the screen so I wasn't sure what my larynx was looking like but I dropped some of the usual beatbox stuff (some old-school hip-hop ones, a slightly poor DnB one, a quick rendition of If Your Mother Only Knew) and they seemed interested in what was happening. They'll take a while to do a proper analysis of the results but apparently there's a lot of muscular activity happening around and above the larynx while I'm doing kicks and snares and suchlike.

Some voice specialists are worried that beatboxing is bad for your voice so it was good to know that, after 7 years of beatboxing, I don't seem to have anything weird or wrong with my vocal folds, I'm not doing myself any damage.

One of the sounds that worries specialists is vocal scratching, so I gave them a bit of that. They confirmed that it involves a lot of constriction to produce that sound, and they also confirmed that there are lots of really fast pitch changes (one-and-a-half octaves in 150 milliseconds!). Whether that means it is bad for you I'm not sure. I don't actually do much vocal scratching myself.

There'll be more sessions, and at some point there'll be a video online, but that's all for now. I have a printed-out photo of my larynx but you don't want to see that ;)

There were also some tests with a laryngograph, which showed some of the controlled-weirdness involved in beatboxing, and some interesting discussion about whether super-deep bass tones were bad for you or not. The "received wisdom" is that they're dangerous since they involve your "false vocal folds" pushing down on your real vocal folds, but some researchers have evidence that if you do it right, that's not what's happening, instead your false vocal folds are basically flapping on their own. Watch this YouTube video on "Extreme vocal effects" to see what's happening when singers make deep growly sounds...

Tuesday 11th March 2008 | science | Permalink

Echinacea: Science says

There was an advert on the tube claiming echinacea could reduce my chance of developing a cold by 65%. Blimey, a big claim. So I went and found the source of the cliam, and a couple of other review papers. My summary of the research is this:

  • Although it's hard to be certain (partly because there are so many different sorts of echinacea plant and different ways to prepare it), it does look like echinacea helps to shorten the duration of a cold and make it less severe. It might also prevent a cold happening in the first place, but that's less clear. The most likely useful type of echinacea is echinacea purpurea.

There are all sorts of caveats on this summary. Firstly it's not recommended for children, or for people with immune problems such as arthritis or HIV, or people who might have an allergic reaction. Secondly we basically don't know how it might work (it contains a few chemicals that probably interact with the immune system... but in what way?). Thirdly we need more big studies before we can be sure about the effect on outcomes - so the picture might change, might even change dramatically, as more science gets done.

But I want to emphasise: in terms of real-life evidence, echinacea has much better evidence than homeopathy, or than other herbs or other such stuff you might find in that same section at the chemist's.

My main sources for all this are two recent research summaries, a meta-analysis published in a Lancet journal and a Cochrane systematic review.

I noticed that some science bloggers tried a little bit to poo-poo the meta-analysis, and to be blunt I suspect that's because it finds quite decisively in favour of echinacea. (None of my favourite science bloggers had this prejudice: David Colquhoun and Ben Goldacre are my favourites by the way.)

I do personally have an instinctive scepticism of complementary medicines because of the way things often try to side-step proper evaluation while at the same time giving themselves a white-coated pseudo-medical image. But in this case I'm happy to say that both the reviews find generally that there is a positive effect on cold from (some) echinacea preparations.

Friday 30th November 2007 | science | Permalink

A beatbox experiment

After a good few months of working on my PhD I'm finally ready to get some people to use my stuff and see what they make of it.

If you're near London and you're a beatboxer check this out, I'm recruiting for a beatbox experiment

Thursday 29th November 2007 | science | Permalink

Smoking ban definitely improved health

Good news from Scotland, where the smoking ban came in before ours in England. A large study has found measurable health improvements due to the ban, such as a large decrease in heart attack admissions (including a noticeable effect on non-smokers due to less passive smoking). Woo.

Monday 10th September 2007 | science | Permalink

Onscreen violence really is bad for us

Given the shootings in the USA this week, the main feature in this week's New Scientist is eerily apt. As summarised in their editorial, the research on the effect of TV / video game violence seems to be persuasive, that it has generally bad effects including aggression/desensitisation/etc.

While the report does concede that you can get useful skills from modern media (such as the dexterity and quick thinking which can be demonstrated to come from computer games), it makes the point quite clearly that the bad outweighs the good. I'm not sure what the picture would be like for people who see only "non-violent" media... I've never read any research papers on the subject so I can only be vague.

The strange prevalence of violence in films and computer games puzzles me quite a bit. I'm not one of those people that automatically tuts about violent media but it's weird how much violence there is. It must be what people want, but why? One answer might be "escapism", escaping from humdrum life into exciting scenarios, and maybe violence is one of the easiest ways to make things exciting. But there are loads of imaginitive ways to escape from the world... just look at some of the weird imaginitive stuff that the Japanese come up with. The Japanese come up with lots of really sick and violent stuff too of course ;) and maybe the grass looks a little greener on the other side, but our media's imaginitive range seems a bit stifled in comparison. Is poverty of imagination really anything to do with it? Or am I making it up?

Thursday 19th April 2007 | science | Permalink

Gillian McKeith stops calling herself a doctor!

The assumptions you make, eh? Not that I ever paid much attention to Gillian McKeith's TV programmes, but when someone called "Dr Gillian McKeith" appears regularly on Channel 4 telling people what they should be eating, who publishes books and so on, you tend to assume they've got medical qualifications in the straightforward sense just like my GP does.

This interesting article on Gillian McKeith throws a different light on the matter. Someone complained to the Advertising Standards Authority that calling herself "Dr Gillian McKeith" in advertising was misleading (since she's only a "Dr" by virtue of a correspondence course with a non-accredited American college). In order to avoid falling foul of a pending Advertising Standards Authority ruling (apparently a draft ruling seemed to be inclined in favour of the complaint) she's agreed not to use the term in future advertising.

The article has some really choice words to say about the woman, including quoting some of the very bizarre medical claims she's made, and the "Wild Pink Yam and Horny Goat Weed products" her company briefly marketed before the Medicines and Healthcare Regulatory Agency ordered her to stop selling them and said they "were never legal for sale in the UK". The article's written by a doctor and it makes quite a lot of good points in general about the difference between science and nonscience, and real doctors and sort-of-doctors...

Tuesday 13th February 2007 | science | Permalink

How many eggs should I eat?

OK, here's yet another food dilemma: should you eat plenty of eggs, because they contain various healthy vitamins and minerals? Or should you not eat many eggs, because of the cholesterol they contain? As usual I'm determined to find an evidence-based answer.

The first things I find in a web search come from the egg marketing boards. So, bearing in mind that they're obviously quite biased, I check out "Healthy eggs" from britegg.co.uk and "Eggs and cholesterol" from nutritionandeggs.co.uk. So, as expected, they confirm that eggs are full of lots and lots of nutritious things, but they also argue that recent evidence shows that eggs aren't bad for health. They have two scientific studies to support this argument: one which looked at a large number of people in the USA and found eggs didn't increase the risk of heart disease; and one which reviewed the current state of scientific knowledge and found that saturated fat (rather than dietary cholesterol) was the main cause of people having high blood cholesterol levels.

So far, so good, although the source is not what you'd call 100% neutral. And even if saturated fat is the main cause of high blood cholesterol, could dietary cholesterol be a lesser but still important cause?

So, I found the cholesterol review article and had a look. It's a very tricky subject to unpick, actually. For example, the study finds that people who eat more dietary fat tend to eat more dietary cholesterol too. So it could be tricky to separate out the effect of these two. There are methods for doing this, of course, and in the multiple regression analysis used by the researchers, it seems that there were three significant influences on a person's blood cholesterol levels: their intakes of saturated fat, polyunsaturated fat (the more people eat, the lower their cholesterol, for polyunsaturates), and cholesterol. However, although these influences all took part, the cholesterol influence is strongly outweighed by the influence of saturated fat vs unsaturated fat - if I gloss over some of the details to come up with a very approximate rule of thumb, the study finds that reducing saturated fat is someting on the scale of ten times more influential than reducing cholesterol.

OK, so what about some other sources of information? The BBC often has a lot of health information, but searching their site didn't actually find very much. The story Eggs 'protect against breast cancer' reports on a USA study of women, finding that eating eggs in teenage years seems to help lessen the likelihood of breast cancer; the study involved a large number of people and was published in a reputable journal so it seems trustworthy. The only other article I found was An egg a day 'is good for you' which seems to be based on the same studies as the ones I mentioned above. They did however confirm with a British Nutrition Foundation scientist, who agreed that there was unlikely to be a health risk from eating an egg a day (they recommend 2 or 3 a week apparently). There is opposition from the Vegan Society, but once again, they're hardly an unbiased source of information about whether people should eat eggs or not!

What about UK government advice? The UK government seems to be quite keen on inventing websites for public information these days, and one of their sites I searched is eatwell.gov.uk. They have two useful pages here: a page about eggs (including the section: "How many eggs?" - aha!) and a Q&A about eggs and cholesterol. The message from them is: eggs are good for you, and you don't need to cut down on them (unless your doctor tells you to for a specific reason). Just eat a balanced diet, as they always say.

And that's pretty much my conclusion. It seems that people used to (reasonably) assume that eating food with cholesterol in, would raise your blood cholesterol, and that was a reason not to eat too many eggs. But that assumption is too simple, and dietary cholesterol isn't that worrying after all. As long as you eat a balanced diet you can enjoy your eggs.

Sunday 10th December 2006 | science | Permalink

Does burnt food cause cancer?

Does burnt food cause cancer? Someone said to me that burnt food was "as dangerous as a cigarette", which is a pretty big claim, so I've been searching the web and some research databases, looking for evidence.

There's very little on the web about it, besides a lot of idle speculation on messageboards. This ScienceNews article from 2005 says that the US government now lists certain chemicals found in "meats when they're cooked too long at high temperature" as carcinogenic. It also says:

Finally, the report notes that while inconclusive, published studies in people "provide some indication" of human risks from eating broiled [grilled] or fried foods "that may contain IQ and/or other heterocyclic amines." The National Cancer Institute conducted one of those suggestive studies. It compared the diets of 176 stomach cancer patients and another 503 cancerfree individuals. Overall, people who regularly ate their beef medium-well or well-done faced more than three times the stomach cancer risk of those who ate their meat rare or medium-rare, according to a 1997 report of the research.

More information about this is in a very helpful summary by the USA National Cancer Institute. Note that one of the studies quoted looked at cooking at 200ºC or 250ºC, which is much hotter than ordinary baking/roasting. However, that is the kind of temperature you use to cook a pizza...

Statistics like "three times the cancer risk" always sound scary, but you need to ask, three times what? We need to know how the risk compares against other things. More on that later.

I found a messageboard thread on which someone said "You can put tomato sauce on it. I heard it helps lessen the production of carcinogen which causes the cancer." This is a big mistake. Fruits like tomatoes or cherries do contain antioxidants which counteract the formation of the carcinogens, but only during the cooking process, mixed in with the meat (e.g. in a burger mixture). Putting ketchup on afterwards will make zero difference.

I also found a journal article discussing the increased cancer risk from barbecued food especially (Lijinsky W, (1991), Mutation Research 259 (3-4): 251-261). It suggested that the reason for the risk was that fat will drip off the meat, then burn at high temperatures when it hits the coals, forming the cancer-causing substances that then mix in with the barbecue smoke and may then coat the outside of the meat being cooked. This explanation was proposed to explain their finding that the chemicals were mainly found in fattier foods cooked over burning logs.

Other relevant journal articles:

  1. One found a similar connection: the highest concentrations found in the Italian diet were in pizzas cooked in wood-burning ovens, and in barbecued beef and pork. Ludovici M et al (1995), Food Additives and Contaminants 12 (5): 703-713)
  2. One found that the Indian tradition of cooking with homemade clay-stoves, called "Chulha", created a lot of smoke containing the problematic chemicals.(Bhargava A et al (2004), Atmospheric Environment 38 (28): 4761-4767) This was said to increase the risk for people who cook with them - remember that inhaling carcinogens is typically much more dodgy than swallowing them, because the route into the body is more direct.

The relative risk? Is a barbecued steak as dangerous as a cigarette, as certain internet message boards might lead you to believe? Clearly not: many people eat well-cooked meat, yet nine-out-of-ten cancer deaths can be attributed to smoking. (The nine-out-of-ten figure comes from a study of USA deaths in 1995: source.) Scientists can calculate guideline statistics such as the "incremental cancer risk", an averaged-out measure of the risk from something. For cigarettes it's 0.079 (source); for burnt meat it's somewhere between 0.00001 and 0.00038 (source). So the risk is somewhere between 200 and 8000 times lower - there's no comparison between one cigarette and one burnt steak.

My conclusions:

  1. Regularly eating burnt or barbecued meat, especially meat that's been cooked at high temperatures for a long time, is relatively risky behaviour. But don't panic: it's not comparable to smoking.
  2. For non-meat food the research is less clear-cut: it's not obvious whether all smoke-cooked or overcooked food carries risks. Certainly if you don't eat it regularly there's nothing to worry about.
Monday 2nd October 2006 | science | Permalink

Digital vs analogue clocks

Both me and Philippa insist that it's easier to read an analogue clock-face (i.e. one with hands) than a digital clock-face. So I wondered: is there any research on the subject?

Of course there is! There's research about everything. But it doesn't seem to agree with us.

In Processing of visually presented clock times (Goolkasian, P and Park, D.C., 1980) the experimenters looked at the differences in speed for judging the time difference between two clocks, and found that "same/different reactions to digitally presented times were faster than to times presented on a clock face, and this format effect was found to be a result of differences in processing that occurred after encoding."

Minding the clock (Kathryn Bock, David E. Irwin, Douglas J. Davidson and W. J. M. Levelt, 2003) looked at explicitly linguistic effects (e.g. difference between Dutch and American English speakers). It also found that "responses to analog clocks were faster with relative expressions and responses to digital clocks were faster with absolute expressions," although overall it found again that digital clock-reading was faster than analogue in all cases. Note that the experimental method was explicitly linguistic - the speed measurements were measurements of how quickly the participants began to speak when they correctly named the time.

This is one of the most interesting (and most recent) results I found, partly because the experimental design included displaying the clocks for a short amount of time (as low as 0.1 seconds). 0.1 seconds is too quick for the eye to rove around the clock-face and fixate directly on the different parts of the display, and "the results from the 100 ms exposure conditions indicated that sufficient information for fairly accurate production can be extracted from the display without fixating the crucial information directly."

The effects of response format and other variables on comparisons of digital and dial displays (Miller R.J. and Penningroth S., 1997) "compared dial and digital clock displays to determine which could be read faster by 25 young adults" and found that "in general, digital displays led to faster responses than did dial displays. However, several combinations of the other variables, particularly those using the before-the-hour response format, effectively eliminated the superiority of digital displays. We suggest that in designing displays requiring such a response format, designers should not assume that a digital display is necessarily the best choice, especially if other factors encourage the selection of a dial display."

I haven't read the full paper (not available electronically; will have to visit my uni library) so I'm not sure if the experimental design was again based on participants reading the time out loud - and if so, I have an issue with that which I'll come to later. But this effect of before-the-hour responses is tantalising. For example: Philippa is a radio producer, and one of the things they need to do is glance at the clock to know how much time they've got before the programme ends at 6 o'clock precisely, so they can judge when to end interviews, when to bring in the next piece of music, etc. Philippa finds it much quicker to glance at an analogue clock in order to do this, and intuitively I can see why. You can literally see how much time is left (i.e. the size of the gap between the minute hand and the 12 o'clock mark), whereas with a digital clock you have to take in all the numbers and then do a quick arithmetic operation - not difficult, of course, but probably much slower, cognitively.

Judging a duration like this is very different from speaking the time. Reading out numbers is a one-to-one transformation which we do in so many contexts that it's very very easy; when reading out from a dial clock, we need to translate the hands' position into numbers before we can speak it. When using a dial clock to determine actions, however, we don't necessarily need to put the numerical step in the middle.

I'd like to run an experiment different to the ones I've found so far, one which tests the ability to comprehend clock-faces from a short glance - e.g. starting at 0.1 seconds and getting shorter. Rather than measuring the speed of vocalising, the measurement would be the minimum "glance time" for which the time could be correctly identified. My hunch is that the threshold will be a much shorter glance for analogue clocks.

Friday 23rd June 2006 | science | Permalink

A load of Boswelox

Here's an interesting article about the science, or lack of science, behind skincare products and their arbitrary pseudo-scientific claims. It even uses the word "Boswelox", possibly the funniest word ever.
Wednesday 22nd February 2006 | science | Permalink
Creative Commons License
Dan's blog articles may be re-used under the Creative Commons Attribution-Noncommercial-Share Alike 2.5 License. Click the link to see what that means...