Since April 2023, I’ve been working on Muse – Unity’s in-Editor AI assistant for developers. I was one of the first to join this team, and it’s been quite the ride until now, working with some excellent colleagues and negotiating through several challenges along the way! Our product was recently featured as a success story at Microsoft’s Build conference for which we had a camera crew visit us on behalf of Microsoft in Copenhagen to create a promotional video, parts of which were shown during a live demo of Muse at this conference. I had the privilege to be a part of this video, and got to talk about the product and share some of my contributions to it!
There’s lot of exciting stuff coming up for Muse, and I’m really looking forward to being a part of the future of this product!
It’s just been confirmed that four of us from Moodagent – Reinier de Valk, Pierre Lafitte, Tomas Gajarsky and I, will be attending ISMIR 2019 in Delft (The Netherlands). This year, two of my colleagues from Moodagent will be presenting their work at ISMIR:
I was invited by the Music Tech Community – India (MTC – India) to deliver a talk on the 29th of December, 2018 in Bengaluru. The theme of the event was “Machine Learning for Art & Music Generation” where my work at Jukedeck fit in perfectly alongside that of the other speakers at the event.
I happened to be on a holiday then in beautiful Mararikulam in Kerala around then, but I really didn’t want to miss this opportunity to speak so we decided to make it a remote talk that I delivered via Skype. Thanks to the excellent organisers – Albin Correya, Manaswi Mishra and Siddharth Bharadwaj, the talk went off smoothly and was apparently well-received. Other speakers during the event were Harshit Agarwal, and two of the organisers themselves – Albin Correya and Manaswi Mishra.
A few months following the acceptance of our paper at ISMIR 2018, I attended the conference in Paris with several of my colleagues from Jukedeck. We had a fairly large presence there dwarfed (as far as I can tell) only by a larger one from Spotify. The conference was organised very well and everything went-off smoothly. It was great to be back in the beautiful city after my last visit nearly 8 years ago!
I was particularly pleased by the new format for presenting accepted papers at this ISMIR wherein each paper was given both oral and poster presentation slots thus removing the traditional distinction between papers that exists in conferences. In the case of our paper on StructureNet, I made the oral presentation and my colleagues and co-authors – Gabriele and Marco – made the poster presentation. Fortunately, this year ISMIR was streamed live and the videos were later stored on YouTube so I’m able to share the video of my presentation with you. It’s only a 4-minute presentation so do check it out! And it appeared to me each time I passed our poster by that it received a lot of attention, and this was of course great! I, with help from members of my team, also prepared a blog post on StructureNet which was published recently on Jukedeck R & D Team’s Medium page. I urge you to give it a read if you’re curious what the paper is all about. Here’s a picture of the Jukedeck team at ISMIR:
I also signed up to play in this year’s ISMIR jam session organised by Uri Nieto from Pandora! If I remember correctly, it’s something that started in 2014 and has been getting more popular by the year. As anticipated, the jam session was a success and a lot of fun, with music ranging from AI-composed folk tunes to Jazz, Blues, Rock and Heavy Metal. I played two songs with my fellow attendees – Blackest Eyes by Porcupine Tree and Plush by Stone Temple Pilots. My friend Juanjo shared a recording of the first song with me in which I played bass.
As always, ISMIR this year provided a great opportunity to make new acquaintances, and meet old friends and colleagues. As it turns out quite a few of my friends from the Music Informatics Research Group (MIRG) at City, University of London showed up this time and it was great to catch up with them.
And to top it all off, my master thesis supervisor Hendrik Purwins managed to make it to the conference on the last day giving me the opportunity to get this one selfie with Tillman (my PhD thesis supervisor) and him.
I’m currently on a break from work at Jukedeck until the 22nd of September, and visiting friends and old colleagues in Bangalore for a few days. On coming to know of my visit to Bangalore, my past mentors invited me to give talks at their respective organisations – the International Institute of Information Technology – Bangalore, and Robert Bosch. Today I presented the work I did on sequence modelling in music, RBMs and Recurrent RBMs during my PhD to the staff and students at the International Institute of Information Technology – Bangalore (IIIT-B). And next Monday (the 18th of September, 2017) it will be more or less the same talk at Robert Bosch.
Here is a copy of the slides for those presentations.
“We are interested in modelling musical pitch sequences in melodies in the symbolic form. The task here is to learn a model to predict the probability distribution over the various possible values of pitch of the next note in a melody, given those leading up to it. For this task, we propose the Recurrent Temporal Discriminative Restricted Boltzmann Machine (RTDRBM). It is obtained by carrying out discriminative learning and inference as put forward in the Discriminative RBM (DRBM), in a temporal setting by incorporating the recurrent structure of the Recurrent Temporal RBM (RTRBM). The model is evaluated on the cross entropy of its predictions using a corpus containing 8 datasets of folk and chorale melodies, and compared with n-grams and other standard connectionist models. Results show that the RTDRBM has a better predictive performance than the rest of the models, and that the improvement is statistically significant.
I presented the paper in the session on Recurrent Neural Networks. The model that we proposed in the paper – the RTDRBM – was the first original Machine Learning contribution of my PhD. And it was a pleasure to collaborate with my friend and colleague Son Tran in the work. He presented a second paper at the conference titled, “Efficient Representation Ranking for Transfer Learning”.
Yet again a conference has taken me to a place in the world that I probably would’ve never visited otherwise! This doesn’t at all mean that the visit wasn’t worthwhile. The lush green Irish landscape, the charming town of Killarney and the abounding nature around it, and a friendly and welcoming hostel all made this a very memorable trip! Unfortunately, I had sore throat and a fever during much of my stay so I chose Irish coffee over a pint of Guinness (which I heard tastes much better in Ireland) when I had the chance. I regret this, but maybe that’s another reason to visit Ireland once again sometime!
Having had one paper accepted at the 14th International Society for Music Information Retrieval Conference (ISMIR 2013), I travelled to Brazil for two weeks where I was in Curitiba first for a week where the conference was being held and then in Rio for the rest of the time on a holiday. ISMIR is the leading conference when it comes to research in Music Information Retrieval and other related topics in Music Technology. The paper I presented there was titled, “A Distributed Model for Multiple Viewpoint Melodic Prediction”. Its abstract is the following:
“The analysis of sequences is important for extracting information from music owing to its fundamentally temporal nature. In this paper, we present a distributed model based on the Restricted Boltzmann Machine (RBM) for melodic sequences. The model is similar to a previous successful neural network model for natural language. It is first trained to predict the next pitch in a given pitch sequence, and then extended to also make use of information in sequences of note-durations in monophonic melodies on the same task. In doing so, we also propose an efficient way of representing this additional information that takes advantage of the RBM’s structure. In our evaluation, this RBM-based prediction model performs slightly better than previously evaluated n-gram models in most cases. Results on a corpus of chorale and folk melodies showed that it is able to make use of information present in longer contexts more effectively than n-gram models, while scaling linearly in the number of free parameters required.”
The paper was chosen for an oral presentation, and it also won a Best Student Paper Award at the conference. On the final day of the conference, I also organised a late-breaking session on “MIR in Music Education” which is a topic I am very interested in, and also participated in several other sessions organised by others.
I also met a very interesting guy named Anderson during my stay at the Knock Knock hostel in Curitiba, who is also a PhD student doing his research on Armadillos!
Then I travelled to Rio de Janeiro for a week where I lived in a hostel located just a few minutes away from Copacabana beach. I spent my time there hanging out at the many beaches, and visiting iconic landmarks such as Cristo Redentor and Sugarloaf mountain among other places recommended to me by the locals I met in the hostel, and also taking a bus tour with some other tourists.
I was also joined there by my supervisor Tillman, and my friend and colleage Reinier who accompanied me during some site-seeing.
In all this was a fabulous experience and I thoroughly enjoyed my time in Brazil! I’m sharing a copy of my paper and presentation slides below.
My paper was accepted at the 6th International Workshop on Machine Learning and Music (MML 2013) held in conjunction with the European Conference on Machine Learning (ECML 2013) this year. The paper is titled, “A Neural Probabilistic Model for Predicting Melodic Sequences”. Its abstract is the following:
“We present an approach for modelling melodic sequences using Restricted Boltzmann Machines, with an application to folk melody classification. Results show that this model’s predictive performance is slightly better in our experiment than that of previously evaluated n-gram models. The model has a simple structure and in our evaluation it scaled linearly in the number of free parameters with length of the modelled context. A set of these models is used to classify 7 different styles of folk melodies with an accuracy of 61.74%.”
Unfortunately, I was unable to go to the conference myself so my colleague Emmanouil presented the paper on my behalf. In addition to this, another paper I co-authored with Emmanouil, titled “An Efficient Shift-Invariant Model for Polyphonic Music Transcription” was also accepted into the same workshop which he presented as well.
I’m including copies of the paper and the presentation below.
My abstract was selected at the British Computer Society’s 5th Doctoral Consortium and I made a presentation, titled “A Neural Probabilistic Model for Music Prediction” on May 16, 2013. The abstract for the talk is the following:
“Neural Networks and Markov models have received long-standing attention in music prediction. The latter, while being successful at modelling the joint probability of short musical sequences, suffer from problems pertaining to the curse of dimensionality and zero-occurrence when the sequences become longer. We present a new model for music prediction based on the Restricted Boltzmann Machine (RBM) and evaluate it on sequences of musical pitch in a corpus of monophonic MIDI melodies. The results show that this model is able to make use of information present in longer sequences more effectively than recently evaluated Markov models, outperforming them on the said corpus while also scaling gracefully in the required number of free parameters. While initial results have been encouraging, there is also scope for considering more powerful models that build upon this basic architecture. Some questions that hope to be addressed in the future are whether such models can provide an insight into the development of musical taste, make predictions about more complex musical structures that involve polyphony and variations in rhythm, be of use in music education and compositional assistance, and aid in Music Information Retrieval tasks such as music transcription and classification.”
I have also attached a PDF copy of the presentation (all made in Beamer/LaTeX) below.