AI on Srikanth Cherla

Completed Generative AI with LLMs Course on Coursera

Thu, 01 Aug 2024 00:00:00 +0000

I recently completed a Coursera foundation course on Large Language Models — my first structured learning in a while. The gap was mostly down to becoming a parent, which makes carving out time for professional development and blogging considerably harder.

Since April 2023 I’ve been working on Muse at Unity — an LLM-driven AI assistant for Unity developers that has evolved from a web-based chat interface into a deeply integrated Editor tool capable of analysing your project and performing contextual tasks.

Unity Muse at the Microsoft Build Conference

Sat, 25 May 2024 00:00:00 +0000

I’ve been building Unity’s AI assistant — Muse — since April 2023, and it recently got a moment in the spotlight: Microsoft featured it at their Build conference as a customer success story.

A film crew came to Copenhagen to record the team, and the resulting video was shown during a live Muse demonstration at the conference. I’m in it, talking about both the product and my contributions to it. A good reminder of how far the project has come from its early days as a simple chat interface.

Completed Practical Reinforcement Learning on Coursera

Sat, 01 May 2021 00:00:00 +0000

Reinforcement Learning is one of the most interesting topics in computer science and ML. I started with the canonical textbook — Reinforcement Learning: An Introduction by Sutton & Barto — which is a fantastic read, though some topics took multiple passes to properly absorb.

After working through Dynamic Programming, Monte-Carlo methods, and Temporal Difference learning, I wanted practical experience and turned to Coursera’s Practical Reinforcement Learning course.

Honest review: it was frustrating. The course tried to cover too much ground too quickly, assignment instructions were minimal and error feedback sparse, and by the halfway point I was no longer enjoying it. I pushed through to the end.

Completed Recommender Systems Specialisation on Coursera

Thu, 25 Apr 2019 00:00:00 +0000

After leaving Jukedeck I completed the four-course Recommender Systems specialisation from the University of Minnesota on Coursera:

Introduction to Recommender Systems: Non-personalised and Content-based
Nearest Neighbour Collaborative Filtering
Recommender Systems: Evaluation and Metrics
Matrix Factorisation and Advanced Techniques

Completed in about a month at a leisurely pace. Very well-taught — the coursework used spreadsheet-based implementations to make the algorithms tangible before diving into code. Content-based filtering, item-item and user-user collaborative filtering, matrix factorisation — all well covered.

TensorFlow Tip: Pretrain and Retrain

Sun, 22 Apr 2018 00:00:00 +0000

I recently ran into a situation where I had to initially train a neural network first on one dataset, save it and then load it up later to train it on a different dataset (or using a different training procedure). I implemented this in Tensorflow and thought I’d share a stripped down version of the script here as it could serve as an instructive example on the use of Tensorflow sessions. Note that this is not necessarily the best way of doing this, and it might indeed be simpler to load the original graph and train that graph itself by making its parameters trainable, or something else like that. The script can be found here. In the first stage of this script (the pre-training stage) there is only a single graph which contains the randomly initialised and trained model. One might as well avoid explicitly defining a graph as Tensorflow’s default graph will be used for this purpose. This model (together with its parameters) is saved to a file and then loaded for the second re-training stage. In this second stage, there are two graphs. The first graph is loaded from the saved file and contains the pre-trained model whose parameters are the ones whose values we wish to assign to those of the second model before training the latter on a different dataset. The parameters of the second model are randomly initialised prior to this assignment step. In order for the assignment to work, I found it necessary to assign parameters across graphs and this could be done by saving the parameters of the first model as numpy tensors and assigning the values of these numpy tensors to the right parameters of the second model.

Completed Machine Learning with Big Data on Coursera

Sat, 17 Mar 2018 00:00:00 +0000

Completed UCSD’s Machine Learning with Big Data on Coursera with a 98.9% mark. The ML theory was introductory — a good refresher on Naive Bayes, Decision Trees, and k-Means, but nothing new. The real value was the hands-on introduction to KNIME and Spark ML applied to real datasets.

Together with the previous course this was more practically focused than the earlier modules in the specialisation, which is what I was after.

The TensorFlow Datasets API for Sequence Data (Code Examples)

Mon, 18 Dec 2017 00:00:00 +0000

When TensorFlow 1.4 was released there were very few fully working examples of the Datasets API for sequence data. Rather than a full tutorial, here are two scripts with explanatory notes.

The GitHub repository contains:

placeholder_vs_iterators.py — Three data input approaches:

Traditional placeholder method
Iterators
Feedable iterators

generator_vs_tfrecord.py — Three methods for iterating through sequence data during training:

Generator function with preprocessing (zero-padding, batching)
Pre-processed data via generator
TFRecord files using SequenceExample Protocol Buffers (the most Datasets API-dependent approach)

References: TF Datasets documentation, Google Developers blog post on the API.

Completed Andrew Ng's Convolutional Neural Networks on Coursera

Sun, 03 Dec 2017 00:00:00 +0000

Completed Andrew Ng’s Convolutional Neural Networks course — the third in his Deep Learning specialisation — with 100%. This was the most genuinely new material for me; I’d only skimmed a couple of papers on CNNs and never properly implemented one.

The course is excellent. Highlights: 1D, 2D, and 3D convolutions explained clearly and in depth; coverage of VGGNet, InceptionNet, and Network-in-Network architectures; applications including object recognition, face recognition, and Neural Style Transfer. The programming assignments were engaging and moderately challenging, and the reading list was valuable.

(Automated) Curriculum Learning

Sat, 18 Nov 2017 00:00:00 +0000

I’ve lately spent some time reading about Curriculum Learning and experimenting with the algorithms described in two of the papers in this domain Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009, June). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41-48). ACM. Graves, A., Bellemare, M. G., Menick, J., Munos, R., & Kavukcuoglu, K. (2017). Automated Curriculum Learning for Neural Networks. arXiv preprint arXiv:1704.03003. The first of the above can be considered important given how with empirical results supporting Curriculum Learning, it revived the interest among researchers in this technique. The second is one of the recently proposed approaches for Curriculum Learning that I thought would be interesting to understand in greater depth. I’ve summarised my thoughts on these in a short presentation. I hope to share my code and results not too long from now as well.

Completed Andrew Ng's Improving Deep Neural Networks on Coursera

Sun, 01 Oct 2017 00:00:00 +0000

Completed Andrew Ng’s Improving Deep Neural Networks — the second in his Deep Learning specialisation — with 100%. Much of the material was familiar from my ML background, but several sections were genuinely valuable: the detailed treatment of optimisation techniques (exponential moving averages, Momentum, RMSProp, Adam), batch normalisation, and dropout.

On to the Convolutional Neural Networks course next.

improving-deep-neural-networks-certificate

Completed Andrew Ng's Structuring Machine Learning Projects on Coursera

Mon, 28 Aug 2017 00:00:00 +0000

Completed Andrew Ng’s Structuring Machine Learning Projects with 96.7%. Reasonably familiar material given my background, but a few useful insights.

The lectures on Transfer Learning, Multitask Learning, and End-to-End ML were too brief to be immediately useful — they’d need to be followed up with deeper reading. But the practical advice and real-world scenario exercises were valuable, and I wish there were more of them (perhaps as optional material).

structuring-machine-learning-projects

Breaking Down the Differentiable Neural Computer

Wed, 24 May 2017 00:00:00 +0000

My PhD work on RNNs for musical sequence prediction got me interested in memory-augmented neural architectures. I spent a couple of weeks working through two key papers:

Neural Turing Machine — Graves et al., Google DeepMind (arXiv)
Differentiable Neural Computer — a more advanced variant published in Nature

I put together a Google Slides presentation with my observations and notes. Feedback welcome — let me know if anything needs correcting.

Music and Connectionism

Mon, 01 Aug 2016 00:00:00 +0000

The many contributions made during the past three decades to computer-assisted analysis and generation of music with the aid of Connectionist architectures can be seen to have occured in two waves, in parallel with developments in Connectionist research itself. During the first wave, the founding principles of Connectionism were introduced (Rumelhart et al., 1986) through the idea of Parallel Distributed Processing according to which mental phenomena occur as a result of simultaneous interactions between simple elementary processing units, as opposed to the then prevailing notion of Sequential Symbolic Processing which explained the same phenomena in terms of sequential interactions between complex goal-specific units. Its significance is largely theoretical, with a few experimental and empirical results to support the feasibility of the theory. Following several years of reduced interest, the second wave further strengthened the claims made by its precursor through a series of successful high-impact real-world applications. This was owing to both the proposal of newer theories, and the availability of greater computational power and vast amounts of data that enabled the demonstration of the efficacy of these theories nearly two decades on (Bengio, 2009; LeCun et al.,2012). The innovations that came about as a result of these two phases trickled down to several application domains (Krizhevsky et al., 2012; Hinton et al., 2012;Collobert et al., 2011) of which music is one (Todd and Loy, 1991; Griffith and Todd,1999; Humphrey et al., 2012). This section reviews notable contributions among the many that demonstrated the application of connectionism to symbolic music modelling during these two waves in order to present a historical perspective together with an overview of the techniques employed.