Completed the Course “Big Data Integration and Processing” offered by UCSD on Coursera

I successfully completed this course with a 97.7% mark. This course was once again broad and touched upon some big data technologies through a series of lectures, assignments and hands-on exercises. The focus was mainly on querying JSON data using MongoDB, analysing data using Pandas, and programming in Spark (Spark SQL, Spark Streaming, Spark MLLIB and Spark GraphX). All these were things I was curious about and it was great that they introduced these in the course. There were also an exercise on analysing tweets using both MongoDB and Spark. They had one section on something called Splunk which I thought was a waste of time but I guess they have to keep their sponsors happy.

This specialisation so far (I’m halfway through) has been fairly introductory and lacking depth. It’s been good to the extent that I feel like I’m aware of all these different technologies and would be able to know where to start if I was to use them for some specific application.¬†As I expected, this course was more hands-on which was great!

And here’s the certificate that I was awarded on completing the course.

Completed the Course “Big Data Modeling and Management Systems” offered by UCSD on Coursera

I successfully completed this course with a 100.0% mark. It was quite broad and covered a range of topics somewhat superficially, from Relational Databases, their relation to Big Data Management Systems, the various alternatives that exist for processing different types of big data. As with the first course, there were a lot of new names to grasp and connections to be made between the things they represented. The assignments were straightforward and involved running a few specific command-line tools and spreadsheet commands to process data and carry out some basic analysis just to get a feel for data tables and how one might go about extracting information from them. The final assignment involved completing an incomplete relational database design for a game. In my opinion, its goals could have been more precise, its connection to the course material more clear, and being a peer-graded assignment the evaluation criteria more well-defined. Quite a few learners seem to have lost out due to someone else not being able to evaluate their assignment properly due to the latter shortcoming. And as usual, here’s the certificate that I was awarded on completing the course.

It looks like the upcoming courses in this specialisation contain more practical and hands-on exercises, so looking forward to that in the coming weeks!

Completed the Course “Introduction to Big Data” offered by UCSD on Coursera

I successfully completed this course with a 98.9% mark. It was easy and covered mostly definitions, some history of big data, big data jargon and very basic principles. There was an emphasis on what constitutes big data (in terms of size, variety, complexity, etc.), what kinds of analyses one can carries out on big data, what sources they can be from, and what tools one could use to analyse them. When it came to the latter, the course offered a brief introduction to the Hadoop¬† ecosystem that I found particularly interesting as I hadn’t ever worked with any of the software that is a part of this ecosystem. And there was also a simple assignment that gave one a taste of what working with Hadoop could be like. Here’s a link to the certificate I received from Coursera on completing this course.

Looking forward to the remaining courses in the Big Data specialisation!