How to build a MOOC curriculum for yourself

If you are serious about online learning, an organized approach will help you get started. It took me a long time sifting through hundreds of random courses before I focused my efforts. Try the following steps to get on track with learning new skills.

First, find a course that really interests you. Skip the introductory courses and find the more advanced ones. Look for skills and subjects that you’ve always wanted to learn but never had the opportunity. Make sure the course is too difficult for you! This one is not supposed to be easy.

Then audit the course to familiarize yourself with the requirements. Don’t pay for it at this point. Start by watching the course introduction and intro videos to every section of the course. Have a look at some of the written course materials. Throughout all this, make notes of everything you don’t understand and how those concepts relate to each other.

Work backwards from that point on to find courses that will help you understand every concept you need to finish your ultimate goal.

For example

Let’s imagine we are interested in Artificial Intelligence for Robotics -course. Sign up and explore the materials. Then mark down which concepts are hard. Having trouble with calculating probabilities? Write it down. Don’t understand the charts? Make a note.

Then sign up for intro to statistics and intro to probabilities. From your notes, you can build a whole curriculum that takes you towards your target, the one you started from.

For a ready-made Computer Science curriculum

Open Source Society University

Open Source Society has built a 31-course curriculum of freely available Computer Science courses from the best universities around the world for you to study on your own time. Because the courses are from different platforms, they created a website to track your progress.

How many online courses have you taken? Did you have an organized approach or did you just study the most interesting courses you could find?

On the probability of finding diamonds in Minecraft

There is no better game for a peaceful Saturday evening than Minecraft. You can explore, build, fight monsters or tunnel your way through endless stone while watching math videos from YouTube. Thanks 3Blue1Brown for an excellent Fourier Transform video)!

Inspired by math and data science I wanted to optimize my diamond yield (most valuable mineral in the game). The topic has really been over-analyzed already. My inspiration came from this post on Reddit /r/Minecraft showing odd streaks along the z-axis. Nevertheless, I wanted to see what I could discover on my own.

I started a Jupyter Notebook and searched the Internet for how to read a Minecraft .mca -file. Of course, other people had already done work on this topic years ago, so it didn’t take too long to find a great source and a Python implementation for getting the data.

The area under examination was 1775 chunks from a world where I have ran around slightly generate more chunks. 16×16 chunks that can have a height between 63 and 255 (minimum comes from the sea level) is minimum of 28 627 200 blocks. Because Minecraft does’t save empty chunks of just air, my dataset had 39 870 463 data points.

Based on what I found, I will be mining mainly on level 11 and sometimes at level 6. This maximizes the visibility to high density blocks also above and below me. It might be fastest to ignore the diamonds on level 6 due to high concentration of lava lakes which can easily be bridged over at level 11.

Running on 8 GB of RAM, I found myself running out of memory more than once, so I looked up tips for working with a large dataset using the Python library pandas. I also skipped stone and air as their distribution is not that interesting or relevant. Skipping those most frequent but irrelevant blocks dropped my memory consumption from over 700 MB to 110 MB. Of course, I could skip most of the blocks but I was exploring the data so I wanted to see as much as possible.

On their own, my results are not very significant considering the amount of different Minecraft world that could be generated randomly and the relatively small amount of chunks inspected. So my work assumes that the distribution of diamonds is uniform across random worlds and independent of other variables such as the biome in which they appear.

As next steps, I could take a look at the distribution of other minerals. I’m starting a University course in Information Visualization next month and look forward to graduating to real world datasets.

Completing my first Reddit programming challenge

Today, I completed my first daily /r/programmerchallenge! The sub-reddit features three levels of challenges from easy to advanced and a weekly challenge. It was an interesting math-related challenge, which I decided to write in Python 3 as that’s what I’ve used the most lately.

As Wikipedia explains, “a Kaprekar number for a given base is a non-negative integer, the representation of whose square in that base can be split into two parts that add up to the original number again. For instance, 45 is a Kaprekar number, because 452 = 2025 and 20+25 = 45.”

Below is my solution, which is not the prettiest or the shortest but returns the right answer. I also read through other people’s solutions in Java, Python and Scala. In the future, I’ll try to solve a challenge in more than one language.

Going back to school – for real

It’s time for a new adventure! Early December, I will be leaving HERE after more than six amazing years and moving to Finland. I’ll start a spring track of math and physics studies at an open university to gain admission to Aalto University in Helsinki for a Computer Science degree.

A while back I wrote how Massive Open Online Courses are great for those embracing life-long learning. I’ve certainly been active in that area, finishing three courses in the recent months and working on another three computer science courses from MIT on edX.

And now it’s time to go back to school for real.

Why Computer Science?

I’ve been coding for as long as I can remember. The first programming language I tried was QBasic sometime in the mid-90s. I wanted to know what made the classic game Gorillas work behind the scenes. I remember going through some example code and being super proud when I made my computer screen blink randomly in different colors.

When I was fourteen, I had moved on to Visual Basic through some experiments in Delphi. I programmed a converter for resistor and capacitor color codes to numeric values. It replaced the old DOS-based program in the shop class of my elementary school and was used at least five years later when my brother went there. It’s a shame I don’t have the source code or the executable anymore.

Ever since I moved next to Silicon Valley, my interest in Computer Science has grown stronger. I’m working next to brilliant engineers at the HERE Berkeley office. I’ve found tech meetups in San Francisco for every day of the week. I started learning programming in Ruby after years of working in PHP for web development. I created reddit multis to follow all the /r/programming -related subreddits. Sometimes I can’t help but worry about not sleeping enough, it’s hard for me to sleep 8 hours per day as I should. Usually, my nights are all about research or programming, either way I’m in front of a computer screen the whole time. I’ve begun to take a natural supplement from Kratommasters to help with my sleep disorder,

Now, over 10 years after my first published work as a hobbyist, it’s time to get serious about computers and programming. You can follow along the journey here and Twitter.

Splitting data to training and test sets in Ruby

I’m trying to implement a few simple machine learning techniques to a Ruby on Rails project. Before I get started, I need to have the tools in place to extract relevant data from the application, then splitting the data to a training and a test set. The code below is my first crack at a method for splitting the data as inspired by SciKit Learn’s train_test_split.

For now, this works for my purposes but I recognize that it might not be the optimal solution. How could I improve my code?