Music Recommender System — Part 1

Naga Sanka
Level Up Coding
Published in
5 min readDec 8, 2021

--

Creating Recommender System using Machine Learning

Introduction

We have all seen many automated recommender systems everywhere, a few well-known ones are Netflix, Amazon, Youtube, LinkedIn, etc. In this series, let’s see how to build a recommender system using machine learning from scratch. As part of this series, I would like to show how we can create a framework for applying different machine learning algorithms on a real world music dataset to predict the playlist/songs recommendations. We will use four main approaches such as content based filtering, collaborative filtering, model based methods, and deep neural networks.

The steps that we follow to build a recommender system are here:

In this article, let’s get started by creating the development environment and installing all necessary libraries.

Part 1: Create Development Environment

We would like to avoid “works on our machine” situations and try to find a system that is available to everyone so that they can try running this project code. I used Google Colab for other projects, it is basically a free Jupyter notebook environment that runs entirely in the cloud. I also used GitHub Codespaces, but I would like to try Gitpod for this project because it provides a workspace which includes: source code, a Linux shell with root/sudo, a file system, the full VS Code editing experience including extensions, language support and all other tools and binaries that run on Linux. Click the link, if you want to setup Codespaces Development Environment. Let’s see how to create and configure your first Gitpod workspace below.

I assume you already have a GitHub account, if not register for free account here. Once you get the GitHub account, login to your GitHub account in a browser and create a new repository, let’s call it “RecSys”. Once it’s created, it will take you to the code page of that repository.

Create RecSys GitHub Repository

Next, let’s add the Open in Gitpod button to make it easy to start Gitpod workspace for this project as shown below by replacing the “project-url” with GitHub repository url. Alternatively, we can also install Gitpod browser extension.

[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#<project-url>)Ex: [![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/nsanka/RecSys)
Open in Gitpod button in README

Project Specific Customization

Click the Add file -> Create new file button in your repository main folder. In the file name box enter “.gitpod.yml”. This file will be used to apply the customization of the workspace environment for any user who uses this particular project. More details of configuring the Gitpod workspaces can be found here.

# Custom Docker Image
#image:
#file: .gitpod.Dockerfile
# List the start up tasks.
# Learn more https://www.gitpod.io/docs/config-start-tasks/
tasks:
- name: Check Setup
init: |
python -m pip install --upgrade pip
# Add commands to Setup Python Environment
command: |
clear
echo "=============="
echo " Welcome "
echo "=============="
pyenv versions
echo ""
# List the ports to expose.
# Learn more https://www.gitpod.io/docs/config-ports/
ports:
# jupyter
- port: 8888
onOpen: ignore
# Install VSCode Extensions
vscode:
extensions:
- ms-azuretools.vscode-docker
- ms-python.python

Setup Python Environment

Gitpod workspaces already have Python installed in it and it also has pyenv which can be used to manage Python versions. We will create a new file in the repository main folder and name it “requirements.txt” to define all the Python packages we need for this project.

# requirements.txt file
altair==4.1.0
matplotlib==3.5.0
numpy==1.19.5
openTSNE==0.6.1
pandas==1.2.5
pip==21.3.1
plotly==5.4.0
requests==2.25.1
scikit-learn==0.24.2
scipy==1.7.3
spotipy==2.19.0
streamlit==1.2.0
seaborn==0.11.2
tqdm==4.62.3
urllib3==1.26.7
wordcloud==1.8.1

We add the following commands to the “.gitpod.yml” file in the “task->init” section to install the required Python version and necessary modules mentioned in “requirements.txt” using pip.

# Install Python 3.7.2
pyenv install -v 3.7.2
# Set Python 3.7.2 as default
pyenv global 3.7.2
# Install all libraries
python -m pip install -r requirements.txt

Container Specific Customization

This step is optional. By default, Gitpod creates a workspace with a standard Docker Image called “Workspace-Full” if we don’t specify any Dockerfile in “.gitpod.yml” file. This standard docker image includes tools such as Docker, Go, Java, Node.js, C/C++, Python, Ruby, Rust, PHP as well as Homebrew, Tailscale, Nginx and several more.

If you want to build custom container, click the Add file -> Create new file button in your repository main folder. In the file name box enter “.gitpod.Dockerfile”. Your custom container can contain anything you want. This is very useful if you are using a framework or SDK that is not present in the standard image, or if you have to install a specific package. What I’d recommend is to use the standard Workspace-Full image as base image and then build on top of that as shown below. If you would like to start with different framework other than standard image, check the pre-built container configuration for Gitpod that is available in the workspace-images repository.

FROM gitpod/workspace-base:latest# [Optional] Uncomment this section to install additional OS packages.
# RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
# && apt-get -y install --no-install-recommends <your-package-list-here>

Open Workspace

We have everything setup and ready to start Gitpod Workspace. Click the Open in Gitpod button to start Gitpod workspace, it will ask you to Login to Gitpod account. We use GitHub as provider and your workspace will open in a new tab.

Login to Gitpod

It will take few seconds and you will see your Workspace in the browser as shown in below image. It also shows how to open a Terminal.

This completes our Development Environment setup for this project using Gitpod Workspaces, we can quickly open the project code in a predefined environment, build and test it.

Next Step

In next artticle, we can see how to get the music dataset and perform exploratory data analysis.

If you enjoy reading my articles and want to support me, please consider signing up to become a Medium member. It’s $5 a month and gives you unlimited access to stories on Medium. Please signup using my link to support me: https://nsanka.medium.com/membership.

--

--