Picture by: Yannic Kilcher


So we all know about Generative Pre-trained Transformer 3 or so-called GPT-3, the language model with 175 Billion parameters, the successor of GPT-2 with 1.5 Billion parameters. GPT-3 was released back in June 2020 by OpenAI, a for-profit San Francisco-based artificial intelligence research laboratory founded by Elon Musk and Sam Altman.

Prior to the release of GPT-3, the largest language model was Microsoft’s Turing NLG, introduced in February 2020, with a capacity of 17 billion parameters.

The quality of the text generated by GPT-3 is so high that it is difficult to distinguish from that written by a human, which…

Spectrogram of different genres of Music

My work is an extension of Pankaj Kumar’s work that can be found here. Instead of a feed-forward Neural Net, I used a pre-trained ResNet model(Transfer Learning) to gain better accuracy. (Thanks a lot Pankaj Kumar).

You can download the dataset from here. The dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 genres, each represented by 100 tracks. The tracks are all 22050 Hz monophonic 16-bit audio files in .wav format.

The dataset consists of 10 genres i.e

  • Blues
  • Classical
  • Country
  • Disco
  • Hip-hop
  • Jazz
  • Metal
  • Pop
  • Reggae
  • Rock

Let’s start

!mkdir genres && wget http://opihi.cs.uvic.ca/sound/genres.tar.gz && tar…

Aryan Khatana

Student, Machine Learning Enthusiast.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store