Using MySQL to design a database and perform a descriptive analysis on music database.
Hello and welcome to the MySQL personal project!
In this post, I will design a database about music albums, bands, and songs. After database is created, I inserting a data on each table. Then, I address some of the SQL Query problems to get better understanding of the information.
Here is the preparation required to help finish the MySQL project.
To make this project completed, I use SQL for cleaning and analysis. There are many SQL editors where you can design and manipulate a database. For this particular project, I used MySQL Workbench.
The dataset I used in this project is open-source data without any PII (Personally Identifiable Information). I obtain from my favorite web development skills YouTuber, WebDevSimplified, and you can watch all of their videos hereThis music dataset contains
This music dataset contains detailed information about bands, albums, and songs. This includes three tables that have more than 180 rows. Because the value is being inserted into the table, this dataset is ideal for designing a database from scratch.
In this section, I will explain how to design a database by inserting a few tables and add a value in each column.
record_company
to our MySQL schema.CREATE DATABASE record_company;
USE record_company;
record_company
database is already in use, it will appear in the schema with bold text.After creating a database named record_company
, we try to add a few tables including bands
, albums
, and songs
tables. Here are the steps:
bands
table to our record_company
database. This includes three columns:
id
to uniquely identify band data input,
name
for the band’s name, and
make id
our table identifier or Primary Key.
CREATE TABLE bands (
id INT NOT NULL AUTO_INCREMENT,
VARCHAR(255) NOT NULL,
name PRIMARY KEY (id)
);
albums
table including information on albums they have produced. This includes four columns:
id
to uniquely identify albums data input,
name
for the album’s name,
release_year
for the album’s year released,
band_id
for uniquely identifying the band (same as id on bands
table), and
Make the id
of the albums
table the primary key.
Connect band_id
column from albums
table to the id
column from bands
table.
CREATE TABLE albums (
id INT NOT NULL AUTO_INCREMENT,
VARCHAR(255) NOT NULL,
name INT,
release_year INT NOT NULL,
band_id PRIMARY KEY (id),
FOREIGN KEY (band_id) REFERENCES bands(id)
);
songs
table including information on songs they have produced. This includes four columns:
id
to uniquely identify songs data input,
name
for the songs’ name,
length
for the duration of the song,
album_id
for uniquely identifying the album (same as id on albums
table).
Make the id
of the songs
table the primary key.
Connect album_id
column from songs
table to the id
column from albums
table.
CREATE TABLE songs (
id INT NOT NULL AUTO_INCREMENT,
VARCHAR(255) NOT NULL,
name length FLOAT NOT NULL,
INT NOT NULL,
album_id PRIMARY KEY (id),
FOREIGN KEY (album_id) REFERENCES albums(id)
);
Because I already add three tables, it will appear in the schema like this.
After designing a few tables in the database, I insert a number of values into the table. Here, I will use an INSERT INTO
function into bands
table.
INSERT INTO bands(id,name) VALUES (1,'Seventh Wonder');
INSERT INTO bands(id,name) VALUES (2,'Metallica');
INSERT INTO bands(id,name) VALUES (3,'The Ocean');
INSERT INTO bands(id,name) VALUES (4,'Within Temptation');
INSERT INTO bands(id,name) VALUES (5,'Death');
INSERT INTO bands(id,name) VALUES (6,'Van Canto');
INSERT INTO bands(id,name) VALUES (7,'Dream Theater');
Because the dataset has a large number of rows, I will include another table in this file thanks to WebDevSimplified.
In this section, I will use some of SQL Query to answer a few question. This will helps us to get better understanding of the dataset.
SELECT bands.name AS 'Band Name'
FROM bands;
Here are all the bands that included in this dataset.
SELECT name AS 'Album Name', release_year FROM albums
WHERE release_year IS NOT NULL
ORDER BY release_year
LIMIT 1;
Here is the oldest album in this dataset.
SELECT DISTINCT bands.name AS 'Band Name' FROM bands
RIGHT JOIN albums ON albums.band_id = bands.id;
Here are the list of all bands that have released at least one album in this dataset.
SELECT b.name AS 'Band Name' FROM bands AS b
LEFT JOIN albums AS a ON a.band_id = b.id
GROUP BY a.band_id
HAVING COUNT(a.id) = 0;
Here are the list of all bands that have no album in this dataset.
SELECT a.name 'Album Name', a.release_year AS 'Release Year', SUM(s.length) AS 'Duration' FROM songs AS s
LEFT JOIN albums AS a ON s.album_id = a.id
WHERE a.release_year IS NOT NULL
GROUP BY a.id
ORDER BY SUM(s.length) DESC
LIMIT 1;
Here are the band with the longest album duration.
Here are the steps to updating the album data with no release year:
SELECT name, release_year FROM albums
WHERE release_year IS NULL;
UPDATE albums
SET release_year = 1982
WHERE release_year IS NULL;
SELECT name, release_year FROM albums
WHERE release_year IS NULL;
release_year
column, the result will be empty.Here are the steps to updating the album data with no release year:
SELECT id, name, release_year FROM albums
WHERE release_year = 1982;
id
column.DELETE FROM albums
WHERE id = 4;
Re-check again the value that has been deleted.
If there’s any problem with deleting the value, go refer to this discussion.
SELECT AVG(length) AS 'Average Song Duration (in min)'
FROM songs;
Here are the average duration of all song.
SELECT a.name AS 'Album Name', s.name AS 'Song Name', a.release_year AS 'Release Year', MAX(s.length) AS 'Duration' FROM albums AS a
INNER JOIN songs as s ON a.id = s.album_id
GROUP BY a.id;
Here are the longest song of each album with its release year.
SELECT b.name AS 'Band Name', COUNT(s.id) AS 'Number of Songs'
FROM bands AS b
JOIN albums AS a ON a.band_id = b.id
JOIN songs AS s ON s.album_id = a.id
GROUP BY b.name;
Here are the total songs counted for each band.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
For attribution, please cite this work as
Hadi (2023, Feb. 15). DataHadi: Design a Music Database using MySQL. Retrieved from https://miftahulhadii.github.io/index.html/posts/2023-02-15-design-a-music-album-database-using-MySQL/
BibTeX citation
@misc{hadi2023design, author = {Hadi, Miftahul}, title = {DataHadi: Design a Music Database using MySQL}, url = {https://miftahulhadii.github.io/index.html/posts/2023-02-15-design-a-music-album-database-using-MySQL/}, year = {2023} }