View on GitHub

Nilesh Ingle

Github Projects

6. SQL

Posted: May 19, 2016; Updated: May 20, 2016

This project focusses on use of SQL to extract data from a clean database for analysis. The database was imported in Oracle SQL Developer (Version 4.1.3.20). The SQL script was written in SQL Developer for extracting data and creating charts. The objective in the project was to first ask a random question based on available tables in database. Then write a script i.e. query in SQL to extract that information. Finally, one or two charts or plots were created from the report-generator in SQL Developer. The complete script for all the queries are uploaded in Github repository.

Source of Data:
The 'Chinook' database analyzed was downloaded from CodePlex.

Questions asked:
  1. Find customers who purchased 'Aerosmith' tracks?
  2. Which states and countries have purchases for 'Metallica' tracks?
  3. Which month had the highest number of orders or any tracks?
  4. Which year had the highest number of orders or any tracks?
  5. What are the top trending mediat types?
  6. What are the top 'Genre'?
  7. What is the lowest priced media type?
  8. What are the highest buying customers?
  9. Which are the top earning tracks?
  10. Which are the top grossing artists?
  11. Which are the top grossing albums?
  12. In which country did the employees primarily help most customers?

Possible answers:
  1. Shown in sample code below
  2. USA, Denmark, Canada, United Kingdom etc.
  3. The highest orders were for the month of January of 34.
  4. The years 2009 to 2012 had highest number of 83 orders.
  5. The top trending media type was 'MPEG audio file' with 1976 orders.
  6. The top trending Genre was 'Rock' with 835 orders.
  7. The lowest priced media types were 'AAC audio file', 'MPEG audio file', 'Protected AAC audio file'. 'Purchased AAC audio file'. All had unit price of $0.99.
  8. The highest purchases were made by the customer named 'Helena Holy' with a total amount of $49.62.
  9. The top earning track was 'The Trooper' grossing $4.95.
  10. The top grossing artist was 'Iron Maiden' at $138.6.
  11. The top grossing album was 'Battlestar Galactica (Classic), Season 1' at $35.82.

Sample code:

-- Find customers who purchased 'Aersosmith' tracks
SELECT album.title, artist.name, track.name, customer.firstname, customer.lastname
FROM album 
  INNER JOIN artist
    ON album.artistid = artist.artistid
      INNER JOIN track
        ON album.albumid = track.albumid
          INNER JOIN invoiceline
            ON track.trackid = invoiceline.trackid
              INNER JOIN invoice
                ON invoiceline.invoiceid = invoice.invoiceid
                  INNER JOIN customer
                    ON invoice.customerid = customer.customerid
WHERE artist.name = 'Aerosmith';
--Query result:
  TITLE     NAME           TRACK        FIRSTNAME  LAST NAME
Big Ones  Aerosmith Love In An Elevator    Daan  Peeters
Big Ones  Aerosmith Rag Doll               Phil  Hughes
Big Ones  Aerosmith What It Takes          Ellie Sullivan
Big Ones  Aerosmith Janie's Got A Gun      Daan  Peeters
Big Ones  Aerosmith Amazing                Heather Leacock
Big Ones  Aerosmith Blind Man              Phil  Hughes
Big Ones  Aerosmith Deuces Are Wild        Ellie Sullivan
Big Ones  Aerosmith Deuces Are Wild        Daan  Peeters
Big Ones  Aerosmith Angel                  Daan  Peeters
Big Ones  Aerosmith Livin' On The Edge     Phil  Hughes
-- Which are the top grossing artists?
SELECT artist.name, SUM(invoiceline.unitprice) AS Price
FROM artist
  INNER JOIN album
    ON artist.artistid = album.artistid
      INNER JOIN track
        ON album.albumid = track.albumid
          INNER JOIN invoiceline
            ON track.trackid = invoiceline.trackid
GROUP BY artist.name
ORDER  BY SUM(invoiceline.unitprice) DESC;
-- In which country did the employees primarily help most customers?
SELECT CONCAT(employee.firstname, CONCAT(' ', employee.lastname)) AS employee_name, 
customer.country, Count(*)
FROM employee
  INNER JOIN customer
    ON employee.employeeid = customer.supportrepid
      INNER JOIN INVOICE
        ON customer.customerid = invoice.customerid
GROUP BY CONCAT(employee.firstname, CONCAT(' ', employee.lastname)), customer.country 
ORDER BY employee_name, Count(*) DESC;
--Query result (partial only)
  EMPLOYEE_NAME COUNTRY COUNT(*)
Jane Peacock  Canada          35
Jane Peacock  USA             21
Jane Peacock  France          14
Jane Peacock  Brazil          14
Jane Peacock  Germany         14
Jane Peacock  United Kingdom  14
Jane Peacock  India           13
Jane Peacock  Finland          7
Jane Peacock  Hungary          7
Jane Peacock  Ireland          7
Margaret Park USA             42
Margaret Park Brazil          14
Margaret Park Portugal        14
Margaret Park France          14
Margaret Park Belgium          7
Margaret Park Argentina        7
Margaret Park Denmark          7
Margaret Park Australia        7
Margaret Park Canada           7
Margaret Park Poland           7
Margaret Park Czech Republic   7
Margaret Park Norway           7
Steve Johnson USA             28
(continued...)

Figure gallery:
Plot_1
Figure 1: (Question 10) Top grossing artist.

Plot_2
Figure 2: (Question 12) Employee helped customers in countries.