This is a continuation of my work on the codeacademy data science independant project #1: Watching the Stock Market.

My first post on the subject explores how I put gathered the stock data.
In my second post, I explored the data using basic sql queries (and put my sql code out into the world for the first time!)

After completing the basic challenges from my last post on the topic, there are intermediate and advanced challenges too while playing with these stocks data.

Let’s jump into the challenges!

Intermediate Challenges

1. Use aggregate functions to look at key statistics about the data (e.g., min, max, average).

SELECT ROUND(AVG(price), 2) AS average_price, 
       ROUND(MIN(price), 2) AS min_price, 
       ROUND(MAX(price), 2) AS max_price 
FROM stocks;

Table 1: 1 records
average_price	min_price	max_price
189.5	87.5	377.36

At the end of my last post, I mentioned that it may be possible that I picked… well expensive stocks. at an average cost of nearly $190 per share, I’d definitely say these aren’t cheap!

2. Group the data by stock and repeat. How do the stocks compare to each other?

So, what’s the variability in stock price within each stock?

SELECT name, 
       ROUND(AVG(price), 2) AS average_price, 
       ROUND(MIN(price), 2) AS min_price, 
       ROUND(MAX(price), 2) AS max_price 
FROM stocks
GROUP BY name
ORDER BY average_price;

Table 2: 5 records
name	average_price	min_price	max_price
Electronic Arts	93.73	87.50	102.72
Spotify	140.56	121.58	155.38
NVIDIA	159.38	133.78	182.98
Apple	196.11	173.30	210.94
Boeing	357.74	337.37	377.36

So, EA’s stock is lower than all the others and never even overlaps the stock price of any of the other stocks. On the other end of the spectrum is Boeing - more expensive than all the others by leaps over $100 dollars a share! Spotify and Nvidia are closer in price to each other than Nvidia and Apple but Spotify isn’t as expensive as Nvidia.

3. Group the data by day or hour of day. Does day of week or time of day impact prices?

SELECT date, 
       ROUND(AVG(price), 2) AS average_price
FROM stocks
GROUP BY date
ORDER BY average_price DESC;

Table 3: Displaying records 1 - 10
date	average_price
5/3/19	199.34
5/1/19	199.20
7/23/19	198.91
5/2/19	198.42
7/24/19	198.18
5/6/19	197.51
7/22/19	197.37
4/30/19	197.33
7/12/19	196.27
7/15/19	196.12

Well, I don’t know about day of the week but it does look like that things were looking particularly good at the beginning of may!

4. Which of the rows have a price greater than the average of all prices in the dataset?

SELECT *
FROM stocks
WHERE price > 189.5;

Table 4: Displaying records 1 - 10
symbol	name	date	price
APPL	Apple	4/30/19	199.9002
APPL	Apple	5/1/19	209.7124
APPL	Apple	5/2/19	208.3477
APPL	Apple	5/3/19	210.9377
APPL	Apple	5/6/19	207.6802
APPL	Apple	5/7/19	202.0818
APPL	Apple	5/8/19	202.1216
APPL	Apple	5/9/19	199.9500
APPL	Apple	5/10/19	197.1800
APPL	Apple	5/15/19	190.9200

A lot of rows. A lot of rows have a price greater than the average price (110 to be precise). If you look at the table from challenge 2, you’ll see that the only stocks above the average are from Apple and Boeing.

Advanced Challenge

1. Calculate other key statistics, like median or variance.

SELECT AVG((stocks.price - sub.a) * (stocks.price - sub.a)) AS variance
FROM stocks, 
    (SELECT AVG(price) AS a 
     FROM stocks) AS sub;

Table 5: 1 records
variance
8257.67435647567

SELECT price AS median
FROM stocks
ORDER BY price
LIMIT 1
OFFSET (SELECT COUNT(price)
        FROM stocks) / 2;

Table 6: 1 records
median
159.398056

2. Refactor the data into 2 tables

We’ll refactor the stocks table into stock_info to store general info about the stock itself (ie. symbol, name) and stock_prices to store the collected data on price (ie. symbol, date, price).

CREATE TABLE stock_info
AS SELECT DISTINCT symbol, name 
   FROM stocks;

CREATE TABLE stock_prices
AS SELECT symbol, date, price
   FROM stocks;

SELECT *
FROM stock_info;

(#tab:show_stock_info)5 records
symbol	name
APPL	Apple
BA	Boeing
EA	Electronic Arts
SPOT	Spotify
NVDA	NVIDIA

SELECT *
FROM stock_prices;

(#tab:show_stock_prices)Displaying records 1 - 10
symbol	date	price
APPL	4/30/19	199.9002
APPL	5/1/19	209.7124
APPL	5/2/19	208.3477
APPL	5/3/19	210.9377
APPL	5/6/19	207.6802
APPL	5/7/19	202.0818
APPL	5/8/19	202.1216
APPL	5/9/19	199.9500
APPL	5/10/19	197.1800
APPL	5/13/19	185.7200

3. Now join the 2 tables in order to view more information on the stock with each row of price.

SELECT stock_prices.symbol, date, price, name
FROM stock_prices
LEFT JOIN stock_info ON stock_info.symbol = stock_prices.symbol;

Table 7: Displaying records 1 - 10
symbol	date	price	name
APPL	4/30/19	199.9002	Apple
APPL	5/1/19	209.7124	Apple
APPL	5/2/19	208.3477	Apple
APPL	5/3/19	210.9377	Apple
APPL	5/6/19	207.6802	Apple
APPL	5/7/19	202.0818	Apple
APPL	5/8/19	202.1216	Apple
APPL	5/9/19	199.9500	Apple
APPL	5/10/19	197.1800	Apple
APPL	5/13/19	185.7200	Apple

Awesome! Now on to the next set of SQL challenges in another post!

DROP TABLE stock_prices;

DROP TABLE stock_info;

Working with stocks: Intermediate and Advanced Challenges