Create your first equity portfolio with Python

Matteo Gambera
4 min readMay 4, 2020

--

This short article aims to clarify the first aspects needed to start analyzing an equity portfolio through Python.

  • Download share price
  • Calculate stocks’ correlation
  • Calculate portfolio’s return and volatility

How to download share price?

To download data easily and for free, you can rely on Yahoo Finance.

To do this, you need to import libraries that allow you to create a dataframe where you can enter data.

import pandas as pd

It will also need a function to download data.

from pandas_datareader import data as wb

Now you can download share price that interests you.
Create a list with ticker symbols, define time period in which you want to extract data and a space to store everything

start_date = '2013-01-01'
end_date = '2019-01-01'
assets = ['T', 'WMB', 'IRM', 'MO', 'PM','D','PGR','GIL','IBM','VZ']
#Dataframe
pf_data = pd.DataFrame()
#Fills the dataframe
for ticker in assets:
pf_data[ticker] = wb.DataReader(ticker, data_source = 'yahoo', start = start_date , end = end_date)['Adj Close']

What is the ‘adjusted close’?

Adjusted close is the closing price after adjustments for all applicable splits and dividend distributions. Data is adjusted using appropriate split and dividend multipliers, adhering to Center for Research in Security Prices (CRSP) standards.

How to view data?

An easy way to view data and verify that it is correct is to use head() and tail() functions that allow you to view first five and last five lines of pf_data dataframe.

pf_data.head()
pf_data.tail()

It is interesting to graphically visualizing behavior to evaluate progress of different stocks included in portfolio.
To be able to compare them correctly, you must first normalize them.
The type of normalization that is done in this case is a very simple method which consists in dividing the value of the action respect
to first value we have (in this case that of 2013–01–02).
With this strategy all actions will start from same “point”.

(pf_data/pf_data.iloc[0]*100).plot(figsize=(10,5), grid= True);

How do I evaluate correlation between stocks?

For next calculations that will be made it’s necessary to use numpy.

To graph results it is necessary to use matplotlib e seaborn

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

I use the logarithmic return to calculate daily return of the shares, if you want to master reason for choosing the logarithm I recommend this article.

log_returns = np.log(pf_data/pf_data.shift(1))

To evaluate the correlation between different stocks, we use the Python corr() function applied to log_returns.
In this case I used the threshold 0.7 to color strong correlations red.

threshold = 0.7
correlation = log_returns.corr()
mask = np.zeros_like(correlation)
mask[np.triu_indices_from(mask)] = True
plt.subplots(figsize=(8,8))
sns.heatmap(correlation,center= treshold,annot=True,mask=mask);

How to calculate portfolio’s return and volatility?

Before calculating return and volatility, it is necessary to define how portfolio is composed, in this example I assume that weights is equally distributed among all assets. Each shares has 10% of the invested capital.

#                      T  WMB IRM MO  PM  D   PGR GIL IBM V
pf_weights = np.array([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1])

Formulas’ concepts explained here are used to calculate return and volatility.
The result that will be printed represents portfolio’s average annual return and average annual volatility in historical period defined above.
For this reason i use 250, represent number of days in which market is open.

returns = np.around(np.sum (weights * log_returns.mean()) * 250, 4)
volatility = np.around(np.sqrt (np.dot (weights.T, np.dot(log_returns.cov() * 250, weights))), 4)
print(returns)
print(volatility)

This script and this article are very basic, I aimed to give a brief introduction on how to use a few lines of code to download data and make a trivial analysis.
In next articles I will explain:

  • How to do a Monte Carlo analysis in python to evaluate which is the best combination of weights for different types of portfolio.
  • How to extract and calculate the dividends obtained from the chosen portfolio.
  • How to obtain and analyze financial and balance sheet data of the companies in the portfolio for a fundamental analysis.

The script used is available on Github.

--

--

Matteo Gambera
Matteo Gambera

Written by Matteo Gambera

Automation Engineer & Founder @Stema, I talk about data applied to problems and decisions, I also tell about startups and team management