Portfolio Optimization

Published:

Code for portfolio optimization using Modern Portfolio Theory.

Modern Portfolio Theory (MPT)

Modern Portfolio Theory (MPT), developed by Harry Markowitz in the 1950s, gives you a way to construct portfolios that maximize expected return for a given risk level (or equivalently, minimize risk for a given return). The core idea is that diversification across imperfectly correlated assets can reduce portfolio variance below the variance of any individual holding.

In this notebook we implement portfolio optimization from scratch and then use packages to automate it. We also compare different volatility estimation methods and see how they affect the results.

Equally-weighted portfolio

import yfinance as yf
import pandas_datareader.data as pdr
import seaborn as sns 
import matplotlib.pyplot as plt
import numpy as np 
import pandas as pd
import quantstats as qs

yf.pdr_override()

# Define a list of stock symbols belonging to tech sectors

stock_symbols = ["META", "TSLA", "X", "MSFT"]

# start date
start_date = '2018-01-01'
# end date
end_date = '2023-01-01'

# Fetch stock data along with sector information
prices_df = pdr.get_data_yahoo(stock_symbols,start_date,end_date)['Adj Close']


# assign equal weights to all assets
weights = np.array(len(stock_symbols) * [1 / len(stock_symbols)])


# returns
returns = prices_df.pct_change()
returns.dropna(inplace=True)


# portfolio returns into a dataframe
portfolio_returns = pd.Series(np.dot(returns,weights.T),index=returns.index)
[*********************100%%**********************]  4 of 4 completed
import logging
logging.getLogger('matplotlib.font_manager').setLevel(level=logging.CRITICAL)

# plot the portfolio performance
qs.plots.snapshot(portfolio_returns,title="Equal weight portfolio's performance",
grayscale=True)

plt.show()

png

A drawdown refers to the peak-to-trough decline in the value of an investment or a trading account, typically expressed as a percentage. It represents the largest loss an investor or fund experiences before a new peak is reached.

# get the metric for the portfolio using bechmark SPY which follows S&P500
# for more detailed discussion on the metric, please refer to Portfolio_analysis.ipynb

qs.reports.metrics(portfolio_returns,benchmark="SPY",mode="basic",prepare_returns=False)

[*********************100%%**********************]  1 of 1 completed
                    Benchmark (SPY)    Strategy
------------------  -----------------  ----------
Start Period        2018-01-03         2018-01-03
End Period          2022-12-30         2022-12-30
Risk-Free Rate      0.0%               0.0%
Time in Market      100.0%             100.0%

Cumulative Return   41.39%             139.13%
CAGR﹪              4.91%              12.82%

Sharpe              0.43               0.66
Prob. Sharpe Ratio  82.94%             93.0%
Sortino             0.59               0.94
Sortino/√2          0.42               0.67
Omega               1.12               1.12

Max Drawdown        -34.1%             -45.84%
Longest DD Days     361                534

Gain/Pain Ratio     0.09               0.12
Gain/Pain (1M)      0.43               0.63

Payoff Ratio        0.89               0.92
Profit Factor       1.09               1.12
Common Sense Ratio  0.92               1.12
CPC Index           0.52               0.56
Tail Ratio          0.84               1.0
Outlier Win Ratio   5.26               2.75
Outlier Loss Ratio  5.08               2.82

MTD                 -6.19%             -12.17%
3M                  5.41%              -11.36%
6M                  0.55%              -12.72%
YTD                 -19.48%            -39.93%
1Y                  -19.91%            -40.54%
3Y (ann.)           3.8%               18.95%
5Y (ann.)           4.23%              11.89%
10Y (ann.)          4.91%              12.82%
All-time (ann.)     4.91%              12.82%

Avg. Drawdown       -2.41%             -5.76%
Avg. Drawdown Days  21                 32
Recovery Factor     1.36               2.62
Ulcer Index         0.09               0.17
Serenity Index      0.4                0.5

Portfolio variance is defined as:

\[\sigma_{port}^2 = weights^{T} \cdot Cov_{port} \cdot weights\]

where \(Cov_{port}\) is the covariance matrix of the portfolio and \(weights\) is the vector of weights for each asset.

# variance matrix 
var_yr = returns.cov() * 252

print(var_yr.shape,weights.T.shape,np.transpose(weights).shape)
# portfolio variance
port_var = weights.T@var_yr@weights

# portfolio volatility
port_vol = np.sqrt(port_var)

print(port_var, port_vol)
(4, 4) (4,) (4,)
0.13078861835379163 0.36164709089634833

In Modern Portfolio Theory (MPT), the goal is typically to either maximize the expected portfolio return for a given level of risk or minimize the portfolio risk for a given level of expected return. The optimization problem involves finding the weights of each asset in the portfolio that achieve these objectives. Here are the general formulas for the optimization problems in MPT:

Maximizing Expected Portfolio Return for a Given Level of Risk:

Objective: \(\text{Maximize} \quad \text{Expected Portfolio Return} = \sum_{i=1}^{n} w_i \cdot \text{Expected Return}_i\)

Subject to: \(\sum_{i=1}^{n} w_i = 1\) \(\text{Portfolio Variance} = \sum_{i=1}^{n} \sum_{j=1}^{n} w_i \cdot w_j \cdot \text{Covariance}_{ij} \leq \text{Target Risk}\)

Where:

  • \(n\) is the number of assets in the portfolio.
  • \(w_i\) is the weight of asset \(i\) in the portfolio.
  • \(\text{Expected Return}_i\) is the expected return of asset \(i\).
  • \(\text{Covariance}_{ij}\) is the covariance between the returns of assets \(i\) and \(j\).
  • The constraint \(\sum_{i=1}^{n} w_i = 1\) ensures that the weights sum up to 1.

Minimizing Portfolio Risk for a Given Level of Expected Return:

Objective: \(\text{Minimize} \quad \text{Portfolio Variance} = \sum_{i=1}^{n} \sum_{j=1}^{n} w_i \cdot w_j \cdot \text{Covariance}_{ij}\)

Subject to: \(\sum_{i=1}^{n} w_i \cdot \text{Expected Return}_i \geq \text{Target Expected Return}\) \(\sum_{i=1}^{n} w_i = 1\)

Where:

  • \(\text{Target Expected Return}\) is the desired level of expected return for the portfolio.

Efficient Frontier Using Monte Carlo

The efficient frontier is a set of optimal portfolios in the risk-return spectrum.

# Efficient frontier
import pandas as pd 


mean_returns = returns.mean()
cov_matrix = returns.cov()
# Number of portfolios to simulate
nsim = 10000
# Risk free rate 
risk_free_rate = 0.018

mean_returns = returns.mean() * 252
covariance_matrix = returns.cov() * 252

# generate random portfolio weights
np.random.seed(42)
weights = np.random.random(size=(nsim, len(stock_symbols)))
weights /=  np.sum(weights, axis=1)[:, np.newaxis]


port_retn = np.dot(weights, mean_returns)
port_vol = []
port_weights = []
# run simulation
for i in range(0, len(weights)):
    vol = np.sqrt(
        np.dot(weights[i].T, np.dot(covariance_matrix, weights[i]))
    )
    port_vol.append(vol)
    port_weights.append(weights[i])

# make an array
portf_vol = np.array(port_vol)

# get sharpe ratio
portf_sharpe_ratio = port_retn / portf_vol


# results df
results_df = pd.DataFrame(
    {"returns": port_retn,
    "volatility": portf_vol,
    "sharpe_ratio": portf_sharpe_ratio}
)

port_df = pd.DataFrame(port_weights,index=results_df.index,columns=stock_symbols)


# get maximum sharpe ratio
idx_maxsharpe = results_df["sharpe_ratio"].idxmax()
df_max_sharpe = results_df.iloc[idx_maxsharpe]
weight_max_sharpe = port_df.iloc[idx_maxsharpe]

# get minimum volatility
idx_minvol = results_df["volatility"].idxmin()
df_min_vol = results_df.iloc[idx_minvol]
weight_min_vol = port_df.iloc[idx_minvol]

npoints = 30
ef_rtn_list = []
ef_vol_list = []

# get range of returns
ef_retns = np.linspace(results_df["returns"].min(),results_df["returns"].max(),npoints)
# round the values
ef_retns = np.round(ef_retns, 2)

# actual returns
portf_rtns = np.round(port_retn, 2)

# for every return value find a minimum volatility
for rtn in ef_retns:
    if rtn in portf_rtns:
        ef_rtn_list.append(rtn)
        matched_ind = np.where(portf_rtns == rtn)
        ef_vol_list.append(np.min(portf_vol[matched_ind]))


# plot the results
fig, ax = plt.subplots()
results_df.plot(kind="scatter", x="volatility",
                    y="returns", c="sharpe_ratio",
                    cmap="RdYlGn", edgecolors="black",
                    ax=ax)

ax.set(xlabel="Volatility",ylabel="Expected Returns",title="Efficient Frontier")


ax.plot(ef_vol_list, ef_rtn_list, "b--",linewidth=3,label='Effiecient frontier')


ax.legend()
plt.show()

png

print("At maximum Sharpe Ratio\n")
print(f"Return:{df_max_sharpe['returns']}")
print(f"Volatility: {df_max_sharpe['volatility']}")
print(f"Portfolio: {weight_max_sharpe.to_dict()}")
print(f"Sharpe ratio: {df_max_sharpe['sharpe_ratio']}")
print("\n")
print("At Minimum risk\n")
print(f"Return:{df_min_vol['returns']}")
print(f"Volatility: {df_min_vol['volatility']}")
print(f"Portfolio: {weight_min_vol.to_dict()}")
print(f"Sharpe ratio: {df_min_vol['sharpe_ratio']}")
print("\n")
At maximum Sharpe Ratio

Return:0.34062078416847674
Volatility: 0.34168156067574446
Portfolio: {'META': 0.00021933368698373237, 'TSLA': 0.7269567018473118, 'X': 0.2574952867649712, 'MSFT': 0.015328677700733293}
Sharpe ratio: 0.996895423606794


At Minimum risk

Return:0.2357896874186083
Volatility: 0.3026424789381148
Portfolio: {'META': 0.0917403347472675, 'TSLA': 0.7961297432004478, 'X': 0.021304457291594758, 'MSFT': 0.09082546476068984}
Sharpe ratio: 0.7791030797987326

Efficient frontier using Scipy

In order to find the portfolio that gives highest returns for a given risk (volatility) or lowest risk (volatility) for a given return we can utilise numerical optimization. Optimization gives the best value of a function subjected to some boundary contraints affecting the target variable.

In the case of porfolio optimization we can express the equations as:

\[min (\omega^T \Sigma \omega)\]

with constraints:

\begin{align} \omega^T \bold{I}=& b
\omega \geq& 0
\omega^{T}\mu =& \mu_p \end{align
}

here \(\omega\) is the weight vector, \(\Sigma\) is the covariance matrix, \(\mu\) is the vector of returns with \(\mu_p\) is the expected portfolio return.

For a range of expected returns we find the optimal weights

Minimum risk

import numpy as np
import scipy.optimize as sco

# function to calculate portfolio return.
def portfolio_return(weights, asset_returns):
    return np.dot(weights, asset_returns)

# function to calculate portfolio volatility.
def portfolio_volatility(weights, covariance_matrix):
    return np.sqrt(np.dot(weights.T, np.dot(covariance_matrix, weights)))

# function to efficient portfolios along the efficient frontier.
def Efficient_frontier(asset_returns, covariance_matrix, target_returns):
    """
    Parameters:
    - asset_returns (array): Array of average returns for each asset.
    - covariance_matrix (array): Covariance matrix of asset returns.
    - target_returns (array): Array of target returns for which efficient portfolios are generated.
    """
    efficient_portfolios = []

    # Number of assets
    n_assets = len(asset_returns)

    # Define bounds for weights (between 0 and 1)
    bounds = tuple((0, 1) for _ in range(n_assets))

    # Initial guess for weights (equally weighted)
    weights0 = n_assets * [1. / n_assets, ]

    # Loop through target returns and find corresponding efficient portfolios
    for target_return in target_returns:
        # Constraints for optimization
        constraints = (
            {"type": "eq", "fun": lambda weights: portfolio_return(weights, asset_returns) - target_return},
            {"type": "eq", "fun": lambda weights: np.sum(weights) - 1}
        )

        # Optimize for minimum volatility
        result = sco.minimize(
            portfolio_volatility,
            weights0,
            args=(covariance_matrix,),
            method="SLSQP",
            constraints=constraints,
            bounds=bounds
        )

        # Save information about the efficient portfolio
        efficient_portfolios.append({
            'Return': target_return,
            'Volatility': result.fun,
            'Weights': result.x.tolist()
        })

    return efficient_portfolios


# define returns range
returns_range = np.linspace(0.025, 0.55, 400)

# covariance matrix
covariance_matrix = returns.cov() * 252

# asset returns
asset_returns = returns.mean() * 252

# run efficient frontier optimization
efficient_portfolios = Efficient_frontier(asset_returns, covariance_matrix, returns_range)

# volatility range
volatility_range = [p['Volatility'] for p in efficient_portfolios]


# plot efficient frontier

fig, ax = plt.subplots()

results_df.plot(kind="scatter", x="volatility",y="returns", c="sharpe_ratio",cmap="RdYlGn", edgecolors="black",ax=ax)

ax.plot(volatility_range, returns_range, "b--", linewidth=3)
ax.set(xlabel="Volatility",ylabel="Expected Returns",title="Efficient Frontier")
plt.show()

png

min_vol_ind = np.argmin(volatility_range)
min_vol_portf_rtn = returns_range[min_vol_ind]
min_vol_portf_vol = efficient_portfolios[min_vol_ind]["Volatility"]
min_vol_weights = efficient_portfolios[min_vol_ind]["Weights"]

min_vol_weights = dict(zip(stock_symbols, min_vol_weights))

print("At Minimum risk\n")
print(f"Return: {min_vol_portf_rtn}")

print(f"Volatility: {min_vol_portf_vol}")
print(f"Sharpe Ratio: {min_vol_portf_rtn / min_vol_portf_vol}")
print(f"weights: {min_vol_weights}")
print(f"Sharpe Ratio: {(min_vol_portf_rtn / min_vol_portf_vol)}")

At Minimum risk

Return: 0.2276315789473684
Volatility: 0.30210659990319516
Sharpe Ratio: 0.7534809865799325
weights: {'META': 0.0937157693026046, 'TSLA': 0.8067261730061059, 'X': 0.0, 'MSFT': 0.09955805769128928}
Sharpe Ratio: 0.7534809865799325

The minimum volatility portfolio is achieved by investing mostly in TESLA, Meta and Microsoft while not investing in X at all.

Maximum Sharpe Ratio


# define negative sharpe ratio for a optimization
def neg_sharpe(weight, avg_rtns, covariance_matrix, risk_free_rate):
    # calculate portfolio return and volatility
    port_ret = np.sum(avg_rtns * weight)
    port_vol = np.sqrt(np.dot(weight.T, np.dot(covariance_matrix, weight)))

    # get sharpe ratio
    port_sr = (
        (port_ret - risk_free_rate) / port_vol)
    
    return -port_sr

# we are searching for the portfolio with maximum sharpe ratio so we need to optimize for negative sharpe ratio

average_returns = returns.mean() * 252
covariance_matrix = returns.cov() * 252
# assume zero risk free rate
risk_free_rate = 0.


args = (average_returns,covariance_matrix,risk_free_rate)

# contraints
constraints = (
    {
        "type": "eq",
        "fun": lambda x: np.sum(x) - 1
    }
)

# bounds for weights
bounds = tuple((0, 1) for stock in range(len(stock_symbols)))

# initial guess
weight0 = np.array(len(stock_symbols) * [1. / len(stock_symbols)])

# minimize the function

max_sharpe = sco.minimize(neg_sharpe,
                                x0=weight0,
                                args=args,
                                method="SLSQP",
                                bounds=bounds,
                                constraints=constraints)

# make dictionary containing results
max_sharpe_port = {

    "returns": portfolio_return(max_sharpe.x,average_returns),
    "volatility": portfolio_volatility(max_sharpe.x,covariance_matrix),
    "sharpe_ratio": -max_sharpe.fun,
    "weights": max_sharpe.x.tolist()

}

# get weights df
max_sharpe_weights = dict(zip(stock_symbols, max_sharpe_port["weights"]))

print("At maximum sharpe ratio\n")
print(f"Return: {max_sharpe_port['returns']}")
print(f"Volatility: {max_sharpe_port['volatility']}")
print(f"Sharpe Ratio: {max_sharpe_port['sharpe_ratio']}")
print(f"weights: {max_sharpe_weights}")

At maximum sharpe ratio

Return: 0.3632417567223674
Volatility: 0.3612470733167341
Sharpe Ratio: 1.0055216596976673
weights: {'META': 0.0, 'TSLA': 0.6746528482243279, 'X': 0.32534715177567214, 'MSFT': 0.0}

The analysis indicates that most weight should be in TESLA and X with no allocation to META and Microsoft

Efficient frontier using convex optimization

Instead of optimizing the portfolio using mean-variance we can pose the problem in risk-adjusted retrun maximization. The problem then becomes:

\begin{align} max( \omega^T\mu -& \gamma \omega^T \Sigma \omega)
\omega^T\bold{I} =& 1
\omega \geq& 0 \end{align
}

Here, \(\gamma \in [0,\infty)\) is risk aversion parameter. Higher value suggest more risk aversion.

This is a convex problem (quadratic objective, linear constraints), so any local minimum is the global minimum. We can solve it with cvxpy.

import cvxpy as cp

# setup the cvxpy variables
weights = cp.Variable(len(stock_symbols))

# risk aversion factor
gamma_para = cp.Parameter(nonneg=True)

# portfolio return
port_ret_cvx = mean_returns.values@weights

# portfolio volatility
port_vol_cvx = cp.quad_form(weights, covariance_matrix.values)

# define function
obj_fun = cp.Maximize(port_ret_cvx - gamma_para * port_vol_cvx)

# define problem and add constraints
prob = cp.Problem(obj_fun,[cp.sum(weights) == 1, weights >= 0])


# place holder for portfolio return, volatility and weights
port_ret_cvx_ef = []
port_vol_cvx_ef = []
weights_ef = []

# define gamma range
gamma_range = np.logspace(-3, 3, num=25)

# loop through gamma
for gamma in gamma_range:
    # set value of gamma
    gamma_para.value = gamma
    # solve problem
    prob.solve()
    # get portfolio return
    port_vol_cvx_ef.append(cp.sqrt(port_vol_cvx).value)
    # get portfolio volatility
    port_ret_cvx_ef.append(port_ret_cvx.value)
    # get portfolio weights
    weights_ef.append(weights.value)


# create dataframe for weights

weights_df = pd.DataFrame(weights_ef, index=np.round(gamma_range,3),columns=stock_symbols)

# plot the weight allocation for different gamma

fig,ax = plt.subplots(1,1)

weights_df.plot(kind='bar',ax=ax,stacked=True,title="Weights per gamma",xlabel="Gamma",ylabel="Weights")

plt.show()

png

For low risk aversion the allocation is 100% to X but as the risk aversion increases the majority of allocation is in TESLA.

# plot efficient frontier
fig,ax = plt.subplots(1,1)
ax.plot(port_vol_cvx_ef, port_ret_cvx_ef, "b-", linewidth=3)

results_df.plot(kind="scatter", x="volatility",y="returns", c="sharpe_ratio",cmap="RdYlGn", edgecolors="black",ax=ax)
<Axes: xlabel='volatility', ylabel='returns'>

png

Maximum allowed Leverage

Leverage lets you borrow to invest more than your own capital. A leverage ratio of 2:1 means for every $1 of your capital, you can invest $2 total. Higher leverage amplifies both gains and losses.

We can add a leverage constraint by bounding the L1 norm of the weight vector:

# define maximum leverage parameter
max_leverage = cp.Parameter()

# define problem with leverage
prob_with_l = cp.Problem(obj_fun, 
    [cp.sum(weights) == 1, 
    cp.norm(weights, 1) <= max_leverage])


# range of leverage
lev_range = [1,3,5]

N = len(gamma_range)
# place holder for portfolio return, volatility and weights
portfolio_return = np.zeros((N,len(lev_range)))
portfolio_volatility = np.zeros((N,len(lev_range)))
portfolio_weights = np.zeros((len(lev_range),N,len(stock_symbols)))


# loop through leverage and gamma
# record the results in the placeholders

for l,lev in enumerate(lev_range):
    for g in range(N):
        # set values of variables
        max_leverage.value = lev
        gamma_para.value = gamma_range[g]
        # solve the equations
        prob_with_l.solve()
        # fill the values in place holders
        portfolio_return[g,l] = port_ret_cvx.value
        portfolio_volatility[g,l] = cp.sqrt(port_vol_cvx).value
        portfolio_weights[l,g,:] = weights.value

# plot the weights for different gamma

fig,ax = plt.subplots(1,1)
ax.set_xlabel(r'Volatility')
ax.set_ylabel(r'Return')

for k,lev in enumerate(lev_range):
    ax.plot(portfolio_volatility[:,k], portfolio_return[:,k], label=f"{lev}")

ax.legend(title="Maximum Leverage")
plt.show()

png

# plot the allocation for different allocation for different leverage and gamma

fig,axes = plt.subplots(len(lev_range),1,sharex=True)

for axi in range(len(axes)):

    weights_df = pd.DataFrame(portfolio_weights[axi], index=np.round(gamma_range,3),columns=stock_symbols)

    weights_df.plot(kind='bar',ax=axes[axi],stacked=True,xlabel="Gamma",ylabel=f"max lev = {lev_range[axi]} \n Weights",legend=None)

axes[0].legend(bbox_to_anchor=(1, 1))
axes[0].set_title("Weights per risk aversion")

plt.show()

png

we note that with an increase in risk aversion, all cases converge to a similar allocation for all levels of the leverage.

Traditional optimization methods depend on inverting covariance matrix. They are also unstable. More correlated assets lead to need of more diversification and which leads to more errors in calculation of portfolio weights.

Hierarchical Risk Parity

Hierarchical Risk Parity (HRP), introduced by Marcos Lopez de Prado, takes a different approach from mean-variance optimization. Instead of inverting the covariance matrix (which is unstable when assets are correlated), it uses hierarchical clustering to group similar assets and then allocates inversely proportional to each cluster’s variance.

The steps:

  1. Dendrogram construction — cluster assets by pairwise correlation into a tree structure.
  2. Recursive bisection — split the tree into sub-clusters at each level.
  3. Inverse variance weighting — within each cluster, weight assets inversely by their variance (lower-risk assets get more weight).
  4. Bottom-up allocation — repeat at each level of the hierarchy, then combine into final portfolio weights.

The main advantage over mean-variance: HRP doesn’t require inverting the covariance matrix, so it’s more stable when you have many correlated assets or limited data.

import yfinance as yf
import pandas_datareader.data as pdr
import seaborn as sns 
import matplotlib.pyplot as plt
import numpy as np 
import pandas as pd
import quantstats as qs

yf.pdr_override()
from pypfopt.expected_returns import returns_from_prices
from pypfopt.hierarchical_portfolio import HRPOpt
from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_prices
from pypfopt import plotting

# Define a list of stock symbols

stocks = {'Apple Inc.': 'AAPL',
'Microsoft Corporation': 'MSFT',
'Amazon.com Inc.': 'AMZN',
'Alphabet Inc.': 'GOOGL', 
'Meta Platforms, Inc.': 'META',
'Tesla, Inc.': 'TSLA',
'Johnson & Johnson': 'JNJ',
'JPMorgan Chase & Co.': 'JPM',
'Visa Inc.': 'V',
'Procter & Gamble Co.': 'PG',
'NVIDIA Corporation': 'NVDA',
'Mastercard Incorporated': 'MA',
'The Walt Disney Company': 'DIS',
'Salesforce.com Inc.': 'CRM'

}

stock_symbols = list(stocks.values())

# start date
start_date = '2018-01-01'
# end date
end_date = '2023-01-01'

stock_data = pdr.get_data_yahoo(stock_symbols,start_date,end_date)['Adj Close']

# returns
returns_df = returns_from_prices(stock_data)
[*********************100%%**********************]  14 of 14 completed
# Optimize using Hierarchical Risk Parity
hrp = HRPOpt(returns=returns_df)
hrp.optimize()

# get weights
weights = hrp.clean_weights()

print(weights)
OrderedDict([('AAPL', 0.05942), ('AMZN', 0.04517), ('CRM', 0.06996), ('DIS', 0.0784), ('GOOGL', 0.06652), ('JNJ', 0.16991), ('JPM', 0.08137), ('MA', 0.05232), ('META', 0.03086), ('MSFT', 0.06809), ('NVDA', 0.02474), ('PG', 0.15484), ('TSLA', 0.03146), ('V', 0.06692)])
# get portfolio performance
hrp.portfolio_performance(verbose=True, risk_free_rate=0);
Expected annual return: 16.1%
Annual volatility: 22.4%
Sharpe Ratio: 0.72
# plot the heirarchical clustering

fig,ax = plt.subplots()

ax.set_title("Dendogram for clusters")

plotting.plot_dendrogram(hrp, color_threshold=0, ax=ax, above_threshold_color='grey')

plt.show()

png

We see that similar companies are clustered together (eg. Visa and Mastercard or GOOGLE, Microsoft and Apple). So in order to diversify, we can choose companies belonging to different clusters.

We can also find the allocation for each company for a total amount of money we wish to invest.

# get current prices and allocate 100,000 USD
latest_prices = get_latest_prices(stock_data)

# find allocation based on weights using hrp
allocation_finder = DiscreteAllocation(weights,
        latest_prices,
        total_portfolio_value=100000)

# get portolio
allocation, leftover = allocation_finder.lp_portfolio()


for stock in stocks.keys():
        if stocks[stock] in allocation.keys():
                print(f"{stock}: {allocation[stocks[stock]]}")

print(f"Leftover: {leftover}")
Apple Inc.: 46
Microsoft Corporation: 29
Amazon.com Inc.: 54
Alphabet Inc.: 75
Meta Platforms, Inc.: 26
Tesla, Inc.: 26
Johnson & Johnson: 99
JPMorgan Chase & Co.: 62
Visa Inc.: 32
Procter & Gamble Co.: 105
NVIDIA Corporation: 17
Mastercard Incorporated: 15
The Walt Disney Company: 90
Salesforce.com Inc.: 53
Leftover: 1.369749858728028

Efficient Forntier using PyPortfolioOpt

Sample covariance matrices get noisy when you have few observations relative to the number of assets. Ledoit-Wolf shrinkage addresses this by blending the sample covariance with a structured target (typically a scaled identity matrix), reducing estimation error.

from pypfopt.expected_returns import mean_historical_return
from pypfopt.risk_models import CovarianceShrinkage
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt.plotting import plot_efficient_frontier


# get mean and variance of returns
mu = mean_historical_return(stock_data, frequency=252)
covariance_matrix = CovarianceShrinkage(stock_data).ledoit_wolf()

# calculate the efficient frontier
EF = EfficientFrontier(mu, covariance_matrix)
# plot the EF
fig,ax = plt.subplots(1,1)

plot_efficient_frontier(EF, ax=ax, show_assets=True)

plt.show()
/home/ankitsingh/miniconda3/envs/DS/lib/python3.11/site-packages/cvxpy/reductions/solvers/solving_chain.py:336: FutureWarning: 
    Your problem is being solved with the ECOS solver by default. Starting in 
    CVXPY 1.5.0, Clarabel will be used as the default solver instead. To continue 
    using ECOS, specify the ECOS solver explicitly using the ``solver=cp.ECOS`` 
    argument to the ``problem.solve`` method.
    
  warnings.warn(ECOS_DEPRECATION_MSG, FutureWarning)

png

# get weights for highest sharpe ratio
EF = EfficientFrontier(mu, covariance_matrix)
weights_ms = EF.max_sharpe()

# get weights for minimum volatility
EF = EfficientFrontier(mu, covariance_matrix)
weights_mv = EF.min_volatility()

# print weights
print("At maximum sharpe ratio\n")
for stock in stocks.keys():
        if stocks[stock] in weights_ms.keys():
                print(f"{stock}: {weights_ms[stocks[stock]]}")

print("\nAt minimum volatility\n")
for stock in stocks.keys():
        if stocks[stock] in weights_mv.keys():
                print(f"{stock}: {weights_mv[stocks[stock]]}")
At maximum sharpe ratio

Apple Inc.: 0.245191828703573
Microsoft Corporation: 0.223290432799307
Amazon.com Inc.: 0.0
Alphabet Inc.: 0.0
Meta Platforms, Inc.: 0.0
Tesla, Inc.: 0.1619657168958174
Johnson & Johnson: 0.0
JPMorgan Chase & Co.: 0.0
Visa Inc.: 0.0
Procter & Gamble Co.: 0.3695520216013028
NVIDIA Corporation: 0.0
Mastercard Incorporated: 0.0
The Walt Disney Company: 0.0
Salesforce.com Inc.: 0.0

At minimum volatility

Apple Inc.: 0.0
Microsoft Corporation: 0.0
Amazon.com Inc.: 0.0959515427915488
Alphabet Inc.: 0.0041594590053027
Meta Platforms, Inc.: 0.0
Tesla, Inc.: 0.0054829509292488
Johnson & Johnson: 0.4440479006669384
JPMorgan Chase & Co.: 0.0288517492601816
Visa Inc.: 0.0
Procter & Gamble Co.: 0.3507875353775004
NVIDIA Corporation: 0.0
Mastercard Incorporated: 0.0
The Walt Disney Company: 0.0707188619692793
Salesforce.com Inc.: 0.0