# Matrix Profile - Brute Force

This notebook is used to demonstrate the brute force algorithm for computing the matrix profile. This algorithm came from the slide deck:

https://www.cs.ucr.edu/~eamonn/Matrix_Profile_Tutorial_Part2.pdf

In [1]:
from IPython.display import Image

In [2]:
Image('images/bruteforce.png')

Out[2]:
In [3]:
import numpy as np

In [4]:
import warnings

In [5]:
def znormalize(series):
series -= np.mean(series)
std = np.std(series)

if std == 0:
raise ValueError("The Standard Deviation cannot be zero")
else:
series /= std

return series

def brute_force_mp(series, window_size=None, query=None):
if isinstance(query, np.ndarray) and window_size:
warnings.warn('Query and window_size provided. Using query to determine window size.')

if isinstance(query, np.ndarray):
query = znormalize(query)
window_size = len(query)

#If length is odd, zero-pad time time series
if len(series) % 2 == 1:
profile = np.insert(profile, 0, 0)

#If length is odd, zero-pad query
if len(query) % 2 == 1:
query = np.insert(query, 0, 0)
window_size += 1

series_len = len(series)
shape = series_len - window_size + 1
profile = np.full(shape, np.inf)

for index in range(0, series_len - window_size + 1):
sub_sequence = series[index:index + window_size]
sub_sequence = znormalize(sub_sequence)
distance = np.sqrt(np.sum(np.power(sub_sequence - query, 2)))
profile[index] = distance

return profile


## Data¶

The data consists of closing stock prices. I use the algorithm to identify which part of the data is closest to our subquery.

In [42]:
series = np.loadtxt('stock_close.txt')
query = series[35:55]

mp = brute_force_mp(series, query=query)

In [43]:
from matplotlib import pyplot as plt

%matplotlib inline

In [44]:
plt.figure(figsize=(15,2))
plt.plot(series)
x = np.arange(30, 40)
y = series[x]
plt.plot(x, y, c='r')
plt.title('Raw Data')
plt.show()


This plot shows the raw data and the subquery that is being searched for in red.

In [46]:
plt.figure(figsize=(15,2))
plt.plot(mp)
x = np.arange(30, 40)
y = mp[x]
plt.plot(x, y, c='r')
plt.title('Matrix Profile')
plt.show()


The matrix profile illustrates that the pattern was found hence the huge dip in the plot. This tells us that the distance between the subquery and that location of the series are very similar. In this case the subquery is identical. This pattern that is found is called a motif.