%% Processing and Analysis of Financial Data
% We will use MATLAB to plot, process, and analyze some data from
% the financial markets. There are huge quantities of such data,
% but to make the analysis manageable, we restrict attention here to
% two sets of numbers from 2012: the closing value each day of the Dow
% Jones Industrial Average (DJIA), a standard measure of the US stock
% market, and the daily value of the euro in US dollars, a measure of
% relative confidence in the US and European economies. The data
% were downloaded from, and is copyright by, Samuel H. Williamson, 
% ``Daily Closing Value of the Dow Jones Average, 1885 to Present,'' 
% MeasuringWorth, 2012, and the European Central Bank Statistical Data 
% Warehouse, 2012.
%%
% Both data sets cover only certain days of the year, namely the days
% when the markets are open. To compensate for this, we assumed that
% each market indicator remained constant between the close of business
% on one market day and the next market day, even if several days
% (such as a holiday weekend) intervened. Thus, since 2012 was a leap
% year, each data set consisted of a vector of length 366 (for the 366
% days of the year 2012). The data were downloaded and converted to
% vector form using the *Import Data* button on the MATLAB Desktop,
% then stored into a |mat| file for future use. We begin by loading 
% the data:
load('financial.mat')
%%
% Our first step is to plot the data. Two lesser-known MATLAB functions
% are useful here. The function *|detrend|* subtracts from a data set
% the best-fitting line, which we plot separately with a dotted line.
% This makes it easier to see the overall trend and the fluctuations
% from it.  The function *|datetick|* labels the desired axis with
% dates or times instead of numbers:
%%
plot(1:366, dj, 'k'), title 'Dow Jones average 2012'
hold on
plot(1:366, dj - detrend(dj), ':k')
axis([1,366,12000,14000])
datetick('x','mmm')
hold off
%%
figure
plot(1:366, euro, 'k'), title 'Euro/dollar exchange rate 2012'
hold on
plot(1:366, euro - detrend(euro), ':k')
hold off
axis([1,366,1.20,1.35])
datetick('x','mmm')
%%
% Now we can begin to analyze the data.  For example, we could ask
% for the correlation between the Dow Jones average and the euro/dollar
% exchange rate.  We might guess that when the US stock market is
% strong, then the dollar should strengthen against the euro and thus
% this exchange rate should go down. To test this, we can use
% *|corrcoef|*, which computes the correlation coefficient and also
% gives the p-value, the probability of getting a correlation as large 
% as the observed value by random chance, when the true correlation is 
% zero:
[r, p] = corrcoef(dj, euro);
disp(['correlation coefficient = ',num2str(r(1,2))])
disp(['p-value = ',num2str(p(1,2))])
%%
% The facts that the correlation is positive and that the p-value is
% incredibly small indicate that
% the theory above is certainly false in this case, as 
% the euro strengthened against the dollar even as the Dow
% Jones average rose.
%%
% Next, we observe that if there were no ``noise'' in the data, we
% would expect _exponential_, not _linear_, growth or decay in prices.
% For that reason, it's really better to work with the _logarithms_
% of the stock exchange index and the euro exchange rate.
% Let's plot these logarithms with the linear part of the trend
% subtracted out:
%%
ldj = log(dj); ldjdetrend = detrend(ldj);
plot(1:366, ldj, 'k'), title 'log of Dow Jones average 2012'
hold on
plot(1:366, ldj - ldjdetrend, ':k')
axis([1,366,9.2,9.8])
datetick('x','mmm')
hold off
%%
figure
leu = log(euro); leudetrend = detrend(leu);
plot(1:366, leu, 'k')
title 'log of Euro/dollar exchange rate 2012'
hold on
plot(1:366, leu - leudetrend, ':k')
hold off
axis([1,366,0.15,0.30])
datetick('x','mmm')
%%
% We might also want to know if the data have certain natural frequencies
% or periodicities. For this, the routine *|fft|*, which computes the
% ``fast Fourier transform," is useful.  Here we want to work with
% the ``detrended" data:
y1 = fft(ldjdetrend); y2 = fft(leudetrend);
%%
% These Fourier transforms are complex numbers. But we can try to
% plot the real and imaginary parts separately, for example:
plot(1:366, real(y1), 'k')
hold on
plot(1:366, imag(y1), ':k')
hold off
%%
% Note that both the solid and dotted
% plots are very close to zero except for a peak at the beginning
% and the end.  That suggests that the characteristic frequencies
% of the stock market are all either very small or very large.
% Let's try the same thing with the exchange rate data:
%%
plot(1:366, real(y2), 'k')
hold on
plot(1:366, imag(y2), ':k')
hold off
%%
% Again, both plots are very close to zero except for peaks at each end.
%%
% We may now try to understand the data from the point of view of
% probability and statistics.  For the rest of this application, we
% use the Statistics Toolbox. First, we can see if
% both data sets plausibly come from the same distributions,
% using a quantile-quantile plot constructed with *|qqplot|*:
%%
qqplot(ldjdetrend, leudetrend)
xlabel('Quantiles for detrended log of DJ average')
ylabel('Quantiles for detrended log of euro rate')
%%
% Since this largely follows a straight line (though the tail at
% one end is a puzzle), we might expect the fluctuations from the
% linear trend in both data sets to come from the same distribution.
% The simplest guess would be that the fluctuations roughly
% follow a normal distribution. (This is the basis for the
% famous _Black-Scholes model_, which assumes that logs of stock prices,
% after subtracting off a trend linear in time, are normally
% distributed.) Let's see if this is plausible,
% starting say with the DJ data, using a normal probability plot
% constructed with *|normplot|*:
%%
normplot(ldjdetrend)
%%
% The plot shows that the stock market fluctuations are largely
% normal, but with ``heavy tails'' at the ends. In other words,
% big fluctuations from the mean are more common than in a normal
% distribution. We should take this into account in applying
% simple statistical models to the stock market.  We can do the
% same with currency exchange rates:
normplot(leudetrend)
%% 
% Again we largely have a normal distribution, but with much heavier
%  ails, and not as good a fit as with the Dow Jones average.
% Now we can try to fit the data to a normal curve with *|normfit|*:
%%
[djm, djsig] = normfit(ldjdetrend);
disp('Parameters for normal fit to the detrended data');
disp(['log of Dow Jones average : mean :', num2str(djm), ...
    ', sigma: ', num2str(djsig)]);
[eurom, eurosig] = normfit(leudetrend);
disp(['log of euro exchange rate : mean :', ...
    num2str(eurom), ', sigma: ', num2str(eurosig)]);
%%
% That the means here are negligible is no surprise, as we are
% working with the detrended data where the trend has already been
% subtracted. But the standard deviations are large, indicating that
% big fluctuations in the markets are quite common. More sophisticated
% analysis of financial data using MATLAB can be done with the
% Financial Toolbox.
