Transforming a QP Problem to an SOCP Problem

A Quadratic Programming problem (QP) in the form of

\(\begin{aligned} & \underset{x}{\text{minimize}} & & \frac{1}{2} x^T H x + p^T x \\ & \text{subject to} \\ & & & A_{eq} x = b_{eq} \\ & & & A x \geq b \end{aligned} \)

where \(H \in \Re^{n \times n}, A \in \Re^{m \times n}, p, x \in \Re^n, b \in \Re^m\), can be transformed to a Second-Order Cone Programming (SOCP) problem in the form of

\(\begin{aligned} & \underset{x}{\text{minimize}} & & f^T x \\ & \text{subject to} \\ & & & \left \| A_i x + b_i \right \|_2 \leq c_i^T x + d_i, \; i = 1, \ldots, m \end{aligned} \)

Consider \(y = \left \| F x – c \right \|\), and

\(\begin{aligned} y^2 & = \left \| F x – c \right \|^2 \\ & = (F x – c)^T (F x – c) \\ & = x^T F^T F x – 2 c^T F x + c^T c \end{aligned} \)

As \(c^T c\) is non-negative, minimizing \(x^T F^T F x – 2 c^T F x \) is equivalent to minimizing \(y^2\), and hence is equivalent to minimizing \(y\).

If we have \(H = 2 F^T F\) and \(p = -2 F^T c\), then the objective function in QP \(\frac{1}{2} x^T H x + p^T x\) can be written as \(x^T F^T F x – 2 c^T F x \). We can thus minimize \(y\).

Thus, the QP problem can now be written as

\(\begin{aligned} & \underset{y}{\text{minimize}} & & y \\ & \text{subject to} \\ & & & \left \| F x – c \right \| \leq y \\ & & & A_{eq} x = b_{eq} \\ & & & A x \geq b \end{aligned} \)

As \(H\), by definition of QP, is symmetric, a symmetric \(F\) can be found such that \(\frac{H}{2} = F^T F = F^2\). If the QP is assumed to be a convex QP, \(H\) is positive semidefinite, applying Cholesky factorization gives \(\frac{H}{2} = U^T U\) (or \(L L^T\)). In this case, \(F = U\) (or \(L^T\)).

Next, as \(\left \| \cdot \right \|_2 \) is always non-negative, the equality constraint \(A_{eq} x = b_{eq}\) can be written as

\(\left \| A_{eq} x – b_{eq} \right \| \leq 0 \)

Finally, each row in the inequality constraint \(A x \geq b\) can be written as

\(a_i x – b_i \geq 0, \; i = 1, \ldots, m \)

where \(a_i\) is the i-th row of \(A\), and \(b_i\) is the i-th element of \(b\).

Therefore, a QP problem can be transformed to an equivalent SOCP problem in the following way. We need to introduce a few variables first.

  • \(y = x_{n+1}\)
  • \(c = -\frac{1}{2} F^{-T} p\)

\(\begin{aligned} & \underset{x}{\text{minimize}} & & f^T x \\ & \text{subject to} \\ & & & \left \| A_1 x + b_1 \right \|_2 \leq c_1^T x + d_1 \\ & & & \left \| A_2 x + b_2 \right \|_2 \leq c_2^T x + d_2 \\ & & & \left \| A_3 x + b_3 \right \|_2 \leq c_3^T x + d_3 \\ & & & \vdots \\ & & & \left \| A_{m+2} x + b_{m+2} \right \|_2 \leq c_{m+2}^T x + d_{m+2} \\ & \text{where} \\ & & f & = (0, \ldots, 0, 1)^T \\ & & x & = (x_1, \ldots, x_n, x_{n+1})^T \\ & & A_1 & = \left [ F, 0 \right ], b_1 = \frac{1}{2} F^{-T} p, c_1 = (0, \ldots, 0, 1)^T, d_1 = 0, \left (\frac{H}{2} \right ) = F^T F \\ & & A_2 & = \left [ A_{eq}, 0 \right ], b_2 = -b_{eq}, c_2 = 0, d_2 = 0 \\ & & A_{i+2} & = 0, b_{i+2} = 0, c_{i+2} = \text{the i-th row of }A, d_{i+2} = -(\text{the i-th element of }b), \; i = 1, \ldots, m \\ & & & f, x, c_1, \ldots, c_{m+2} \in \Re^{n+1} \end{aligned} \)

The sub-vector with the first \(n\) elements in the solution of the transformed SOCP problem is the solution of the original QP problem.

SuanShu has implementations to solve both SOCP and QP problems.

Fastest Java Matrix Multiplication

Introduction

Matrix multiplication occupies a central role in scientific computing with an extremely wide range of applications. Many numerical procedures in linear algebra (e.g. solving linear systems, matrix inversion, factorizations, determinants) can essentially be reduced to matrix multiplication [5, 3]. Hence, there is great interest in investigating fast matrix multiplication algorithms, to accelerate matrix multiplication (and other numerical procedures in turn).

SuanShu was already the fastest in matrix multiplication and hence linear algebra per our benchmark.
SuanShu v3.0.0 benchmark

Starting version 3.3.0, SuanShu has implemented an advanced algorithm for even faster matrix multiplication. It makes some operations 100x times faster those of our competitors! The new benchmark can be found here:
SuanShu v3.3.0 benchmark

In this article, we briefly describe our implementation of a matrix multiplication algorithm that dramatically accelerates dense matrix-matrix multiplication compared to the classical IJK algorithm.

Parallel IJK

We first describe the method against which our new algorithm is compared against, IJK. Here is the algorithm performing multiplication \(C = A\times B\) for \(A\) is \(m\times n\), \(B\) is \(n\times p\), and \(C\) is \(m\times p\):

In Suanshu, this is implemented in parallel; the outermost loop is passed to a  ParallelExecutor .

As there are often more rows than threads available, the time complexity of this parallelized IJK is still roughly the same as IJK: \(O\left(mnp\right)\), or cubic time \(O\left(n^3\right)\) for \(m = n = p\). This complexity is not most desirable.

The core of our new multiplication algorithm, the Strassen algorithm, reduces the time complexity to \(O\left(n^{\log_27}\right)\).

Strassen’s Algorithm

The Strassen algorithm [6] is based on the following block matrix multiplication:

\(\displaystyle \begin{pmatrix}A_{11} & A_{12}\\ A_{21} & A_{22} \end{pmatrix}\begin{pmatrix}B_{11} & B_{12}\\ B_{21} & B_{22} \end{pmatrix}=\begin{pmatrix}A_{11}B_{11}+A_{12}B_{21} & A_{11}B_{12}+A_{12}B_{22}\\ A_{21}B_{11}+A_{22}B_{21} & A_{21}B_{12}+A_{22}B_{22} \end{pmatrix}\)

The naive method of completing this involves 8 submatrix multiplications and 4 additions. The version (Winograd’s variant [7]) of Strassen’s algorithm that we use forgoes one submatrix multiplication in exchange for eleven extra additions/subtractions, which is faster when the submatrices are large enough.

WinoPNG
Fig 1. Visualization of Strassen’s Algorithm (Winograd variant)

The algorithm runs as follows (visualization above):

1. Split \(A\) into four equally-sized quadrants \(A_{11}\), \(A_{12}\), \(A_{21}\), \(A_{22}\). The same for \(B\). (Assume first that all dimensions are even.)

2. Obtain the following factor matrices:

\(\displaystyle \begin{array}{ccccccc} M_{1} & = & A_{11} & \qquad & N_{1} & = & B_{11}\\ M_{2} & = & A_{12} & \qquad & N_{2} & = & B_{21}\\ M_{3} & = & A_{21}+A_{22} & \qquad & N_{3} & = & B_{12}-B_{11}\\ M_{4} & = & M_{3}-A_{11} & \qquad & N_{4} & = & B_{22}-N_{3}\\ M_{5} & = & A_{11}-A_{21} & \qquad & N_{5} & = & B_{22}-B_{12}\\ M_{6} & = & A_{12}-M_{4} & \qquad & N_{6} & = & B_{22}\\ M_{7} & = & A_{22} & \qquad & N_{7} & = & B_{21}-N_{4} \end{array}\)

3. Obtain \(P_{i}=M_{i}\times N_{i}\) for \(i\in\left\{ 1,2,\ldots,7\right\}\). Again depending on the dimensions, we either use Parallel IJK, or make a recursive call to Strassen’s algorithm.

4. The final product \(C=A\times B\) can then be obtained as follows (using some temporary matrices \(T_{i}\)):

\(\displaystyle \begin{array}{ccccccc} C_{11} & = & P_{1}+P_{2} & \qquad & C_{22} & = & T_{2}+P_{3}\\ T_{1} & = & P_{1}+P_{4} & \qquad & T_{3} & = & T_{1}+P_{3}\\ T_{2} & = & T_{1}+P_{5} & \qquad & C_{12} & = & T_{3}+P_{6}\\ C_{21} & = & T_{2}+P_{7} & \qquad \end{array}\)

Odd Dimensions

So far this algorithm has ignored the cases when \(A\) and/or \(B\) has an odd number of rows/columns. There are several methods of dealing with this [2, 4]. For example, one could pad the matrices statically so that the dimensions are always even until the recursion passes to IJK (static padding); or pad only when one of the dimensions is odd (dynamic padding).

Alternatively one could disregard the extra rows/columns until after the algorithm completes, and then take care of them afterwards (i.e. if \(A\) has an extra row or \(B\) has an extra column, use the appropriate matrix-vector operation to calculate the remaining row/column of \(C\). If \(A\) has an extra column and \(B\) has an extra row, their product can be added on to \(C\) afterwards.) We chose this method, called dynamic peeling, for our implementation.

Blocking and Tiling

Taken on its own, the above Strassen’s algorithm works well, provided both matrices are roughly square. In practice, we may encounter cases where either is highly rectangular (e.g., too tall, or long).

We solve this by slicing the matrices into blocks which are nearly square, then using Strassen’s algorithm on the submatrices. The blocking scheme is devised so that long or tall strips are avoided.

Blocking
Fig 2. Left: A strict blocking scheme (strips!). Right: An efficient blocking scheme (our implementation).

Performance

The following charts show the performance of our hybrid Block-Strassen algorithm versus Parallel IJK on an Intel ® Core i5-3337U CPU @ 1.80 GHz with 6GB RAM, running Java 1.8.0 update 40.

Tests are patterned after D’Alberto and Nicolau [1]: We ran \(C = A \times B\) for random matrices \(A\) (\(m\times n\)) and \(B\) (\(n \times p\)) , for every triple \(\left(m, n, p\right)\) in \(S^3\), where \(S = \{100,500,1000,2500,5000,7500,10000\}\). This multiplication was done three times using Parallel IJK, then three times using Hybrid Block-Strassen. The average times using each method are compared and tabulated.

WinoComplexity
Fig 3. Multiplication Time (by Method) vs Complexity m×n×p

Figure 3 shows the shows the multiplication time plotted against the product \(mnp\) of the dimensions. The multiplication time for IJK is \(O\left(mnp\right)\), and our empirical results show that multiplication times for both Parallel IJK and HBS are strongly linearly related to complexity. HBS, however, has a significantly smaller gradient.

The gradients of the best-fit lines suggest that as complexity approaches infinity (ignoring memory constraints), HBS will take 63.5% less time than IJK. Several data points, however, (e.g. \(m,n,p\ge5000\) ) show an even greater speedup.

Fig 4. Time Savings (in seconds) using HBS versus Parallel IJK
Fig 4. Time Savings (in seconds) using HBS versus Parallel IJK
WinoPercent
Fig 5. Time Savings (as a percent of P.IJK Time) using HBS versus Parallel IJK

Figures 4 and 5 show the time saving of HBS over Parallel IJK, in seconds (Figure 4), and as a percentage of the Parallel IJK time (Figure 5). Each table is for a specific value of \(n\) (number of columns of \(A\)), and runs over values of \(m\) (number of rows of \(A\)) and \(p\) (number of columns of \(B\)).

WinoAccu
Fig 6. Maximum entry-wise Relative Error of HBS versus PIJK (in units of 1e-15)

Finally, an accuracy test was also run to address concerns regarding the numerical stability of Strassen’s algorithm [3]. Figure 6 shows the maximum entry-wise relative error

\(\displaystyle\max_{i,j}\left|\frac{C_{i,j}^{{\rm IJK}}-C_{i,j}^{{\rm HBS}}}{C_{i,j}^{{\rm IJK}}}\right|\)

of the resulting product matrix. We see that none of the errors exceed \(2 \times 10^{-14}\) (note that we did not run Strassen’s algorithm completely to the scalar level; when the matrices \(M_i\) and \(N_i\) are too small, IJK is used. This reduces the error.), which suggests that HBS is surprisingly accurate, and a strong candidate for use in general-purpose contexts where speed is a priority.

References

[1] Paolo D’Alberto and Alexandru Nicolau. Adaptive strassen’s matrix multiplication. In Proceedings of the 21st Annual International Conference on Supercomputing, ICS ’07, pages 284–292, New York, NY, USA, 2007. ACM.
[2] Hossam ElGindy and George Ferizis. On improving the memory access patterns during the execution of strassen’s matrix multiplication algorithm. In Proceedings of the 27th Australasian Conference on Computer Science – Volume 26, ACSC ’04, pages 109–115, Darlinghurst, Australia, Australia, 2004. Australian Computer Society, Inc.
[3] Nicholas J Higham. Accuracy and stability of numerical algorithms. Siam, 2002.
[4] Steven Huss-Lederman, Elaine M. Jacobson, Anna Tsao, Thomas Turnbull, and Jeremy R. Johnson. Implementation of strassen’s algorithm for matrix multiplication. In Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, Supercomputing ’96, Washington, DC, USA, 1996. IEEE Computer Society.
[5] Steven S. Skiena. The Algorithm Design Manual. Springer Publishing Company, Incorporated, 2nd edition, 2008.
[6] Volker Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 13(4):354–356, 1969.
[7] Shmuel Winograd. On multiplication of 2×2 matrices. Linear algebra and its applications, 4(4):381– 388, 1971.

On Some Practical Issues when Using AlgoQuant to Compute the Markowitz Efficient Frontier

Markowitz suggests in his Nobel Prize-winning paper Markowitz(1952) that when one selects a portfolio, he/she should consider both the return and the risk of the portfolio. Most of us, if not all, are risk-averse. Risk-averse means that if there are two  portfolios with the same return, but different risks (in this article by risk we mean the standard deviation of the portfolio), we would choose the one with the smaller risk without any hesitation. Therefore given a set of risky assets and an expected return, we are interested in finding their best combination, i.e. the weights which will minimize the risk of the portfolio. And if we find the minimum risk of the portfolio for any return, we can draw a curve on risk-return plane. This curve is the famous efficient frontier.

Assuming there are \(n\) risky assets, and their return vector and covariance matrix are \(R\) and \(\Sigma\) respectively, then the points on the efficient frontier are computed by solving the following problem:

\(\min_{w\in\mathbb{R}^{n}} w^{\top}\Sigma w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

\(R^{\top}w=\mu\)

where \(\mu\) is the pre-defined expected return, and \(w\) is the weight vector. The above problem can be solved using Lagrange multipliers. And we denote this problem as “Problem 1”.

In AlgoQuant, we use another approach to compute the efficient frontier. The problem we solve is based on the utility function:

\(\min_{w\in\mathbb{R}^{n}} q\times w^{\top}\Sigma w-R^{\top}w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

\(R^{\top}w=\mu\)

\(R\), \(\Sigma\), \(w\) and \(\mu\) are the same parameters in Problem 1. The newly added parameter \(q\), is risk-averse coefficient. And this problem is denoted as “Problem 2”.

The larger the \(q\), the less risk the investor is willing to take. Although most of us are risk-averse, the degrees of risk-averse are different among individuals. As a result, a coefficient that describes the risk-averse degree is introduced. Note that in some papers, risk tolerance coefficient is used, for example Steinbach (2001). Risk tolerance is the reciprocal of risk-averse and it is applied on the return term in the objective function rather than the risk term. For usages of risk-averse coefficient in portfolio optimization, please see page 75 of Lee and Lee (2010), and page 159 of Bodie et al. (2008).

It can be seen that Problem 2 and Problem 1 are equivalent. Because with the constraint \(R^{\top}w=\mu\), the second term in Problem 2’s objective function is a constant, and it does not affect the optimization result. If problem 2 and problem 1 are equivalent, why even bother including the risk-averse coefficient?

When computing the efficient frontier, the two problems are equivalent. However the solutions of the two problems will be different when the constraint \(R^{\top}w=\mu\) is removed.

 

Problem 1′:

\(\min_{w\in\mathbb{R}^{n}} w^{\top}\Sigma w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

Problem 2′:

\(\min_{w\in\mathbb{R}^{n}} q\times w^{\top}\Sigma w-R^{\top}w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

The solution of Problem 1′ is the minimum variance portfolio. And minimum variance portfolio is the portfolio on the efficient frontier which has the minimum variance (i.e., the leftmost point on the curve). On the other hand, the solution of Problem 2′ is the optimal portfolio given utility function \(R^{\top}w-q\times w^{\top}\Sigma w\). This optimal portfolio is also on the efficient frontier.  Thus different \(q\) can lead to different optimal portfolios on the efficient frontier.  When \(q\) is infinity, the return term in the objective function would be 0 comparing to the risk. It means that the only consideration in portfolio optimization is the risk. Thus the portfolio corresponding to \(q=\infty\) is the minimum variance portfolio. When \(q\) is decreasing to 0, the significance of the risk term in the objective function is also decreasing.  And the corresponding portfolio is moving in the upper right direction along the efficient frontier. Finally when \(q\) is 0, we are just maximizing expected return, without any constraints on the risk.

From the above discussion, it can be seen that changing \(q\) in Problem 2′ can also find the efficient frontier. However this is not the approach used in AlgoQuant. Because the point movements on the curve are very slow when \(q\) increases. For example, portfolios corresponding to \(q = 0.1\) and \(q = 10\) are very close on the efficient frontier. As a result, changing expected return in Problem 2 is more convenient to draw the efficient frontier.

When \(q\) is unknown, AlgoQuant provides a method to find the optimal \(q\). In AlgoQuant, MarkowitzPortfolio class, there is a method called getOptimalRiskAversionCoefficient. This method will try different risk-averse coefficients and select the one whose corresponding optimal portfolio has the largest Sharpe ratio, given a risk free rate. And the corresponding portfolio is the tangency portfolio on the efficient frontier.

So far three different portfolios have been discussed: the minimum variance portfolio, the tangency portfolio and the optimal portfolio. And they have the following connections. The optimal portfolio is found by solving Problem 2′, given a risk-averse coefficient \(q\). And every portfolio on the efficient frontier is an optimal portfolio. The optimal portfolio with the largest Sharpe ratio (given a risk free rate) is the tangency portfolio.  And finally the optimal portfolio with the smallest risk is the minimum variance portfolio.  Moreover the weight corresponding to any one of the optimal portfolios is a linear combination of weights corresponding to the minimum variance portfolio and the tangency portfolio.

By providing an extra input parameter \(q\), AlgoQuant is more flexible to give a client a specific optimal portfolio. Because if a client knows his risk-averse coefficient \(q\), the optimal portfolio corresponding to \(q\) can be computed by AlgoQuant. And if he doesn’t, AlgoQuant can always recommend a portfolio on the efficient frontier according to his expected return.

Reference:

Appendix:

The following code is an example on computing the efficient frontier in AlgoQuant. In this example, Problem 2 is solved with different expected returns and fixed \(q\).

Solving the “Corner Solution Problem” of Portfolio Optimization

Many portfolio optimization methods (e.g., Markowitz/Modern Portfolio Theory in 1952) face the well-known predicament called the “corner portfolio problem”. When short selling is allowed, they usually give efficient allocation weighting that is highly concentrated in only a few assets in the portfolio. This means that the portfolio is not as diversified as we would like, which makes the optimized portfolio less practically useful.

In [Corvalan, 2005], the author suggests to look for instead an “almost efficient” but “more diversified” portfolio within the close neighborhood of the Mean-Variance (MV) optimal solution. The paper shows that there are many eligible portfolios around the MV optimal solution on the efficient frontier. Specificially, given the MV optimal solution, those “more diversified” portfolios can be computed by relaxing the requirements for the portfolio return \(R\) and risk \(\sigma\) in an additional optimization problem:

\(max_{w} D(w) \ \ \textup{s.t.,} \\ \begin{aligned} \sqrt{w’ \Sigma w} & \le \sigma^* + \Delta \sigma \\ R^* – \Delta R & \le w’r \\ w’ 1 & = 1 \\ w_i & \ge 0 \end{aligned}\)

where \(( \sigma^* , R^* ) = ( \sqrt{w^{*’} \Sigma w^*} , w^{*’} r ) \), \(w^*\) is the Markowitz MV optimal weights, \(\Delta \sigma, \Delta R \) are the relaxation tolerance parameters, and \(D(w)\) is a diversification measure for the portfolio (for example, \(\sum_i w_i, ln(w_i)\), \(\prod_i w_i\)). In other words, the new optimization problem looks for a portfolio with the maximal diversification around the optimal solution.

Corvalan’s approach can be extended to create an approximate, sufficiently optimal and well diversified portfolio from the optimal portfolio. The approximate portfolio keeps the constraints from the original optimization problem.

References:

Mean-Variance Portfolio Optimization When Means And Covariances Are Unknown

[Lai, Xing and Chen, 2010], in the paper “Mean-Variance Portfolio Optimization When Means And Covariances Are Unknown”, proposed a ground breaking method to do portfolio optimization. In what follows we summarize their idea and use it to implement a periodic rebalancing strategy based on the AlgoQuant framework.

efficient frontier
efficient frontier

Harry Markowitz won the Nobel prize for his work in mean-variance (MV) portfolio optimization in 1950s. The theory is widely regarded as fundamental in financial economics. It says, given a target return of a portfolio of m assets, the optimal (in terms of information ratio) weighting \(w_{eff}\) is given by

\(w_{eff}=\arg \min_w w^T \Sigma w \text{ subject to } w^T\mu = \mu_*, w^T1 = 1\)

where \(\mu\) is the expected future returns, and \(\Sigma\) is the expected covariance matrix of future returns. This problem is readily solved by quadratic programming.

Nonetheless, the assumption that \(\mu\) and \(\Sigma\) are known in advance is very dubious. This has been referred to as the “Markowitz optimization engima”. The attempts made so far are to better forecast these estimators, namely \(\hat{\mu}\) and \(\hat{\Sigma}\), as accurately as possible. The maximum likelihood estimate (MLE) from the training sample is an example. It turns out, however, that MLE performs poorly because the estimators are quite different from the realized values [Michaud, 1989]. Since then, three approaches have been proposed to address the difficulty. The first approach uses a multi-factor model to reduce the dimensionality in estimating \(\Sigma\) [Fan, Fan and Lv, 2008]. The second approach uses Bayes or other shrinkage estimates of \(\Sigma\) [Ledoit and Wolf, 2004]. Both approaches attempt to use improved estimates of \(\Sigma\) for the plug-in efficient frontier. They have also been modified to provide better estimates of \(\mu\), for example, in the quasi-Bayesian approach of [Black and Litterman, 1990]. The third approach uses bootstrapping to correct for the bias of \(\hat{w}_{eff}\) as an estimate of \(w_{eff}.\)

The innovation in [Lai, Xing and Chen, 2010] is to move away from the traditional approach of finding better estimators. They instead assume inherent uncertainty, hence risk, in these estimators. In other words, because the mean and covariance matrix of future returns are estimated from past returns, the uncertainties of these estimators should be incorporated into the portfolio risk. The authors formulate a stochastic optimization framework which solves a more fundamental problem. Given the past returns \(r_1, \ldots, r_n\),

maximize \(\left \{ E(w^Tr_{n+1}) – \lambda Var(w^Tr_{n+1}) \right \}\)

where \(w^Tr_{n+1}\) is now regarded as a random variable conditional on the past returns, and \(\lambda\) is a risk-aversion index (investor’s preference).

This stochastic optimization problem is not standard due to the \(E(.)^2\) term in \(Var(.)\) (\(Var(X) = E[X^2] – (E[X])^2\)). To tackle this problem, the authors solve instead an equivalent stochastic optimization problem:

\(\max_{\eta} {E[w^T(\eta)r_{n+1}] – \lambda Var[w^T(\eta)r_{n+1}]}\)

where

\(w(\eta) = \arg \min_w {\lambda E[(w^Tr_{n+1})^2] – \eta E(w^Tr_{n+1}) }\)

and

\(\eta = 1 + 2\lambda E(W_B)\)

Next, define the function

\(C_{\lambda,\mu_n,V_n}(\eta) = E[w^T(\eta)\mu_n] + \lambda (E[w^T(\eta)\mu_n])^2 – \lambda E[w^T(\eta) V_n w(\eta)]\)

where \(\mu_n\) and \(V_n\) are the first and second moments of returns. The problem now becomes maximizing \(C_{\lambda,\mu_n,V_n}(\eta)\) which can be achieved by any univariate maximization algorithm such as Brent’s method (BrentMinimizer in SuanShu can be used to minimize the negative of this function).

The remaining task is to estimate \(\mu_n\) and \(V_n\) to plug into \(C_{\lambda,\mu_n,V_n}(\eta)\). \(\mu_n\) can be estimated by, for example, bootstrapping, a multi-factor model, a GARCH model, etc. Similarly, \(V_n = \Sigma + \mu_n \mu_n^T\) can be estimated by first estimating the sample covariance matrix \(\Sigma\). As \(\Sigma\) may be unstable when the sample size is small, our implementation uses a covariance-shrinkage method described in [Ledoit and Wolf 2004].

Putting all these steps together, the algorithm to compute the optimal weighting looks like this:

  1. Estimate \(\mu_n\) and \(V_n\) from past returns
  2. Estimate \(a = E[w^T(\eta)\mu_n], b = E[w^T(\eta)V_nw(\eta)]\)
  3. Construct the univariate real function \(C_{\lambda,\mu_n,V_n}(\eta) = a + \lambda a^2 – \lambda b\)
  4. Maximize \(C_{\lambda,\mu_n,V_n}(\eta)\) over \(\eta\)
  5. Compute the optimal weighting \(w^* = w(\eta^*)\) using the optimal \(\eta^*\)

We have implemented this algorithm in AlgoQuant as a component/module/signal. Using our component based programming paradigm, it is easy to modify the steps in the implementation without a lot of changes to the code. For instance, we can pass in the constructor of Lai2010NPEBModel a GridSearchCetaMaximizer object instead of a BrentCetaMaximizer object to change the algorithm to maximize \(C(\eta)\).

In addition, we have coded up a trading strategy using Lai’s portfolio optimization algorithm as a signal. This strategy re-balances periodically a basket of assets according to the optimal weighting computed by the model. Specifically, onDepthUpdate(), collects the returns series using the component ReturnSeries with its update() method. Then, onTimerUpdate() is triggered periodically (say, every 3 months) to call the model (passing in the returns matrix) and to send orders to re-balance the assets accordingly. We backtest this trading strategy using the Simulator object together with the historical data from Yahoo! Finance (or other data sources).

 

Code

References

  • Black, F. and Litterman, R. (1990). Asset Allocation: Combining Investor Views with Market Equilibrium. Goldman, Sachs & Co., New York.
  • Fan, J, Fan, Y and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. J. Econometrics 147 186-197.
  • Lai, T. L., Xing, H. and Chen Z. (2010). Mean-variance portfolio optimization when means and covariances are unknown. Annals of Applied Statistics 2011, Vol. 5, No. 2A, p. 798-823.
  • Ledoit, P. and Wolf, M. (2004). Honey, I shrunk the sample covariance matrix. J. Portfolio Management 30 110–119.
  • Michaud, R. O. (1989). Efficient Asset Management. Harvard Business School Press, Boston.

Bloomberg Tick-By-Tick Data Download

Bloomberg maintains tick-by-tick historical data for only 140 days. However, you may want to backtest your strategies with a longer history. In this case, you can archive these tickdata by yourself and do backtesting with the archived data. Since version 0.2, AlgoQuant supports downloading tick-by-tick data from Bloomberg and saving them as CSV files via the Bloomberg Java API v3 (assuming that you have access to a Bloomberg terminal).

After you download the AlgoQuant package, you will find that there is a folder lib/blpapi-dummy which contains the Bloomberg API jar file (blpapi3.jar). This file is a dummy for AlgoQuant to compile. To use the Bloomberg Java API, you need to replace the file with the real API jar file. If your machine has been deployed the Bloomberg API, you can find the real jar file in your hard drive, for example,

C:\blpAPI\APIv3\JavaAPI\v3.4.3.2\lib\blpapi3.jar

Note that the version number of the API may be automatically upgraded by Bloomberg.

The code for connecting, downloading and saving the tickdata is located in the package com.numericalmethod.algoquant.data.historicaldata.bloomberg.  You can find in the package a simple demo application “TickDataDownloadApp“, which accepts command-line arguments and downloads tickdata of a given period for a given security:

Usage: TickDataDownloadApp <security> <startDate (yyyy-MM-dd)> <endDate (yyyy-MM-dd)> <output dir>

Note that the start date should be within 140 days as it is the oldest history you can download from Bloomberg.

Here is how it works. The Bloomberg system provides a core service named “//blp/refdata” which allows downloading a wide range of market data. The code opens a session to connect to the localhost at port 8194 (change the settings in SimpleSessionFactory if you are using a different port). Then, it sends the IntradayTickRequest with the security symbol, start and end dates. Upon receiving the response, the BloombergTickDataFileWriter saves the data as zipped CSV files in the specified output folder (one file per trading day). For example,

TickDataDownloadApp "5 HK Equity" 2012-12-01 2012-12-21 data

will save the data as .csv.zip files in the folder “data/5 HK Equity“.

Since AlgoQuant is source-available, you are free to change the code to download different data, or save the data into your database instead of files.

 

Happy downloading data! 🙂

 

Reference: Bloomberg API Version 3.x Developer’s Guide

Using SuanShu on Amazon EC2

Cloud computing is very popular nowadays. Delegating your CPU-intensive computation (or simulation) to the cloud seems to be a smart choice. Many of our users asked if SuanShu can be run on Amazon’s Elastic Compute Cloud (EC2), because SuanShu license requires a MAC address and they have no control on which machine being used when they launch an EC2 instance. Here comes a good news! Amazon Web Service (AWS) now supports Elastic Network Interface (ENI), by which you can bind your EC2 instance to a specified network interface. Therefore, you can license your SuanShu against the MAC address of the ENI, and launch an instance with the same ENI and MAC address. For details, please visit the blog here. User guide for ENI can also be found here.

On White’s (2000) Reality Check

I was asked the following question the other day on White’s reality check.

QUESTION:

I was reading White’s (2000) paper (http://www.ssc.wisc.edu/~bhansen/718/White2000.pdf). It seems to suggest that to examine the pnl characteristic of a strategy, we should do bootstrapping on the P&L. I am wondering why this makes sense. Why scramble the P&L time series? Why not scramble the original returns series, e.g., returns of S&P 500, paste the segments together, and feed the “fake” returns series into the strategy again? Then you can generate as many P&L time series as you like. Then you can use them to compute the expected P&L, Sharpe-ratio, etc. Would the 2nd approach be more realistic than the approach suggested in the paper?


ANSWER:

1) In the statistical literature, there are basically two approaches to tackling the data snooping bias problem. The first and more popular approach focuses on data. It tries to avoid re-using the same data set and can be done by testing a particular model on randomly chosen subsamples of the original time series or bootstrapping the original time series (e.g. return). However, one may argue that this sampling approach is somewhat arbitrary and may lack desired objectivity. See Chan, Karceski, and Lakonishok (1998) and Brock, Lakonishok, and Lebaron (1992) for examples of this sampling approach. Your proposed method basically falls into this category.

The second and more “formal” approach focuses on models by considering all relevant models (trading strategies) and constructing a test with properly controlled type I error (test size). Unfortunately, this method is not feasible when the number of strategies being tested is large. White’s paper basically follows the latter approach but he did in a smart way which does not suffer from the aforementioned problem.

2) However, there’s also a problem associated with White’s reality check. It may reduce rejection probabilities of the test under the null by the inclusion of poor and irrelevant alternative models (trading strategies) because it doesn’t satisfy a relevant similarity condition that is necessary for a test to be unbiased. I did some research and found the following paper by Hansen, which attempts to amend this problem by modifying White’s reality check.
Hansen, P. R. (2005), “A Test for Superior Predictive Ability”, Journal of Business & Economic Statistics, 23, pp. 365-380.

3) Basically, the emphasis of using these two approaches is different. Your proposed method tries to answer the following question: will this particular strategy be profitable on a different data set that exhibits similar characteristics to the original one? White’s (2000) reality check, on the other hand, tries to answer the following question: is at least one trading strategy (out of a pool of many seemingly profitable strategies) really profitable for this particular data set? I think a good and conservative approach would be: (a) answer White’s question first and if the answer is YES, then proceed to (b) answer your question by bootstrapping the return time series (and other covariates).

China ETFs, anyone?

Actually, I struggle to decide the title of this post. In fact, at one point I believe the more appropriate title will be “Will China long funds disappear like penguins?” However, I decide to send a more positive message to my readers, so hopefully this title will light up my readers’ interest in investing China equities…

In case my readers are not aware of, one of the latest trend in the Hong Kong market is the rise of sector ETFs. You see, products such as China A-50 (2823 hk) and Tracker fund (2800 hk) have been in the market for quite sometime, but it’s not until now that there are more specific China ETFs available in the market. This is one of the most exciting developments in the Greater China equities market, as investors have options to maximize their return.

Claymore has been the pioneer in terms of product offerings for Chinese equities market. Although Claymore cannot really break into the A-share market, it is combining the best of the Hong Kong, US and Singapore listed Chinese stocks to form specific ETFs. Performance of its ETF has been stellar. For example, in 2009 return of Claymore’s China small cap ETF (HAO US Equity) was 98% and perform better than the majority of long funds (I heard from market rumor that top 10 percentile fund performance, including long only and hedge funds, is about 90%). Probably having seen the success of HAO (In Chinese, hao means good), this year Claymore has launched 2 additional sector funds: A real estate and a technology ETF. So it’s actually pretty straight forward, like real estate will be investing stocks like CR Land, COLI, Shimao, Sino-Ocean and then technology fund is investing in stocks like Tencent and Baidu. These ETFs become good options if investor want to bet against specific sectors of the market, especially given the huge volatility among different sectors (say real estate was so beaten down by government policy and I would like to bet for the upturn), it opens up the option for investors to buy selectively in the China market. Also as a buy-side analyst, for me buying these ETFs allow me to participate in a more tactical way without violating most compliance rules. So, I am not just recommending this to retail investors in general but this works even for finance professional.

Another new set of ETFs launched by Deutsche Bank is about the A-Share sector ETFs. There was no way I could bet against specific sectors in the A-Share market for foreign investors until the launch of this product. Now if I want to, I can buy say healthcare sector ETF (3057 HK) in the midst of the healthcare sector rally based on the huge investment government makes to the healthcare sector. Note that although investors do not actually own the A-share healthcare stocks (in a sense this is a swap based investment), you can access the investment opportunity that is available in the market rather than just simply moving along with the whole market. And, almost at the same time, iShares launched 4 sector ETFs as well, including sectors for financial, infrastructure, energy and materials, all mimicking performance of the CSI 300 specific sector performance.

Still not enough? Well, consider this newly launched ETF by DaChen Asset Management just a couple of days ago (3071 HK), which has a specific focus on A-share consumer stocks. I haven’t checked the holdings yet, but this could be the ETFs the buy, given everybody is quite well known the consumption growth story in China and this is the sector currently most supported by the government as China intends to build around a big, strong middle class.

To me, the rise of China ETFs should send a big warning signal to long-only funds. I mean, there is no reason to invest in most funds if given HAO can produce a whopping 98% of the return. I simply do not understand the rationale to invest in long funds is I can get even better returns in investing ETFs. I mean, in Hong Kong it’s still easy enough to open an etrade account and trades US  stocks. I also do not understand why the HK fund managers are still so arrogant these days, when they do not realize that the ETFs are seriously challenging their once lucrative and promising careers.

My last piece of advice is to bet the beaten up real estate and materials sector in China. If, you believe the market will trun around soon (given not much downside and overly pessimistic sentiment in Mainland), why not giving these way oversold sector ETFs a good try and instead those arrogant, ignorant fund managers who do not even realize there are these ETFs available in the market?

I believe some of my wordings are a little too harsh on this writeup, but just want to comment honestly about what’s really going on in the market. Yes, I do worry about my job prospect in the long run as these ETFs keep coming out of nowhere.

Creative retailing

All right, I may bore my readers as my focus has been always on the consumer sector in China. But I do find a lot of interesting subplots going on in this sector, as I am spending my weekend shopping in Hong Kong. You see, Hong Kong actually is quite a matured city (in terms of high rental cost and as an international center), so it is not strange to see new forms of retailing evolving in such a sophisticated city, catering needs for a diversified pool of customers. Say for instance:                                                                                                      

  • The birth of Cosway (288 HK)  in Hong Kong: I am actually quite amazed to see that direct sales is happening on street stores level now. This is a subsidiary of Berjaya Corp and so far it has already opened more than 90 little shops on the street level to recruit anyone who is interested in building a direct sales network. In fact, it’s so visible now that I can see people selling Cosway’s water filter equipments in the noisiest part of Causeway Bay. If you still believe direct sales business model is classified as “underground economy”, well I hope you will have time to visit Cosway’s stores. It’s quite small on the street, but you can find them and they work! I tell you every time I walk by a Cosway store in Hong Kong, I see customers shopping and getting new customers. So go figure.
  • The funny shoes that are selling in department stores from Skechers (btw you know GEM, the cute little HK singer as the brand’s spokesperson?) : Okay, I know that this is more like a global story, but I have tried the shape-up shoes and it’s really comfortable. So other than athletic shoes and leather shoes, now there are tone-up and shape-up shoes you can find in stores. (Interestingly, Kingmaker is the OEM  for Skechers and we should see if Skechers can help to revive this company)
  • The rise of L’Occitane: Okay, I have to admit this is essentially another global name, but its success in the Asia region is quite a unique story . Partly coming from an unique marketing effort and product differentiation, the shops can sell  a can of cream up to 100 USD. Having seen a Cosway store and then to a L’Occitane store in Hong Kong makes me feel my head a bit giddy… how can all these forms of retailing works within the same place, even in the same area?
  • To make me feel more schizophrenic, I can actually find milk powder in front of the SaSa’s shop. It looks like to me, Sa Sa now also diversifies in selling baby products- while their mothers are busying shopping cosmetic at the same time. Oh, not to mention those health drinks sitting next by the milk powder. The other thing really amazes me is Sa Sa’s outlet store. Wait a second… Sa Sa already sells mass market cosmetic products already, and yet u can find a shop (at least in a crowded area of Jordan) to dump off old inventories? Kudos for good management.
  • The emerging concept of outlet mall in China: Okay, outlet mall is nothing special in US, but now you can find these malls in Beijing and Shanghai! You will find them in Guangzhou soon, and if you don’t believe it, check out with companies like PCD Stores or Guangzhou Friendship. Then you will understand how companies like Ports Design can dump off inventories and get additional cash from happy customers who thought they found bargains.
  • Continual evolution of Suning and Gome in China: Now both players consider stores wholly run by store managers instead of guided by direct sales from the brands. So it will be more like the broadway stores in Hong Kong, where the staffs are responsible for selecting, teaching and selling these consumer electronics to cusomters instead of the sales agent from the manufacturer. So margin improvement, hint hint…
  • Last but not least, the way I see how everyone uses 3G phones these days. I used to think phone is used for talking, but now it can be used as a camera, or a laptop for browsing and download… every time I enter in the mobile phone store I am totally lost in those new 3G/wifi/wireless plans that can easily and could potentially cost me a lot of money.

Needless to say, for all the companies that I have mentioned above, I think their potential in the Greater China market are huge. Many of them still do not even have a presence in China but already has many many customers from mainland, such as Cosway. We shall see how these companies fare in the China market in the future, but to be honest, my view is that there will be many re-rating opportunities for the companies mentioned above, as investors will discover in the future how companies will be rewarded for their creative strategies to differentiate and service the specific needs of customers. For others, like Texwinca’s Belano brand or Espirit, it’s now or never to reformulate a sales strategy to turnaround their business, as they can eventually simply disappear in the Hong Kong market or become a third tier brand, even in China. Take a look at how Theme struggles even in China will remind us to stay motivated and ahead in this constantly changing retail landscape.

All right, I promise my readers I will move on to discuss other topics so that it’s not just all about consumer. See you next time!