In the data stream model, data arrives in high volume and speed and there is not enough storage space to hold all the input. The input data is examined one record at a time upon arrival and processed. The typical processing done is randomized sketching that allows for sublinear storage, and often uses only poly-logarithmic space and time (per record). We present some interesting sketching solutions that have appeared in the literature for several statistical problems over data streams. We will also discuss some recent advances in algorithms for numerical linear algebra obtained using linear sketching. In this technique, an input matrix is compressed into a much smaller matrix by multiplying it with a random matrix chosen from certain distributions. The original problem is now solved, approximately to within factors of 1 with high probability, by computing over the smaller matrix. We will illustrate this method using the least squares regression problem.
展开▼