Use your computer fearlessly.




[ Security | Consulting | Research ]




Algorithms for Big Data

Algorithms are provably correct portions of computer code. Most computer programs on the market consist of multiple algorithms pieced together and are not provably correct. Software works when it consists of correct algorithms.

Big data is any data-set which is large relative to the ability of the algorithm to efficiently process it. Suppose you have 1 MB of data, then you can likely process that data with an exponential algorithm (i.e., an algorithm with O(2^n) running time). Suppose you have 10 MB of data, then one order of magnitude increase in data-set size would increase the running time exponentially. So, you need a faster algorithm.

If your program has running time f(n), then the data-set it can quickly process is 1/f(n) in size. Therefore, big data often requires algorithms with linear or quadratic running time. If the running-time of the algorithm cannot be improved, parallelization may be a feasible option.

Dr. Kirkpatrick has over two decades experience in algorithm design and over a decade of experience with big-data analysis, statistics, and machine learning. He is an expert at avoiding NP-HARD problems, and inventing efficient, accurate algorithms for data analysis. He also has experience in debugging, testing, and parallel computing.




bbkirk@intrepidnetcomputing.com