With user-created code we could simply modify our code in-place. call () to create a probability-distribution object returned to the caller – this also took little run-time that was not worth bothering about.call stablelike() in order to get likelihood data – this took about half the run-time as well.call stablefit() in order to get fitting parameters – this took about half the run-time.removeCensoring() methods) – this turned out to be unnecessary in my client’s data, but has little run-time impact. It turns out that prob.StableDistribution basically does 3 things in its main fit() method: However, since we wanted to significantly improve the total run-time and this was spent within the distribution class ( prob.StableDistribution in the case of my client), we continue to drill-down into this class to determine what can be done. Moreover, in the specific case of fitdist they take only a very minor portion of the run-time (this may be different in other cases, such as the ismember function that I described years ago, where the sanity checks have a significant run-time impact compared to the core processing in the internal ismembc function). In general, I advise not to discard such checks, because you never know whether future uses might have a problem with outlier data or input parameters. In our specific case, the external onion layers of sanity checks were unnecessary and could be avoided. The core processing is done inside a class that is unique to each required distribution (e.g., prob.StableDistribution, prob.BetaDistribution etc.) that is invoked within fitdist using an feval call, based on the distribution name that was specified by the user. Drilling into this function in the profiling report, we see onion-like functions that processed input parameters, ensured data validation etc. The profile report for the client’s function showed that 99% of the time was spent in the Statistics Toolbox’s fitdist function. The first step in speeding up any function is to profile its current run-time behavior using Matlab’s builtin Profiler tool, which can either be started from the Matlab Editor toolstrip (“Run and Time”) or via the profile function. Many additional speed-up techniques can be found in other performance-related posts on this website, as well in my book Accelerating MATLAB Performance. ![]() ![]() In today’s post I will show how I did this, which is relevant for a wide variety of other similar performance issues with Matlab. Luckily, I was there to assist and was able to quickly speed-up the code down to 7 minutes, well below the required run-time. He therefore assumed that to get the necessary speedup he must either switch to another programming language (C/Java/Python), and/or upgrade his computer hardware at considerable expense, since parallelization was not feasible in his specific case. It turns out that 99% of the run-time was taken up by Matlab’s built-in fitdist function (part of the Statistics Toolbox), which my client was certain is already optimized for maximal performance. The code had to run in 10 minutes or less to be useful. A client recently asked me to assist with his numeric analysis function – it took 45 minutes to run, which was unacceptable (5000 runs of ~0.55 secs each).
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |