piątek, 26 czerwca 2015

Progress report

The main goal of this post will be to create a progress report before the coming midterm assessment.
As I mentioned before I planned to complete the Dimensions and entropies section of the TISEAN documentation. This seems to be still a realistic goal.

Currently I have ported d2, av-d2, c2g, c2t along with writing documentation and demos for them. The current state of the tests needs improvement because they rely heavily on external files generated using the corresponding TISEAN programs. Because most of those functions/programs are closely linked I plan to improve on this feature once functions from the entire section are ported.

Currently I am working on c1 which already passes it's test. Once I complete it and write the documentation and demo, the only programs/functions that will need to be ported are boxcount and c2d. Once they are complete I plan to release version 0.2.0.

My elaborated proposal located on the octave wiki states that I planned to also port c2. Although the source code for such a program does exist in the TISEAN package (ver. 3.0.1), it does not seem to be mentioned in the documentation. Furthermore, installing the package on a computer does not give access to this program. Also, it seems to be redundant with other programs in the package. Therefore, I will not port it.

niedziela, 14 czerwca 2015

Improving the code

TISEAN was originally written as a command line set of programs. Because of this all the code is not very portable and many variables are global ones. So far this has been dealt with by creating local variables and extending the number of variables in function calls (in some cases up to 11). This is not optimal for code clarity, ease of maintaince and because many of the variables are passed as values (they are parameters) it also caused a slight slowdown in execution speed.

Due to all of these downsides I have contemplated possible solutions which I will attempt to describe.

Using structs

One idea that came to mind is to pack all of the global variables into a struct and pass the struct to all of the functions and obtain the global variables from this struct. This solution certainly solves the problem of passing so many parameters to functions. However, it does not improve code portability because every *.cc oct-file function needs its own struct. This solution is also problematic because all of the names of global variables have to be referenced now through the struct so center[i][j] would become parameter_struct.center[i][j] (obviously the name of the struct could be as short as p).

Using classes

Another quite simple solution would be to create a class. This class would have data members that were previously the global variables and function members that were the old functions called from old main(). As there are similarities between different TISEAN programs, it could be possible to even create a prototype class and inherit from it.

There are however downsides to this option as well. First of all, Octave code guidelines specifies that classes should be in separate files. This would mean creating 2 more files for each program that was ported using the C++ wrapper. Apart from that, the memory might have to allocated using new/delete, because the preferred method of using the macro OCTAVE_LOCAL_BUFFER might be difficult (or impossible) to apply to this case. This objection can be worked around in other ways, such as using Array classes to allocate the data and then get a pointer to them using fortran_vec().

Summary

Performing the aforementioned code improvement, although helpful, is not critical. Therefore any attempts to implement it will be deferred until after the functions outlined at the beginning of the project are complete.

Timeline update

So far I have been giving progress reports on the TISEAN porting project. This time, however, I would like to also compare the outlined schedule for the project with the actual progress made.

Since the last post I have additionally ported:
  • xzero
  • lyap_r
  • lyap_k
  • lyap_spec
In one of my first posts I stated that I would like to finish Dimensions and Entropies before the midterm assessment. Currently I have finished up Lyapunov Exponents and I plan to start working on Dimensions and Entropies this week. Since there are 2+ weeks to the Midterm Assessment I believe it is possible to complete all of the goals for this section of the project as planned.

niedziela, 7 czerwca 2015

Analyzing lfo-run

I have written tests that compare lfo-run from TISEAN to the ported version lfo_run. The test that uses amplitude.dat works perfectly, but when I analyzed the results both programs/function gave for henon (Henon Maps) I ran into some problems. I will attempt to describe them.

Input data

The problems occur when analyzing a 1000 element Henon map (henon(1000)). For all of the implementations if I used a simple call with default parameters (m = 2, d= 1) the programs would quit due to a matrix singularity. The problems arose when (m = 4, d= 6) was used. With these parameters the program gave various results for various implementation methods.

It is important to note that the prediction that I was testing tried to predict 1000 future elements (default for all implementations) on the basis of given 1000 elements.

Implementations

There are 3 implementations I used:
  1. The TISEAN implementation (uses lfo-run)
  2. The implementation similar to 1. but compiled as c++ and wrapped in enough code to run as m-file (uses __lfo_run__ and invert_matrix())
  3. The implementation that uses Matrix::solve() method
I tried to find out if maybe method 1. and 2. do not differ due to a bug that was introduced while porting. I therefore ported it twice (the second time to a very rudimentary stage) and both times the same results were encountered. I do not understand why there is a discrepancy between between these two implementations.

Discrepencies

Since the goal of this project is to port TISEAN functions I have compare implementation 1. with 2. and 1. with 3, to see what differences I come across.

A comparison between implementation 1. and 2. results in an error from implementation 2. The function generates about 700 elements (of the default 1000) and then throws an error that the forecast has failed.

The comparison between implementation 1. and 3. is much more fruitful as the results are the same for about 150 elements and then they begin to differ (see Fig. 1.)
Fig. 1 Comparison between the TISEAN implementation and using Matrix::solve() in TISEAN package from Octave
These results can be achieved by cloning the repo, doing make run and running the script:

cd tests/lfo_run/; test_lfo_run


Analysis

Before I give my suggestions for what is the cause for these discrepancies I would like to discuss another interesting discrepancy. This discrepancy is the maximum difference between the solution of the equation system obtained from implementation 2. and 3. When using implementation 3. for the forecast this value was 8.5e-14, but when using implementation 2. for the forecast this difference was 5e-13.

I believe this is because the computational error is accumulated throughout the program. Each new forecast point is dependent on the previous ones. Moreover the Kahan algorithm (compensated summation) is never used in the TISEAN implementations. Even matrix multiplication (as seen e.g. in multiply_matrix())  uses the simple, but error accumulating for(...) sum += vec[i].

As to why implementation 1. and 2. give different results I have two theories: either there still is a bug which I was unable to detect, or some compilation difference (e.g. a linked library) between the TISEAN program written in C (lfo-run) and the TISEAN package function written in C++(__lfo_run__).

Summary

The question that poses itself is whether this warrants a rewriting of other TISEAN function that use the simple summation, or if this problem can be ignored. The authors of TISEAN said in their introduction that blindly using the programs they wrote may result in unintended or even wrong results. Trying to predict 1000 elements of a 1000 element Henon Map using first order local linear prediction might be considered a bad use-case.

Progress report

Since the last post I wrote a tutorial on http://wiki.octave.org/TISEAN_package. I also ported:
  • lzo_gm
  • lzo_run
  • ikeda
  • lfo_run
  • lfo_ar
  • lfo_test
  • rbf
  • polynom
During my work I discovered that polynom has similar functions (polynomppolyparpolyback) which provide extra options for performing polynomial fits. I will not include those functions in the project now, but they have a high priority once I finish all of the functions I outlined for this project.

These newly ported functions aren't completely polished (some need demos and documentation) but they pass tests and don't have memory leaks. Once I clean these functions up and port xzero the last function in this section, I intend to create version 0.1.0 of the package. With this version I intend to branch out the repo to have a 'devel' and a 'stable' branch.

Afterwards, I will add more information to the tutorials on the wiki page.