We should try to optimize the scavenging which is in my opinion not filling his role as well as it should.
When you compare the performance of an application linked with the Boehm GC instead of our GC, you notice that the application linked with Boehm takes 75% of the time of the application linked with our GC. However you notice that the time spent in the GC is really different to the advantage of the ISE as we also take less memory (we take twice as less time and use 35MB less than the Boehm GC). So the only difference that remains is the generated code, and between Boehm GC and our GC, the real major difference is the stack management. Ours is manual, Boehm's one is done by using the hardware stack.
Time | Memory | |
Boehm | 3m4s | 85MB |
Boehm in MT | 3m9s | 85MB |
ISE | 4m9s | 51MB |
ISE in MT | 5m17s | 51MB |
The thing that strikes is the difference between non-MT and MT in ISE. The reason is that local specific data is used extensively in ISE GC, but not used at all in Boehm. As a consequence all the stack management routines are done through an indirection ("eif_globals") which is the only difference I can see at this point.
I thought that EIF_GET_CONTEXT had a cost because it might be inefficient, but it does not seems to be the case. To test that, instead of using EIF_GET_CONTEXT to retrieve `eif_globals', I've decided to pass it as first argument of all our routines. We avoid a call, but we put more on the stack. We get an improvement in speed, but only 8s on over 5m. We also get an improvement on the size of the generated code.
Time | Executable Size | |
ISE in MT using EIF_GET_CONTEXT | 5m17s | 7,843,840 bytes |
ISE in MT using argument passing of eif_globals | 5m9s | 7,741,440 bytes |
At the moment we have a global variable `loc_set' which tracks all references pushed on stack. It works like:
void Eiffel_routine (EIF_REFERENCE Current, EIF_REFERENCE arg1, EIF_INTEGER arg2) { EIF_REFERENCE loc1 = NULL; EIF_REFERENCE loc2 = NULL; RTLI (4); RTLR(0, Current); RTLR(1, arg1); RTLR(2, loc1); RTLR(3, loc2); ... RTLE; }
A new idea would be to do:
void Eiffel_routine (EIF_REFERENCE Current, EIF_REFERENCE arg1, EIF_INTEGER arg2) { struct locals { EIF_REFERENCE loc1; EIF_REFERENCE loc2; EIF_REFERENCE Current; EIF_REFERENCE arg1; } l; memset(&l, 0, 2 * sizeof(EIF_REFERENCE)); l.Current = Current; l.arg1 = arg1; add_loc_set (&l, 4); ... remove_loc_set; }
The advantage I see are:
The disadvantage I see are: