Closed
Description
Lines 194 to 204 in 93d2801
Medium value needs to be allocated and deallocated very frequently.
I tested under different numfree
and showed that freelist
can improve performance.
./main/bin/python3 -m pyperf compare_to --min-speed 5 -G --table --table-format md main.json free_list100.json
Benchmark | main | numfree 100 |
---|---|---|
spectral_norm | 112 ms | 96.7 ms: 1.16x faster |
json_loads | 29.8 us | 25.9 us: 1.15x faster |
logging_silent | 112 ns | 102 ns: 1.10x faster |
regex_v8 | 25.4 ms | 23.7 ms: 1.07x faster |
mako | 10.6 ms | 10.1 ms: 1.06x faster |
pyflate | 454 ms | 430 ms: 1.06x faster |
regex_dna | 228 ms | 217 ms: 1.05x faster |
crypto_pyaes | 87.0 ms | 91.6 ms: 1.05x slower |
chaos | 76.6 ms | 82.4 ms: 1.08x slower |
Geometric mean | (ref) | 1.01x faster |
./main/bin/python3 -m pyperf compare_to --min-speed 5 -G --table --table-format md main.json free_list500.json
Benchmark | main | numfree 500 |
---|---|---|
json_loads | 29.8 us | 26.3 us: 1.13x faster |
regex_v8 | 25.4 ms | 23.5 ms: 1.08x faster |
scimark_sparse_mat_mult | 4.63 ms | 4.34 ms: 1.07x faster |
spectral_norm | 112 ms | 106 ms: 1.06x faster |
scimark_fft | 361 ms | 343 ms: 1.05x faster |
regex_dna | 228 ms | 217 ms: 1.05x faster |
scimark_lu | 105 ms | 112 ms: 1.07x slower |
crypto_pyaes | 87.0 ms | 95.3 ms: 1.10x slower |
Geometric mean | (ref) | 1.00x faster |
./main/bin/python3 -m pyperf compare_to --min-speed 5 -G --table --table-format md main.json free_list1000.json
Benchmark | main | numfree 1000 |
---|---|---|
regex_dna | 228 ms | 184 ms: 1.24x faster |
spectral_norm | 112 ms | 96.9 ms: 1.16x faster |
json_loads | 29.8 us | 26.2 us: 1.13x faster |
scimark_sparse_mat_mult | 4.63 ms | 4.17 ms: 1.11x faster |
scimark_sor | 129 ms | 121 ms: 1.07x faster |
regex_effbot | 3.02 ms | 2.82 ms: 1.07x faster |
scimark_fft | 361 ms | 341 ms: 1.06x faster |
pickle_dict | 30.4 us | 28.8 us: 1.06x faster |
logging_silent | 112 ns | 106 ns: 1.06x faster |
telco | 7.23 ms | 6.85 ms: 1.06x faster |
crypto_pyaes | 87.0 ms | 99.0 ms: 1.14x slower |
Geometric mean | (ref) | 1.02x faster |
The current results show that a numfree
of 1000 can speed 2% in the pyperformance benchmark. In memory, medium value need sizeof(PyLongObject) + 4
. In the worst case, every thread may have to pay about 36k of extra memory(if numfree == 1000
)