Performance » History » Version 1
Philippe May, 30/03/2019 03:00
| 1 | 1 | Philippe May | h1. Performance |
|---|---|---|---|
| 2 | 1 | Philippe May | |
| 3 | 1 | Philippe May | Gisaf is written basically as a OO and asynchronous way. |
| 4 | 1 | Philippe May | |
| 5 | 1 | Philippe May | For manipulating potentially large datasets, the performance of SqlAlchemy (actually, asyncpg and Gino) has become a concern. |
| 6 | 1 | Philippe May | |
| 7 | 1 | Philippe May | Few techniques are being put in place to tackle this problem. |
| 8 | 1 | Philippe May | |
| 9 | 1 | Philippe May | |
| 10 | 1 | Philippe May | h2. Use Pandas (Numpy) instead of OO models |
| 11 | 1 | Philippe May | |
| 12 | 1 | Philippe May | This is work in progress, but shows improvements of ~ 4 times with few thousands of records already. |
| 13 | 1 | Philippe May | |
| 14 | 1 | Philippe May | |
| 15 | 1 | Philippe May | h2. Parallel processing |
| 16 | 1 | Philippe May | |
| 17 | 1 | Philippe May | Using vector based processing (Pandas) serves as the base for future improvements: parallel processing and shameless code jit compilation. |
| 18 | 1 | Philippe May | |
| 19 | 1 | Philippe May | For future reference, see https://towardsdatascience.com/how-i-learned-to-love-parallelized-applies-with-python-pandas-dask-and-numba-f06b0b367138 |