Project

General

Profile

Performance » History » Version 1

Version 1/2 - Next » - Current version
Philippe May, 30/03/2019 03:00


Performance

Gisaf is written basically as a OO and asynchronous way.

For manipulating potentially large datasets, the performance of SqlAlchemy (actually, asyncpg and Gino) has become a concern.

Few techniques are being put in place to tackle this problem.

Use Pandas (Numpy) instead of OO models

This is work in progress, but shows improvements of ~ 4 times with few thousands of records already.

Parallel processing

Using vector based processing (Pandas) serves as the base for future improvements: parallel processing and shameless code jit compilation.

For future reference, see https://towardsdatascience.com/how-i-learned-to-love-parallelized-applies-with-python-pandas-dask-and-numba-f06b0b367138