Version 2 - History - Performance - Gisaf - Redmine

Performance » History » Version 2

Philippe May, 30/03/2019 03:05

-Philippe May
+h1. Performance
 Philippe May
-Philippe May
+Gisaf is written basically as a OO and asynchronous way.
 Philippe May
-Philippe May
+For manipulating potentially large datasets, the performance of SqlAlchemy (actually, asyncpg and Gino) has become a concern.
 Philippe May
-Philippe May
+Few techniques are being put in place to tackle this problem.
 Philippe May
 Philippe May
-Philippe May
+h2. Use Pandas (Numpy) instead of OO models
 Philippe May
-Philippe May
+This is work in progress, but shows improvements of ~ 4 times with few thousands of records already.
 Philippe May
 Philippe May
-Philippe May
+h2. Parallel processing
 Philippe May
-Philippe May
+Using vector based processing (Pandas) serves as the base for future improvements: parallel processing and shameless code jit compilation.
 Philippe May
-Philippe May
+For future reference, see https://towardsdatascience.com/how-i-learned-to-love-parallelized-applies-with-python-pandas-dask-and-numba-f06b0b367138
 Philippe May
 Philippe May
-Philippe May
+h2. Geographical clustering
 Philippe May
-Philippe May
+Gisaf uploads complete layers; boundary boxes based on the desired visualization, if possible with mapbox, would bring substantial speed ups. Generating vector tiles on the fly, rather than GeoJSON, seems to be the most promising track.