Data Analysis » History » Version 1
Giulio Di Anastasio, 03/05/2021 10:56
| 1 | 1 | Giulio Di Anastasio | h1. %{color:BLUE} Data analysis% |
|---|---|---|---|
| 2 | 1 | Giulio Di Anastasio | |
| 3 | 1 | Giulio Di Anastasio | We use "Jupyter":https://jupyter.org , "Pandas":https://pandas.pydata.org/ and "GeoPandas":http://geopandas.org/ , accessible at http://gis.auroville.org.in/notebooks . |
| 4 | 1 | Giulio Di Anastasio | |
| 5 | 1 | Giulio Di Anastasio | For integration in the processes (execution of notebooks), there's "papermill":https://github.com/nteract/papermill . Systemd "timers":https://wiki.archlinux.org/index.php/Systemd/Timers are used to automatically schedule the notebooks on the server, ie. for the dashboards. |
| 6 | 1 | Giulio Di Anastasio | |
| 7 | 1 | Giulio Di Anastasio | There's a dedicated virtual machine for Jupyter, accessible from our local network at @jupyter.csr.av@. |
| 8 | 1 | Giulio Di Anastasio | |
| 9 | 1 | Giulio Di Anastasio | h2. %{color:BLUE} Organization of notebooks% |
| 10 | 1 | Giulio Di Anastasio | |
| 11 | 1 | Giulio Di Anastasio | The setup is organized in 2 parts, that are run with 2 instances of Jupyter for security reasons. |
| 12 | 1 | Giulio Di Anastasio | |
| 13 | 1 | Giulio Di Anastasio | h3. %{color:BLUE} Admin% |
| 14 | 1 | Giulio Di Anastasio | |
| 15 | 1 | Giulio Di Anastasio | The notebooks in the admin are mostly for maintenance: operations on the database, etc. |
| 16 | 1 | Giulio Di Anastasio | |
| 17 | 1 | Giulio Di Anastasio | h3. %{color:BLUE} Users% |
| 18 | 1 | Giulio Di Anastasio | |
| 19 | 1 | Giulio Di Anastasio | The notebooks are organized in folders, all under Gisaf's source code git repository, except the "Sandbox" one. |
| 20 | 1 | Giulio Di Anastasio | |
| 21 | 1 | Giulio Di Anastasio | This notebook server connects to the database with a specific user (@jupyter@), which has been set on the database server with permissions to read all data (@readonly@) plus has write access to some tables dedicated to store analysis results. |
| 22 | 1 | Giulio Di Anastasio | |
| 23 | 1 | Giulio Di Anastasio | h2. %{color:BLUE} Integration with Gisaf% |
| 24 | 1 | Giulio Di Anastasio | |
| 25 | 1 | Giulio Di Anastasio | The notebook in @Templates@ demonstrates the usage of notebook in relation with Gisaf: mostly, how to use the @gisad.ipynb_tools@ module to access Gisaf models and the data from the database. |
| 26 | 1 | Giulio Di Anastasio | |
| 27 | 1 | Giulio Di Anastasio | This module is part of gisaf: https://redmine.auroville.org.in/projects/gisaf/repository/revisions/master/entry/gisaf/ipynb_tools.py |
| 28 | 1 | Giulio Di Anastasio | |
| 29 | 1 | Giulio Di Anastasio | h2. %{color:BLUE} References% |
| 30 | 1 | Giulio Di Anastasio | |
| 31 | 1 | Giulio Di Anastasio | h3. %{color:BLUE} Geopandas% |
| 32 | 1 | Giulio Di Anastasio | |
| 33 | 1 | Giulio Di Anastasio | Some nice examples of processing, using watershed and rain: https://geohackweek.github.io/vector/06-geopandas-advanced/ |
| 34 | 1 | Giulio Di Anastasio | |
| 35 | 1 | Giulio Di Anastasio | h3. %{color:BLUE} Integration% |
| 36 | 1 | Giulio Di Anastasio | |
| 37 | 1 | Giulio Di Anastasio | A good example of how a company has integrated the same tools: https://medium.com/netflix-techblog/scheduling-notebooks-348e6c14cfd6 |
| 38 | 1 | Giulio Di Anastasio | |
| 39 | 1 | Giulio Di Anastasio | h2. %{color:BLUE} Other docs% |