Software

From imk-tro wiki
Jump to: navigation, search

Essential Software

The following is a list of essential software packages and the links to their web pages:

Many of these can be installed using anaconda [1].

CDO

https://code.mpimet.mpg.de/projects/cdo/.

Note that CDO has a lot of built-in functions that are not well documented, but details about these can be usually found in their discussion forums, https://code.mpimet.mpg.de/projects/cdo/boards.

Python

Instructions on how to download and install Python for all OSs can be found at: https://www.python.org/.

It is usually recommended to use the Anaconda distribution to install Python. Details on how to do this are here: https://www.anaconda.com/

Python can be combined with a good Integrated Development Environment (IDE) of your choice. All existing IDEs have their pros and cons. Some of the most popular IDEs are the following:

Python boasts a large number of packages. Some of the most used packages for manipulating large files in NetCDF, HDF5 or CSV formats are the following:

Data analysis:

  • NumPy [9]
  • xarray [10], easy handling of large nc files.
  • pandas [11]
  • SciPy [12]

Parallel computing, Machine learning, Deep learning etc.:

Data visualization:

Most of these packages are distributed through either Conda [24] or Pip [25].

On top of these, various users also create packages tuned for specific purposes. They are usually made public through GitHub [26].

Various forums exist for the sole purpose of clearing specific questions about coding. One such forum is stackoverflow[27]. Medium [28] is also a good source for reading up about new ideas and tools in Python and other languages.

Visualisation

The visualisation of "nc" files can be made easier using:

  • NcView [29]
  • Panoply [30]
  • Paraview for 3D rendering [31], note this one requires more computational power and should ideally be run through a supercomputer via VNC client.
  • Cartopy (python package) for maps [32]
  • Matplotlib (standard python package) [33]


Colourblind Friendly Plotting Tools

Here's a list of a few websites that have information and also palette generation for colourblind friendly plots:

HPC (High-performance computing) commands

Usefull commands based on slurm [34], which is mainly used on large clusters are:

- Useful is the --array option to submit a job array, using the index as an argument. This can be used, for example, to apply an analysis to many files or a large dataset. (e.g. sbatch --array=0-99 script.sh)
  • squeue -l
  • salloc
  • sinfo