================== 1. Python language ================== 1.1 Introduction ================ Python is a multi-platform, free and open-source programming language, first released in 1991. The current version, Python 3, is one of the most widely used programming languages to date. Python is an *interpreted* language (as opposed to *compiled* as C or Fortran), which means one does not need to compile the code before executing it. This allows you to run Python code *interactively* (you can modify it and immediately run it) through the provided Python **interpreter**, a command-line tool which can run on-the-fly the code you type in. Python is defined as a high-level, general purpose language and supports object-oriented programming. An overview of the Python scientific ecosystem is given in `section 1.1 `_ of the SciPy Lecture Notes. 1.2 Installation ================ Python comes in various flavors and can be installed in different ways. However, the easiest way to have Python in your system is to install a scientific distribution such as the `Anaconda distribution `_, which provides a manager for a full set of libraries and software to perform data analysis with Python. Alternatively, if you want more control on the installed packages and a better optimization you can install the :doc:`Miniforge distribution `. 1.3 Usage ========= After the installation of `Anaconda `_ Python 3 version, you can run the Ananconda Navigator which provides you with the tools to use and setup your Python installation. Between these tools, `JupyterLab `_ is the core application you will use to actually perform your data analysis. It consists of a browser-based user-interface for managing `Jupyter Notebooks `_. A Jupyter Notebook is an interactive-programming browser-interface with a Python interpreter running under the hood. After opening JupyterLab, in the Launcher main window you can open a new Notebook (or do *File > New > Notebook*) and start already to type some Python code. In the empty cell that appears, you can try to type: .. code-block:: python print("Hello World!") or simply .. code-block:: python 7 * 6 and press the Run button ► (the small triangle just above the cell) or just hit *Shift+Enter*. Congratulations, you just run your first Python code! See the next paragraph for more information on Python programming. To have an idea on what one can actually do into a Jupyter Notebook, have a look at the `Bokeh `_ and `Holoviews `_ galleries, which provides some interactive examples. You can go through the `JupyterLab User Guide `_ and the `Jupyter Notebook documentation `_ to to get familiar with the user interface. 1.4 Actual programming ====================== Printing ``Hello World!`` has been pretty exciting but with Python you can do quite more! Actually, one does not need at all to be a programmer in order to use Python for data analysis, still, knowing the basis of the language will help you immensely in doing it properly. The `section 1.2 `_ of the SciPy Lecture Notes dedicated to Python provides a very good introduction to the language itself, and initially one should at least be familiar with the sections `1.2.1 `_, `1.2.2 `_ and `1.2.3 `_. It would still be useful to go through the rest, also if you are interested in write some code or automatize some operations. In the latter case, the `Spyder `_ development environment, included in Anaconda, could come in handy. Depending on your inclination, these same basic aspects are covered in more detail in chapters 1 to 5 of the excellent official `Python tutorial `_, while if you are planning to develop code, consider going through the chapters 6 to 10. 1.5 Modules and packages ======================== In the following are reported a few key concepts strictly related to programming, but which have to be clear in order to use Python for data analysis. When launching a new Notebook (*i.e.* a new Python interpreter) you are provided by default with just a very basic set of functions, such as *e.g.* ``print()`` seen above or ``abs()``: .. code-block:: python abs(-7) which returns the absolute value of a number. In order to perform something more, you have to load (or more precisely to **import**) additional *packages* into the active session. Many of these packages are already present by default, others can be installed. For example, supposing you want to calculate the cosine of π, you would do: .. code-block:: python import math math.cos(math.pi) This piece of code already illustrates two important aspects in Python: * **Importing.** Here, ``math`` is a *module*, which is included by default in the Python distribution, but needs to be 'activated' with the ``import math`` statement. * **Object-oriented programming.** The ``math`` module contains several objects, like functions (which are called *methods*, such as ``cos()``) and variables (which are called *attributes*, such as ``pi``). These objects are accessed through the dot ``.`` notation. So ``math.cos()`` or ``math.sin()`` give the cosine and sine functions, respectively, while ``math.pi`` returns the π constant. To better illustrate this let's try a variant of the importing: .. code-block:: python from math import cos, pi This line of code is pretty self-explaining. In this way, ``cos()`` and ``pi`` have been made directly available and one can just write: .. code-block:: python cos(pi) with the same result as before. You can inspect the type of ``cos`` and ``pi`` objects with the ``type()`` function. For example: .. code-block:: python type(pi) will return ``float``, indicating ``pi`` is a floating point number. To summarize, here ``math`` is a *module*, which contains several *methods* (*i.e.* functions, as it is ``cos()``) and *attributes* (*i.e.* variables, as it is ``pi``). Similarly, other types of objects in Python can have their own methods and attributes. As an example, the object ``mydata``, which we assume has been properly constructed, can posses, let's say, the ``mydata.temperature`` attribute (which would probably be a float number representing the temperature at which data has been acquired) or the ``mydata.normalize()`` method (which, for example, could rescale ``mydata`` values, so that the integral under the curve is equal to one). A collection of modules is called a *package*. So to give another example, let's take the ``convolve`` *method* contained in the ``signal`` *module* of the ``scipy`` *package*. To access this function any of this will work: .. code-block:: python import scipy scipy.signal.convolve() .. code-block:: python from scipy import signal signal.convolve() .. code-block:: python from scipy.signal import convolve convolve() .. code-block:: python from scipy.signal import convolve as conv conv() In the last example, ``convolve`` has been imported with the *shorthand* ``conv``. This is an useful and extensively used practice, especially when you need to use the same object several times. The same concept of importing applies similarly to Python *scripts*: simple text files, you may have written by yourself, typically with '.py' extension, and containing custom definitions of functions or other objects you want to reuse. To have an insight into scripts and modules, check the section `1.2.5 `_ of SciPy lectures and `chapter 6 `_ of the Python tutorial.