$\gamma$-ray spectra visualization and fit¶

In this example we want fit a power law function on one high energy $\gamma$-ray spectrum (Aharonian et al 2006) acquired by HESS. We will see, how to:

visualize the spectrum
overlay on spectrum the model already computed
fit a new model
compute the integrated energy of the 2 models

Key: error plot, scatter plot, curve fit, integration

Explore the file¶

In [1]:

from astropy.io import fits

tev = fits.open('data/TeVJ1826-148.fits')
tev.info()

Filename: data/TeVJ1826-148.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
  0  PRIMARY       1 PrimaryHDU       9   ()      
  1  HESS_INFC     1 BinTableHDU     20   26R x 4C   [1E, 1E, 1E, 1E]   
  2  MODEL_HESS_INFC    1 BinTableHDU     20   300R x 4C   [1E, 1E, 1E, 1E]   
  3  MODEL_HESS_INFC_MAN    1 BinTableHDU     21   300R x 4C   [1E, 1E, 1E, 1E]

In [ ]:

Open the proper extension (HESS_INFC) and retrieve just data

In [2]:

infc=tev['HESS_INFC'].data

In [3]:

infc

Out[3]:

FITS_rec([( 0.195958  , 6.0770583e-11, 1.2443476e-11, 1.4777032e-11),
          ( 0.241226  , 4.1031869e-11, 4.0196353e-12, 7.6927215e-12),
          ( 0.301863  , 2.6202924e-11, 3.3654117e-12, 3.1804041e-12),
          ( 0.369569  , 1.7682928e-11, 1.7322793e-12, 2.3747540e-12),
          ( 0.45494398, 1.0170147e-11, 1.3062176e-12, 1.2344121e-12),
          ( 0.553947  , 8.5277436e-12, 9.2303097e-13, 1.1452271e-12),
          ( 0.681913  , 7.0764371e-12, 5.4529211e-13, 5.9083592e-13),
          ( 0.83030903, 3.2332483e-12, 4.4736605e-13, 4.7647048e-13),
          ( 1.02772   , 1.9477987e-12, 2.3061205e-13, 2.8703380e-13),
          ( 1.24455   , 1.8305073e-12, 1.9813155e-13, 2.6975433e-13),
          ( 1.52369   , 1.1143001e-12, 8.5867430e-14, 1.9383865e-13),
          ( 1.89631   , 5.1605725e-13, 8.6426261e-14, 9.6751478e-14),
          ( 2.30898   , 5.1384363e-13, 7.6140808e-14, 9.6336303e-14),
          ( 2.82687   , 3.8885307e-13, 7.2455070e-14, 5.2221624e-14),
          ( 3.49899   , 1.9280053e-13, 2.8568985e-14, 3.8784521e-14),
          ( 4.26044   , 1.5092543e-13, 3.2270847e-14, 2.6254316e-14),
          ( 5.21602   , 7.4754798e-14, 1.9255756e-14, 1.6072552e-14),
          ( 6.35114   , 5.4631165e-14, 1.1681232e-14, 1.3284150e-14),
          ( 7.73325   , 2.1507434e-14, 7.4304602e-15, 8.1334958e-15),
          ( 9.416121  , 1.5717886e-14, 5.4302965e-15, 6.7014456e-15),
          (11.5913    , 7.5261020e-15, 3.2328266e-15, 4.5117955e-15),
          (14.664599  , 0.0000000e+00, 0.0000000e+00, 3.1852115e-15),
          (17.9538    , 1.7490052e-15, 1.1180409e-15, 1.5729794e-15),
          (21.7416    , 0.0000000e+00, 0.0000000e+00, 1.4490910e-15),
          (26.185     , 1.4579962e-15, 8.4051572e-16, 1.4745043e-15),
          (31.1933    , 0.0000000e+00, 0.0000000e+00, 1.6060789e-16)],
         dtype=(numpy.record, [('ENERGY', '>f4'), ('FLUX', '>f4'), ('FLUX_ERROR_MIN', '>f4'), ('FLUX_ERROR_MAX', '>f4')]))

prepare the plot¶

In [4]:

#%matplotlib qt
%matplotlib notebook

In [5]:

from matplotlib import pyplot as plt

fig = plt.figure("HESS Spectrum")
ax = fig.subplots()

In [ ]:

Error bars¶

Display points and errors

In [6]:

ax.errorbar(infc['ENERGY'], infc['FLUX'], yerr=[infc['FLUX_ERROR_MIN'], infc['FLUX_ERROR_MAX']], 
            label='HESS INFC', color='red', fmt='o')

Out[6]:

<ErrorbarContainer object of 3 artists>

In [7]:

 infc

Out[7]:

FITS_rec([( 0.195958  , 6.0770583e-11, 1.2443476e-11, 1.4777032e-11),
          ( 0.241226  , 4.1031869e-11, 4.0196353e-12, 7.6927215e-12),
          ( 0.301863  , 2.6202924e-11, 3.3654117e-12, 3.1804041e-12),
          ( 0.369569  , 1.7682928e-11, 1.7322793e-12, 2.3747540e-12),
          ( 0.45494398, 1.0170147e-11, 1.3062176e-12, 1.2344121e-12),
          ( 0.553947  , 8.5277436e-12, 9.2303097e-13, 1.1452271e-12),
          ( 0.681913  , 7.0764371e-12, 5.4529211e-13, 5.9083592e-13),
          ( 0.83030903, 3.2332483e-12, 4.4736605e-13, 4.7647048e-13),
          ( 1.02772   , 1.9477987e-12, 2.3061205e-13, 2.8703380e-13),
          ( 1.24455   , 1.8305073e-12, 1.9813155e-13, 2.6975433e-13),
          ( 1.52369   , 1.1143001e-12, 8.5867430e-14, 1.9383865e-13),
          ( 1.89631   , 5.1605725e-13, 8.6426261e-14, 9.6751478e-14),
          ( 2.30898   , 5.1384363e-13, 7.6140808e-14, 9.6336303e-14),
          ( 2.82687   , 3.8885307e-13, 7.2455070e-14, 5.2221624e-14),
          ( 3.49899   , 1.9280053e-13, 2.8568985e-14, 3.8784521e-14),
          ( 4.26044   , 1.5092543e-13, 3.2270847e-14, 2.6254316e-14),
          ( 5.21602   , 7.4754798e-14, 1.9255756e-14, 1.6072552e-14),
          ( 6.35114   , 5.4631165e-14, 1.1681232e-14, 1.3284150e-14),
          ( 7.73325   , 2.1507434e-14, 7.4304602e-15, 8.1334958e-15),
          ( 9.416121  , 1.5717886e-14, 5.4302965e-15, 6.7014456e-15),
          (11.5913    , 7.5261020e-15, 3.2328266e-15, 4.5117955e-15),
          (14.664599  , 0.0000000e+00, 0.0000000e+00, 3.1852115e-15),
          (17.9538    , 1.7490052e-15, 1.1180409e-15, 1.5729794e-15),
          (21.7416    , 0.0000000e+00, 0.0000000e+00, 1.4490910e-15),
          (26.185     , 1.4579962e-15, 8.4051572e-16, 1.4745043e-15),
          (31.1933    , 0.0000000e+00, 0.0000000e+00, 1.6060789e-16)],
         dtype=(numpy.record, [('ENERGY', '>f4'), ('FLUX', '>f4'), ('FLUX_ERROR_MIN', '>f4'), ('FLUX_ERROR_MAX', '>f4')]))

Plot setup¶

Setup the plot: scales, axes labels and legend

In [8]:

ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlabel("E(TeV)")
ax.set_ylabel(r"Flux ($\frac{ph}{cm^2sTev}$)") #raw string, no characters escaping

Out[8]:

Text(0, 0.5, 'Flux ($\\frac{ph}{cm^2sTev}$)')

In [ ]:

In the same way we can retrieve data model, and plot it as a solid line without error

In [9]:

#Load the model
model=tev['MODEL_HESS_INFC'].data
ax.plot(model['ENERGY'], model['FLUX'], label='model', color='gray')
ax.legend()

Out[9]:

<matplotlib.legend.Legend at 0x7f1193ab3240>

In [10]:

model=tev['MODEL_HESS_INFC'].data
model.dtype

Out[10]:

dtype((numpy.record, [('ENERGY', '>f4'), ('FLUX', '>f4'), ('FLUX_ERROR_MIN', '>f4'), ('FLUX_ERROR_MAX', '>f4')]))

Fit a function¶

There are several ways to compute a fit in astropy, NumPy or SciPy.

NumPy and astropy.modeling modules offer a lot of prepared models for fitting

SciPy provides us a general purpose function for fit: curve_fit

In this example we will use SciPy and curve_fit.

In [11]:

from scipy.optimize import curve_fit

In [ ]:

optimize.curve_fit¶

curve_fit requires the definition of our function.

Signature of this function MUST be ydata = f(xdata, *params)

In our case we decide to fit a power law like that

$$N x^{-\alpha} e^{-\frac{x}{x_{off}}}$$

The parameters of the function are: $N$, $\alpha$ and $x_{off}$

In [12]:

import numpy as np

def powerlaw(x, N, alpha, x_off):
    return N * np.power(x, -alpha) * np.exp(-x/x_off)

In [ ]:

use the function with scalar

In [ ]:

powerlaw(1, 3.0, 2.0, 0.5)

In [13]:

powerlaw(1, 3.0, 2.0, 0.5)

Out[13]:

0.4060058497098381

and with numpy.array objects

In [ ]:

_x=np.array([0.5, 1.0, 2.0])
powerlaw(_x, 3.0, 2.0, 0.5)

In [14]:

_x=np.array([0.5, 1.0, 2.0])
powerlaw(_x, 3.0, 2.0, 0.5)

Out[14]:

array([4.41455329, 0.40600585, 0.01373673])

Compute the fit¶

In [15]:

xdata = infc['ENERGY']
ydata = infc['FLUX']
yerr = infc['FLUX_ERROR_MAX']
popt, pcov=curve_fit(powerlaw, xdata, ydata, sigma=yerr, p0=(1.0, 1, 6.0))
popt, pcov

Out[15]:

(array([2.97021086e-12, 1.82433667e+00, 7.68035476e+00]),
 array([[ 3.94792250e-26, -1.02996866e-14, -3.28658276e-13],
        [-1.02996866e-14,  5.28959877e-03,  1.20696691e-01],
        [-3.28658276e-13,  1.20696691e-01,  3.97762247e+00]]))

In [16]:

curve_fit?

Compute the $\chi^{2}$¶

$$ \chi^2 = \sum_j^n \left( \frac{f(x_j) - y_j}{y_{err}} \right)^2 $$$$ \chi^2_{r} = \frac{\chi^2}{n-n_{p}} $$

In [17]:

import numpy as np
chisq = np.sum(((powerlaw(xdata, *popt) - ydata)/yerr)**2)
reduced_chisq = (chisq)/(len(xdata)-len(popt))

In [18]:

powerlaw(xdata, *popt)

Out[18]:

array([5.6629090e-11, 3.8531258e-11, 2.5393347e-11, 1.7400369e-11,
       1.1777737e-11, 8.1182812e-12, 5.4646461e-12, 3.7425735e-12,
       2.4717812e-12, 1.6946432e-12, 1.1297008e-12, 7.2203468e-13,
       4.7777412e-13, 3.0874850e-13, 1.9168949e-13, 1.2121016e-13,
       7.3990015e-14, 4.4563884e-14, 2.5991438e-14, 1.4577113e-14,
       7.5163781e-15, 3.2801306e-15, 1.4776295e-15, 6.3637580e-16,
       2.5416812e-16, 9.6217808e-17], dtype=float32)

Plot the final model¶

In [19]:

# Compute the fitted function on the grid of x
x=np.linspace(xdata[0], xdata[-1], 1000)
yfit=powerlaw(x, *popt)

#Setup a new figure
fig=plt.figure()
ax = fig.subplots()

#Plot the fitted function
ax.plot(x, yfit, color='blue', label='powerlaw fit')

#Plot data
ax.errorbar(xdata, ydata, yerr=[infc['FLUX_ERROR_MIN'], infc['FLUX_ERROR_MAX']], label='HESS', color='red', fmt='o')

#Plot original model
ax.plot(model['ENERGY'], model['FLUX'], label='model', color='gray', ls='--')
ax.legend()
ax.set_xscale('log')
ax.set_yscale('log')

Fix some parameters¶

curve_fit assumes $f$ must have this signature ydata = f(xdata, *params) there are not additional options for fixed parameters.

curve_fit provides also the bounds parameter to set a lower and upper limits in parameters space exploration. In any case each lower bound must be strictly less than each upper bound.

In [ ]:

# This call raises an error!
popt, pcov=curve_fit(powerlaw, xdata, ydata, sigma=yerr, p0=(2.97, 1, 6.0), bounds=((2.97, 0, 0),
                                                                                    (2.97, 10., np.inf)))

Goal: we must find a way provide some parameter to the final function (the powerlaw in this case), but this parameter can not be a function argument.

We can define an auxiliary function and take advantage of variables scopes.

Variables scope¶

Variables can only reach the area in which they are defined, which is called scope. Think of it as the area of code where variables can be used. Python supports global variables (usable in the entire program) and local variables.

By default, all variables declared in a function are local variables. To access a global variable inside a function, it’s required to explicitly define global variable.

In [ ]:

def powerlaw(x, N, alpha, x_off):
    return N * np.power(x, -alpha) * np.exp(-x/x_off)

We can define "new" powerlaw function, with only 2 free parameter: $\alpha$ and $x_{off}$. The N parameter is defined out of powerlaw function.

In [ ]:

    def powerlaw(x, alpha, x_off):
        return N * np.power(x, -alpha) * np.exp(-x/x_off)

We nest powerlaw inside another function which provides N. In this way N is defined outside powerlaw, but it is in the powerlaw scope.

In [20]:

def powerlawN(N):
    # Variable N lives in this scope
    def powerlaw(x, alpha, x_off):
        return N * np.power(x, -alpha) * np.exp(-x/x_off)
    return powerlaw

powerlawN(2.97e-12) returns a powerlaw with a fixed N, and we can call the curve_fit in this way

In [21]:

popt2, pcov2=curve_fit(powerlawN(2.97e-12), xdata, ydata, sigma=yerr, p0=(1, 6.0))
popt2,pcov2

Out[21]:

(array([1.82439046, 7.68200904]), array([[0.00249409, 0.03350657],
        [0.03350657, 1.19053626]]))

In [ ]:

Comute the integrate energy: scipy.integrate¶

The scipy.integrate sub-package provides several integration techniques:

Methods for Integrating Functions given function object
Methods for Integrating Functions given fixed samples
Interface to numerical integrators of ODE systems

We use quad function, which use a technique from the Fortran library QUADPACK, on the analytic model

In [22]:

from scipy.integrate import quad

value, err = quad(powerlaw, 0.2, 20, args=tuple(popt))
value

Out[22]:

1.1697759599647136e-11

And the simps function, which use the composite Simpson’s rule, to integrate the sampled model.

In [23]:

from scipy.integrate import simps

idx = (model['ENERGY'] >=0.2) & (model['ENERGY'] <=20.0)
value = simps(model['FLUX'][idx], model['ENERGY'][idx])
value

Out[23]:

9.068625001793289e-12

In [24]:

idx

Out[24]:

array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False])

Fill area in plot¶

In [25]:

xfill = np.linspace(0.2, 20, 100)
ax.fill_between(xfill, 0, powerlaw(xfill, *popt), color='orange')

Out[25]:

<matplotlib.collections.PolyCollection at 0x7f11937d2198>

In [26]:

ax

Out[26]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f11926e1160>

and close the file

In [27]:

tev.close()

Optional arguments¶

args and kwargs

In [28]:

def sum(*args):
    s=0
    for i in args:
        s+=i
    print(s)

In [ ]:

sum(1, 2, 3)
sum([1, 2, 3]) # Error!
sum(*[1, 2, 3])

In [29]:

def show(**kargs):
    print(kargs)
    print("keys", *kargs)

In [30]:

show(a=1, b="x")

{'a': 1, 'b': 'x'}
keys a b

curve_fit: N dimensional fit¶

In this example we fit the hyperbolic paraboloid defined by 3 parameters (a, b, c):

$$ z = ax^2-by^2+c $$

By Nicoguaro - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=20570051

load the observed point

In [31]:

x,y,z=np.loadtxt("data/saddle.dat", unpack=True)

Plot the points in 3D plot

In [32]:

from mpl_toolkits.mplot3d import Axes3D

In [33]:

#Defined a 3D projection figure
fig=plt.figure()
ax = fig.add_subplot(111, projection='3d')

#Plot the data
ax.scatter(x,y,z)

Out[33]:

<mpl_toolkits.mplot3d.art3d.Path3DCollection at 0x7f1192109630>

In [ ]:

Define the hyperbolic paraboloid function

In [34]:

def hypar(x, y, a, b, c):
  return a*x*x - b*y*y + c

PROBLEM: we can not use hypar function with curve_fit

$f$ interface must be ydata = f(xdata, *params)

In [ ]:

popt, pcov = curve_fit(hypar, x, y , z) # x and y not allow!

Define a new interface, where $x$ and $y$ data are packed together into a generic data

In [35]:

def hypar_interface(data, a, b, c):
  _x = data[:len(data)//2]
  _y = data[len(data)//2:]
  return hypar(_x, _y, a, b, c)

prepare the input data for hypar_interface

In [36]:

#Concatenate x,y for curve_fit
input_data = np.concatenate((x, y))

compute the fit

In [ ]:

#The invalid syntax just for comparison
popt, pcov = curve_fit(hypar, x, y , z)

In [37]:

#Compute the fit
popt, pcov = curve_fit(hypar_interface, input_data, z)
popt, pcov

Out[37]:

(array([2.11000788, 3.51269877, 1.03722845]),
 array([[ 5.97524246e-05,  1.52870003e-06, -1.95994463e-03],
        [ 1.52870003e-06,  5.87180572e-05,  1.83185189e-03],
        [-1.95994463e-03,  1.83185189e-03,  1.77500926e-01]]))

Show the results¶

Evaluate the function on a regular grid of X and Y. Plot the surface obtained by the fit

In [38]:

#Evaluate the function on a grid of values
X,Y = np.meshgrid(np.linspace(-10,10,20), np.linspace(-10,10,20))

zfit = hypar(X, Y, *popt)

#Plot the surface
from matplotlib import cm
surf = ax.plot_surface(X, Y, zfit, cmap=cm.rainbow,
                       linewidth=0, alpha=0.5)

In [ ]: