In this example we want fit a power law function on one high energy $\gamma$-ray spectrum (Aharonian et al 2006) acquired by HESS. We will see, how to:
Key: error plot, scatter plot, curve fit, integration
from astropy.io import fits
tev = fits.open('data/TeVJ1826-148.fits')
tev.info()
Filename: data/TeVJ1826-148.fits No. Name Ver Type Cards Dimensions Format 0 PRIMARY 1 PrimaryHDU 9 () 1 HESS_INFC 1 BinTableHDU 20 26R x 4C [1E, 1E, 1E, 1E] 2 MODEL_HESS_INFC 1 BinTableHDU 20 300R x 4C [1E, 1E, 1E, 1E] 3 MODEL_HESS_INFC_MAN 1 BinTableHDU 21 300R x 4C [1E, 1E, 1E, 1E]
Open the proper extension (HESS_INFC
) and retrieve just data
infc=tev['HESS_INFC'].data
infc
FITS_rec([( 0.195958 , 6.0770583e-11, 1.2443476e-11, 1.4777032e-11), ( 0.241226 , 4.1031869e-11, 4.0196353e-12, 7.6927215e-12), ( 0.301863 , 2.6202924e-11, 3.3654117e-12, 3.1804041e-12), ( 0.369569 , 1.7682928e-11, 1.7322793e-12, 2.3747540e-12), ( 0.45494398, 1.0170147e-11, 1.3062176e-12, 1.2344121e-12), ( 0.553947 , 8.5277436e-12, 9.2303097e-13, 1.1452271e-12), ( 0.681913 , 7.0764371e-12, 5.4529211e-13, 5.9083592e-13), ( 0.83030903, 3.2332483e-12, 4.4736605e-13, 4.7647048e-13), ( 1.02772 , 1.9477987e-12, 2.3061205e-13, 2.8703380e-13), ( 1.24455 , 1.8305073e-12, 1.9813155e-13, 2.6975433e-13), ( 1.52369 , 1.1143001e-12, 8.5867430e-14, 1.9383865e-13), ( 1.89631 , 5.1605725e-13, 8.6426261e-14, 9.6751478e-14), ( 2.30898 , 5.1384363e-13, 7.6140808e-14, 9.6336303e-14), ( 2.82687 , 3.8885307e-13, 7.2455070e-14, 5.2221624e-14), ( 3.49899 , 1.9280053e-13, 2.8568985e-14, 3.8784521e-14), ( 4.26044 , 1.5092543e-13, 3.2270847e-14, 2.6254316e-14), ( 5.21602 , 7.4754798e-14, 1.9255756e-14, 1.6072552e-14), ( 6.35114 , 5.4631165e-14, 1.1681232e-14, 1.3284150e-14), ( 7.73325 , 2.1507434e-14, 7.4304602e-15, 8.1334958e-15), ( 9.416121 , 1.5717886e-14, 5.4302965e-15, 6.7014456e-15), (11.5913 , 7.5261020e-15, 3.2328266e-15, 4.5117955e-15), (14.664599 , 0.0000000e+00, 0.0000000e+00, 3.1852115e-15), (17.9538 , 1.7490052e-15, 1.1180409e-15, 1.5729794e-15), (21.7416 , 0.0000000e+00, 0.0000000e+00, 1.4490910e-15), (26.185 , 1.4579962e-15, 8.4051572e-16, 1.4745043e-15), (31.1933 , 0.0000000e+00, 0.0000000e+00, 1.6060789e-16)], dtype=(numpy.record, [('ENERGY', '>f4'), ('FLUX', '>f4'), ('FLUX_ERROR_MIN', '>f4'), ('FLUX_ERROR_MAX', '>f4')]))
#%matplotlib qt
%matplotlib notebook
from matplotlib import pyplot as plt
fig = plt.figure("HESS Spectrum")
ax = fig.subplots()
Display points and errors
ax.errorbar(infc['ENERGY'], infc['FLUX'], yerr=[infc['FLUX_ERROR_MIN'], infc['FLUX_ERROR_MAX']],
label='HESS INFC', color='red', fmt='o')
<ErrorbarContainer object of 3 artists>
infc
FITS_rec([( 0.195958 , 6.0770583e-11, 1.2443476e-11, 1.4777032e-11), ( 0.241226 , 4.1031869e-11, 4.0196353e-12, 7.6927215e-12), ( 0.301863 , 2.6202924e-11, 3.3654117e-12, 3.1804041e-12), ( 0.369569 , 1.7682928e-11, 1.7322793e-12, 2.3747540e-12), ( 0.45494398, 1.0170147e-11, 1.3062176e-12, 1.2344121e-12), ( 0.553947 , 8.5277436e-12, 9.2303097e-13, 1.1452271e-12), ( 0.681913 , 7.0764371e-12, 5.4529211e-13, 5.9083592e-13), ( 0.83030903, 3.2332483e-12, 4.4736605e-13, 4.7647048e-13), ( 1.02772 , 1.9477987e-12, 2.3061205e-13, 2.8703380e-13), ( 1.24455 , 1.8305073e-12, 1.9813155e-13, 2.6975433e-13), ( 1.52369 , 1.1143001e-12, 8.5867430e-14, 1.9383865e-13), ( 1.89631 , 5.1605725e-13, 8.6426261e-14, 9.6751478e-14), ( 2.30898 , 5.1384363e-13, 7.6140808e-14, 9.6336303e-14), ( 2.82687 , 3.8885307e-13, 7.2455070e-14, 5.2221624e-14), ( 3.49899 , 1.9280053e-13, 2.8568985e-14, 3.8784521e-14), ( 4.26044 , 1.5092543e-13, 3.2270847e-14, 2.6254316e-14), ( 5.21602 , 7.4754798e-14, 1.9255756e-14, 1.6072552e-14), ( 6.35114 , 5.4631165e-14, 1.1681232e-14, 1.3284150e-14), ( 7.73325 , 2.1507434e-14, 7.4304602e-15, 8.1334958e-15), ( 9.416121 , 1.5717886e-14, 5.4302965e-15, 6.7014456e-15), (11.5913 , 7.5261020e-15, 3.2328266e-15, 4.5117955e-15), (14.664599 , 0.0000000e+00, 0.0000000e+00, 3.1852115e-15), (17.9538 , 1.7490052e-15, 1.1180409e-15, 1.5729794e-15), (21.7416 , 0.0000000e+00, 0.0000000e+00, 1.4490910e-15), (26.185 , 1.4579962e-15, 8.4051572e-16, 1.4745043e-15), (31.1933 , 0.0000000e+00, 0.0000000e+00, 1.6060789e-16)], dtype=(numpy.record, [('ENERGY', '>f4'), ('FLUX', '>f4'), ('FLUX_ERROR_MIN', '>f4'), ('FLUX_ERROR_MAX', '>f4')]))
Setup the plot: scales, axes labels and legend
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlabel("E(TeV)")
ax.set_ylabel(r"Flux ($\frac{ph}{cm^2sTev}$)") #raw string, no characters escaping
Text(0, 0.5, 'Flux ($\\frac{ph}{cm^2sTev}$)')
In the same way we can retrieve data model, and plot it as a solid line without error
#Load the model
model=tev['MODEL_HESS_INFC'].data
ax.plot(model['ENERGY'], model['FLUX'], label='model', color='gray')
ax.legend()
<matplotlib.legend.Legend at 0x7f1193ab3240>
model=tev['MODEL_HESS_INFC'].data
model.dtype
dtype((numpy.record, [('ENERGY', '>f4'), ('FLUX', '>f4'), ('FLUX_ERROR_MIN', '>f4'), ('FLUX_ERROR_MAX', '>f4')]))
from scipy.optimize import curve_fit
curve_fit
requires the definition of our function.
Signature of this function MUST be ydata = f(xdata, *params)
In our case we decide to fit a power law like that
$$N x^{-\alpha} e^{-\frac{x}{x_{off}}}$$The parameters of the function are: $N$, $\alpha$ and $x_{off}$
import numpy as np
def powerlaw(x, N, alpha, x_off):
return N * np.power(x, -alpha) * np.exp(-x/x_off)
powerlaw(1, 3.0, 2.0, 0.5)
powerlaw(1, 3.0, 2.0, 0.5)
0.4060058497098381
_x=np.array([0.5, 1.0, 2.0])
powerlaw(_x, 3.0, 2.0, 0.5)
_x=np.array([0.5, 1.0, 2.0])
powerlaw(_x, 3.0, 2.0, 0.5)
array([4.41455329, 0.40600585, 0.01373673])
xdata = infc['ENERGY']
ydata = infc['FLUX']
yerr = infc['FLUX_ERROR_MAX']
popt, pcov=curve_fit(powerlaw, xdata, ydata, sigma=yerr, p0=(1.0, 1, 6.0))
popt, pcov
(array([2.97021086e-12, 1.82433667e+00, 7.68035476e+00]), array([[ 3.94792250e-26, -1.02996866e-14, -3.28658276e-13], [-1.02996866e-14, 5.28959877e-03, 1.20696691e-01], [-3.28658276e-13, 1.20696691e-01, 3.97762247e+00]]))
curve_fit?
import numpy as np
chisq = np.sum(((powerlaw(xdata, *popt) - ydata)/yerr)**2)
reduced_chisq = (chisq)/(len(xdata)-len(popt))
powerlaw(xdata, *popt)
array([5.6629090e-11, 3.8531258e-11, 2.5393347e-11, 1.7400369e-11, 1.1777737e-11, 8.1182812e-12, 5.4646461e-12, 3.7425735e-12, 2.4717812e-12, 1.6946432e-12, 1.1297008e-12, 7.2203468e-13, 4.7777412e-13, 3.0874850e-13, 1.9168949e-13, 1.2121016e-13, 7.3990015e-14, 4.4563884e-14, 2.5991438e-14, 1.4577113e-14, 7.5163781e-15, 3.2801306e-15, 1.4776295e-15, 6.3637580e-16, 2.5416812e-16, 9.6217808e-17], dtype=float32)
# Compute the fitted function on the grid of x
x=np.linspace(xdata[0], xdata[-1], 1000)
yfit=powerlaw(x, *popt)
#Setup a new figure
fig=plt.figure()
ax = fig.subplots()
#Plot the fitted function
ax.plot(x, yfit, color='blue', label='powerlaw fit')
#Plot data
ax.errorbar(xdata, ydata, yerr=[infc['FLUX_ERROR_MIN'], infc['FLUX_ERROR_MAX']], label='HESS', color='red', fmt='o')
#Plot original model
ax.plot(model['ENERGY'], model['FLUX'], label='model', color='gray', ls='--')
ax.legend()
ax.set_xscale('log')
ax.set_yscale('log')
curve_fit
assumes $f$ must have this signature ydata = f(xdata, *params)
there are not additional options for fixed parameters.
curve_fit
provides also the bounds
parameter to set a lower and upper limits in parameters space exploration. In any case each lower bound must be strictly less than each upper bound.
# This call raises an error!
popt, pcov=curve_fit(powerlaw, xdata, ydata, sigma=yerr, p0=(2.97, 1, 6.0), bounds=((2.97, 0, 0),
(2.97, 10., np.inf)))
Goal: we must find a way provide some parameter to the final function (the powerlaw
in this case), but this parameter can not be a function argument.
We can define an auxiliary function and take advantage of variables scopes.
Variables can only reach the area in which they are defined, which is called scope. Think of it as the area of code where variables can be used. Python supports global variables (usable in the entire program) and local variables.
By default, all variables declared in a function are local variables. To access a global variable inside a function, it’s required to explicitly define global variable.
def powerlaw(x, N, alpha, x_off):
return N * np.power(x, -alpha) * np.exp(-x/x_off)
We can define "new" powerlaw
function, with only 2 free parameter: $\alpha$ and $x_{off}$.
The N parameter is defined out of powerlaw function.
def powerlaw(x, alpha, x_off):
return N * np.power(x, -alpha) * np.exp(-x/x_off)
We nest powerlaw
inside another function which provides N. In this way N is defined outside powerlaw, but it is in the powerlaw scope.
def powerlawN(N):
# Variable N lives in this scope
def powerlaw(x, alpha, x_off):
return N * np.power(x, -alpha) * np.exp(-x/x_off)
return powerlaw
powerlawN(2.97e-12)
returns a powerlaw with a fixed N, and we can call the curve_fit
in this way
popt2, pcov2=curve_fit(powerlawN(2.97e-12), xdata, ydata, sigma=yerr, p0=(1, 6.0))
popt2,pcov2
(array([1.82439046, 7.68200904]), array([[0.00249409, 0.03350657], [0.03350657, 1.19053626]]))
The scipy.integrate
sub-package provides several integration techniques:
We use quad
function, which use a technique from the Fortran library QUADPACK, on the analytic model
from scipy.integrate import quad
value, err = quad(powerlaw, 0.2, 20, args=tuple(popt))
value
1.1697759599647136e-11
And the simps
function, which use the composite Simpson’s rule, to integrate the sampled model.
from scipy.integrate import simps
idx = (model['ENERGY'] >=0.2) & (model['ENERGY'] <=20.0)
value = simps(model['FLUX'][idx], model['ENERGY'][idx])
value
9.068625001793289e-12
idx
array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])
xfill = np.linspace(0.2, 20, 100)
ax.fill_between(xfill, 0, powerlaw(xfill, *popt), color='orange')
<matplotlib.collections.PolyCollection at 0x7f11937d2198>
ax
<matplotlib.axes._subplots.AxesSubplot at 0x7f11926e1160>
and close the file
tev.close()
def sum(*args):
s=0
for i in args:
s+=i
print(s)
sum(1, 2, 3)
sum([1, 2, 3]) # Error!
sum(*[1, 2, 3])
def show(**kargs):
print(kargs)
print("keys", *kargs)
show(a=1, b="x")
{'a': 1, 'b': 'x'} keys a b
In this example we fit the hyperbolic paraboloid defined by 3 parameters (a, b, c):
$$ z = ax^2-by^2+c $$By Nicoguaro - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=20570051
load the observed point
x,y,z=np.loadtxt("data/saddle.dat", unpack=True)
Plot the points in 3D plot
from mpl_toolkits.mplot3d import Axes3D
#Defined a 3D projection figure
fig=plt.figure()
ax = fig.add_subplot(111, projection='3d')
#Plot the data
ax.scatter(x,y,z)
<mpl_toolkits.mplot3d.art3d.Path3DCollection at 0x7f1192109630>
Define the hyperbolic paraboloid function
def hypar(x, y, a, b, c):
return a*x*x - b*y*y + c
PROBLEM: we can not use hypar
function with curve_fit
$f$ interface must be ydata = f(xdata, *params)
popt, pcov = curve_fit(hypar, x, y , z) # x and y not allow!
Define a new interface, where $x$ and $y$ data are packed together into a generic data
def hypar_interface(data, a, b, c):
_x = data[:len(data)//2]
_y = data[len(data)//2:]
return hypar(_x, _y, a, b, c)
prepare the input data for hypar_interface
#Concatenate x,y for curve_fit
input_data = np.concatenate((x, y))
compute the fit
#The invalid syntax just for comparison
popt, pcov = curve_fit(hypar, x, y , z)
#Compute the fit
popt, pcov = curve_fit(hypar_interface, input_data, z)
popt, pcov
(array([2.11000788, 3.51269877, 1.03722845]), array([[ 5.97524246e-05, 1.52870003e-06, -1.95994463e-03], [ 1.52870003e-06, 5.87180572e-05, 1.83185189e-03], [-1.95994463e-03, 1.83185189e-03, 1.77500926e-01]]))
Evaluate the function on a regular grid of X and Y. Plot the surface obtained by the fit
#Evaluate the function on a grid of values
X,Y = np.meshgrid(np.linspace(-10,10,20), np.linspace(-10,10,20))
zfit = hypar(X, Y, *popt)
#Plot the surface
from matplotlib import cm
surf = ax.plot_surface(X, Y, zfit, cmap=cm.rainbow,
linewidth=0, alpha=0.5)