Add the energy dimension to the likelihood analysis

Energy and event type information are available in the data, but not used by the current wtlike light curve analysis.

For a single cell the likelihood, derived by Kerr, is,

$$ \displaystyle\log\mathcal{L}(\alpha,\beta\ |\ w)\ = \sum_{w} \log \big( 1 + \alpha\ w + \beta\ (1-w) \big) - (\alpha\ S + \beta\ B) \tag{1}$$

where $w$ represents the set of all weights in the cell, and $S$ is an estimate for $\sum w$ using the cell's exposure relative to the full dataset.

We extend this with three modifications. Denoting the energy subset, or band, with an index $k$ we:

  • Replace $\alpha$ with a value for each energy band, $\alpha_k$;
  • Break the set $w$ into sets with specific $k$, $w_k$;
  • Replace $S$ with its equivalent for each energy band, $S_k$, the estimate for the value of $\sum w_k$.

In the following we specialize to the usual case $\beta=0$. The likelihood, now a function of the set $\alpha_k$, is a sum over $k$:

$$ \log\mathcal{L}(\alpha_0,\alpha_1,\dots,| w)\ = \sum_k \big[ \sum_{w_k} \log ( 1 + \alpha_k\ w_k ) - \alpha_k\ S_k \big],\tag{2} $$

Calculation of $S_k$

Since the effective area also depends on event type, this is a sum of two terms. Let the function $\mathcal{E}_{kj}(I)$ represent the exposure, in $\mathrm{cm^2\ s}$, for energy band $k$ and event type $j$, for the time interval $I \equiv (t_a,t_b)$. Also, let $W_{kj}$ be the sums of $w_k$ for the subsets with $j$ = Front or Back.

The spacecraft pointing data relevant for a given source is a list of 30-s time intervals, each with a livetime $\tau$ and a value of $z\equiv \cos\theta$. Then for an interval $I$, either the full dataset or just a cell, we form an array of the sum of the livetime values binned in $z$. The final step to calculate $\mathcal{E}_{kj}$ involves the effective area $A$. This is determined from the $Fermi$ "IRF" tables, defining a function of energy $E$, $z$ and event type, returning values in $\mathrm{cm^2}$. Since we bin in $z$, it is represented by a $16\times 2 \times 12$ matrix for the 16 energy bins, 12 angular bins, and Front/Back. Symbolically, we have $\mathcal{E}(I)$ = $A_{eff} \cdot L(I)$ with $L$ the $12\times 2$ livetime matrix for interval $I$, resulting in the $16\times 2$ exposure matrix.

Thus we have $$ S_k =\sum_{j=F,B} \frac{\mathcal{E}_{kj}(I_{cell})}{\mathcal{E}_{kj}(I_{full})} W_{kj} \tag{3}$$

A question is how to prepare the SC data to facilitate this. Here is a snapshot of $\cos\theta$ for a fraction of a day:

image.png

The points are 30s apart, and the source is in the FOV on alternate orbits. Since an orbit is the minimum time interval for a cell, which is only appropriate for extremently bright flares, I propose to make an orbit the basic time unit to save for rebinning. For each orbit, I'll collect the 8x12 matrix $L(I_{orbit})$. For each orbit then, I make a sum of livetimes for bins in $z$, keeping the $z$ information.

I can save either the livetime binned in $z$ or, after applying the effective area, the exposure per energy band/event type. The former has (now) 12 bins, while the latter up to 32 with Front/Back and 16 energy bins. However, this is effectively 14 below 10 GeV, and I could make larger bins above 10 GeV.

Kerr evaluation of S:

This is an energy independent approach, used by Kerr, which ignores the energies and event types of the photons. In this case, the sum of all the weights, $\sum w \equiv W_{total}$ is used, with a relative exposure factor, to estimate the sum of the weights for a cell. The energy dependence of the effective area is averaged over with a weighting function $g$ representing the source spectrum.

So $\mathcal{E}(I) = A \cdot L(I)$ becomes $\bar{\mathcal{E}} = \bar{A} \cdot L(I) $, where $\bar{A}(z) = \sum_j \int g(E) A_{j}(E,z)\ dE$, is the energy-weighted effective area, now a function only of $z$. In the current approach it is evaluated only for the 12 binned values of $z$, instead of for each 30-s exposure interval. Finally,

$$ \bar{S}_{cell} = \frac{ \bar{\mathcal{E}}(I_{cell}) }{ \bar{\mathcal{E}}(I_{total}) } W_{total} \tag{4}$$


Making light curves

The objective is to look for changes over time. We consider two modes:

Normalization

The basic assumption is that the spectral shape is fixed. In Eq. (2), set all $\alpha$'s to a single value which makes it identical to Eq. (1) with the identification $$S = \sum_k S_k \tag{5}.$$ Compare this with the Kerr approach: $$ S = \frac{f(I_{cell})}{f(I_{total})} \sum w,$$ where $f(I)$ is an effective exposure weighted by an energy spectrum for the interval. He used a weighting function $E^{-2.1}$. The actual source spectrum should be an better alternative.

Spectral shape

As the count mode detects changes in the level of flux, assumimg a constant spectrum, we will try to charactrize a change corresponding to a difference corresponding to multiplication by a power law. The nominal logarithmic energy spectrum, $E\ \frac{dN}{dE}$, is proportional to the $S_k$'s.

Since energies are digitized to 4 bins per decade from 100 MeV to 1 TeV, with the logarithmic bin center value representing the bin's energy, we have $E_k = 10^{(k+1/2)/4} \times 100 \mathrm{MeV}$ with $k=0,\ldots, 15$. (But only the first eight play a role in this analysis.)

So we want a likelihood expression, like Equation (2), with a parameter that characterizes a power law spectral factor. We use $f(E,\gamma) = \big( \frac{E}{E_c} \big)^{-\gamma}$ where $E_c=1\ \mathrm{GeV}$. The nominal value for $\gamma$ is zero.

Then $$ \frac{df(E_k,\gamma)}{d\gamma} = (3.5-k)\ \frac{\log(10)}{4} $$

Consider a single energy term in Eq. (2). Factor possible changes in the flux into an overall normalization change represented by $\alpha$, so we have an actual factor of $(1+\alpha)$ and that from $f(E_k,\gamma)$. Expand the latter to first order since we expect $|\gamma|<<1$, and keeping only first-order terms to obtain $$ \log\mathcal{L}_k(\alpha, \gamma)\ = \sum_{w\in W_k} \log\big(1+(\alpha+c_k\gamma)w\big) - (\alpha+c_k\gamma)\ S_k, $$ where $c_k = \log(E_k/E_c)=\frac{\log(10)}{4}(3.5-k)$.

Implementation note

The code, see the class Poisson, uses a Poisson-like representation of $\log\mathcal{L}_k(\alpha)$ for each cell.

That can be reinterpreted as a function of $\alpha + c_k\gamma$.

For $s>0$, $$\log \mathcal{L}(s | s_p,e,b) = e\ \big[\ (s_p+b) \ \log( s+b ) - s\ \big] + \mathrm{const},$$ where the const is conventionally defined such that the log likelihood is zero at the peak.

For $s_p >> b$ or $e>>1$, the distribution approaches the normal distribution, and can be represented as,

$$ s = s_p \pm \sqrt{\frac{s_p+b}{e}}.$$

Here $s$ is the flux, $1+\alpha$.

These functions representing each subcell are created from the data. For a given cell, the likelihood for the flux for a cell is the the sum shown in Eq. (2). The modification for each of the subcell likelihood functions involves dividing the $e$ paramater for each function by $c_k$.

pull_plot[source]

pull_plot(pulls, ax=None, nbins=25, **kwargs)

Histogram of a set of "pulls", presumed to be normal (0,1) distributed, the function for which is overplotted.

class EnergyCells[source]

EnergyCells(wtl:SourceData object, interval, adjust_slope=-0.55)

Create a set of cells using energy-dependent livetime

Input is an instance of the SourceData class for a given source, containing photon and spacecraft info

Processing creates:

  • Orbit-based exposure vs $\cos\theta$, implemented by LivetimeHistory and using the effective area

  • The basic time-defined cells, implemented by FermiInterval

Each cell has 8 sub-cells for the 4/decade energies from 100 MeV to 10 GeV. Then the likelihood function for each sub-cell is determined using the tools LogLike and Poisson.

self = None # will be set below
@ipynb_doc
def setup_test(name='Geminga',  interval=7):  ##PSR J1227-4853'
    """ # Study {name} with {interval}-day bins
    
    Create an instance of an `EnergyCells` object to use for the following plots.
    Setting the Front-Back mask to {fbmask}.
    {output}
    """
    global self
    with capture_show('Printout') as output:
        wtl = SourceData(name)
        self =  EnergyCells( wtl, interval)
    return locals()
setup_test()

Study Geminga with 7-day bins

Create an instance of an EnergyCells object to use for the following plots. Setting the Front-Back mask to {fbmask}.

Printout
SourceData:  4FGL J0633.9+1746: Restoring from cache with key "PSR J0633+1746_data"
SourceData: Source Geminga with:
data: 1,222,521 photons from 2008-08-04 to 2022-09-28
exposure: 3,434,143 intervals, average effective area 2693 cm^2 for 102.7 Ms
rates: source 3.31e-06/s, background 1.11e-06/s, TS 2290923.9

pull_plot[source]

pull_plot(pulls, ax=None, nbins=25, **kwargs)

Histogram of a set of "pulls", presumed to be normal (0,1) distributed, the function for which is overplotted.

class FluxFixer[source]

FluxFixer(fluxfits, emom)

@ipynb_doc
def ebands_plot(self, **kwargs):
    """ ## Energy Band light curve
    
    Each {self.interval}-day cell has 8 sub-cells for the 4/decade energies from 100 MeV to 10 GeV.
    
    Thus we can construct a light curve for each energy band:
    {fig}
    """   
    fig = figure(self.plot_fluxes,**kwargs)

    return locals()

ebands_plot(self, ylim=(0,2), UTC=True )

Energy Band light curve

Each 7-day cell has 8 sub-cells for the 4/decade energies from 100 MeV to 10 GeV.

Thus we can construct a light curve for each energy band:

Figure 1
Figure 1. Flux light curves for each energy band.
@ipynb_doc
def flux_pull_plots(self, fbmask):
    """
    #### {which}
    {fig}
    """
    which = ['','Front','Back', 'Both'][fbmask]
    x = self.energy_moment()
    pff = [FluxFixer( self.flux_table( fbmask).iloc[:,i] , x) for i in range(8)]
    fig,axx = plt.subplots(2,4, figsize=(10,3))
    plt.subplots_adjust(hspace=0, wspace=0)
    for pf, ax in zip(pff, axx.flatten()):
        pf.pull_hist(ax=ax)
    return locals()
display_markdown("""
### Pull plots
The following show the pull histograms for each energy band.
""")
flux_pull_plots(self,1)
flux_pull_plots(self,2)

Pull plots

The following show the pull histograms for each energy band.

Front

Figure 1

Back

Figure 1
@ipynb_doc
def plot_sigmas(self, **kwargs):    
    """### Resolution for Energy and event type

    The top of each bar is the measured std, and the bottom the reduction 
    possible in the absence of systematics.
    {fig}
    Note that there is no Back data below 300 MeV.
    """
    fit_info = check_fits(self)
    fig, ax = plt.subplots(figsize=(6,4))
    kw = dict(yscale='linear', ylim=(0,None), 
              xlabel='Energy (GeV)', ylabel='Relative STD (%)',
              xticks=[-0.5,3.5,7.5], xticklabels='0.1 1 10'.split())
    kw.update(kwargs)

    for ii, name, color in zip(range(3), 'F B F+B'.split(),
                              'cornflowerblue orange violet'.split()):
        y = fit_info.loc[[ii],:]
        k = np.array([t[1] for t in y.index])
        for j, sig, psig in zip(k, 100*y.sigma.values, y.pull_std.values):
            x = j + 0.14*(ii-1)
            ax.plot( [x,x], [sig/psig, sig],'s-', lw=6, color=color, label=name if j==7 else '' );
    ax.set(**kw)
    ax.grid(alpha=0.5); 
    ax.legend();   
    return locals()

plot_sigmas(self)

Resolution for Energy and event type

The top of each bar is the measured std, and the bottom the reduction possible in the absence of systematics.

Figure 1
Note that there is no Back data below 300 MeV.

@ipynb_doc
def plot_energy_moment(self):
    """## Energy moment
    {text}
    {fig}
    """
    text = self.energy_moment.__doc__
    x = self.energy_moment()
    fig, (ax1,ax2) = plt.subplots(ncols=2, figsize=(12,4), sharey=True,
                                  gridspec_kw=dict(width_ratios=[4,1]), 
                                  #width_ratios=[4,1],
                                 ) 
    plt.subplots_adjust(wspace=0.05)
    ax1.scatter(self.cells.time.values, x, marker='.')
    ax1.grid(alpha=0.5)
    ax2.hist(x, np.linspace(-0.2,0.2,101), histtype='step', lw=2,
            orientation='horizontal');
    ax2.grid(alpha=0.5)
    return locals()

plot_energy_moment(self)

Energy moment

Return a measure of the exposure energy distribution.

Figure 1
@ipynb_doc
def plot_weighted_aeff(self, **kwargs):
    """
    ## Weighted exposure
    
    Compare the "Kerr-mode", power-law spectral shape, with that for the source.
    
    {fig}
    
    The difference is rather small.
    """

    from wtlike.exposure import weighted_aeff
    A = weighted_aeff(Config(use_kerr=True), self.source)
    Ank = weighted_aeff(Config(use_kerr=False), self.source)
    ctbins = LivetimeHistory.ctbins
    ctvals = 0.5*(ctbins[1:] + ctbins[:-1])
    Akerr = A(ctvals)
    Ankerr = Ank(ctvals)

    plt.rc('figure', figsize=(6,4))
    plt.rc('font', size=10)      

    fig, ax = plt.subplots()
    ax.plot(ctvals, Akerr, '+:', label='power-law');
    ax.plot(ctvals, Ankerr, 'x:', label='source')
    ax.set(ylabel=r'$\mathrm{Effective\ area\ (cm^2)}$', ylim=(0,None), xlabel=r'$\cos\theta$')
    ax.grid(alpha=0.4)
    ax.legend();
    return locals()
plot_weighted_aeff(self)

Weighted exposure

Compare the "Kerr-mode", power-law spectral shape, with that for the source.

Figure 1

The difference is rather small.

@ipynb_doc
def count_light_curve(self, ftmask=3, **kwargs):
    """### Light curve and pull distribution--{data}
    
   
    The left plot below shows relative flux, using all energies, for {self.source_name} in {self.interval}-day bins,
    showing the relative flux. 
    {fig}
    The right plot is the distribution of the "pulls", compared with the expected Gaussian.
    """
    from matplotlib.gridspec import GridSpec
    data = ['','Front only','Back only','Front and Back'][ftmask]
    fig = plt.figure(figsize=(12,3))
    gs = GridSpec(1, 2,  width_ratios=[4,1], wspace=0.15, top=0.95)
    (ax1,ax2) =  [fig.add_subplot(g) for g in gs]
    kw = dict(ylim=(0.5,1.5))
    kw.update(kwargs)
    ax1.set(**kw)
    ax1.axhline(1, color='grey')

    cf = self.count_flux(ftmask)    
    pulls = cf.fit.apply(lambda p: p.sigma_dev(1))
    flux_plot(cf,    ax=ax1,**kw);
    pull_plot(pulls, ax=ax2)

    return locals()



count_light_curve(self, 1)
count_light_curve(self, 2)
count_light_curve(self, 3)

Light curve and pull distribution--Front only

The left plot below shows relative flux, using all energies, for Geminga in 7-day bins, showing the relative flux.

Figure 1
The right plot is the distribution of the "pulls", compared with the expected Gaussian.

Light curve and pull distribution--Back only

The left plot below shows relative flux, using all energies, for Geminga in 7-day bins, showing the relative flux.

Figure 1
The right plot is the distribution of the "pulls", compared with the expected Gaussian.

Light curve and pull distribution--Front and Back

The left plot below shows relative flux, using all energies, for Geminga in 7-day bins, showing the relative flux.

Figure 1
The right plot is the distribution of the "pulls", compared with the expected Gaussian.

@ipynb_doc    
def plot_ratio_vs_eindex(self, strategy, rlim=(0.8,1.2), **kwargs):
    """   
    ### {comment}
    {fig}
    """    
    # # S, B = self._get_S_and_B()    
    # ssum = S.sum(axis=1)
    Scell = strategy(self)
    comment = strategy.__doc__

    wsum = np.array([sum(c['w']) for c in self])
    ratio = (Scell/wsum).clip(*rlim)
    
    # Energy index weighted by exposure
    F = self._get_Fbar()[:,:8].sum(axis=2)    
    iE = np.arange(8)
    EF = (iE * F).sum(axis=1) / F.sum(axis=1)
    
    x,y = EF-3.5, ratio
    class LSQ():
        def __init__(self, x, y):
            self.a, self.b = np.linalg.lstsq(np.vstack([x, np.ones(len(x))]).T, y, rcond=None)[0]
        def __call__(self, x):
            return self.a*x + self.b
        def __str__(self):
            return rf'$y={self.a:.2f}x + {self.b:.2f}$'

    fitfun = LSQ(x,y) 
    
    fig, (ax,ax2) = plt.subplots(ncols=2, figsize=(12,4), 
                                sharey=True, gridspec_kw=dict(width_ratios=[3,1]))
    plt.subplots_adjust(wspace=0.05)
    ax.plot(x,y, '.', color='cornflowerblue', label='cell data')
    ax.plot(x, fitfun(x), '--', color='orange', label=f'least-squares fit:\n{fitfun}')

    kw = dict(xlabel='Weighted Energy index', ylabel='ratio', ylim=rlim)
    kw.update(kwargs); ax.set(**kw)
    ax.grid(alpha=0.5)
    ax.legend()
    ax.axhline(1, color='grey')    
    ax2.hist(y, np.linspace(*rlim), histtype='step', lw=2, 
             orientation='horizontal', label=f'${y.mean():.3f}\pm{y.std():.3f}$')
    ax2.legend(); ax2.grid(alpha=0.5)
    ax2.axhline(1, color='grey')
    return locals()
display_markdown(
"""
## Study exposure weighting
""")
plot_ratio_vs_eindex(self, get_S_cell);  
plot_ratio_vs_eindex(self, Sbar(A1,'Power-law 2.1'))
plot_ratio_vs_eindex(self, Sbar(A2, 'Source spectral model'))
plot_ratio_vs_eindex(self, Sbar(A3, f'Power-law${Gamma}$'))

Study exposure weighting

summed Sk

Figure 1

weighted Aeff Power-law 2.1

Figure 1

weighted Aeff Source spectral model

Figure 1

weighted Aeff Power-law$-1.95$

Figure 1