next up previous
Next: Likelihood fit Up: Simultaneous Template-Based Top Quark Previous: Backgrounds

Kernel Density Estimates

Probability density functions for $m\ensuremath{_t^{\mbox{reco}}}$- $m\ensuremath{_{\mbox{jj}}}$ and $M\ensuremath{_t^{\mbox{NWA}}}\xspace $- $m\ensuremath{_{\mbox{T2}}}$ at every point in the $\mbox{M\ensuremath$_{\mbox{top}}$}-\ensuremath{\Delta_{\mbox{JES}}}\xspace $ grid and for backgrounds are derived using a Kernel Density Estimate (KDE) approach. KDE is a non-parametric method for forming density estimates that can easily be generalized to more than one dimension, making it useful for this analysis, which has two observables per event. The probability for an event with observable ($x$) is given by the linear sum of contributions from all entries in the MC:
(9.arabic@equation)

In the above equation, $\hat{f}(x)$ is the probability to observe $x$ given some MC sample with known mass and JES (or the background). The MC has $n$ entries, with observables $x_i$. The kernel function $K$ is a normalized function that adds less probability to a measurement at $x$ as its distance from $x_i$ increases. The smoothing parameter $h$ (sometimes called the bandwidth) is a number that determines the width of the kernel. Larger values of $h$ smooth out the contribution to the density estimate and give more weight at $x$ farther from $x_i$. Smaller values of $h$ provide less bias to the density estimate, but are more sensitive to statistical fluctuations. We use the Epanechnikov kernel, defined as:
(9.arabic@equation)

so that only events with $\left\vert x-x_i\right\vert < h$ contribute to $\hat{f}(x)$. We use an adaptive KDE method in which the value of $h$ is replaced by $ h_i$ in that the amount of smoothing around $x_i$ depends on the value of $\hat{f}(x_i)$. In the peak of the distributions, where statistics are high, we use small values of $ h_i$ to capture as much shape information as possible. In the tails of the distribution, where there are few events and the density estimates are sensitive to statistical fluctuations, a larger value of $ h_i$ is used. The overall scale of $h$ is set by the number of entries in the MC sample (larger smoothing is used when fewer events are available), and by the RMS of the distribution (larger smoothing is used for wider distributions). We extend KDE to two dimensions by multiplying the two kernels together:
(9.arabic@equation)

Figures 5 and 6 show the 2d density estimates for Lepton+Jets and Dilepton signal events. Figures 7 and 8 show the 2d estimates for background events. The backgrounds density estimates are derived separately for the individual backgrounds, taking into account the sample size and width, and are then combined with the appropriate weights.
Figure 5: Full 2d density estimates for input mass of $\ensuremath{172~\mathrm{GeV}/c^{2}}$ and $\ensuremath{\Delta_{\mbox{JES}}}\xspace = 0.0$ for Lepton+Jets 1-tag events (left) and 2-tag events (right).
\includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_mass_1720_jes_00_sub_0.eps} \includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_mass_1720_jes_00_sub_1.eps}
Figure 6: Full 2d density estimates for input mass of $\ensuremath{172~\mathrm{GeV}/c^{2}}$ and $\ensuremath{\Delta_{\mbox{JES}}}\xspace = 0.0$ for Dilepton untagged events (left) and tagged events (right).
\includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_mass_1720_jes_00_sub_2.eps} \includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_mass_1720_jes_00_sub_3.eps}
Figure 7: Full 2d density estimates for the combined background for Lepton+Jets 1-tag events (left) and 2-tag events (right).
\includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_bkgd_jes_00_sub_0.eps} \includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_bkgd_jes_00_sub_1.eps}
Figure 8: Full 2d density estimates for the combined background for Dilepton untagged events (left) and tagged events (right).
\includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_bkgd_jes_00_sub_2.eps} \includegraphics[width=0.45\textwidth]{plots/h2dpassCONTOUR_bkgd_jes_00_sub_3.eps}

next up previous
Next: Likelihood fit Up: Simultaneous Template-Based Top Quark Previous: Backgrounds
Hyunsu Lee 2009-02-06