Search for New Physics Using Quaero: A General Interface to - D0 Event Data

We describe quaero , a method that i) enables the automatic optimization of searches for physics beyond the standard model, and ii) provides a mechanism for making high energy collider data generally available. We apply quaero to searches for standard model WW , ZZ , and t (cid:22) t production, and to searches for these objects produced through a new heavy resonance. Through this interface, we make three data sets collected by the D(cid:31) experiment at p s = 1 : 8 TeV publicly available.

It is generally recognized that the standard model, a successful description of the fundamental particles and their interactions, must be incomplete.Models that extend the standard model often predict rich phenomenology at the scale of a few hundred GeV, an energy regime accessible to the Fermilab Tevatron.Due in part to the complexity of the apparatus required to test models at such large energies, experimental responses to these ideas have not kept pace.Any technique that reduces the time required to test a particular candidate theory would allow more such theories to be tested, reducing the possibility that the data contain overlooked evidence for new physics.
Once data are collected and the backgrounds have been understood, the testing of any specific model in principle follows a well-defined procedure.In practice, this process has been far from automatic.Even when the basic selection criteria and background estimates are taken from a previous analysis, the reinterpretation of the data in the context of a new model often requires a substantial length of time.
Ideally, the data should be "published" in such a way that others in the community can easily use those data to test a variety of models.The publishing of experimental distributions in journals allows this to occur at some level, but an effective publishing of a multidimensional data set has, to our knowledge, not yet been accomplished by a large particle physics experiment.The problem appears to be that such data are context-specific, requiring detailed knowledge of the complexities of the apparatus.This knowledge must somehow be incorporated either into the data or into whatever tool the non-expert would use to analyze those data.
Many data samples and backgrounds have been defined in the context of sleuth [1], a quasi-model-independent search strategy for new high p T physics that has been applied to a number of exclusive final states [2,3] in the data collected by the DØ detector [4] during 1992-1996 in Run I of the Fermilab Tevatron.In this Letter we describe a tool (quaero) that automatically optimizes an analysis for a particular signature, using these samples and standard model backgrounds.sleuth and quaero are complementary approaches to searches for new phenomena, enabling analyses that are both general (sleuth) and focused (quaero).We demonstrate the use of quaero in eleven separate searches: standard model W W and ZZ production; standard model t t production with leptonic and semileptonic decays; resonant W W , ZZ, W Z, and t t production; associated Higgs boson production; and pair production of first generation scalar leptoquarks.The data described here are accessible through quaero on the World Wide Web [5], for general use by the particle physics community.
The signals predicted by most theories of physics beyond the standard model involve an increased number of predicted events in some region of an appropriate variable space.In this case the optimization of the analysis can be understood as the selection of the region in this variable space that minimizes σ 95% , the expected 95% confidence level (CL) upper limit on the cross section of the signal in question, assuming the data contain no signal.The optimization algorithm consists of a few simple steps: (i) Kernel density estimation [6] is used to estimate the probability distributions p( x|s) and p( x|b) for the signal and background samples in a lowdimensional variable space V, where x ∈ V.The signal sample is contained in a Monte Carlo file provided as input to quaero.The background sample is constructed from all known standard model and instrumental sources.
The semi-positive-definiteness of p( x|s) and p( x|b) restricts D( x) to the interval [0, 1] for all x.(iii) The sensitivity S of a particular threshold D cut on the discriminant function is defined as the reciprocal of σ 95% .D cut is chosen to maximize S. (iv) The region of variable space having D( x) > D cut is used to determine the actual 95% CL cross section upper limit σ 95% [8].
When provided with a signal model and a choice of variables V, quaero uses this algorithm and DØ Run I data to compute an upper limit on the cross section of the signal.Instructions for use are available from the quaero web site.
Table I shows the data available within quaero, and Table II summarizes the backgrounds.These data and their backgrounds are described in more detail in Ref. [3].The final states are inclusive, with many events containing one or more additional jets.Kolmogorov-Smirnov tests have been used to demonstrate agreement between data and the expected backgrounds in many distributions.The fraction of events with true final state objects satisfying the cuts shown that satisfy these cuts after reconstruction is given as an "identification" efficiency (ǫ ID ).Because electrons are more accurately measured and more efficiently identified than muons in the DØ detector, the corresponding muon channels µ / E T 2j and µµ 2j have been excluded from these data.
To check standard model results, we remove W W and ZZ production from the background estimate and search (i) for standard model W W production in the space defined by the transverse momentum of the electron (p e T ) and missing transverse energy ( / E T ) in the final state eµ / E T , and (ii) for standard model ZZ production in the space defined by the invariant mass of the two electrons (m ee ) and two jets (m jj ) in the final state ee 2j.Removing t t production from the background estimate, we search for this process (iii) in the final state e / E T 4j using the two variables laboratory aplanarity (A) and p j T ,   II.Standard model backgrounds (often produced with accompanying jets) to the final states considered.V V denotes W W , W Z, and ZZ; "data" indicates backgrounds from jets misidentified as electrons estimated using data.Monte Carlo programs (isajet [9], pythia [10], herwig [11], and vecbos [12]) are used to estimate several sources of background.and (iv) in the final state eµ / E T 2j, using the two variables p e T and p j T , assuming a top quark mass of 175 GeV.Including all standard model processes in the background estimate, we look for evidence of new heavy resonances.We search (v) for resonant W W production in the final state e / E T 2j, using the single variable m eνjj after constraining m eν and m jj to M W , and (vi) for resonant ZZ production in the final state ee 2j, using the variable m eejj after constraining m jj to M Z .In both cases we remove events that cannot be so constrained.To obtain a specific signal prediction, we assume that the resonance behaves like a standard model Higgs boson in its couplings to the W and Z bosons.Constraining m eν to M W and m jj to M Z , we use the quality of the fit and m eνjj to search (vii) for a massive W ′ boson in the extended gauge model of Ref. [13].Using m eν 4j after constraining m eν to M W , we search (viii) for a massive narrow Z ′ resonance with Z-like couplings decaying to Non-resonant new phenomena are also considered.The variables m jj and either m T eν or m ee are used to search for a light Higgs boson produced (ix) in association with a W boson, and (x) in association with a Z boson.Finally, we search (xi) for first generation scalar leptoquarks with mass 225 GeV in the final state ee 2j using m ee and S T , the summed scalar transverse momentum of all electrons and jets in the event.The numerical results of these searches are listed in Table III.Figures 1 and 2 present plots of the signal density, background density, and selected region in the variables considered.III.Limits on cross section × branching fraction for the processes discussed in the text.All final states are inclusive in the number of additional jets.The fraction of the signal sample satisfying quaero's selection criteria is denoted ǫsig; b is the number of expected background events satisfying these criteria; and N data is the number of events in the data satisfying these criteria.The subscripts on h, W ′ , Z ′ , and LQ denote assumed masses, in units of GeV.
We note slight indications of excess in the searches for t t → e / E T 4j and t t → eµ / E T 2j (corresponding to cross section × branching fractions of σ × B = 0.39 +0.21 −0.19 pb and 0.14 +0.15  −0.08 pb) that are consistent with our measured t t production cross section of 5.5 ± 1.8 pb [14] and known W boson branching fractions.Observing no compelling excess in any of these processes, limits on σ × B are determined at the 95% CL.As expected, we find these data insensitive to standard model ZZ production (with predicted σ × B ≈ 0.05 pb), and to associated Higgs boson production (with predicted σ × B < ∼ 0.01 pb).As a check of the method, quaero almost exactly duplicates a previous search for LQLQ → ee 2j [15].
quaero is a method both for automatically optimizing searches for new physics and for allowing DØ to make a subset of its data available for general use.In this Letter we have outlined the algorithm used in quaero, and we have described the final states currently available for analysis using this method.quaero's performance on several examples, including both standard model and resonant W W , ZZ, and t t production, has been demonstrated.The limits obtained are comparable to those from previous searches at hadron colliders, and the search for W ′ → W Z is the first of its kind.This tool should in-

FIG. 1 .FIG. 2 .
FIG. 1.The background density (a), signal density (b), and selected region (shaded) (c) determined by quaero for the standard model processes discussed in the text.From top to bottom the signals are: W W → eµ /ET , ZZ → ee 2j, t t → e / ET 4j, and t t → eµ / ET 2j.The dots in the plots in the rightmost column represent events observed in the data.

TABLE I .
A summary of the data available within quaero, including the selection cuts applied and the efficiency of identification requirements.The final states are inclusive, with many events containing one or more additional jets.Reconstructed jets satisfy p j T > 15 GeV and |η j det |< 2.5, and reconstructed electrons satisfy p e T > 15 GeV and (|η e det |< 1.1 or 1.5 <|η e det |< 2.5), where η det is the pseudorapidity measured from the center of the detector.