From lys@fnal.gov Fri Jan 5 16:19:11 2001 Date: Tue, 31 Oct 2000 10:52:31 -0600 From: Jeremy Lys To: frisch@fnal.gov, rlc@fnal.gov, rebcdf@fnal.gov, johny@fnal.gov, velev@fnal.gov Subject: Berkeley comments on draft PRD "Searches..." Berkeley Group Comments on the Draft PRD, "Searches for New Physics with a Photon and b-quark Jet at CDF" We congratulate the authors on bringing this work to draft publication stage. However, we have some reservations about the paper as currently written. Our comments follow. Jeremy, for Berkeley group. General Comment. 1. We discussed the "model-independent limits" part of the paper. Some points that arose were as follows: (a) Should general discussion of model-independent limits, and of how to apply such limits, be in a separate paper, perhaps for publication in NIM? If CDF is making a contribution to the field in this matter, the contribution should be evident, and should not be tied to one particular search. We note, for example, that the title of the paper gives no clue of this discussion. An alert reader has to read down to the last sentence of the Abstract to get a hint that such discussion is present. In a separate NIM paper one could think of presenting extensive information on the efficiencies of "standard" (have to be careful with that word) selections in CDF run 1 analyses. Or, such a paper could describe a "public Monte Carlo" that any interested physicist could make use of. (b) Is in fact any appreciable contribution to the subject of model- independent limits being made in this paper? A cynical reader might assert: "all CDF has done is give limits in the case of 100% acceptance* efficiency, and state that any modellers must determine the appropriate acceptance*efficiency for their particular models". While there are responses to such a cynic, such cynicism may be encouraged by some of the wording in the draft, for example the words "A new paradigm... for reporting results...". (c) Some reference must be made to other work concerning model-independence. In particular, there is the D0 work, "Search for New Physics....: A Quasi-Model_Independent Search Strategy for New Physics", hep-ex/0006001 (9 Jun 2000). While "Search Strategy" is not identical to "Search Results", there are overlaps between the D0 work and the present work. (d) The second para. of Sect. 6 states that "These limits..... do not have an immediate interpretation." We disagree with that statement. A limit on (sigma*BR*Aeps) clearly gives a lower bound on the limit on (sigma*BR) for all models. Any model that predicts a value for (sigma*BR) that is less than this lower bound will not be excluded by this CDF result. There is no need for an advocate of such a model to estimate the model's A*eps value in order to check compatibility. (e) The inclusion of a "typical" uncertainty on A*eps in the model- independent limits (2nd para of Sect. 6.1) is at best confusing, and should be removed. Because (i) the equality given in each of the two preceding paras., i.e., the (sigma*BR*A*eps)_lim = N_lim/L, is destroyed, (ii) a particular method of combining a statistical uncertainty with a systematic uncertainty is forced upon a potential user, (iii) it is not clear (will not be clear to many readers) that a "typical" uncertainty on A*eps is always appropriate, (iv) it is better to tell potential users that a "typical" uncertainty on A*eps is 22%, and let users choose to accept that or to determine their own value, and to combine it as they wish. Point (ii) also applies to the luminosity uncertainty, but the relative smallness makes the point not so important; however, if the luminosity uncertainty is included in the limit it would be better to write the defining equations as (sigma*BR*A*eps)_lim = (N/L)_lim . (f) The first page of the Appendix is very confusing. First, just what is being advocated here? Apparently, that limits for specific model(s) should NOT be given - see the "are [no] longer given..." in para. below point 6 page 35. That is contrary to what the present paper does, and we believe many physicists would strongly disagree. That is, many physicists would argue strongly for giving limits on models that are of current interest - that is what many CDF papers do. Second, all the six points given as "advantages" can be questioned. For example, point 6. appears to advocate that experimentalists ignore all models. Point 5. appears to be estimating roughly equal likelihoods of a discovery from concentrating on a particular model and from looking at variations on a signature, and therefore preferring the latter? Third, disadvantages to the approach are not mentioned. Such as the difficulty of how to optimise a model-independent search. Maybe also the difficulty in ignoring models one knows about when choosing what cut to make in order to give "model-independent" limits. (g) The second para. of Section 5 is hard to understand. We do not know what a "rather obscure" model is, and suspect many readers will similarly not know. We do not understand what "the odds that any of them is the correct picture of nature are small" is intended to communicate. In the first sentence, in the "choosing models" and "specific models", does "models" refer to supersymmetry models or to models in general? Since the models chosen here are all supersymmetry models, presumably the argument just applies to supersymmetry models, since there are no words on how any or all other models are like supersymmetry in this property. Then, is the argument for signature-based limits just for the case of supersymmetry models? In the last sentence of the para. there are the words "to demonstrate the effectiveness of signature-based limits". But it is hard find where such effectiveness is demonstrated. What follows (Sects 5 and 6) are determinations of the specific model limits and the signature-based limits, but no obvious demonstration of why one is more effective than the other. Still more on the last sentence of the para. Sentence clearly implies that in themselves these SUSY limits are not of much interest. So that the pages of Sect 5, including 6 Tables and 4 Figs., are not of much interest. Then why include all that material? Detailed Comments. 2. First line of Abstract: here and in several places in the text, there is an unusual use of the word "search", which we believe many readers will find disconcerting. Usually we "search *for* something in some place" - as for a needle in a haystack. We would rephrase all such unusual uses of search. In this sentence, maybe "We have searched for evidence of physics beyond the standard model in a sample of p.pbar collision events that produce an energetic photon and an energetic b-quark jet." Then a sentence on CDF and 85 pb-1. 3. First para of Introduction. The Tevatron Collider is a machine and does not have responsibilities. Perhaps the Director of Fermilab has responsibilities, but we should not discuss that in a PRD. 4. Intro, para 2. Should be "New physics models often involve...". 5. Sect 2 Intro. Is this para. needed at all? The 85 pb-1 and 1.8 TeV belong in the Introduction section. And the rest of the para. potentially confuses by using such wordings as "*the* electromagnetic cluster", "*the* photon", "'standard'", "standard", "similar ... except..". 6. Sect 2.4. We suspect that some readers wil recall that in many CDF top analyses a SVX b-tagged jet Et cut of 15 GeV (raw Et) was used. So maybe a reason for this 30 GeV choice should be given. We would also like to know, since the fake tag rate is low with the 15 GeV cut. 7. Sect 3.3. Most or all of this para. should be omitted. There is no need to semi-explain a background method that is not used. It is impolitic to say "a more sophisticated.." and imply that the top analysis background estimate was unsophisticated. Besides, there are several top analyses, some currently in draft status, and there have been changes over time. 8. Sect 3.4, 1st para last line. "unmeasured effects" needs some explanation. 9. Sect 3.6, The second para here strongly suggests that the 197 number, rather than whatever produced the 312, should be used for estimating the fake tag background. Is there an argument against that? It should be checked that what is said here (and anywhere else in this paper) does not contradict the top cross section draft PRD - see sect X of the July 20 draft of cdf5375. 10. Sect 4, para 3 and Fig 1 and 2 caption: The word "approximate" in "approximate background prediction" seems strange. We have a *background prediction* . Or perhaps a *background estimate* . Most (possibly all) such predictions/estimates have uncertainties and so are "approximate". Can we just call it a background prediction/estimate and explain some particular approximatons that were made. 11. Sect 4 para 4. This whole paragraph should be omitted or rewritten. First, almost any (maybe any) distribution that has a "tail" can be said to have several events in the tail. One takes a distribution where one can go far enough to the right (conventional "european" figure) so that there are no events further to the right, moves to the left until several events are to one's right, decrees that one is "at the start of the tail" and by construction several events are in the tail. Second, where new physics is likely to first appear is model or guess dependent; historically, the omega meson and the J/psi meson did not appear in a tail (unless one plotted 1/abs[m-m0]), for example. Third, whether "a few events at the kinematic limit" warrant much interest depends on how many "a few" is and what the background prediction is. 12. Fig 2, delta-phi plots. Why does the scale go to 360 rather than 180, and why is there a sharp change at 360 in both plots? 13. Sect 4.1, para 2. Presumably we have consciously/deliberately chosen to accept two-track tags, so we cannot use a two-track tag to imply that a tag is not a b. The Pt of 2 and 60 GeV looks like a "discovered-after-the-fact" additional cut idea, so the discussion "is allowed", but it may be better if the "unlikely" assertion could be quantified. 14. Sect 4.1 para 4, does "these" mean "these two" or "these six"? Also, an extra "not". 15. Sect 4.2, On reasons for looking at invariant masses, we should at least be aware that D0, in their Quasi-Model-Independent Search Strategy paper, argue (in their Appendix A) that invariant masses are "remarkably ineffective" in a general search. 16. Sect 4.2 para 3: Text says the small box is as close to the five events as possible, but that is not the box that appears in Fig. 3. That is, the small box in the Fig could be appreciably smaller and still enclose the five events. 17. Sect 4.2 and Fig. 5. Text says the min. prob. occurs for a cut at 350 Gev, but Fig shows 400 GeV . Also in the upper fig the data curve should be purely horizontal and vertical lines, but is not. Also, the data curve is .01 at around 450 GeV, while the extreme event in Table 5 is at 467 GeV. Also, it would be good to assure readers (and us) that in the pseudo-expt. procedure the statistical and systematic uncertainties on the background prediction were taken into account. 18. Sect 4.3. Some explanation must be given of why the expected numbers in Table 8 are sometimes negative. And the vertical axis in fig 6 needs to be adjusted (i.e., go below zero) so that the predictions can be clearly seen. 19. Sect 6 intro. Text says "We make an important distinction between the acceptance ....... and the efficiency." But we see no explanation of why the distinction is important, nor of what the distinction is. As just one example, does the tagged jet Et cut belong to acceptance or efficiency, and why does it matter? Also, epsilon is the probability of *not* losing events...