Tuesday, January 31, 2017 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

Anomaly! by Tommaso Dorigo, a review

A guest blog: a review by Tristan du Pree of CMS at CERN

Anomaly! Collider Physics and the Quest for New Phenomena at Fermilab
World Scientific Publishing

Book by Tommaso Dorigo, 2016

To the outside world, Italian particle physicist Tommaso Dorigo is best known for his blogs, not afraid to express his personal thoughts about particle physics research. His new book, ‘Anomaly!’, describes this research from the inside – it covers the parts of the field that one normally does not hear about: the history, the sociology, and everything usually happening behind the screens.


In the world of Big Science, nowadays performed at CERN in collaborations of thousands of particle physicists, press releases have become as common as reconstruction algorithms. Of course, any personal opinion will be carefully hidden and controversial statements should be avoided at all costs. Dorigo’s new book, about the research at the American Tevatron collider at the end of the 20th century, is his latest piece that goes against this trend.

Within the CMS Collaboration at CERN, where the INFN physicist performs his research nowadays, Dorigo is mostly known as a statistics expert. Professionally, I have interacted with Dorigo as reviewer of some of my own searches at CMS for physics beyond the Standard Model, where I encountered him as a thorough, sometimes pedantic, but usually very efficient reviewer. Unsurprisingly, his role in this book is also one of a reviewer – a young researcher who has the important role of validating the claims of one of the CDF collaborators in this book.

But this book is not about Tommaso himself. Like the extensive lists of authors of experimental particle physics papers, the main character in this book is the CDF Collaboration.


The CDF experiment was located at one of the interaction points of the Tevatron collider at Fermilab, near Chicago. This book illustrates, after starting from a collaboration of the order of hundred people in the mid eighties, how the CDF collaboration had successfully constructed a silicon tracker, advanced the identification of heavy-flavor quarks, and discovered the top quark. In many senses, they did pioneering work for the research that we, together with my thousands of colleagues, are currently doing at the LHC at CERN.

The discovery at the Tevatron of the top quark, the sixth, and surprisingly heavy quark, was one of the major discoveries of CDF, confirming the Standard Model of particle physics. While reading about the 1990’s research in the US, the similarity of the top quark discovery with the later discovery of the Higgs boson at CERN is easily seen. Some people are quickly convinced this to be a real effect, whereas others needed more evidence. In the end, the first group turned out to be right, but the latter group wasn’t wrong either.

The internal reviews of such anomalous events, events that could possibly be the first signs of a new discovery, lead to lively internal discussions. Those are the main subject of this book.


Whereas confirming in further detail the Standard Model is a great achievement, truly finding something beyond would be the ultimate scientific jackpot. And apparently, people inside CDF were convinced some unexpected events were the first hints of such new physics. In ‘Anomaly!’, various of such events pass by: anomalous Higgs-like events, events with “superjets”, possible hints of bottom squarks, dimuon bumps, etcetera... Even though we now know these claims were all premature, these discussions are very entertaining and useful to read.

Do we really understand all detector effects? Does it harm our image if we later have to retract a claim? Will the media misrepresent our claims, which could possibly negatively influence our funding? Should experimentalists just provide data, or should we go one step further and provide interpretations? Will people misinterpret our interpretation as a claim for discovery? Is it unscientific to keep a result for ourselves? These are just some of the questions being discussed.

All of these are valid questions, leading to a multidimensional discussion with opposite conclusions. Agreeing how (if at all) to present the results is a topic of various conversations in this book. These time-consuming, but very important, discussions are the reason why experimental particle physicists spend so many days, evenings, and weekends in meetings, phone calls, and video conferences.

3 sigma

In the end, collider experimentalists are a funny bunch. As our collaborations have become so large, we will never be able to win a Nobel Prize for our individual work. Also, constructing a gigantic detector requires close collaboration with various colleagues. But still, curiosity and pride push us to be the first to have the histogram with new events on our own computer.

And whenever a 3 sigma’ish excess appears, some people will claim this might be something not understood, whereas others are convinced that it must be a statistical fluctuation. And one person (usually the first investigator) will claim with certainty this to be the first signs of new physics. Optimists and pessimists are everywhere, and most of the time both sides do have valid arguments.

With so many clever (and stubborn) people, simultaneously collaborating to develop a good experiment while competing on making a first discovery, converging on the final result, and its presentation, can be tough or even impossible. Why, in some cases, this can take so much time – that’s the one thing that Dorigo makes totally clear from this book.


What makes the book so interesting and special is the way the discussions are described.

The description of the dialogues and the researchers are often recognizable and hilarious. For example, we hear Michelangelo Mangano (now CERN theorist) in a restaurant asking “Are you crazy?” (while almost spilling his beer on his freshly ironed Yves Saint Laurent shirt) and we see Claudio Campagnari (currently CMS Susy convener) saying in a meeting: “As long as I’m alive, this stuff will not make it out of here!”. In a collaboration with hundreds (and nowadays thousands) of such people with strong opinions, finding agreement will often take time.

And this brings me to the audience that could most benefit from reading this book: theorists! How often do it not hear, at conferences, on social media, and blogs the question from theorists, wondering why the experiments don’t publish faster. Just to quote a very recent post from Luboš at The Reference Frame:
“Don't you find it a bit surprising that now, in early 2017, we are still getting preprints based on the evaluation of the 2012 LHC data? The year was called "now" some five years ago. Are they hiding something? And when they complete an analysis like that, why don't they directly publish the same analysis including all the 2015+2016 = 4031 data as well? Surely the analysis of the same channel applied to the newer data is basically the same work.”
Well, this book gives the answer! Sometimes data is not yet sufficiently understood in order to become public. Often, further studies and crosschecks are needed, for example if the research reveals differences between different periods of data taking. And, finally, despite the large amount of automatization, all this work is in the end done by humans, and the collaboration has to convinced of the soundness of the obtained result.

Read this book, once, and you’ll never be surprised why some experimental particle physics publications appear to take long from the outside.


It’s a great read, but if I would have to mention one comment on the book, it’s that parts are written for the non-expert reader. I personally think that the book is mostly readable and interesting for professionals (in experiment and theory) and possibly also for an additional audience with a relatively large knowledge of the field. But, in the end, it is a niche market (which also explains the price of the book).

Dorigo has tried to make the book readable for a wider audience, and he is certainly great in trying to explain fundamental concepts in a clear way, using original analogies. His two-page long slow-motion description of a ppbar-collision generating an “anomalous” dielectron-diphoton-met event was actually quite amazing! But I doubt it’s sufficient to clarify our research to the non-expert reader. To me, those descriptions could’ve been dropped, or maybe formatted differently, to distinguish them from the main story line.


But all in all, it’s certainly a unique inside view in the history of particle physics.

This book really makes you experience the research atmosphere during the nineties at Fermilab. The situation around the Fermilab top discovery reminds one of the CERN Higgs discovery. The Tevatron dilepton-diphoton discussion reminds one of the excited discussion around the LHC diphoton events in 2015. And the discussions about internal results that have never been published… well, you know…

After finishing this description focusing on CDF Run-1, I was immediately curious to read more about the stories of CDF Run-2. Not just the measurements, but the everyday research inside a large experimental collaboration.


This week, I will attend CMS Week for the last time. After >6 years in this collaboration, I move to the ATLAS experiment. I will take all the stories of eventful internal meetings with me, as I also did in 2010, after >4 years in the LHCb collaboration. I won’t write a book about it now, this would certainly be too early. Maybe someone else will do it, in like twenty years from now.

If you want to know how experimental particle physics is really done behind the screens, read this book! In all those years, the technology has advanced, as has our knowledge of particle physics, but the sociology is still very much the same.

Tristan du Pree

PS: Thanks to Orange, my provider in France, for recently cutting my telephone, tv, and internet (for no reason). It allowed me to quickly read this paper book without modern disturbance.

Friday, August 26, 2016 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

The delirium over beryllium

Flip Tanedo's guest blog is an example with a highly structured layout.

Guest blog by Prof Flip Tanedo, a co-author of the first highlighted paper

Click at the pirate flag above for a widget-free version
Article: Particle Physics Models for the \(17\MeV\) Anomaly in Beryllium Nuclear Decays
Authors: J.L. Feng, B. Fornal, I. Galon, S. Gardner, J. Smolinsky, T. M. P. Tait, F. Tanedo
Reference: arXiv:1608.03591 (Submitted to Phys. Rev. D)

Also featuring the results from:
  • Gulyás et al., “A pair spectrometer for measuring multipolarities of energetic nuclear transitions” (description of detector; 1504.00489; NIM)
  • Krasznahorkay et al., “Observation of Anomalous Internal Pair Creation in 8Be: A Possible Indication of a Light, Neutral Boson” (experimental result; 1504.01527; PRL version; note PRL version differs from arXiv)
  • Feng et al., “Protophobic Fifth-Force Interpretation of the Observed Anomaly in 8Be Nuclear Transitions” (phenomenology; 1604.07411; PRL)
Recently there’s some press (see links below) regarding early hints of a new particle observed in a nuclear physics experiment. In this bite, we’ll summarize the result that has raised the eyebrows of some physicists, and the hackles of others.

Friday, July 22, 2016 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

Resolving confusion over the term "nonlocality"

Few words stir up a hornet’s nest on TRF as reliably as “nonlocality,” so it is with some trepidation that I offer a few thoughts on the subject. To some extent, I think terminology has sown confusion. Different people use the word “nonlocality” in different ways, and if we can agree on our terms, much of the dispute will evaporate. But not all of it.

Luboš defines nonlocality as a violation of relativistic causality—an ability to signal at spacelike separation. (See here and here.) In our present understanding of physics, that is impossible, although, as Luboš has also explained, we may legitimately look for such nonlocal effects in black-hole physics and string-theoretic dualities. At times, physicists and popularizers of physics have been guilty of leaving the impression that quantum correlations are nonlocal in this sense, and Luboš is right to take them to task (for instance, here, here, and here). But we need to distinguish incautious presentation from bad physics. No one really thinks signaling can occur across spacelike separation. Not even advocates of Bohmian mechanics do (although they do think there is a type of Lorentz-violating nonsignaling causation). When Einstein spoke of spukhafte Fernwirkung, he was putting it forward not as an actual physical process, but as the scandalous consequence of claims that Bohr and others had been making.

When I and many other people use the term “nonlocality,” we have in mind a broader definition that includes the nonseparability of entangled states, which violate what Einstein called the Trennungsprinzip. We likewise speak of nonlocality in manifestly gauge-invariant formulations of Yang-Mills theory and in string theory. If we are attuned to these varied usages, I think we will find broad agreement on the physics.

Where we do disagree is the significance of quantum correlations, so let us focus our energies on that. Does the disagreement reflect an outright error or simply a question on which we can agree to disagree? Regarding a pair of electrons in the singlet state, Luboš draws a comparison to Bertlmann’s socks:

When you measure the colors of his socks, there is nothing mysterious about the anticorrelation. It was guaranteed by design because the same brain decided about the two socks in the morning.
Remember, when John Bell introduced Bertlmann’s socks, his point was that entangled particles do not behave like socks. Yes, the electrons are correlated by virtue of their joint preparation within the past light cone. But the sock metaphor is realist. We can assign definite colors to the socks, so we have a straightforward explanation of how they develop, maintain, and exhibit their correlation. We know from Bell-inequality violations that we cannot do anything analogous with the electrons. One might still argue that this is not mysterious and that quantum mechanics merely enlarges our conception of the types of objects that populate our world—objects that need not follow classical logic. But you cannot appeal to people’s intuition about matching socks.

The situation is nonlocal inasmuch as we are speaking of joint properties of spatiotemporally separated objects. We know the singlet electrons have a total spin of zero, but we cannot ascribe either particle a definite spin in advance of measurement. If you object to the word “nonlocal” in this context, fine. I would also be happy with “nonseparable,” “delocalized,” or “global.”

The real issue is how to explain the phenomenology of correlations. I know that Luboš does not think highly of the EPR paper (neither did Einstein), but it is the usual starting point for this discussion, so let us focus on the most solid part of that paper: the dilemma it presents us with. Given certain assumptions, to explain correlated outcomes, we must either assign some preexisting values to the properties of entangled particles or we must imagine action at a distance. Einstein recoiled from the latter possibility—he was committed to (classical) field theory. The former possibility was later ruled out by Bell experiments. So, presumably we need to question one of the assumptions going into the argument, and that’s where we go down the interpretive rabbit hole of superdeterminism, Everettian views, and so forth, none of which is entirely satisfactory, either. We seem to be stuck. I personally look to emergent-spacetime models for some help, since those models suggest that the degrees of freedom we see arrayed in space are not fundamental.

Luboš has written:
An action at a distance would be needed in a classical model that would try to mimic the predictions of quantum mechanics.
True, but quantum mechanics does not provide a physical picture, either. It tell us that objects should be correlated, but does not tell us how, and it creates a serious tension between correlations and indeterminism. If you disagree, fine. Tell me what is going on. Give me a step-by-step explanation of how particle spins show the observed correlations even though neither has a determinate value in advance of being measured.

Friday, September 4, 2015 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

On what grounds can we trust a theory in the absence of empirical confirmation?

Thanks, Lubos, for your kind invitation to write a guest blog on non-empirical theory confirmation (which I have recently presented in the book String Theory and the Scientific Method, CUP 2013). As a long-time follower of this blog (who, I may dare to add, fervently disagrees with much of its non-physical content) I am very glad to do so. Fundamental physics today faces an unusual situation. Virtually all fundamental theories that have been devised during the last four decades still lack conclusive empirical confirmation. While the details with respect to empirical support and prospects for conclusive empirical testing vary from case to case, this general situation is exemplified by theories like low energy supersymmetry, grand unified theories, cosmic inflation or string theory. The fact that physics is characterised by decades of continuous work on empirically unconfirmed theories turns the non-empirical assessment of those theories' chances of being viable into an important element of the scientific process. Despite the scarcity of empirical support, many physicists working on the above-mentioned theories have developed substantial trust in their theories' viability based on an overall assessment of the physical context and the theories' qualities. In particular in the cases of string theory and cosmic inflation, that trust has been harshly criticised by others as unjustified and incompatible with basic principles of scientific reasoning. The critics argue that empirical confirmation is the only possible scientific basis for holding a theory viable. Relying on other considerations in their eyes amounts to abandoning necessary scientific restraint and leads to a relapse into pre-scientific modes of reasoning. The crux of the argument is the concept of scientific theory confirmation. In my recent book, I argue that the critics' wholesale condemnation of non-empirical reasons for having trust in a theory's viability is caused by their adherence to an oversimplified and inadequate understanding of scientific confirmation that, unfortunately, has dominated the philosophy of science in the 20th century. I propose an understanding of theory confirmation that is broader than what is commonly understood as empirical confirmation and therefore does provide a basis for acknowledging the general soundness of the lines of reasoning that lead physicists towards trusting their theories on a non-empirical basis. (To be sure, this does not imply that non-empirical arguments for a theory's viability are always convincing. It just means that arguments of that kind can be convincing in principle.) The canonical understanding of scientific confirmation, which can for example be found in classical hypothetico-deductivism and in most presentations of Bayesian confirmation theory, is the following. A theory can be confirmed only by empirical data that is predicted by that theory. Agreement between prediction and data amounts to confirmation, disagreement amounts to disconfirmation. Now, we may take this as a primitive definition of confirmation, in which case it makes no sense to question it. The problem is, however, that most scientists have a wider intuitive understanding of confirmation. According to that wider understanding, scientific confirmation is the scientifically supported generation of trust in a theory's viability based on observation. That wider intuitive understanding is implicitly adopted by the critics of non-empirical reasons for trusting a theory: they assume that, in the absence of confirmation, there can't be any good reasons for trusting a theory. My point of departure thus is the understanding that our concept of confirmation should cover all observation-based scientifically supported reasons for believing in a theory's viability. If so, however, it is by no means clear that the observations involved must always be predicted by the given theory. In fact, we can find cases of scientific reasoning where that is quite obviously not the case. A striking example is the Higgs hypothesis. High energy physicists were highly confident that some kind of Higgs particle (whatever the details) existed long before a Higgs particle was discovered in 2012. Their confidence was based on an assessment of the scientific context and their overall experience with predictive success in physics. Even before 2012, it would have been difficult to deny the scientific legitimacy of that assessment. It would be even more implausible today, after that assessment has been vindicated by the LHC. Clearly, there is an important difference between the status of the Higgs hypothesis before and after its successful empirical testing in 2011/2012. That difference can be upheld by distinguishing two different kinds of confirmation. Empirical confirmation is based on the empirical testing of the theory's predictions. Non-empirical confirmation is based on observations that are not of the kind that can be predicted by the theory. Conclusive empirical confirmation is more powerful than non-empirical prediction. But non-empirical confirmation can also provide strong reasons for believing in a theory's viability. What are the observations that generate non-empirical confirmation in physics today? Three main kinds of argument, each relying on one type of observation can be found when looking at the research process. They don't work in isolation but only acquire strength in conjunction. The first and most straightforward argument is the no alternatives argument (NAA). Physicists infer the probable viability of a theory that solves a specific physical problem from the observation that, despite extensive efforts to do so, no alternative theory that solves this problem has been found. Trust in the Higgs hypothesis before empirical confirmation was crucially based on the fact that the Higgs hypothesis was the only known convincing theory for generating the observed mass spectrum within the empirically well-confirmed framework of gauge field theory. In the same vein, trust in string theory is based on the understanding that there is no other known approach for a coherent theory of all fundamental interactions. On its own, NAA has one obvious weakness: scientists might just have not been clever enough to find the alternatives that do exist. In order to take NAA seriously, one therefore needs a method of assessing whether or not scientists in the field typically are capable of finding the viable theories. The argument of meta-inductive inference from predictive success in the research field (MIA) does the job. Scientists observe that in similar contexts, theories without known alternatives turned out to be successful once empirically tested. Both, pre-discovery trust the Higgs hypothsis and today's trust in string theory gain strength from the observation that standard model predictions were highly successful empirically. One important caveat remains, however. It often seems questionable whether previous examples of predictive success and the new theory under scrutiny are sufficiently similar to justify the use of MIA. In some cases, for example in the Higgs case, the concept under scrutiny and previous examples of predictive success are so closely related to each other that the deployment of MIA looks fairly unproblematic. NAA and MIA in conjunction thus were sufficient in the Higgs case for generating a high degree of trust in the theory. In other cases, like string theory, the comparison with earlier cases of predictive success is more contentious. In many respects, string theory does constitute a direct continuation of the high energy physics research program that was so successful in the case of the standard model. But its evolution differs substantially from that of its predecessors. The far higher level of complexity of the mathematical problems involved makes it far more difficult to approach a complete theory. This higher level of complexity may throw the justification for a deployment of MIA into doubt. In cases like that, it is crucial to have a third argument indicating that, despite the high complexity of the theory in question, scientists are still capable of finding their way through the 'conceptual labyrinth' they face. The argument that can be used to that end is the argument from unexpected explanatory interconnections (UEA). The observation on which UEA is based is the following: scientists develop a theory in order to solve a specific theory. Later it turns out that this theory also solves other conceptual problems it was not developed to solve. This is taken as an indicator of the theory's viability. UEA is the theory-based 'cousin' of the well known data-based argument of novel predictive success. The latter relies on the observation that a theory that was developed based on a given set of empirical data correctly predicts new data that had not entered the process of theory construction. UEA now replaces novel empirical prediction by unexpected explanation. The most well-known example of UEA in the case of string theory is its role in understanding black hole entropy. String theory was proposed as a universal theory of all interactions because it was understood to imply the existence of a graviton and suspected to be capable of avoiding the problem of non-renormalizability faced by field theoretical approaches to quantum gravity. Closer investigations of the theory's structure later revealed that - at least in special cases - it allowed for the exact derivation of the known macro-physical black hole entropy law from micro-physical stringy structure. Consideration about black hole entropy, however, had not entered the construction of string theory. Beyond this particular example, string theory offers a considerable number of other unexpected explanatory interconnections that allow for the deployment of UEA. String theorists asked to what extent the think NAA, MIA and UEA influence their trust in their theory often answer that, while NAA and MIA do provide necessary stepping stones for trust in the theory, it is UEA that is the crucial reason for trusting the theory. NAA, MIA and UEA are applicable in a wide range of cases in physics. Their deployment is by no means confined to empirically unconfirmed theories. NAA and MIA play a very important role in understanding the significance of empirical theory confirmation. The continuity between non-empirical confirmation and the assessment of empirical confirmation based on NAA and MIA can be seen nicely by having another look at the example of the Higgs discovery. As argued above, the Higgs hypothesis was believed before 2012 based on NAA and MIA. But only the empirical discovery of a Higgs particle implied that calculations of the background for future scattering experiments had to contain Higgs contributions. That implication is based on the fact that the discovery of a particle in a specific experimental context is taken to be a reliable basis for having trust in that particle's further empirical implications. But why is that so? It relies on the very same types of consideration that had generated trust in the Higgs hypothesis already prior to discovery. First, no alternative theoretical conception is available that can account for the measured signal without having those further empirical implications (NAA). And second, in comparable cases of particle discoveries in the past trust in the particle's further empirical implications was mostly vindicated by further experimentation (MIA). Non-empirical confirmation in this light is no new mode of reasoning in physics. Very similar lines of reasoning have played a perfectly respectable role in the assessment of the conceptual significance of empirical confirmation throughout the 20th century. What has changed is the perceived power of non-empirical considerations already prior to empirical testing of the theory. While NAA, MIA and UEA are firmly rooted in the history of physical reasoning, string theory does add one entirely new kind of argument that can contribute to the strength of non-empirical confirmation. String theory contains a final theory claim, i.e. the claim that, if string theory is a viable theory at its own characteristic scale, it won't ever have to be superseded by an empirically distinguishable new theory. The future of theoretical conceptualization in that case would be devoted to fully developing the theory from the basic posits that are already known rather than to searching for new basic posits that are emprically more adequate. Though the character of string theory's final theory claim is not easy to understand from a philosophical perspective, it may shed new light on the epistemic status of string theory. My self-imposed space constraints for this blog don't allow a more far-reaching discussion of this question. I just want to point out that final theory claims seem to constitute a very interesting new twist to the question of non-empirical confirmation. For the remainder of this text, though, I want to confine my analysis to the role of the three 'classical' arguments NAA, MIA and UEA. Let us first address an important point. In order to be convincing, theory confirmation must not be a one way street. If a certain type of observation has the potential to confirm a theory, it must also have the potential to dis-confirm it. Empirical confirmation trivially fulfils that condition. For any set of empirical data that agrees with a theory's prediction, there are many others that disagree with it and therefore, if actually measured, would dis-confirm the theory. NAA, MIA and UEA fulfil that condition as well. The observation that no alternatives to a theory have been found could, in principle, always be overridden by future observations that scientists do find alternatives later on. That later observation would reduce the trust in the initial theory and therefore amount to that theory's non-empirical dis-confirmation. Likewise, an observed trend of predictive success in a research field could later be overridden by a series of instances where a theory that was well trusted on non-empirical grounds turned out to disagree with empirical tests once they became possible. In the case of UEA, the observation that no unexpected explanatory interconnections show up would be taken to speak against a theory's viability. And once unexpected interconnections have been found, it could still happen that a more careful conceptual analysis reveals them to be the result of elementary structural characteristics of theory building in the given context that are not confined to the specific theory in question. To conclude, the three non-empirical arguments are not structurally biased in favour of confirmation but may just as well provide indications against a theory's viability. Next, I briefly want to touch a more philosophical level of analysis. Empirical confirmation is based on a prediction of the confirmed theory that agrees with an observation. In the case of non-empirical confirmation, to the contrary, the confirming observations are not predicted by the theory. How can one understand the mechanism that makes those observations confirm the theory? It turns out that an element of successful prediction is involved in non-empirical confirmation as well. That element, however, is placed at the meta-level of understanding the context of theory building. More specifically, the claim that is tested at the meta-level is a claim on the spectrum of possible scientific alternatives to the known theory. The observations on which NAA, MIA and UEA rely are all predicted by the meta-level hypothesis that the spectrum of possible scientific alternatives to the theory in question is very limited. Inversely, NAA, MIA and UEA indicate that the spectrum of unconceived alternatives to the known theory is strongly limited. Let us, for the sake of simplicity, just consider the most extreme form of this meta-level hypothesis, namely the hypothesis that, in all research contexts in the scientific field, there are no possible alternatives to the viable theory at all. This radical hypothesis predicts 1: that no alternatives are found because there aren't any (NAA); 2: that a theory that has been developed will always be predictively successful, given that there exists a predictively successful theory at all (MIA); and 3: that that a theory that has been developed for one specific reason will explain all other things as well, because there are no alternatives that could (UEA). In order to use such claims of "limitations to scientific underdetermination", as I call them, as a serious foundation for non-empirical confirmation, one would have to say more on the criteria for accepting a theory as scientific, on how to individuate theories, etc. In this presentation, it shall suffice to give the general flavour of the line of reasoning: non-empirical confirmation is a natural extension of empirical confirmation that places the agreement between observation and the prediction of a hypothesis at the meta-level of theory dynamics. A clearer understanding of the mechanism of non-empirical confirmation and its close relation to empirical confirmation can be acquired based on a formalization of the arguments within the framework of Bayesian confirmation theory. An analysis of this kind has been carried out for NAA (which is the simplest case) in "The No Alternatives Argument", Dawid, Hartmann and Sprenger BJPS 66(1), 213-34, 2015. A number of worries have been raised with respect to the concept of non-empirical confirmation. Let me, in the last part of this text, address a few of them. It has been argued (e.g. by Sabine Hossenfelder) that arguments of non-empirical confirmation are sociological and therefore don't constitute proper scientific reasoning. This claim may be read in two different ways. In its radical form, it would amount to the statement that there is no factual scientific basis to non-empirical confirmation at all. Confidence in a theory on that account would be driven entirely by sociological mechanisms in the physics community and only be camouflaged ex post by fake rational reasoning. The present text in its entirety aims at demonstrating that such an understanding of non-empirical confirmation is highly inadequate. A more moderate reading of the sociology claim is the following: there may be a factual core to non-empirical confirmation, but it is so difficult to disentangle from sociological factors that science is better off when fully discarding non-empirical confirmation. I concede that the role of sociology is trickier with respect to deployments of non-empirical confirmation than in cases where conclusive empirical confirmation is to be had. But I would argue that it is must always be the aim of good science to extract all factual information that is provided by an investigation. If the existence of a sociological element in scientific analysis would justify discarding that analysis, quite some empirical data analysis had to be discarded as well. To give a recent example: the year 2015 witnessed considerable differences of opinion among physicists interpreting the empirical data collected by BICEP2, which might be understood to a certain degree by sociological factors involved. No-one would have suggested to discard the debate on the interpretation of that data as scientifically worthless on those grounds. I suggest that the very same point of view should also be taken with respect to non-empirical confirmation. It has also been suggested (e.g. by George Ellis and Joseph Silk) that non-empirical confirmation may lead to a disregard for empirical data and therefore to the abandonment of a pivotal principle of scientific reasoning. This worry is based on a misreading of non-empirical confirmation. Accepting the importance of non-empirical confirmation by no means devaluates the search for empirical confirmation. To the contrary, empirical confirmation is crucial for the functioning of non-empirical confirmation in at least two ways. Firstly, non-empirical confirmation indicates the viability of a theory. But a theory's viability is defined as: the theory's empirical predictions would turn out correct if they could be specified and empirically tested. Conclusive empirical confirmation therefore remains the ultimate judge of a theory's viability - and thus the ultimate goal of science. Secondly MIA, one cornerstone of non-empirical confirmation, relies on empirical confirmation elsewhere in the research field. Therefore, if empirical confirmation ended in the entire research field, that would remove the possibility of testing non-empirical confirmation strategies and, in the long run, make them dysfunctional. Non-empirical confirmation itself thus highlights the importance of testing theories empirically whenever possible. It implies, though, that not having empirical confirmation must not be equated with knowing nothing about the theory's chances of being viable. Finally, it has been argued (e.g. by Lee Smolin) that non-empirical confirmation further strengthens the dominant research program and therefore in an unhealthy way contributes to thinning out the search for alternatives perspectives that may turn out productive later on. To a given extent, that is correct. Taking non-empirical confirmation seriously does support the focus on those research strategies that generate theories with a considerable degree of non-empirical confirmation. I would argue, however, that this is, by and large, a positive effect. It is an important element of successful science to understand which approaches merit further investigations and which don't. But a very important second point must be added. The way non-empirical confirmation has been presented, it is a technique for understanding the spectrum of possible alternatives to the theory one knows. One crucial test in that respect is to check whether serious and extensive search for alternatives has produced any alternative theories (This is the basis for NAA). Therefore, the search for alternatives is a crucial element of non-empirical confirmation. Non-empirical confirmation does exactly the opposite from denying the value of the search for alternatives: it adds a new reason why it is important. The search for alternatives remains fruitful even if the alternative strands of research fail to produce coherent theories. In that case the understanding that none of the alternative approaches has succeeded gives an important contribution to the non-empirical confirmation of the theory that is available. So what is the status of non-empirical confirmation? The arguments I present support the general relevance of non-empirical confirmation in physics. In the absence of empirical confirmation, non-empirical confirmation can provide a strong case for taking a theory to be viable. This does by no means render empirical confirmation obsolete. Conclusive empirical testing will always trump non-empirical confirmation and therefore remains the ultimate goal in science. Arguments of non-empirical confirmation can in some cases lead to a nearly consensual assessment in the physics community (see the trust in the Higgs particle before 2012). In other cases, they can also be more controversial. As in all contexts of scientific inquiry, arguments presented can be balanced and well founded but may also be exaggerated and unsound in some cases. The actual strength of each specific case of non-empirical confirmation has to be assessed and discussed by the physicists concerned with the given theory based on a careful scientific analysis of the particular case. Criticism of cases on non-empirical confirmation at that level constitutes an integral part of theory assessment. I suggest, however, that a whole-sale verdict that non-empirical theory confirmation is unscientific and should not be taken seriously does not do justice to the actual research process in physics and obscures the actual state of contemporary physics by disregarding one important element of scientific analysis. Richard Dawid Center for Mathematical Philosophy LMU Munich

Wednesday, July 8, 2015 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

This is the second part of a guest blog on double field theory (thanks again to Lubos for giving me this opportunity). I will introduce the extension of double field theory to `exceptional field theory', a subject developed in collaboration with Henning Samtleben, and explain how it allowed us to resolve open problems in basic Kaluza-Klein theory that could not be solved by standard techniques.

Exceptional field theory is the completion (in a sense I shall make precise below) of a research program that goes back to the early 80s and attempts to understand why maximal supergravity knows about exceptional groups, such as \(E_6\), \(E_7\) and \(E_8\). These groups emerge, miraculously, as global (continuous) symmetries upon compactifying maximal supergravity on tori. This looks like a miracle because exceptional groups had no role to play in the original construction of, say, 11-dimensional supergravity. Although these symmetries are now understood as the supergravity manifestations of the (discrete) U-dualities of string-/M-theory, they remained deeply mysterious from the point of view of conventional geometry. Exceptional field theory (EFT) makes these symmetries manifest prior to dimensional reduction, in the same sense that double field theory (DFT) makes the T-duality group \(O(d,d)\) manifest.

It should be emphasized that U-dualities are tied to toroidal backgrounds. Similarly, the continuous exceptional symmetries of supergravity only emerge for compactification on tori. For compactifcations on curved backgrounds, such as spheres, there is no exceptional symmetry. Understandably, this fact led various researchers to conclude that DFT and EFT are consistently defined only on toroidal backgrounds. This is not correct, however, despite the continuing claims by some people. In its most conservative interpretation, EFT (like DFT) is simply a reformulation of (maximal) supergravity that makes its duality properties manifest. In particular, it is background-independent, and so one may describe any desired compactification. The real question therefore is whether this formalism is useful for compactifications other than toroidal ones.

Since on curved backgrounds none of the exceptional symmetries are preserved, it is reasonable to expect that EFT is more awkward than useful for such compactifications. Remarkably, it turns out that, on the contrary, EFT allows one to describe such compactifications very efficiently as generalized Scherk-Schwarz compactifications, governed by `twist matrices' taking values in the duality group. For instance, the compactification of type IIB on \(AdS_5\times S^5\) can be described by a matrix valued in \(E_6\) (the U-duality group in \(D=5\)). Moreover, in this formulation one can solve problems that could not be addressed otherwise. Thus, although physically there is no \(E_6\) symmetry in any conventional sense, this group somehow still governs these spaces `behind the scenes'.

Before I explain this and EFT in more detail, let me first discuss what exactly the issues in conventional Kaluza-Klein theory are that we resolved recently. They are related to the `consistency of Kaluza-Klein truncations', a subject that unfortunately is not appreciated even by many experts. In Kaluza-Klein theory we start with some higher-dimensional theory and decompose fields and coordinates in a way that is appropriate for a lower-dimensional theory. For instance, the metric \(G\) is written as \[ G = \begin{pmatrix} g_{\mu\nu}+A_{\mu}{}^m A_{\nu}{}^n g_{mn} & A_{\mu}{}^{k}g_{kn}\\[0.5ex] A_{\nu}{}^{k} g_{km} & g_{mn} \end{pmatrix} \] with `external' indices \(\mu,\nu\) and `internal' indices \(m,n\). The resulting fields will eventually be interpreted as lower-dimensional metric, vectors and scalars. The question is how the fields depend on the internal coordinates \(y^m\), in other words, what the `Kaluza-Klein ansatz' is.

In one extreme we may declare the fields to be completely independent of the internal coordinates, which means we are effectively truncating to the massless modes of a torus compactification. In another extreme, we may keep the full \(y\)-dependence but expand the fields in a complete basis of harmonics (such a Fourier modes on a torus or spherical harmonics on a sphere), which means keeping the full tower of Kaluza-Klein modes. In both cases there is no danger of inconsistency. The interesting question is whether there is anything in between, i.e., a non-trivial truncation that is nevertheless consistent.

The standard lore is that for a compactification on a manifold with metric \(g_{mn}\) and isometry group \(G\) the appropriate ansatz is written in terms of the Killing vectors \(K_{\alpha}=K_{\alpha}{}^{m}\partial_m\) as \[ G_{\mu n}(x,y) = A_{\mu}{}^{\alpha}(x) K_{\alpha}{}^m(y) g_{mn}(y) \] and similarly for the other metric components. Working out the infinitesimal general coordinate transformations of the metric (using standard formulas from differential geometry that give these transformations in terms of Lie derivatives), one finds that the \(A_{\mu}{}^{\alpha}\) transform like Yang-Mills gauge fields with the gauge group given by the isometry group \(G\) of the internal manifold. Concretely, in order to verify this we have to use that the Killing vectors satisfy the following algebra \[ \big[K_{\alpha}, K_{\beta}\big] \ = \ f_{\alpha\beta}{}^{\gamma} K_{\gamma} \] where \(f_{\alpha\beta}{}^{\gamma}\) are the structure constants of \(G\). Moreover, if we take the gravity action and integrate over the internal manifold, we obtain a lower-dimensional Einstein-Yang-Mills theory. This is the famous `Kaluza-Klein miracle' in which an internal gauge symmetry (the Yang-Mills gauge group) is `geometrized' in terms of a higher-dimensional manifold and its spacetime (diffeomorphism) symmetry.

The trouble with this ansatz is that in general it is inconsistent! A Kaluza-Klein truncation is consistent if and only if any solution of the (truncated) lower-dimensional theory can be embedded into a solution of the (original) higher-dimensional theory. One way to see that Kaluza-Klein truncations on curved manifolds in general are inconsistent is to insert the Kaluza-Klein ansatz discussed above into the Einstein equations and to observe that the \(y\)-dependence does not factor out consistently: one may obtain equations in which the left-hand side depends only on \(x\), but the right-hand side depends on \(x\) and \(y\). (A nice discussion of this can be found in a classic 1984 PLB paper by Duff, Nilsson, Pope and Warner.) Consequently, a solution of the Einstein-Maxwell equations, following from the action obtained by simply integrating over the internal manifold, in general does not give rise to a solution of the original field equations. Consistency only holds for very specific theories and very special internal geometries and requires a suitable non-linear extension of the Kaluza-Klein ansatz.

The known consistent truncations include 11-dimensional supergravity on \(AdS_4\times S^7\), as established by de Wit and Nicolai in 1986, and \(AdS_7\times S^4\), shown to be consistent by Nastase, Vaman and van Nieuwenhuizen in hep-th/9911238. In contrast, until our recent paper, for the celebrated case of type IIB on \(AdS_5\times S^5\) there was no proof of consistency, except for certain truncations and sub-sectors. At this point let me stress that the size of the sphere is of the same order as the AdS scale. There is no low-energy sense which would justify to keep only the `massless' modes, and hence it is especially important to actually prove that the truncation is consistent.

What was known already since 1984 is 1) the complete Kaluza-Klein spectrum of type IIB on \(AdS_5\times S^5\), determined by Gunaydin and Marcus, which requires only the linearized theory, and 2) the complete \(SO(6)\) gauged supergravity in five dimensions, constructed directly in \(D=5\) by Gunaydin, Romans, and Warner, which was believed (and is now proven) to be a consistent truncation of type IIB. What was missing since 1984 is a way to uplift the \(D=5\) gauged supergravity to type IIB. This means that we didn't even know in principle how to obtain the \(D=5\) theory from the type IIB theory in \(D=10\), because we simply didn't have the Kaluza-Klein ansatz that needs to be plugged into the higher-dimensional action and field equations.

After this digression into the consistency issues of Kaluza-Klein theory, let me return to exceptional field theory (EFT) and explain how the above problems are resolved in a strikingly simple way. As in DFT, EFT makes the duality groups manifest by introducing extended/generalized spacetimes and organizing the fields into covariant tensors under these groups. In contrast to DFT, the coordinates are not simply doubled (or otherwise multiplied). Rather, the coordinates are split into `external' and `internal' coordinates as in Kaluza-Klein, but without any truncation, and the internal coordinates are extended to live in the fundamental representation.

EFT has been constructed for \(E_6\), \(E_7\) and \(E_8\) in the series of papers 1308.1673, 1312.0614, 1312.4542, 1406.3348, but for the present discussion I will focus on the \(E_6\) case. So let's first recall some basic facts about this group, which more precisely is here given by \(E_{6(6)}\). The extra 6 in parenthesis means that we are dealing with a non-compact version of \(E_6\), in which the number of non-compact and compact generators differs by 6. \(E_{6(6)}\) has two fundamental representations of dimension 27, denoted by \({\bf 27}\) and \(\bar{\bf 27}\), with corresponding lower and upper indices \(M,N=1,\ldots, 27\). There is no invariant metric to raise and lower indices, and so these two representations are inequivalent. \(E_{6(6)}\) admits two cubic fully symmetric invariant tensors \(d^{MNK}\) and \(d_{MNK}\).

The generalized spacetime of the \(E_{6(6)}\) EFT is given by `external' coordinates \(x^{\mu}\), \(\mu=0,\ldots,4\), and (extended) `internal' coordinates \(Y^M\) in the 27-dimensional fundamental representation. As for DFT, this does not mean that the theory is physically 32-dimensional. Rather, all functions on this extended space are subject to a section constraint, which is similar to the analogous constraint in DFT. In the present case it takes the manifestly \(E_{6(6)}\) covariant form \[ d^{MNK}\partial_N\partial_K A = 0 \qquad d^{MNK}\partial_NA\,\partial_K B = 0 \label{section0} \] with \(A,B\) denoting any fields or gauge parameters. Interestingly, this constraint allows for at least two inequivalent solutions: one preserves \(GL(6)\) and leaves six physical coordinates; the other preserves \(GL(5)\times SL(2)\) and leaves five physical coordinates. The first solution leads to a theory in \(5+6\) dimensions and turns out to be equivalent to 11-dimensional supergravity; the second solution leads to a theory in \(5+5\) dimensions and turns out to be equivalent to type IIB supergravity.

The field content of the theory comprises again a generalized metric, here denoted by \({\cal M}_{MN}\), which takes values in \(E_{6(6)}\) in the fundamental representation. Due to the splitting of coordinates, however, more fields are needed. The bosonic field content is given by \[ g_{\mu\nu}\;, \quad {\cal M}_{MN}\;, \quad {\cal A}_{\mu}{}^{M}\;, \quad {\cal B}_{\mu\nu M}\;. \] Here \(g_{\mu\nu}\) is the external, five-dimensional metric, while \({\cal A}_{\mu}{}^{M}\) and \({\cal B}_{\mu\nu M}\) are higher-form potentials needed for consistency. The fields depend on all \(5+27\) coordinates, subject to the above constraint.

The theory is uniquely determined by its invariance under the bosonic gauge symmetries, including internal and external generalized diffeomorphisms. Again, there is not enough space to explain this properly, but in order to give the reader at least a sense of the extended underlying geometry, let me display the generalized Lie derivative, which satisfies an algebra governed by the analogue of the `C-bracket' in DFT (which we call the `E-bracket'), and which encodes the internal generalized diffeomorphisms. Specifically, w.r.t. to vectors \(V^M\) and \(W^M\) in the fundamental representation it reads \[ \big(\mathbb{L}_{V}W\big)^M \ \equiv \ V^N\partial_NW^M-W^N\partial_N V^M+10\,d^{MNP}\,d_{KLP}\,\partial_NV^K\,W^L \] As for the C-bracket, the first two terms coincide with the Lie bracket between vector fields, but the new term, which explicitly requires the \(E_{6(6)}\) structure, shows that the full symmetry cannot be viewed as conventional diffeomorphisms on an extended space.

There is one more fascinating aspect of the symmetries of EFT that I can't resist mentioning. The vector fields \({\cal A}_{\mu}{}^{M}\) act as Yang-Mills-like gauge potentials for the internal diffeomorphisms. The novelty here is that the underlying algebraic structure is not a Lie algebra, because the E-bracket does not satisfy the Jacobi identity. The failure of the E-bracket to satisfy the Jacobi identity is, however, of a certain exact form. As a consequence, one can construct covariant objects like field strengths by introducing higher-form potentials, in this case the two-forms \({\cal B}_{\mu\nu M}\), and assigning suitable gauge transformations to them. This is referred to as the tensor hierarchy. One can then write a gauge invariant action, which structurally looks like five-dimensional gauged supergravity, except that it encodes, through its non-abelian gauge structure, the full dependence on the internal coordinates, as it should be in order to encode either 11-dimensional or type IIB supergravity. EFT explains the emergence of exceptional symmetries upon reduction, because the formulation is already fully covariant before reduction. (For more details see the recent review 1506.01065 with Henning Samtleben and Arnaud Baguet.)

[At this point let me stress that, as always in science, exceptional field theory did not originate out of thin air, but rather is the culmination of efforts by many researchers starting with the seminal work by Cremmer and Julia. The most important work for the present story is by de Wit and Nicolai in 1986, which made some symmetries, normally only visible upon reduction, manifest in the full \(D=11\) supergravity. It did not, however, make the exceptional symmetries manifest. These were discussed in more recent work dealing with the truncation to the internal sector governed by \({\cal M}_{MN}\). Notable work is due to West, Hillmann, Berman, Perry and many others.]

We are now ready to address the issue of consistent Kaluza-Klein truncations in EFT, following the two papers 1410.8145, 1506.01385. The Kaluza-Klein ansatz takes the form of a generalized Scherk-Schwarz reduction, governed by `twist matrices' \(U\in E_{6(6)}\). For technical reasons that I can't explain here we also need to introduce a scale factor \(\rho(Y)\). The ansatz for the bosonic fields collected above then reads \[ \begin{split} g_{\mu\nu}(x,Y) &= \rho^{-2}(Y)\,{g}_{\mu\nu}(x)\nonumber\\ {\cal M}_{MN}(x,Y) &= U_{M}{}^{{K}}(Y)\,U_{N}{}^{{L}}(Y)\,M_{{K}{L}}(x) \nonumber\\ {\cal A}_{\mu}{}^{M}(x,Y) &= \rho^{-1}(Y) A_{\mu}{}^{{N}}(x)(U^{-1})_{{N}}{}^{M}(Y) \nonumber\\ {\cal B}_{\mu\nu\,M}(x,Y) &= \,\rho^{-2}(Y) U_M{}^{{N}}(Y)\,B_{\mu\nu\,{N}}(x) \end{split} \] The \(x\)-dependent fields on the right-hand side are the fields of five-dimensional gauged supergravity.

We have to verify that the \(Y\)-dependence factors out consistently both in the action and equations of motion. This is the case provided some consistency conditions are satisfied, which have a very natural geometric interpretation within the extended geometry of EFT. To state these it is convenient to introduce the combination \[ {\cal E}_{{M}}{}^{N} \ \equiv \ \rho^{-1}(U^{-1})_{{M}}{}^{N} \] [Here I am deviating from the notation in the paper in order to simplify the presentation.] The consistency condition can then be written in terms of the \(E_{6(6)}\) generalized Lie derivative discussed above. It takes the form \[ \mathbb{L}_{\,{\cal E}_{{M}}}\,{\cal E}_{{N}} \ = \ -X_{{M}{N}}{}^{{K}}\, {\cal E}_{{K}} \] where the \(X_{{M}{N}}{}^{{K}}\) are the `structure constants' of gauged supergravity that encode the gauge group. This relation is the extended geometry version of the Lie bracket algebra of Killing vector fields given above. Thus, we can view the \({\cal E}_{M}\) as generalized Killing vectors on the extended space of EFT. An important and intriguing difference is that the \(X_{MN}{}^{K}\) in general are not the structure constants of a Lie group. In general they are not even antisymmetric in their lower two indices. They do, however, satisfy a quadratic Jacobi-type identity, leading to a structure that in the mathematics literature is referred to as a `Leibniz algebra'.

Due to these novel algebraic structures, there is no general procedure of how to solve the above consistency equations, i.e., of how, for given structure constants \(X_{MN}{}^{K}\), to find a twist matrix \(U\) satisfying the above equation. I think it is a mathematically fascinating open problem to understand systematically how to integrate the above Leibniz algebra to the corresponding `Leibniz group' (whatever that could mean). What we did in the paper instead is to solve the equations `by hand' for a few interesting cases, in particular spheres and their non-compact counterparts (inhomogeneous hyperboloidal spaces \(H^{p,q}\)).

These twist matrices take a surprisingly simple universal form, which then allows us to cover in one stroke the sphere compactifications of \(D=11\) supergravity (\(AdS_4\times S^7\) and \(AdS_7\times S^4\)) and of type IIB (\(AdS_5\times S^5\)). This, finally, settles the issue of consistency of the corresponding Kaluza-Klein truncation and also gives the explicit uplift formulas: they are given by the above generalized Scherk-Schwarz ansatz. Thus, for any solution of five-dimensional gauged supergravity, given by \(g_{\mu\nu}(x)\), \(M_{MN}(x)\), etc., we can directly read off the corresponding solution of EFT and, via the embedding discussed above, of type IIB. In particular, every stationary point and every holographic RG flow of the scalar potential directly lifts to a solution of type IIB.

This concludes my summary of exceptional field theory and its applications to Kaluza-Klein compactifications. Far from being impossible to describe in exceptional field theory, spheres and other curved spaces actually fit intriguingly well into these extended geometries, which allows us to resolve open problems. This is one example of a phenomenon I have seen again and again in the last couple of years: the application of this geometry to areas where duality symmetries are not present in any standard sense still leads to quite dramatic simplifications. I believe this points to a deeper significance of these extended geometries for our understanding of string theory more generally, but of course it remains to be seen which radically new geometry (if one may still call it that) we will eventually have to get used to.

Thursday, July 2, 2015 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

First of all I would like to thank Luboš for giving me the opportunity to write a guest blog on double field theory. This is a subject that in some sense is rather old, almost as old as string theory, but that has seen a remarkable revival over the last five years or so and that, as a consequence, has reached a level of maturity comparable to that of many other sub-disciplines of string theory. In spite of this, double field theory is viewed by some as a somewhat esoteric theory in which unphysical higher-dimensional spacetimes are introduced in an ad-hoc manner for no reasons other than purely aesthetic ones and that, ultimately, does not give any results that might not as well be obtained with good old-fashioned supergravity. It is the purpose of this blog post to introduce double field theory (DFT) and to explain that, on the contrary, even in its most conservative form it allows us to attack problems several decades old that were beyond reach until recently.

Concretely, in the first part I will review work done in collaborations with Warren Siegel and Barton Zwiebach on a formulation of DFT that includes higher-derivative \(\alpha'\) corrections and that describes certain subsectors of string theory in a way that is exact to all orders in \(\alpha'\). This casts the old problem of determining and understanding these corrections into a radically new form that, we believe, provides a significant step forward in understanding the interplay of two of the main players of string theory: \(\alpha'\) and duality symmetries. In the second part, I will explain how an extension of DFT to exceptional groups, now commonly referred to as exceptional field theory, allows us to settle open problems in Kaluza-Klein truncations of supergravity that, although of conventional nature, were impossible to solve with standard techniques.

So let's start by explaining what DFT is. It is framework for the spacetime (target space) description of string theory that makes the T-duality properties manifest. T-duality implies that string theory on the torus \(T^d\) with background metric and B-field looks the same for any background obtained by an \(O(d,d;\ZZ)\) transformation. The discrete nature of the group is due to the torus identifications (periodicity conditions) which the transformations need to respect. In the supergravity approximation to string theory the dimensional reduction on the torus truncates the massive Kaluza-Klein modes (as it should be, since in the effective supergravity we have also truncated massive string modes), and so all memory of the torus is gone. Consequently, the duality symmetry visible in supergravity is actually the continuous \(O(d,d;\RR)\), and in the following I will exclusively consider this group. In contrast to what people sometimes suspect, the continuous symmetry is preserved by \(\alpha'\) corrections, which will be important below.

This implies that gravity in \(D=10\) or \(D=26\) dimensions, extended by the bosonic and fermionic fields predicted by string theory, yields an enhanced global symmetry upon reduction that cannot be understood in terms of the symmetries present in the standard formulation of gravity. Consider the minimal field content of all closed string theories, the metric \(g_{ij}\), the antisymmetric b-field \(b_{ij}\) and a scalar (dilaton) \(\phi\), with effective two-derivative action \[ S = \int d^Dx\,\sqrt{g}e^{-2\phi}\big[R+4(\partial\phi)^2-\tfrac{1}{12}H^2\big] \] This action is invariant under standard diffeomorphisms (general coordinate transformations) \(x^i\rightarrow x^i-\xi^i(x)\) and b-field gauge transformations \(b\rightarrow b+{\rm d}\tilde{\xi}\), with vector gauge parameter \(\xi^i\) and one-form parameter \(\tilde{\xi}_i\). The diffeomorphism symmetry explains the emergence of the \(GL(d,\RR)\) subgroup of \(O(d,d)\), representing global reparametrizations of the torus, while the b-field gauge symmetry permits a residual global shift symmetry \(b\rightarrow b+c\), with antisymmetric constant c. The full symmetry is larger, however: the complete \(O(d,d)\), as predicted by string theory. String theory is trying to teach us a lesson that we fail to understand by writing the spacetime actions as above. DFT is the framework that for the first time made the full symmetries manifest before dimensional reduction.

The idea behind DFT is to introduce a doubled space with coordinates \(X^M=(\tilde{x}_i, x^i)\), \(M=1,…,2D\), on which \(O(D,D)\) acts naturally in the fundamental representation. (Note that here, at least to begin with, we have doubled the number of all spacetime coordinates.) This idea is actually well motivated by string theory on toroidal backgrounds, where these coordinates are dual both to momentum and winding modes. In fact, in closed string field theory on such backgrounds, the doubled coordinates are a necessity and not a luxury. [Perhaps it's time now for some references: the idea of doubling coordinates in connection to T-duality is rather old, going back at least to the early 90's and work by Duff, Tseytlin, Kugo, Zwiebach and others, but the most important paper for the present story is hep-th/9305073 by Warren Siegel. The modern revival of these ideas was initiated in a paper by Chris Hull and Barton Zwiebach, 0904.4664, and then continued with myself in 1003.5027, 1006.4823. There is also a close relation to `generalized geometry', which I will comment on below. For more references see for instance the review 1309.2977 with Barton Zwiebach and Dieter Lust.]

In DFT we reorganize the fields into \(O(D,D)\) covariant variables as follows \[

{\cal H}_{MN} &= \begin{pmatrix} g^{ij} & -g^{ik}b_{kj}\\[0.5ex] b_{ik}g^{kj} & g_{ij}-b_{ik}g^{kl}b_{lj}\end{pmatrix}, \\
e^{-2d} &= \sqrt{g}e^{-2\phi}

\] where the `generalized metric' \({\cal H}_{MN}\) transforms as a symmetric 2-tensor and \(e^{-2d}\) is taken to be an \(O(D,D)\) singlet. Moreover, \({\cal H}_{MN}\) can be thought of as an \(O(D,D)\) group element in the following way: Defining \[ {\cal H}^{MN} \equiv \eta^{MK}\eta^{NL}{\cal H}_{KL}\;, \qquad \eta_{MN} = \begin{pmatrix} 0 & 1\\[0.5ex] 1 & 0 \end{pmatrix} \] where \(\eta_{MN}\) is the metric left invariant by \(O(D,D)\), it satisfies \[ {\cal H}^{MK}{\cal H}_{KN} = \delta^{M}{}_{N}\;, \qquad \eta^{MN}{\cal H}_{MN} = 0 \] Conversely, the parametrization above is the most general solution of these constraints. Thus, we may forget about \(g\) and \(b\) and simply view \({\cal H}\) as the fundamental gravitational field, which is a constrained field.

If we forget about \(g\) and \(b\), how do we write an action for \({\cal H}\)? We can write an action in the Einstein-Hilbert-like form \[ S_{\rm DFT} = \int d^{2D}X\,e^{-2d}\,{\cal R}({\cal H},d) \] where the scalar \({\cal R}\), depending both on \({\cal H}\) and \(d\), denotes a generalization of the Ricci scalar in standard differential geometry. But how is it constructed? There is a beautiful story here, closely analogous to conventional Riemannian geometry with its notions of Levi-Civita connections, invariant curvatures, etc., but also with subtle differences. Most importantly, there is a notion of generalized diffeomorphisms, infinitesimally given by generalized Lie derivatives \({\cal L}_{\xi}\) parametrized by an \(O(D,D)\) vector \(\xi^{M}=(\tilde{\xi}_i,\xi^i)\) that unifies the diffeomorphism vector parameter with the one-form gauge parameter. These Lie derivatives form an interesting algebra, \([{\cal L}_{\xi_1},{\cal L}_{\xi_2}]={\cal L}_{[\xi_1,\xi_2]_C}\), defining the `C-bracket' \[

\!\big[\,\xi_1\,,\;\xi_2\,\big]_{C}^M &= \xi_1^N\partial_N\xi_{2}^M \!-\xi_2^N\partial_N\xi_{1}^M\!-\\
&-\frac{1}{2}\xi_{1N}\partial^M \xi_{2}^N \!+ \frac{1}{2}\xi_{2N}\partial^M\xi_{1}^N

\] where indices are raised and lowered with the \(O(D,D)\) invariant metric. The first two terms look like the standard Lie bracket between vector fields, but the remaining two terms are new. Incidentally, this shows that these transformations are not diffeomorphisms on the doubled space, for these would close according to the Lie bracket, not the C-bracket. Due to lack of space I cannot review the geometry further, but suffice it to say that the generalized diffeomorphisms uniquely determine the Ricci scalar, and since I haven't introduced connections, etc., let me just give the explicit and manifestly \(O(D,D)\) invariant expression, written in terms of the derivatives \(\partial_M\) dual to the doubled coordinates, \[

\begin{split} {\cal R} \ \equiv &~~~4\,{\cal H}^{MN}\partial_{M}\partial_{N}d -\partial_{M}\partial_{N}{\cal H}^{MN}- \\[1.2ex]
~&-4\,{\cal H}^{MN}\partial_{M}d\,\partial_{N}d + 4 \partial_M {\cal H}^{MN} \,\partial_Nd+ \\[1.0ex] ~&+\frac{1}{8}\,{\cal H}^{MN}\partial_{M}{\cal H}^{KL}\, \partial_{N}{\cal H}_{KL}-\\
~&-\frac{1}{2}{\cal H}^{MN}\partial_{M}{\cal H}^{KL}\, \partial_{K}{\cal H}_{NL} \end{split}

\] With this form of the generalized Ricci scalar, the above DFT action reduces to the standard low-energy action upon truncating the extra coordinates by setting \(\tilde{\partial}=0\).

So far I have remained silent about the nature of the extended coordinates. Surely, we don't mean to imply that the theory is defined in 20 dimensions, right? Indeed, the gauge invariance of the theory actually requires a constraint, the `strong constraint' or `section constraint', \[ \eta^{MN}\partial_M\partial_N = 2\tilde{\partial}^{i}\partial_{i} = 0 \] This is supposed to mean that \(\partial^M\partial_MA=0\) for any field or gauge parameter A but also \(\partial^M\partial_M(AB)=0\) for any products of fields, which then requires \(\partial^MA\partial_MB=0\). One may convince oneself that the only way to satisfy these constraints is to set \(\tilde{\partial}=0\) (or \(\partial_i=0\) or any combination obtained thereof by an \(O(D,D)\) transformation). This constraint is therefore an \(O(D,D)\) invariant way of saying that the theory is only defined on any of the half-dimensional subspaces, on each of which it is equivalent to the original spacetime action. (This will change, however, once we go to type II theories or M-theory, in which case different theories emerge on different subspaces, but more about this later.)

Is it possible to relax this constraint? Indeed, some of the initial excitement about DFT was due to the prospect of having a framework to describe so-called non-geometric fluxes, which are relevant for gauged supergravities in lower dimensions that apparently cannot be embedded into higher-dimensional supergravity/string theory in any conventional `geometric' way. Those non-geometric compactifications most likely require a genuine dependence on both types of coordinates, and various proposals have been put forward of how to relax the above constraint. (I should also mention that in the full closed string field theory the constraint is relaxed, where one requires only the level-matching constraint, allowing for certain dependences on both \(x\) and \(\tilde{x}\).) While these preliminary results are intriguing and, in my opinion, capture part of the truth, I think it is fair to say that we still do not have a sufficiently rigorous framework. I will therefore assume the strong form of the constraint, and so if you don't like this constraint you may simply imagine at any step that \(\partial_M\) is really only a short-hand notation for \[ \partial_M = (0,\partial_i) \] In this way of thinking the extended coordinates play a purely auxiliary role. They are necessary in order to make certain symmetries manifest and thus analogous, for instance, to the fermionic coordinates in superspace (although technically they appear to be rather different). We then have a strict reformulation of supergravity; in particular, we are not tied to the torus. I want to emphasize that it is not only possible to use this theory for general curved backgrounds (spheres, for instance), but it is actually highly beneficial to do so, as will be explained in a follow-up post.

[Let me also point to a close connection with a beautiful field in pure mathematics called `generalized geometry', going back to work by Courant, Severa, Weinstein, Hitchin, Gualtieri and others, in which the generalized metric is also a central object. Subsequently, this was picked up by string theorists, suggesting that one should forget about \(g\) and \(b\) and view the generalized metric as the fundamental object. Curiously, however, before the advent of DFT, no physicist seemed to bother (or to be capable) to take the obvious next step and to formulate the spacetime theory in terms of this object, although this is technically straightforward once phrased in the right language. What is the reason for this omission? Of course I can only speculate, but the reason must be that the (auxiliary) additional coordinates, which are absent in generalized geometry, are really needed to get any idea of what kind of terms one could write to construct an action.]

Finally we are now ready to turn to higher-derivative \(\alpha'\) corrections. We constructed a particular subset of these corrections in the paper 1306.2970 with Warren Siegel and Barton Zwiebach by using a certain chiral CFT in the doubled space with a novel propagator and simplified OPEs, earlier introduced by Warren. This is a beautiful story, but slightly too technical to be properly explained here. Therefore, let me give the idea in an alternative but more pedagogical way. First, recall that the generalized metric satisfies a constraint, which we can simply write as \({\cal H}^2=1\) (leaving implicit the metric \(\eta\)). Is there a way to define the theory for an unconstrained metric? The trouble is that when checking gauge invariance of the action we use this constraint, and so simply replacing the constrained \({\cal H}\) by an unconstrained field, which in the paper we called the `double metric' \({\cal M}\), violates gauge invariance. However, by construction, the failure of gauge invariance must be proportional to \({\cal M}^2-1\), and therefore we can restore gauge invariance, at least to first order, by adding the following term to the action: \[

S &= \int e^{-2d}\big[\tfrac{1}{2}{\eta}^{MN}({\cal M}-\tfrac{1}{3}{\cal M}^3)_{MN}+\\
&\qquad+{\cal R}'({\cal M},d) +\dots \big]

\] The variation of the first term is proportional to \({\cal M}^2-1\) and so whatever the `anomalous' transformation of the generalized Ricci scalar is, we can cancel it by assigning a suitable extra gauge variation to \({\cal M}\). (The scalar \({\cal R}'\) carries a prime here, because one actually has to augment the original Ricci scalar by terms that vanish when the metric is constrained.) Since \({\cal R}\) contains already two derivatives this means we have a higher-derivative (order \(\alpha'\)) deformation of the gauge transformations.

Now the problem is that we also have to use these \({\cal O}(\alpha')\) gauge transformations in the variation of \({\cal R}'\), which requires extra higher-derivative terms in the action, in turn necessitating yet higher order terms in the gauge transformations. One would think that this leads to an iterative procedure that never stops, thus giving at best a gauge and T-duality invariant action to some finite order in \(\alpha'\). Remarkably, however, we found an exact deformation, with gauge transformations carrying a finite number of higher derivatives. Moreover, the chiral CFT construction allowed as to define an explicit action with up to six derivatives, which is of the same structural form as above. Intriguingly, the action in terms of \({\cal M}\) is cubic. Since the \(O(D,D)\) metric \(\eta\) is used to raise and lower indices, we never need to use the inverse of \({\cal M}\) (in fact, the action it completely well-defined for singular \({\cal M}\)) and so we have a truly polynomial action for gravity.

How can this be, given the standard folklore that gravity must be non-polynomial? The resolution is actually quite simple: since \({\cal M}\) is now unconstrained it encodes more fields than the expected metric and b-field, and the extra field components act as auxiliary fields. Integrating them out leads to the non-polynomial form of gravity, but including infinitely many higher-derivative corrections. Actually, the fact that gravity can be made polynomial by introducing auxiliary fields is well known, but in all cases I am aware of this is achieved by using connection variables; the novelty here is that components of the metric itself (of the doubled metric, however) serve as auxiliary fields, in a way that does not simply reproduce Einstein gravity but also infinitetly many higher-derivative corrections!

What are these higher-derivative corrections in conventional language? This is actually a quite non-trivial question, since beyond zeroth order in \(\alpha'\) the conventional metric and b-field are not encoded in \({\cal M}\) in any simple manner, as to be expected, given the rather dramatic reorganization of the spacetime theory. Barton and I managed to show last year that to first-order in \(\alpha'\) the theory encodes in particular the deformation due to the Green-Schwarz mechanism in heterotic string theory. In this, the spacetime gauge symmetries (local Lorentz transformations or diffeomorphisms) are deformed in order to cancel anomalies, which in turn requires higher-derivative terms in the action in the form of Chern-Simons modifications of the three-form curvature. We are currently trying to figure out what exactly the higher derivative modifications are to yet higher order in \(\alpha'\). These are only a subsector of all possible \(\alpha'\) corrections. For instance, this theory does not describe the Riemann-square correction present in both bosonic and heterotic string theory. This is not an inconsistency, because T-duality is not supposed to completely constrain the \(\alpha'\) corrections; after all, there are different closed string theories with different corrections. This theory describes one particular \(O(d,d)\) invariant, and we are currently trying to extend this construction to other invariants.

I hope my brief description conveys some of the reasons why we are so excited about double field theory. In a follow-up blog post I will explain how the extension of double field theory to exceptional groups, exceptional field theory, allows us to solve problems that, although strictly in the realm of the two-derivative supergravity, were simply intractable before. So stay tuned.

Sunday, April 12, 2015 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

Manifest unitarity (simplified template)

Guest blog by Prof Dejan Stojkovic, University of Buffalo

Dear Lubos,
First, I would like to thank you very much for his kind invitation for a guest post. I am certainly honored by this gesture.

We recently published a paper titled “Radiation from a Collapsing Object is Manifestly Unitary” in PRL. The title was carefully chosen (note the absence of the term “black hole”) because of its potential implications for a very touchy issue of the information loss paradox. I will use this opportunity to explain our points of view.