diff --git a/gaps-1.1/LICENSE b/gaps-1.1/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..f288702d2fa16d3cdf0035b15a9fcbc552cd88e7 --- /dev/null +++ b/gaps-1.1/LICENSE @@ -0,0 +1,674 @@ + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/> + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + <one line to give the program's name and a brief idea of what it does.> + Copyright (C) <year> <name of author> + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + <program> Copyright (C) <year> <name of author> + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +<https://www.gnu.org/licenses/>. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +<https://www.gnu.org/licenses/why-not-lgpl.html>. diff --git a/gaps-1.1/README.md b/gaps-1.1/README.md new file mode 100644 index 0000000000000000000000000000000000000000..b404d3764d643b017ebffb4b218e16ac4c6a48a0 --- /dev/null +++ b/gaps-1.1/README.md @@ -0,0 +1,63 @@ +# GAPS: a GPU-Amplified Parton Shower + +> **Version 1.1.0**: Some extra features added on demand, folders restructured. + +Code for "An Algorithm to Parallelise Parton Showers on a GPU" [[arxiv:2403.08692](https://arxiv.org/abs/2403.08692)] + +The aim of this project is to demonstrate how a Parton Shower Veto Algorithm can be written to run in parallel on a GPU. The code runs a simple LEP Event Generator on NVIDIA GPUs using CUDA. It is based on S. Höche's Tutorial on Parton Showers [[arxiv:1411.4085](https://arxiv.org/abs/1411.4085)]. + +## What can the code do on the GPU? + +- Calculate the Matrix Element for $e^+ e^- \to q \bar{q}$ at 91.2 GeV +- Simulate a Final State Dipole Shower +- Calculate Jet Rates and Event Shapes + +## Requirements + +You will need an NVIDIA GPU, desgined for data centres (this code is verified to run on the NVIDIA Tesla V100 and A100 Devices). To build the code, you will need CMake, G++, Python and the NVIDIA Development Toolkit, which contains the NVCC compiler. + +## Running the Code + +The executable ```rungaps``` is written to simplify the use of the code. One can simply execute the command: + +```bash +./rungaps +``` + +NB: If you get a permission denied error, please run ```chmod +x rungaps```. + +This should build the program and generate 10000 events on the GPU. More customisation options are available, and are listed below: + +```bash +# Simulate different numbers of events and build the code using multiple CPU cores +./rungaps -n nevents -c ncores + +# Run C++ Simulation +./rungaps -n nevents -c ncores -r cpp + +# Run the same number of events on C++ and CUDA and compare times +./rungaps -n nevents -c ncores -r compare + +# Run a multitude of number of events 100 times, as seen in the paper +./rungaps -c ncores -r full +``` + +The histograms are saved as yoda files [[arxiv:2312.15070](https://arxiv.org/abs/2312.15070)]. To generate the plots, use Rivet [[arxiv:1912.05451](https://arxiv.org/abs/1912.05451)] as follows: + +```shell +rivet-mkhtml my-output.yoda:"Results" -s --mc-errs -c plots.conf +``` + +## Modifying Parameters and Going Further + +**New**: You can now adjust the Centre of Mass Energy using the ```-e``` flag + +To focus on the computational aspects and make it simple to replicate the results in the paper, we don't allow direct access to the physics parameters (yet!). For now, please use the ```base.cuh``` file to adjust parameters like $\alpha_s(m_Z)$, $t_{C}$ and $n_{Bins}$. + +To learn more about the code and how it all works, see the [documentation](doc/README.md). + +*** + +### Sid Sule + Mike Seymour, March 2024 + +For issues and queries, email: [siddharth.sule@manchester.ac.uk](mailto:siddharth.sule@manchester.ac.uk) diff --git a/gaps-1.1/cpp-shower/.vscode/settings.json b/gaps-1.1/cpp-shower/.vscode/settings.json new file mode 100644 index 0000000000000000000000000000000000000000..23fd35f0e0e708ef622c7d957b9c8bb60c7876eb --- /dev/null +++ b/gaps-1.1/cpp-shower/.vscode/settings.json @@ -0,0 +1,3 @@ +{ + "editor.formatOnSave": true +} \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/CMakeLists.txt b/gaps-1.1/cpp-shower/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..7722eb24f8812c0b49d569e805d0301fdefcd885 --- /dev/null +++ b/gaps-1.1/cpp-shower/CMakeLists.txt @@ -0,0 +1,27 @@ +# Minimum required version of CMake +cmake_minimum_required(VERSION 3.10) + +# Project name and languages used +project(cpp-shower LANGUAGES CXX) + +# Set C++ standard +set(CMAKE_CXX_STANDARD 17) +set(CMAKE_CXX_STANDARD_REQUIRED ON) + +# List of subdirectories +set(SUBDIRS base matrix shower observables) + +# Include the directories for the headers +foreach(subdir ${SUBDIRS}) + include_directories(${subdir}/include) + add_subdirectory(${subdir}) +endforeach() + +# Set the directory for the executable +set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_SOURCE_DIR}/bin) + +# Add main.cpp to the executable +add_executable(cpp-shower main.cpp) + +# Link the libraries from the subdirectories +target_link_libraries(cpp-shower matrix shower observables) \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/base/CMakeLists.txt b/gaps-1.1/cpp-shower/base/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..47434fffccdadd849fcbdb1d8b41a503a829a5ed --- /dev/null +++ b/gaps-1.1/cpp-shower/base/CMakeLists.txt @@ -0,0 +1,5 @@ +cmake_minimum_required(VERSION 3.10) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include) \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/base/include/base.h b/gaps-1.1/cpp-shower/base/include/base.h new file mode 100644 index 0000000000000000000000000000000000000000..81a34b05647a556f5fffa5b731384105288d760c --- /dev/null +++ b/gaps-1.1/cpp-shower/base/include/base.h @@ -0,0 +1,34 @@ +#ifndef BASE_H_ +#define BASE_H_ + +// ----------------------------------------------------------------------------- +// Import Libraries + +#include <cmath> // Math Functions +#include <fstream> // File I/O +#include <iostream> // Standard I/O +#include <random> // Random Number Generation +#include <vector> // Vector + +// ----------------------------------------------------------------------------- +// Program Settings - CAREFUL WITH CHANGES + +// Max Number of Partons, set to save memory +const int maxPartons = 30; + +// LEP 91.2 settings +const double mz = 91.1876; +const double asmz = 0.118; + +// Cutoff and its value of alpha_s (pre-calculated) +const double tC = 1.; +const double asmax = 0.440886; + +// Number of Histogram Bins: Common for all Plots (for now...) +const int nBins = 100; +const int nBins2D = 100; // 10x10 Grid + +// Maximum Number of Events, beyond which program will be done in batches +const int maxEvents = 1000000; + +#endif // BASE_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/base/include/event.h b/gaps-1.1/cpp-shower/base/include/event.h new file mode 100644 index 0000000000000000000000000000000000000000..26d5580dce7ee50bf0acf0d824fb71fd2d861a2d --- /dev/null +++ b/gaps-1.1/cpp-shower/base/include/event.h @@ -0,0 +1,219 @@ +#ifndef EVENT_H_ +#define EVENT_H_ + +#include "parton.h" + +// Event Class +// Built to contain the partons and the dxs as one accesible object +// In future, can use to store thrust, log10y23 to parallelise those +class Event { + private: + // Temporary Solution - Allows a limited number of partons + // Better Solution would be to use a dynamic array, but not GPU friendly + Parton partons[maxPartons]; + + // ME Params ----------------------------------------------------------------- + + double dxs = 0.; // Differential Cross Section + int nHard = 0; // Number of Hard Partons + // int nInitial = 0; // Number of Initial Partons (Prep for ISR) + // int nNonParton = 0; // Number of Non-Parton Partons (Prep for ISR) + + // Shower Params ------------------------------------------------------------- + + int nEmission = 0; // Number of Emissions + double showerT = 0.; // Evolution and Splitting Variables + double showerZ = 0.; + double showerY = 0.; + int showerC = 0; // Colour Counter + + // Selecting Winner Emission - Defaults Values which represent no winner + int winSF = 16; + int winDipole[2] = {-1, -1}; + double winParams[2] = {0., 0.}; + + bool endShower = false; // Shower End Flag - used if T < 1 GeV + + // Analysis Variables -------------------------------------------------------- + + // Event Validity - Momentum and Colour Conservation + bool validity = true; + + // Jet Rates using the Durham Algorithm + double y23 = -50., y34 = -50., y45 = -50., y56 = -50.; + + // Event Shape Variables - Thrust, Jet Masses and Broadenings + double thr = -50., hjm = -50., ljm = -50., wjb = -50., njb = -50.; + Vec4 t_axis = Vec4(); + + // Dalitz Plot + double dalitz[2] = {-50., -50.}; + + public: + // Constructor --------------------------------------------------------------- + + // Empty, so that we can build our ME, PS onto it + Event() {} + + // Getters ------------------------------------------------------------------- + + // Access Partons in the Event + Parton GetParton(int i) const { return partons[i]; } + int GetSize() const { return nHard + nEmission; } + int GetHard() const { return nHard; } + int GetEmissions() const { return nEmission; } + int GetPartonSize() const { return (nHard + nEmission) - 2; } // -2: e+, e- + + // Get Differential Cross Section + double GetDxs() const { return dxs; } + + // Get Shower Params + double GetShowerT() const { return showerT; } + double GetShowerY() const { return showerY; } + double GetShowerZ() const { return showerZ; } + int GetShowerC() const { return showerC; } + + // Get Winner Emission + int GetWinSF() const { return winSF; } + int GetWinDipole(int i) const { return winDipole[i]; } + double GetWinParam(int i) const { return winParams[i]; } + + // Get Analysis Variables + bool GetValidity() const { return validity; } + + double GetY23() const { return y23; } + double GetY34() const { return y34; } + double GetY45() const { return y45; } + double GetY56() const { return y56; } + double GetThr() const { return thr; } + double GetHJM() const { return hjm; } + double GetLJM() const { return ljm; } + double GetWJB() const { return wjb; } + double GetNJB() const { return njb; } + + Vec4 GetTAxis() const { return t_axis; } + + double GetDalitz(int i) const { return dalitz[i]; } + + // Setters ------------------------------------------------------------------- + + // Add / Replace Parton + void SetParton(int i, Parton parton) { partons[i] = parton; } + + // Not used in ME [HOST] + void SetPartonPid(int i, int pid) { partons[i].SetPid(pid); } + void SetPartonMom(int i, Vec4 mom) { partons[i].SetMom(mom); } + void SetPartonCol(int i, int col) { partons[i].SetCol(col); } + void SetPartonAntiCol(int i, int anticol) { partons[i].SetAntiCol(anticol); } + + // Set Differential Cross Section and nHard + void SetDxs(double dxs) { this->dxs = dxs; } + void SetHard(int nHard) { this->nHard = nHard; } + + // Adjust and Increment Number of Emissions + void SetEmissions(int nEmission) { this->nEmission = nEmission; } + void IncrementEmissions() { nEmission++; } + + // Set Shower Params + void SetShowerT(double showerT) { this->showerT = showerT; } + void SetShowerY(double showerY) { this->showerY = showerY; } + void SetShowerZ(double showerZ) { this->showerZ = showerZ; } + + void SetShowerC(int showerC) { this->showerC = showerC; } + void IncrementShowerC() { showerC++; } + + // Set Winner Emission + void SetWinSF(int winSF) { this->winSF = winSF; } + void SetWinDipole(int i, int winDipole) { this->winDipole[i] = winDipole; } + void SetWinParam(int i, double winParams) { this->winParams[i] = winParams; } + + // Set Analysis Variables + void SetValidity(bool validity) { this->validity = validity; } + + void SetY23(double y23) { this->y23 = y23; } + void SetY34(double y34) { this->y34 = y34; } + void SetY45(double y45) { this->y45 = y45; } + void SetY56(double y56) { this->y56 = y56; } + void SetThr(double thr) { this->thr = thr; } + void SetHJM(double hjm) { this->hjm = hjm; } + void SetLJM(double ljm) { this->ljm = ljm; } + void SetWJB(double wjb) { this->wjb = wjb; } + void SetNJB(double njb) { this->njb = njb; } + + void SetTAxis(Vec4 t_axis) { this->t_axis = t_axis; } + + void SetDalitz(double x1, double x2) { + dalitz[0] = x1; + dalitz[1] = x2; + } + + // Member Functions ---------------------------------------------------------- + + // Validation of Result Data + bool Validate() { + Vec4 psum = Vec4(); + + std::vector<int> csum(100, 0); + + for (int i = 0; i < GetSize(); i++) { + Parton p = GetParton(i); + + Vec4 pmom = p.GetMom(); + int pcol = p.GetCol(); + int pAntiCol = p.GetAntiCol(); + + psum = psum + pmom; + + if (pcol > 0) { + csum[pcol] += 1; + } + + if (pAntiCol > 0) { + csum[pAntiCol] -= 1; + } + } + + bool pcheck = (psum[0] < 1e-12 && psum[1] < 1e-12 && psum[2] < 1e-12 && + psum[3] < 1e-12); + if (!pcheck) { + std::cout << psum << std::endl; + } + + bool ccheck = true; + for (int i = 0; i < 100; i++) { + if (csum[i] != 0) { + std::cout << "Colour " << i << " is not conserved." << std::endl; + ccheck = false; + break; + } + } + + return pcheck && ccheck; + } + + void print_info() const { + std::cout << "Event Information:\n"; + std::cout << "Dxs: " << GetDxs() << "\n"; + std::cout << "Number of Emissions: " << GetEmissions() << "\n"; + std::cout << "Shower T: " << GetShowerT() << "\n"; + std::cout << "Shower Y: " << GetShowerY() << "\n"; + std::cout << "Shower Z: " << GetShowerZ() << "\n"; + std::cout << "Shower C: " << GetShowerC() << "\n"; + std::cout << "Winner SF: " << GetWinSF() << "\n"; + std::cout << "Winner Dipole 1: " << GetWinDipole(0) << "\n"; + std::cout << "Winner Dipole 2: " << GetWinDipole(1) << "\n"; + std::cout << "Winner Params 1: " << GetWinParam(0) << "\n"; + std::cout << "Winner Params 2: " << GetWinParam(1) << "\n"; + std::cout << "Partons:\n"; + for (int i = 0; i < GetSize(); i++) { + Parton parton = GetParton(i); + std::cout << " Parton " << i << ":\n"; + std::cout << " Pid: " << parton.GetPid() << "\n"; + std::cout << " Mom: " << parton.GetMom() << "\n"; + std::cout << " Col: " << parton.GetCol() << "\n"; + std::cout << " AntiCol: " << parton.GetAntiCol() << "\n"; + } + } +}; + +#endif // EVENT_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/base/include/histogram.h b/gaps-1.1/cpp-shower/base/include/histogram.h new file mode 100644 index 0000000000000000000000000000000000000000..82163b67594d04fa79481a5628c3c767b12332db --- /dev/null +++ b/gaps-1.1/cpp-shower/base/include/histogram.h @@ -0,0 +1,331 @@ +#ifndef HISTOGRAM_H_ +#define HISTOGRAM_H_ + +#include <fstream> +#include <iomanip> +#include <sstream> +#include <string> + +#include "base.h" + +// Bin1D class +class Bin1D { + public: + double xmin, xmax, w, w2, wx, wx2, n; + + public: + Bin1D(double xmin, double xmax) + : xmin(xmin), xmax(xmax), w(0.), w2(0.), wx(0.), wx2(0.), n(0.) {} + + std::string Format(const std::string& tag) const { + std::stringstream ss; + ss << std::scientific << std::setprecision(6); + ss << tag << "\t" << tag << "\t" << w << "\t" << w2 << "\t" << wx << "\t" + << wx2 << "\t" << static_cast<int>(n); + return ss.str(); + } + + std::string ToString() const { + std::stringstream ss; + ss << std::scientific << std::setprecision(6); + ss << xmin << "\t" << xmax << "\t" << w << "\t" << w2 << "\t" << wx << "\t" + << wx2 << "\t" << static_cast<int>(n); + return ss.str(); + } + + double Width() const { return xmax - xmin; } + + void Fill(double x, double weight) { + this->w += weight; + w2 += weight * weight; + wx += weight * x; + wx2 += weight * weight * x; + n += 1.; + } + + void ScaleW(double scale) { + w *= scale; + w2 *= scale * scale; + wx *= scale; + wx2 *= scale * scale; + } +}; + +class Bin2D { + public: + double xmin, xmax, ymin, ymax, w, w2, wx, wx2, wy, wy2, wxy, n; + + public: + Bin2D(double xmin, double xmax, double ymin, double ymax) + : xmin(xmin), + xmax(xmax), + ymin(ymin), + ymax(ymax), + w(0.), + w2(0.), + wx(0.), + wx2(0.), + wy(0.), + wy2(0.), + wxy(0.), + n(0.) {} + + std::string Format(const std::string& tag) const { + std::stringstream ss; + ss << std::scientific << std::setprecision(6); + ss << tag << "\t" << tag << "\t" << w << "\t" << w2 << "\t" << wx << "\t" + << wx2 << "\t" << wy << "\t" << wy2 << "\t" << static_cast<int>(n); + return ss.str(); + } + + std::string ToString() const { + std::stringstream ss; + ss << std::scientific << std::setprecision(6); + ss << xmin << "\t" << xmax << "\t" << ymin << "\t" << ymax << "\t" << w + << "\t" << w2 << "\t" << wx << "\t" << wx2 << "\t" << wy << "\t" << wy2 + << "\t" << wxy << "\t" << static_cast<int>(n); + return ss.str(); + } + + double WidthX() const { return xmax - xmin; } + double WidthY() const { return ymax - ymin; } + + void Fill(double x, double y, double weight) { + this->w += weight; + w2 += weight * weight; + wx += weight * x; + wx2 += weight * weight * x; + wy += weight * y; + wy2 += weight * weight * y; + wxy += weight * x * y; + n += 1.; + } + + void ScaleW(double scale) { + w *= scale; + w2 *= scale * scale; + wx *= scale; + wx2 *= scale * scale; + wy *= scale; + wy2 *= scale * scale; + wxy *= scale * scale; + } +}; + +// Histo1D class +class Histo1D { + public: + std::string name; + std::vector<Bin1D> bins; + Bin1D uflow; + Bin1D oflow; + Bin1D total; + double scale; + + public: + // Constructor for Histo1D + Histo1D(double xmin = 0., double xmax = 1., const std::string& name = "hst") + : name(name), + uflow(xmin - 100., xmin), + oflow(xmax, xmax + 100.), + total(xmin - 100., xmax + 100.), + scale(1.) { + double width = (xmax - xmin) / nBins; + for (int i = 0; i < nBins; ++i) { + double xlow = xmin + i * width; + double xhigh = xlow + width; + bins.push_back(Bin1D(xlow, xhigh)); + } + } + + std::string ToString() const { + std::stringstream ss; + ss << "BEGIN YODA_HISTO1D " << name << "\n\n"; + ss << "Path=" << name << "\n\n"; + ss << "ScaledBy=" << scale << "\n"; + ss << "Title=\nType=Histo1D\n"; + ss << "# ID\tID\tsumw\tsumw2\tsumwx\tsumwx2\tnumEntries\n"; + ss << total.Format("Total") << "\n"; + ss << uflow.Format("Underflow") << "\n"; + ss << oflow.Format("Overflow") << "\n"; + ss << "# xlow\txhigh\tsumw\tsumw2\tsumwx\tsumwx2\tnumEntries\n"; + for (const auto& bin : bins) { + ss << bin.ToString() << "\n"; + } + ss << "END YODA_HISTO1D\n\n"; + return ss.str(); + } + + void Fill(double x, double w) { + int l = 0; + int r = bins.size() - 1; + int c = (l + r) / 2; + double a = bins[c].xmin; + + while (r - l > 1) { + if (x < a) { + r = c; + } else { + l = c; + } + c = (l + r) / 2; + a = bins[c].xmin; + } + + if (x > bins[r].xmin) { + if (x > bins[r].xmax) { + oflow.Fill(x, w); + } else { + bins[r].Fill(x, w); + } + } else if (x < bins[l].xmin) { + uflow.Fill(x, w); + } else { + bins[l].Fill(x, w); + } + + total.Fill(x, w); + } + + void ScaleW(double scale) { + for (auto& bin : bins) { + bin.ScaleW(scale); + } + uflow.ScaleW(scale); + oflow.ScaleW(scale); + total.ScaleW(scale); + this->scale *= scale; + } + + void Write(const std::string& filename) const { + std::ofstream file; + file.open(filename, std::ios::out | std::ios::app); + file << ToString(); + file.close(); + } +}; + +// Histo2D class +class Histo2D { + public: + std::string name; + std::vector<std::vector<Bin2D>> bins; + Bin2D uflow; + Bin2D oflow; + Bin2D total; + double scale; + + public: + // Constructor for Histo2D + Histo2D(double xmin = 0., double xmax = 1., double ymin = 0., + double ymax = 1., const std::string& name = "hst") + : name(name), + uflow(xmin - 100., xmin, ymin - 100., ymin), + oflow(xmax, xmax + 100., ymax, ymax + 100.), + total(xmin - 100., xmax + 100., ymin - 100., ymax + 100.), + scale(1.) { + double xwidth = (xmax - xmin) / nBins2D; + double ywidth = (ymax - ymin) / nBins2D; + for (int i = 0; i < nBins2D; ++i) { + double xlow = xmin + i * xwidth; + double xhigh = xlow + xwidth; + std::vector<Bin2D> binRow; + for (int j = 0; j < nBins2D; ++j) { + double ylow = ymin + j * ywidth; + double yhigh = ylow + ywidth; + binRow.push_back(Bin2D(xlow, xhigh, ylow, yhigh)); + } + bins.push_back(binRow); + } + } + + std::string ToString() const { + std::stringstream ss; + ss << "BEGIN YODA_HISTO2D " << name << "\n\n"; + ss << "Path=" << name << "\n\n"; + ss << "ScaledBy=" << scale << "\n"; + ss << "Title=\nType=Histo2D\n"; + ss << "# ID\tID\tsumw\tsumw2\tsumwx\tsumwx2\tsumwy\tsumwy2\tnumEntries\n"; + ss << total.Format("Total") << "\n"; + ss << "# " + "xlow\txhigh\tylow\tyhigh\tsumw\tsumw2\tsumwx\tsumwx2\tsumwy\tsumwy2" + "\tnumEntries\n"; + for (const auto& binRow : bins) { + for (const auto& bin : binRow) { + ss << bin.ToString() << "\n"; + } + } + ss << "END YODA_HISTO2D\n\n"; + return ss.str(); + } + + void Fill(double x, double y, double w) { + // Find the bin for the x-coordinate + int lx = 0; + int rx = bins.size() - 1; + int cx = (lx + rx) / 2; + double ax = bins[cx][0].xmin; + + while (rx - lx > 1) { + if (x < ax) { + rx = cx; + } else { + lx = cx; + } + cx = (lx + rx) / 2; + ax = bins[cx][0].xmin; + } + + // Find the bin for the y-coordinate + int ly = 0; + int ry = bins[0].size() - 1; + int cy = (ly + ry) / 2; + double ay = bins[0][cy].ymin; + + while (ry - ly > 1) { + if (y < ay) { + ry = cy; + } else { + ly = cy; + } + cy = (ly + ry) / 2; + ay = bins[0][cy].ymin; + } + + // Fill the appropriate bin + if (x > bins[rx][0].xmin && y > bins[0][ry].ymin) { + if (x > bins[rx][0].xmax || y > bins[0][ry].ymax) { + oflow.Fill(x, y, w); + } else { + bins[rx][ry].Fill(x, y, w); + } + } else if (x < bins[lx][0].xmin || y < bins[0][ly].ymin) { + uflow.Fill(x, y, w); + } else { + bins[lx][ly].Fill(x, y, w); + } + + total.Fill(x, y, w); + } + + void ScaleW(double scale) { + for (auto& binRow : bins) { + for (auto& bin : binRow) { + bin.ScaleW(scale); + } + } + uflow.ScaleW(scale); + oflow.ScaleW(scale); + total.ScaleW(scale); + this->scale *= scale; + } + + void Write(const std::string& filename) const { + std::ofstream file; + file.open(filename, std::ios::out | std::ios::app); + file << ToString(); + file.close(); + } +}; + +#endif // HISTOGRAM_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/base/include/parton.h b/gaps-1.1/cpp-shower/base/include/parton.h new file mode 100644 index 0000000000000000000000000000000000000000..35f292d156022c13601aa0cb9212e25bf2a55d4a --- /dev/null +++ b/gaps-1.1/cpp-shower/base/include/parton.h @@ -0,0 +1,36 @@ +#ifndef PARTON_H_ +#define PARTON_H_ + +// Partons have Vec4 Momentum, Vec4 #includes Base +#include "vec4.h" + +class Parton { + public: + // Constructor + Parton(int pid = 0, Vec4 momentum = Vec4(), int col = 0, int anticol = 0) + : pid(pid), mom(momentum), col(col), anticol(anticol) {} + + // Getters and Setters + int GetPid() const { return pid; } + Vec4 GetMom() const { return mom; } + int GetCol() const { return col; } + int GetAntiCol() const { return anticol; } + + void SetPid(int pid) { this->pid = pid; } + void SetMom(Vec4 mom) { this->mom = mom; } + void SetCol(int col) { this->col = col; } + void SetAntiCol(int anticol) { this->anticol = anticol; } + + // Boolean - If two partons are in a Colour Connected Dipole + bool IsColorConnected(Parton p) { + return (col > 0 && col == p.anticol) || (anticol > 0 && anticol == p.col); + } + + private: + int pid; + Vec4 mom; + int col; + int anticol; +}; + +#endif // PARTON_H_ diff --git a/gaps-1.1/cpp-shower/base/include/qcd.h b/gaps-1.1/cpp-shower/base/include/qcd.h new file mode 100644 index 0000000000000000000000000000000000000000..3ac5682ed0543e1813915335ea2bdfd75350a7a5 --- /dev/null +++ b/gaps-1.1/cpp-shower/base/include/qcd.h @@ -0,0 +1,87 @@ +#ifndef QCD_H_ +#define QCD_H_ + +// Base Class, with all the important definitions +#include "base.h" + +const double kNC = 3.; +const double kTR = 0.5; +const double kCA = kNC; +const double kCF = (kNC * kNC - 1.) / (2. * kNC); + +class AlphaS { + private: + int order; + double mc2, mb2, mz2, asmz, asmb, asmc; + + public: + // Constructor + AlphaS(double mz, double asmz, int order = 1, double mb = 4.75, + double mc = 1.27) + : order(order), + mc2(mc * mc), + mb2(mb * mb), + mz2(mz * mz), + asmz(asmz), + asmb((*this)(mb2)), + asmc((*this)(mc2)) {} + + // Beta functions + double Beta0(int nf) const { return (11. / 6. * kCA) - (2. / 3. * kTR * nf); } + + double Beta1(int nf) const { + return (17. / 6. * kCA * kCA) - ((5. / 3. * kCA + kCF) * kTR * nf); + } + + // Alpha_s at order 0 and 1 (One-Loop and Two-Loop) + double As0(double t) const { + double tref, asref, b0; + if (t >= mb2) { + tref = mz2; + asref = asmz; + b0 = Beta0(5) / (2. * M_PI); + } else if (t >= mc2) { + tref = mb2; + asref = asmb; + b0 = Beta0(4) / (2. * M_PI); + } else { + tref = mc2; + asref = asmc; + b0 = Beta0(3) / (2. * M_PI); + } + return 1. / (1. / asref + b0 * log(t / tref)); + } + + double As1(double t) const { + double tref, asref, b0, b1, w; + if (t >= mb2) { + tref = mz2; + asref = asmz; + b0 = Beta0(5) / (2. * M_PI); + b1 = Beta1(5) / pow(2. * M_PI, 2); + } else if (t >= mc2) { + tref = mb2; + asref = asmb; + b0 = Beta0(4) / (2. * M_PI); + b1 = Beta1(4) / pow(2. * M_PI, 2); + } else { + tref = mc2; + asref = asmc; + b0 = Beta0(3) / (2. * M_PI); + b1 = Beta1(3) / pow(2. * M_PI, 2); + } + w = 1. + b0 * asref * log(t / tref); + return asref / w * (1. - b1 / b0 * asref * log(w) / w); + } + + // Call operator to calculate alpha_s + double operator()(double t) { + if (order == 0) { + return As0(t); + } else { + return As1(t); + } + } +}; + +#endif // QCD_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/base/include/vec4.h b/gaps-1.1/cpp-shower/base/include/vec4.h new file mode 100644 index 0000000000000000000000000000000000000000..d4b032ea475ea18cfc31e2f7a127d8928c381618 --- /dev/null +++ b/gaps-1.1/cpp-shower/base/include/vec4.h @@ -0,0 +1,131 @@ +#ifndef VEC4_H_ +#define VEC4_H_ + +// Base Class, with all the important definitions +#include "base.h" + +class Vec4 { + private: + double E, px, py, pz; + + public: + // Constructor - Define key attributes Energy and Momentum + Vec4(double E = 0., double px = 0., double py = 0., double pz = 0.) + : E(E), px(px), py(py), pz(pz) {} + + // Get Method to Obtain Attribute Value + double operator[](int i) const { + switch (i) { + case 0: + return E; + case 1: + return px; + case 2: + return py; + case 3: + return pz; + default: + return 0; + } + } + + // Print a Column Vector with the attributes + friend std::ostream& operator<<(std::ostream& os, const Vec4& v) { + os << "(" << v.E << "," << v.px << "," << v.py << "," << v.pz << ")"; + return os; + } + + // Simple Mathematics with Four vectors + Vec4 operator+(const Vec4& v) const { + return Vec4(E + v.E, px + v.px, py + v.py, pz + v.pz); + } + + Vec4 operator-() const { return Vec4(-E, -px, -py, -pz); } + + Vec4 operator-(const Vec4& v) const { + return Vec4(E - v.E, px - v.px, py - v.py, pz - v.pz); + } + + // Multiplication (and Dot Product) + double operator*(const Vec4& v) const { + return E * v.E - px * v.px - py * v.py - pz * v.pz; + } + + Vec4 operator*(double v) const { return Vec4(E * v, px * v, py * v, pz * v); } + + // Division + Vec4 operator/(double v) const { return Vec4(E / v, px / v, py / v, pz / v); } + + // Magnitude of the Vector + double M2() const { return (*this) * (*this); } + + double M() const { + double m2 = M2(); + return m2 > 0 ? sqrt(m2) : 0; + } + + // 3 Momenta + double P2() const { return px * px + py * py + pz * pz; } + + double P() const { + double p2 = P2(); + return p2 > 0 ? sqrt(p2) : 0; + } + + // Transverse Momenta + double PT2() const { return px * px + py * py; } + + double PT() const { + double pt2 = PT2(); + return pt2 > 0 ? sqrt(pt2) : 0; + } + + // Angles + double Theta() const { + double p = P(); + return p != 0 ? acos(pz / p) : 0; + } + + double Phi() const { + if (px == 0 && py == 0) { + return 0.; + } else { + return atan2(py, px); + } + } + + double Rapidity() const { + double denominator = (E - pz); + return denominator != 0 ? 0.5 * log((E + pz) / denominator) : 0; + } + + double Eta() const { + double theta = Theta(); + return -log(tan(theta / 2.)); + } + + // Three Momenta Dot and Cross Product + double Dot(const Vec4& v) const { return px * v.px + py * v.py + pz * v.pz; } + + Vec4 Cross(const Vec4& v) const { + return Vec4(0., py * v.pz - pz * v.py, pz * v.px - px * v.pz, + px * v.py - py * v.px); + } + + // Boosts + Vec4 Boost(const Vec4& v) const { + double rsq = M(); + double v0 = (E * v.E - px * v.px - py * v.py - pz * v.pz) / rsq; + double c1 = (v.E + v0) / (rsq + E); + return Vec4(v0, v.px - c1 * px, v.py - c1 * py, v.pz - c1 * pz); + } + + Vec4 BoostBack(const Vec4& v) const { + double rsq = M(); + double v0 = (E * v.E + px * v.px + py * v.py + pz * v.pz) / rsq; + double c1 = (v.E + v0) / (rsq + E); + return Vec4(v0, v.px + c1 * px, v.py + c1 * py, v.pz + c1 * pz); + } +}; + +#endif // VEC4_H_ diff --git a/gaps-1.1/cpp-shower/main.cpp b/gaps-1.1/cpp-shower/main.cpp new file mode 100644 index 0000000000000000000000000000000000000000..17b7c115e568b6a7bd1008ae00d87a5d3b0fe3f6 --- /dev/null +++ b/gaps-1.1/cpp-shower/main.cpp @@ -0,0 +1,174 @@ +// To Measure Wall Clock Time and Write to File +#include <chrono> +#include <fstream> + +// Base Components +#include "base.h" + +// ME +#include "matrix.h" + +// Shower +#include "shower.h" + +// Analysis +#include "observables.h" + +/** + * GAPS: C++ Shower for Comparison + * ------------------------------------ + * + * This program is a translation of S. Höche's "Introduction to Parton Showers" + * Python tutorial[1], with added functionality for parallelisation, a Event + * class and event shape analyses. + * + * The purpose of this program is to compare the performance of the C++ and + * CUDA versions of the shower, and to compare the performance of the C++ with + * parallelisation and CUDA. + * + * [1] https://arxiv.org/abs/1411.4085 and MCNET-CTEQ 2021 Tutorial + */ + +// ----------------------------------------------------------------------------- + +void runGenerator(const int& N, const double& E, const std::string& filename) { + // --------------------------------------------------------------------------- + // Give some information about the simulation + + std::cout << "-------------------------------------------------" << std::endl; + std::cout << "| GAPS: C++ Shower for Comparison |" << std::endl; + std::cout << "-------------------------------------------------" << std::endl; + std::cout << "Process: e+ e- --> q qbar" << std::endl; + std::cout << "Number of Events: " << N << std::endl; + std::cout << "Centre of Mass Energy: " << E << " GeV" << std::endl; + std::cout << "" << std::endl; + + // --------------------------------------------------------------------------- + // Inititalisation + + std::cout << "Initialising..." << std::endl; + std::vector<Event> events(N); + + // --------------------------------------------------------------------------- + // Matrix Element Generation + + std::cout << "Generating Matrix Elements (C++)..." << std::endl; + auto start = std::chrono::high_resolution_clock::now(); + + Matrix xs(asmz, E); + + for (int i = 0; i < N; i++) { + xs.GenerateLOPoint(events[i]); // Random Seed + // (same seed option is off in matrix.cpp!) + } + + auto end = std::chrono::high_resolution_clock::now(); + std::chrono::duration<double> diff_me = end - start; + + // --------------------------------------------------------------------------- + // Showering + + std::cout << "Showering Partons (C++)..." << std::endl; + start = std::chrono::high_resolution_clock::now(); + + Shower sh; + + for (int i = 0; i < N; i++) { + sh.Run(events[i]); // Random Seed + // (same seed option is off in shower.cpp!) + } + + end = std::chrono::high_resolution_clock::now(); + std::chrono::duration<double> diff_sh = end - start; + + // --------------------------------------------------------------------------- + // Analysis + + std::cout << "Analysing Events (C++)..." << std::endl; + start = std::chrono::high_resolution_clock::now(); + + // Remove existing file + std::remove(filename.c_str()); + + Analysis an; + + // Analyze events (Including Validation of Colour and Momentum Conservation) + for (int i = 0; i < N; i++) { + an.Analyze(events[i]); + } + + // Storage + an.Finalize(filename); + + end = std::chrono::high_resolution_clock::now(); + std::chrono::duration<double> diff_an = end - start; + + // --------------------------------------------------------------------------- + // Results + + double diff = diff_me.count() + diff_sh.count() + diff_an.count(); + + std::cout << "" << std::endl; + std::cout << "EVENT GENERATION COMPLETE" << std::endl; + std::cout << "" << std::endl; + std::cout << "ME Time: " << diff_me.count() << " s" << std::endl; + std::cout << "Sh Time: " << diff_sh.count() << " s" << std::endl; + std::cout << "An Time: " << diff_an.count() << " s" << std::endl; + std::cout << "" << std::endl; + std::cout << "Total Time: " << diff << " s" << std::endl; + std::cout << "" << std::endl; + + // Open the file in append mode. This will create the file if it doesn't + // exist. + std::ofstream outfile("cpp-time.dat", std::ios_base::app); + + // Write diff_sh.count() to the file. + outfile << diff_me.count() << ", " << diff_sh.count() << ", " + << diff_an.count() << ", " << diff << std::endl; + + // Close the file. + outfile.close(); + + std::cout << "Histograms written to " << filename << std::endl; + std::cout << "Timing data written to cpp-time.dat" << std::endl; + std::cout << "------------------------------------------------" << std::endl; +} +// ----------------------------------------------------------------------------- + +int main(int argc, char* argv[]) { + // Import Settings + int N = argc > 1 ? atoi(argv[1]) : 10000; + double E = argc > 2 ? atof(argv[2]) : 91.2; + + // If more than maxEvents, run in batches + if (N > maxEvents) { + std::cout << "-------------------------------------------------" + << std::endl; + std::cout << "More Events than GPU Can Handle at Once!" << std::endl; + std::cout << "Running in batches..." << std::endl; + std::cout << "Please use rivet-merge to combine runs" << std::endl; + + // Split into batches + int nBatches = N / maxEvents; + int nRemainder = N % maxEvents; + std::cout << "Number of Batches: " << nBatches << std::endl; + std::cout << "Size of Remainder: " << nRemainder << std::endl; + + // Run in batches + for (int i = 0; i < nBatches; i++) { + std::string filename = "cpp-" + std::to_string(i) + ".yoda"; + runGenerator(maxEvents, E, filename); + } + + // Run remainder + if (nRemainder > 0) { + std::string filename = "cpp-" + std::to_string(nBatches) + ".yoda"; + runGenerator(nRemainder, E, filename); + } + } else { + runGenerator(N, E, "cpp.yoda"); + } + + return 0; +} +// ----------------------------------------------------------------------------- diff --git a/gaps-1.1/cpp-shower/matrix/CMakeLists.txt b/gaps-1.1/cpp-shower/matrix/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..fa77830f48948f10463e363b186f227bafdb9c70 --- /dev/null +++ b/gaps-1.1/cpp-shower/matrix/CMakeLists.txt @@ -0,0 +1,9 @@ +cmake_minimum_required(VERSION 3.10) +project(matrix) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include ../base/include) +file(GLOB SOURCES "src/*.cpp") + +add_library(matrix ${SOURCES}) \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/matrix/include/matrix.h b/gaps-1.1/cpp-shower/matrix/include/matrix.h new file mode 100644 index 0000000000000000000000000000000000000000..ab27db2768074ad886dfbb2060636e6478537391 --- /dev/null +++ b/gaps-1.1/cpp-shower/matrix/include/matrix.h @@ -0,0 +1,23 @@ +#ifndef MATRIX_H_ +#define MATRIX_H_ + +// Parton includes Base, which has the CUDA libraries +#include "event.h" + +// Matrix class +class Matrix { + private: + double alphas, ecms, MZ2, GZ2, alpha, sin2tw, amin, ye, ze, ws; + + public: + // Constructor + Matrix(double alphas = asmz, double ecms = 91.2); + + // Leading Order Matrix Element Generation + double ME2(int fl, double s, double t); + + // Generate a leading order point + void GenerateLOPoint(Event &ev); +}; + +#endif // MATRIX_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/matrix/src/matrix.cpp b/gaps-1.1/cpp-shower/matrix/src/matrix.cpp new file mode 100644 index 0000000000000000000000000000000000000000..1ce172091c75a301b1ab66d011cd80c35fc63091 --- /dev/null +++ b/gaps-1.1/cpp-shower/matrix/src/matrix.cpp @@ -0,0 +1,80 @@ +#include "matrix.h" + +// Constructor +Matrix::Matrix(double alphas, double ecms) + : alphas(alphas), + ecms(ecms), + MZ2(pow(91.1876, 2.)), + GZ2(pow(2.4952, 2.)), + alpha(1. / 128.802), + sin2tw(0.22293), + amin(1.e-10), + ye(0.5), + ze(0.01), + ws(0.25) {} + +// Leading Order Matrix Element Generation +double Matrix::ME2(int fl, double s, double t) { + double qe = -1.; + double ae = -0.5; + double ve = ae - 2. * qe * sin2tw; + double qf = (fl == 2 || fl == 4) ? 2. / 3. : -1. / 3.; + double af = (fl == 2 || fl == 4) ? 0.5 : -0.5; + double vf = af - 2. * qf * sin2tw; + double kappa = 1. / (4. * sin2tw * (1. - sin2tw)); + double chi1 = kappa * s * (s - MZ2) / (pow(s - MZ2, 2.) + GZ2 * MZ2); + double chi2 = pow(kappa * s, 2.) / (pow(s - MZ2, 2.) + GZ2 * MZ2); + double term1 = (1. + pow(1. + 2. * t / s, 2.)) * + (pow(qf * qe, 2.) + 2. * (qf * qe * vf * ve) * chi1 + + (ae * ae + ve * ve) * (af * af + vf * vf) * chi2); + double term2 = (1. + 2. * t / s) * (4. * qe * qf * ae * af * chi1 + + 8. * ae * ve * af * vf * chi2); + return pow(4. * M_PI * alpha, 2.) * 3. * (term1 + term2); +} + +// Generate a point +void Matrix::GenerateLOPoint(Event &ev) { + thread_local std::random_device rd; + thread_local std::mt19937 gen(rd()); + + // Same seed option. Turn off by commenting when not in use! + // Having an if statement if no seed is given would not be a fair comparison + // to the GPU, so commented out is better for now. Maybe in the future. + // thread_local std::mt19937 gen(seed); + + std::uniform_real_distribution<> dis(0., 1.); // Uniform distribution + + double ct = 2. * dis(gen) - 1.; + double st = std::sqrt(1. - ct * ct); + double phi = 2. * M_PI * dis(gen); + + int fl = std::rand() % 5 + 1; // Faster than using dis(gen) + double p0 = 0.5 * ecms; + + Vec4 pa(p0, 0., 0., p0); + Vec4 pb(p0, 0., 0., -p0); + Vec4 p1(p0, p0 * st * cos(phi), p0 * st * sin(phi), p0 * ct); + Vec4 p2(p0, -p0 * st * cos(phi), -p0 * st * sin(phi), -p0 * ct); + + double lome = ME2(fl, (pa + pb).M2(), (pa - p1).M2()); + + // Calculate the differential cross section + // 5 = 5 flavours (?) + // 3.89379656e8 = Convert from GeV^-2 to pb + // 8 pi = Standard Phase Space Factor + // pow(matrix->GetECMS(), 2.) = center of mass energy squared, s + double dxs = + 5. * lome * 3.89379656e8 / (8. * M_PI) / (2. * std::pow(ecms, 2.)); + + Parton p[4] = {Parton(-11, -pa, 0, 0), Parton(11, -pb, 0, 0), + Parton(fl, p1, 1, 0), Parton(-fl, p2, 0, 1)}; + + // Set the Partons + for (int i = 0; i < 4; i++) { + ev.SetParton(i, p[i]); + } + + // Set the ME Params + ev.SetDxs(dxs); + ev.SetHard(4); +} \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/CMakeLists.txt b/gaps-1.1/cpp-shower/observables/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..b08539307937bf78af8679ad4bb7114dd86f3d82 --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/CMakeLists.txt @@ -0,0 +1,9 @@ +cmake_minimum_required(VERSION 3.10) +project(observables) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include ../base/include) +file(GLOB SOURCES "src/*.cpp") + +add_library(observables ${SOURCES}) \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/include/dalitz.h b/gaps-1.1/cpp-shower/observables/include/dalitz.h new file mode 100644 index 0000000000000000000000000000000000000000..e3eb3258021b9b97cd50d6038198b4b72ceaaf3b --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/include/dalitz.h @@ -0,0 +1,9 @@ +#ifndef DALITZ_H_ +#define DALITZ_H_ + +#include "event.h" + +// Dalitz Plot +void CalculateDalitz(Event& ev); + +#endif // DALITZ_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/include/eventshapes.h b/gaps-1.1/cpp-shower/observables/include/eventshapes.h new file mode 100644 index 0000000000000000000000000000000000000000..0e82a721790061f8f7b04eeac358636061e8b6fb --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/include/eventshapes.h @@ -0,0 +1,11 @@ +#ifndef EVENTSHAPES_H_ +#define EVENTSHAPES_H_ + +#include "event.h" + +// Event Shapes +void bubbleSort(Vec4* moms, int n); +void CalculateThrust(Event& ev); +void CalculateJetMBr(Event& ev); + +#endif // EVENTSHAPES_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/include/jetrates.h b/gaps-1.1/cpp-shower/observables/include/jetrates.h new file mode 100644 index 0000000000000000000000000000000000000000..ff348342f823aaa03bfda6e0674a35a1fd2ab196 --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/include/jetrates.h @@ -0,0 +1,10 @@ +#ifndef JETRATES_H_ +#define JETRATES_H_ + +#include "event.h" + +// Jet Rates using the Durham Algorithm +double Yij(const Vec4& p, const Vec4& q); +void Cluster(Event& ev); + +#endif // JETRATES_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/include/observables.h b/gaps-1.1/cpp-shower/observables/include/observables.h new file mode 100644 index 0000000000000000000000000000000000000000..d202855173709cc3a24f29a81713a41319be77a0 --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/include/observables.h @@ -0,0 +1,47 @@ +#ifndef DURHAM_H_ +#define DURHAM_H_ + +#include "dalitz.h" +#include "event.h" +#include "eventshapes.h" +#include "histogram.h" +#include "jetrates.h" + +/** + * Slight Difference between C++ and CUDA codes + * -------------------------------------------- + * + * Here, we don't need Analyse and Finalize functions to be outside the class. + * So we can just include them in the class definition. Also we have to use the + * same class for all events! + */ +class Analysis { + public: + Histo1D hists[10]; + Histo2D dalitz; + + double wtot; // Scale by Weight for 1/sigma d(sigma)/d Observable + double ntot; // Scale by Number for d(sigma)/d Observable + + public: + Analysis() : wtot(0.), ntot(0.) { + hists[0] = Histo1D(-4.3, -0.3, "/gaps/log10y23\n"); + hists[1] = Histo1D(-4.3, -0.3, "/gaps/log10y34\n"); + hists[2] = Histo1D(-4.3, -0.3, "/gaps/log10y45\n"); + hists[3] = Histo1D(-4.3, -0.3, "/gaps/log10y56\n"); + hists[4] = Histo1D(0., 0.5, "/gaps/tvalue\n"); + hists[5] = Histo1D(0., 0.5, "/gaps/tzoomd\n"); + hists[6] = Histo1D(0., 1., "/gaps/hjm\n"); + hists[7] = Histo1D(0., 0.5, "/gaps/ljm\n"); + hists[8] = Histo1D(0., 0.5, "/gaps/wjb\n"); + hists[9] = Histo1D(0., 0.2, "/gaps/njb\n"); + + dalitz = Histo2D(0., 1., 0., 1., "/gaps/dalitz\n"); + } + + // Member Functions Here, but not in CUDA + void Analyze(Event& ev); + void Finalize(const std::string& filename); +}; + +#endif // DURHAM_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/src/dalitz.cpp b/gaps-1.1/cpp-shower/observables/src/dalitz.cpp new file mode 100644 index 0000000000000000000000000000000000000000..978e65b8f86ed3ab0e7903bbbc5b5a16554086b1 --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/src/dalitz.cpp @@ -0,0 +1,23 @@ +#include "dalitz.h" + +// Dalitz Plot + +void CalculateDalitz(Event& ev) { + if (!ev.GetValidity() || ev.GetPartonSize() != 3) { + return; + } + + // Obtain Energy from incoming partons + double E = abs(ev.GetParton(0).GetMom()[0] + ev.GetParton(1).GetMom()[0]); + + // By default, element 2 is quark and 3 is antiquark + // i.e. emission will be element 4 + Vec4 p1 = ev.GetParton(2).GetMom(); + Vec4 p2 = ev.GetParton(3).GetMom(); + + // Calculate x1 and x2 + double x1 = 2 * p1.P() / E; + double x2 = 2 * p2.P() / E; + + ev.SetDalitz(x1, x2); +} \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/src/eventshapes.cpp b/gaps-1.1/cpp-shower/observables/src/eventshapes.cpp new file mode 100644 index 0000000000000000000000000000000000000000..59566c651e27a4de4792ae539001643cb150f1de --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/src/eventshapes.cpp @@ -0,0 +1,156 @@ +#include "eventshapes.h" + +// Event Shapes + +void bubbleSort(Vec4* moms, int n) { + for (int i = 0; i < n - 1; i++) { + for (int j = 0; j < n - i - 1; j++) { + if (moms[j].P() < moms[j + 1].P()) { + Vec4 temp = moms[j]; + moms[j] = moms[j + 1]; + moms[j + 1] = temp; + } + } + } +} + +void CalculateThrust(Event& ev) { + if (!ev.GetValidity() || ev.GetPartonSize() < 3) { + return; + } + + Vec4 moms[maxPartons]; + for (int i = 2; i < ev.GetSize(); ++i) { + moms[i - 2] = ev.GetParton(i).GetMom(); + } + + bubbleSort(moms, maxPartons); + + double momsum = 0.; + for (int i = 0; i < ev.GetPartonSize(); ++i) { + momsum += moms[i].P(); + } + + double thr = 0.; + Vec4 t_axis = Vec4(); + + for (int k = 1; k < ev.GetPartonSize(); ++k) { + for (int j = 0; j < k; ++j) { + Vec4 tmp_axis = moms[j].Cross(moms[k]); + Vec4 p_thrust = Vec4(); + Vec4 p_combin[4]; + + for (int i = 0; i < ev.GetPartonSize(); ++i) { + if (i != j && i != k) { + if (moms[i].Dot(tmp_axis) >= 0) { + p_thrust = p_thrust + moms[i]; + } else { + p_thrust = p_thrust - moms[i]; + } + } + } + + p_combin[0] = (p_thrust + moms[j] + moms[k]); + p_combin[1] = (p_thrust + moms[j] - moms[k]); + p_combin[2] = (p_thrust - moms[j] + moms[k]); + p_combin[3] = (p_thrust - moms[j] - moms[k]); + + for (int i = 0; i < 4; ++i) { + double temp = p_combin[i].P(); + if (temp > thr) { + thr = temp; + t_axis = p_combin[i]; + } + } + } + } + + thr /= momsum; + thr = 1. - thr; + + t_axis = t_axis / (t_axis).P(); + if (t_axis[3] < 0) { + t_axis = t_axis * -1.; + } + + if (thr < 1e-12) { + thr = -5.; + } + + ev.SetThr(thr); + ev.SetTAxis(t_axis); +} + +void CalculateJetMBr(Event& ev) { + if (!ev.GetValidity() || ev.GetPartonSize() < 3) { + return; + } + + Vec4 moms[maxPartons]; + for (int i = 2; i < ev.GetSize(); ++i) { + moms[i - 2] = ev.GetParton(i).GetMom(); + } + + double momsum = 0.; + for (int i = 0; i < ev.GetSize(); ++i) { + momsum += moms[i].P(); + } + + Vec4 p_with, p_against; + int n_with = 0, n_against = 0; + double e_vis = 0., broad_with = 0., broad_against = 0., + broad_denominator = 0.; + + for (int i = 0; i < ev.GetPartonSize(); ++i) { + double mo_para = moms[i].Dot(ev.GetTAxis()); + double mo_perp = (moms[i] - (ev.GetTAxis() * mo_para)).P(); + double enrg = moms[i].P(); + + e_vis += enrg; + broad_denominator += 2. * enrg; + + if (mo_para > 0.) { + p_with = p_with + moms[i]; + broad_with += mo_perp; + n_with++; + } else if (mo_para < 0.) { + p_against = p_against + moms[i]; + broad_against += mo_perp; + n_against++; + } else { + p_with = p_with + (moms[i] * 0.5); + p_against = p_against + (moms[i] * 0.5); + broad_with += 0.5 * mo_perp; + broad_against += 0.5 * mo_perp; + n_with++; + n_against++; + } + } + + double e2_vis = e_vis * e_vis; + + double mass2_with = fabs(p_with.M2() / e2_vis); + double mass2_against = fabs(p_against.M2() / e2_vis); + + double mass_with = sqrt(mass2_with); + double mass_against = sqrt(mass2_against); + + broad_with /= broad_denominator; + broad_against /= broad_denominator; + + double mH = fmax(mass_with, mass_against); + double mL = fmin(mass_with, mass_against); + + double bW = fmax(broad_with, broad_against); + double bN = fmin(broad_with, broad_against); + + if (n_with == 1 || n_against == 1) { + ev.SetHJM(mH); + ev.SetWJB(bW); + } else { + ev.SetHJM(mH); + ev.SetLJM(mL); + ev.SetWJB(bW); + ev.SetNJB(bN); + } +} \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/observables/src/jetrates.cpp b/gaps-1.1/cpp-shower/observables/src/jetrates.cpp new file mode 100644 index 0000000000000000000000000000000000000000..840fec6342fda7c317cdea44352fdaf3f9800de5 --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/src/jetrates.cpp @@ -0,0 +1,95 @@ +#include "jetrates.h" + +// Jet Rates + +// Yij function Used for the Durham analysis +double Yij(const Vec4& p, const Vec4& q, double ecm2) { + double pq = p[1] * q[1] + p[2] * q[2] + p[3] * q[3]; + double min_pq = std::min(p[0], q[0]); + double max_pq = std::max(pq / std::sqrt(p.P2() * q.P2()), -1.); + return 2. * std::pow(min_pq, 2) * (1. - std::min(max_pq, 1.)) / ecm2; +} + +// Durham Clustering Algorithm +void Cluster(Event& ev) { + if (!ev.GetValidity()) { + return; + } + // For C++, one can use the std::vector class to store the parton 4-momenta. + // Howver, this would be inefficient for CUDA, so we use a pre-set array size + // To make the comparison fair, we make this class use the same array size + // as well. + + // Get the center of mass energy squared + double ecm2 = (ev.GetParton(0).GetMom() + ev.GetParton(1).GetMom()).M2(); + + // Extract the 4-momenta of the partons + Vec4 p[maxPartons]; + for (int i = 2; i < ev.GetSize(); ++i) { + p[i - 2] = ev.GetParton(i).GetMom(); + } + + // kt2 will store the kt2 values for each clustering step + // If not changed, set to -1 so we can ignore when histogramming + double kt2[maxPartons] = {-1.}; + int counter = 0; + + // Number of partons (which will change when clustered), lower case to avoid N + int n = ev.GetPartonSize(); + + // imap will store the indices of the partons + int imap[maxPartons]; + for (int i = 0; i < ev.GetPartonSize(); ++i) { + imap[i] = i; + } + + // kt2ij will store the kt2 values for each pair of partons + double kt2ij[maxPartons][maxPartons] = {0.}; + double dmin = 1.; + int ii = 0, jj = 0; + for (int i = 0; i < n; ++i) { + for (int j = 0; j < i; ++j) { + double dij = kt2ij[i][j] = Yij(p[i], p[j], ecm2); + if (dij < dmin) { + dmin = dij; + ii = i; + jj = j; + } + } + } + + // Cluster the partons + while (n > 2) { + --n; + kt2[counter] = dmin; + counter++; + int jjx = imap[jj]; + p[jjx] = p[jjx] + p[imap[ii]]; + for (int i = ii; i < n; ++i) { + imap[i] = imap[i + 1]; + } + for (int j = 0; j < jj; ++j) { + kt2ij[jjx][imap[j]] = Yij(p[jjx], p[imap[j]], ecm2); + } + for (int i = jj + 1; i < n; ++i) { + kt2ij[imap[i]][jjx] = Yij(p[jjx], p[imap[i]], ecm2); + } + dmin = 1.; + for (int i = 0; i < n; ++i) { + for (int j = 0; j < i; ++j) { + double dij = kt2ij[imap[i]][imap[j]]; + if (dij < dmin) { + dmin = dij; + ii = i; + jj = j; + } + } + } + } + + // Store the kt2 values in the output arrays + ev.SetY23(counter > 0 ? log10(kt2[counter - 1 - 0]) : -50.); + ev.SetY34(counter > 1 ? log10(kt2[counter - 1 - 1]) : -50.); + ev.SetY45(counter > 2 ? log10(kt2[counter - 1 - 2]) : -50.); + ev.SetY56(counter > 3 ? log10(kt2[counter - 1 - 3]) : -50.); +} diff --git a/gaps-1.1/cpp-shower/observables/src/observables.cpp b/gaps-1.1/cpp-shower/observables/src/observables.cpp new file mode 100644 index 0000000000000000000000000000000000000000..8a1621e21864e9b4c1e763816fd02919820653d4 --- /dev/null +++ b/gaps-1.1/cpp-shower/observables/src/observables.cpp @@ -0,0 +1,69 @@ +#include "observables.h" + +// Observable Analysis + +void Analysis::Analyze(Event& ev) { + // Validate Event + ev.SetValidity(ev.Validate()); + + if (!ev.GetValidity()) { + printf("Invalid Event\n"); + return; + } + + // Cluster + Cluster(ev); + + // Calculate Thrust + CalculateThrust(ev); + + // Calculate JetMBr + CalculateJetMBr(ev); + + /** + * Why is the Dalitz Plot off? + * --------------------------- + * + * While the Dalitz analysis also benefits from the GPU parallelisation, the + * writing of the data to file severely limits the performance, as instead of + * the usual 100 bins, we have 100^2 = 1000 bins. This takes around 0.04s, + * which is minute in the C++ case, but is in fact 40% of the total analysis + * time! So for our tests, we keep this off, to keep our comparisons fair, + * and relvant to the actual GPU effect. + * + * If you want to turn it on, uncomment the lines in this file, and it's + * equivalent in the 'observables.cpp' file. + */ + // Calculate Dalitz + // CalculateDalitz(ev); + + // Fill Histograms + hists[0].Fill(ev.GetY23(), ev.GetDxs()); + hists[1].Fill(ev.GetY34(), ev.GetDxs()); + hists[2].Fill(ev.GetY45(), ev.GetDxs()); + hists[3].Fill(ev.GetY56(), ev.GetDxs()); + hists[4].Fill(ev.GetThr(), ev.GetDxs()); + hists[5].Fill(ev.GetThr(), ev.GetDxs()); + hists[6].Fill(ev.GetHJM(), ev.GetDxs()); + hists[7].Fill(ev.GetLJM(), ev.GetDxs()); + hists[8].Fill(ev.GetWJB(), ev.GetDxs()); + hists[9].Fill(ev.GetNJB(), ev.GetDxs()); + + // Dalitz Plot is OFF + // dalitz.Fill(ev.GetDalitz(0), ev.GetDalitz(1), ev.GetDxs()); + + // Weighted Total + wtot += ev.GetDxs(); + ntot += 1.; +} + +void Analysis::Finalize(const std::string& filename) { + for (auto& hist : hists) { + hist.ScaleW(1. / ntot); + hist.Write(filename); + } + + // Dalitz Plot is OFF + // dalitz.ScaleW(1. / ntot); + // dalitz.Write(filename); +} \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/shower/CMakeLists.txt b/gaps-1.1/cpp-shower/shower/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..eaeab5843cf3b69c7d5d2152f34ab82a6e207a42 --- /dev/null +++ b/gaps-1.1/cpp-shower/shower/CMakeLists.txt @@ -0,0 +1,9 @@ +cmake_minimum_required(VERSION 3.10) +project(shower) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include ../base/include) +file(GLOB SOURCES "src/*.cpp") + +add_library(shower ${SOURCES}) \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/shower/include/shower.h b/gaps-1.1/cpp-shower/shower/include/shower.h new file mode 100644 index 0000000000000000000000000000000000000000..39e1a7d9c5ddff2be70f0ab9c91e1de0c4577b36 --- /dev/null +++ b/gaps-1.1/cpp-shower/shower/include/shower.h @@ -0,0 +1,51 @@ +#ifndef SHOWER_H_ +#define SHOWER_H_ + +#include "event.h" +#include "qcd.h" + +// Splitting Function Codes - Only FF for now (Removed Zeroes) +// ------------------------------------------ +const int sfCodes[] = {1, 2, 3, 4, 5, 11, 12, 13, + 14, 15, 200, 301, 302, 303, 304, 305}; + +// Shower Class - NOT IN GPU as kernels cannot be member functions +class Shower { + private: + AlphaS as = AlphaS(mz, asmz); + + public: + Shower(); // tC and asmax are preset in base + + // In CUDA, we cannot point by reference, so we use + // pointers (*) instead of references (&). + + // Splitting Functions + double sfValue(double z, double y, int sf); + double sfEstimate(double z, int sf); + double sfIntegral(double zm, double zp, int sf); + double sfGenerateZ(double zm, double zp, double rand, int sf); + + // Utilities + bool validateSplitting(int ij, const int sf); + void sfToFlavs(int sf, int* flavs); + + // Select the Winner Emission + void SelectWinner(Event& ev, std::mt19937& gen); + + // Kinematics + void MakeKinematics(Vec4* kinematics, const double z, const double y, + const double phi, const Vec4 pijt, const Vec4 pkt); + + // Colours + void MakeColours(Event& ev, int* coli, int* colj, const int flavs[3], + const int colij[2], const int colk[2], const double r); + + // Veto Algorithm + Perform the Splitting + void GenerateSplitting(Event& ev, std::mt19937& gen); + + // Run the Shower + void Run(Event& ev); +}; + +#endif // SHOWER_H_ \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/shower/src/colours.cpp b/gaps-1.1/cpp-shower/shower/src/colours.cpp new file mode 100644 index 0000000000000000000000000000000000000000..54648c8514ad6a8bc7d48198dce9e645b496f899 --- /dev/null +++ b/gaps-1.1/cpp-shower/shower/src/colours.cpp @@ -0,0 +1,55 @@ +#include "shower.h" + +void Shower::MakeColours(Event& ev, int* coli, int* colj, const int flavs[3], + const int colij[2], const int colk[2], + const double r) { + // Increase variable ev.GetShowerC() by 1 + ev.IncrementShowerC(); + + if (flavs[0] != 21) { + if (flavs[0] > 0) { + coli[0] = ev.GetShowerC(); + coli[1] = 0; + colj[0] = colij[0]; + colj[1] = ev.GetShowerC(); + } else { + coli[0] = 0; + coli[1] = ev.GetShowerC(); + colj[0] = ev.GetShowerC(); + colj[1] = colij[1]; + } + } else { + if (flavs[1] == 21) { + if (colij[0] == colk[1]) { + if (colij[1] == colk[0] && r > 0.5) { + coli[0] = colij[0]; + coli[1] = ev.GetShowerC(); + colj[0] = ev.GetShowerC(); + colj[1] = colij[1]; + } else { + coli[0] = ev.GetShowerC(); + coli[1] = colij[1]; + colj[0] = colij[0]; + colj[1] = ev.GetShowerC(); + } + } else { + coli[0] = colij[0]; + coli[1] = ev.GetShowerC(); + colj[0] = ev.GetShowerC(); + colj[1] = colij[1]; + } + } else { + if (flavs[1] > 0) { + coli[0] = colij[0]; + coli[1] = 0; + colj[0] = 0; + colj[1] = colij[1]; + } else { + coli[0] = 0; + coli[1] = colij[1]; + colj[0] = colij[0]; + colj[1] = 0; + } + } + } +} \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/shower/src/kinematics.cpp b/gaps-1.1/cpp-shower/shower/src/kinematics.cpp new file mode 100644 index 0000000000000000000000000000000000000000..597c6a1a53c7faecf088267c06cd93a00f5101aa --- /dev/null +++ b/gaps-1.1/cpp-shower/shower/src/kinematics.cpp @@ -0,0 +1,31 @@ +#include "shower.h" + +void Shower::MakeKinematics(Vec4* kinematics, const double z, const double y, + const double phi, const Vec4 pijt, const Vec4 pkt) { + Vec4 Q = pijt + pkt; + + // Generating the Momentum (0, kt1, kt2, 0) + double rkt = sqrt(Q.M2() * y * z * (1. - z)); + + Vec4 kt1 = pijt.Cross(pkt); + if (kt1.P() < 1.e-6) { + Vec4 xaxis(0., 1., 0., 0.); + kt1 = pijt.Cross(xaxis); + } + kt1 = kt1 * (rkt * cos(phi) / kt1.P()); + + Vec4 kt2cms = Q.Boost(pijt); + kt2cms = kt2cms.Cross(kt1); + kt2cms = kt2cms * (rkt * sin(phi) / kt2cms.P()); + Vec4 kt2 = Q.BoostBack(kt2cms); + + // Conversion to {i, j, k} basis + Vec4 pi = pijt * z + pkt * ((1. - z) * y) + kt1 + kt2; + Vec4 pj = pijt * (1. - z) + pkt * (z * y) - kt1 - kt2; + Vec4 pk = pkt * (1. - y); + + // No need to do *kinematics[0], for arrays the elements are already pointers + kinematics[0] = pi; + kinematics[1] = pj; + kinematics[2] = pk; +} \ No newline at end of file diff --git a/gaps-1.1/cpp-shower/shower/src/shower.cpp b/gaps-1.1/cpp-shower/shower/src/shower.cpp new file mode 100644 index 0000000000000000000000000000000000000000..50eee1180111adfa44af93a4363bb73a8b2c7739 --- /dev/null +++ b/gaps-1.1/cpp-shower/shower/src/shower.cpp @@ -0,0 +1,187 @@ +#include "shower.h" + +Shower::Shower() {} + +/** + * This Function is Different from S.H's Tutorial, we keep it the same as the + * GPU version for a fair test + */ +void Shower::SelectWinner(Event& ev, std::mt19937& gen) { + std::uniform_real_distribution<> dis(0., 1.); + + // Default Values + double win_tt = tC; // Lowest possible value is Cutoff Scale (in base.cuh) + int win_sf = 0; // 0 = No Splitting + int win_ij = 0; + int win_k = 0; + double win_zp = 0.; + double win_m2 = 0.; + + // We start at 2 because elements 0 and 1 are electrons - To change with ISR + for (int ij = 2; ij < ev.GetSize(); ij++) { + for (int k = 2; k < ev.GetSize(); k++) { + // Sanity Check to ensure ij != k + if (ij == k) { + continue; + } + + // Need to check if ij and k are colour connected + if (!ev.GetParton(ij).IsColorConnected(ev.GetParton(k))) { + continue; + } + + // Params Identical to all splitting functions + double m2 = (ev.GetParton(ij).GetMom() + ev.GetParton(k).GetMom()).M2(); + if (m2 < 4. * tC) { + continue; + } + + double zp = 0.5 * (1. + sqrt(1. - 4. * tC / m2)); + + // Codes instead of Object Oriented Approach! + for (int sf : sfCodes) { + // Check if the Splitting Function is valid for the current partons + if (!validateSplitting(ev.GetParton(ij).GetPid(), sf)) { + continue; + } + + // Calculate the Evolution Variable + double g = asmax / (2. * M_PI) * sfIntegral(1 - zp, zp, sf); + double tt = ev.GetShowerT() * pow(dis(gen), 1. / g); + + // Check if tt is greater than the current winner + if (tt > win_tt) { + win_tt = tt; + win_sf = sf; + win_ij = ij; + win_k = k; + win_zp = zp; + win_m2 = m2; + } + } + } + } + + // Store the results + ev.SetShowerT(win_tt); + ev.SetWinSF(win_sf); + ev.SetWinDipole(0, win_ij); + ev.SetWinDipole(1, win_k); + ev.SetWinParam(0, win_zp); + ev.SetWinParam(1, win_m2); +} + +/** + * In the GPU version, this would be split into multiple CUDA Kernels + */ +void Shower::GenerateSplitting(Event& ev, std::mt19937& gen) { + std::uniform_real_distribution<> dis(0., 1.); + + while (ev.GetShowerT() > tC) { + SelectWinner(ev, gen); + + if (ev.GetShowerT() > tC) { + // Get the Splitting Function + int sf = ev.GetWinSF(); + + double rand = dis(gen); + + // Generate z + double zp = ev.GetWinParam(0); + double z = sfGenerateZ(1 - zp, zp, rand, sf); + + double y = ev.GetShowerT() / ev.GetWinParam(1) / z / (1. - z); + + double f = 0.; + double g = 0.; + double value = 0.; + double estimate = 0.; + + // CS Kernel: y can't be 1 + if (y < 1.) { + value = sfValue(z, y, sf); + estimate = sfEstimate(z, sf); + + f = (1. - y) * as(ev.GetShowerT()) * value; + g = asmax * estimate; + + if (dis(gen) < f / g) { + ev.SetShowerZ(z); + ev.SetShowerY(y); + + double phi = 2. * M_PI * dis(gen); + + int win_ij = ev.GetWinDipole(0); + int win_k = ev.GetWinDipole(1); + + Vec4 moms[3] = {Vec4(), Vec4(), Vec4()}; + MakeKinematics(moms, z, y, phi, ev.GetParton(win_ij).GetMom(), + ev.GetParton(win_k).GetMom()); + + int flavs[3]; + sfToFlavs(sf, flavs); + + int colij[2] = {ev.GetParton(win_ij).GetCol(), + ev.GetParton(win_ij).GetAntiCol()}; + + int colk[2] = {ev.GetParton(win_k).GetCol(), + ev.GetParton(win_k).GetAntiCol()}; + + int coli[2] = {0, 0}; + int colj[2] = {0, 0}; + MakeColours(ev, coli, colj, flavs, colij, colk, dis(gen)); + + // Modify Splitter + ev.SetPartonPid(win_ij, flavs[1]); + ev.SetPartonMom(win_ij, moms[0]); + ev.SetPartonCol(win_ij, coli[0]); + ev.SetPartonAntiCol(win_ij, coli[1]); + + // Modify Recoiled Spectator + ev.SetPartonMom(win_k, moms[2]); + + // Add Emitted Parton + Parton em = Parton(flavs[2], moms[1], colj[0], colj[1]); + ev.SetParton(ev.GetSize(), em); + + // Increment Emissions (IMPORTANT) + ev.IncrementEmissions(); + + return; + } + } + } + } +} + +void Shower::Run(Event& ev) { + /** + * Thread Local + * ------------ + * + * We observed significant slowdown due to the rng - this is because the + * code was re initialising the RNG. Using thread_local means that it is + * initialised once and then reused, giving a massive speed-up! + */ + thread_local std::random_device rd; + thread_local std::mt19937 gen(rd()); + + // Same seed option. Turn off by commenting when not in use! + // Having an if statement if no seed is given would not be a fair comparison + // to the GPU, so commented out is better for now. Maybe in the future. + // thread_local std::mt19937 gen(seed); + + // Set the starting shower scale + double t_max = (ev.GetParton(0).GetMom() + ev.GetParton(1).GetMom()).M2(); + ev.SetShowerT(t_max); + + // Set the initial number of emissions + ev.SetEmissions(0); + + // Set the Colour Counter to 1 (q and qbar) + ev.SetShowerC(1); + + while (ev.GetShowerT() > tC) { + GenerateSplitting(ev, gen); + } +} diff --git a/gaps-1.1/cpp-shower/shower/src/splittings.cpp b/gaps-1.1/cpp-shower/shower/src/splittings.cpp new file mode 100644 index 0000000000000000000000000000000000000000..83efd48248fb411a520413223fbd0d989afe991d --- /dev/null +++ b/gaps-1.1/cpp-shower/shower/src/splittings.cpp @@ -0,0 +1,240 @@ +#include "shower.h" + +/** + * Splitting Functions as a function - safer but less sophisticated + * ---------------------------------------------------------------- + * + * This is a safer and more straightforward way to implement the splitting + * functions for the shower. Although the class-based approach is good for + * C++, in CUDA many issues arise that mean that OOP might not always be the + * best strategy in coding. As a simpler approach, we will use switch-case + * statements to select the correct splitting function. + * + * We have a LOT of splitting functions: + * - Four types (FF, FI, IF, II) + * - Three or Four Possible DGLAP Splittings (q->qg, q->gq, g->gg, g->qq) + * - Five Flavours of Quarks (d, u, s, c, b) and each of their antiquarks + * - At most, In total: 4 * (10 + 10 + 1 + 5) = 104 splitting functions + * + * So we need to organise ourselves with some kind of structure. As a first + * attempt lets use four digit codes to identify the splitting functions: + * + * - 1st digit: Type of Split-Spect (FF, FI, IF, II) - 0, 1, 2, 3 + * - 2nd digit: Type of DGLAP (q->qg, q->gq, g->gg, g->qq) - 0, 1, 2, 3 + * - 3rd digit: Emitter is a Particle or Antiparticle - 0, 1 (gluon is 0) + * - 4th digit: Flavor of the Emitter - 1, 2, 3, 4, 5; 0 for gluon + * + * Examples: + * - FF u -> ug = 0 0 0 2 + * - FF ubar -> ubar g = 0 0 1 2 + * - FF g -> gg = 0 2 0 0 + * - FF g -> ccbar = 0 3 0 4 + * + * - FI u -> ug = 1 0 0 2 + * - FI g -> ccbar = 1 3 0 4 + * + * - IF d -> dg = 2 0 0 1 + * - IF d -> gd = 2 1 0 1 + * - IF sbar -> sbar g = 2 0 1 3 + * - IF g -> uubar = 2 3 0 2 + * + * - II g -> gg = 3 2 0 0 + * - II g -> bbbar = 3 3 0 5 + * + * This way we can easily identify the splitting functions and select the + * correct one using a switch-case statement. This can be used for value, + * estimate, integral and generateZ functions. + */ + +double Shower::sfValue(double z, double y, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + return kCF * (2. / (1. - z * (1. - y)) - (1. + z)); + break; + + // FF g -> gg + case 200: + return kCA / 2. * (2. / (1. - z * (1. - y)) - 2. + z * (1. - z)); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + return kTR / 2. * (1. - 2. * z * (1. - z)); + break; + } + return 0.; +} + +double Shower::sfEstimate(double z, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + return kCF * 2. / (1. - z); + break; + + // FF g -> gg + case 200: + return kCA / (1. - z); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + return kTR / 2.; + break; + } + return 0.; +} + +double Shower::sfIntegral(double zm, double zp, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + return kCF * 2. * log((1. - zm) / (1. - zp)); + break; + + // FF g -> gg + case 200: + return kCA * log((1. - zm) / (1. - zp)); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + return kTR / 2. * (zp - zm); + break; + } + return 0.; +} + +double Shower::sfGenerateZ(double zm, double zp, double rand, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + return 1. + (zp - 1.) * pow((1. - zm) / (1. - zp), rand); + break; + + // FF g -> gg + case 200: + return 1. + (zp - 1.) * pow((1. - zm) / (1. - zp), rand); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + return zm + (zp - zm) * rand; + break; + } + return 0.; +} + +bool Shower::validateSplitting(int ij, int sf) { + // Obtain the splitting function code + // int firstDigit = sf / 1000; + int secondDigit = (sf / 100) % 10; + int thirdDigit = (sf / 10) % 10; + int fourthDigit = sf % 10; + + // Insert FF, FI, IF, II checks here + // --------------------------------- + + // Skip if ij is a quark and the sf is not a quark sf (2nd digit), or + // if ij is a gluon and the sf is not a gluon sf (2nd digit) + if ((ij != 21 && secondDigit >= 2) || (ij == 21 && secondDigit < 2)) { + return false; + } + + // Skip if ij is a particle and sf is an antiparticle sf (3rd digit), or + // if ij is an antiparticle and sf is a particle sf (3rd digit) + if ((ij < 0 && thirdDigit == 0) || (ij > 0 && thirdDigit == 1)) { + return false; + } + + // q->qg case: Skip if the flavor of ij is different from the flavor of the sf + // g->gg and g->qq case: No need to check the flavor + if ((ij != 21 && abs(ij) != fourthDigit)) { + return false; + } + + return true; +} + +void Shower::sfToFlavs(int sf, int* flavs) { + if (sf < 16) { + if (sf < 6) { + flavs[0] = sf; + flavs[1] = sf; + flavs[2] = 21; + } else { + flavs[0] = -1 * (sf - 10); + flavs[1] = -1 * (sf - 10); + flavs[2] = 21; + } + } else if (sf == 200) { + flavs[0] = 21; + flavs[1] = 21; + flavs[2] = 21; + } else if (sf < 306) { + flavs[0] = 21; + flavs[1] = sf - 300; + flavs[2] = -1 * (sf - 300); + } +} diff --git a/gaps-1.1/doc/README.md b/gaps-1.1/doc/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7d0d95ce262e33e4b91d74066e8b001791977bf6 --- /dev/null +++ b/gaps-1.1/doc/README.md @@ -0,0 +1,15 @@ +# GAPS - Documentation Home + +Welcome to the GAPS Documentation! Here, we go into the computational depths of the GPU algorithm, explaining what everything means and how it all works. + +We'll divide the documentation into a few sections: + +- [Getting Started](sections/getting-started.md): Similar to [README.md](../README.md), this page covers the setting up and basic usage +- [Code Structure](sections/code-structure.md): This contains details of where all params and settings are defined +- [Matrix](sections/matrix-element.md), [Shower](sections/parton-shower.md) and [Observables](sections/observables.md): This contains the descriptions of the different GPU components + +Additionally, the 'tutorials' folder contains introductory CUDA codes available online, which we have commented on. (Links to the Original codes are also provided!) + +For more information/help, feel free to contact [siddharth.sule@manchester.ac.uk](mailto:siddharth.sule@manchester.ac.uk) + +Happy Simulating! diff --git a/gaps-1.1/doc/extra/acceptveto.csv b/gaps-1.1/doc/extra/acceptveto.csv new file mode 100644 index 0000000000000000000000000000000000000000..724d941f307dbbe667ffbbca5649d4bd0de5d3d8 --- /dev/null +++ b/gaps-1.1/doc/extra/acceptveto.csv @@ -0,0 +1,116 @@ +1, 1000000, 999985, 15 +2, 1000000, 999856, 144 +3, 1000000, 999561, 439 +4, 1000000, 998749, 1251 +5, 1000000, 997321, 2679 +6, 1000000, 994948, 5052 +7, 1000000, 991838, 8162 +8, 1000000, 987655, 12345 +9, 999999, 983085, 16914 +10, 999998, 977434, 22564 +11, 999987, 971937, 28050 +12, 999974, 965649, 34325 +13, 999932, 959482, 40450 +14, 999845, 952889, 46956 +15, 999678, 946216, 53462 +16, 999353, 939089, 60264 +17, 998692, 931821, 66871 +18, 997563, 924423, 73140 +19, 995659, 916316, 79343 +20, 992718, 907665, 85053 +21, 988307, 897644, 90663 +22, 981971, 886133, 95838 +23, 973220, 873161, 100059 +24, 961687, 857094, 104593 +25, 946832, 838767, 108065 +26, 928216, 818203, 110013 +27, 906379, 795303, 111076 +28, 880536, 768942, 111594 +29, 851298, 740511, 110787 +30, 818801, 709177, 109624 +31, 783677, 676693, 106984 +32, 746218, 642369, 103849 +33, 707289, 607062, 100227 +34, 667263, 571330, 95933 +35, 626727, 535600, 91127 +36, 586427, 499773, 86654 +37, 546688, 464755, 81933 +38, 507944, 431426, 76518 +39, 470231, 398306, 71925 +40, 434309, 367053, 67256 +41, 399704, 337120, 62584 +42, 366947, 309257, 57690 +43, 336330, 283062, 53268 +44, 307838, 258481, 49357 +45, 280984, 235541, 45443 +46, 255876, 214290, 41586 +47, 232433, 193874, 38559 +48, 210830, 175847, 34983 +49, 190589, 158917, 31672 +50, 172071, 143034, 29037 +51, 155009, 128631, 26378 +52, 139260, 115518, 23742 +53, 124920, 103625, 21295 +54, 112128, 92998, 19130 +55, 100202, 82590, 17612 +56, 89330, 73908, 15422 +57, 79464, 65601, 13863 +58, 70386, 57856, 12530 +59, 62336, 51247, 11089 +60, 55098, 45301, 9797 +61, 48524, 39831, 8693 +62, 42601, 34924, 7677 +63, 37297, 30527, 6770 +64, 32703, 26773, 5930 +65, 28502, 23266, 5236 +66, 24881, 20268, 4613 +67, 21538, 17617, 3921 +68, 18780, 15283, 3497 +69, 16219, 13168, 3051 +70, 14019, 11381, 2638 +71, 12070, 9853, 2217 +72, 10430, 8525, 1905 +73, 8972, 7303, 1669 +74, 7718, 6289, 1429 +75, 6630, 5392, 1238 +76, 5663, 4567, 1096 +77, 4834, 3977, 857 +78, 4083, 3285, 798 +79, 3463, 2813, 650 +80, 2908, 2331, 577 +81, 2432, 2006, 426 +82, 2025, 1636, 389 +83, 1709, 1386, 323 +84, 1425, 1142, 283 +85, 1214, 988, 226 +86, 1003, 800, 203 +87, 841, 653, 188 +88, 701, 565, 136 +89, 591, 475, 116 +90, 489, 402, 87 +91, 413, 349, 64 +92, 348, 288, 60 +93, 293, 240, 53 +94, 243, 188, 55 +95, 199, 155, 44 +96, 162, 127, 35 +97, 133, 115, 18 +98, 108, 89, 19 +99, 93, 72, 21 +100, 67, 54, 13 +101, 58, 48, 10 +102, 47, 36, 11 +103, 36, 27, 9 +104, 28, 26, 2 +105, 19, 14, 5 +106, 17, 14, 3 +107, 16, 13, 3 +108, 14, 10, 4 +109, 10, 6, 4 +110, 9, 6, 3 +111, 7, 6, 1 +112, 4, 3, 1 +113, 2, 2, 0 +114, 1, 0, 1 +115, 1, 1, 0 +116, 0, 0, 0 \ No newline at end of file diff --git a/gaps-1.1/doc/extra/acceptveto.py b/gaps-1.1/doc/extra/acceptveto.py new file mode 100644 index 0000000000000000000000000000000000000000..82611040737ebfed946f839cd67f123fbd6baec0 --- /dev/null +++ b/gaps-1.1/doc/extra/acceptveto.py @@ -0,0 +1,35 @@ +import numpy as np +import matplotlib.pyplot as plt +import matplotlib as mpl + +mpl.rc_file("../../test/mplstyleerc") + +data = np.genfromtxt('acceptveto.csv', delimiter=',') + +data = data[data[:,1] > 0] + +x = data[:,0] +n = data[:,1] +y1 = data[:,2] +y2 = data[:,3] + +fig, ax = plt.subplots(1, 2, figsize=(9, 3.75)) + +ax[0].plot(x, y1, label='Vetoed Emissions', color='C3') +ax[0].plot(x, y2, label='Accepted Emissions', color='C2') + +ax[1].plot(x, y1/n, label='Vetoed Emissions', color='C3') +ax[1].plot(x, y2/n, label='Accepted Emissions', color='C2') + +for i in range(2): + ax[i].set_xlabel('Number of Events') + ax[i].set_ylabel('Emissions') + ax[i].legend() + ax[i].grid(True) + +ax[0].set_title('Accepted and Vetoed Emissions') +ax[1].set_title('Accepted and Vetoed Emissions, (Rescaled by $N_{Active Events}$)') + +fig.tight_layout() +fig.savefig('acceptveto.pdf') +fig.savefig('acceptveto.png') \ No newline at end of file diff --git a/gaps-1.1/doc/extra/gaps.sh b/gaps-1.1/doc/extra/gaps.sh new file mode 100755 index 0000000000000000000000000000000000000000..8ea162938893b355ec44b76cf31e93aa68310a41 --- /dev/null +++ b/gaps-1.1/doc/extra/gaps.sh @@ -0,0 +1,123 @@ +#------------------------------------------------------------------------------ +#!/bin/bash +# ------------------------------------------------------------------------------ + +# GAPS - Run Script +# ----------------- +# This script is used to compile and run the GAPS and C++ Shower codes. It +# provides a number of options to control the number of events, the number of +# cores to use, and the type of run to perform. The run types are: +# - gaps: Run the GAPS simulation +# - cpp: Run the C++ Shower simulation +# - compare: Run both the GAPS and C++ Shower and compare the results +# - full: Run both the GAPS and C++ Shower for a range of event numbers + + +# ------------------------------------------------------------------------------ +# Default Parameters + +events=10000 +energy=91.2 +ncores=1 + +# Defaults to just the CUDA simulation +runtype="gaps" + +# ------------------------------------------------------------------------------ +# Use optargs to adjust default values and define run + +while getopts "n:e:c:r:h" opt; do + case $opt in + n) events=$OPTARG ;; + e) energy=$OPTARG ;; + c) ncores=$OPTARG ;; + r) runtype=$OPTARG ;; + h) echo "Usage: $0 [-n nevents] [-e energy] [-c cores] [-r runtype] [-h help]" + echo " -n: set the number of events (default: 10000)" + echo " -e: set the CoM energy of the system (default: 91.2)" + echo " -c: set the number of cores (default: 1)" + echo " -r: set the run type (default: gaps, options: gaps, cpp, compare, full)" + echo " -h: display this help message" + exit 0 + ;; + esac +done + +# ------------------------------------------------------------------------------ +# Compile Code + +compile() { + dir=$1 + echo "Compiling $dir" + (cd $dir && mkdir -p build && cd build && cmake .. && make -j $ncores) +} + +if [ "$runtype" = "gaps" ] || [ "$runtype" = "compare" ] || [ "$runtype" = "full" ]; then + compile "gaps" +fi + +if [ "$runtype" = "cpp" ] || [ "$runtype" = "compare" ] || [ "$runtype" = "full" ]; then + compile "cpp-shower" +fi + +# ------------------------------------------------------------------------------ +# Default: Just run GAPS + +if [ "$runtype" = "gaps" ]; then + echo "Running GAPS" + ./gaps/bin/gaps $events $energy +fi + +# ------------------------------------------------------------------------------ +# Run C++ Shower + +if [ "$runtype" = "cpp" ]; then + echo "Running C++ Shower" + ./cpp-shower/bin/cpp-shower $events $energy +fi + +# ------------------------------------------------------------------------------ +# Compare GAPS and C++ Shower + +if [ "$runtype" = "compare" ]; then + echo "Running GAPS" + ./gaps/bin/gaps $events $energy + echo "Running C++ Shower" + ./cpp-shower/bin/cpp-shower $events $energy +fi + +# ------------------------------------------------------------------------------ +# Do a full analysis, comparing GAPS and C++ Shower at many values of N + +if [ "$runtype" = "full" ]; then + + # Remove previous results and make folder + rm -rf results + mkdir -p results + + # Clear previous log files + rm cpp-time.dat + rm gaps-time.dat + + # Run the comparison 100 times, for different number of events + neventsarray=(1 2 5 10 20 50 100 200 500 1000 2000 5000 10000 20000 50000 100000 200000 500000 1000000) + for n in "${neventsarray[@]}" + do + # Run and store the output in a log file + for i in {1..100} + do + echo "Running GAPS with $n events" + ./gaps/bin/gaps $n $energy + echo "Running C++ Shower with $n events" + ./cpp-shower/bin/cpp-shower $n $energy + done + done + + # Move the log files to the results directory + mv cpp-time.dat results/ + mv gaps-time.dat results/ + mv cpp.yoda results/ + mv gaps.yoda results/ + +fi + diff --git a/gaps-1.1/doc/extra/results-in-paper.tar.gz b/gaps-1.1/doc/extra/results-in-paper.tar.gz new file mode 100644 index 0000000000000000000000000000000000000000..fb10d79114b3a6050abbf54ab82ad1719601d45a Binary files /dev/null and b/gaps-1.1/doc/extra/results-in-paper.tar.gz differ diff --git a/gaps-1.1/doc/extra/splittings-test.cuh b/gaps-1.1/doc/extra/splittings-test.cuh new file mode 100644 index 0000000000000000000000000000000000000000..c40cd44201362aa7ae30c14fdb89e57739baec61 --- /dev/null +++ b/gaps-1.1/doc/extra/splittings-test.cuh @@ -0,0 +1,100 @@ +#ifndef SPLITTINGS_CUH_ +#define SPLITTINGS_CUH_ + +// #include "qcd.cuh" + +// Splitting functions as a class - Here be dragons! + +class Kernel { + public: + int flavs[3]; + __device__ virtual double Value(double z, double y) = 0; + virtual ~Kernel() {} +}; + +class Pqq : public Kernel { + public: + __device__ double Value(double z, double y) override { + return CF * (2. / (1. - z * (1. - y)) - (1. + z)); + } + __device__ static Pqq* getInstance() { + __shared__ Pqq instance; + return &instance; + } +}; + +__global__ void initKernels(Kernel** Kernels) { + if (threadIdx.x == 0 && blockIdx.x == 0) { + // Quarks + for (int i = 0; i < 5; i++) { + Kernels[i] = Pqq::getInstance(); + Kernels[i]->flavs[0] = i + 1; // id = i + 1 + Kernels[i]->flavs[1] = i + 1; + Kernels[i]->flavs[2] = 21; + } + + // Anti-quarks + for (int i = 5; i < 10; i++) { + Kernels[i] = Pqq::getInstance(); + Kernels[i]->flavs[0] = i - 11; // id = i - 11 + Kernels[i]->flavs[1] = i - 11; + Kernels[i]->flavs[2] = 21; + } + } +} + +// Kernel to perform computations +__global__ void computeValues(Kernel** Kernels, double* input, double* output, + int n) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + if (idx < n) { + double temp = 0; + double max = 0; + for (int i = 0; i < 10; i++) { + temp = Kernels[i]->Value(input[idx], input[idx + 1]); + if (temp > max) { + max = temp; + } + } + output[idx] = max; + } +} + +int main() { + Kernel** deviceKernels; + cudaMalloc(&deviceKernels, 10 * sizeof(Kernel*)); + + // Initialize Kernels on device + initKernels<<<1, 1>>>(deviceKernels); + cudaDeviceSynchronize(); // Ensure initialization is completed + + // Input data + double input[2] = {0.1, 0.2}; + double* deviceInput; + cudaMalloc(&deviceInput, 2 * sizeof(double)); + cudaMemcpy(deviceInput, input, 2 * sizeof(double), cudaMemcpyHostToDevice); + + // Output data + double output[1]; + double* deviceOutput; + cudaMalloc(&deviceOutput, 1 * sizeof(double)); + + // Compute values + computeValues<<<1, 1>>>(deviceKernels, deviceInput, deviceOutput, 1); + cudaDeviceSynchronize(); + + // Copy output back to host + cudaMemcpy(output, deviceOutput, 1 * sizeof(double), cudaMemcpyDeviceToHost); + + // Print output + printf("Output: %f\n", output[0]); + + // Free memory + cudaFree(deviceKernels); + cudaFree(deviceInput); + cudaFree(deviceOutput); + + return 0; +} + +#endif // SPLITTINGS_CUH_ diff --git a/gaps-1.1/doc/extra/versioncontrol.py b/gaps-1.1/doc/extra/versioncontrol.py new file mode 100644 index 0000000000000000000000000000000000000000..d04688ee313185b57b075c8403f8f5ca1b0a3adb --- /dev/null +++ b/gaps-1.1/doc/extra/versioncontrol.py @@ -0,0 +1,36 @@ +import numpy as np +import pandas as pd +import matplotlib.pyplot as plt +# from scipy.stats import iqr + +# Load the data +cpp0 = pd.read_csv('cpp-time-0.dat') +cud0 = pd.read_csv('gaps-time-0.dat') +cpp1 = pd.read_csv('cpp-time-1.dat') +cud1 = pd.read_csv('gaps-time-1.dat') + +# Name the Columns Matrix, Shower, Observables, and Total +cpp0.columns = ['Matrix', 'Shower', 'Observables', 'Total'] +cud0.columns = ['Matrix', 'Shower', 'Observables', 'Total'] +cpp1.columns = ['Matrix', 'Shower', 'Observables', 'Total'] +cud1.columns = ['Matrix', 'Shower', 'Observables', 'Total'] + +# Get the median speedup between cpp and cud for the two versions +speedup0 = cpp0.median() / cud0.median() +speedup1 = cpp1.median() / cud1.median() + +# Get the median speedup between the two versions for cpp and cud +speedup_cpp = cpp0.median() / cpp1.median() * 100 +speedup_cud = cud0.median() / cud1.median() * 100 + +# Concatenate all results +speedup0 = pd.concat([speedup0, speedup1], axis=1) +speedup1 = pd.concat([speedup_cpp, speedup_cud], axis=1) +speedup = pd.concat([speedup0, speedup1], axis=1) + +# Name the columns +speedup.columns = ['Old Speedup', 'New Speedup', + 'Cpp Old/New (%)', 'CUDA Old/New (%)'] + +# Print the results to 3 decimal places +print(speedup.round(3)) diff --git a/gaps-1.1/doc/sections/code-structure.md b/gaps-1.1/doc/sections/code-structure.md new file mode 100644 index 0000000000000000000000000000000000000000..8ceac227eb6153cbd505f8abd477e71d3ebbf819 --- /dev/null +++ b/gaps-1.1/doc/sections/code-structure.md @@ -0,0 +1,24 @@ +# Code Structure + +> **Version 1.0.0**: All code in `include` and `src` to be split by component later + +While the focus of the paper is the parton shower, there are several necessary components that are exciting in their own right. Below is a figure showing the structure of the classes. + +**Note**: The code structure is identical for the C++ and CUDA generators. We want to demonstrate that one can get a speedup without altering the code much. + + + +Let's go through the Classes: + +- Base: All the necessary definitions and settings are provided here. For now, you can adjust the Centre of Mass Energy and Cutoff and the physical contents here. + +- Vec4 contains the definition of the four vectors—the primary tool for all our kinematics. This is one of many header-only files! This is because the code inside a .cu file might not be accessible to another .cu file (weird, I know; it could be me struggling to do things the C++ way, though!). + +- Parton and Event: This is the backbone of the Generator. Unlike the original tutorial, we choose to store all and any properties of the code in these objects. The parton class holds the ID, momentum and colour of the parton. The event class stores an array of partons (it has to be static because CUDA won't allow dynamic arrays/unsuitable for speed). It also stores several parameters like the differential cross section and observable values and acts as a temporary store for shower parameters. It's good to have everything in there rather than in separate arrays! + +- Histogram: This has a class for Bins and Histograms. Fun feature: CUDA allows you to do Atomic operations, i.e., I can bin all my events at the same time! It makes things unbelievably fast! + +- Physics Classes + - Matrix: Computes the matrix element for $ee \to qq$ + - Shower: Runs the final state shower + - Observable: Calculates the observable and bins the Histogram diff --git a/gaps-1.1/doc/sections/getting-started.md b/gaps-1.1/doc/sections/getting-started.md new file mode 100644 index 0000000000000000000000000000000000000000..01fd128b182c76f7202b118c73d28b0adde5f667 --- /dev/null +++ b/gaps-1.1/doc/sections/getting-started.md @@ -0,0 +1,65 @@ +# Getting Started + +The code simulates just one experiment, so this should take a little time. + +The file `rungaps` can be used to build and operate both the C++ and CUDA generators. It has been coded with all the routines (including the one for paper results). + +To run GAPS, you will need the following: + +- An NVIDIA V100, A100 or above: These are the only GPUs with the required features +- CMake: To create the makefile +- NVCC and GCC: to build the two generators +- Python: To make plots of the results + +Simply execute the command: + +```bash +./rungaps +``` + +NB: If you get a permission denied error, please run ```chmod +x rungaps```. + +This should build the program and generate 10000 events on the GPU. The output should look something like this: + +```bash +------------------------------------------------- +| GAPS: a GPU-Amplified Parton Shower | +------------------------------------------------- +Process: e+ e- --> q qbar +Number of Events: 10000 + +Initialising... +Generating Matrix Elements... +Showering Partons... +Analysing Events... + +EVENT GENERATION COMPLETE + +ME Time: 0.000666688 s +Sh Time: 0.0235208 s +An Time: 0.00896093 s + +Total Time: 0.0331484 s + +Histograms written to gaps.yoda +Timing data written to gaps-time.dat +------------------------------------------------ +``` + +Then you have free reign over what you need. Like [README.md](../../README.md), here are all the possible ways you can run gaps: + +```bash +# Simulate different numbers of events and build the code using multiple CPU cores +./rungaps -n nevents -c ncores + +# Run C++ Simulation +./rungaps -n nevents -c ncores -r cpp + +# Run the same number of events on C++ and CUDA and compare times +./rungaps -n nevents -c ncores -r compare + +# Run a multitude of number of events 100 times, as seen in the paper +./rungaps -c ncores -r full +``` + +And that's all there is to it! In upcoming versions, we'll add features like increasing the number of events, different centres of mass energies, new analyses, and potentially some matching! diff --git a/gaps-1.1/doc/sections/matrix-element.md b/gaps-1.1/doc/sections/matrix-element.md new file mode 100644 index 0000000000000000000000000000000000000000..78fad3e4e1f3c42e0c137f776cf218a76548b5e1 --- /dev/null +++ b/gaps-1.1/doc/sections/matrix-element.md @@ -0,0 +1,13 @@ +# Matrix Element Calculation + +This part of the code resembles the classic example of axpy (a*x + p = y). The matrix element is in charge of doing some calculations of pre-computed matrix elements and applying random numbers to create authentic events. + +Here are the steps involved: + +- The code generates a random number for the flavour of the quark-antiquark pair and for the angles theta and Phi. + +- Then it does an analytical calculation, which might be complicated for us but is simple for the device as it is designed for this. This calculation gives the matrix element and the differential cross-section. + +- It generates the momenta of the e+e- pair and the quark-antiquark pair and adds all of this information to the event variable + +And that's all! The shower and observables sections are a lot more complicated. For more sophisticated calculations, look at the PEPPER [2311.06198](https://arxiv.org/abs/2311.06198.pdf) and MG4GPU [2303.18244](https://arxiv.org/abs/2303.18244) projects! diff --git a/gaps-1.1/doc/sections/observables.md b/gaps-1.1/doc/sections/observables.md new file mode 100644 index 0000000000000000000000000000000000000000..754868580a025ccc8b42eb3c167db24f648428d4 --- /dev/null +++ b/gaps-1.1/doc/sections/observables.md @@ -0,0 +1,15 @@ +# Calculating Observables and Histogramming + +After the parton shower stage, we are left with multiple events of different sizes (meaning different numbers of partons in the final state). Like the shower, we write our code to treat each event independently and use the as-prescribed algorithm to calculate the observables. + +The observables that are currently available are: + +- Durham Jet Rates: These are the 2->3, 3->4, 4->5 and 5->6 Jet rates on a log scale. These tell us how many jets a particular event possibly has, along with intricate details like the transverse momentum spectra [Phys. Lett. B 269 (1991)](https://inspirehep.net/literature/317695) + +- Thrust: This tells us how pencil-like or spherical a final state is [Phys. Rev. Lett. 39 (1977)](https://link.aps.org/doi/10.1103/PhysRevLett.39.1587). There are two possible algorithms for calculating thrust: one that is more time-consuming but consistent for all event shapes and a simplified, faster algorithm that incorrectly calculates thrust for highly spherical events [(Reported in Pythia)](https://pythia.org/latest-manual/EventAnalysis.html). We used the more time-consuming algorithm, first seen in [Z. Physik C 1, 61 (1979)](https://link.springer.com/article/10.1007/BF01450381) + +- Jet Masses and Broadenings: These split the momenta of the partons with and against the thrust direction and yield properties of each hemisphere. [Phys. Lett. B 272 (1991)](https://www.sciencedirect.com/science/article/pii/037026939191845M) [Phys. Lett. B. 295 (1992](https://www.sciencedirect.com/science/article/pii/037026939291565Q) + +- **New** Dalitz Plot: This studies the emitter and spectator momentum in a single emission event [Phil. Mag. 44 (1953)](https://www.tandfonline.com/doi/abs/10.1080/14786441008520365) (OFF by default though, as the writing time dominates the GPU analysis time!) + +One exciting thing that happens during the stage is binning. CUDA has a feature called atomic operations, where a variable can be operated on simultaneously. In our case, this means that once an observable is calculated for all events, every value of that observable is binned at once! (Pretty cool, right?) diff --git a/gaps-1.1/doc/sections/paraveto.png b/gaps-1.1/doc/sections/paraveto.png new file mode 100644 index 0000000000000000000000000000000000000000..f9b265ed72e212201341629e31c13483933a563f Binary files /dev/null and b/gaps-1.1/doc/sections/paraveto.png differ diff --git a/gaps-1.1/doc/sections/parton-shower.md b/gaps-1.1/doc/sections/parton-shower.md new file mode 100644 index 0000000000000000000000000000000000000000..a5fc8c42144cccb00247b70b9d8d26824956f645 --- /dev/null +++ b/gaps-1.1/doc/sections/parton-shower.md @@ -0,0 +1,21 @@ +# Parton Shower + +Now I won't really go too much in depth with this page, there's a whole paper on it... you're most likely here after reading it, too... + +Either way, here is a breakdown: Due to Independent Thread Scheduling [1](https://developer.nvidia.com/blog/inside-volta/), we have access to for and while loops. This means that we can adapt our original veto algorithm: + + + +Into something like this: + + + +The GPU handles each event independently and works regardless of its size. As a security measure, the "Sync GPU and Check Everything" function is run after each step so that there are no spills. Furthermore, when a thread encounters a veto event, it can skip it. + +Just a few things to point out that may not have made it to the paper: + +- Max number of Partons: It is much faster to have a static event size than a dynamic size, as most of the time is used to manage memory. So we set the maximum number of partons to about 30 (we usually see at most 17-18 partons from the shower). While we are in the LEP phase, we shouldn't have a problem, but once we soon come to LHC, we'll have to be careful of this + +- Splitting Kernels as a "shared" object: I have had a LOT of trouble making the splitting kernel an object that can be shared across the device without making a copy. So, instead, as a first solution, we manually sample through all kernels and have the relevant functions written into the Cuda kernel. One possible improvement to this could be a code system where the code determines the splitting, the case (FF, FI, IF, and II) and the relevant partons (see v1.1.0) + +Besides that, the code is thoroughly commented on so a potential developer can find it helpful. diff --git a/gaps-1.1/doc/sections/structure.png b/gaps-1.1/doc/sections/structure.png new file mode 100644 index 0000000000000000000000000000000000000000..d0e53cff95d26323a00ef47bf17fcc15b9beae80 Binary files /dev/null and b/gaps-1.1/doc/sections/structure.png differ diff --git a/gaps-1.1/doc/sections/veto.png b/gaps-1.1/doc/sections/veto.png new file mode 100644 index 0000000000000000000000000000000000000000..312ff9e386705523c7da92b94026e4b0a2764589 Binary files /dev/null and b/gaps-1.1/doc/sections/veto.png differ diff --git a/gaps-1.1/doc/tutorials/basicExample.cu b/gaps-1.1/doc/tutorials/basicExample.cu new file mode 100644 index 0000000000000000000000000000000000000000..8187262ce3247dcdf701c8d835ad6f1d1120d72e --- /dev/null +++ b/gaps-1.1/doc/tutorials/basicExample.cu @@ -0,0 +1,32 @@ +int main(void) { + // DOES NOT RUN - JUST TO SHOW THE STRUCTURE OF A BASIC CUDA PROGRAM + + // Declare Variables + // h_ = on host, d_ = on device + int *h_c, d_c; + + // Allocate memory on the device + // cudaMalloc( Location of the Memory, Size of the Memory ) + cudaMalloc((void **)&d_c, sizeof(int)); + + // If h_c initialised, copy info from h_c to d_c + // cudaMemcpy( destination, host, numBytes, Direction ) + cudaMemcpy(d_c, h_c, sizeof(int) cudaMemcpyHostToDevice); + + // Kernel Configuration Parameters + dim3 grid_size(1); + dim3 block_size(1); + + // Launch the Kernel + kernel<<<grid_size, block_size>>>(...); + + // Copy data back to host + // cudaMemcpy( destination, device, numBytes, Direction ) + cudaMemcpy(h_c, d_c, sizeof(int), cudaMemcpyDeviceToHost); + + // Deallocate Memory + cudaFree(d_c); + free(h_c); + + return 0; +} diff --git a/gaps-1.1/doc/tutorials/parallelFor.cu b/gaps-1.1/doc/tutorials/parallelFor.cu new file mode 100644 index 0000000000000000000000000000000000000000..f29a3a370c31d0f7fa7f0011982a9a7b1538a127 --- /dev/null +++ b/gaps-1.1/doc/tutorials/parallelFor.cu @@ -0,0 +1,106 @@ +// Adapted from https://github.com/olcf-tutorials/vector_addition_cuda +#include <stdio.h> + +// Large array, 2^20 +#define N 1048576 + +// Kernel that adds the element +// Global = Called on Host, Ran on Device +__global__ void add_vectors(double *a, double *b, double *out) { + /* + Indexing within Grids + --------------------- + + Can use dim3 variables to get the index of a block/thread, + as well as the size of a grid/block + + dim3 blockIdx - unique + dim3 threadIdx - unique in own block + dim3 gridDim - size of grid + dim3 blockDim - size of block + + A useful indexing command is + blockDim.x * blockIdx.x + threadIdx.x + - blockDim.x * blockIdx.x allows going from 0 to block_size one at a time + - + ThreadIdx.x allows acces to threads in a block + */ + + // Shared Memory Example + /* + __shared__ int shared_array[N]; + shared_array[i] = in[i] // Each Thread writes to one element of s_a + */ + + int id = blockDim.x * blockIdx.x + threadIdx.x; + if (id < N) out[id] = a[id] + b[id]; +} + +// Main Program +int main() { + // Time the execution + cudaEvent_t start, stop; + float time; + cudaEventCreate(&start); + cudaEventCreate(&stop); + + // Number of bytes to allocate for N Doubles + size_t bytes = N * sizeof(double); + + // Allocate memory for arrays A, B and Out on host + // malloc casts to void*, use (double*) to match the pointers + double *h_a = (double *)malloc(bytes); + double *h_b = (double *)malloc(bytes); + double *h_out = (double *)malloc(bytes); + + // Allocate memory for arrays A, B and Out on Device + double *d_a, *d_b, *d_out; + cudaMalloc(&d_a, bytes); + cudaMalloc(&d_b, bytes); + cudaMalloc(&d_out, bytes); + + // Fill A and B + for (int i = 0; i < N; i++) { + h_a[i] = 1.; + h_b[i] = 2.; + } + + // Copy data from host to device + cudaMemcpy(d_a, h_a, bytes, cudaMemcpyHostToDevice); + cudaMemcpy(d_b, h_b, bytes, cudaMemcpyHostToDevice); + + // Grid and Block Size + int block_size = 256; + int grid_size = ceil(float(N) / block_size); + + // Start Timer + cudaEventRecord(start, 0); + + // Launch kernel + add_vectors<<<grid_size, block_size>>>(d_a, d_b, d_out); + + // End Timer + cudaEventRecord(stop, 0); + cudaEventSynchronize(stop); + + // Get Elapsed Time + cudaEventElapsedTime(&time, start, stop); + cudaEventDestroy(start); + cudaEventDestroy(stop); + + // Copy data from device to host + cudaMemcpy(h_out, d_out, bytes, cudaMemcpyDeviceToHost); + + // Free Memory + free(h_a); + free(h_b); + free(h_out); + cudaFree(d_a); + cudaFree(d_b); + cudaFree(d_out); + + // Print if everything works + printf("\n---------------------------\n"); + printf("SUCCESS\n"); + printf("%d\n", time); + printf("---------------------------\n"); +} diff --git a/gaps-1.1/doc/tutorials/saxpy.cu b/gaps-1.1/doc/tutorials/saxpy.cu new file mode 100644 index 0000000000000000000000000000000000000000..1aa2fbae539ee2ee6c9058f6eabb3f2509052a02 --- /dev/null +++ b/gaps-1.1/doc/tutorials/saxpy.cu @@ -0,0 +1,45 @@ +// Adapted From +// https://developer.nvidia.com/blog/easy-introduction-cuda-c-and-c/ +#include <stdio.h> + +// Large array, 2^20 +#define N 1048576 + +// Kernel function to add the elements of two arrays +__global__ void saxpy(int n, float a, float *x, float *y) { + int i = blockIdx.x * blockDim.x + threadIdx.x; + if (i < n) y[i] = a * x[i] + y[i]; +} + +int main(void) { + // Host input vectors + float *h_x, *h_y, *d_x, *d_y; + h_x = (float *)malloc(N * sizeof(float)); + h_y = (float *)malloc(N * sizeof(float)); + + // Device input vectors + cudaMalloc(&d_x, N * sizeof(float)); + cudaMalloc(&d_y, N * sizeof(float)); + + // Initialize x and y arrays on the host + for (int i = 0; i < N; i++) { + h_x[i] = 1.0f; + h_y[i] = 2.0f; + } + + // Copy data from host to device + cudaMemcpy(d_x, x, N * sizeof(float), cudaMemcpyHostToDevice); + cudaMemcpy(d_y, y, N * sizeof(float), cudaMemcpyHostToDevice); + + // Perform SAXPY on 1M elements + saxpy<<<(N + 255) / 256, 256>>>(N, 2.0f, d_x, d_y); + + // Copy data from device to host + cudaMemcpy(y, d_y, N * sizeof(float), cudaMemcpyDeviceToHost); + + // Cleanup + cudaFree(d_x); + cudaFree(d_y); + free(h_x); + free(h_y); +} diff --git a/gaps-1.1/gaps/.vscode/settings.json b/gaps-1.1/gaps/.vscode/settings.json new file mode 100644 index 0000000000000000000000000000000000000000..23fd35f0e0e708ef622c7d957b9c8bb60c7876eb --- /dev/null +++ b/gaps-1.1/gaps/.vscode/settings.json @@ -0,0 +1,3 @@ +{ + "editor.formatOnSave": true +} \ No newline at end of file diff --git a/gaps-1.1/gaps/CMakeLists.txt b/gaps-1.1/gaps/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..0a701379bbfe794620f833ff1c6f8f548562207b --- /dev/null +++ b/gaps-1.1/gaps/CMakeLists.txt @@ -0,0 +1,43 @@ +# Minimum required version of CMake +cmake_minimum_required(VERSION 3.10) + +# Enable CUDA +enable_language(CUDA) + +# Project name and languages used +project(gaps LANGUAGES CXX CUDA) + +# Set C++ standard +set(CMAKE_CXX_STANDARD 17) +set(CMAKE_CXX_STANDARD_REQUIRED ON) + +# Add the --expt-extended-lambda flag to the CUDA flags +set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-extended-lambda") + +# List of subdirectories +set(SUBDIRS base matrix shower observables) + +# Include the directories for the headers +foreach(subdir ${SUBDIRS}) + include_directories(${subdir}/include) + add_subdirectory(${subdir}) +endforeach() + +# Set the directory for the executable +set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_SOURCE_DIR}/bin) + +# Add executable +add_executable(gaps main.cu) + +# Link the libraries from the subdirectories +target_link_libraries(gaps base matrix shower observables) + +# Set CUDA architecture to 7.0 for Tesla V100 +set_property(TARGET gaps PROPERTY CUDA_ARCHITECTURES 70) + +# Set CUDA properties +set_target_properties( + gaps + PROPERTIES + CUDA_SEPARABLE_COMPILATION ON +) \ No newline at end of file diff --git a/gaps-1.1/gaps/base/CMakeLists.txt b/gaps-1.1/gaps/base/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..8f87bb37891c243f2fc208fbc3e0da52e59b8c61 --- /dev/null +++ b/gaps-1.1/gaps/base/CMakeLists.txt @@ -0,0 +1,18 @@ +cmake_minimum_required(VERSION 3.10) +project(base) + +# Enable CUDA +enable_language(CUDA) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include) +file(GLOB SOURCES "src/*.cu") + +# Set all source files to compile with CUDA +set_source_files_properties(${SOURCES} PROPERTIES LANGUAGE CUDA) + +add_library(base ${SOURCES}) + +# Set CUDA architecture to 7.0 for Tesla V100 +set_property(TARGET base PROPERTY CUDA_ARCHITECTURES 70) \ No newline at end of file diff --git a/gaps-1.1/gaps/base/include/base.cuh b/gaps-1.1/gaps/base/include/base.cuh new file mode 100644 index 0000000000000000000000000000000000000000..cdee98787b7cc8f26572832be3756ff532ae0995 --- /dev/null +++ b/gaps-1.1/gaps/base/include/base.cuh @@ -0,0 +1,69 @@ +#ifndef BASE_CUH_ +#define BASE_CUH_ + +/** + * The Base Class + * -------------- + * + * This file contains the neccessary includes and definitions that are used + * throughout the program. This includes CUDA Libraries, Thrust Libraries, + * C++ Libraries, and some global variables. Make changes here if you want to + * change the global settings of the program! + * + * (but also be careful with what you change, as it may break the program...) + */ + +// ----------------------------------------------------------------------------- +// Import Libraries + +// CUDA Libraries +#include <cuda_runtime.h> // CUDA Runtime +#include <curand_kernel.h> // CURAND Library + +// Thrust Libraries +#include <thrust/device_vector.h> // ALL EVENTS ON DEVICE + +// C++ Libraries (Genreally Used) +#include <cmath> // Math Functions +#include <cstdlib> // SYS EXIT Command +#include <fstream> // File I/O +#include <iostream> // Standard I/O + +// ----------------------------------------------------------------------------- +// Program Settings - CAREFUL WITH CHANGES + +// Debugging - only debug levels 0 and 1 (true or false) +const bool debug = false; + +// Max Number of Partons, set to save memory +// at 10^6 Events: +// 50 works for all, but observables calc is slow +// 100 works for ME + PS, but not for Observables +// 30 is more than enough to do e+e- at 91.2 GeV +const int maxPartons = 30; + +// LEP 91.2 settings +const double mz = 91.1876; +const double asmz = 0.118; + +// Cutoff and its value of alpha_s (pre-calculated) +const double tC = 1.; +const double asmax = 0.440886; + +// Number of Histogram Bins: Common for all Plots (for now...) +const int nBins = 100; +const int nBins2D = 100; // NxN Grid + +// Maximum Number of Events, beyond which program will be done in batches +const int maxEvents = 1000000; + +// ----------------------------------------------------------------------------- +// Common Functions + +// Sync Device and Check for CUDA Errors +void syncGPUAndCheck(const char *operation); + +// Debugging Function - Available in Kernels too! +__host__ __device__ void DEBUG_MSG(const char *msg); + +#endif // BASE_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/base/include/event.cuh b/gaps-1.1/gaps/base/include/event.cuh new file mode 100644 index 0000000000000000000000000000000000000000..6c4d398df400d38c109f89cf122c437fa5fcd661 --- /dev/null +++ b/gaps-1.1/gaps/base/include/event.cuh @@ -0,0 +1,242 @@ +#ifndef EVENT_CUH_ +#define EVENT_CUH_ + +#include "parton.cuh" + +/** + * The Event Class + * --------------- + * + * This is the backbone of the program. An event contains the differential cross + * section calculated from the ME, the hard partons and the showered partons. + * It also contains the shower parameters which are constantly updated during + * the showering process. The event also contains the analysis variables, which + * are calculated after the showering process is complete. + */ + +class Event { + private: + // Temporary Solution - Allows a limited number of partons + // Better Solution would be to use a dynamic array, but not GPU friendly + Parton partons[maxPartons]; + + // ME Params ----------------------------------------------------------------- + + double dxs = 0.; // Differential Cross Section + int nHard = 0; // Number of Hard Partons + // int nInitial = 0; // Number of Initial Partons (Prep for ISR) + // int nNonParton = 0; // Number of Non-Parton Partons (Prep for ISR) + + // Shower Params ------------------------------------------------------------- + + int nEmission = 0; // Number of Emissions + double showerT = 0.; // Evolution and Splitting Variables + double showerZ = 0.; + double showerY = 0.; + int showerC = 0; // Colour Counter + + // Selecting Winner Emission - Defaults Values which represent no winner + int winSF = 16; + int winDipole[2] = {-1, -1}; + double winParams[2] = {0., 0.}; + + // End Shower Flag + bool endShower = false; // Shower End Flag - used if T < 1 GeV + + // Analysis Variables -------------------------------------------------------- + + // Event Validity - Momentum and Colour Conservation + bool validity = true; + + // Jet Rates using the Durham Algorithm + double y23 = -50., y34 = -50., y45 = -50., y56 = -50.; + + // Event Shape Variables - Thrust, Jet Masses and Broadenings + double thr = -50., hjm = -50., ljm = -50., wjb = -50., njb = -50.; + Vec4 t_axis = Vec4(); + + // Dalitz Plot + double dalitz[2] = {-50., -50.}; + + public: + // Constructor --------------------------------------------------------------- + + // Empty, so that we can build our ME, PS onto it + __device__ Event() {} + + // Getters ------------------------------------------------------------------- + + // Access Partons in the Event + __device__ Parton GetParton(int i) const { return partons[i]; } + __device__ int GetSize() const { return nHard + nEmission; } + __device__ int GetHard() const { return nHard; } + __device__ int GetEmissions() const { return nEmission; } + __device__ int GetPartonSize() const { + // -2: e+, e- + return (nHard + nEmission) - 2; + } + + // Get Differential Cross Section + __device__ double GetDxs() const { return dxs; } + + // Get Shower Params + __device__ double GetShowerT() const { return showerT; } + __device__ double GetShowerZ() const { return showerZ; } + __device__ double GetShowerY() const { return showerY; } + __device__ int GetShowerC() const { return showerC; } + + __device__ int GetWinSF() const { return winSF; } + __device__ int GetWinDipole(int i) const { return winDipole[i]; } + __device__ double GetWinParam(int i) const { return winParams[i]; } + + __device__ bool GetEndShower() const { return endShower; } + + // Analysis Getters + __device__ bool GetValidity() const { return validity; } + + __device__ double GetY23() const { return y23; } + __device__ double GetY34() const { return y34; } + __device__ double GetY45() const { return y45; } + __device__ double GetY56() const { return y56; } + __device__ double GetThr() const { return thr; } + __device__ double GetHJM() const { return hjm; } + __device__ double GetLJM() const { return ljm; } + __device__ double GetWJB() const { return wjb; } + __device__ double GetNJB() const { return njb; } + + __device__ Vec4 GetTAxis() const { return t_axis; } + + __device__ double GetDalitz(int i) const { return dalitz[i]; } + + // Setters ------------------------------------------------------------------- + + // Add / Replace Parton + __device__ void SetParton(int i, Parton parton) { partons[i] = parton; } + + // Set Parton Data + __device__ void SetPartonPid(int i, int pid) { partons[i].SetPid(pid); } + __device__ void SetPartonMom(int i, Vec4 mom) { partons[i].SetMom(mom); } + __device__ void SetPartonCol(int i, int col) { partons[i].SetCol(col); } + __device__ void SetPartonAntiCol(int i, int anticol) { + partons[i].SetAntiCol(anticol); + } + + // Set Differential Cross Section and nHard + __device__ void SetDxs(double dxs) { this->dxs = dxs; } + __device__ void SetHard(int nHard) { this->nHard = nHard; } + + // Adjust and Increment Number of Emissions + __device__ void SetEmissions(int nEmission) { this->nEmission = nEmission; } + __device__ void IncrementEmissions() { nEmission++; } + + // Set Shower Params + __device__ void SetShowerT(double showerT) { this->showerT = showerT; } + __device__ void SetShowerZ(double showerZ) { this->showerZ = showerZ; } + __device__ void SetShowerY(double showerY) { this->showerY = showerY; } + + __device__ void SetShowerC(int showerC) { this->showerC = showerC; } + __device__ void IncrementShowerC() { showerC++; } + + __device__ void SetWinSF(int winSF) { this->winSF = winSF; } + __device__ void SetWinDipole(int i, int winParton) { + this->winDipole[i] = winParton; + } + __device__ void SetWinParam(int i, double winParam) { + this->winParams[i] = winParam; + } + + __device__ void SetEndShower(bool endShower) { this->endShower = endShower; } + + // Set Analysis Variables + __device__ void SetValidity(bool validity) { this->validity = validity; } + + __device__ void SetY23(double y23) { this->y23 = y23; } + __device__ void SetY34(double y34) { this->y34 = y34; } + __device__ void SetY45(double y45) { this->y45 = y45; } + __device__ void SetY56(double y56) { this->y56 = y56; } + __device__ void SetThr(double thr) { this->thr = thr; } + __device__ void SetHJM(double hjm) { this->hjm = hjm; } + __device__ void SetLJM(double ljm) { this->ljm = ljm; } + __device__ void SetWJB(double wjb) { this->wjb = wjb; } + __device__ void SetNJB(double njb) { this->njb = njb; } + + __device__ void SetTAxis(Vec4 t_axis) { this->t_axis = t_axis; } + + __device__ void SetDalitz(double x1, double x2) { + dalitz[0] = x1; + dalitz[1] = x2; + } + + // Member Functions ---------------------------------------------------------- + + // Validate the Event - Check Momentum and Colour Conservation + __device__ bool Validate() { + Vec4 psum = Vec4(); + + int csum[100] = {0}; + + for (int i = 0; i < GetSize(); i++) { + Parton p = GetParton(i); + + Vec4 pmom = p.GetMom(); + int pcol = p.GetCol(); + int pAntiCol = p.GetAntiCol(); + + psum = psum + pmom; + + if (pcol > 0) { + csum[pcol] += 1; + } + + if (pAntiCol > 0) { + csum[pAntiCol] -= 1; + } + } + + bool pcheck = (psum[0] < 1e-12 && psum[1] < 1e-12 && psum[2] < 1e-12 && + psum[3] < 1e-12); + + /* // No need to print for GPU, it counts number of invalid events + if (!pcheck) { + printf("%f %f %f %f\n", psum[0], psum[1], psum[2], psum[3]); + } + */ + + bool ccheck = true; + for (int i = 0; i < maxPartons - 1; i++) { + if (csum[i] != 0) { + // printf("Colour %d is not conserved.\n", i); + ccheck = false; + break; + } + } + + return pcheck && ccheck; + } + + __device__ void print_info() const { + printf("Event Information:\n"); + printf("Dxs: %f\n", GetDxs()); + printf("Number of Emissions: %d\n", GetEmissions()); + printf("Shower Parameters:\n"); + printf(" T: %f\n", GetShowerT()); + printf(" Y: %f\n", GetShowerY()); + printf(" Z: %f\n", GetShowerZ()); + printf(" C: %d\n", GetShowerC()); + printf("Shower Winner:\n"); + printf(" Kernel Number: %d\n", GetWinSF()); + printf(" Partons: [%d, %d]\n", GetWinDipole(0), GetWinDipole(1)); + printf(" Params: [%f, %f]\n", GetWinParam(0), GetWinParam(1)); + printf("Partons:\n"); + for (int i = 0; i < GetSize(); i++) { + Parton parton = GetParton(i); + printf(" Parton %d:\n", i); + printf(" Pid: %d\n", parton.GetPid()); + printf(" Mom: %f\n", parton.GetMom().P()); + printf(" Col: %d\n", parton.GetCol()); + printf(" AntiCol: %d\n", parton.GetAntiCol()); + } + } +}; + +#endif // EVENT_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/base/include/histogram.cuh b/gaps-1.1/gaps/base/include/histogram.cuh new file mode 100644 index 0000000000000000000000000000000000000000..2fe9540ab492352f06c91963442fdf264f94cdca --- /dev/null +++ b/gaps-1.1/gaps/base/include/histogram.cuh @@ -0,0 +1,259 @@ +#ifndef HISTOGRAM_CUH_ +#define HISTOGRAM_CUH_ + +#include "base.cuh" + +/** + * Binning and Histogramming + * ------------------------- + * + * This file contains tools needed for binning and histogramming data. The data + * is binned and then stored as a Yoda file[1] + * + * Yoda: https://yoda.hepforge.org/ + */ + +class Bin1D { + public: + double xmin, xmax, w, w2, wx, wx2, n; + + __host__ __device__ Bin1D(double xmin = 0., double xmax = 0.) + : xmin(xmin), xmax(xmax), w(0.), w2(0.), wx(0.), wx2(0.), n(0.) {} + + __host__ __device__ double Width() const { return xmax - xmin; } + + __device__ void AtomicFill(double x, double weight) { + atomicAdd(&w, weight); + atomicAdd(&w2, weight * weight); + atomicAdd(&wx, weight * x); + atomicAdd(&wx2, weight * weight * x); + atomicAdd(&n, 1.); + } + + __host__ __device__ void ScaleW(double scale) { + w *= scale; + w2 *= scale * scale; + wx *= scale; + wx2 *= scale * scale; + } +}; + +class Bin2D { + public: + double xmin, xmax, ymin, ymax, w, w2, wx, wx2, wy, wy2, wxy, n; + + public: + __host__ __device__ Bin2D(double xmin = 0., double xmax = 0., + double ymin = 0., double ymax = 0.) + : xmin(xmin), + xmax(xmax), + ymin(ymin), + ymax(ymax), + w(0.), + w2(0.), + wx(0.), + wx2(0.), + wy(0.), + wy2(0.), + wxy(0.), + n(0.) {} + + __host__ __device__ double WidthX() const { return xmax - xmin; } + __host__ __device__ double WidthY() const { return ymax - ymin; } + + __device__ void AtomicFill(double x, double y, double weight) { + atomicAdd(&w, weight); + atomicAdd(&w2, weight * weight); + atomicAdd(&wx, weight * x); + atomicAdd(&wx2, weight * weight * x); + atomicAdd(&wy, weight * y); + atomicAdd(&wy2, weight * weight * y); + atomicAdd(&wxy, weight * x * y); + atomicAdd(&n, 1.); + } + + __host__ __device__ void ScaleW(double scale) { + w *= scale; + w2 *= scale * scale; + wx *= scale; + wx2 *= scale * scale; + wy *= scale; + wy2 *= scale * scale; + wxy *= scale * scale; + } +}; + +/** + * Name component of Histo1D + * ------------------------- + * + * Unfortunately, std::string is not a feature in CUDA, so we have to provide + * the name additinally whern writing to file. This is the only difference + * between the CUDA and C++ versions of the code. + */ +// Histo1D class +class Histo1D { + public: + Bin1D bins[nBins]; // Array of Bin1D objects on the device + Bin1D uflow; + Bin1D oflow; + Bin1D total; + double scale; + // static constexpr int nbin = nBins; // SET IN BASE.CUH + + public: + __host__ __device__ Histo1D(double xmin = 0., double xmax = 1.) + : uflow(xmin - 100., xmin), + oflow(xmax, xmax + 100.), + total(xmin - 100., xmax + 100.), + scale(1.) { + double width = (xmax - xmin) / nBins; + for (int i = 0; i < nBins; ++i) { + double xlow = xmin + i * width; + double xhigh = xlow + width; + bins[i] = Bin1D(xlow, xhigh); // Initialize Bin1D object on the device + } + } + + // Atomic Binning! Each event is binned simultaneously here + __device__ void Fill(double x, double w) { + int l = 0; + int r = nBins - 1; + int c = (l + r) / 2; + double a = bins[c].xmin; + + while (r - l > 1) { + if (x < a) { + r = c; + } else { + l = c; + } + c = (l + r) / 2; + a = bins[c].xmin; + } + + if (x > bins[r].xmin) { + if (x > bins[r].xmax) { + oflow.AtomicFill(x, w); + } else { + bins[r].AtomicFill(x, w); + } + } else if (x < bins[l].xmin) { + uflow.AtomicFill(x, w); + } else { + bins[l].AtomicFill(x, w); + } + + total.AtomicFill(x, w); + } + + __host__ __device__ void ScaleW(double scale) { + for (int i = 0; i < nBins; ++i) { + bins[i].ScaleW(scale); + } + uflow.ScaleW(scale); + oflow.ScaleW(scale); + total.ScaleW(scale); + this->scale *= scale; + } +}; + +class Histo2D { + public: + Bin2D bins[nBins2D][nBins2D]; + Bin2D uflow; + Bin2D oflow; + Bin2D total; + double scale; + + public: + __host__ __device__ Histo2D(double xmin = 0., double xmax = 1., + double ymin = 0., double ymax = 1.) + : uflow(xmin - 100., xmin, ymin - 100., ymin), + oflow(xmax, xmax + 100., ymax, ymax + 100.), + total(xmin - 100., xmax + 100., ymin - 100., ymax + 100.), + scale(1.) { + double widthX = (xmax - xmin) / nBins2D; + double widthY = (ymax - ymin) / nBins2D; + for (int i = 0; i < nBins2D; ++i) { + for (int j = 0; j < nBins2D; ++j) { + double xlow = xmin + i * widthX; + double xhigh = xlow + widthX; + double ylow = ymin + j * widthY; + double yhigh = ylow + widthY; + bins[i][j] = Bin2D(xlow, xhigh, ylow, yhigh); + } + } + } + + __device__ void Fill(double x, double y, double w) { + // Find the bin for the x-coordinate + int lx = 0; + int rx = nBins2D - 1; + int cx = (lx + rx) / 2; + double ax = bins[cx][0].xmin; + + while (rx - lx > 1) { + if (x < ax) { + rx = cx; + } else { + lx = cx; + } + cx = (lx + rx) / 2; + ax = bins[cx][0].xmin; + } + + // Find the bin for the y-coordinate + int ly = 0; + int ry = nBins2D - 1; + int cy = (ly + ry) / 2; + double ay = bins[0][cy].ymin; + + while (ry - ly > 1) { + if (y < ay) { + ry = cy; + } else { + ly = cy; + } + cy = (ly + ry) / 2; + ay = bins[0][cy].ymin; + } + + // Fill the appropriate bin + if (x > bins[rx][0].xmin && y > bins[0][ry].ymin) { + if (x > bins[rx][0].xmax || y > bins[0][ry].ymax) { + oflow.AtomicFill(x, y, w); + } else { + bins[rx][ry].AtomicFill(x, y, w); + } + } else if (x < bins[lx][0].xmin || y < bins[0][ly].ymin) { + uflow.AtomicFill(x, y, w); + } else { + bins[lx][ly].AtomicFill(x, y, w); + } + + total.AtomicFill(x, y, w); + } + + void ScaleW(double scale) { + for (auto& binRow : bins) { + for (auto& bin : binRow) { + bin.ScaleW(scale); + } + } + uflow.ScaleW(scale); + oflow.ScaleW(scale); + total.ScaleW(scale); + this->scale *= scale; + } +}; + +// Writing is done outside of the class in CUDA implementation +std::string ToString(Histo1D h, std::string name); +void Write(Histo1D h, std::string name, const std::string& filename); + +// Overload for Histo2D +std::string ToString(Histo2D h, std::string name); +void Write(Histo2D h, std::string name, const std::string& filename); + +#endif // HISTOGRAM_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/base/include/parton.cuh b/gaps-1.1/gaps/base/include/parton.cuh new file mode 100644 index 0000000000000000000000000000000000000000..f251911bfe7211b4e2a5a77bfbab8bf8be935bf9 --- /dev/null +++ b/gaps-1.1/gaps/base/include/parton.cuh @@ -0,0 +1,45 @@ +#ifndef PARTON_CUH_ +#define PARTON_CUH_ + +// Partons have Vec4 Momentum, Vec4 #includes Base +#include "vec4.cuh" + +/** + * The Parton Class + * ---------------- + + * This file contains the Parton Object, which has attributes ID, momentum and + * colour. For now we use it for Electrons too. + */ + +class Parton { + public: + // Constructor + __device__ Parton(int pid = 0, Vec4 momentum = Vec4(), int col = 0, + int anticol = 0) + : pid(pid), mom(momentum), col(col), anticol(anticol) {} + + // Getters and Setters + __device__ int GetPid() const { return pid; } + __device__ Vec4 GetMom() const { return mom; } + __device__ int GetCol() const { return col; } + __device__ int GetAntiCol() const { return anticol; } + + __device__ void SetPid(int pid) { this->pid = pid; } + __device__ void SetMom(Vec4 mom) { this->mom = mom; } + __device__ void SetCol(int col) { this->col = col; } + __device__ void SetAntiCol(int anticol) { this->anticol = anticol; } + + // If two partons are in a Colour Connected Dipole + __device__ bool IsColorConnected(Parton p) { + return (col > 0 && col == p.anticol) || (anticol > 0 && anticol == p.col); + } + + private: + int pid; + Vec4 mom; + int col; + int anticol; +}; + +#endif // PARTON_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/base/include/qcd.cuh b/gaps-1.1/gaps/base/include/qcd.cuh new file mode 100644 index 0000000000000000000000000000000000000000..416c75f0774b377173ffe613e19ede708f1b1084 --- /dev/null +++ b/gaps-1.1/gaps/base/include/qcd.cuh @@ -0,0 +1,55 @@ +#ifndef QCD_CUH_ +#define QCD_CUH_ + +// Parton includes Base, which has the CUDA libraries +#include "event.cuh" + +/** + * The Strong Coupling Constant + * ---------------------------- + * + * This file contains the QCD constants and the alpha_s class. The alpha_s class + * is a simple class that calculates the strong coupling constant at a given + * scale. The class is designed to be used in a CUDA kernel, so it is a simple + * class with no dynamic memory allocation. + */ + +// QCD Constants, maybe you can use for SU(Not 3) ? +const double kNC = 3.; +const double kTR = 0.5; +const double kCA = kNC; +const double kCF = (kNC * kNC - 1.) / (2. * kNC); + +// A lot of the member functions can be defined here, but we need a .cu file to +// define the kernels, so might as well define everything in the .cu file! +class AlphaS { + private: + int order; + double mc2, mb2, mz2, asmz, asmb, asmc; + + public: + // Constructor + __device__ AlphaS(double mz, double asmz, int order = 1, double mb = 4.75, + double mc = 1.27); + + // Setup function for device code + __device__ void setup(double mz, double asmz, int order = 1, double mb = 4.75, + double mc = 1.27); + + // All the required functions to calculate the strong coupling constant + __device__ double Beta0(int nf); + __device__ double Beta1(int nf); + __device__ double As0(double t); + __device__ double As1(double t); + __device__ double operator()(double t); +}; + +// Setup the alpha_s class +__global__ void asSetupKernel(AlphaS *as, double mz, double asmz, int order = 1, + double mb = 4.75, double mc = 1.27); + +// Calculate the strong coupling constant +__global__ void asValue(AlphaS *as, double *asval, double t); +__global__ void asShowerKernel(AlphaS *as, Event *events, double *asval, int N); + +#endif // QCD_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/base/include/vec4.cuh b/gaps-1.1/gaps/base/include/vec4.cuh new file mode 100644 index 0000000000000000000000000000000000000000..43f05a4fed3d56ba293aed23710e6253cb326628 --- /dev/null +++ b/gaps-1.1/gaps/base/include/vec4.cuh @@ -0,0 +1,160 @@ +#ifndef VEC4_CUH_ +#define VEC4_CUH_ + +// Base Class, with all the important definitions +#include "base.cuh" + +/** + * Four Momenta + * ------------ + * + * This file contains the definition of the Four Momenta class, which is used to + * represent the four-momentum of particles in the event. It is a simple class + * with the four attributes (E, px, py, pz) and some basic operations that can + * be performed with them. + * + * + * Why is this Header Only? + * ------------------------ + * + * When you declare a __device__ function in a .cuh (header) file and define it + * in a .cu file, that function's definition is only available to the .cu file + * in which it is defined. This is because __device__ functions are compiled by + * nvcc into the device code, and unlike host functions, they do not have + * external linkage that allows them to be seen or linked across different .cu + * files after individual compilation. + */ + +class Vec4 { + private: + double E, px, py, pz; + + public: + // Constructor - Define key attributes Energy and Momentum + // Used in ME and out [HOST + DEVICE] + __device__ Vec4(double E = 0., double px = 0., double py = 0., double pz = 0.) + : E(E), px(px), py(py), pz(pz) {} + + // Get Method to Obtain Attribute Value + __device__ double operator[](int i) const { + switch (i) { + case 0: + return E; + case 1: + return px; + case 2: + return py; + case 3: + return pz; + default: + // CUDA does not support exceptions, so we just return 0 + return 0; + } + } + + // Print a Column Vector with the attributes + friend std::ostream& operator<<(std::ostream& os, const Vec4& v) { + os << "(" << v.E << "," << v.px << "," << v.py << "," << v.pz << ")"; + return os; + } + + // Simple Mathematics with Four vectors + __device__ Vec4 operator+(const Vec4& v) const { + return Vec4(E + v.E, px + v.px, py + v.py, pz + v.pz); + } + + __device__ Vec4 operator-() const { return Vec4(-E, -px, -py, -pz); } + + __device__ Vec4 operator-(const Vec4& v) const { + return Vec4(E - v.E, px - v.px, py - v.py, pz - v.pz); + } + + // Multiplication (and Dot Product) + __device__ double operator*(const Vec4& v) const { + return E * v.E - px * v.px - py * v.py - pz * v.pz; + } + + __device__ Vec4 operator*(double v) const { + return Vec4(E * v, px * v, py * v, pz * v); + } + + // Division + __device__ Vec4 operator/(double v) const { + return Vec4(E / v, px / v, py / v, pz / v); + } + + // Magnitude of the Vector + __device__ double M2() const { return (*this) * (*this); } + + __device__ double M() const { + double m2 = M2(); + return m2 > 0 ? sqrt(m2) : 0; + } + + // 3 Momenta + __device__ double P2() const { return px * px + py * py + pz * pz; } + + __device__ double P() const { + double p2 = P2(); + return p2 > 0 ? sqrt(p2) : 0; + } + + // Transverse Momenta + __device__ double PT2() const { return px * px + py * py; } + + __device__ double PT() const { + double pt2 = PT2(); + return pt2 > 0 ? sqrt(pt2) : 0; + } + + // Angles + __device__ double Theta() const { + double p = P(); + return p != 0 ? acos(pz / p) : 0; + } + + __device__ double Phi() const { + if (px == 0 && py == 0) { + return 0.; + } else { + return atan2(py, px); + } + } + + __device__ double Rapidity() const { + double denominator = (E - pz); + return denominator != 0 ? 0.5 * log((E + pz) / denominator) : 0; + } + + __device__ double Eta() const { + double theta = Theta(); + return -log(tan(theta / 2.)); + } + + // Three Momenta Dot and Cross Product + __device__ double Dot(const Vec4& v) const { + return px * v.px + py * v.py + pz * v.pz; + } + + __device__ Vec4 Cross(const Vec4& v) const { + return Vec4(0., py * v.pz - pz * v.py, pz * v.px - px * v.pz, + px * v.py - py * v.px); + } + + // Boosts + __device__ Vec4 Boost(const Vec4& v) const { + double rsq = M(); + double v0 = (E * v.E - px * v.px - py * v.py - pz * v.pz) / rsq; + double c1 = (v.E + v0) / (rsq + E); + return Vec4(v0, v.px - c1 * px, v.py - c1 * py, v.pz - c1 * pz); + } + + __device__ Vec4 BoostBack(const Vec4& v) const { + double rsq = M(); + double v0 = (E * v.E + px * v.px + py * v.py + pz * v.pz) / rsq; + double c1 = (v.E + v0) / (rsq + E); + return Vec4(v0, v.px + c1 * px, v.py + c1 * py, v.pz + c1 * pz); + } +}; + +#endif // VEC4_CUH_ diff --git a/gaps-1.1/gaps/base/src/base.cu b/gaps-1.1/gaps/base/src/base.cu new file mode 100644 index 0000000000000000000000000000000000000000..256dbb8bd8b5ef5a75e92f63e59d562a34b30c93 --- /dev/null +++ b/gaps-1.1/gaps/base/src/base.cu @@ -0,0 +1,25 @@ +#include "base.cuh" + +// Sync Device and Check for Errors +void syncGPUAndCheck(const char *operation) { + // synchronize with the device + cudaDeviceSynchronize(); + + // check for an error + cudaError_t error = cudaGetLastError(); + if (error != cudaSuccess) { + // print the CUDA error message + std::cerr << "CUDA error @" << operation << ": " + << cudaGetErrorString(error) << std::endl; + + // abort the program + std::exit(EXIT_FAILURE); + } +} + +// Debug messages +__host__ __device__ void DEBUG_MSG(const char *message) { + if (debug) { + printf("DEBUG: %s\n", message); + } +} \ No newline at end of file diff --git a/gaps-1.1/gaps/base/src/histogram.cu b/gaps-1.1/gaps/base/src/histogram.cu new file mode 100644 index 0000000000000000000000000000000000000000..5c6deabbea06b4881bee2000ceb4bc9254974294 --- /dev/null +++ b/gaps-1.1/gaps/base/src/histogram.cu @@ -0,0 +1,83 @@ +#include "histogram.cuh" + +// Libraries needed for file writing +#include <fstream> +#include <iomanip> +#include <sstream> +#include <string> + +// Text for the Yoda File +std::string ToString(Histo1D h, std::string name) { + std::stringstream ss; + ss << "BEGIN YODA_HISTO1D " << name << "\n\n"; + ss << "Path=" << name << "\n\n"; + ss << "ScaledBy=" << h.scale << "\n"; + ss << "Title=\nType=Histo1D\n"; + ss << "# ID\tID\tsumw\tsumw2\tsumwx\tsumwx2\tnumEntries\n"; + ss << std::scientific << std::setprecision(6); + ss << "Total" + << "\t" << h.total.w << "\t" << h.total.w2 << "\t" << h.total.wx << "\t" + << h.total.wx2 << "\t" << static_cast<int>(h.total.n) << "\n"; + ss << "Underflow" + << "\t" << h.uflow.w << "\t" << h.uflow.w2 << "\t" << h.uflow.wx << "\t" + << h.uflow.wx2 << "\t" << static_cast<int>(h.uflow.n) << "\n"; + ss << "Overflow" + << "\t" << h.oflow.w << "\t" << h.oflow.w2 << "\t" << h.oflow.wx << "\t" + << h.oflow.wx2 << "\t" << static_cast<int>(h.oflow.n) << "\n"; + ss << "# xlow\txhigh\tsumw\tsumw2\tsumwx\tsumwx2\tnumEntries\n"; + for (size_t i = 0; i < nBins; ++i) { + ss << std::scientific << std::setprecision(6); + ss << h.bins[i].xmin << "\t" << h.bins[i].xmax << "\t" << h.bins[i].w + << "\t" << h.bins[i].w2 << "\t" << h.bins[i].wx << "\t" << h.bins[i].wx2 + << "\t" << static_cast<int>(h.bins[i].n) << "\n"; + } + ss << "END YODA_HISTO1D\n\n"; + return ss.str(); +} + +// Write the Yoda File +void Write(Histo1D h, std::string name, const std::string& filename) { + std::ofstream file; + file.open(filename, std::ios::out | std::ios::app); + file << ToString(h, name); + file.close(); +} + +std::string ToString(Histo2D h, std::string name) { + std::stringstream ss; + ss << "BEGIN YODA_HISTO2D " << name << "\n\n"; + ss << "Path=" << name << "\n\n"; + ss << "ScaledBy=" << h.scale << "\n"; + ss << "Title=\nType=Histo2D\n"; + ss << "# " + "ID\tID\tsumw\tsumw2\tsumwx\tsumwx2\tsumwy\tsumwy2\tsumwxy\tnumEntries" + "\n"; + ss << std::scientific << std::setprecision(6); + ss << "Total" + << "\t" << h.total.w << "\t" << h.total.w2 << "\t" << h.total.wx << "\t" + << h.total.wx2 << "\t" << h.total.wy << "\t" << h.total.wy2 << "\t" + << h.total.wxy << "\t" << static_cast<int>(h.total.n) << "\n"; + ss << "# " + "xlow\txhigh\tylow\tyhigh\tsumw\tsumw2\tsumwx\tsumwx2\tsumwy\tsumwy2\ts" + "umwxy\tnumEntries\n"; + for (size_t i = 0; i < nBins2D; ++i) { + for (size_t j = 0; j < nBins2D; ++j) { + ss << std::scientific << std::setprecision(6); + ss << h.bins[i][j].xmin << "\t" << h.bins[i][j].xmax << "\t" + << h.bins[i][j].ymin << "\t" << h.bins[i][j].ymax << "\t" + << h.bins[i][j].w << "\t" << h.bins[i][j].w2 << "\t" << h.bins[i][j].wx + << "\t" << h.bins[i][j].wx2 << "\t" << h.bins[i][j].wy << "\t" + << h.bins[i][j].wy2 << "\t" << h.bins[i][j].wxy << "\t" + << static_cast<int>(h.bins[i][j].n) << "\n"; + } + } + ss << "END YODA_HISTO2D\n\n"; + return ss.str(); +} + +void Write(Histo2D h, std::string name, const std::string& filename) { + std::ofstream file; + file.open(filename, std::ios::out | std::ios::app); + file << ToString(h, name); + file.close(); +} \ No newline at end of file diff --git a/gaps-1.1/gaps/base/src/qcd.cu b/gaps-1.1/gaps/base/src/qcd.cu new file mode 100644 index 0000000000000000000000000000000000000000..5739ea67bcf5287f6a14bcb5aae9431c11053527 --- /dev/null +++ b/gaps-1.1/gaps/base/src/qcd.cu @@ -0,0 +1,110 @@ +#include "qcd.cuh" + +// Constructor +__device__ AlphaS::AlphaS(double mz, double asmz, int order, double mb, + double mc) + : order(order), + mc2(mc * mc), + mb2(mb * mb), + mz2(mz * mz), + asmz(asmz), + asmb((*this)(mb2)), + asmc((*this)(mc2)) {} + +// Setup +__device__ void AlphaS::setup(double mz, double asmz, int order, double mb, + double mc) { + this->order = order; + this->mc2 = mc * mc; + this->mb2 = mb * mb; + this->mz2 = mz * mz; + this->asmz = asmz; + this->asmb = (*this)(mb2); + this->asmc = (*this)(mc2); +} + +// Beta and Alpha S functions +__device__ double AlphaS::Beta0(int nf) { + return (11. / 6. * kCA) - (2. / 3. * kTR * nf); +} + +__device__ double AlphaS::Beta1(int nf) { + return (17. / 6. * kCA * kCA) - ((5. / 3. * kCA + kCF) * kTR * nf); +} + +// Alpha_s at order 0 and 1 (One-Loop and Two-Loop) +__device__ double AlphaS::As0(double t) { + double tref, asref, b0; + if (t >= mb2) { + tref = mz2; + asref = asmz; + b0 = Beta0(5) / (2. * M_PI); + } else if (t >= mc2) { + tref = mb2; + asref = asmb; + b0 = Beta0(4) / (2. * M_PI); + } else { + tref = mc2; + asref = asmc; + b0 = Beta0(3) / (2. * M_PI); + } + return 1. / (1. / asref + b0 * log(t / tref)); +} + +__device__ double AlphaS::As1(double t) { + double tref, asref, b0, b1, w; + if (t >= mb2) { + tref = mz2; + asref = asmz; + b0 = Beta0(5) / (2. * M_PI); + b1 = Beta1(5) / pow(2. * M_PI, 2); + } else if (t >= mc2) { + tref = mb2; + asref = asmb; + b0 = Beta0(4) / (2. * M_PI); + b1 = Beta1(4) / pow(2. * M_PI, 2); + } else { + tref = mc2; + asref = asmc; + b0 = Beta0(3) / (2. * M_PI); + b1 = Beta1(3) / pow(2. * M_PI, 2); + } + w = 1. + b0 * asref * log(t / tref); + return asref / w * (1. - b1 / b0 * asref * log(w) / w); +} + +__device__ double AlphaS::operator()(double t) { + if (order == 0) { + return As0(t); + } else { + return As1(t); + } +} + +// Set up Kernel on the Device +__global__ void asSetupKernel(AlphaS *as, double mz, double asmz, int order, + double mb, double mc) { + as->setup(mz, asmz, order, mb, mc); +} + +// Calculate AlphaS on the Device for ONE input +__global__ void asValue(AlphaS *as, double *asval, double t) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= 1) return; + + asval[idx] = (*as)(t); + printf("asVal: %f\n", (*as)(t)); +} + +// Calculate AlphaS on the Device for MANY inputs +// Exclusively used for Parton Shower Veto Algorithm +__global__ void asShowerKernel(AlphaS *as, Event *events, double *asval, + int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) return; + Event &ev = events[idx]; + + asval[idx] = (*as)(ev.GetShowerT()); +} diff --git a/gaps-1.1/gaps/main.cu b/gaps-1.1/gaps/main.cu new file mode 100644 index 0000000000000000000000000000000000000000..b63570e27f9852c067d66d07e984d2af7da2374e --- /dev/null +++ b/gaps-1.1/gaps/main.cu @@ -0,0 +1,166 @@ +// To Measure Wall Clock Time and Write to File +#include <chrono> +#include <fstream> + +// Base Components +#include "base.cuh" + +// Matrix Element +#include "matrix.cuh" + +// Parton Shower +#include "shower.cuh" + +// Jet and Event Shape Analysis +#include "observables.cuh" + +/** + * GAPS: a GPU-Amplified Parton Shower + * ----------------------------------- + * + * This program is a simple event generator for e+e- -> partons. It is designed + * to be a simple example of how to use the GPU to calculate matrix elements and + * perform parton showering. The program is designed to be a proof of concept as + * well as a intuitive example of how to use the GPU for event generation. + * + * This program is based on S. Höche's "Introduction to Parton Showers" Python + * tutorial[1]. This program calculates ee -> qq and then showers the partons. + * + * [1] https://arxiv.org/abs/1411.4085 and MCNET-CTEQ 2021 Tutorial + */ + +// ----------------------------------------------------------------------------- + +void runGenerator(const int& N, const double& E, const std::string& filename) { + // --------------------------------------------------------------------------- + // Give some information about the simulation + + std::cout << "-------------------------------------------------" << std::endl; + std::cout << "| GAPS: a GPU-Amplified Parton Shower |" << std::endl; + std::cout << "-------------------------------------------------" << std::endl; + std::cout << "Process: e+ e- --> q qbar" << std::endl; + std::cout << "Number of Events: " << N << std::endl; + std::cout << "Center of Mass Energy: " << E << " GeV" << std::endl; + std::cout << "" << std::endl; + + // --------------------------------------------------------------------------- + // Initialisation + + std::cout << "Initialising..." << std::endl; + thrust::device_vector<Event> d_events(N); + + // --------------------------------------------------------------------------- + // ME Generation + + std::cout << "Generating Matrix Elements..." << std::endl; + auto start = std::chrono::high_resolution_clock::now(); + + calcLOME(d_events, E); + + auto end = std::chrono::high_resolution_clock::now(); + std::chrono::duration<double> diff_me = end - start; + + // --------------------------------------------------------------------------- + // Do the Showering + + std::cout << "Showering Partons..." << std::endl; + start = std::chrono::high_resolution_clock::now(); + + runShower(d_events); + + end = std::chrono::high_resolution_clock::now(); + std::chrono::duration<double> diff_sh = end - start; + + // --------------------------------------------------------------------------- + // Analyze Events + + std::cout << "Analysing Events..." << std::endl; + start = std::chrono::high_resolution_clock::now(); + + // Analysis + doAnalysis(d_events, filename); + + end = std::chrono::high_resolution_clock::now(); + std::chrono::duration<double> diff_an = end - start; + + // --------------------------------------------------------------------------- + // Empty the device vector (Not neccessary, but good practice) + + d_events.clear(); + d_events.shrink_to_fit(); + + /** + * Maybe in the future, to allow > 10^6 events, we can split the large number + * into smaller batches. Right now, we write events to file directly from the + * do... functions, so the code is not ready for this. + */ + + // --------------------------------------------------------------------------- + // Results + + double diff = diff_me.count() + diff_sh.count() + diff_an.count(); + + std::cout << "" << std::endl; + std::cout << "EVENT GENERATION COMPLETE" << std::endl; + std::cout << "" << std::endl; + std::cout << "ME Time: " << diff_me.count() << " s" << std::endl; + std::cout << "Sh Time: " << diff_sh.count() << " s" << std::endl; + std::cout << "An Time: " << diff_an.count() << " s" << std::endl; + std::cout << "" << std::endl; + std::cout << "Total Time: " << diff << " s" << std::endl; + std::cout << "" << std::endl; + + // Open the file in append mode. This will create the file if it doesn't + // exist. + std::ofstream outfile("gaps-time.dat", std::ios_base::app); + + // Write diff_sh.count() to the file. + outfile << diff_me.count() << ", " << diff_sh.count() << ", " + << diff_an.count() << ", " << diff << std::endl; + + // Close the file. + outfile.close(); + + std::cout << "Histograms written to " << filename << std::endl; + std::cout << "Timing data written to gaps-time.dat" << std::endl; + std::cout << "------------------------------------------------" << std::endl; +} +// ----------------------------------------------------------------------------- + +int main(int argc, char* argv[]) { + // Import Settings + int N = argc > 1 ? atoi(argv[1]) : 10000; + double E = argc > 2 ? atof(argv[2]) : 91.2; + + // If more than maxEvents, run in batches + if (N > maxEvents) { + std::cout << "-------------------------------------------------" + << std::endl; + std::cout << "More Events than GPU Can Handle at Once!" << std::endl; + std::cout << "Running in batches..." << std::endl; + std::cout << "Please use rivet-merge to combine runs" << std::endl; + + // Split into batches + int nBatches = N / maxEvents; + int nRemainder = N % maxEvents; + std::cout << "Number of Batches: " << nBatches << std::endl; + std::cout << "Size of Remainder: " << nRemainder << std::endl; + + // Run in batches + for (int i = 0; i < nBatches; i++) { + std::string filename = "gaps-" + std::to_string(i) + ".yoda"; + runGenerator(maxEvents, E, filename); + } + + // Run remainder + if (nRemainder > 0) { + std::string filename = "gaps-" + std::to_string(nBatches) + ".yoda"; + runGenerator(nRemainder, E, filename); + } + } else { + runGenerator(N, E, "gaps.yoda"); + } + + return 0; +} +// ----------------------------------------------------------------------------- diff --git a/gaps-1.1/gaps/matrix/CMakeLists.txt b/gaps-1.1/gaps/matrix/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..7b5722c28872a5238b1b0aa0cf51e135e54d9907 --- /dev/null +++ b/gaps-1.1/gaps/matrix/CMakeLists.txt @@ -0,0 +1,21 @@ +cmake_minimum_required(VERSION 3.10) +project(matrix) + +# Enable CUDA +enable_language(CUDA) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include ../base/include) +file(GLOB SOURCES "src/*.cu") + +# Set all source files to compile with CUDA +set_source_files_properties(${SOURCES} PROPERTIES LANGUAGE CUDA) + +add_library(matrix ${SOURCES}) + +# Set CUDA architecture to 7.0 for Tesla V100 +set_property(TARGET matrix PROPERTY CUDA_ARCHITECTURES 70) + +# Link to Base +target_link_libraries(matrix base) \ No newline at end of file diff --git a/gaps-1.1/gaps/matrix/include/matrix.cuh b/gaps-1.1/gaps/matrix/include/matrix.cuh new file mode 100644 index 0000000000000000000000000000000000000000..eca606167dfe993ac3e6bdb14fa71f22d00dc5a2 --- /dev/null +++ b/gaps-1.1/gaps/matrix/include/matrix.cuh @@ -0,0 +1,43 @@ +#ifndef MATRIX_CUH +#define MATRIX_CUH + +// Parton includes Base, which has the CUDA libraries +#include "event.cuh" + +/** + * Matrix Element Generation + * ------------------------- + * + * This class is used to generate the leading order matrix element for the + * e+e- -> q qbar process. The ME^2 is calculated simulataneously for all + * events, but with a few random numbers for flavour and direction. This is a + * massless shower, so the system generates theoretical identical events for + * all flavours. + */ +class Matrix { + private: + double alphas, ecms, MZ2, GZ2, alpha, sin2tw, amin, ye, ze, ws; + + public: + // Constructor + Matrix(double alphas = asmz, double ecms = 91.2); + + // Setup function for device code + __device__ void setup(double alphas = asmz, double ecms = 91.2); + + // Leading Order Matrix Element Generation + __device__ double ME2(int fl, double s, double t); + + // Getters + __device__ double GetECMS() { return ecms; }; +}; + +// CUDA Kernels to Setup Matrix and make LO Points +// TIP: CUDA KERNELS CANNOT BE MEMBER FUNCTIONS +__global__ void matrixSetupKernel(Matrix* matrix, double E); +__global__ void loPointKernel(Matrix* matrix, Event* ev, int N); + +// All tasks wrapped in a function +void calcLOME(thrust::device_vector<Event>& d_events, double E); + +#endif // MATRIX_CUH \ No newline at end of file diff --git a/gaps-1.1/gaps/matrix/src/matrix.cu b/gaps-1.1/gaps/matrix/src/matrix.cu new file mode 100644 index 0000000000000000000000000000000000000000..2babef49f02023c779226b1eca40a6103d8e08eb --- /dev/null +++ b/gaps-1.1/gaps/matrix/src/matrix.cu @@ -0,0 +1,131 @@ +#include "matrix.cuh" + +// Host constructor +Matrix::Matrix(double alphas, double ecms) + : alphas(alphas), + ecms(ecms), + MZ2(pow(91.1876, 2.)), + GZ2(pow(2.4952, 2.)), + alpha(1. / 128.802), + sin2tw(0.22293), + amin(1.e-10), + ye(0.5), + ze(0.01), + ws(0.25) {} + +// Device setup function - Default Values in matrix.cuh +__device__ void Matrix::setup(double alphas, double ecms) { + this->alphas = alphas; + this->ecms = ecms; + this->MZ2 = pow(91.1876, 2.); + this->GZ2 = pow(2.4952, 2.); + this->alpha = 1. / 128.802; + this->sin2tw = 0.22293; + this->amin = 1.e-10; + this->ye = 0.5; + this->ze = 0.01; + this->ws = 0.25; +} + +// ME^2 Formula +__device__ double Matrix::ME2(int fl, double s, double t) { + double qe = -1.; + double ae = -0.5; + double ve = ae - 2. * qe * sin2tw; + double qf = (fl == 2 || fl == 4) ? 2. / 3. : -1. / 3.; + double af = (fl == 2 || fl == 4) ? 0.5 : -0.5; + double vf = af - 2. * qf * sin2tw; + double kappa = 1. / (4. * sin2tw * (1. - sin2tw)); + double chi1 = kappa * s * (s - MZ2) / (pow(s - MZ2, 2.) + GZ2 * MZ2); + double chi2 = pow(kappa * s, 2.) / (pow(s - MZ2, 2.) + GZ2 * MZ2); + double term1 = (1. + pow(1. + 2. * t / s, 2.)) * + (pow(qf * qe, 2.) + 2. * (qf * qe * vf * ve) * chi1 + + (ae * ae + ve * ve) * (af * af + vf * vf) * chi2); + double term2 = (1. + 2. * t / s) * (4. * qe * qf * ae * af * chi1 + + 8. * ae * ve * af * vf * chi2); + return pow(4. * M_PI * alpha, 2.) * 3. * (term1 + term2); +} + +// Kernel to set up the Matrix object on the device +__global__ void matrixSetupKernel(Matrix *matrix, double E) { + matrix->setup(asmz, E); +} + +// Kernel to generate the Event +__global__ void loPointKernel(Matrix *matrix, Event *events, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + curandState state; + + // Every events[idx] has a seed idx + // curand_init(idx, 0, 0, &states[idx]); + + // Every events[idx] has a seed idx and clok64() is used to get a seed + curand_init(clock64(), idx, 0, &state); + + if (idx >= N) { + return; + } + + Event &ev = events[idx]; + + double ct = 2. * curand_uniform(&state) - 1.; + double st = sqrt(1. - ct * ct); + double phi = 2. * M_PI * curand_uniform(&state); + + int fl = curand(&state) % 5 + 1; + double p0 = matrix->GetECMS() / 2.; // Need to use Get because outside class + + Vec4 pa(p0, 0., 0., p0); + Vec4 pb(p0, 0., 0., -p0); + Vec4 p1(p0, p0 * st * cos(phi), p0 * st * sin(phi), p0 * ct); + Vec4 p2(p0, -p0 * st * cos(phi), -p0 * st * sin(phi), -p0 * ct); + + double lome = matrix->ME2(fl, (pa + pb).M2(), (pa - p1).M2()); + + // Calculate the differential cross section + // 5 = 5 flavours (?) + // 3.89379656e8 = Convert from GeV^-2 to pb + // 8 pi = Standard Phase Space Factor + // pow(matrix->GetECMS(), 2.) = center of mass energy squared, s + double dxs = 5. * lome * 3.89379656e8 / (8. * M_PI) / + (2. * pow(matrix->GetECMS(), 2.)); + + Parton p[4] = {Parton(-11, -pa, 0, 0), Parton(11, -pb, 0, 0), + Parton(fl, p1, 1, 0), Parton(-fl, p2, 0, 1)}; + + // Set the Partons + for (int i = 0; i < 4; i++) { + ev.SetParton(i, p[i]); + } + + // Set the ME Params + ev.SetDxs(dxs); + ev.SetHard(4); +} + +// Function to generate the LO Matrix Elements + Momenta +void calcLOME(thrust::device_vector<Event> &d_events, double E) { + // Number of Events - Can get from d_events.size() + int N = d_events.size(); + + // Allocate memory for a Matrix object on the device + Matrix *d_matrix; + cudaMalloc(&d_matrix, sizeof(Matrix)); + + // Set up the device Matrix object + DEBUG_MSG("Running @matrixSetupKernel"); + matrixSetupKernel<<<1, 1>>>(d_matrix, E); + syncGPUAndCheck("matrixSetupKernel"); + + // Generate the LO Matrix Elements + DEBUG_MSG("Running @loPointKernel"); + loPointKernel<<<(N + 255) / 256, 256>>>( + d_matrix, thrust::raw_pointer_cast(d_events.data()), N); + syncGPUAndCheck("loPointKernel"); + + // Free Memory + cudaFree(d_matrix); + + return; +} diff --git a/gaps-1.1/gaps/observables/CMakeLists.txt b/gaps-1.1/gaps/observables/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..2beb9366625f6d0d04b09dfc9d36930bdb03993e --- /dev/null +++ b/gaps-1.1/gaps/observables/CMakeLists.txt @@ -0,0 +1,21 @@ +cmake_minimum_required(VERSION 3.10) +project(observables) + +# Enable CUDA +enable_language(CUDA) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include ../base/include) +file(GLOB SOURCES "src/*.cu") + +# Set all source files to compile with CUDA +set_source_files_properties(${SOURCES} PROPERTIES LANGUAGE CUDA) + +add_library(observables ${SOURCES}) + +# Set CUDA architecture to 7.0 for Tesla V100 +set_property(TARGET observables PROPERTY CUDA_ARCHITECTURES 70) + +# Link to Base +target_link_libraries(observables base) \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/include/dalitz.cuh b/gaps-1.1/gaps/observables/include/dalitz.cuh new file mode 100644 index 0000000000000000000000000000000000000000..ecb92cd6a95e7562d18e594b5c5c641627e3d51c --- /dev/null +++ b/gaps-1.1/gaps/observables/include/dalitz.cuh @@ -0,0 +1,9 @@ +#ifndef DALITZ_CUH_ +#define DALITZ_CUH_ + +#include "event.cuh" + +// Dalitz Plot +__global__ void calculateDalitz(Event* events, int N); + +#endif // DALITZ_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/include/eventshapes.cuh b/gaps-1.1/gaps/observables/include/eventshapes.cuh new file mode 100644 index 0000000000000000000000000000000000000000..f1ef393ce97e961d3c7d752aec19a16266320951 --- /dev/null +++ b/gaps-1.1/gaps/observables/include/eventshapes.cuh @@ -0,0 +1,11 @@ +#ifndef EVENTSHAPES_CUH_ +#define EVENTSHAPES_CUH_ + +#include "event.cuh" + +// Event Shapes +__device__ void bubbleSort(Vec4* moms, int n); +__global__ void calculateThr(Event* events, int N); +__global__ void calculateJetMBr(Event* events, int N); + +#endif // EVENTSHAPES_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/include/jetrates.cuh b/gaps-1.1/gaps/observables/include/jetrates.cuh new file mode 100644 index 0000000000000000000000000000000000000000..95f5566b654472726054eb6d458edee1ad349f23 --- /dev/null +++ b/gaps-1.1/gaps/observables/include/jetrates.cuh @@ -0,0 +1,10 @@ +#ifndef JETRATES_CUH_ +#define JETRATES_CUH_ + +#include "event.cuh" + +// Jet rates using Durham algorithm +__device__ double Yij(const Vec4& p, const Vec4& q, double ecm2); +__global__ void doCluster(Event* events, int N); + +#endif // JETRATES_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/include/observables.cuh b/gaps-1.1/gaps/observables/include/observables.cuh new file mode 100644 index 0000000000000000000000000000000000000000..0db3d97461f4a403c8e288b7e3b99f704586ae13 --- /dev/null +++ b/gaps-1.1/gaps/observables/include/observables.cuh @@ -0,0 +1,60 @@ +#ifndef DURHAM_CUH_ +#define DURHAM_CUH_ + +// Histogram and Parton include relevant headers +#include "event.cuh" +#include "histogram.cuh" + +// Add Analyses Here +#include "dalitz.cuh" +#include "eventshapes.cuh" +#include "jetrates.cuh" + +/** + * Observables and their Anaylsis + * ------------------------------- + * + * This file contains the necessary classes and functions to perform a Durham + * algorithm analysis on a set of events, as well as thrust and Jet massses + + * broadenings. The observables are calculated here and then analysed using + * Rivet[1] + * + * [1] Rivet: https://rivet.hepforge.org/ + */ + +class Analysis { + public: + // Similar to Histo1D in C++/Rivet, just split into Host / Device Components + Histo1D hists[10]; + Histo2D dalitz; + + double wtot; // Scale by Weight for 1/sigma d(sigma)/d Observable + double ntot; // Scale by Number for d(sigma)/d Observable + + // Can't have strings as device variables, in future could use char* + __host__ __device__ Analysis() : wtot(0.), ntot(0.) { + hists[0] = Histo1D(-4.3, -0.3); // /gaps/log10y23 + hists[1] = Histo1D(-4.3, -0.3); // /gaps/log10y34 + hists[2] = Histo1D(-4.3, -0.3); // /gaps/log10y45 + hists[3] = Histo1D(-4.3, -0.3); // /gaps/log10y56 + hists[4] = Histo1D(0., 0.5); // "/gaps/tvalue" + hists[5] = Histo1D(0., 0.5); // "/gaps/tzoomd" + hists[6] = Histo1D(0., 1.); // "/gaps/hjm" + hists[7] = Histo1D(0., 0.5); // "/gaps/ljm" + hists[8] = Histo1D(0., 0.5); // "/gaps/wjb" + hists[9] = Histo1D(0., 0.2); // "/gaps/njb" + + dalitz = Histo2D(0., 1., 0., 1.); // '/gaps/dalitz" + } +}; + +// Validate Events - Colour and Momentum Conservation +__global__ void validateEvents(Event* events, int* invalid, int N); + +// Fill Histograms simultaneously +__global__ void fillHistos(Analysis* an, Event* events, int N); + +// Analysis wrapped in a function +void doAnalysis(thrust::device_vector<Event>& d_events, std::string filename); + +#endif // DURHAM_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/src/dalitz.cu b/gaps-1.1/gaps/observables/src/dalitz.cu new file mode 100644 index 0000000000000000000000000000000000000000..639693d9704fa5d8727d856e580302ce43bf9bc5 --- /dev/null +++ b/gaps-1.1/gaps/observables/src/dalitz.cu @@ -0,0 +1,31 @@ +#include "dalitz.cuh" + +// Dalitz Plot + +__global__ void calculateDalitz(Event* events, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event& ev = events[idx]; + + if (!ev.GetValidity() || ev.GetPartonSize() != 3) { + return; + } + + // Obtain Energy from incoming partons + double E = abs(ev.GetParton(0).GetMom()[0] + ev.GetParton(1).GetMom()[0]); + + // By default, element 2 is quark and 3 is antiquark + // i.e. emission will be element 4 + Vec4 p1 = ev.GetParton(2).GetMom(); + Vec4 p2 = ev.GetParton(3).GetMom(); + + // Calculate x1 and x2 + double x1 = 2 * p1.P() / E; + double x2 = 2 * p2.P() / E; + + ev.SetDalitz(x1, x2); +} \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/src/eventshapes.cu b/gaps-1.1/gaps/observables/src/eventshapes.cu new file mode 100644 index 0000000000000000000000000000000000000000..4a9001b34d800a49f8fa563d6091a97be40aef54 --- /dev/null +++ b/gaps-1.1/gaps/observables/src/eventshapes.cu @@ -0,0 +1,175 @@ +#include "eventshapes.cuh" + +// Event Shapes + +// Bubble sort to sort the momenta in descending order of P() +__device__ void bubbleSort(Vec4* moms, int n) { + for (int i = 0; i < n - 1; i++) { + for (int j = 0; j < n - i - 1; j++) { + if (moms[j].P() < moms[j + 1].P()) { + Vec4 temp = moms[j]; + moms[j] = moms[j + 1]; + moms[j + 1] = temp; + } + } + } +} + +// Thrust calculation from TASSO +__global__ void calculateThr(Event* events, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event& ev = events[idx]; + + if (!ev.GetValidity() || ev.GetPartonSize() < 3) { + return; + } + + Vec4 moms[maxPartons]; + for (int i = 2; i < ev.GetSize(); ++i) { + moms[i - 2] = ev.GetParton(i).GetMom(); + } + + bubbleSort(moms, maxPartons); + + double momsum = 0.; + for (int i = 0; i < ev.GetPartonSize(); ++i) { + momsum += moms[i].P(); + } + + double thr = 0.; + Vec4 t_axis = Vec4(); + + for (int k = 1; k < ev.GetPartonSize(); ++k) { + for (int j = 0; j < k; ++j) { + Vec4 tmp_axis = moms[j].Cross(moms[k]); + Vec4 p_thrust = Vec4(); + Vec4 p_combin[4]; + + for (int i = 0; i < ev.GetPartonSize(); ++i) { + if (i != j && i != k) { + if (moms[i].Dot(tmp_axis) >= 0) { + p_thrust = p_thrust + moms[i]; + } else { + p_thrust = p_thrust - moms[i]; + } + } + } + + p_combin[0] = (p_thrust + moms[j] + moms[k]); + p_combin[1] = (p_thrust + moms[j] - moms[k]); + p_combin[2] = (p_thrust - moms[j] + moms[k]); + p_combin[3] = (p_thrust - moms[j] - moms[k]); + + for (int i = 0; i < 4; ++i) { + double temp = p_combin[i].P(); + if (temp > thr) { + thr = temp; + t_axis = p_combin[i]; + } + } + } + } + + thr /= momsum; + thr = 1. - thr; + + t_axis = t_axis / (t_axis).P(); + if (t_axis[3] < 0) { + t_axis = t_axis * -1.; + } + + if (thr < 1e-12) { + thr = -5.; + } + + ev.SetThr(thr); + ev.SetTAxis(t_axis); +} + +// Jet Mass and Broadening +__global__ void calculateJetMBr(Event* events, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event& ev = events[idx]; + + if (!ev.GetValidity() || ev.GetPartonSize() < 3) { + return; + } + + Vec4 moms[maxPartons]; + for (int i = 2; i < ev.GetSize(); ++i) { + moms[i - 2] = ev.GetParton(i).GetMom(); + } + + double momsum = 0.; + for (int i = 0; i < ev.GetSize(); ++i) { + momsum += moms[i].P(); + } + + Vec4 p_with, p_against; + int n_with = 0, n_against = 0; + double e_vis = 0., broad_with = 0., broad_against = 0., + broad_denominator = 0.; + + for (int i = 0; i < ev.GetPartonSize(); ++i) { + double mo_para = moms[i].Dot(ev.GetTAxis()); + double mo_perp = (moms[i] - (ev.GetTAxis() * mo_para)).P(); + double enrg = moms[i].P(); + + e_vis += enrg; + broad_denominator += 2. * enrg; + + if (mo_para > 0.) { + p_with = p_with + moms[i]; + broad_with += mo_perp; + n_with++; + } else if (mo_para < 0.) { + p_against = p_against + moms[i]; + broad_against += mo_perp; + n_against++; + } else { + p_with = p_with + (moms[i] * 0.5); + p_against = p_against + (moms[i] * 0.5); + broad_with += 0.5 * mo_perp; + broad_against += 0.5 * mo_perp; + n_with++; + n_against++; + } + } + + double e2_vis = e_vis * e_vis; + + double mass2_with = fabs(p_with.M2() / e2_vis); + double mass2_against = fabs(p_against.M2() / e2_vis); + + double mass_with = sqrt(mass2_with); + double mass_against = sqrt(mass2_against); + + broad_with /= broad_denominator; + broad_against /= broad_denominator; + + double mH = fmax(mass_with, mass_against); + double mL = fmin(mass_with, mass_against); + + double bW = fmax(broad_with, broad_against); + double bN = fmin(broad_with, broad_against); + + if (n_with == 1 || n_against == 1) { + ev.SetHJM(mH); + ev.SetWJB(bW); + } else { + ev.SetHJM(mH); + ev.SetLJM(mL); + ev.SetWJB(bW); + ev.SetNJB(bN); + } +} \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/src/jetrates.cu b/gaps-1.1/gaps/observables/src/jetrates.cu new file mode 100644 index 0000000000000000000000000000000000000000..0014af945ad5b00eab69e53d455b215227d43cf0 --- /dev/null +++ b/gaps-1.1/gaps/observables/src/jetrates.cu @@ -0,0 +1,110 @@ +#include "jetrates.cuh" + +// Jet Rates + +// Yij function Used for the Durham analysis +__device__ double Yij(const Vec4& p, const Vec4& q, double ecm2) { + double pq = p[1] * q[1] + p[2] * q[2] + p[3] * q[3]; + double min_pq = min(p[0], q[0]); + double max_pq = max(pq / sqrt(p.P2() * q.P2()), -1.); + return 2. * pow(min_pq, 2) * (1. - min(max_pq, 1.)) / ecm2; +} + +// Durham Clustering Algorithm +__global__ void doCluster(Event* events, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event& ev = events[idx]; + + if (!ev.GetValidity()) { + return; + } + + /** + * On the size of arrays during the clustering process: + * + * The number of partons in the event is not known at compile time, so we + * cannot use a fixed size array. We could use a dynamic array, but that + * would require a lot of memory management, and we would have to use + * malloc and free. Instead, we will use a fixed size array, and we will + * assume that the number of partons will not exceed maxPartons. This is + * not a great solution, but ok for now. + */ + + // Get the center of mass energy squared + double ecm2 = (ev.GetParton(0).GetMom() + ev.GetParton(1).GetMom()).M2(); + + // Extract the 4-momenta of the partons + Vec4 p[maxPartons]; + for (int i = 2; i < ev.GetSize(); ++i) { + p[i - 2] = ev.GetParton(i).GetMom(); + } + + // kt2 will store the kt2 values for each clustering step + // If not changed, set to -1 so we can ignore when histogramming + double kt2[maxPartons] = {-1.}; + int counter = 0; + + // Number of partons (which will change when clustered), lower case to avoid N + int n = ev.GetPartonSize(); + + // imap will store the indices of the partons + int imap[maxPartons]; + for (int i = 0; i < ev.GetPartonSize(); ++i) { + imap[i] = i; + } + + // kt2ij will store the kt2 values for each pair of partons + double kt2ij[maxPartons][maxPartons] = {0.}; + double dmin = 1.; + int ii = 0, jj = 0; + for (int i = 0; i < n; ++i) { + for (int j = 0; j < i; ++j) { + double dij = kt2ij[i][j] = Yij(p[i], p[j], ecm2); + if (dij < dmin) { + dmin = dij; + ii = i; + jj = j; + } + } + } + + // Cluster the partons + while (n > 2) { + --n; + kt2[counter] = dmin; + counter++; + int jjx = imap[jj]; + p[jjx] = p[jjx] + p[imap[ii]]; + for (int i = ii; i < n; ++i) { + imap[i] = imap[i + 1]; + } + for (int j = 0; j < jj; ++j) { + kt2ij[jjx][imap[j]] = Yij(p[jjx], p[imap[j]], ecm2); + } + for (int i = jj + 1; i < n; ++i) { + kt2ij[imap[i]][jjx] = Yij(p[jjx], p[imap[i]], ecm2); + } + dmin = 1.; + for (int i = 0; i < n; ++i) { + for (int j = 0; j < i; ++j) { + double dij = kt2ij[imap[i]][imap[j]]; + if (dij < dmin) { + dmin = dij; + ii = i; + jj = j; + } + } + } + } + + // Store the kt2 values in the output arrays + ev.SetY23(counter > 0 ? log10(kt2[counter - 1 - 0]) : -50.); + ev.SetY34(counter > 1 ? log10(kt2[counter - 1 - 1]) : -50.); + ev.SetY45(counter > 2 ? log10(kt2[counter - 1 - 2]) : -50.); + ev.SetY56(counter > 3 ? log10(kt2[counter - 1 - 3]) : -50.); +} \ No newline at end of file diff --git a/gaps-1.1/gaps/observables/src/observables.cu b/gaps-1.1/gaps/observables/src/observables.cu new file mode 100644 index 0000000000000000000000000000000000000000..08f4edec18edc604124ab92ba11a02acc01916ef --- /dev/null +++ b/gaps-1.1/gaps/observables/src/observables.cu @@ -0,0 +1,160 @@ +#include <fstream> + +#include "observables.cuh" + +// ----------------------------------------------------------------------------- +// Validate Events before binning + +__global__ void validateEvents(Event* events, int* invalid, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event& ev = events[idx]; + ev.SetValidity(ev.Validate()); + + if (!ev.GetValidity()) { + // printf("Invalid Event\n"); + atomicAdd(invalid, 1); + } +} + +// ----------------------------------------------------------------------------- +// Analysis + +// Fill theHistograms (Atomically!) +__global__ void fillHistos(Analysis* an, Event* events, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event& ev = events[idx]; + + an->hists[0].Fill(ev.GetY23(), ev.GetDxs()); + an->hists[1].Fill(ev.GetY34(), ev.GetDxs()); + an->hists[2].Fill(ev.GetY45(), ev.GetDxs()); + an->hists[3].Fill(ev.GetY56(), ev.GetDxs()); + an->hists[4].Fill(ev.GetThr(), ev.GetDxs()); + an->hists[5].Fill(ev.GetThr(), ev.GetDxs()); + an->hists[6].Fill(ev.GetHJM(), ev.GetDxs()); + an->hists[7].Fill(ev.GetLJM(), ev.GetDxs()); + an->hists[8].Fill(ev.GetWJB(), ev.GetDxs()); + an->hists[9].Fill(ev.GetNJB(), ev.GetDxs()); + + // Dalitz Plot is OFF + // an->dalitz.Fill(ev.GetDalitz(0), ev.GetDalitz(1), ev.GetDxs()); + + atomicAdd(&an->wtot, ev.GetDxs()); + atomicAdd(&an->ntot, 1.); +} + +// Run the above kernels +void doAnalysis(thrust::device_vector<Event>& d_events, std::string filename) { + /** + * Only Place for a Host Object + * ---------------------------- + * + * While most of the work is done on the device, one cannot directly write + * to a file from the device. Therefore, we will create a host object to + * store the histograms, and then copy the results back to the host for + * writing to file. + */ + + // Device Analysis Object + Analysis *h_an, *d_an; + + // Allocate memory for the device analysis object + h_an = new Analysis(); + cudaMalloc(&d_an, sizeof(Analysis)); + cudaMemcpy(d_an, h_an, sizeof(Analysis), cudaMemcpyHostToDevice); + + // Get Event Data + int N = d_events.size(); + Event* d_events_ptr = thrust::raw_pointer_cast(d_events.data()); + + // Validate the Events + int* d_invalid; + cudaMalloc(&d_invalid, sizeof(int)); + cudaMemset(d_invalid, 0, sizeof(int)); + + validateEvents<<<(N + 255) / 256, 256>>>(d_events_ptr, d_invalid, N); + syncGPUAndCheck("validateEvents"); + + int h_invalid; + cudaMemcpy(&h_invalid, d_invalid, sizeof(int), cudaMemcpyDeviceToHost); + cudaFree(d_invalid); + + if (h_invalid > 0) { + std::cout << "" << std::endl; + std::cout << "ERROR: Invalid Events Found" << std::endl; + std::cout << "Number of Invalid Events: " << h_invalid << "\n"; + } + + // Calculare the Observables + doCluster<<<(N + 255) / 256, 256>>>(d_events_ptr, N); + syncGPUAndCheck("doCluster"); + + calculateThr<<<(N + 255) / 256, 256>>>(d_events_ptr, N); + syncGPUAndCheck("calculateThr"); + + calculateJetMBr<<<(N + 255) / 256, 256>>>(d_events_ptr, N); + syncGPUAndCheck("calculateJetMBr"); + + /** + * Why is the Dalitz Plot off? + * --------------------------- + * + * While the Dalitz analysis also benefits from the GPU parallelisation, the + * writing of the data to file severely limits the performance, as instead of + * the usual 100 bins, we have 100^2 = 1000 bins. This takes around 0.04s, + * which is minute in the C++ case, but is in fact 40% of the total analysis + * time! So for our tests, we keep this off, to keep our comparisons fair, + * and relvant to the actual GPU effect. + * + * If you want to turn it on, uncomment the lines in this file, and it's + * equivalent in the 'observables.cpp' file. + */ + // calculateDalitz<<<(N + 255) / 256, 256>>>(d_events_ptr, N); + // syncGPUAndCheck("calculateDalitz"); + + // Do the Analysis + fillHistos<<<(N + 255) / 256, 256>>>(d_an, d_events_ptr, N); + syncGPUAndCheck("fillHistos"); + + // Copy the results back to the host + cudaMemcpy(h_an, d_an, sizeof(Analysis), cudaMemcpyDeviceToHost); + + // Normalize the histograms + for (auto& hist : h_an->hists) { + hist.ScaleW(1. / h_an->ntot); + } + + // Dalitz Plot is OFF + // h_an->dalitz.ScaleW(1. / h_an->ntot); + + // Remove existing file + std::remove(filename.c_str()); + + // Write the histograms to file + Write(h_an->hists[0], "/gaps/log10y23\n", filename); + Write(h_an->hists[1], "/gaps/log10y34\n", filename); + Write(h_an->hists[2], "/gaps/log10y45\n", filename); + Write(h_an->hists[3], "/gaps/log10y56\n", filename); + Write(h_an->hists[4], "/gaps/tvalue\n", filename); + Write(h_an->hists[5], "/gaps/tzoomd\n", filename); + Write(h_an->hists[6], "/gaps/hjm\n", filename); + Write(h_an->hists[7], "/gaps/ljm\n", filename); + Write(h_an->hists[8], "/gaps/wjb\n", filename); + Write(h_an->hists[9], "/gaps/njb\n", filename); + + // Dalitz Plot is OFF + // Write(h_an->dalitz, "/gaps/dalitz\n", filename); + + // Clean up + delete h_an; + cudaFree(d_an); +} \ No newline at end of file diff --git a/gaps-1.1/gaps/shower/CMakeLists.txt b/gaps-1.1/gaps/shower/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..e8d7ac6732651c62afd813960f0b5af4f5c6ba6d --- /dev/null +++ b/gaps-1.1/gaps/shower/CMakeLists.txt @@ -0,0 +1,21 @@ +cmake_minimum_required(VERSION 3.10) +project(shower) + +# Enable CUDA +enable_language(CUDA) + +set(CMAKE_CXX_STANDARD 17) + +include_directories(include ../base/include) +file(GLOB SOURCES "src/*.cu") + +# Set all source files to compile with CUDA +set_source_files_properties(${SOURCES} PROPERTIES LANGUAGE CUDA) + +add_library(shower ${SOURCES}) + +# Set CUDA architecture to 7.0 for Tesla V100 +set_property(TARGET shower PROPERTY CUDA_ARCHITECTURES 70) + +# Link to Base +target_link_libraries(shower base) \ No newline at end of file diff --git a/gaps-1.1/gaps/shower/include/colours.cuh b/gaps-1.1/gaps/shower/include/colours.cuh new file mode 100644 index 0000000000000000000000000000000000000000..d149ab9594951e78352e07e93166f76640c46466 --- /dev/null +++ b/gaps-1.1/gaps/shower/include/colours.cuh @@ -0,0 +1,66 @@ +#ifndef SHOWER_COLOURS_CUH_ +#define SHOWER_COLOURS_CUH_ + +#include "qcd.cuh" + +// Colours + +/** + * Why is this function in a header file? See Vec4.cuh + */ + +__device__ void MakeColours(Event &ev, int *coli, int *colj, const int flavs[3], + const int colij[2], const int colk[2], + const double rand) { + // Increase variable ev.GetShowerC() by 1 + ev.IncrementShowerC(); + + if (flavs[0] != 21) { + if (flavs[0] > 0) { + coli[0] = ev.GetShowerC(); + coli[1] = 0; + colj[0] = colij[0]; + colj[1] = ev.GetShowerC(); + } else { + coli[0] = 0; + coli[1] = ev.GetShowerC(); + colj[0] = ev.GetShowerC(); + colj[1] = colij[1]; + } + } else { + if (flavs[1] == 21) { + if (colij[0] == colk[1]) { + if (colij[1] == colk[0] && rand > 0.5) { + coli[0] = colij[0]; + coli[1] = ev.GetShowerC(); + colj[0] = ev.GetShowerC(); + colj[1] = colij[1]; + } else { + coli[0] = ev.GetShowerC(); + coli[1] = colij[1]; + colj[0] = colij[0]; + colj[1] = ev.GetShowerC(); + } + } else { + coli[0] = colij[0]; + coli[1] = ev.GetShowerC(); + colj[0] = ev.GetShowerC(); + colj[1] = colij[1]; + } + } else { + if (flavs[1] > 0) { + coli[0] = colij[0]; + coli[1] = 0; + colj[0] = 0; + colj[1] = colij[1]; + } else { + coli[0] = 0; + coli[1] = colij[1]; + colj[0] = colij[0]; + colj[1] = 0; + } + } + } +} + +#endif // SHOWER_COLOURS_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/shower/include/kinematics.cuh b/gaps-1.1/gaps/shower/include/kinematics.cuh new file mode 100644 index 0000000000000000000000000000000000000000..52835457d9cb3ddd2250e75920e01e3c87fef00f --- /dev/null +++ b/gaps-1.1/gaps/shower/include/kinematics.cuh @@ -0,0 +1,42 @@ +#ifndef SHOWER_KINEMATICS_CUH_ +#define SHOWER_KINEMATICS_CUH_ + +#include "qcd.cuh" + +/** + * Why is this function in a header file? See Vec4.cuh + */ + +__device__ void MakeKinematics(Vec4 *kinematics, const double z, const double y, + const double phi, const Vec4 pijt, + const Vec4 pkt) { + Vec4 Q = pijt + pkt; + + // Generating the Momentum (0, kt1, kt2, 0) + double rkt = sqrt(Q.M2() * y * z * (1. - z)); + + Vec4 kt1 = pijt.Cross(pkt); + if (kt1.P() < 1.e-6) { + Vec4 xaxis(0., 1., 0., 0.); + kt1 = pijt.Cross(xaxis); + } + kt1 = kt1 * (rkt * cos(phi) / kt1.P()); + + Vec4 kt2cms = Q.Boost(pijt); + kt2cms = kt2cms.Cross(kt1); + kt2cms = kt2cms * (rkt * sin(phi) / kt2cms.P()); + Vec4 kt2 = Q.BoostBack(kt2cms); + + // Conversion to {i, j, k} basis + Vec4 pi = pijt * z + pkt * ((1. - z) * y) + kt1 + kt2; + Vec4 pj = pijt * (1. - z) + pkt * (z * y) - kt1 - kt2; + Vec4 pk = pkt * (1. - y); + + // No need to do *kinematics[0], for arrays the elements are already + // pointers + kinematics[0] = pi; + kinematics[1] = pj; + kinematics[2] = pk; +} + +#endif // SHOWER_KINEMATICS_CUH_ \ No newline at end of file diff --git a/gaps-1.1/gaps/shower/include/shower.cuh b/gaps-1.1/gaps/shower/include/shower.cuh new file mode 100644 index 0000000000000000000000000000000000000000..c41c8358a848d9dc9790061f1f99c6271be99827 --- /dev/null +++ b/gaps-1.1/gaps/shower/include/shower.cuh @@ -0,0 +1,56 @@ +#ifndef SHOWER_CUH_ +#define SHOWER_CUH_ + +// qcd includes all the necessary headers +#include "qcd.cuh" + +/** + * A Dipole Shower on GPU + * ---------------------- + * + * NOTE: Kernel = CUDA function, Splitting Function = QCD function + * + * This is the main result of the published work. It is a full implementation of + * a dipole shower on the GPU. It is designed to be as fast as possible*, and + * uses a number of tricks to achieve this. The main trick is to use a single + * kernel to perform the entire shower, and to use a number of optimisations to + * make the code as fast as possible. + * + * With the Event Object storing all the neccessary information and with the + * fact that kernel's can't be member functions, the Shower Class has been + * removed + * + * *: as possible as a second year PhD student can make it ;) + */ + +// Initialise the curandStates +__global__ void initCurandStates(curandState *states, int N); + +// Prepare Events for the Shower +__global__ void prepShower(Event *events, int N); + +// Selecting the Winner Emission +__global__ void selectWinnerSplitFunc(Event *events, curandState *states, + int N); + +// Check the Cutoff +__global__ void checkCutoff(Event *events, int *d_completed, double cutoff, + int N); + +// Veto Algorithm +__global__ void vetoAlg(Event *events, double *asval, bool *acceptEmission, + curandState *states, int N); + +// Perform the Splitting +__global__ void doSplitting(Event *events, bool *acceptEmission, + curandState *states, int N); + +// Kinematics +__device__ void MakeKinematics(Vec4 *kinematics, const double z, const double y, + const double phi, const Vec4 pijt, + const Vec4 pkt); + +// All tasks wrapped into a function +void runShower(thrust::device_vector<Event> &d_events); + +#endif // SHOWER_CUH_ diff --git a/gaps-1.1/gaps/shower/include/splittings.cuh b/gaps-1.1/gaps/shower/include/splittings.cuh new file mode 100644 index 0000000000000000000000000000000000000000..a5b5bcb15032e26f8a061e7ab6319afa034e3638 --- /dev/null +++ b/gaps-1.1/gaps/shower/include/splittings.cuh @@ -0,0 +1,266 @@ +#ifndef SPLITTINGS_CUH_ +#define SPLITTINGS_CUH_ + +#include "qcd.cuh" + +/** + * Splitting Functions as a function - safer but less sophisticated + * ---------------------------------------------------------------- + * + * This is a safer and more straightforward way to implement the splitting + * functions for the shower. Although the class-based approach is good for + * C++, in CUDA many issues arise that mean that OOP might not always be the + * best strategy in coding. As a simpler approach, we will use switch-case + * statements to select the correct splitting function. + * + * We have a LOT of splitting functions: + * - Four types (FF, FI, IF, II) + * - Three or Four Possible DGLAP Splittings (q->qg, q->gq, g->gg, g->qq) + * - Five Flavours of Quarks (d, u, s, c, b) and each of their antiquarks + * - At most, In total: 4 * (10 + 10 + 1 + 5) = 104 splitting functions + * + * So we need to organise ourselves with some kind of structure. As a first + * attempt lets use four digit codes to identify the splitting functions: + * + * - 1st digit: Type of Split-Spect (FF, FI, IF, II) - 0, 1, 2, 3 + * - 2nd digit: Type of DGLAP (q->qg, q->gq, g->gg, g->qq) - 0, 1, 2, 3 + * - 3rd digit: Emitter is a Particle or Antiparticle - 0, 1 (gluon is 0) + * - 4th digit: Flavor of the Emitter - 1, 2, 3, 4, 5; 0 for gluon + * + * Examples: + * - FF u -> ug = 0 0 0 2 + * - FF ubar -> ubar g = 0 0 1 2 + * - FF g -> gg = 0 2 0 0 + * - FF g -> ccbar = 0 3 0 4 + * + * - FI u -> ug = 1 0 0 2 + * - FI g -> ccbar = 1 3 0 4 + * + * - IF d -> dg = 2 0 0 1 + * - IF d -> gd = 2 1 0 1 + * - IF sbar -> sbar g = 2 0 1 3 + * - IF g -> uubar = 2 3 0 2 + * + * - II g -> gg = 3 2 0 0 + * - II g -> bbbar = 3 3 0 5 + * + * This way we can easily identify the splitting functions and select the + * correct one using a switch-case statement. This can be used for value, + * estimate, integral and generateZ functions. + */ + +// Splitting Function Codes - Only FF for now (Removed Zeroes) +// ------------------------------------------ +__constant__ int sfCodes[] = {1, 2, 3, 4, 5, 11, 12, 13, + 14, 15, 200, 301, 302, 303, 304, 305}; + +// ----------------------------------------------------------------------------- + +__device__ double sfValue(double z, double y, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + return kCF * (2. / (1. - z * (1. - y)) - (1. + z)); + break; + + // FF g -> gg + case 200: + // Asymmetric Splitting Function: Fixed Emitter and Emission in our code + return kCA / 2. * (2. / (1. - z * (1. - y)) - 2. + z * (1. - z)); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + // Divide by 2 to avoid double counting of the same splitting + return kTR / 2. * (1. - 2. * z * (1. - z)); + break; + } + return 0.; +} + +// ----------------------------------------------------------------------------- + +__device__ double sfEstimate(double z, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + return kCF * 2. / (1. - z); + break; + + // FF g -> gg + case 200: + return kCA / (1. - z); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + return kTR / 2.; + break; + } + return 0.; +} + +// ----------------------------------------------------------------------------- + +__device__ double sfIntegral(double zm, double zp, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + return kCF * 2. * log((1. - zm) / (1. - zp)); + break; + + // FF g -> gg + case 200: + return kCA * log((1. - zm) / (1. - zp)); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + return kTR / 2. * (zp - zm); + break; + } + return 0.; +} + +// ----------------------------------------------------------------------------- + +__device__ double sfGenerateZ(double zm, double zp, double rand, int sf) { + switch (sf) { + // FF Splittings --------------------------- + + // FF q -> qg + case 1: + case 2: + case 3: + case 4: + case 5: + case 11: + case 12: + case 13: + case 14: + case 15: + // int = - 2CF * log(1 - z); inv = 1 - exp(-x/2CF) + return 1. + (zp - 1.) * pow((1. - zm) / (1. - zp), rand); + break; + + // FF g -> gg + case 200: + // int = - CA * log(1 - z); inv = 1 - exp(-x/CA) + return 1. + (zp - 1.) * pow((1. - zm) / (1. - zp), rand); + break; + + // FF g -> qqbar + case 301: + case 302: + case 303: + case 304: + case 305: + // int = TR/2 * z; inv = x / (TR/2) + return zm + (zp - zm) * rand; + break; + } + return 0.; +} + +// ----------------------------------------------------------------------------- +// Utility Functions + +__device__ bool validateSplitting(int ij, int sf) { + // Obtain the splitting function code + // int firstDigit = sf / 1000; + int secondDigit = (sf / 100) % 10; + int thirdDigit = (sf / 10) % 10; + int fourthDigit = sf % 10; + + // Insert FF, FI, IF, II checks here + // --------------------------------- + + // Skip if ij is a quark and the sf is not a quark sf (2nd digit), or + // if ij is a gluon and the sf is not a gluon sf (2nd digit) + if ((ij != 21 && secondDigit >= 2) || (ij == 21 && secondDigit < 2)) { + return false; + } + + // Skip if ij is a particle and sf is an antiparticle sf (3rd digit), or + // if ij is an antiparticle and sf is a particle sf (3rd digit) + if ((ij < 0 && thirdDigit == 0) || (ij > 0 && thirdDigit == 1)) { + return false; + } + + // q->qg case: Skip if the flavor of ij is different from the flavor of the sf + // g->gg and g->qq case: No need to check the flavor + if ((ij != 21 && abs(ij) != fourthDigit)) { + return false; + } + + return true; +} + +__device__ void sfToFlavs(int sf, int* flavs) { + if (sf < 16) { + if (sf < 6) { + flavs[0] = sf; + flavs[1] = sf; + flavs[2] = 21; + } else { + flavs[0] = -1 * (sf - 10); + flavs[1] = -1 * (sf - 10); + flavs[2] = 21; + } + } else if (sf == 200) { + flavs[0] = 21; + flavs[1] = 21; + flavs[2] = 21; + } else if (sf < 306) { + flavs[0] = 21; + flavs[1] = sf - 300; + flavs[2] = -1 * (sf - 300); + } +} + +#endif // SPLITTINGS_CUH_ diff --git a/gaps-1.1/gaps/shower/src/shower.cu b/gaps-1.1/gaps/shower/src/shower.cu new file mode 100644 index 0000000000000000000000000000000000000000..4d43ebcf2d3430003fa292d28400881e843e306c --- /dev/null +++ b/gaps-1.1/gaps/shower/src/shower.cu @@ -0,0 +1,462 @@ +#include "shower.cuh" + +// Need to be here to avoid multiple definitions +#include "colours.cuh" +#include "kinematics.cuh" +#include "splittings.cuh" + +// ----------------------------------------------------------------------------- +// Random Number Generator + +// No need during matrix as initialised once and used once only +// But for shower used 80 to 100 times +__global__ void initCurandStates(curandState *states, int N) { + int idx = threadIdx.x + blockIdx.x * blockDim.x; + if (idx >= N) { + return; + } + // Every events[idx] has a seed idx + // curand_init(idx, 0, 0, &states[idx]); + + // Every events[idx] has a seed idx and clok64() is used to get a seed + curand_init(clock64(), idx, 0, &states[idx]); +} + +// ----------------------------------------------------------------------------- +// Preparing the Shower + +__global__ void prepShower(Event *events, int N) { + int idx = threadIdx.x + blockIdx.x * blockDim.x; + + if (idx >= N) { + return; + } + + Event &ev = events[idx]; + + // Set the starting shower scale + double t_max = (ev.GetParton(0).GetMom() + ev.GetParton(1).GetMom()).M2(); + ev.SetShowerT(t_max); + + // Set the initial number of emissions + ev.SetEmissions(0); + + // Set the Colour Counter to 1 (q and qbar) + ev.SetShowerC(1); + + // Set the initial end shower flag - No need, default value is false + // ev.SetEndShower(false); +} + +// ----------------------------------------------------------------------------- + +/** + * Selecting the Winner Splitting Function + * --------------------------------------- + * + * When you profile the code, you will notice that this is the process that + * takes up half of the shower time. This method below is a first attempt at + * parallelizing the process. + */ + +__global__ void selectWinnerSplitFunc(Event *events, curandState *states, + int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + curandState state = states[idx]; + + Event &ev = events[idx]; + + // Do not run if the shower has ended + if (ev.GetEndShower()) { + return; + } + + // Default Values + double win_tt = tC; // Lowest possible value is Cutoff Scale (in base.cuh) + int win_sf = 0; // 0 = No Splitting + int win_ij = 0; + int win_k = 0; + double win_zp = 0.; + double win_m2 = 0.; + + // We start at 2 because elements 0 and 1 are electrons - To change with ISR + for (int ij = 2; ij < ev.GetSize(); ij++) { + for (int k = 2; k < ev.GetSize(); k++) { + // Sanity Check to ensure ij != k + if (ij == k) { + continue; + } + + // Need to check if ij and k are colour connected + if (!ev.GetParton(ij).IsColorConnected(ev.GetParton(k))) { + continue; + } + + // Params Identical to all splitting functions + double m2 = (ev.GetParton(ij).GetMom() + ev.GetParton(k).GetMom()).M2(); + if (m2 < 4. * tC) { + continue; + } + + // Phase Space Limits + double zp = 0.5 * (1. + sqrt(1. - 4. * tC / m2)); + + // Codes instead of Object Oriented Approach! + for (int sf : sfCodes) { + // Check if the Splitting Function is valid for the current partons + if (!validateSplitting(ev.GetParton(ij).GetPid(), sf)) { + continue; + } + + // Calculate the Evolution Variable + double g = asmax / (2. * M_PI) * sfIntegral(1 - zp, zp, sf); + double tt = ev.GetShowerT() * pow(curand_uniform(&state), 1. / g); + + states[idx] = state; // So that the next number is not the same! + + // Check if tt is greater than the current winner + if (tt > win_tt) { + win_tt = tt; + win_sf = sf; + win_ij = ij; + win_k = k; + win_zp = zp; + win_m2 = m2; + } + } + } + } + + // Store the results + ev.SetShowerT(win_tt); + ev.SetWinSF(win_sf); + ev.SetWinDipole(0, win_ij); + ev.SetWinDipole(1, win_k); + ev.SetWinParam(0, win_zp); + ev.SetWinParam(1, win_m2); +} + +// ----------------------------------------------------------------------------- + +__global__ void checkCutoff(Event *events, int *d_completed, double cutoff, + int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event &ev = events[idx]; + + // Do not run if the shower has ended + if (ev.GetEndShower()) { + return; + } + + /** + * End shower if t < cutoff + * + * ev.GetShowerT() <= cutoff is equally valid + * I just prefer this way because this way is + * how we usually write it in Literature + */ + if (!(ev.GetShowerT() > cutoff)) { + ev.SetEndShower(true); + atomicAdd(d_completed, 1); // Increment the number of completed events + } +} + +// ----------------------------------------------------------------------------- + +__global__ void vetoAlg(Event *events, double *asval, bool *acceptEmission, + curandState *states, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event &ev = events[idx]; + curandState state = states[idx]; + + // Do not run if the shower has ended + if (ev.GetEndShower()) { + return; + } + + // Set to False, only set to True if accpeted + acceptEmission[idx] = false; + + // Get the Splitting Function + int sf = ev.GetWinSF(); + + double rand = curand_uniform(&state); + states[idx] = state; + + // Generate z + double zp = ev.GetWinParam(0); + double z = sfGenerateZ(1 - zp, zp, rand, sf); + + double y = ev.GetShowerT() / ev.GetWinParam(1) / z / (1. - z); + + double f = 0.; + double g = 0.; + double value = 0.; + double estimate = 0.; + + // CS Kernel: y can't be 1 + if (y < 1.) { + value = sfValue(z, y, sf); + estimate = sfEstimate(z, sf); + + f = (1. - y) * asval[idx] * value; + g = asmax * estimate; + + if (curand_uniform(&state) < f / g) { + acceptEmission[idx] = true; + ev.SetShowerZ(z); + ev.SetShowerY(y); + } + states[idx] = state; + } +} + +// ----------------------------------------------------------------------------- + +// Do Splitting +__global__ void doSplitting(Event *events, bool *acceptEmission, + curandState *states, int N) { + int idx = blockIdx.x * blockDim.x + threadIdx.x; + + if (idx >= N) { + return; + } + + Event &ev = events[idx]; + + // Do not run if the shower has ended + if (ev.GetEndShower()) { + return; + } + + if (!acceptEmission[idx]) { + return; + } + + curandState state = states[idx]; + + double phi = 2. * M_PI * curand_uniform(&state); + states[idx] = state; + + int win_ij = ev.GetWinDipole(0); + int win_k = ev.GetWinDipole(1); + + // Make Kinematics + Vec4 moms[3] = {Vec4(), Vec4(), Vec4()}; + + MakeKinematics(moms, ev.GetShowerZ(), ev.GetShowerY(), phi, + ev.GetParton(win_ij).GetMom(), ev.GetParton(win_k).GetMom()); + + // Adjust Colors + + // Get Flavs from Kernel Number + int sf = ev.GetWinSF(); + int flavs[3]; + sfToFlavs(sf, flavs); + + int colij[2] = {ev.GetParton(win_ij).GetCol(), + ev.GetParton(win_ij).GetAntiCol()}; + + int colk[2] = {ev.GetParton(win_k).GetCol(), + ev.GetParton(win_k).GetAntiCol()}; + + int coli[2] = {0, 0}; + int colj[2] = {0, 0}; + + double rand = curand_uniform(&state); + states[idx] = state; + + MakeColours(ev, coli, colj, flavs, colij, colk, rand); + + // Modify Splitter + ev.SetPartonPid(win_ij, flavs[1]); + ev.SetPartonMom(win_ij, moms[0]); + ev.SetPartonCol(win_ij, coli[0]); + ev.SetPartonAntiCol(win_ij, coli[1]); + + // Modify Recoiled Spectator + ev.SetPartonMom(win_k, moms[2]); + + // Add Emitted Parton + Parton em = Parton(flavs[2], moms[1], colj[0], colj[1]); + ev.SetParton(ev.GetSize(), em); + + // Increment Emissions (IMPORTANT) + ev.IncrementEmissions(); +} + +// ----------------------------------------------------------------------------- + +/* +__global__ void countBools(Event *events, int *trueCount, bool *acceptEmission, + int *falseCount, int N) { + int idx = threadIdx.x + blockIdx.x * blockDim.x; + if (idx >= N) { + return; + } + + Event &ev = events[idx]; + + if (ev.GetEndShower()) { + return; + } + + if (!acceptEmission[idx]){ + atomicAdd(trueCount, 1); + } else { + atomicAdd(falseCount, 1); + } +} +*/ + +// ----------------------------------------------------------------------------- + +void runShower(thrust::device_vector<Event> &d_events) { + // Number of Events - Can get from d_events.size() + int N = d_events.size(); + + // Set up the device alphaS + AlphaS *d_as; + cudaMalloc(&d_as, sizeof(AlphaS)); + asSetupKernel<<<1, 1>>>(d_as, mz, asmz); + syncGPUAndCheck("asSetupKernel"); + + // Allocate device memory for completed events counter + int *d_completed; + cudaMalloc(&d_completed, sizeof(int)); + cudaMemset(d_completed, 0, sizeof(int)); + + // as(t) and veto + double *d_asval; + cudaMalloc(&d_asval, N * sizeof(double)); + bool *d_acceptEmission; + cudaMalloc(&d_acceptEmission, N * sizeof(bool)); + + // Allocate space for curand states + curandState *d_states; + cudaMalloc(&d_states, N * sizeof(curandState)); + + // Initialize the states + initCurandStates<<<(N + 255) / 256, 256>>>(d_states, N); + + // Store the number of finished events per cycle + std::vector<int> completedPerCycle; + + // Use a pointer to the device events + Event *d_events_ptr = thrust::raw_pointer_cast(d_events.data()); + + // --------------------------------------------------------------------------- + // Prepare the Shower + + DEBUG_MSG("Running @prepShower"); + prepShower<<<(N + 255) / 256, 256>>>(d_events_ptr, N); + syncGPUAndCheck("prepShower"); + + // --------------------------------------------------------------------------- + // Run the Shower + + int completed = 0; + int cycle = 0; + while (completed < N) { + // Run all the kernels here... + // ------------------------------------------------------------------------- + // Select the winner kernel + + DEBUG_MSG("Running @selectWinnerSplitFunc"); + selectWinnerSplitFunc<<<(N + 255) / 256, 256>>>(d_events_ptr, d_states, N); + syncGPUAndCheck("selectWinnerSplitFunc"); + + // ------------------------------------------------------------------------- + // Check Cutoff + + DEBUG_MSG("Running @checkCutoff"); + checkCutoff<<<(N + 255) / 256, 256>>>(d_events_ptr, d_completed, tC, N); + syncGPUAndCheck("checkCutoff"); + + // ------------------------------------------------------------------------- + // Calculate AlphaS for Veto Algorithm + + DEBUG_MSG("Running @asShowerKernel"); + asShowerKernel<<<(N + 255) / 256, 256>>>(d_as, d_events_ptr, d_asval, N); + syncGPUAndCheck("asShowerKernel"); + + // ------------------------------------------------------------------------- + // Veto Algorithm + + DEBUG_MSG("Running @vetoAlg"); + vetoAlg<<<(N + 255) / 256, 256>>>(d_events_ptr, d_asval, d_acceptEmission, + d_states, N); + syncGPUAndCheck("vetoAlg"); + + // ------------------------------------------------------------------------- + // Splitting Algorithm + + DEBUG_MSG("Running @doSplitting"); + doSplitting<<<(N + 255) / 256, 256>>>(d_events_ptr, d_acceptEmission, + d_states, N); + syncGPUAndCheck("doSplitting"); + + // ------------------------------------------------------------------------- + // Import the Number of Completed Events + + cudaMemcpy(&completed, d_completed, sizeof(int), cudaMemcpyDeviceToHost); + cycle++; + + // Until Paper is Published, we will use this + completedPerCycle.push_back(completed); + + // ------------------------------------------------------------------------- + // Print number of Accepted / Vetoed Events - for A. V. + + /* + // TRUE means that the event is vetoed + // FALSE means that the event is accepted + + int *d_trueCount, *d_falseCount; + cudaMalloc(&d_trueCount, sizeof(int)); + cudaMalloc(&d_falseCount, sizeof(int)); + cudaMemset(d_trueCount, 0, sizeof(int)); + cudaMemset(d_falseCount, 0, sizeof(int)); + + DEBUG_MSG("Running @countBools"); + countBools<<<(N + 255) / 256, 256>>>(d_events_ptr, d_acceptEmission, + d_trueCount, d_falseCount, N); syncGPUAndCheck("countBools"); + + int h_trueCount(0), h_falseCount(0); // Number of vetoed events + cudaMemcpy(&h_trueCount, d_trueCount, sizeof(int), cudaMemcpyDeviceToHost); + cudaMemcpy(&h_falseCount, d_falseCount, sizeof(int), + cudaMemcpyDeviceToHost); + + std::cout << cycle << ", " << N - completed << ", " << h_trueCount << ", " + << h_falseCount << std::endl; + */ + } + + // --------------------------------------------------------------------------- + // Write completedPerCycle to file + std::ofstream file("gaps-cycles.dat"); + for (auto &i : completedPerCycle) { + file << i << std::endl; + } + + // --------------------------------------------------------------------------- + // Clean Up Device Memory + cudaFree(d_asval); + cudaFree(d_acceptEmission); + cudaFree(d_completed); +} diff --git a/gaps-1.1/rungaps b/gaps-1.1/rungaps new file mode 100755 index 0000000000000000000000000000000000000000..91d36f1d5e9759578f399289812bae004fbf2994 --- /dev/null +++ b/gaps-1.1/rungaps @@ -0,0 +1,104 @@ +#!/usr/bin/env python3 + +# ------------------------------------------------------------------------------ + +# GAPS - Run Script +# ----------------- +# This script is used to compile and run the GAPS and C++ Shower codes. It +# provides a number of options to control the number of events, the number of +# cores to use, and the type of run to perform. The run types are: +# - gaps: Run the GAPS simulation +# - cpp: Run the C++ Shower simulation +# - compare: Run both the GAPS and C++ Shower and compare the results +# - full: Run both the GAPS and C++ Shower for a range of event numbers + +# ------------------------------------------------------------------------------ + +import argparse +import os +import shutil +import subprocess + +# ------------------------------------------------------------------------------ + +# Set up argument parser +parser = argparse.ArgumentParser(description='Run GAPS or C++ Shower') +parser.add_argument('-n', '--nevents', type=int, default=10000, + help='set the number of events (default: 10000)') +parser.add_argument('-e', '--energy', type=float, default=91.2, + help='set the CoM energy of the system (default: 91.2)') +parser.add_argument('-c', '--cores', type=int, default=1, + help='set the number of cores (default: 1)') +parser.add_argument('-r', '--runtype', type=str, default='gaps', + help='set the run type (default: gaps, options: gaps, cpp, compare, full)') + +args = parser.parse_args() + +# ------------------------------------------------------------------------------ +# Compile code + + +def compile(dir): + print(f'Compiling {dir}') + os.chdir(dir) + os.makedirs('build', exist_ok=True) + os.chdir('build') + subprocess.run(['cmake', '..']) + subprocess.run(['make', '-j', str(args.cores)]) + os.chdir('../..') + +# ------------------------------------------------------------------------------ +# Run GAPS or C++ Shower + + +def run(runtype, events, energy): + print(f'Running {runtype}') + subprocess.run([f'./{runtype}/bin/{runtype}', str(events), str(energy)]) + + +# ------------------------------------------------------------------------------ +# Compile and run based on runtype + +if args.runtype in ['gaps', 'compare', 'full']: + compile('gaps') + +if args.runtype in ['cpp', 'compare', 'full']: + compile('cpp-shower') + +if args.runtype in ['gaps', 'compare']: + run('gaps', args.nevents, args.energy) + +if args.runtype in ['cpp', 'compare']: + run('cpp-shower', args.nevents, args.energy) + +if args.runtype == 'full': + # Remove previous results and make folder + if os.path.exists('results'): + shutil.rmtree('results') + os.makedirs('results', exist_ok=True) + + # Clear previous log files + if os.path.exists('cpp-time.dat'): + os.remove('cpp-time.dat') + if os.path.exists('gaps-time.dat'): + os.remove('gaps-time.dat') + + # Run the comparison 100 times, for different number of events + neventsarray = [1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, 2000, + 5000, 10000, 20000, 50000, 100000, 200000, 500000, 1000000] + for n in neventsarray: + # Run and store the output in a log file + for i in range(1, 101): + print(f"Running GAPS with {n} events") + subprocess.run(['./gaps/bin/gaps', str(n), str(args.energy)]) + print(f"Running C++ Shower with {n} events") + subprocess.run( + ['./cpp-shower/bin/cpp-shower', str(n), str(args.energy)]) + + # Move the log files to the results directory + shutil.move('cpp-time.dat', 'results/') + shutil.move('gaps-time.dat', 'results/') + shutil.move('cpp.yoda', 'results/') + shutil.move('gaps.yoda', 'results/') + +# ------------------------------------------------------------------------------ diff --git a/gaps-1.1/test/SH-Tutorial.yoda b/gaps-1.1/test/SH-Tutorial.yoda new file mode 100644 index 0000000000000000000000000000000000000000..c902c96eed1d216581c4ee1114800415c071b44a --- /dev/null +++ b/gaps-1.1/test/SH-Tutorial.yoda @@ -0,0 +1,464 @@ +# BEGIN YODA_HISTO1D /gaps/log10y23 +Path=/gaps/log10y23 +ScaledBy=1.5213690739441659e-07 +Title= +Type=Histo1D +XLabel= +YLabel= +# Mean: -2.272216e+00 +# Area: 4.069205e+04 +# ID ID sumw sumw2 sumwx sumwx2 numEntries +Total Total 4.069205e+04 1.736849e+04 -9.246111e+04 2.330336e+05 95336 +Underflow Underflow 1.707311e+00 7.287275e-01 -7.543883e+00 3.333822e+01 4 +Overflow Overflow 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# xlow xhigh sumw sumw2 sumwx sumwx2 numEntries +-4.300000e+00 -4.260000e+00 4.268278e-01 1.821819e-01 -1.828374e+00 7.832089e+00 1 +-4.260000e+00 -4.220000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.220000e+00 -4.180000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.180000e+00 -4.140000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.140000e+00 -4.100000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.100000e+00 -4.060000e+00 4.268277e-01 1.821819e-01 -1.738981e+00 7.084955e+00 1 +-4.060000e+00 -4.020000e+00 8.536553e-01 3.643638e-01 -3.459724e+00 1.402169e+01 2 +-4.020000e+00 -3.980000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.980000e+00 -3.940000e+00 2.134138e+00 9.109094e-01 -8.460362e+00 3.353980e+01 5 +-3.940000e+00 -3.900000e+00 1.280483e+00 5.465456e-01 -5.011295e+00 1.961258e+01 3 +-3.900000e+00 -3.860000e+00 4.268278e+00 1.821819e+00 -1.652896e+01 6.400908e+01 10 +-3.860000e+00 -3.820000e+00 5.292663e+01 2.259055e+01 -2.030135e+02 7.787153e+02 124 +-3.820000e+00 -3.780000e+00 1.289020e+02 5.501893e+01 -4.898154e+02 1.861270e+03 302 +-3.780000e+00 -3.740000e+00 1.506702e+02 6.431020e+01 -5.664653e+02 2.129725e+03 353 +-3.740000e+00 -3.700000e+00 1.784140e+02 7.615203e+01 -6.637134e+02 2.469086e+03 418 +-3.700000e+00 -3.660000e+00 2.095724e+02 8.945131e+01 -7.710762e+02 2.837036e+03 491 +-3.660000e+00 -3.620000e+00 2.364626e+02 1.009288e+02 -8.606023e+02 3.132182e+03 554 +-3.620000e+00 -3.580000e+00 2.731697e+02 1.165964e+02 -9.832301e+02 3.539014e+03 640 +-3.580000e+00 -3.540000e+00 2.543893e+02 1.085804e+02 -9.056297e+02 3.224087e+03 596 +-3.540000e+00 -3.500000e+00 3.149988e+02 1.344502e+02 -1.108818e+03 3.903160e+03 738 +-3.500000e+00 -3.460000e+00 3.316451e+02 1.415553e+02 -1.154187e+03 4.016828e+03 777 +-3.460000e+00 -3.420000e+00 3.747547e+02 1.599557e+02 -1.289185e+03 4.434944e+03 878 +-3.420000e+00 -3.380000e+00 4.029254e+02 1.719797e+02 -1.370001e+03 4.658246e+03 944 +-3.380000e+00 -3.340000e+00 4.212789e+02 1.798135e+02 -1.415479e+03 4.756005e+03 987 +-3.340000e+00 -3.300000e+00 4.498764e+02 1.920197e+02 -1.493384e+03 4.957412e+03 1054 +-3.300000e+00 -3.260000e+00 4.801812e+02 2.049546e+02 -1.574809e+03 5.164831e+03 1125 +-3.260000e+00 -3.220000e+00 4.806080e+02 2.051368e+02 -1.556770e+03 5.042704e+03 1126 +-3.220000e+00 -3.180000e+00 4.759129e+02 2.031328e+02 -1.523220e+03 4.875321e+03 1115 +-3.180000e+00 -3.140000e+00 5.092055e+02 2.173430e+02 -1.608873e+03 5.083429e+03 1193 +-3.140000e+00 -3.100000e+00 5.160347e+02 2.202579e+02 -1.609875e+03 5.022400e+03 1209 +-3.100000e+00 -3.060000e+00 5.561565e+02 2.373830e+02 -1.712860e+03 5.275367e+03 1303 +-3.060000e+00 -3.020000e+00 5.979856e+02 2.552368e+02 -1.817744e+03 5.525622e+03 1401 +-3.020000e+00 -2.980000e+00 6.009734e+02 2.565121e+02 -1.802986e+03 5.409239e+03 1408 +-2.980000e+00 -2.940000e+00 6.334123e+02 2.703579e+02 -1.874667e+03 5.548409e+03 1484 +-2.940000e+00 -2.900000e+00 6.022539e+02 2.570586e+02 -1.758938e+03 5.137221e+03 1411 +-2.900000e+00 -2.860000e+00 6.372538e+02 2.719976e+02 -1.834839e+03 5.283118e+03 1493 +-2.860000e+00 -2.820000e+00 6.833512e+02 2.916731e+02 -1.940674e+03 5.511484e+03 1601 +-2.820000e+00 -2.780000e+00 6.812170e+02 2.907623e+02 -1.907389e+03 5.340731e+03 1596 +-2.780000e+00 -2.740000e+00 7.221925e+02 3.082518e+02 -1.993194e+03 5.501156e+03 1692 +-2.740000e+00 -2.700000e+00 7.285949e+02 3.109845e+02 -1.981971e+03 5.391586e+03 1707 +-2.700000e+00 -2.660000e+00 7.435339e+02 3.173609e+02 -1.992374e+03 5.338867e+03 1742 +-2.660000e+00 -2.620000e+00 7.405461e+02 3.160856e+02 -1.954813e+03 5.160201e+03 1735 +-2.620000e+00 -2.580000e+00 7.785337e+02 3.322998e+02 -2.024242e+03 5.263277e+03 1824 +-2.580000e+00 -2.540000e+00 7.174974e+02 3.062477e+02 -1.836498e+03 4.700772e+03 1681 +-2.540000e+00 -2.500000e+00 7.332900e+02 3.129885e+02 -1.847727e+03 4.655958e+03 1718 +-2.500000e+00 -2.460000e+00 7.674362e+02 3.275630e+02 -1.903412e+03 4.720981e+03 1798 +-2.460000e+00 -2.420000e+00 7.776801e+02 3.319355e+02 -1.897378e+03 4.629309e+03 1822 +-2.420000e+00 -2.380000e+00 7.653021e+02 3.266521e+02 -1.836655e+03 4.407904e+03 1793 +-2.380000e+00 -2.340000e+00 7.729850e+02 3.299314e+02 -1.823983e+03 4.304083e+03 1811 +-2.340000e+00 -2.300000e+00 7.793874e+02 3.326641e+02 -1.808450e+03 4.196337e+03 1826 +-2.300000e+00 -2.260000e+00 7.968873e+02 3.401335e+02 -1.817041e+03 4.143273e+03 1867 +-2.260000e+00 -2.220000e+00 7.499363e+02 3.200936e+02 -1.679825e+03 3.762834e+03 1757 +-2.220000e+00 -2.180000e+00 7.542046e+02 3.219154e+02 -1.659304e+03 3.650687e+03 1767 +-2.180000e+00 -2.140000e+00 7.460948e+02 3.184540e+02 -1.611323e+03 3.480033e+03 1748 +-2.140000e+00 -2.100000e+00 7.157900e+02 3.055190e+02 -1.517353e+03 3.216625e+03 1677 +-2.100000e+00 -2.060000e+00 6.991438e+02 2.984140e+02 -1.454305e+03 3.025228e+03 1638 +-2.060000e+00 -2.020000e+00 7.209120e+02 3.077052e+02 -1.470529e+03 2.999704e+03 1689 +-2.020000e+00 -1.980000e+00 7.349973e+02 3.137172e+02 -1.469807e+03 2.939333e+03 1722 +-1.980000e+00 -1.940000e+00 6.667049e+02 2.845681e+02 -1.307091e+03 2.562668e+03 1562 +-1.940000e+00 -1.900000e+00 6.726805e+02 2.871187e+02 -1.291873e+03 2.481111e+03 1576 +-1.900000e+00 -1.860000e+00 6.688390e+02 2.854790e+02 -1.257647e+03 2.364900e+03 1567 +-1.860000e+00 -1.820000e+00 6.594488e+02 2.814709e+02 -1.213600e+03 2.233502e+03 1545 +-1.820000e+00 -1.780000e+00 6.432293e+02 2.745481e+02 -1.157664e+03 2.083610e+03 1507 +-1.780000e+00 -1.740000e+00 6.445099e+02 2.750947e+02 -1.134317e+03 1.996445e+03 1510 +-1.740000e+00 -1.700000e+00 5.996929e+02 2.559656e+02 -1.031302e+03 1.773631e+03 1405 +-1.700000e+00 -1.660000e+00 5.928637e+02 2.530507e+02 -9.960368e+02 1.673462e+03 1389 +-1.660000e+00 -1.620000e+00 5.868881e+02 2.505001e+02 -9.625968e+02 1.578902e+03 1375 +-1.620000e+00 -1.580000e+00 5.634126e+02 2.404801e+02 -9.013291e+02 1.441996e+03 1320 +-1.580000e+00 -1.540000e+00 5.407907e+02 2.308244e+02 -8.436723e+02 1.316262e+03 1267 +-1.540000e+00 -1.500000e+00 4.878640e+02 2.082339e+02 -7.417458e+02 1.127812e+03 1143 +-1.500000e+00 -1.460000e+00 5.322541e+02 2.271808e+02 -7.879504e+02 1.166552e+03 1247 +-1.460000e+00 -1.420000e+00 4.818885e+02 2.056833e+02 -6.941000e+02 9.998282e+02 1129 +-1.420000e+00 -1.380000e+00 4.912787e+02 2.096914e+02 -6.876112e+02 9.624702e+02 1151 +-1.380000e+00 -1.340000e+00 4.473154e+02 1.909266e+02 -6.084743e+02 8.277518e+02 1048 +-1.340000e+00 -1.300000e+00 4.328033e+02 1.847324e+02 -5.714894e+02 7.546728e+02 1014 +-1.300000e+00 -1.260000e+00 4.379252e+02 1.869186e+02 -5.604155e+02 7.172243e+02 1026 +-1.260000e+00 -1.220000e+00 4.033522e+02 1.721619e+02 -5.002264e+02 6.204217e+02 945 +-1.220000e+00 -1.180000e+00 3.888400e+02 1.659677e+02 -4.666680e+02 5.601240e+02 911 +-1.180000e+00 -1.140000e+00 3.713401e+02 1.584982e+02 -4.309385e+02 5.001507e+02 870 +-1.140000e+00 -1.100000e+00 3.491451e+02 1.490248e+02 -3.911219e+02 4.381895e+02 818 +-1.100000e+00 -1.060000e+00 3.132916e+02 1.337215e+02 -3.382910e+02 3.653281e+02 734 +-1.060000e+00 -1.020000e+00 3.098769e+02 1.322640e+02 -3.223734e+02 3.354175e+02 726 +-1.020000e+00 -9.800000e-01 2.919502e+02 1.246124e+02 -2.920970e+02 2.922856e+02 684 +-9.800000e-01 -9.400000e-01 2.770111e+02 1.182360e+02 -2.661491e+02 2.557498e+02 649 +-9.400000e-01 -9.000000e-01 2.590844e+02 1.105844e+02 -2.382954e+02 2.192093e+02 607 +-9.000000e-01 -8.600000e-01 2.364626e+02 1.009288e+02 -2.081998e+02 1.833470e+02 554 +-8.600000e-01 -8.200000e-01 2.134139e+02 9.109094e+01 -1.793725e+02 1.507885e+02 500 +-8.200000e-01 -7.800000e-01 1.831091e+02 7.815603e+01 -1.465709e+02 1.173476e+02 429 +-7.800000e-01 -7.400000e-01 1.664628e+02 7.105093e+01 -1.267104e+02 9.647328e+01 390 +-7.400000e-01 -7.000000e-01 1.557921e+02 6.649639e+01 -1.124971e+02 8.125291e+01 365 +-7.000000e-01 -6.600000e-01 1.374385e+02 5.866257e+01 -9.357336e+01 6.372601e+01 322 +-6.600000e-01 -6.200000e-01 1.109752e+02 4.736729e+01 -7.110283e+01 4.556981e+01 260 +-6.200000e-01 -5.800000e-01 8.493871e+01 3.625420e+01 -5.099670e+01 3.063068e+01 199 +-5.800000e-01 -5.400000e-01 6.530464e+01 2.787383e+01 -3.664402e+01 2.057112e+01 153 +-5.400000e-01 -5.000000e-01 3.073159e+01 1.311710e+01 -1.609667e+01 8.434279e+00 72 +-5.000000e-01 -4.600000e-01 2.560966e+00 1.093091e+00 -1.267354e+00 6.271989e-01 6 +-4.600000e-01 -4.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.200000e-01 -3.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.800000e-01 -3.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.400000e-01 -3.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# END YODA_HISTO1D + +# BEGIN YODA_HISTO1D /gaps/log10y34 +Path=/gaps/log10y34 +ScaledBy=1.5213690739441659e-07 +Title= +Type=Histo1D +XLabel= +YLabel= +# Mean: -2.955824e+00 +# Area: 3.505536e+04 +# ID ID sumw sumw2 sumwx sumwx2 numEntries +Total Total 3.505536e+04 1.496260e+04 -1.036175e+05 3.185104e+05 82130 +Underflow Underflow 4.016449e+02 1.714332e+02 -1.905189e+03 9.121964e+03 941 +Overflow Overflow 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# xlow xhigh sumw sumw2 sumwx sumwx2 numEntries +-4.300000e+00 -4.260000e+00 4.652422e+01 1.985783e+01 -1.991134e+02 8.521675e+02 109 +-4.260000e+00 -4.220000e+00 4.823153e+01 2.058655e+01 -2.044932e+02 8.670214e+02 113 +-4.220000e+00 -4.180000e+00 5.762174e+01 2.459455e+01 -2.421109e+02 1.017291e+03 135 +-4.180000e+00 -4.140000e+00 5.463395e+01 2.331928e+01 -2.271905e+02 9.447592e+02 128 +-4.140000e+00 -4.100000e+00 6.402415e+01 2.732729e+01 -2.637788e+02 1.086776e+03 150 +-4.100000e+00 -4.060000e+00 6.914609e+01 2.951346e+01 -2.822597e+02 1.152216e+03 162 +-4.060000e+00 -4.020000e+00 9.560940e+01 4.080874e+01 -3.862197e+02 1.560169e+03 224 +-4.020000e+00 -3.980000e+00 9.603623e+01 4.099093e+01 -3.841338e+02 1.536505e+03 225 +-3.980000e+00 -3.940000e+00 1.062801e+02 4.536329e+01 -4.208226e+02 1.666287e+03 249 +-3.940000e+00 -3.900000e+00 1.404263e+02 5.993784e+01 -5.504107e+02 2.157391e+03 329 +-3.900000e+00 -3.860000e+00 1.924993e+02 8.216403e+01 -7.466274e+02 2.895895e+03 451 +-3.860000e+00 -3.820000e+00 2.441454e+02 1.042080e+02 -9.372402e+02 3.597966e+03 572 +-3.820000e+00 -3.780000e+00 3.239622e+02 1.382760e+02 -1.230712e+03 4.675443e+03 759 +-3.780000e+00 -3.740000e+00 4.234131e+02 1.807244e+02 -1.591719e+03 5.983735e+03 992 +-3.740000e+00 -3.700000e+00 4.648154e+02 1.983961e+02 -1.728976e+03 6.431341e+03 1089 +-3.700000e+00 -3.660000e+00 5.634126e+02 2.404801e+02 -2.073259e+03 7.629303e+03 1320 +-3.660000e+00 -3.620000e+00 5.676808e+02 2.423019e+02 -2.066057e+03 7.519429e+03 1330 +-3.620000e+00 -3.580000e+00 6.214611e+02 2.652568e+02 -2.237056e+03 8.052756e+03 1456 +-3.580000e+00 -3.540000e+00 6.974364e+02 2.976852e+02 -2.482464e+03 8.836198e+03 1634 +-3.540000e+00 -3.500000e+00 7.123754e+02 3.040616e+02 -2.507229e+03 8.824372e+03 1669 +-3.500000e+00 -3.460000e+00 7.559119e+02 3.226441e+02 -2.630332e+03 9.152816e+03 1771 +-3.460000e+00 -3.420000e+00 7.887776e+02 3.366721e+02 -2.713231e+03 9.333055e+03 1848 +-3.420000e+00 -3.380000e+00 8.890821e+02 3.794849e+02 -3.023193e+03 1.028005e+04 2083 +-3.380000e+00 -3.340000e+00 8.711553e+02 3.718332e+02 -2.927181e+03 9.835779e+03 2041 +-3.340000e+00 -3.300000e+00 9.321916e+02 3.978852e+02 -3.094975e+03 1.027577e+04 2184 +-3.300000e+00 -3.260000e+00 9.360331e+02 3.995248e+02 -3.069929e+03 1.006864e+04 2193 +-3.260000e+00 -3.220000e+00 9.518259e+02 4.062656e+02 -3.084172e+03 9.993675e+03 2230 +-3.220000e+00 -3.180000e+00 9.360331e+02 3.995248e+02 -2.995515e+03 9.586438e+03 2193 +-3.180000e+00 -3.140000e+00 9.731672e+02 4.153746e+02 -3.075145e+03 9.717387e+03 2280 +-3.140000e+00 -3.100000e+00 9.206675e+02 3.929663e+02 -2.871914e+03 8.958721e+03 2157 +-3.100000e+00 -3.060000e+00 9.496916e+02 4.053547e+02 -2.925011e+03 9.009042e+03 2225 +-3.060000e+00 -3.020000e+00 9.057284e+02 3.865900e+02 -2.753249e+03 8.369496e+03 2122 +-3.020000e+00 -2.980000e+00 9.492649e+02 4.051725e+02 -2.847736e+03 8.543159e+03 2224 +-2.980000e+00 -2.940000e+00 8.967650e+02 3.827642e+02 -2.654302e+03 7.856494e+03 2101 +-2.940000e+00 -2.900000e+00 9.091430e+02 3.880474e+02 -2.654565e+03 7.751067e+03 2130 +-2.900000e+00 -2.860000e+00 9.014601e+02 3.847682e+02 -2.596162e+03 7.476942e+03 2112 +-2.860000e+00 -2.820000e+00 8.643261e+02 3.689183e+02 -2.454806e+03 6.972103e+03 2025 +-2.820000e+00 -2.780000e+00 8.485335e+02 3.621776e+02 -2.376002e+03 6.653223e+03 1988 +-2.780000e+00 -2.740000e+00 8.203628e+02 3.501536e+02 -2.264000e+03 6.248194e+03 1922 +-2.740000e+00 -2.700000e+00 8.020092e+02 3.423198e+02 -2.181819e+03 5.935618e+03 1879 +-2.700000e+00 -2.660000e+00 7.900581e+02 3.372186e+02 -2.117265e+03 5.674137e+03 1851 +-2.660000e+00 -2.620000e+00 7.354241e+02 3.138993e+02 -1.941803e+03 5.127208e+03 1723 +-2.620000e+00 -2.580000e+00 7.303022e+02 3.117133e+02 -1.898754e+03 4.936772e+03 1711 +-2.580000e+00 -2.540000e+00 6.982901e+02 2.980495e+02 -1.787753e+03 4.577070e+03 1636 +-2.540000e+00 -2.500000e+00 6.628634e+02 2.829284e+02 -1.670347e+03 4.209189e+03 1553 +-2.500000e+00 -2.460000e+00 6.419489e+02 2.740016e+02 -1.592108e+03 3.948702e+03 1504 +-2.460000e+00 -2.420000e+00 5.745101e+02 2.452168e+02 -1.401716e+03 3.420050e+03 1346 +-2.420000e+00 -2.380000e+00 5.412175e+02 2.310066e+02 -1.298770e+03 3.116757e+03 1268 +-2.380000e+00 -2.340000e+00 5.151810e+02 2.198935e+02 -1.215839e+03 2.869475e+03 1207 +-2.340000e+00 -2.300000e+00 5.015226e+02 2.140637e+02 -1.163692e+03 2.700201e+03 1175 +-2.300000e+00 -2.260000e+00 4.639617e+02 1.980317e+02 -1.057842e+03 2.411964e+03 1087 +-2.260000e+00 -2.220000e+00 4.293887e+02 1.832750e+02 -9.616848e+02 2.153903e+03 1006 +-2.220000e+00 -2.180000e+00 4.003644e+02 1.708866e+02 -8.807010e+02 1.937372e+03 938 +-2.180000e+00 -2.140000e+00 3.867059e+02 1.650568e+02 -8.349481e+02 1.802812e+03 906 +-2.140000e+00 -2.100000e+00 3.243891e+02 1.384582e+02 -6.878354e+02 1.458531e+03 760 +-2.100000e+00 -2.060000e+00 3.239622e+02 1.382760e+02 -6.738023e+02 1.401471e+03 759 +-2.060000e+00 -2.020000e+00 2.834136e+02 1.209688e+02 -5.784531e+02 1.180672e+03 664 +-2.020000e+00 -1.980000e+00 2.676210e+02 1.142280e+02 -5.352522e+02 1.070559e+03 627 +-1.980000e+00 -1.940000e+00 2.569503e+02 1.096735e+02 -5.035793e+02 9.869650e+02 602 +-1.940000e+00 -1.900000e+00 2.189626e+02 9.345931e+01 -4.202817e+02 8.067282e+02 513 +-1.900000e+00 -1.860000e+00 2.146943e+02 9.163749e+01 -4.038162e+02 7.595623e+02 503 +-1.860000e+00 -1.820000e+00 1.762798e+02 7.524111e+01 -3.243858e+02 5.969508e+02 413 +-1.820000e+00 -1.780000e+00 1.562189e+02 6.667857e+01 -2.813286e+02 5.066551e+02 366 +-1.780000e+00 -1.740000e+00 1.417068e+02 6.048439e+01 -2.493923e+02 4.389292e+02 332 +-1.740000e+00 -1.700000e+00 1.297556e+02 5.538329e+01 -2.231547e+02 3.837999e+02 304 +-1.700000e+00 -1.660000e+00 1.084142e+02 4.627420e+01 -1.821012e+02 3.058846e+02 254 +-1.660000e+00 -1.620000e+00 8.920698e+01 3.807601e+01 -1.463894e+02 2.402381e+02 209 +-1.620000e+00 -1.580000e+00 8.579237e+01 3.661856e+01 -1.373283e+02 2.198341e+02 201 +-1.580000e+00 -1.540000e+00 8.109726e+01 3.461455e+01 -1.264421e+02 1.971519e+02 190 +-1.540000e+00 -1.500000e+00 6.317050e+01 2.696292e+01 -9.610781e+01 1.462264e+02 148 +-1.500000e+00 -1.460000e+00 5.506077e+01 2.350146e+01 -8.156273e+01 1.208272e+02 129 +-1.460000e+00 -1.420000e+00 4.780470e+01 2.040437e+01 -6.893583e+01 9.941371e+01 112 +-1.420000e+00 -1.380000e+00 3.201208e+01 1.366364e+01 -4.476444e+01 6.260089e+01 75 +-1.380000e+00 -1.340000e+00 3.457304e+01 1.475673e+01 -4.704116e+01 6.401019e+01 81 +-1.340000e+00 -1.300000e+00 2.432918e+01 1.038437e+01 -3.222315e+01 4.268140e+01 57 +-1.300000e+00 -1.260000e+00 2.304870e+01 9.837822e+00 -2.952935e+01 3.783521e+01 54 +-1.260000e+00 -1.220000e+00 1.323166e+01 5.647639e+00 -1.642233e+01 2.038424e+01 31 +-1.220000e+00 -1.180000e+00 1.493897e+01 6.376366e+00 -1.791882e+01 2.149532e+01 35 +-1.180000e+00 -1.140000e+00 8.536554e+00 3.643638e+00 -9.908212e+00 1.150167e+01 20 +-1.140000e+00 -1.100000e+00 3.841449e+00 1.639637e+00 -4.301635e+00 4.817251e+00 9 +-1.100000e+00 -1.060000e+00 5.548760e+00 2.368364e+00 -6.017099e+00 6.525701e+00 13 +-1.060000e+00 -1.020000e+00 3.414621e+00 1.457455e+00 -3.558022e+00 3.708047e+00 8 +-1.020000e+00 -9.800000e-01 1.707311e+00 7.287275e-01 -1.712616e+00 1.718031e+00 4 +-9.800000e-01 -9.400000e-01 4.268276e-01 1.821819e-01 -4.145039e-01 4.025361e-01 1 +-9.400000e-01 -9.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-9.000000e-01 -8.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-8.600000e-01 -8.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-8.200000e-01 -7.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.800000e-01 -7.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.400000e-01 -7.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.000000e-01 -6.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-6.600000e-01 -6.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-6.200000e-01 -5.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.800000e-01 -5.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.400000e-01 -5.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.000000e-01 -4.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.600000e-01 -4.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.200000e-01 -3.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.800000e-01 -3.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.400000e-01 -3.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# END YODA_HISTO1D + +# BEGIN YODA_HISTO1D /gaps/log10y45 +Path=/gaps/log10y45 +ScaledBy=1.5213690739441662e-07 +Title= +Type=Histo1D +XLabel= +YLabel= +# Mean: -3.322106e+00 +# Area: 2.626698e+04 +# ID ID sumw sumw2 sumwx sumwx2 numEntries +Total Total 2.626698e+04 1.121147e+04 -8.726170e+04 2.973293e+05 61540 +Underflow Underflow 1.017557e+03 4.343216e+02 -4.821037e+03 2.303868e+04 2384 +Overflow Overflow 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# xlow xhigh sumw sumw2 sumwx sumwx2 numEntries +-4.300000e+00 -4.260000e+00 9.091430e+01 3.880474e+01 -3.890031e+02 1.664474e+03 213 +-4.260000e+00 -4.220000e+00 1.118289e+02 4.773166e+01 -4.741670e+02 2.010537e+03 262 +-4.220000e+00 -4.180000e+00 1.297556e+02 5.538329e+01 -5.449119e+02 2.288390e+03 304 +-4.180000e+00 -4.140000e+00 1.280483e+02 5.465457e+01 -5.327211e+02 2.216304e+03 300 +-4.140000e+00 -4.100000e+00 1.600604e+02 6.831820e+01 -6.594999e+02 2.717372e+03 375 +-4.100000e+00 -4.060000e+00 1.707311e+02 7.287275e+01 -6.964693e+02 2.841156e+03 400 +-4.060000e+00 -4.020000e+00 1.852432e+02 7.906694e+01 -7.482914e+02 3.022751e+03 434 +-4.020000e+00 -3.980000e+00 2.257918e+02 9.637422e+01 -9.030245e+02 3.611558e+03 529 +-3.980000e+00 -3.940000e+00 2.403040e+02 1.025684e+02 -9.516179e+02 3.768492e+03 563 +-3.940000e+00 -3.900000e+00 2.928038e+02 1.249768e+02 -1.147685e+03 4.498554e+03 686 +-3.900000e+00 -3.860000e+00 3.461572e+02 1.477495e+02 -1.342846e+03 5.209344e+03 811 +-3.860000e+00 -3.820000e+00 3.931083e+02 1.677895e+02 -1.509379e+03 5.795469e+03 921 +-3.820000e+00 -3.780000e+00 5.032299e+02 2.147924e+02 -1.912095e+03 7.265347e+03 1179 +-3.780000e+00 -3.740000e+00 5.360956e+02 2.288205e+02 -2.015507e+03 7.577580e+03 1256 +-3.740000e+00 -3.700000e+00 6.120709e+02 2.612489e+02 -2.276614e+03 8.468007e+03 1434 +-3.700000e+00 -3.660000e+00 7.123754e+02 3.040615e+02 -2.621205e+03 9.644887e+03 1669 +-3.660000e+00 -3.620000e+00 7.234730e+02 3.087983e+02 -2.633214e+03 9.584170e+03 1695 +-3.620000e+00 -3.580000e+00 7.542045e+02 3.219154e+02 -2.715003e+03 9.773635e+03 1767 +-3.580000e+00 -3.540000e+00 7.977410e+02 3.404980e+02 -2.840076e+03 1.011120e+04 1869 +-3.540000e+00 -3.500000e+00 9.018869e+02 3.849503e+02 -3.174553e+03 1.117423e+04 2113 +-3.500000e+00 -3.460000e+00 8.809724e+02 3.760234e+02 -3.065353e+03 1.066605e+04 2064 +-3.460000e+00 -3.420000e+00 8.946309e+02 3.818532e+02 -3.077435e+03 1.058617e+04 2096 +-3.420000e+00 -3.380000e+00 9.232283e+02 3.940594e+02 -3.139006e+03 1.067285e+04 2163 +-3.380000e+00 -3.340000e+00 9.326185e+02 3.980675e+02 -3.133574e+03 1.052886e+04 2185 +-3.340000e+00 -3.300000e+00 9.193869e+02 3.924198e+02 -3.052583e+03 1.013542e+04 2154 +-3.300000e+00 -3.260000e+00 8.942040e+02 3.816711e+02 -2.933012e+03 9.620475e+03 2095 +-3.260000e+00 -3.220000e+00 8.903626e+02 3.800314e+02 -2.884871e+03 9.347409e+03 2086 +-3.220000e+00 -3.180000e+00 8.553627e+02 3.650925e+02 -2.737458e+03 8.760934e+03 2004 +-3.180000e+00 -3.140000e+00 8.058507e+02 3.439595e+02 -2.546375e+03 8.046292e+03 1888 +-3.140000e+00 -3.100000e+00 7.588996e+02 3.239194e+02 -2.367674e+03 7.386951e+03 1778 +-3.100000e+00 -3.060000e+00 7.017047e+02 2.995070e+02 -2.161320e+03 6.657179e+03 1644 +-3.060000e+00 -3.020000e+00 7.217656e+02 3.080695e+02 -2.194015e+03 6.669443e+03 1691 +-3.020000e+00 -2.980000e+00 6.765219e+02 2.887583e+02 -2.029554e+03 6.088711e+03 1585 +-2.980000e+00 -2.940000e+00 6.090831e+02 2.599736e+02 -1.803068e+03 5.337699e+03 1427 +-2.940000e+00 -2.900000e+00 5.659735e+02 2.415732e+02 -1.653009e+03 4.827930e+03 1326 +-2.900000e+00 -2.860000e+00 5.454858e+02 2.328285e+02 -1.570972e+03 4.524391e+03 1278 +-2.860000e+00 -2.820000e+00 4.934128e+02 2.106023e+02 -1.401553e+03 3.981216e+03 1156 +-2.820000e+00 -2.780000e+00 4.767666e+02 2.034972e+02 -1.334846e+03 3.737352e+03 1117 +-2.780000e+00 -2.740000e+00 4.174375e+02 1.781739e+02 -1.152484e+03 3.181891e+03 978 +-2.740000e+00 -2.700000e+00 3.849986e+02 1.643281e+02 -1.047100e+03 2.847902e+03 902 +-2.700000e+00 -2.660000e+00 3.546938e+02 1.513931e+02 -9.507592e+02 2.548564e+03 831 +-2.660000e+00 -2.620000e+00 3.158525e+02 1.348146e+02 -8.338809e+02 2.201569e+03 740 +-2.620000e+00 -2.580000e+00 2.979258e+02 1.271629e+02 -7.747753e+02 2.014896e+03 698 +-2.580000e+00 -2.540000e+00 2.770112e+02 1.182360e+02 -7.090866e+02 1.815140e+03 649 +-2.540000e+00 -2.500000e+00 2.159748e+02 9.218403e+01 -5.442547e+02 1.371548e+03 506 +-2.500000e+00 -2.460000e+00 2.146943e+02 9.163749e+01 -5.326361e+02 1.321446e+03 503 +-2.460000e+00 -2.420000e+00 1.809750e+02 7.724512e+01 -4.417733e+02 1.078425e+03 424 +-2.420000e+00 -2.380000e+00 1.681701e+02 7.177966e+01 -4.036536e+02 9.688995e+02 394 +-2.380000e+00 -2.340000e+00 1.408531e+02 6.012002e+01 -3.325143e+02 7.849900e+02 330 +-2.340000e+00 -2.300000e+00 1.126825e+02 4.809601e+01 -2.616231e+02 6.074420e+02 264 +-2.300000e+00 -2.260000e+00 1.015850e+02 4.335929e+01 -2.316704e+02 5.283507e+02 238 +-2.260000e+00 -2.220000e+00 9.134114e+01 3.898692e+01 -2.046705e+02 4.586235e+02 214 +-2.220000e+00 -2.180000e+00 7.640216e+01 3.261056e+01 -1.681483e+02 3.700756e+02 179 +-2.180000e+00 -2.140000e+00 5.975588e+01 2.550547e+01 -1.290406e+02 2.786672e+02 140 +-2.140000e+00 -2.100000e+00 5.676808e+01 2.423020e+01 -1.204278e+02 2.554824e+02 133 +-2.100000e+00 -2.060000e+00 4.097546e+01 1.748946e+01 -8.523877e+01 1.773233e+02 96 +-2.060000e+00 -2.020000e+00 4.225594e+01 1.803601e+01 -8.626589e+01 1.761185e+02 99 +-2.020000e+00 -1.980000e+00 3.073159e+01 1.311710e+01 -6.153447e+01 1.232162e+02 72 +-1.980000e+00 -1.940000e+00 2.262187e+01 9.655639e+00 -4.440178e+01 8.715383e+01 53 +-1.940000e+00 -1.900000e+00 1.664628e+01 7.105093e+00 -3.193404e+01 6.126450e+01 39 +-1.900000e+00 -1.860000e+00 1.963408e+01 8.380367e+00 -3.690659e+01 6.937698e+01 46 +-1.860000e+00 -1.820000e+00 1.152435e+01 4.918912e+00 -2.122259e+01 3.908336e+01 27 +-1.820000e+00 -1.780000e+00 9.817037e+00 4.190183e+00 -1.767355e+01 3.181910e+01 23 +-1.780000e+00 -1.740000e+00 5.975587e+00 2.550547e+00 -1.048800e+01 1.840849e+01 14 +-1.740000e+00 -1.700000e+00 5.548760e+00 2.368364e+00 -9.545105e+00 1.642027e+01 13 +-1.700000e+00 -1.660000e+00 2.560966e+00 1.093091e+00 -4.314286e+00 7.268198e+00 6 +-1.660000e+00 -1.620000e+00 2.987794e+00 1.275273e+00 -4.889778e+00 8.002809e+00 7 +-1.620000e+00 -1.580000e+00 5.975588e+00 2.550546e+00 -9.577661e+00 1.535178e+01 14 +-1.580000e+00 -1.540000e+00 2.134138e+00 9.109094e-01 -3.338561e+00 5.222807e+00 5 +-1.540000e+00 -1.500000e+00 4.268276e-01 1.821819e-01 -6.485781e-01 9.855350e-01 1 +-1.500000e+00 -1.460000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.460000e+00 -1.420000e+00 4.268278e-01 1.821819e-01 -6.229367e-01 9.091490e-01 1 +-1.420000e+00 -1.380000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.380000e+00 -1.340000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.340000e+00 -1.300000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.300000e+00 -1.260000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.260000e+00 -1.220000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.220000e+00 -1.180000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.180000e+00 -1.140000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.140000e+00 -1.100000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.100000e+00 -1.060000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.060000e+00 -1.020000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.020000e+00 -9.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-9.800000e-01 -9.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-9.400000e-01 -9.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-9.000000e-01 -8.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-8.600000e-01 -8.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-8.200000e-01 -7.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.800000e-01 -7.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.400000e-01 -7.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.000000e-01 -6.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-6.600000e-01 -6.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-6.200000e-01 -5.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.800000e-01 -5.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.400000e-01 -5.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.000000e-01 -4.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.600000e-01 -4.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.200000e-01 -3.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.800000e-01 -3.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.400000e-01 -3.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# END YODA_HISTO1D + +# BEGIN YODA_HISTO1D /gaps/log10y56 +Path=/gaps/log10y56 +ScaledBy=1.5213690739441659e-07 +Title= +Type=Histo1D +XLabel= +YLabel= +# Mean: -3.582652e+00 +# Area: 1.714780e+04 +# ID ID sumw sumw2 sumwx sumwx2 numEntries +Total Total 1.714780e+04 7.319158e+03 -6.143460e+04 2.250285e+05 40175 +Underflow Underflow 1.466153e+03 6.257949e+02 -6.990336e+03 3.364129e+04 3435 +Overflow Overflow 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# xlow xhigh sumw sumw2 sumwx sumwx2 numEntries +-4.300000e+00 -4.260000e+00 1.459751e+02 6.230620e+01 -6.248300e+02 2.674536e+03 342 +-4.260000e+00 -4.220000e+00 1.438409e+02 6.139530e+01 -6.098786e+02 2.585877e+03 337 +-4.220000e+00 -4.180000e+00 1.609140e+02 6.868257e+01 -6.756488e+02 2.836947e+03 377 +-4.180000e+00 -4.140000e+00 1.788408e+02 7.633421e+01 -7.438346e+02 3.093780e+03 419 +-4.140000e+00 -4.100000e+00 1.895115e+02 8.088876e+01 -7.807761e+02 3.216777e+03 444 +-4.100000e+00 -4.060000e+00 2.142675e+02 9.145530e+01 -8.741523e+02 3.566328e+03 502 +-4.060000e+00 -4.020000e+00 2.385967e+02 1.018397e+02 -9.638848e+02 3.893942e+03 559 +-4.020000e+00 -3.980000e+00 2.710356e+02 1.156855e+02 -1.084007e+03 4.335519e+03 635 +-3.980000e+00 -3.940000e+00 2.902428e+02 1.238837e+02 -1.149071e+03 4.549207e+03 680 +-3.940000e+00 -3.900000e+00 3.423158e+02 1.461099e+02 -1.341684e+03 5.258685e+03 802 +-3.900000e+00 -3.860000e+00 3.692060e+02 1.575873e+02 -1.431958e+03 5.553871e+03 865 +-3.860000e+00 -3.820000e+00 4.157302e+02 1.774452e+02 -1.595929e+03 6.126602e+03 974 +-3.820000e+00 -3.780000e+00 4.532910e+02 1.934772e+02 -1.722544e+03 6.545867e+03 1062 +-3.780000e+00 -3.740000e+00 5.207298e+02 2.222619e+02 -1.957939e+03 7.361898e+03 1220 +-3.740000e+00 -3.700000e+00 5.279859e+02 2.253590e+02 -1.963836e+03 7.304531e+03 1237 +-3.700000e+00 -3.660000e+00 5.809125e+02 2.479496e+02 -2.137552e+03 7.865511e+03 1361 +-3.660000e+00 -3.620000e+00 6.107904e+02 2.607023e+02 -2.223118e+03 8.091654e+03 1431 +-3.620000e+00 -3.580000e+00 6.419489e+02 2.740015e+02 -2.310580e+03 8.316597e+03 1504 +-3.580000e+00 -3.540000e+00 6.782292e+02 2.894870e+02 -2.414566e+03 8.596200e+03 1589 +-3.540000e+00 -3.500000e+00 6.598756e+02 2.816532e+02 -2.323104e+03 8.178621e+03 1546 +-3.500000e+00 -3.460000e+00 6.526196e+02 2.785561e+02 -2.270888e+03 7.901986e+03 1529 +-3.460000e+00 -3.420000e+00 6.419489e+02 2.740015e+02 -2.208415e+03 7.597416e+03 1504 +-3.420000e+00 -3.380000e+00 6.223148e+02 2.656211e+02 -2.116107e+03 7.195649e+03 1458 +-3.380000e+00 -3.340000e+00 5.941441e+02 2.535972e+02 -1.996645e+03 6.709884e+03 1392 +-3.340000e+00 -3.300000e+00 6.001198e+02 2.561478e+02 -1.992515e+03 6.615623e+03 1406 +-3.300000e+00 -3.260000e+00 5.241444e+02 2.237194e+02 -1.719456e+03 5.640746e+03 1228 +-3.260000e+00 -3.220000e+00 5.079250e+02 2.167964e+02 -1.645891e+03 5.333448e+03 1190 +-3.220000e+00 -3.180000e+00 4.652422e+02 1.985783e+02 -1.489004e+03 4.765612e+03 1090 +-3.180000e+00 -3.140000e+00 4.199985e+02 1.792670e+02 -1.327201e+03 4.194030e+03 984 +-3.140000e+00 -3.100000e+00 3.807303e+02 1.625062e+02 -1.188115e+03 3.707709e+03 892 +-3.100000e+00 -3.060000e+00 3.576816e+02 1.526684e+02 -1.101774e+03 3.393865e+03 838 +-3.060000e+00 -3.020000e+00 2.979258e+02 1.271630e+02 -9.060402e+02 2.755455e+03 698 +-3.020000e+00 -2.980000e+00 2.898160e+02 1.237015e+02 -8.696831e+02 2.609794e+03 679 +-2.980000e+00 -2.940000e+00 2.300601e+02 9.819604e+01 -6.810366e+02 2.016073e+03 539 +-2.940000e+00 -2.900000e+00 2.189626e+02 9.345930e+01 -6.394456e+02 1.867432e+03 513 +-2.900000e+00 -2.860000e+00 1.916456e+02 8.179966e+01 -5.520373e+02 1.590174e+03 449 +-2.860000e+00 -2.820000e+00 1.843896e+02 7.870257e+01 -5.235936e+02 1.486824e+03 432 +-2.820000e+00 -2.780000e+00 1.519507e+02 6.485676e+01 -4.254782e+02 1.191404e+03 356 +-2.780000e+00 -2.740000e+00 1.344507e+02 5.738730e+01 -3.711245e+02 1.024435e+03 315 +-2.740000e+00 -2.700000e+00 9.945085e+01 4.244838e+01 -2.705964e+02 7.362808e+02 233 +-2.700000e+00 -2.660000e+00 8.749969e+01 3.734729e+01 -2.346092e+02 6.290589e+02 205 +-2.660000e+00 -2.620000e+00 7.810947e+01 3.333929e+01 -2.063096e+02 5.449327e+02 183 +-2.620000e+00 -2.580000e+00 6.317050e+01 2.696292e+01 -1.641781e+02 4.267018e+02 148 +-2.580000e+00 -2.540000e+00 5.463395e+01 2.331928e+01 -1.398750e+02 3.581185e+02 128 +-2.540000e+00 -2.500000e+00 4.182911e+01 1.785382e+01 -1.054938e+02 2.660636e+02 98 +-2.500000e+00 -2.460000e+00 3.201207e+01 1.366364e+01 -7.941134e+01 1.969976e+02 75 +-2.460000e+00 -2.420000e+00 3.243890e+01 1.384582e+01 -7.923107e+01 1.935237e+02 76 +-2.420000e+00 -2.380000e+00 1.878042e+01 8.016003e+00 -4.505429e+01 1.080879e+02 44 +-2.380000e+00 -2.340000e+00 1.963407e+01 8.380367e+00 -4.635020e+01 1.094217e+02 46 +-2.340000e+00 -2.300000e+00 1.621945e+01 6.922911e+00 -3.761231e+01 8.722329e+01 38 +-2.300000e+00 -2.260000e+00 8.536554e+00 3.643638e+00 -1.947298e+01 4.442134e+01 20 +-2.260000e+00 -2.220000e+00 1.195117e+01 5.101093e+00 -2.675066e+01 5.987843e+01 28 +-2.220000e+00 -2.180000e+00 4.268277e+00 1.821819e+00 -9.384951e+00 2.063610e+01 10 +-2.180000e+00 -2.140000e+00 4.268277e+00 1.821819e+00 -9.226594e+00 1.994531e+01 10 +-2.140000e+00 -2.100000e+00 2.134138e+00 9.109094e-01 -4.530094e+00 9.616247e+00 5 +-2.100000e+00 -2.060000e+00 1.280483e+00 5.465456e-01 -2.671949e+00 5.575768e+00 3 +-2.060000e+00 -2.020000e+00 2.134138e+00 9.109094e-01 -4.345335e+00 8.847881e+00 5 +-2.020000e+00 -1.980000e+00 8.536554e-01 3.643638e-01 -1.705871e+00 3.408881e+00 2 +-1.980000e+00 -1.940000e+00 4.268276e-01 1.821819e-01 -8.388888e-01 1.648756e+00 1 +-1.940000e+00 -1.900000e+00 1.707311e+00 7.287275e-01 -3.273572e+00 6.277061e+00 4 +-1.900000e+00 -1.860000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.860000e+00 -1.820000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.820000e+00 -1.780000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.780000e+00 -1.740000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.740000e+00 -1.700000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.700000e+00 -1.660000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.660000e+00 -1.620000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.620000e+00 -1.580000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.580000e+00 -1.540000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.540000e+00 -1.500000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.500000e+00 -1.460000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.460000e+00 -1.420000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.420000e+00 -1.380000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.380000e+00 -1.340000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.340000e+00 -1.300000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.300000e+00 -1.260000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.260000e+00 -1.220000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.220000e+00 -1.180000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.180000e+00 -1.140000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.140000e+00 -1.100000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.100000e+00 -1.060000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.060000e+00 -1.020000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-1.020000e+00 -9.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-9.800000e-01 -9.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-9.400000e-01 -9.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-9.000000e-01 -8.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-8.600000e-01 -8.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-8.200000e-01 -7.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.800000e-01 -7.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.400000e-01 -7.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-7.000000e-01 -6.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-6.600000e-01 -6.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-6.200000e-01 -5.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.800000e-01 -5.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.400000e-01 -5.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-5.000000e-01 -4.600000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.600000e-01 -4.200000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-4.200000e-01 -3.800000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.800000e-01 -3.400000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +-3.400000e-01 -3.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0 +# END YODA_HISTO1D + diff --git a/gaps-1.1/test/completedevents.sh b/gaps-1.1/test/completedevents.sh new file mode 100644 index 0000000000000000000000000000000000000000..938aa46e0f8f25756ea7e0595b13b6459ad2942b --- /dev/null +++ b/gaps-1.1/test/completedevents.sh @@ -0,0 +1,25 @@ +# Algorithm to get all the completed events data +# Not really worth putting in rungaps, so in Test + +# Move to the root directory +cd .. + +# Remove the old and make the new results directory +rm -rf results +mkdir -p results + +# Run for 1000, 10000, 100000 events and get results +for i in 1000 10000 100000 +do + echo "Running for $i events" + # Run the algorithm + ./bin/gaps $i + mv gaps-cycles.dat results/gaps-cycles-$i.dat +done + +# For 1000000 events, profile and store the results +echo "Running for 1000000 events" +# Run the algorithm +nsys profile --stats=true ./bin/gaps 1000000 > profile.log +mv gaps-cycles.dat results/gaps-cycles-1000000.dat +mv profile.log results/profile.log \ No newline at end of file diff --git a/gaps-1.1/test/mplstyleerc b/gaps-1.1/test/mplstyleerc new file mode 100644 index 0000000000000000000000000000000000000000..6297440335efd2dad150fa5ec942133a4a77e0cc --- /dev/null +++ b/gaps-1.1/test/mplstyleerc @@ -0,0 +1,21 @@ +# To match with LaTeX Paper +font.family: serif +font.serif: Times New Roman +font.size: 7 +font.weight: 400 +text.usetex: True +text.latex.preamble: \usepackage{amsmath} +mathtext.fontset: cm +mathtext.rm: Times New Roman +mathtext.it: Times New Roman:italic +mathtext.bf: Times New Roman:bold +mathtext.cal: Times New Roman:caligraphic + +# Common settings +axes.titlesize: 12 +axes.labelsize: 10 + +# Savefig settings +savefig.bbox: tight +savefig.pad_inches: 0.1 + diff --git a/gaps-1.1/test/plot-completedevents.py b/gaps-1.1/test/plot-completedevents.py new file mode 100644 index 0000000000000000000000000000000000000000..702c9b19face5105de6e83e41d9a39206a4ace56 --- /dev/null +++ b/gaps-1.1/test/plot-completedevents.py @@ -0,0 +1,62 @@ +import numpy as np +import matplotlib.pyplot as plt +import pandas as pd + +import matplotlib as mpl +mpl.rc_file("mplstyleerc") + +# Plotting Number of Completed Events per cycle and Number of newly Completed +# Events per cycle + +# Plot +fig, ax = plt.subplots(1, 2, figsize=(9, 3.75)) + +nev = [1000, 10000, 100000, 1000000] + +for n in nev: + + filename = "../results-events/gaps-cycles-" + str(n) + + temp = np.genfromtxt(filename + ".dat", delimiter='\n') + temp /= n # Divide by number of events + + comp = np.zeros((200)) + diff = np.zeros((200)) + max = 0 + + comp[:len(temp)] = temp + comp[len(temp):] = temp[-1] + + for i in range(1, len(temp) - 1): + + diff[i] = temp[i] - temp[i-1] + + if len(temp) > max: + max = len(temp) + + comp = comp[:max] + comp = np.append(comp, 0.) + + diff = diff[:max] + diff = np.append(diff, 0.) + + cycles = np.arange(1, len(comp) + 1) + + ax[0].scatter(cycles, comp, label=str(n) + ' Events ') + ax[0].plot(cycles, comp, alpha=0.5) + ax[0].set_xlabel('Cycle') + ax[0].set_ylabel('Number of Completed Events / Total') + ax[0].set_title('Number of Completed Events per Cycle') + ax[0].legend() + ax[0].grid(True) + + ax[1].scatter(cycles, diff, label=str(n) + ' Events ') + ax[1].plot(cycles, diff, alpha=0.5) + ax[1].set_xlabel('Cycle') + ax[1].set_ylabel('Number of Newly Completed Events / Total') + ax[1].set_title('Number of Newly Completed Events per Cycle') + ax[1].legend() + ax[1].grid(True) + +fig.tight_layout() +plt.savefig("completedEvents.pdf") diff --git a/gaps-1.1/test/plot-executiontime.py b/gaps-1.1/test/plot-executiontime.py new file mode 100644 index 0000000000000000000000000000000000000000..5e561d351d6c8e88374385c5626a13fd4a7acfb6 --- /dev/null +++ b/gaps-1.1/test/plot-executiontime.py @@ -0,0 +1,147 @@ +import numpy as np +import matplotlib.pyplot as plt +import pandas as pd +from scipy.stats import iqr + +import matplotlib as mpl +mpl.rc_file("mplstyleerc") + +# Data to plot +nev = np.array([1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, + 2000, 5000, 10000, 20000, 50000, + 100000, 200000, 500000, 1000000]) + +# Params +nreps = 100 + +dir_to_results = "../results-times/" + +# import data +cpp_full = np.genfromtxt(dir_to_results + "cpp-time.dat", delimiter=',') +cud_full = np.genfromtxt(dir_to_results + "gaps-time.dat", delimiter=',') + +# Calculate the average and standard deviation for all repetitions +cpp_median = np.zeros((len(nev), 4)) +cud_median = np.zeros((len(nev), 4)) + +cpp_iqr = np.zeros((len(nev), 4)) +cud_iqr = np.zeros((len(nev), 4)) + +for i in range(len(nev)): + + """ + # To Check if there are any outliers + if nev[i] == 5000: + print(cud_full[i*nreps:i*nreps+nreps]) + + fig, ax = plt.subplots() + ax.hist(cud_full[i*nreps:i*nreps+nreps, 0], bins=50) + fig.savefig("histo.pdf") + """ + + cpp_median[i] = np.median(cpp_full[i*nreps:i*nreps+nreps], axis=0) + cud_median[i] = np.median(cud_full[i*nreps:i*nreps+nreps], axis=0) + + cpp_iqr[i] = iqr(cpp_full[i*nreps:i*nreps+nreps], axis=0) + cud_iqr[i] = iqr(cud_full[i*nreps:i*nreps+nreps], axis=0) + +cpp = cpp_median +cud = cud_median + +# Convert the arrays to pandas DataFrames for easier printing +columns_lin = ['Matrix Element', 'Parton Shower', 'Observables', 'Total'] +cpp_df = pd.DataFrame(cpp, index=nev, columns=columns_lin) +cud_df = pd.DataFrame(cud, index=nev, columns=columns_lin) + +# Calculate the ratios and convert to integer +cpp_cud_ratio = (cpp / cud) + +# Convert the ratio arrays to DataFrame +cpp_cud_ratio_df = pd.DataFrame(cpp_cud_ratio, index=nev, columns=columns_lin) + +# Print the DataFrame +print("CPU / GPU Ratio for different Number of Events:") +print(cpp_cud_ratio_df) +print("\n") + +# Initialize p as a 3D array +p = np.zeros((2, 2, 2)) + +# Define the labels for printing +labels = ["GPU Matrix Element Gradient", "GPU Parton Shower Gradient", + "GPU Observables Gradient", "GPU Total Gradient"] + +# Loop over the columns - to prevent lots of repeated statements +""" +For i = 0, i // 2 is 0 and i % 2 is 0. +For i = 1, i // 2 is 0 and i % 2 is 1. +For i = 2, i // 2 is 1 and i % 2 is 0. +For i = 3, i // 2 is 1 and i % 2 is 1. + +Kept it here because I thought it was a neat way of doing loops! +""" +for i in range(4): + + # Linea Fit + p1, c1 = np.polyfit(np.log(nev[14:]), np.log(cud[14:, i]), 1, cov=True) + + # Linear Fit + p[i//2, i % 2, :] = p1 + + # Print the results + print(labels[i], ":", round(p1[0], 3), "±", round(np.sqrt(c1[0, 0]), 3)) + +# print(p) + +# Create a new figure with a 4x2 grid of subplots +fig, axs = plt.subplots(2, 2, figsize=(10, 6.4)) + +# Define the column names +columns = [['Matrix Element', 'Parton Shower'], ['Observables', 'Total']] + +# Add this line to adjust the space between subplots +fig.subplots_adjust(wspace=1, hspace=1) + +# Add linspace for the linear fit +x = np.linspace(40000, 1300000, 1000) + +# Loop over the columns and plot the data +for i in range(2): + for j in range(2): + ax = axs[i, j] + cpp_errorbar = ax.errorbar( + nev, cpp[:, 2*i + j], yerr=cpp_iqr[:, 2*i + j], fmt='o', color='C0') + cud_errorbar = ax.errorbar( + nev, cud[:, 2*i + j], yerr=cud_iqr[:, 2*i + j], fmt='o', color='C2') + cpp_plot = ax.plot(nev, cpp[:, 2*i + j], color='C0', alpha=0.3) + cud_plot = ax.plot(nev, cud[:, 2*i + j], color='C2', alpha=0.3) + fit_plot = ax.plot(x, np.exp(p[i, j, 1]) * x**p[i, j, 0], color='C1', + linestyle='--') + + ax.set_xscale('log') + ax.set_yscale('log') + ax.set_xlabel('Number of events') + ax.set_ylabel('Execution time (s)') + ax.set_title(columns[i][j]) + ax.grid(True) + + # Add a vertical line + ax.axvline(x=5120, color='C2', linestyle='--') + + # Create a proxy artist for the axvline to use in the legend + v100_gpu_cores_line = mpl.lines.Line2D( + [], [], color='C2', label="V100 GPU Cores") + + # Create a list of handles and labels manually, including the proxy artist + handles = [cpp_errorbar, cud_errorbar, + v100_gpu_cores_line, fit_plot[0]] + labels = ['CPU', 'GPU', "V100 GPU Cores", + "Linear Fit, Gradient = " + str(round(p[i, j, 0], 2))] + + # Add the legend with the updated handles and labels + ax.legend(handles, labels) + + ax.legend(handles, labels) + +fig.tight_layout() +plt.savefig('executionTime.pdf') diff --git a/gaps-1.1/test/plots.conf b/gaps-1.1/test/plots.conf new file mode 100644 index 0000000000000000000000000000000000000000..00b8bcf194a5fbf4971babb7a12ed36fe247171c --- /dev/null +++ b/gaps-1.1/test/plots.conf @@ -0,0 +1,88 @@ +# BEGIN PLOT /.* +LegendXPos=0.74 +LegendYPos=0.95 +# END PLOT + +# BEGIN PLOT /gaps/log10y23 +Title=Differential $2 \to 3$ jet resolution (Durham algorithm) at 91.2 GeV +XLabel=$\log_{10}(y_{23})$ +YLabel=$\text{d}\sigma/\text{d}\log_{10}(y_{23})$ +LegendXPos=0.7 +LegendYPos=0.95 +XMin=-4 +XMax=-0.4 +YMin=4e2 +# END PLOT + +# BEGIN PLOT /gaps/log10y34 +Title=Differential $3 \to 4$ jet resolution (Durham algorithm) at 91.2 GeV +XLabel=$\log_{10}(y_{34})$ +YLabel=$\text{d}\sigma/\text{d}\log_{10}(y_{34})$ +XMin=-4 +XMax=-0.8 +# END PLOT + +# BEGIN PLOT /gaps/log10y45 +Title=Differential $4 \to 5$ jet resolution (Durham algorithm) at 91.2 GeV +XLabel=$\log_{10}(y_{45})$ +YLabel=$\text{d}\sigma/\text{d}\log_{10}(y_{45})$ +XMin=-4 +XMax=-1.2 +# END PLOT + +# BEGIN PLOT /gaps/log10y56 +Title=Differential $5 \to 6$ jet resolution (Durham algorithm) at 91.2 GeV +XLabel=$\log_{10}(y_{56})$ +YLabel=$\text{d}\sigma/\text{d}\log_{10}(y_{56})$ +XMin=-4 +XMax=-1.6 +# END PLOT + +# BEGIN PLOT /gaps/tvalue +Title=Thrust $(1 - T)$ at 91.2 GeV +XLabel=$1 - T$ +YLabel=$\text{d}\sigma/\text{d}(1 - T)$ +# END PLOT + +# BEGIN PLOT /gaps/tzoomd +Title=Thrust $(1 - T)$ at 91.2 GeV - zoomed in +XLabel=$1 - T$ +YLabel=$\text{d}\sigma/\text{d}(1 - T)$ +XMin=0 +XMax=0.1 +# END PLOT + +# BEGIN PLOT /gaps/hjm +Title=Heavy Jet Mass at 91.2 GeV +XLabel=$\rho_H$ +YLabel=$\text{d}\sigma/\text{d}\rho_H$ +XMax=0.65 +# END PLOT + +# BEGIN PLOT /gaps/ljm +Title=Light Jet Mass at 91.2 GeV +XLabel=$\rho_L$ +YLabel=$\text{d}\sigma/\text{d}\rho_L$ +XMin=0 +XMax=0.4 +# END PLOT + +# BEGIN PLOT /gaps/wjb +Title=Wide Jet Broadening at 91.2 GeV +XLabel=$B_W$ +YLabel=$\text{d}\sigma/\text{d}B_W$ +XMax=0.3 +# END PLOT + +# BEGIN PLOT /gaps/njb +Title=Narrow Jet Broadening at 91.2 GeV +XLabel=$B_N$ +YLabel=$\text{d}\sigma/\text{d}B_N$ +# END PLOT + +# BEGIN PLOT /gaps/dalitz +Title=Dalitz plot at 91.2 GeV +XLabel=$x_1$ +YLabel=$x_2$ +PlotSize=8,8 +# END PLOT \ No newline at end of file diff --git a/gaps-1.1/test/rivet-command.sh b/gaps-1.1/test/rivet-command.sh new file mode 100644 index 0000000000000000000000000000000000000000..1d504d6d09fc32ca939db06f6ae34b4d7d4a0556 --- /dev/null +++ b/gaps-1.1/test/rivet-command.sh @@ -0,0 +1,6 @@ +# Command to Make Plots of Jet and Event Shapes +# Write something similar in the terminal +rivet-mkhtml -s --mc-errs -c plots.conf \ + ../cpp.yoda:"C++" \ + ../gaps.yoda:"GAPS" \ + SH-Tutorial.yoda:"S. H." \ \ No newline at end of file