CMS Data Analysis School Pre-Exercises - Second Set

Introduction

Welcome to the second set of CMSDAS pre-exercises. As you know by now, the purpose of the pre-workshop exercises is for prospective workshop attendees to become familiar with the basic software tools required to perform physics analysis at CMS before the workshop begins. Post the answers in the online response form available from the course web area:

A large amount of additional information about these exercises is available in the twikis that we reference. Please remember that twikis evolve but aim to provide with the best information at any given time. If problems are encountered please e-mail the
  • LPC contact CMSDASATLPC@fnal.gov for CMSDAS@LPC2018
with a detailed description of your problem. The instructors will be delighted to help you.

The Second Set of exercises begins with Exercise 7. We will use Collision data events and simulated events (Monte Carlo (MC)). To comfortably work with these files, we will first make them smaller by selecting only the objects that we are interested in (electrons and muons in our case).

The collision data events are stored in DoubleMuon.root. DoubleMuon refers here to the fact, that when recording these events, we believed that there are two muons in the event. This is true most of the time, but other objects can fake muons, hence at closer inspection we might find events that actually don't have two muons.

The MC file is called DYJetsToLL. You will need to get used to cryptic names like this if you want to survive in the high energy physics environment! The MC file contains Drell Yan events, that decay to two leptons and that might be accompanied by one or several jets.

Exercises 8 and Exercise 9 are using FWLite (Frame Work Lite). This is an interactive analysis tool integrated with the CMSSW EDM (Event Data Model) Framework. It allows you to automatically load the shared libraries defining CMSSW data formats and the tools provided, to easily access parts of the event in the EDM format within ROOT interactive sessions. It reads produced ROOT files, has full access to the class methods and there is no need to write full-blown framework modules. Thus having FWLite distribution locally on the desktop one can do CMS analysis outside the full CMSSW framework. In these two exercises, we will analyze the data stored in a MiniAOD sample using FWLite. We will loop over muons and make a Z mass peak.

Exercise 10 and Exercise 11 are on the CMS events display called Fireworks. Fireworks is the CMS event display project. Data handling is greatly simplified by using only reconstructed information and ideal geometry. Data is presented via graphical and textual views. Fireworks provides an easy to use interface which allows a physicist to concentrate only on the data in which he is interested in. One can select which events (e.g. require a high energy muon), what data (e.g. which track list) and which items in a collection (e.g. only high-pt tracks) to show.

We assume that having done the first set of pre-exercises by now, one is comfortable with logging onto cmslpc-sl6.fnal.gov and setting up the cms environment.

NOTE: Legend of colors for this tutorial:

GRAY background for the commands to execute  (cut&paste)
GREEN background for the output sample of the executed commands
BLUE background for the configuration files  (cut&paste)
PINK background for the code (EDAnalyzer etc.)  (cut&paste)

Obtain a CERN account (in case you don't have one already)

  • Use the following link for a CMS CERN account: CMS CERN account
    • A CERN account is needed, for example, to login in to any e-learning web-site, or obtain a file from the afs area. A CERN account will be needed for future exercises.
    • Obtaining a CERN account can be time-consuming and requires people at CERN to process applications during business hours. The relevant institutional team leader must start the CERN account request. Try to initiate this process as early as possible.

Obtain a Grid Certificate and CMS VO registration

  • A Grid Certificate and CMS VO registration will be needed for the Grid Exercises. The registration process can be time-consuming (actions by several people are required), so it is important to start it as soon as possible. There are two main requirements which can be simply summarized: A certificate ensures that you are who you claim to be. A registration in the VO recognizes you (identified by your certificate) as a member of CMS. Use the following link for this: Get Your Grid Certificate and CMSVO. Both are needed to submit jobs on the Grid. Make sure you follow any additional instructions for US-CMS users.

Exercise 7 - Slim MiniAOD sample from Exercise 6 to reduce its size by keeping only Muon and Electron branches

In order to reduce the size of the MiniAOD we would like to keep only the slimmedMuons and slimmedElectrons objects and drop all others. The config files should now look like slimMiniAOD_MC_MuEle_cfg.py and slimMiniAOD_data_MuEle_cfg.py. To work with this config file and make the slim MiniAOD, execute the following steps in the directory YOURWORKINGAREA/CMSSW_9_3_2/src

1. Cut and paste the script slimMiniAOD_MC_MuEle_cfg.py and slimMiniAOD_data_MuEle_cfg.py in its entirety and save it with the same name. Open with your favorite editor and take a look at these python files. The number of events has been set to 1000:

process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(1000) )

To run over all events in the sample, one can change it to -1.

2. Now run the following command:

cmsRun slimMiniAOD_MC_MuEle_cfg.py 

This produces an output file called slimMiniAOD_MC_MuEle.root in your $CMSSW_BASE/src area.

3. Now run the following command:

cmsRun slimMiniAOD_data_MuEle_cfg.py 

This produces an output file called slimMiniAOD_data_MuEle.root in your $CMSSW_BASE/src area.

On opening these two MiniAODs one observes that only the slimmedMuons and the slimmedElectrons objects are retained as intended.

To find the size of your MiniAOD execute following Linux command:

ls -lh slimMiniAOD_MC_MuEle.root

and

ls -lh slimMiniAOD_data_MuEle.root

You may also try the following:

To know the size of each branch, use the edmEventSize utility as follows (also explained in First Set of Exercises ):

 edmEventSize -v slimMiniAOD_MC_MuEle.root

and

 edmEventSize -v slimMiniAOD_data_MuEle.root

To see what objects there are, open the ROOT file as follows and browse to the MiniAOD samples as you did in Exercise 6:

Here is how you do it for the output file slimMiniAOD_MC_MuEle.root

root -l slimMiniAOD_MC_MuEle.root; 
TBrowser b;

OR

root -l
TFile *theFile = TFile::Open("slimMiniAOD_MC_MuEle.root");
TBrowser b;

To quit ROOT application, execute:

.q

For CMSDAS@LPC2018 please submit your answers at the Google Form second set.

QUESTION 7.1a - What is the size of the MiniAOD slimMiniAOD_MC_MuEle.root?

QUESTION 7.1b - What is the size of the MiniAOD slimMiniAOD_data_MuEle.root?

QUESTION 7.2a - What is the mean eta of the muons for MC?

QUESTION 7.2b - What is the mean eta of the muons for data?

QUESTION 7.3a - What is the size of the output file compared to the original sample?

QUESTION 7.3b - Is the mean eta for muons for MC and data the same as in the original sample in Exercise 6?

Exercise 8 - Use FWLite on the MiniAOD created in Exercise 7 and make a Z Peak (applying pt and eta cuts)

FWLite (pronounced "framework-light") is basically a ROOT session with CMS data format libraries loaded. CMS uses ROOT to persistify data objects. CMS data formats are thus "ROOT-aware"; that is, once the shared libraries containing the ROOT-friendly description of CMS data formats are loaded into a ROOT session, these objects can be accessed and used directly from within ROOT like any other ROOT class!

In addition, CMS provides a couple of classes that greatly simplify the access to the collections of CMS data objects. Moreover, these classes (Event and Handle) have the same name as analogous ones in the Full Framework; this mnemonic trick helps in making the code to access CMS collections very similar between the FWLite and the Full Framework.

In this exercise we will make a ZPeak using our data and MC sample. We will use the corresponding slim MiniAOD created in Exercise 7. To read more about FWLite, have a look at Section 3.5 of Chapter 3 of the WorkBook.

We will first make a ZPeak. We will loop over the slimmedMuons in the MiniAOD and get the mass of oppositely charged muons. These are filled in a histogram that is written to an output ROOT file.

First make sure that you have the MiniAODs created in Exercise 7. They should be called slimMiniAOD_MC_MuEle.root and slimMiniAOD_data_MuEle.root.

1. Go to the src area of current CMSSW release

cd $CMSSW_BASE/src
The environment variable CMSSW_BASE will point to the base area of current CMSSW release.

2. Check out a package from GitHub.

Make sure that you get github setup properly as in obtain a GitHub account. It's particularly important to set up ssh keys so that you can check out code without problems: https://help.github.com/articles/generating-ssh-keys

To check out the package, run:

git cms-addpkg PhysicsTools/FWLite

Then to compile the packages, do

scram b
cmsenv

Note: You can try scram b -j 8 to speed up the compiling. Here -j 8 will compile with 8 cores. When occupying several cores to compile, you will also make the interactive machine slower for others, since you are using more resources. Use with care!

Note 2: It is necessary to call cmsenv again after compiling this package because it adds executables in the $CMSSW_BASE/bin area.

3. To make a Z peak, we will use the FWLite executable called FWLiteHistograms. The corresponding code should be in $CMSSW_BASE/src/PhysicsTools/FWLite/bin/FWLiteHistograms.cc

With this executable we will use the command line options. More about these can be learned from SWGuideCommandLineParsing.

To make a ZPeak from this executable, using the MC MiniAOD, run the following command (which will not work out of the box, see below):

FWLiteHistograms inputFiles=slimMiniAOD_MC_MuEle.root outputFile=ZPeak_MC.root maxEvents=-1 outputEvery=100

You can see that you will get the following error:

terminate called after throwing an instance of 'cms::Exception'
  what():  An exception of category 'ProductNotFound' occurred.
Exception Message:
getByLabel: Found zero products matching all criteria
Looking for type: edm::Wrapper<std::vector<reco::Muon> >
Looking for module label: muons
Looking for productInstanceName: 

The data is registered in the file but is not available for this event

This error occurs because your input files slimMiniAOD_MC_MuEle.root is a MiniAOD and does not contain reco::Muon whose label is muons. It contains, however, slimmedMuons (check yourself by opening the root file with ROOT browser). However, in the code FWLiteHistograms.cc there are lines that say:

using reco::Muon;

and

event.getByLabel(std::string("muons"), muons);

This means you need to change reco::Muon to pat::Muon, and muons to slimmedMuons.

To implement these changes, open the code $CMSSW_BASE/src/PhysicsTools/FWLite/bin/FWLiteHistograms.cc. In this code, look at the line that says:

using reco::Muon;

and change it to

using pat::Muon;

and in this:

event.getByLabel(std::string("muons"), muons);

and change it to

event.getByLabel(std::string("slimmedMuons"), muons);

Now you need to re-compile:

scram b

Now again run the executable as follows:

FWLiteHistograms inputFiles=slimMiniAOD_MC_MuEle.root outputFile=ZPeak_MC.root maxEvents=-1 outputEvery=100

You can see that now it runs successfully and you get a ROOT file with a histogram called ZPeak_MC.root. Open this ROOT file and see the Z mass peak histogram called mumuMass. Answer the following question.

QUESTION 8.1a - What is mean mass of the ZPeak for your MC MiniAOD?

QUESTION 8.1b - How can you increase statistics in your ZPeak histogram?

4. Now a little bit about the command that you executed.

In the command above, it is obvious that slimMiniAOD_MC_MuEle.root is the input file, ZPeak_MC.root is output file. maxEvents is the events you want to run over. You can change it any other number. The option -1 means running over all the events which is 1000 in this case. outputEvery means after how any events should the code report the number of event being processed. As you may have noticed, as you specified, when your executable runs, it says processing event: after every 100 events.

If you look at the code FWLiteHistograms.cc , it also contains the defaults corresponding to the above command line options. Answer the following question:

QUESTION 8.2 - What is the default name of the output file?

Exercise 9 - Re-run the above executable with the data MiniAOD

Re-run the above executable with the data MiniAOD file called slimMiniAOD_data_MuEle.root as follows:

FWLiteHistograms inputFiles=slimMiniAOD_data_MuEle.root outputFile=ZPeak_data.root maxEvents=-1 outputEvery=100

This will create an output histogram ROOT file called ZPeak_data.root

Then answer the following question.

QUESTION 9a - What is mean mass of the ZPeak for your data MiniAOD?

QUESTION 9b - How can you increase statistics in your ZPeak histogram?

Exercise 10 - Fireworks - CMS Event Display

Fireworks is the CMS event-display project and cmsShow is the official name of the executable. Both names are used interchangeably. With this tool one can display events for physics. The core of Fireworks is built on top of the Event Data Model (EDM) and the light version of the software framework (FWLite). The Event Visualization Environment (EVE) of ROOT is used to manage 3D and 2D views, selection, and user-interaction with the graphics windows. Several EVE components were developed in a collaboration between the Fireworks and ROOT teams. The event display operates using simple plugins which are registered into the system to perform conversion from EDM collections into their visual representations. As a guiding principle, Fireworks shows only what is available in the EDM event-data, no reconstruction or result enhancement is performed internally. Visibility of collection elements can be filtered via a generic expression.

An instructive introduction to the features of Fireworks is given in this video tutorial. The video tutorial shows an older version of Fireworks, as some elements of user interface (UI) have changed. With a little bit of browsing through the new UI, you will be able to find all the functionalities. Lots of them are very helpful, and they are nicely explained in the video.

Please be aware that for any issues with fireworks display, first have a look at the twiki WorkBookFireworksHowToFix and then send email to the fireworks support list at fireworks-support@cernSPAMNOT.ch.

1. First we will look at the event display from YOURWORKINGAREA/CMSSW_9_3_2/src. After you login and do cmsenv, execute the following command from YOURWORKINGAREA/CMSSW_9_3_2/src. We will look at the collision data that you have used in earlier exercises:

cmsShow root://cmseos.fnal.gov//store/user/cmsdas/2017/pre_exercises/DYJetsToLL.root
If you get an error about incompatible data, try with this method:
cmsShow --no-version-check root://cmseos.fnal.gov//store/user/cmsdas/2017/pre_exercises/DYJetsToLL.root

Depending on the connection you have, it might take a while to open Fireworks, since the GUI has to be forwarded to your local computer. It will pop a window like this:

cmsShow_MC_730pre1.png

You will soon realise that it is very slow. This is because the data file is not stored locally. Let us copy a few events locally using a config. You can use this config file to copy events from any file that you can access locally. To achieve this, open the file copy_CMSDAS_cfg.py, select the text entirely and save it in a file with the same name in YOURWORKINGAREA/CMSSW_9_3_2/src. Now execute the following command:

cmsRun copy_CMSDAS_cfg.py

This will copy 100 events to a file called DYJetsToLL_n100.root. If you want to copy less events, you are free to change the number 100 to a different value in the following line in copy_CMSDAS_cfg.py:

process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(100) )

To also copy the data file, open copy_CMSDAS_cfg.py and replace DYJetsToLL with DoubleMuon everywhere, there should be two occurrences of DYJetsToLL. Again run the config file for data as:

cmsRun copy_CMSDAS_cfg.py

This will copy 100 collision events to a file called DoubleMuon_n100.root.

Now we are ready to play with Fireworks for the event display.

To open the Fireworks event display for the collision data file, do the following:

cmsShow DoubleMuon_n100.root

This will open the Fireworks display window as shown in the snapshot below. This window has several parts that can be swapped or undocked for a separate view. Now do the following after the Fireworks windows open.

event_display_804.png

1. As you see the very first event displayed has an event number 133325096.

2. On the top left part that says "Summary View/Add Collection" uncheck all collections EXCEPT Muons. As you uncheck, notice how the different color coded objects disappear from the main display sub-window that says "Rho Phi".

3. Now look at the Muons collection from the left side menu. Click the arrow to the left, to see some basic properties of the muon candidates.

QUESTION 10.1a - What is pT of the highest pT muon that you see in the first event?

QUESTION 10.1b - What is Run number?

QUESTION 10.1c - What is the LS number?

QUESTION 10.2 - How many tracks does the first event have?

You can also open the DYJetsToLL_n100.root file and display its events too.

Exercise 11 - Run Fireworks locally from Desktop

As you noticed, first accessing a remote file for cmsShow makes things run slowly. To overcome that you copied some events to your interactive machine. However, despite having the data and MC files in your working area, the display is still very slow. The fastest way to run Fireworks is to have everything locally on your laptop. In this exercise, we will first download fireworks and then run the display. We will also copy the ROOT files locally to the laptop/desktop.

First copy the ROOT files locally to your laptop. To do that, we use scp (secure copy), which copies files with the ssh protocol. Alternate copy methods can be found [http://uscms.org/uscms_at_work/computing/setup/uaf_data_transfer.shtml][here for the cmslpc-sl6 cluster]].

While on your laptop, do the command:

scp USERNAME@SERVER:YOURWORKINGAREA/CMSSW_9_3_2/src/DYJetsToLL_n100.root .
scp USERNAME@SERVER:YOURWORKINGAREA/CMSSW_9_3_2/src/DoubleMuon_n100.root .

Replace USERNAME with your username, SERVER with the server you have been using (cmslpc-sl6.fnal.gov, this presumes you have already gotten a valid Kerberos ticket), and YOURWORKINGAREA with the path to your working area (find it with the pwd command while logged into the cmslpc-sl6 cluster).

Now we will get the fireworks executable locally. To do this, please follow the instructions in the WorkBookFireworks twiki. Just download, uncompress and come back here.

Next, copy the root files to directory cmsShow-X.Y. To open the event display, go into the cmsShow-X.Y directory and execute the following:

./cmsShow DoubleMuon_n100.root

QUESTION 11 - How many primary vertices are in the first event of DoubleMuon_n100.root?

For CMSDAS@LPC2018 please submit your answers at the Google Form second set.

Link to SWGuideCMSDataAnalysisSchoolPreExerciseFirstSet

Link to SWGuideCMSDataAnalysisSchoolPreExerciseThirdSet

Link to SWGuideCMSDataAnalysisSchoolPreExerciseFourthSet

Link to SWGuideCMSDataAnalysisSchoolPreExerciseFifthSet

Link to SWGuideCMSDataAnalysisSchoolPreExerciseSixthSet

Questions/Problems/Suggestions - mailto: basil.schneider@cern.ch.

Reviewer/Editor and Date (copy from screen) Comments
Main.Basil.Schneider, MargueriteTonjes - 22 October 2017 Updated to Run2/2017 for CMSDAS@LPC2018
AdrianPerieanu - 15 July 2016 Updated to CMSSW_8_0_6 for CMSDAS@HH
NitishDhingra - 03 October 2013 Updated to CMSSW_5_3_11 for CMSDAS@Kolkata
NitishDhingra - 22 July 2012 Updated to CMSSW_5_2_5 for CMSDASia
SudhirMalik - 02 July 2012 General Review

%REVIEW% MargueriteTonjes - 26 October 2017
%REVIEW% AdrianPerieanu - 15 July 2016
%REVIEW% NitishDhingra - 03 October 2013
%RESPONSIBLE% SudhirMalik

Article text.

-- Sudhir Malik - 2017-12-15

Comments

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2017-12-15 - SudhirMalik
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback