Accessing the CDF databases through Python


If you want to be able to access information in the CDF databases from a script, with a minimal amount of code, Python may be just the thing you need. I've compiled a recent version (2.1.1) of Python and added the DCOracle2 module to it, which binds to Oracle databases. (This python executable is called ~peonyisi/bin/python.)

The most difficult part about getting information from the database is knowing what to ask for (which fields from which tables). This task is pretty much equivalent to writing a SQL statement. I'm not going to cover this; I'll assume that you know what SQL request you want to execute. (A couple of examples of SQL statements follow below.)

Given that you know what SQL query you wish to execute, you can just fit it in to the following skeleton:

 
#!/cdf/home/cdf/peonyisi/bin/python 
import sys, DCOracle2

db = DCOracle2.connect('cdf_reader/reader@cdfofprd') 
c = db.cursor() 
c.execute('SQL STATEMENT GOES HERE') 
r = c.fetchall() 

for entry in r:
	# for example,
	print entry
An explanation of what is going on:

A simple example:

# This script just dumps the information from the GOODRUN_STATUS table
# for a given run number

#!/cdf/home/cdf/peonyisi/bin/python
import sys, DCOracle2

if len(sys.argv) < 2:
	print "Usage:", sys.argv[0], "run"
	sys.exit(1)

db = DCOracle2.connect('cdf_reader/reader@cdfofprd')
c = db.cursor()

c.execute("SELECT * FROM GOODRUN_STATUS "
	  "WHERE RUNNUMBER = " + sys.argv[1])
r = c.fetchall()
for entry in r:
	print entry

A example with slightly prettier output:

# This program lists every Run 2 trigger, at every level, ever
# sorted alphabetically within trigger level

#!/cdf/home/cdf/peonyisi/bin/python
import DCOracle2
 
db = DCOracle2.connect('cdf_reader/reader@cdfofprd')
c = db.cursor()
 
c.execute("SELECT UNIQUE TRIGGER_NAME||'_v-'||TRIGGER_TAG, "
        "TRIGGER_LEVEL||':'||TRIGGER_NAME||':'||TRIGGER_TAG, "
        'TRIGGER_LEVEL, TRIGGER_ID, '
        'PHYSICS_GROUP, HTML_DOCUMENT FROM TRIGGERS '
        "ORDER BY TRIGGER_LEVEL||':'||TRIGGER_NAME||':'||TRIGGER_TAG")
r = c.fetchall()
format = '%-1d     %-50s %-5d %-15s'
print '%-5s %-50s %-5s %-15s' % ('Level', 'Trigger name and version', 'ID #', 'Physics group')
print '%-5s %-50s %-5s %-20s' % ('-----', '------------------------', '----', '-------------')
for entry in r:
        print format % (entry[2], entry[0], entry[3], entry[4]);

A few references:


Peter Onyisi, 29 Nov 2001