ANN: script to visualize python profiling data with kcachegrind

Discussion in 'C++' started by Joerg Beyer, Sep 17, 2003.

  1. Joerg Beyer

    Joerg Beyer Guest

    Dear List Reader,

    with the kind help of Josef Weidendorfer, I wrote a script
    to transform the profiling data of the pstats.Stats class
    to a format that the program kcachegrind could read
    and visualize.

    kcachegrind is free software by Josef Weidendorfer,
    see . It was developed
    with the very nice valgrind (see ),
    which is (among other things a C/C++ profiling tool).
    kcachegrind is the best visualisation of gcc compiled
    profiling code I have seen yet.

    With this, one could profile it's python python code and
    generate the pstats.Stats results, e.g. like that:
    #!/usr/bin/env python
    # _*_ coding: latin1 _*_

    import hotshot

    def busy(x):
    for i in range(x):

    def h():

    def f():

    def run():

    filename = ""
    prof = hotshot.Profile(filename, lineevents=1)

    When you run this tiny python code, you get the profiling results
    in "". Now you can transform them with hotshot2cachegrind
    (my script, see below) like this:

    hotshot2cachegrind -o cachegrind.out.42

    this will produce the file cachegrind.out.42, the input to kcachegrind.
    The filename _has_ to be in the format "cachegrind.out."+<number>.

    now start kcachegrind with this data file as input:
    kcachegrind cachegrind.out.42

    Now a kcachegrind window should open, with the profiling
    results, like a calltree and a list of costs (how expensive
    are the functions). For a description of how to interpret
    this data, please take a look at the kcachegrind documentation,
    but most of it is very easy and intuitive.

    The costs are given in "Ticks", which is what hotshot
    tells me as costs (tdelta of the logread).

    I am not sure what is the unit of this, could anybody
    tell me? There may be small mistakes, since
    I am e.g. not sure where the costs of a call should go:
    to the calling or the called function. So, python experts
    are needed to justify these details.

    Besides this, I was able to find and improve the hot spots
    in some python sources without big efforts. At least the
    relative numbers seem right.

    The script hotshot2cachegrind is in it's early stages, I plan
    to extend it to display the source line and the associated
    costs, just like kcachegrind could do with assembler lines.
    I believe in "release early, release often", so please take
    a look at it, profile some of you code an tell me, if the
    results are resonable.

    Josef and I plan to make the current version of the
    hotshot2cachegrind script available from

    Until that, I append the python code to this posting.

    I am not aware of other visualisations of python profiling
    results, so links are welcome.

    Comments are welcome.

    below is hotshot2cachegrind:
    #!/usr/bin/env python
    # _*_ coding: latin1 _*_

    # Copyright (c) 2003 by WEB.DE, Karlsruhe
    # Autor: Jörg Beyer <>
    # hotshot2cachegrind is free software; you can redistribute it and/or
    # modify it under the terms of the GNU General Public
    # License as published by the Free Software Foundation, version 2.
    # This program is distributed in the hope that it will be useful,
    # but WITHOUT ANY WARRANTY; without even the implied warranty of
    # General Public License for more details.

    # You should have received a copy of the GNU General Public License
    # along with this program; see the file COPYING. If not, write to
    # the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
    # Boston, MA 02111-1307, USA.
    # This script transforms the pstat output of the hotshot
    # python profiler into the input of kcachegrind.
    # example usage:
    # modify you python script to run this code:
    # import hotshot
    # filename = ""
    # prof = hotshot.Profile(filename, lineevents=1)
    # prof.runcall(run) # assuming that "run" should be called.
    # prof.close()
    # it will run the "run"-method under profiling and write
    # the results in a file, called "".
    # then call this script:
    # hotshot2cachegrind -o <output> <input>
    # or here:
    # hotshot2cachegrind cachegrind.out.0
    # then call kcachegrind:
    # kcachegrind cachegrind.out.0

    import os, sys
    from hotshot import stats,log
    import os.path

    what2text = {
    log.WHAT_EXIT : "EXIT",
    log.WHAT_ENTER : "ENTER"}

    # a pseudo caller on the caller stack. This represents
    # the Python interpreter that executes the given python
    # code.
    root_caller = ("PythonInterpreter",0,"execute")

    class Stack:
    """A tiny Stack implementation, based on python lists"""
    def __init__(self):
    self.stack = []
    def push(self, elem):
    """put something on the stack"""
    self.stack = [elem] + self.stack

    def pop(self):
    """get the head element of the stack and remove it from teh stack"""
    head = self.stack[0]
    self.stack = self.stack[1:]
    return head

    def top(self):
    """get the head element of the stack, stack is unchanged."""
    head = self.stack[0]
    return head
    def size(self):
    """ return how many elements the stack has"""
    return len(self.stack)

    def do_calls(output, current_function, current_source_file, costs):
    """print, what kcachegrind needs for a function call"""
    output.write("cfn=%s\n" % current_function)
    output.write("calls=1\n0 %d\n" % costs)
    output.write("fn=%s\n" % current_function)

    def return_from_call(caller_stack, call_dict, current_cost):
    """return from a function call
    remove the function from the caller stack,
    add the costs to the calling function.
    called, cost_of_left_funct = caller_stack.pop()
    caller, caller_cost = caller_stack.pop()
    #print "pop: caller: %s size: %d" % (caller, caller_stack.size())

    per_file_dict = call_dict.get(caller[0], {})
    per_caller_dict = per_file_dict.get(caller[2], {})
    call_cost, count = per_caller_dict.get(called, (0, 0))
    cost_of_left_funct += current_cost

    per_caller_dict[called] = (call_cost+cost_of_left_funct, count + 1)
    per_file_dict[caller[2]] = per_caller_dict
    call_dict[caller[0]] = per_file_dict
    #print "push caller %s, size: %d : %s" % (caller, caller_stack.size(), (caller,caller_cost + cost_of_left_funct))
    caller_stack.push((caller,caller_cost + cost_of_left_funct))

    def convertProfFiles(output, inputfilenames):
    """convert all the given input files into one kcachegrind
    input file.
    call_dict = {}
    cost_per_line = {}
    cost_per_function = {}
    caller_stack = Stack()
    caller_stack.push((root_caller, 0))

    total_cost = 0
    i = 0
    number_of_files = len(inputfilenames)
    for inputfilename in inputfilenames:
    i += 1
    sys.stdout.write("lese File %d von %d, costs: %s \r" % (i, number_of_files, total_cost))
    cost = convertProfFile(inputfilename, caller_stack, call_dict, cost_per_line, cost_per_function)
    total_cost += cost

    print "total_cost:",total_cost
    dumpResults(output, call_dict, total_cost, cost_per_line, cost_per_function)

    def convertProfFile(inputfilename, caller_stack, call_dict, cost_per_line, cost_per_function):
    """convert a single input file into one kcachegrind

    item_counter = 0
    total_cost = 0
    logreader = log.LogReader(inputfilename)
    current_cost = 0
    last_file = None
    last_func = None
    for item in logreader:
    item_counter += 1
    what, pos ,tdelta = item
    (file, lineno, func) = pos
    #line = "%s %s %d %s %d" % (what2text[what], file, lineno, func, tdelta)
    #print line
    #if what == log.WHAT_LINENO:
    if what == log.WHAT_ENTER:
    caller_stack.push((pos, tdelta))
    elif what == log.WHAT_EXIT:
    return_from_call(caller_stack, call_dict, tdelta)
    # add the current cost to the current function
    p, c = caller_stack.pop()
    c += tdelta

    cost = cost_per_line.get(pos, 0)
    #print "buche tdelta: %d auf pos: %s -> %d" % (tdelta, pos, cost + tdelta)
    cost_per_line[pos] = cost + tdelta
    total_cost += tdelta

    # I have no idea, why sometimes the stack is not empty - we
    # have to rewind the stack to get 100% for the root_caller
    while caller_stack.size() > 1:
    return_from_call(caller_stack, call_dict, 0)

    except IOError:
    print "could not open inputfile '%s', ignore this." % inputfilename
    except EOFError, m:
    print "item_counter: %d %s" % (item_counter, m)
    return total_cost

    def pretty_name(file, function):
    #pfile = os.path.splitext(os.path.basename(file)) [0]
    return "%s [%s]" % (function, file)
    #return "%s_%s" % (pfile, function)

    def write_fn(output, file, function):
    output.write("fn=%s\n" % pretty_name(file, function))

    def write_fl(output, file, function):
    output.write("fl=%s\n" % pretty_name(file, function))

    def write_cfn(output, file, function):
    output.write("cfn=%s\n" % pretty_name(file, function))

    def dumpResults(output, call_dict, total_cost, cost_per_line, cost_per_function):
    """write the collected results in the format kcachegrind
    could read.
    # the intro
    output.write("events: Tick\n")
    output.write("summary: %d\n" % total_cost)
    output.write("cmd: your python script\n")
    last_func = None
    last_file = None
    # now the costs per line
    for pos in cost_per_line.keys():
    output.write("ob=%s\n" % pos[0])
    write_fn(output, pos[0], pos[2])
    # cost line
    output.write("%d %d\n" % (pos[1], cost_per_line[pos]))
    # now the function calls. For each caller all the called
    # functions and their costs are written.
    for file in call_dict.keys():
    per_file_dict = call_dict[file]
    output.write("ob=%s\n" % file)
    #print "file %s -> %s" % (file, per_file_dict)
    for caller in per_file_dict.keys():
    write_fn(output, file ,caller)
    write_fl(output, file ,caller)
    per_caller_dict = per_file_dict[caller]
    #print "caller %s -> %s" % (caller, per_caller_dict)
    for called in per_caller_dict.keys():
    output.write("cob=%s\n\ncfl=%s\n" % (called[0], called[0]))
    write_cfn(output, called[0], called[2])
    cost, count = per_caller_dict[called]
    # detect recursion
    if file == called[0] and caller == called[2]:
    output.write("calls=%d\n%d %d\n" % (count, called[1], cost))
    #output.write("calls=%d\n%d %d\n" % (count, called[1], cost))

    def run_without_optik():
    """parse the options without optik, use sys.argv"""
    if len(sys.argv) < 4 or sys.argv[1] != "-o" :
    print "usage: hotshot2cachegrind -o outputfile in1 [in2 [in3 [...]]]"
    outputfilename = sys.argv[2]
    output = file(outputfilename, "w")
    args = sys.argv[3:]
    convertProfFiles(output, args)
    except IOError:
    print "could not open '%s' for writing." % outputfilename

    def run_with_optik():
    """parse the options with optik"""
    parser = OptionParser()
    parser.add_option("-o", "--output",
    action="store", type="string", dest="outputfilename",
    help="write output into FILE")
    output = sys.stdout
    close_output = 0
    (options, args) = parser.parse_args()
    if options.outputfilename and options.outputfilename != "-":
    output = file(options.outputfilename, "w")
    close_output = 1
    except IOError:
    print "could not open '%s' for writing." % options.outputfilename
    if output:
    convertProfFiles(output, args)
    if close_output:

    # check if optik is available.
    from optik import OptionParser
    run = run_with_optik
    except ImportError:
    run = run_without_optik

    if __name__ == "__main__":
    except KeyboardInterrupt:

    Simplicity is prerequisite for reliability.
    Edsger W. Dijkstra
    Joerg Beyer, Sep 17, 2003
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sebastien de Menten

    profile output for kcachegrind ?

    Sebastien de Menten, Jul 22, 2003, in forum: Python
    Sebastien de Menten
    Jul 23, 2003
  2. Joerg Beyer
    Joerg Beyer
    Sep 18, 2003
  3. A. Novruzi
    Robert Ferrell
    Jan 15, 2004
  4. Thomas Korimort

    how to visualize symbol table?

    Thomas Korimort, Jan 12, 2005, in forum: Python
    Pierre Barbier de Reuille
    Jan 12, 2005
  5. Michael Sgier


    Michael Sgier, Sep 29, 2005, in forum: C++
    Sep 29, 2005

Share This Page