Help With PyParsing of output from win32pdhutil.ShowAllProcesses()

Discussion in 'Python' started by Steve, Sep 11, 2007.

  1. Steve

    Steve Guest

    Hi All (especially Paul McGuire!)

    Could you lend a hand in the grammar and paring of the output from the
    function win32pdhutil.ShowAllProcesses()?

    This is the code that I have so far (it is very clumsy at the
    moment) :


    import string
    import win32api
    import win32pdhutil
    import re
    import pyparsing


    process_info = win32pdhutil.ShowAllProcesses()

    print process_info
    print

    ## Output from ShowAllProcesses :

    ##Process Name ID Process,% Processor Time,% User Time,% Privileged
    Time,Virtual Bytes Peak,Virtual Bytes
    ##PyScripter 2572 0 0 0 96370688 96370688
    ##vmnetdhcp 1184 0 0 0 13942784 13942784
    ##vmount2 780 0 0 0 40497152 38400000
    ##ipoint 260 0 0 0 63074304 58531840


    sProcess_Info = str(process_info)
    print('type = ', type(sProcess_Info))

    ## Try some test data :
    test = ('Process Name ID Process,% Processor Time,% User Time,%
    Privileged Time,Virtual Bytes Peak,Virtual Bytes',
    'PyScripter 2572 0 0 0 96370688 96370688',
    'vmnetdhcp 1184 0 0 0 13942784 13942784',
    'vmount2 780 0 0 0 40497152 38400000',
    'ipoint 260 0 0 0 63074304 58531840')

    heading = pyparsing.Literal('Process Name ID Process,% Processor
    Time,% User Time,% Privileged Time,Virtual Bytes Peak,Virtual
    Bytes').suppress()
    integer = pyparsing.Word(pyparsing.nums)
    process_name = pyparsing.Word(pyparsing.alphas)

    #ProcessList = heading + process_name + pyparsing.OneOrMore(integer)
    ProcessList = process_name + pyparsing.OneOrMore(integer)

    # Now parse data and print results

    for current_line in test :
    print('Current line = %s') % (current_line)

    try:
    data = ProcessList.parseString(current_line)
    print "data:", data
    except:
    pass


    print('\n\nParse Actual data : \n\n')
    ## Parse the actual data from ShowAllProcesses :

    ProcessList = heading + process_name + pyparsing.OneOrMore(integer)
    data = ProcessList.parseString(sProcess_Info)
    print "data:", data
    print "data.asList():",
    print "data keys:", data.keys()



    =====

    Output from run :


    Process Name ID Process,% Processor Time,% User Time,% Privileged
    Time,Virtual Bytes Peak,Virtual Bytes
    PyScripter 2572 0 0 0 101416960 97730560
    vmnetdhcp 1184 0 0 0 13942784 13942784
    vmount2 780 0 0 0 40497152 38400000
    ipoint 260 0 0 0 65175552 58535936
    DockingDirector 916 0 0 0 102903808 101695488
    vmnat 832 0 0 0 15757312 15757312
    svchost 1060 0 0 0 74764288 72294400
    svchost 1120 0 0 0 46632960 45846528
    svchost 1768 0 0 0 131002368 113393664
    svchost 1988 0 0 0 33619968 31047680
    svchost 236 0 0 0 39841792 39055360
    System 4 0 0 0 3624960 1921024
    .....

    None

    ('type = ', <type 'str'>)
    Current line = Process Name ID Process,% Processor Time,% User Time,
    % Privileged Time,Virtual Bytes Peak,Virtual Bytes
    Current line = PyScripter 2572 0 0 0 96370688
    96370688
    data: ['PyScripter', '2572', '0', '0', '0', '96370688', '96370688']
    Current line = vmnetdhcp 1184 0 0 0 13942784
    13942784
    data: ['vmnetdhcp', '1184', '0', '0', '0', '13942784', '13942784']
    Current line = vmount2 780 0 0 0 40497152
    38400000
    data: ['vmount', '2', '780', '0', '0', '0', '40497152', '38400000']
    Current line = ipoint 260 0 0 0 63074304
    58531840
    data: ['ipoint', '260', '0', '0', '0', '63074304', '58531840']


    Parse Actual data :


    Traceback (most recent call last):
    File "ProcessInfo.py", line 55, in <module>
    data = ProcessList.parseString(sProcess_Info)
    File "C:\Python25\lib\site-packages\pyparsing.py", line 821, in
    parseString
    loc, tokens = self._parse( instring.expandtabs(), 0 )
    File "C:\Python25\lib\site-packages\pyparsing.py", line 712, in
    _parseNoCache
    loc,tokens = self.parseImpl( instring, preloc, doActions )
    File "C:\Python25\lib\site-packages\pyparsing.py", line 1864, in
    parseImpl
    loc, resultlist = self.exprs[0]._parse( instring, loc, doActions,
    callPreParse=False )
    File "C:\Python25\lib\site-packages\pyparsing.py", line 716, in
    _parseNoCache
    loc,tokens = self.parseImpl( instring, preloc, doActions )
    File "C:\Python25\lib\site-packages\pyparsing.py", line 2106, in
    parseImpl
    return self.expr._parse( instring, loc, doActions,
    callPreParse=False )
    File "C:\Python25\lib\site-packages\pyparsing.py", line 716, in
    _parseNoCache
    loc,tokens = self.parseImpl( instring, preloc, doActions )
    File "C:\Python25\lib\site-packages\pyparsing.py", line 1118, in
    parseImpl
    raise exc
    pyparsing.ParseException: Expected "Process Name ID Process,%
    Processor Time,% User Time,% Privileged Time,Virtual Bytes
    Peak,Virtual Bytes" (at char 0), (line:1, col:1)



    Many thanks!

    Steve
    Steve, Sep 11, 2007
    #1
    1. Advertising

  2. Steve

    David Guest

    On 9/11/07, Steve <> wrote:
    > Hi All (especially Paul McGuire!)
    >
    > Could you lend a hand in the grammar and paring of the output from the
    > function win32pdhutil.ShowAllProcesses()?
    >
    > This is the code that I have so far (it is very clumsy at the
    > moment) :


    Any particular reason you need to use pyparsing? Seems like an
    overkill for such simple data.

    Here's an example:

    import pprint

    X="""Process Name ID Process,% Processor Time,% User Time,%
    Privileged Time,Virtual Bytes Peak,Virtual Bytes
    PyScripter 2572 0 0 0 96370688 96370688
    vmnetdhcp 1184 0 0 0 13942784 13942784
    vmount2 780 0 0 0 40497152 38400000
    ipoint 260 0 0 0 63074304 58531840"""

    data = []
    for line in X.split('\n')[1:]: # Skip the first row
    split = line.split()
    row = [split[0]] # Get the process name
    row += [int(x) for x in split[1:]] # Convert strings to int, fail
    if any aren't.
    data.append(row)

    pprint.pprint(data)

    # Output follows:
    #
    #[['PyScripter', 2572, 0, 0, 0, 96370688, 96370688],
    # ['vmnetdhcp', 1184, 0, 0, 0, 13942784, 13942784],
    # ['vmount2', 780, 0, 0, 0, 40497152, 38400000],
    # ['ipoint', 260, 0, 0, 0, 63074304, 58531840]]
    #
    David, Sep 11, 2007
    #2
    1. Advertising

  3. Steve

    Steve Guest

    Hi All,

    I did a lot of digging into the code in the module, win32pdhutil, and
    decided to create some custom methods.


    added to : import win32pdhutil



    def ShowAllProcessesAsList():

    object = find_pdh_counter_localized_name("Process")
    items, instances =
    win32pdh.EnumObjectItems(None,None,object,win32pdh.PERF_DETAIL_WIZARD)

    # Need to track multiple instances of the same name.
    instance_dict = {}
    all_process_dict = {}

    for instance in instances:
    try:
    instance_dict[instance] = instance_dict[instance] + 1
    except KeyError:
    instance_dict[instance] = 0

    # Bit of a hack to get useful info.

    items = [find_pdh_counter_localized_name("ID Process")] + items[:
    5]
    # print items
    # print "Process Name", string.join(items,",")

    all_process_dict['Headings'] = items # add
    headings to dict

    for instance, max_instances in instance_dict.items():

    for inum in xrange(max_instances+1):
    hq = win32pdh.OpenQuery()
    hcs = []
    row = []

    for item in items:
    path =
    win32pdh.MakeCounterPath( (None,object,instance,None, inum, item) )
    hcs.append(win32pdh.AddCounter(hq, path))

    win32pdh.CollectQueryData(hq)
    # as per http://support.microsoft.com/default.aspx?scid=kb;EN-US;q262938,
    some "%" based
    # counters need two collections
    time.sleep(0.01)
    win32pdh.CollectQueryData(hq)
    # print "%-15s\t" % (instance[:15]),

    row.append(instance[:15])

    for hc in hcs:
    type, val = win32pdh.GetFormattedCounterValue(hc,
    win32pdh.PDH_FMT_LONG)
    # print "item : %5d" % (val),
    row.append(val)
    win32pdh.RemoveCounter(hc)

    # print
    # print ' row = ', instance ,row
    all_process_dict[instance] = row # add
    current row to dict

    win32pdh.CloseQuery(hq)

    return all_process_dict


    def ShowSingleProcessAsList(sProcessName):

    object = find_pdh_counter_localized_name("Process")
    items, instances =
    win32pdh.EnumObjectItems(None,None,object,win32pdh.PERF_DETAIL_WIZARD)

    # Need to track multiple instances of the same name.
    instance_dict = {}
    all_process_dict = {}

    for instance in instances:
    try:
    instance_dict[instance] = instance_dict[instance] + 1
    except KeyError:
    instance_dict[instance] = 0

    # Bit of a hack to get useful info.

    items = [find_pdh_counter_localized_name("ID Process")] + items[:
    5]
    # print items
    # print "Process Name", string.join(items,",")

    # all_process_dict['Headings'] = items # add
    headings to dict

    # print 'instance dict = ', instance_dict
    # print

    if sProcessName in instance_dict:
    instance = sProcessName
    max_instances = instance_dict[sProcessName]
    # print sProcessName, ' max_instances = ', max_instances

    for inum in xrange(max_instances+1):
    hq = win32pdh.OpenQuery()
    hcs = []
    row = []

    for item in items:
    path =
    win32pdh.MakeCounterPath( (None,object,instance,None, inum, item) )
    hcs.append(win32pdh.AddCounter(hq, path))

    try:
    win32pdh.CollectQueryData(hq)
    except:
    all_process_dict[sProcessName] =
    [0,0,0,0,0,0,0] # process not found - set to all zeros
    break

    # as per http://support.microsoft.com/default.aspx?scid=kb;EN-US;q262938,
    some "%" based
    # counters need two collections
    time.sleep(0.01)
    win32pdh.CollectQueryData(hq)
    # print "%-15s\t" % (instance[:15]),

    row.append(instance[:15])

    for hc in hcs:
    type, val = win32pdh.GetFormattedCounterValue(hc,
    win32pdh.PDH_FMT_LONG)
    # print "item : %5d" % (val),
    row.append(val)
    win32pdh.RemoveCounter(hc)

    # print
    # print ' row = ', instance ,row
    all_process_dict[instance] = row # add
    current row to dict

    win32pdh.CloseQuery(hq)
    else:
    all_process_dict[sProcessName] = [0,0,0,0,0,0,0] #
    process not found - set to all zeros

    return all_process_dict

    =============================

    Demo :

    import win32pdhutil # with customized methods in win32pdhutil
    (above)


    ###################################################################
    # GetMemoryStats #
    ###################################################################

    def GetMemoryStats(sProcessName, iPauseTime):

    Memory_Dict = {}

    ## Headings ['ProcessName', '% Processor Time', '% User Time', '%
    Privileged Time', 'Virtual Bytes Peak', 'Virtual Bytes']
    ##machine process = {'firefox': ['firefox', 2364, 0, 0, 0,
    242847744, 211558400]}

    loop_counter = 0

    print('\n\n** Starting Free Memory Sampler **\n\n')
    print('Process : %s\n Delay : %d seconds\n\n') % (sProcessName,
    iPauseTime)
    print('\n\nPress : Ctrl-C to stop and output stats...\n\n')


    try:

    while 1:
    print('Sample : %d') % loop_counter
    row = []

    machine_process =
    win32pdhutil2.ShowSingleProcessAsList(sProcessName)
    # print 'machine process = ', machine_process
    row.append(machine_process[sProcessName]
    [5]) # Virtual Bytes Peak
    row.append(machine_process[sProcessName]
    [6]) # Virtual Bytes

    Memory_Dict[loop_counter] =
    row # add values to the
    dictionary
    loop_counter += 1
    time.sleep(iPauseTime)

    except KeyboardInterrupt: # Ctrl-C encountered
    print "End of Sample...\n\n"


    return Memory_Dict


    ###################################################################
    ############# M A I N ###########################
    ###################################################################

    def Main():

    iPause_time = 5 # pause time - seconds
    sProcessName = 'firefox' # Process to watch
    sReportFileName = 'MemoryStats.csv' # output filename

    Memory_Dict = GetMemoryStats(sProcessName, iPause_time)


    outfile = open(sReportFileName,"w") # send output to a file
    outfile.write('SampleTime, VirtualBytesMax, VirtualBytes\n')


    for current_stat in Memory_Dict:
    line = ('%s,%d,%d\n') % (current_stat, Memory_Dict[current_stat]
    [0],Memory_Dict[current_stat][1] )
    outfile.write(line)


    outfile.close() # close output file


    if __name__ == "__main__":
    Main()


    -------------------------

    I have found that the process that you want to want to monitor needs
    to be started before this script is started. The script will handle
    when the process disappears and set the stats to zeros.

    Enjoy!

    Steve
    Steve, Sep 12, 2007
    #3
  4. Steve

    Paul McGuire Guest

    On Sep 11, 1:12 pm, Steve <> wrote:
    > Hi All (especially Paul McGuire!)
    >
    > Could you lend a hand in the grammar and paring of the output from the
    > function win32pdhutil.ShowAllProcesses()?
    >
    > This is the code that I have so far (it is very clumsy at the
    > moment) :
    >

    <snip>
    >
    > Many thanks!
    >
    > Steve


    Steve -

    Well, your first issue is not a pyparsing one, but one of redirecting
    stdout. win32pdhutil.ShowAllProcesses does not *return* the output
    you listed, it just prints it to stdout. The value returned is None,
    which is why you are having trouble parsing it (even after converting
    None to a string).

    For you to parse out this data, you will need to redirect stdout to a
    string buffer, run ShowAllProcesses, and then put stdout back the way
    it was. Python's cStringIO module is perfect for this:


    from cStringIO import StringIO
    import sys
    import win32pdhutil

    save_stdout = sys.stdout
    process_info = StringIO()
    sys.stdout = process_info

    win32pdhutil.ShowAllProcesses()
    sys.stdout = save_stdout
    sProcess_Info = process_info.getvalue()


    *Now* you have all that data captured into a processable string.

    As others have mentioned, this data is pretty predictably formatted,
    so pyparsing may be more than you need. How about plain old split?


    for line in sProcess_Info.splitlines()[1:]:
    data = line.split()
    print data


    Done!

    Still have an urge to parse with pyparsing? Here are some comments on
    your grammar:

    - Your definition of process_name was not sufficient on my system. I
    had some processes running whose names includes numeric digits and
    other non-alphas. I needed to modify process_name to:

    process_name = pyparsing.Word(pyparsing.alphanums+"_.-")

    - Similarly, some of my values returned by ShowAllProcesses had
    negative values, so your definition of integer needs to comprehend an
    optional leading '-' sign. (This actually sounds like a bug in
    win32pdhutil - I don't think any of these listed quantities should
    report a negative value.)

    - Whenever I have integers in a grammar, I usually convert them to
    ints at parse time, using a parse action:

    integer.setParseAction( lambda tokens : int(tokens[0]) )

    - The tabular format of this data, and the fact that the initial entry
    in each row appears to be a label of some sort invites the use of the
    pyparsing Dict class. I note that you are already trying to extract
    keys from the parsed data, so it looks like you are already thinking
    along these lines. (Unfortunately, it is very likely you will get
    duplicate keys, since process names do not have to be unique - this
    will involve some loss of data in this example.) The Dict class auto-
    generates results names in the parsed results. Dict turns out to be
    awkward to use directly, so I added the dictOf method to simplify
    things. The concept of dictOf(keyExpr,valueExpr) is "parse a list of
    dict entries, each of which is a key-value pair; while parsing, label
    each entry with the parsed key." In your example, this would be:

    ProcessList = heading + pyparsing.dictOf(process_name,
    pyparsing.OneOrMore(integer) )

    The key is a leading process_name, and the value is the following list
    of integers. With this, you can print out the results using:


    data = ProcessList.parseString(sProcess_Info)

    print "data keys:", data.keys()
    for k in sorted(data.keys()):
    print k, ":", data[k]


    Getting:

    BCMWLTRY : [684, 0, 0, 0, 54353920, 53010432]
    CLI : [248, 0, 0, 0, 171941888, 153014272]
    D4 : [2904, 0, 0, 0, 37527552, 36413440]
    F-StopW : [2064, 0, 0, 0, 33669120, 30121984]
    ....
    (again, note that the multiple entries for "CLI" have been reduced to
    a single dict entry)

    You could get similar results using something like:

    data = dict((vals[0],vals[1:]) for vals in
    map(str.split,sProcess_Info.splitlines()))

    But then you would never have learned about dictOf!

    Enjoy!
    -- Paul
    Paul McGuire, Sep 12, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Just Another Victim of the Ambient Morality

    Need help parsing with pyparsing...

    Just Another Victim of the Ambient Morality, Oct 22, 2007, in forum: Python
    Replies:
    6
    Views:
    586
    Dennis Lee Bieber
    Oct 23, 2007
  2. avidfan
    Replies:
    2
    Views:
    615
    avidfan
    Oct 31, 2007
  3. Neal Becker

    help with pyparsing

    Neal Becker, Oct 31, 2007, in forum: Python
    Replies:
    1
    Views:
    454
    Paul McGuire
    Oct 31, 2007
  4. Prabhu Gurumurthy

    help with pyparsing

    Prabhu Gurumurthy, Dec 10, 2007, in forum: Python
    Replies:
    3
    Views:
    385
    Prabhu Gurumurthy
    Dec 10, 2007
  5. Gabriel Genellina

    Re: pyparsing wrong output

    Gabriel Genellina, Feb 13, 2010, in forum: Python
    Replies:
    1
    Views:
    415
    Paul McGuire
    Feb 13, 2010
Loading...

Share This Page