Using String for new List name

S

Scott

I am new to Python but I have studied hard and written a fairly big
(to me) script/program. I have solved all of my problems by Googling
but this one has got me stumped.

I want to check a string for a substring and if it exists I want to
create a new, empty list using that substring as the name of the list.
For example:

Let's say file1 has line1 through line100 as the first word in each
line.

for X in open("file1"):
Do a test.
If true:
Y = re.split(" ", X)
Z = Y[0] # This is a string, maybe it is "Line42"
Z = [] # This doesn't work, I want a new, empty
list created called Line42 not Z.

Is there any way to do this?
 
O

Olof Bjarnason

2009/9/28 Scott said:
I am new to Python but I have studied hard and written a fairly big
(to me) script/program. I have solved all of my problems by Googling
but this one has got me stumped.

I want to check a string for a substring and if it exists I want to
create a new, empty list using that substring as the name of the list.
For example:

What do you mean by "as the name of the list"?

You cannot alter the name "Z" in the source code to be the content of
the file, unless you do some serious magic ;)
Let's say file1 has line1 through line100 as the first word in each
line.

for X in open("file1"):
   Do a test.
   If true:
       Y = re.split(" ", X)
       Z = Y[0]          # This is a string, maybe it is "Line42"
       Z = []              # This doesn't work, I want a new, empty
list created called Line42 not Z.

Is there any way to do this?
 
E

Ethan Furman

Scott said:
I am new to Python but I have studied hard and written a fairly big
(to me) script/program. I have solved all of my problems by Googling
but this one has got me stumped.

I want to check a string for a substring and if it exists I want to
create a new, empty list using that substring as the name of the list.
For example:

Let's say file1 has line1 through line100 as the first word in each
line.

for X in open("file1"):
Do a test.
If true:
Y = re.split(" ", X)
Z = Y[0] # This is a string, maybe it is "Line42"
Z = [] # This doesn't work, I want a new, empty
list created called Line42 not Z.

Is there any way to do this?

Assuming you made this work, and had a new variable called "Line42", how
would you know it was called "Line42" in the rest of your program?

What you could do is create a dict and have the key set to the new name,
e.g.:

new_names = {}
for X in open("file1");
Do a test.
if True:
Y = X.split(" ")
new_names[Y[0]] = []

then in the rest of your program you can refer to the keys in new_names:

for var in new_names:
item = new_names[var]
do_something_with(item)

Hope this helps!

~Ethan~
 
S

Scott

Thank you fine folks for getting back with your answers!

So down the road I do dictname[line42].append("new stuff"). (or [var]
if I'm looping through the dict)

This is cool and should do the trick!

-Scott Freemire
disclosure - Ok, I'm new to *any* language. I've been teaching myself
for about 3 months with "Learning Python, 3rd Edition" and I think
it's going well! Of course I picked something way too complicated for
a first try. Thanks again!
 
O

Olof Bjarnason

2009/9/28 Scott said:
Thank you fine folks for getting back with your answers!

So down the road I do dictname[line42].append("new stuff"). (or [var]
if I'm looping through the dict)

This is cool and should do the trick!

-Scott Freemire
disclosure - Ok, I'm new to *any* language. I've been teaching myself
for about 3 months with "Learning Python, 3rd Edition" and I think
it's going well! Of course I picked something way too complicated for
a first try. Thanks again!

Good luck!
 
E

Ethan Furman

Scott said:
Thank you fine folks for getting back with your answers!

So down the road I do dictname[line42].append("new stuff"). (or [var]
if I'm looping through the dict)

This is cool and should do the trick!

-Scott Freemire
disclosure - Ok, I'm new to *any* language. I've been teaching myself
for about 3 months with "Learning Python, 3rd Edition" and I think
it's going well! Of course I picked something way too complicated for
a first try. Thanks again!

That should actually be dictname["line42"].append("new stuff"). Notice
the quotes around line42.

Good luck! Python is a fine language, I hope you like it.

~Ethan~
 
T

Terry Reedy

Scott said:
Thank you fine folks for getting back with your answers!

So down the road I do dictname[line42].append("new stuff").

The keys are strings, so

dictname['line42'].append("new stuff")

or

for key in dictname.keys():
...
dictname[key]....

tjr
 
S

Scott

That should actually be dictname["line42"].append("new stuff").  Notice
the quotes around line42.

Good luck!  Python is a fine language, I hope you like it.

~Ethan~

Doh. I sent it before my type, fail, fix cycle had taken place.
Got it.
Thanks again all!
 
D

Dave Angel

Scott said:
Thank you fine folks for getting back with your answers!

So down the road I do dictname[line42].append("new stuff"). (or [var]
if I'm looping through the dict)
Nope, you still haven't gotten it. Of course, I really don't know where
you're going wrong, since you didn't use the same symbols as any of the
responses you had gotten.

I suspect that you meant dictname[] to be the dictionary that Duncan
called values[]. On that assumption, in order to append, you'd want
something like:

values["line42"].append("new stuff")
or
values[var].append("new stuff") if you happen to have a variable called
var with a value of "line42".

You will need to get a firm grasp on the distinctions between symbol
names, literals, and values. And although Python lets you blur these in
some pretty bizarre ways, you haven't a chance of understanding those
unless you learn how to play by the rules first. I'd suggest your first
goal should be to come up with better naming conventions. And when
asking questions here, try for more meaningful data than "Line42" to
make your point.


Suppose a text file called "customers.txt" has on each line a name and
some data. We want to initialize an (empty) list for each of those
customers, and refer to it by the customer's name. At first glance we
might seem to want to initialize a variable for each customer, but our
program doesn't know any of the names ahead of time, so it's much better
to have some form of collection. We choose a dictionary.

transactions = {}
with open("customers.txt") as infile:
for line in infile:
fields = line.split()
customername = fields[0] #customer is first thing on
the line
transactions[customername] = [] #this is where we'll put
the transactions at some later point, for this customer

Now, if our program happens to have a special case for a single
customer, we might have in our program something like:

transactions["mayor"].append("boots")

But more likely, we'll be in a loop, working through another file:

......
for line in otherfile:
fields = line.split()
customername = fields[0]
transaction = fields[1]

transactions[customername].append(transaction) #append
one transaction

or interacting:
name = raw_input("Customer name")
trans = raw_input("transaction for that customer")
transactions[name].append(trans)
 
S

Scott

Scott said:
Thank you fine folks for getting back with your answers!
So down the road I do dictname[line42].append("new stuff"). (or [var]
if I'm looping through the dict)

Nope, you still haven't gotten it.  Of course, I really don't know where
you're going wrong, since you didn't use the same symbols as any of the
responses you had gotten.

I suspect that you meant dictname[] to be the dictionary that Duncan
called values[].  On that assumption, in order to append, you'd want
something like:

values["line42"].append("new stuff")
     or
values[var].append("new stuff") if you happen to have a variable called
var with a value of "line42".

You will need to get a firm grasp on the distinctions between symbol
names, literals, and values.  And although Python lets you blur these in
some pretty bizarre ways, you haven't a chance of understanding those
unless you learn how to play by the rules first.  I'd suggest your first
goal should be to come up with better naming conventions.  And when
asking questions here, try for more meaningful data than "Line42" to
make your point.

Suppose a text file called "customers.txt" has on each line a name and
some data.  We want to initialize an (empty)  list for each of those
customers, and refer to it by the customer's name.  At first glance we
might seem to want to initialize a variable for each customer, but our
program doesn't know any of the names ahead of time, so it's much better
to have some form of collection. We choose a dictionary.

transactions = {}
with open("customers.txt") as infile:
    for line in infile:
        fields = line.split()
        customername = fields[0]            #customer is first thing on
the line
        transactions[customername] = []       #this is where we'll put
the transactions at some later point, for this customer

Now, if our program happens to have a special case for a single
customer, we might have in our program something like:

    transactions["mayor"].append("boots")

But more likely, we'll be in a loop, working through another file:

.....
        for line in otherfile:
               fields = line.split()
               customername = fields[0]
               transaction = fields[1]

transactions[customername].append(transaction)                #append
one transaction

or interacting:
      name = raw_input("Customer name")
      trans = raw_input("transaction for that customer")
      transactions[name].append(trans)

Dave,

I'm amazed at everyone's willingness to share and teach! I will sure
do the same once I have the experience.

I think that one of the problems here is that I tried to make my
initial question as bone simple as possible. When I first tried to
explain what I was doing I was getting up to 2 pages and I thought "I
bet these folks don't need to read my program. They probably just need
to know the one bit I'm looking for." So I deleted it all and reduced
it to the 10 line example that I posted.

It was then suggested that I eschew using regular expressions when not
required because I used Y = re.split(" ", X) in my example. In my
program it is actually aclLs = re.split("\s|:|/", aclS) which I think
requires a regex. I just didn't want anyone to waste their time
parsing the regex when it was not really germane to my actual
question.

The same applies to the suggestion for using meaningful variables. In
the above aclLs represents (to me) "access control list List-Split"
and aclS represents "access control list String." Again, I thought X
and Y, like foo and bar or spam and eggs would do for a simple
example.

Of course I then went and forgot the quotes around "line42" and really
looked confused. I was so excited to have an answer that I typed the
reply without thinking it through. Not good.

Don't worry though, I take no offense. I understand and welcome the
advice. I don't have anyone to work with and this post is my first
interaction with any person who knows programming and Python. I am but
a network engineer (Cisco, Lan/Wan, firewalls, security, monitoring
(this is the connection), etc.) who has never programmed. I will work
on clearer communications in future posts.

I'm happy for a chance to share what I am actually trying to
accomplish here.

I have a firewall with a static list of access-control-list (ACL)
rules (about 500 rules). I also have a directory with one week of
syslog output from the firewall. About 100 text files that are each
about 10 to 30 MB in size.

My quest, if you will, is to create a list of syslog entries, each
representing a successful network connection, with each syslog entry
listed under the access-list rule that allowed it.

Since ACL rules can be written with a range of granularity, i.e. loose
or tight, with or without Port Number, etc., their order is important.
A firewall scans the rules in order, using the first successful match.
I have 18 varieties of ACL rule to deal with. Furthermore Cisco
sometimes outputs a Protocol Port Name string instead of a numeric
Port Number so I had to write a function to translate every Port to
its numeric equivalent.

I have written a function that takes each ACL rule and generates a
regex that matches that rule based on several factors including the
value of the Subnet Mask. E.g.:
Rule:
access-list cramer line 54 extended permit tcp 192.168.0.0
255.255.252.0 host 10.1.0.195 eq 1433 (hitcnt=0) 0xc3fbbb5c
Regex:
tcp.*cramer:192\.168\..*10\.1\.0\.195/1433
And an example log to match against:
2009-09-21 12:10:04 local4.info 10.1.0.1 sep 21 2009 12:10:03 fwsm :
%fwsm-6-302013: built inbound tcp connection 146034729239394289 for
cramer:192.168.0.171/2531 (192.168.0.171/2531) to lab:10.1.0.195/1433
(10.1.0.195/1433)

Writing that was a lot of fun. Ouch, my brain.

Next I iterate across all of the log files saving successful network
connections while filtering out remarks, denials, duplicates, etc.

Now I take the selected logs and test them with the Rule regexes, in
order, until I get a match. (Actually I'm going to attempt to test
each log as I accept it instead of building up a multi-Gigabyte List
Object and then looping through it)

Now the part that has tied me up for a while - I'll try to be clear
about it:

Say I take the first chosen syslog string and begin testing it against
my rule regexes. I find that the 10th regex is a match. I want to save
rule # 10 and below it the current (matching) syslog string. (by save
I mean put it in some object that I can print it out later, in order)

The second syslog string matches rule # 20. I want to save rule # 20
and below it the second syslog string.

Finally, the third syslog string matches rule # 10 again and I simply
want to add this syslog beneath the one that was saved in the first
step.

This way I could print a report that showed:
******
ACL rule1 details, details (This is the string referenced by aclS)
syslog string showing a connection
another syslog string showing a connection

ACL rule2 details, details
(never used)

ACL rule3 details, details
syslog string...
another syslog string...
yet another syslog string...

....
******

This is why I thought that I needed to take something, a substring or
even the entire string, from each rule (aclS) and use it for the
"name" (reference, pointer?) of a new List to save all of the matching
logs in.

I could not do this so I registered and posted here on
comp.lang.python - and within 17 minutes I had my answer!
Duncan showed the dictionary method. Boy, I had to stare at that for a
while to get it. Ethan gave an example of how to reference it once it
was created, and then you came along and added an example of doing it
interactively! Oh and your scenarios were spot on. What a community!
I'm really impressed.

I think the dictionary approach will get me there but if there is some
convention that would work better I'm all ears.

Thanks again,
Scott
 
H

Hendrik van Rooyen

I am new to Python but I have studied hard and written a fairly big
(to me) script/program. I have solved all of my problems by Googling
but this one has got me stumped.

I want to check a string for a substring and if it exists I want to
create a new, empty list using that substring as the name of the list.
For example:

Let's say file1 has line1 through line100 as the first word in each
line.

for X in open("file1"):
Do a test.
If true:
Y = re.split(" ", X)
Z = Y[0] # This is a string, maybe it is "Line42"
Z = [] # This doesn't work, I want a new, empty
list created called Line42 not Z.

Is there any way to do this?

Yes

Look at exec and eval

But also look at using the string as a key in a dict.

- Hendrik
 
P

Peter Otten

Hendrik said:
I am new to Python but I have studied hard and written a fairly big
(to me) script/program. I have solved all of my problems by Googling
but this one has got me stumped.

I want to check a string for a substring and if it exists I want to
create a new, empty list using that substring as the name of the list.
For example:

Let's say file1 has line1 through line100 as the first word in each
line.

for X in open("file1"):
Do a test.
If true:
Y = re.split(" ", X)
Z = Y[0] # This is a string, maybe it is "Line42"
Z = [] # This doesn't work, I want a new, empty
list created called Line42 not Z.

Is there any way to do this?

Yes

Look at exec and eval

Look. But don't touch ;)
 
N

nn

Scott said:
Thank you fine folks for getting back with your answers!
So down the road I do dictname[line42].append("new stuff"). (or [var]
if I'm looping through the dict)
Nope, you still haven't gotten it.  Of course, I really don't know where
you're going wrong, since you didn't use the same symbols as any of the
responses you had gotten.
I suspect that you meant dictname[] to be the dictionary that Duncan
called values[].  On that assumption, in order to append, you'd want
something like:
values["line42"].append("new stuff")
     or
values[var].append("new stuff") if you happen to have a variable called
var with a value of "line42".
You will need to get a firm grasp on the distinctions between symbol
names, literals, and values.  And although Python lets you blur these in
some pretty bizarre ways, you haven't a chance of understanding those
unless you learn how to play by the rules first.  I'd suggest your first
goal should be to come up with better naming conventions.  And when
asking questions here, try for more meaningful data than "Line42" to
make your point.
Suppose a text file called "customers.txt" has on each line a name and
some data.  We want to initialize an (empty)  list for each of those
customers, and refer to it by the customer's name.  At first glance we
might seem to want to initialize a variable for each customer, but our
program doesn't know any of the names ahead of time, so it's much better
to have some form of collection. We choose a dictionary.
transactions = {}
with open("customers.txt") as infile:
    for line in infile:
        fields = line.split()
        customername = fields[0]            #customer is first thing on
the line
        transactions[customername] = []       #this is where we'll put
the transactions at some later point, for this customer
Now, if our program happens to have a special case for a single
customer, we might have in our program something like:
    transactions["mayor"].append("boots")
But more likely, we'll be in a loop, working through another file:
.....
        for line in otherfile:
               fields = line.split()
               customername = fields[0]
               transaction = fields[1]
transactions[customername].append(transaction)                #append
one transaction
or interacting:
      name = raw_input("Customer name")
      trans = raw_input("transaction for that customer")
      transactions[name].append(trans)

Dave,

I'm amazed at everyone's willingness to share and teach! I will sure
do the same once I have the experience.

I think that one of the problems here is that I tried to make my
initial question as bone simple as possible. When I first tried to
explain what I was doing I was getting up to 2 pages and I thought "I
bet these folks don't need to read my program. They probably just need
to know the one bit I'm looking for." So I deleted it all and reduced
it to the 10 line example that I posted.

It was then suggested that I eschew using regular expressions when not
required because I used Y = re.split(" ", X) in my example. In my
program it is actually aclLs = re.split("\s|:|/", aclS) which I think
requires a regex. I just didn't want anyone to waste their time
parsing the regex when it was not really germane to my actual
question.

The same applies to the suggestion for using meaningful variables. In
the above aclLs represents (to me) "access control list List-Split"
and aclS represents "access control list String." Again, I thought X
and Y, like foo and bar or spam and eggs would do for a simple
example.

Of course I then went and forgot the quotes around "line42" and really
looked confused. I was so excited to have an answer that I typed the
reply without thinking it through. Not good.

Don't worry though, I take no offense. I understand and welcome the
advice. I don't have anyone to work with and this post is my first
interaction with any person who knows programming and Python. I am but
a network engineer (Cisco, Lan/Wan, firewalls, security, monitoring
(this is the connection), etc.) who has never programmed. I will work
on clearer communications in future posts.

I'm happy for a chance to share what I am actually trying to
accomplish here.

I have a firewall with a static list of access-control-list (ACL)
rules (about 500 rules). I also have a directory with one week of
syslog output from the firewall. About 100 text files that are each
about 10 to 30 MB in size.

My quest, if you will, is to create a list of syslog entries, each
representing a successful network connection, with each syslog entry
listed under the access-list rule that allowed it.

Since ACL rules can be written with a range of granularity, i.e. loose
or tight, with or without Port Number, etc., their order is important.
A firewall scans the rules in order, using the first successful match.
I have 18 varieties of ACL rule to deal with. Furthermore Cisco
sometimes outputs a Protocol Port Name string instead of a numeric
Port Number so I had to write a function to translate every Port to
its numeric equivalent.

I have written a function that takes each ACL rule and generates a
regex that matches that rule based on several factors including the
value of the Subnet Mask. E.g.:
Rule:
access-list cramer line 54 extended permit tcp 192.168.0.0
255.255.252.0 host 10.1.0.195 eq 1433 (hitcnt=0) 0xc3fbbb5c
Regex:
tcp.*cramer:192\.168\..*10\.1\.0\.195/1433
And an example log to match against:
2009-09-21 12:10:04     local4.info     10.1.0.1        sep 21 2009 12:10:03 fwsm :
%fwsm-6-302013: built inbound tcp connection 146034729239394289 for
cramer:192.168.0.171/2531 (192.168.0.171/2531) to lab:10.1.0.195/1433
(10.1.0.195/1433)

Writing that was a lot of fun. Ouch, my brain.

Next I iterate across all of the log files saving successful network
connections while filtering out remarks, denials, duplicates, etc.

Now I take the selected logs and test them with the Rule regexes, in
order, until I get a match. (Actually I'm going to attempt to test
each log as I accept it instead of building up a multi-Gigabyte List
Object and then looping through it)

Now the part that has tied me up for a while - I'll try to be clear
about it:

Say I take the first chosen syslog string and begin testing it against
my rule regexes. I find that the 10th regex is a match. I want to save
rule # 10 and below it the current (matching) syslog string. (by save
I mean put it in some object that I can print it out later, in order)

The second syslog string matches rule # 20. I want to save rule # 20
and below it the second syslog string.

Finally, the third syslog string matches rule # 10 again and I simply
want to add this syslog beneath the one that was saved in the first
step.

This way I could print a report that showed:
******
ACL rule1 details, details  (This is the string referenced by aclS)
    syslog string showing a connection
    another syslog string showing a connection

ACL rule2 details, details
    (never used)

ACL rule3 details, details
    syslog string...
    another syslog string...
    yet another syslog string...

...
******

This is why I thought that I needed to take something, a substring or
even the entire string, from each rule (aclS) and use it for the
"name" (reference, pointer?) of a new List to save all of the matching
logs in.

I could not do this so I registered and posted here on
comp.lang.python - and within 17 minutes I had my answer!
Duncan showed the dictionary method. Boy, I had to stare at that for a
while to get it. Ethan gave an example of how to reference it once it
was created, and then you came along and added an example of doing it
interactively! Oh and your  scenarios were spot on. What a community!
I'm really impressed.

I think the dictionary approach will get me there but if there is some
convention that would work better I'm all ears.

Thanks again,
Scott

No that is pretty much it. Your dictionary could look something like
this:
{'ACL rule 1': ['detail detail', 'syslog string1', 'another syslog
string'],
'ACL rule 2': ['detail detail'],
'ACL rule 3': ['detail detailsyslog string1', 'yet another syslog
string']}
The only wrinkle is that a dictionary doesn't care about any
particular ordering of its entries, so if you want them in a
particular order you might have to sort them first when you output
them. You know like:
for rule in sorted(rules_info): print rule,rules_info[rule]

ACL rule 1 ['detail detail', 'syslog string1', 'another syslog
string']
ACL rule 2 ['detail detail']
ACL rule 3 ['detail detailsyslog string1', 'yet another syslog
string']
Producing output in an order different than the dictionary key is left
as an exercise for the reader :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top