Need to Parse delimited File into DataStructures .....

L

LuckyBoy

I have a delimited file of format
USER:ROLE:RESOURCE
It can have any number of records.
I want to parse this file in Java.
What I want to know is how to do this:
For each distinct USER having ROLE and RESOURCE
add to some Java DataStructure (which would be best ?)
I mean:
if SAM:USER: PC
SAM:ADMIN:LAN
MIKE:USER:LAPTOP
Then my DataStructure must have only 1 entry of SAM and MIKE
but all records for SAM AND MIKE.
i mean something like:
SAM:USER:pC
:ADMIN:LAN
MIKE:USER:LAPTOP
Any suggestions ? Ideas ? which DataStructure ? HashMap ?
How to do this ?
 
L

Lionel

LuckyBoy said:
I have a delimited file of format
USER:ROLE:RESOURCE
It can have any number of records.
I want to parse this file in Java.
What I want to know is how to do this:
For each distinct USER having ROLE and RESOURCE
add to some Java DataStructure (which would be best ?)
I mean:
if SAM:USER: PC
SAM:ADMIN:LAN
MIKE:USER:LAPTOP
Then my DataStructure must have only 1 entry of SAM and MIKE
but all records for SAM AND MIKE.
i mean something like:
SAM:USER:pC
:ADMIN:LAN
MIKE:USER:LAPTOP
Any suggestions ? Ideas ? which DataStructure ? HashMap ?
How to do this ?

Just a guess, StringTokenizer to parse and HashMap sounds good, or
Hashtable depending on your needs and whether or not your user names are
unique, which by you specification they have to be.

Lionel.
 
L

LuckyBoy

How will we use the HashMap ?
for each UNIQUE token found in first field
put (field1, (field2, field3)) in HashMap(K,V) ??
How to distinguish b/w field2 and field3 if I put it the above said way
?
I mean that later on I will require values like
for field1 in HashMap get ALL values Field2 and Field3 in separate
variables

How to do this in HashTable too ? as you have suggested it ?
 
L

Lionel

LuckyBoy said:
How will we use the HashMap ?
for each UNIQUE token found in first field
put (field1, (field2, field3)) in HashMap(K,V) ??
How to distinguish b/w field2 and field3 if I put it the above said way
?
I mean that later on I will require values like
for field1 in HashMap get ALL values Field2 and Field3 in separate
variables

How to do this in HashTable too ? as you have suggested it ?

Well, at a very superficial look I would create a class that holds the
persons name (field1) and a list (perhaps ArrayList would be suitable)
containing instances of a second class containing String variables for
ROLE and RESOURCE. It really depends on your other requirements.

You can then use NAME as the key to the Hashtable and the class as the
object to store. I haven't use HashMap as much so I'm sure you can
figure out how to use that in a similar way.

You need to think about how you will need to access the data, it may be
that HashMap will be faster for lookups if you don't know NAME. If you
do know NAME then HashTable will be faster but I think it uses more memory.

Lionel.
 
L

LuckyBoy

I used ArrayList . 3 ArrayLists.
and have successfully collected UNIQUE NAMES with the Code below:

My only hitch is the storage Class of multiple field values of ROLE and
RESOURCES for one UNIQUE record NAME:

List alRec = new ArrayList();
FileReader fr = new FileReader(csvFile);
BufferedReader br = new BufferedReader(fr);

while((strRec=br.readLine()) != null){
feedRecCtr++;
alRec.add(strRec); // gets line records
}

FeedFileLength = feedRecCtr;
Iterator itr = alRec.iterator();

while(itr.hasNext()){
strRec = (String)itr.next();

StringTokenizer strtok = new StringTokenizer(strRec,":");

while(strtok.hasMoreElements()){

if(tokenCtr == 1){
String kerbToken = strtok.nextToken();
if((null != alNameID) && !(alNameID.contains(kerbToken))) // For
UNIQUE Names
{
alNameID.add(kerbToken); // ADDs UNIQUE NAMEs
}
}

if(tokenCtr == 2){
alRole.add(strtok.nextToken()); // ADDS ROLES
}

if(tokenCtr == 3){
alResourceType.add(strtok.nextToken()); // ADDS RESOURCES
tokenCtr = 0;
}

tokenCtr++;
}
 
L

Lew

Lionel said:
You can then use NAME as the key to the Hashtable and the class as the
object to store. I haven't use HashMap as much so I'm sure you can
figure out how to use that in a similar way.

You need to think about how you will need to access the data, it may be
that HashMap will be faster for lookups if you don't know NAME. If you
do know NAME then HashTable will be faster but I think it uses more memory.

HashMap and Hashtable (not HashTable) work the exact same way, except that
Hashtable methods are synchronized. That means that HashMap pretty much will
always be faster, but not safer in multi-threaded use unless you synchronize
it yourself. They are *both* implementations of Map (the interface
supertype). There is no evidence that their memory requirements differ.

Usually you should declare a variable as a Map and implement it as a HashMap
or Hashtable or TreeMap or whatever, depending on the desired performance
characteristics.

In all those cases you will access the data in the exact same way.

Have you considered actually reading the javadocs on these classes?
http://java.sun.com/j2se/1.5.0/docs/api/
http://java.sun.com/j2se/1.5.0/docs/api/java/util/Map.html

- Lew Bloch
 
L

Lionel

Lew said:
HashMap and Hashtable (not HashTable) work the exact same way, except
that Hashtable methods are synchronized. That means that HashMap pretty
much will always be faster, but not safer in multi-threaded use unless
you synchronize it yourself. They are *both* implementations of Map
(the interface supertype). There is no evidence that their memory
requirements differ.

Usually you should declare a variable as a Map and implement it as a
HashMap or Hashtable or TreeMap or whatever, depending on the desired
performance characteristics.

In all those cases you will access the data in the exact same way.

Have you considered actually reading the javadocs on these classes?

I was just giving the OP some direction, I've always used Hashtable and
it wasn't necessary in this case for me to look up HashMap I just made
it clear that I wasn't completely familiar with HashMap. I admit that I
was under an incorrect understanding of how HashMap worked and you are
right that they should use the same amount of memory. That's why I left
it to the OP to figure out which they wanted to use.

Lionel.
 
L

Lionel

LuckyBoy said:
I used ArrayList . 3 ArrayLists.
and have successfully collected UNIQUE NAMES with the Code below:

My only hitch is the storage Class of multiple field values of ROLE and
RESOURCES for one UNIQUE record NAME:

List alRec = new ArrayList();
FileReader fr = new FileReader(csvFile);
BufferedReader br = new BufferedReader(fr);

while((strRec=br.readLine()) != null){
feedRecCtr++;
alRec.add(strRec); // gets line records
}

FeedFileLength = feedRecCtr;
Iterator itr = alRec.iterator();

while(itr.hasNext()){
strRec = (String)itr.next();

StringTokenizer strtok = new StringTokenizer(strRec,":");

while(strtok.hasMoreElements()){

if(tokenCtr == 1){
String kerbToken = strtok.nextToken();
if((null != alNameID) && !(alNameID.contains(kerbToken))) // For
UNIQUE Names
{
alNameID.add(kerbToken); // ADDs UNIQUE NAMEs
}
}

if(tokenCtr == 2){
alRole.add(strtok.nextToken()); // ADDS ROLES
}

if(tokenCtr == 3){
alResourceType.add(strtok.nextToken()); // ADDS RESOURCES
tokenCtr = 0;
}

tokenCtr++;
}

I can't read the above code the way it is formatted. I can say that you
are doing things in a difficult manner. My suggestion wasn't to use only
ArrayLists, in fact it's quite a bad implementation that way you have
done it because it seems to rely on the indexes in each array being
aligned so to speak.

I get the feeling that this is an assignment question? If so you should
really figure out what you have to do yourself or you won't learn anything.

You can also perform the above with one while loop and less code, I'll
leave you to think about it for a while with the hint that you really do
need to create a new class.

Lionel.
 
L

Lionel

Lew said:
HashMap and Hashtable (not HashTable) work the exact same way, except
that Hashtable methods are synchronized. That means that HashMap pretty
much will always be faster, but not safer in multi-threaded use unless
you synchronize it yourself. They are *both* implementations of Map
(the interface supertype). There is no evidence that their memory
requirements differ.

Usually you should declare a variable as a Map and implement it as a
HashMap or Hashtable or TreeMap or whatever, depending on the desired
performance characteristics.

In fact I was thinking TreeMap rather than HashMap which is why I also
had in my head log(n) for performance and the fact that TreeMap only
uses as much memory as necessary.
 
J

jiji

The data can be stored as follows.

1. there will be a root HashMap which has NAME as key and another
HashMap as value..
2. the second HashMap will have role as key and an array list as
value..
3. the ArrayList will have list of all the resources for that user for
that particular role.

so.. the root HashMap will look like

HashMap { [NAME1, HashMap { [ROLE1, ArrayList {RES11, RES12,...} ],
[ROLE2, ArrayList {RES21,
RES22,...} ],.....} ], [NAME2, HashMap { [ROLE1, ArrayList {RES11,
RES12,...} ], [ROLE2, ArrayList {RES21, RES22, ...
} ], .... }

eg:

SAM:ADMIN:LAP
SAM:ADMIN:pEN
SAM:USER:pC
MIKE:USER:pC

then the storage will be like

HashMap { [SAM, HashMap { [ADMIN, ArrayList {LAP, PEN}], [USER,
ArrayList {PC}] } ], [MIKE, HashMap { [USER, ArrayList {PC} ] } ] }


the following is a simple function which does the above logic..

public void store(Map userMap, String user, String role, String
resource) {
Map roleMap = (Map)userMap.get(user);
if(roleMap == null) {
roleMap = new HashMap();
userMap.put(user, roleMap);
}

List resList = (List) roleMap.get(role);
if (resList == null) {
resList = new ArrayList();
roleMap.put(role, resList);
}

if(!resList.contains(resource)){
resList.add(resource);
}
}


using this you can access all the data very easily.. i guess the memory
utilization better since there is no unnecessary or duplicate data
stored. (dont know the exact memory utilization)
 
L

LuckyBoy

Hi Jiji, Your solution looks fine but can you tell me what is the
userMap that you are passing to the store() method . In context of file
being read, when will this store method be called ?
Can you give a complete picture.
 
L

LuckyBoy

Can anybody suggest a good DataStructure Solution to this problem ?

Parsing a delimited text file :
USER:ROLE:RESOURCE
Like input file is: (can be huge size with many records)
===================
sam:user:pc
mike:admin:laptop
sam:admin:usb drive
richard:user:pc
===================
output in some class level object(s) must be: user is to be stored only
once for n roles and n resources
sam:user:pc
:admin:usb drive
mike:admin:laptop
richard:user:pc

I got HashMap1 [ K , HashMap2 [ K, ArrayList ] ] as a solution.
Anything better ?
 
L

LuckyBoy

Can anybody suggest a good DataStructure Solution to this problem ?

Parsing a delimited text file :
USER:ROLE:RESOURCE
Like input file is: (can be huge size with many records)
===================
sam:user:pc
mike:admin:laptop
sam:admin:usb drive
richard:user:pc
===================
output in some class level object(s) must be: user is to be stored only
once for n roles and n resources
sam:user:pc
:admin:usb drive
mike:admin:laptop
richard:user:pc

I got HashMap1 [ K , HashMap2 [ K, ArrayList ] ] as a solution.
Anything better ?
 
L

LuckyBoy

Can anybody suggest a good DataStructure Solution to this problem ?

Parsing a delimited text file :
USER:ROLE:RESOURCE
Like input file is: (can be huge size with many records)
===================
sam:user:pc
mike:admin:laptop
sam:admin:usb drive
richard:user:pc
===================
output in some class level object(s) must be: user is to be stored only
once for n roles and n resources
sam:user:pc
:admin:usb drive
mike:admin:laptop
richard:user:pc

I got HashMap1 [ K , HashMap2 [ K, ArrayList ] ] as a solution.
Anything better ?
 
L

LuckyBoy

Can anybody suggest a good DataStructure Solution to this problem ?

Parsing a delimited text file :
USER:ROLE:RESOURCE
Like input file is: (can be huge size with many records)
===================
sam:user:pc
mike:admin:laptop
sam:admin:usb drive
richard:user:pc
===================
output in some class level object(s) must be: user is to be stored only
once for n roles and n resources
sam:user:pc
:admin:usb drive
mike:admin:laptop
richard:user:pc

I got HashMap1 [ K , HashMap2 [ K, ArrayList ] ] as a solution.
Anything better ?
 
L

LuckyBoy

Can anybody suggest a good DataStructure Solution to this problem ?

Parsing a delimited text file :
USER:ROLE:RESOURCE
Like input file is: (can be huge size with many records)
===================
sam:user:pc
mike:admin:laptop
sam:admin:usb drive
richard:user:pc
===================
output in some class level object(s) must be: user is to be stored only
once for n roles and n resources
sam:user:pc
:admin:usb drive
mike:admin:laptop
richard:user:pc

I got HashMap1 [ K , HashMap2 [ K, ArrayList ] ] as a solution.
Anything better ?
 
L

LuckyBoy

Can anybody suggest a good DataStructure Solution to this problem ?

Parsing a delimited text file :
USER:ROLE:RESOURCE
Like input file is: (can be huge size with many records)
===================
sam:user:pc
mike:admin:laptop
sam:admin:usb drive
richard:user:pc
===================
output in some class level object(s) must be: user is to be stored only
once for n roles and n resources
sam:user:pc
:admin:usb drive
mike:admin:laptop
richard:user:pc

I got HashMap1 [ K , HashMap2 [ K, ArrayList ] ] as a solution.
Anything better ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top