Best way to access this file?

B

Brian

Hello,

I have a text file I'm attempting to parse. There are about 50 fixed width
fields in each line / row. For example (shortened for brevity):

W1234Somebody East 101110001111010101
E1235Someone Else West 010111001001010101

I'm having problems pulling these fields into structures, in order to be
able to access each individually. I am currently opening as a sequential
file. Is there a better way?

My structure looks something like:

struct data{
char area[1];
char empNumber[4];
char name[16]
char region[5];
char options[20];
}

int index = 0;
data user[100];

I would like to read the entire file into memory. Please tell me if I'm
going about this the wrong way. So far after reading the file in, I'm
unable to access any individual items (ie. user[index].area) Also should
I stick with a sequential file, or should I consider binary access (seems
like it may be easier to address individual elements)?



(I don't want to include too many details here, but will be happy to provide
whatever is needed).

Thank you in advance,
Brian
 
V

Victor Bazarov

Brian said:
I have a text file I'm attempting to parse. There are about 50 fixed width
fields in each line / row. For example (shortened for brevity):

W1234Somebody East 101110001111010101
E1235Someone Else West 010111001001010101

I'm having problems pulling these fields into structures, in order to be
able to access each individually. I am currently opening as a sequential
file. Is there a better way?

My structure looks something like:

struct data{
char area[1];
char empNumber[4];
char name[16]
char region[5];
char options[20];
}

int index = 0;
data user[100];

I would like to read the entire file into memory. Please tell me if I'm
going about this the wrong way.

Which way are you going? As soon as you specify the way we can tell if
it's wrong or not.
So far after reading the file in, I'm
unable to access any individual items (ie. user[index].area)

WTF does "unable to access" mean?
Also should
I stick with a sequential file, or should I consider binary access (seems
like it may be easier to address individual elements)?

Binaryness and sequentialness of a file are two orthogonal traits.
(I don't want to include too many details here, but will be happy to provide
whatever is needed).

FAQ 5.8.

V
 
A

Alf P. Steinbach

* Brian:
I have a text file I'm attempting to parse. There are about 50 fixed width
fields in each line / row. For example (shortened for brevity):

W1234Somebody East 101110001111010101
E1235Someone Else West 010111001001010101

I'm having problems pulling these fields into structures, in order to be
able to access each individually. I am currently opening as a sequential
file. Is there a better way?

That question seems to be meaningless.

My structure looks something like:

struct data{
char area[1];
char empNumber[4];
char name[16]
char region[5];
char options[20];
}

Are you sure? This should not compile due to lacking semicolon
at the end. Always copy and paste code, don't retype it.

Assuming it is accurate in other details:

Every field is 1 character too short to be used as a C zero-terminated
string. Perhaps that is an error, or perhaps it's intentional: that
you intend this structure to directly reflect the data. But if the
latter it's non-portable, because different compilers may add padding
in different ways (you're almost guaranteed that 'area' will occupy at
least two bytes of structure space, no matter which compiler).


int index = 0;
data user[100];

Do think about using C++'s facilities for _abstraction_.

E.g.,


class AreaCode{ ... };
class EmployeeNumber{ ... };
class ...

class FixedWidthFields
{
public:
FixedWidthFields( std::string line ) { ... }
FixedWidthFields(
AreaCode const& anAreaCode,
EmpoyeeNumber const& anEmployeeNumber,
...
)
{ ... }

AreaCode areaCode() const { return ...; }
EmployeeNumber employeeNumber() const { return ...; }
...
};


...
std::ifstream f( filename );
std::string s;
std::vector<

if( !f ) { throw std::runtime_error( "asølkjasd" ); }
std::getline( f, s );
FixedWidthFields fields( s );
// Create real user data representation from 'fields'.
 
B

Brian

Thanks for your comments - points well taken. I'll be the first to
admit I have a long way to go, but I'm trying. Also have to admit I
haven't been through the group's faq in a long time. Will have another
look.

Brian

Victor Bazarov said:
Brian said:
I have a text file I'm attempting to parse. There are about 50 fixed width
fields in each line / row. For example (shortened for brevity):

W1234Somebody East 101110001111010101
E1235Someone Else West 010111001001010101

I'm having problems pulling these fields into structures, in order to be
able to access each individually. I am currently opening as a sequential
file. Is there a better way?

My structure looks something like:

struct data{
char area[1];
char empNumber[4];
char name[16]
char region[5];
char options[20];
}

int index = 0;
data user[100];

I would like to read the entire file into memory. Please tell me if I'm
going about this the wrong way.

Which way are you going? As soon as you specify the way we can tell if
it's wrong or not.
So far after reading the file in, I'm
unable to access any individual items (ie. user[index].area)

WTF does "unable to access" mean?
Also should
I stick with a sequential file, or should I consider binary access (seems
like it may be easier to address individual elements)?

Binaryness and sequentialness of a file are two orthogonal traits.
(I don't want to include too many details here, but will be happy to provide
whatever is needed).

FAQ 5.8.

V
 
B

Brian

Alf,

First of all many thanks for your thoughtful reply to my "scattered"
post. I see it's quite obvious I don't know what I'm doing. I've
been "messing" with C and C++ for at least the past 10 years, but
until now have not really tried taking on a meaningful project. I
have
a (I would say) basic understanding of the syntax.

I would love to be able to apply the code you presented, but quite
honestly it's way over my head right now. Most of the manuals I have
on my shelf don't do a great job of walking me through these commands
(could be b/c I have old books on my shelf!)

Anyhow I "sort of" understand the way I'm attempting to solve this,
but even so there are many holes. For example even though I've
declared
the array of structures named 'user', when I try reading the file into
this array I come up with blanks (or whatever data already happens to
be
there).

Here's the code, if you have a moment to take a quick look and
(hopefully) point out my errors. The data file will be included at
the
end.

Also please let me know if you have any suggestions on books, etc..
that can help me with the "True" C++ syntax.

Thank you,
Brian

------------------------------------------------------------------
#include <iostream.h>
#include <string.h>
#include <fstream.h>
#include <conio.h>
using namespace std;

int main()
{
int counter = 0;
// create file object and open file
ifstream inFile;
inFile.open("BDNTest.prn", ios::in);

// Determine if the file was opened successfully

if (inFile.is_open())
{
string name = "";
getline(inFile, name, '\n');
while (!inFile.eof())
{
counter +=1;
cout << name << endl;
getline(inFile, name, '\n');
}
inFile.close();
}
else
cout << "File could not be opened" << endl;
cout << endl << "Total records: " << counter;

cout << "Now ready to read BDN File" << endl;

struct BDNData {
char area;
char EIN[5];
char station[4];
char name[17];
char fileNum[10];
char userType;
char orgCode[4];
char accessLevel;
char rest[78];
};


BDNData user[100];
int index=0;

//read BDN file into memory array


cout << "Address first array: " << &user[0] << endl;
cout << "Address second array: " << &user[1] << endl;
cout << sizeof(user[0]);
cout << endl << "user-0: " << user[0].name << user[0].EIN;

inFile.open("BDNTest.prn", ios::in);

while (!inFile.eof()) {
cout << "Array " << index << ": " << &user[index] << endl;
inFile >> &user[index];
cout << endl << "Name: " << user[index].name << endl;
cout << "For Index " << index << "Process complete" << endl;
getch();
index++;

}


inFile.close();
return 0;

}

----------------------------------------------------------------------

************************** BDNTEST.PRN *******************************
----------------------------------------------------------------------

W1212345JONESS KIMBE 123456789121
0000001011000000000001000000000000000
0000100111010000000000000100001000000000
W1092345ROESE DAVID 234567890121
6011111111110000011111001101100010000
0000110111011000000100100110101000011111
W1111345FUNNNYYWICT 121
7011111111110111011011101101101010000
0000111111011000000100100110101000011011
W4444345CRUISE BRIAN 123456789124
7000001000000000000000000000000000000
0000100000010000000000000100001001000000
W5055644D SMITTY 01234567 213
5000001000000000000000001000000100001
0000100000110000000000101100001000000000
W6666345SMITH SANDRA 987654321100
8000001000000000000000000000000000000
0000100000010000000000000100001001000001
W6789345BERTHA BIG 246802461100
8000001000000100000000000000000000000
0000100000010000000000000100001001000001
W2222345JAGGER MICK 126
0000001000000000000000001000000000000
0000100000000000000000000000000000000000
W1111345BROOKS GARTH 100
7000001000000000000000000000000000000
0000100000000000000000000100001000000000
W9876345LAUPER CYNTHIA
120F8000001000000000000000000000000000000
0000100000010000000000000100001000000001
-------------------------END OF FILE
-------------------------------------

Sorry about the text wrap!



* Brian:
I have a text file I'm attempting to parse. There are about 50 fixed width
fields in each line / row. For example (shortened for brevity):

W1234Somebody East 101110001111010101
E1235Someone Else West 010111001001010101

I'm having problems pulling these fields into structures, in order to be
able to access each individually. I am currently opening as a sequential
file. Is there a better way?

That question seems to be meaningless.

My structure looks something like:

struct data{
char area[1];
char empNumber[4];
char name[16]
char region[5];
char options[20];
}

Are you sure? This should not compile due to lacking semicolon
at the end. Always copy and paste code, don't retype it.

Assuming it is accurate in other details:

Every field is 1 character too short to be used as a C zero-terminated
string. Perhaps that is an error, or perhaps it's intentional: that
you intend this structure to directly reflect the data. But if the
latter it's non-portable, because different compilers may add padding
in different ways (you're almost guaranteed that 'area' will occupy at
least two bytes of structure space, no matter which compiler).


int index = 0;
data user[100];

Do think about using C++'s facilities for _abstraction_.

E.g.,


class AreaCode{ ... };
class EmployeeNumber{ ... };
class ...

class FixedWidthFields
{
public:
FixedWidthFields( std::string line ) { ... }
FixedWidthFields(
AreaCode const& anAreaCode,
EmpoyeeNumber const& anEmployeeNumber,
...
)
{ ... }

AreaCode areaCode() const { return ...; }
EmployeeNumber employeeNumber() const { return ...; }
...
};


...
std::ifstream f( filename );
std::string s;
std::vector<

if( !f ) { throw std::runtime_error( "asølkjasd" ); }
std::getline( f, s );
FixedWidthFields fields( s );
// Create real user data representation from 'fields'.
 
A

Alf P. Steinbach

* Brian:
Here's the code, if you have a moment to take a quick look and
(hopefully) point out my errors. The data file will be included at
the end.

Code and test data is precisely the thing to get useful suggestions.

Also please let me know if you have any suggestions on books, etc..
that can help me with the "True" C++ syntax.

"Accelerated C++" by Koenig and (somebody help out here) Moi?

A copy of Bjarne Stroustrup's "The C++ Programming Language" would
also be handy, but start with AC++.


Non-standard header, use


#include <string.h>

Probably not the header you think it is. The one you've specified
here is (almost, modulo namespace issues) equivalent to writing


#include <cstring>


i.e. it's a header from the C library, while the C++ header that defines
the std::string type is


#include <string>


#include <fstream.h>

Non-standard header, use


#include <conio.h>

System-specific header, OK for system-specific program.

using namespace std;

int main()
{
int counter = 0;
// create file object and open file
ifstream inFile;
inFile.open("BDNTest.prn", ios::in);

// Determine if the file was opened successfully

if (inFile.is_open())
{
string name = "";
getline(inFile, name, '\n');
while (!inFile.eof())

Better


while( !infile )


because that also checks for error state.

{
counter +=1;
cout << name << endl;
getline(inFile, name, '\n');
}
inFile.close();
}
else
cout << "File could not be opened" << endl;
cout << endl << "Total records: " << counter;

cout << "Now ready to read BDN File" << endl;

struct BDNData {
char area;
char EIN[5];
char station[4];
char name[17];
char fileNum[10];
char userType;
char orgCode[4];
char accessLevel;
char rest[78];
};

Replace every character array with a std::string.

BDNData user[100];

int index=0;

//read BDN file into memory array


cout << "Address first array: " << &user[0] << endl;
cout << "Address second array: " << &user[1] << endl;
cout << sizeof(user[0]);
cout << endl << "user-0: " << user[0].name << user[0].EIN;

inFile.open("BDNTest.prn", ios::in);

while (!inFile.eof()) {
cout << "Array " << index << ": " << &user[index] << endl;
inFile >> &user[index];

Don't do that (whatever it is).

Use std::getline to get a string.

Then use e.g. str::substr to extract the fields in that string.

cout << endl << "Name: " << user[index].name << endl;
cout << "For Index " << index << "Process complete" << endl;
getch();
index++;

}


inFile.close();
return 0;

}
 
D

Default User

Brian wrote:

Also please let me know if you have any suggestions on books, etc..
that can help me with the "True" C++ syntax.

Accelerate C++, by Koeing and Moo, is often recommended.
#include <iostream.h>

Nonstandard header said:
#include <string.h>

Standard, but probably not what you want. string.h is the header for
the C-style string functions, not the C++ std::string.
#include <fstream.h>

Nonstandard header said:
#include <conio.h>

Platform specific header, don't use it or anything from it on this
newsgroup.

Fix all that first. Oh, and don't top-post, your replies belong
following properly trimmed quotes.



Brian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top