Look for open source C library for text manipulation

1

1230987za

Hi,

I have been working on BASH shell programming a while and am very
pleased with its functionality to process text/string. Now I wonder if
there are any open source C or C++ libraries which can do the similar
things? E.g. regular expression support, cut/grep/sed functions, etc.

I know some are not actually BASH functions like " cut/grep/sed", but
it is really a pain to work out every such function by myself. I think
such thing should be in place already, but a simple google search does
not return any. May be it is hidden somewhere?

Thanks.
 
J

jackassplus

Hi,

I have been working on BASH shell programming a while and am very
pleased with its functionality to process text/string. Now I wonder if
there are any open source C or C++ libraries which can do the similar
things? E.g. regular expression support, cut/grep/sed functions, etc.

I know some are not actually BASH functions like " cut/grep/sed", but
it is really a pain to work out every such function by myself. I think
such thing should be in place already, but a simple google search does
not return any. May be it is hidden somewhere?

Thanks.

Try the regex.h library. I've had good results.
 
I

Ian Collins

Hi,

I have been working on BASH shell programming a while and am very
pleased with its functionality to process text/string. Now I wonder if
there are any open source C or C++ libraries which can do the similar
things? E.g. regular expression support, cut/grep/sed functions, etc.

Which language are you using, C or C++?
 
K

Keith Thompson

I have been working on BASH shell programming a while and am very
pleased with its functionality to process text/string. Now I wonder if
there are any open source C or C++ libraries which can do the similar
things? E.g. regular expression support, cut/grep/sed functions, etc.

I know some are not actually BASH functions like " cut/grep/sed", but
it is really a pain to work out every such function by myself. I think
such thing should be in place already, but a simple google search does
not return any. May be it is hidden somewhere?

cut, grep, and sed are Unix commands, not Bash functions. It's hard
to tell from the way you phrased it whether you're acknowledging that
fact or are unaware of it.

You can always invoke external commands from a C program, using
system() or some implementation-specific method. That's what bash
does.

You might try asking in comp.unix.programmer. <OT>Or you might
consider using something like Perl rather than C.</OT>
 
S

Spiros Bousbouras

Hi,

I have been working on BASH shell programming a while and am very
pleased with its functionality to process text/string. Now I wonder if
there are any open source C or C++ libraries which can do the similar
things? E.g. regular expression support, cut/grep/sed functions, etc.

I know some are not actually BASH functions like " cut/grep/sed", but
it is really a pain to work out every such function by myself. I think
such thing should be in place already, but a simple google search does
not return any. May be it is hidden somewhere?

Regarding regular expressions regex.h has been
mentioned already. See also http://www.pcre.org

Regarding "cut/grep/sed" what is it that you want
the library to do ?
 
W

WANG Cong

Hi,

I have been working on BASH shell programming a while and am very
pleased with its functionality to process text/string. Now I wonder if
there are any open source C or C++ libraries which can do the similar
things? E.g. regular expression support, cut/grep/sed functions, etc.

I think you should use Perl instead of C, Perl is exactly good at those
things you want.
 
R

Richard

William Pursell said:
regex.h is a header, not a library. Very likely you are referring
to libregex, whose API is published in a file named regex.h.

*blink*

Is this some sort of joke?
 
K

Kaz Kylheku

regex.h is a header, not a library. Very likely you are referring
to libregex, whose API is published in a file named regex.h.

So you are wrong in two ways.

Firstly, it's perfectly normal to refer to a
library by its header name, e.g. ``<stdio.h> library''.

Kindly read this page: http://en.wikipedia.org/wiki/Metonymy
And also this: http://en.wikipedia.org/wiki/Synecdoche

Then when you're done reading these, read them four more times until
they sink in.

Secondly, although C has no regular expression support, the Unix specification
defines a <regex.h>. No mention of a ``libregex'' is made in the specification.

In many implementations you don't have to take any additional toolchain-related
steps to make the functions available in your program, because the functions
are in the main C library.

# sample implementation:
$ nm /lib/libc.so.6 | grep regex
00105f60 t __compat_regexec
000acd10 t __regexec
00105f60 T regexec@GLIBC_2.0
000acd10 T regexec@@GLIBC_2.3.4

So your pointless pedantry is completely misplaced in the one case, and
inaccurate in the other.
 
R

Richard

It's an instance of compulsive pedantry blurring the ability to
read - either that or unfamiliarity with the rules of English for
forming possessives.

The poster must have been away for a while and not realised that the
"regs" have been stripped of their dominant "alpha" position in no
uncertain terms.
 
U

user923005

Hi,

I have been working on BASH shell programming a while and am very
pleased with its functionality to process text/string. Now I wonder if
there are any open source C or C++ libraries which can do the similar
things? E.g. regular expression support, cut/grep/sed functions, etc.

I know some are not actually BASH functions like " cut/grep/sed", but
it is really a pain to work out every such function by myself. I think
such thing should be in place already, but a simple google search does
not return any. May be it is hidden somewhere?

If you want to find libraries, try sourceforge and google.
 
R

Richard

William Pursell said:
It is indeed quite common for people to refer to libraries
by referring to the header. It is wrong to do so. It is
extremely common for people to be so totally ignorant of
how libraries work and to indeed believe that the header
is the library.

No it is not. You are talking through your backside. If someone is clued
in enough to suggest regexp.h, for example, I think its fair to say they
know how to link it in. For years programming in C and other languages
people have always referred to the header as the library because other
clued in professionals know to what they are referring. It is totally
correct and totally conventional to do so.

Why you have this compulsion to look like you're trying so hard to stand
out with your compulsion to be confused by fuzzy conversation is anyones
guess. You just appear slightly foolish to be honest.
 
R

Richard

William Pursell said:
WTF? I'm attempting to prevent people from being confused.

Hopefully the replies have proven to you that the only confusion is
emanating from your direction. You appear to be confused (as well as
mistaken) in thinking that referring to a header as a library in a
technical C programming group would, in any way, confuse anyone.
 
R

Richard

Richard Heathfield said:
William Pursell said:


And it's wrong because there's a real comprehension cost for newbies
in being misled by such people.

Nonsense. The world would be a strange place if everything said had to
be true all the way to first principles.
You're right, in other words. Now you have two choices - defend

No. He is NOT right. He is wrong. He is WRONG because people DO refer to
the header as the library ALL the time and there is NO confusion. The
same way people say a pointer references a block of memory. In the REAL
world.
yourself against people who can't stand the idea that someone
actually posts a correction, or accept that no matter what you say
there will always be bozos to argue with you, and let it go, saving
your energy for *useful* discussions.

The latter strategy is a *huge* timesaver, as I have discovered.

What is a huge timesaver is giving little credence to your
nonsense. Everyone knew what was meant. Only you and your type could try
and confuse the thread with such petty and almost vindictive
criticism. No wonder so many people have termed you arrogant and have
have killfiled you. It's almost like you enjoy being a pariah.
 
B

Ben Pfaff

Richard said:
He is WRONG because people DO refer to the header as the
library ALL the time and there is NO confusion.

On the contrary, this confusion is so common that there are
multiple questions in the FAQ aimed to dispel it:

10.11: I seem to be missing the system header file <sgtty.h>.
Can someone send me a copy?

A: Standard headers exist in part so that definitions appropriate
to your compiler, operating system, and processor can be
supplied. You cannot just pick up a copy of someone else's
header file and expect it to work, unless that person is using
exactly the same environment. Ask your compiler vendor why the
file was not provided (or to send a replacement copy).

13.25: I keep getting errors due to library functions being undefined,
but I'm #including all the right header files.

A: In general, a header file contains only declarations. In some
cases (especially if the functions are nonstandard) obtaining
the actual *definitions* may require explicitly asking for the
correct libraries to be searched when you link the program.
(#including the header doesn't do that.) See also questions
11.30, 13.26, and 14.3.
 
A

Antoninus Twink

If the day ever comes that no newbie programmer is ever confused about
why he needs to add a -lfoo to CFLAGS when the source code clearly
includes foo.h, then I won't correct someone who refers to the "foo.h
library".

Well, it would be better if he added -lfoo to LDLIBS instead...
Until then, it is necessary to refer to headers as headers, and
libraries as libraries.

....but on the general principle, I agree with you that this is often a
source of confusion and it's a good thing to make sure the distinction
is clear in newbies' minds.
 
R

Richard

Ben Pfaff said:
On the contrary, this confusion is so common that there are
multiple questions in the FAQ aimed to dispel it:

There are many things in the FAQ which are basic and the great majority
have no issues with.

Are you claiming you would would be CONFUSED when someone said use the
stdio.h library?

I do not believe it for one minute.

Pedantry in c.l.c is not dead it seems.
10.11: I seem to be missing the system header file <sgtty.h>.
Can someone send me a copy?

A: Standard headers exist in part so that definitions appropriate
to your compiler, operating system, and processor can be
supplied. You cannot just pick up a copy of someone else's
header file and expect it to work, unless that person is using
exactly the same environment. Ask your compiler vendor why the
file was not provided (or to send a replacement copy).

13.25: I keep getting errors due to library functions being undefined,
but I'm #including all the right header files.

A: In general, a header file contains only declarations. In some
cases (especially if the functions are nonstandard) obtaining
the actual *definitions* may require explicitly asking for the
correct libraries to be searched when you link the program.
(#including the header doesn't do that.) See also questions
11.30, 13.26, and 14.3.

Also notice "in general".

Noe of these issues, of course, having hardly any link to the original
statement of course.
 
C

Chris McDonald

Richard said:
.... because people DO refer to
the header as the library ALL the time and there is NO confusion.


Richard, on your text (quoted above) you are correct in stating that
people blur the distinction, all the time. However you are certainly
wrong about it not causing confusion.

You have obviously never participated in the teaching and conversations
of hundreds of undergraduates, who are people too, and observed their
massive confusion about this issue.

Not everyone is a long-time expert in C, and it would be a sad place if
they are the only people we ever talk to.
 
R

Richard

Chris McDonald said:
Richard, on your text (quoted above) you are correct in stating that
people blur the distinction, all the time. However you are certainly
wrong about it not causing confusion.

You have obviously never participated in the teaching and conversations
of hundreds of undergraduates, who are people too, and observed their
massive confusion about this issue.

Incorrect. I have done a lot of teaching. And this is why I refuse to be
so pedantic. Being pedantic creates more confusion IMO.

But, massive confusion? No. I dont think so. If your experience is
different fair enough. Then I think someone explained it in such a way
that did not allow sentient beings to make a small leap of faith.

If it was massive how come I never heard of it before? I dont doubt
there might be some. In the same way there is some about integers abd
bit places. Someone will always be confused.

If you say massive confusion then someone, somewhere needs to redesign a
course if they are confused that a header does not contain the main
library code.

The header is the interface to the library and is commonly referred to
as the library. This is my point. And why I claim it is not incorrect in
common language to refer to the library as "stdio.h" for example.

Yes, I know I am a liberal thinker. But I think most people are.
Not everyone is a long-time expert in C, and it would be a sad place if
they are the only people we ever talk to.

I agree. Some people will be confused about many things.

This is not one of them in the main.

I can only go on personal experience. And in my personal experience I
have never know this to be an issue. I can see why people can pick at
it. But it's pedantic and petty to pick at it as the first responder
did.
 
B

Ben Pfaff

Richard said:
There are many things in the FAQ which are basic and the great majority
have no issues with.

Are you claiming you would would be CONFUSED when someone said use the
stdio.h library?

I would not be confused. Some people would be, and have been,
confused by similar wording. Clarity should not be unwelcome.
 
R

Richard

Ben Pfaff said:
I would not be confused. Some people would be, and have been,
confused by similar wording. Clarity should not be unwelcome.

Of course not Ben.

But to come barging in claiming "it is wrong" in a thread talking about
regex for C, well, come off it. It was usual "reg"/ CBF type pettiness.

The person who replied was being pedantic and silly.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top