accessing single characters of strings

Marten Lehmann · Oct 21, 2008

Hello,

how can I access a single character of a string in Perl like I can do it
in C with text[10]? I don't want to split it up and work on an array.
How can I work directly in place?

Kind Regards
Marten

Tad J McClellan · Oct 21, 2008

Marten Lehmann said:
like I can do it
in C with text[10]? I don't want to split it up and work on an array.

It is not possible to do it like in C, because in C it _is_ an array.

Jürgen Exner · Oct 21, 2008

Marten Lehmann said:
how can I access a single character of a string in Perl like I can do it
in C with text[10]? I don't want to split it up and work on an array.
How can I work directly in place?

substr() is the function you are looking for.
However I very strongly recommend that you change your mental model.
Perl strings are _NOT_ arrays of characters. That primitive model is
adaquate for C. Perl strings are much more powerful and if you use them
like you would use arrays of characters in C, then you are missing most
of their functionality.

jue

Marten Lehmann · Oct 22, 2008

Hello,

how can I access a single character of a string in Perl like I can do it
in C with text[10]? I don't want to split it up and work on an array.
How can I work directly in place?

Click to expand...

substr() is the function you are looking for.
However I very strongly recommend that you change your mental model.
Perl strings are _NOT_ arrays of characters. That primitive model is
adaquate for C. Perl strings are much more powerful and if you use them
like you would use arrays of characters in C, then you are missing most
of their functionality.

I need to parse a file line by line. It is basically a CSV file, not not
completely.

Imagine this content:

"one","""two"",""three"""

"" mean a replacement for one "

Correctly parsed, I would get two values:

one
and
"two","three"

But I cannot split at , and I cannot split at ",", since both would lead
to wrong parsing. So I have to lexically go through every character. And
Thus I don't only need to know the current character, but also the next
one. Is substr() really the only choice? Looks a bit awkward to call
substr() dozends of times.

Kind regards
Marten

Willem · Oct 22, 2008

Marten Lehmann wrote:
) I need to parse a file line by line. It is basically a CSV file, not not
) completely.
)
) Imagine this content:
)
) "one","""two"",""three"""
)
) "" mean a replacement for one "
)
) Correctly parsed, I would get two values:
)
) one
) and
) "two","three"
)
) But I cannot split at , and I cannot split at ",", since both would lead
) to wrong parsing.

Obviously.

) So I have to lexically go through every character.

A strange conclusion. There are dozens of other ways to do it.

One example: split at , and then use some simple processing to determine
if an entry has an odd number of quotes and, if so, join it to the next
entry. Then postprocess the entries for the quotes.

Another example: Use one of the many existing CSV parsing module.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Jürgen Exner · Oct 22, 2008

Marten Lehmann said:
I need to parse a file line by line. It is basically a CSV file, not not
completely.

Imagine this content:

"one","""two"",""three"""

"" mean a replacement for one "

That is a normal, standard, boring CSV, nothing special. Why do you
think, the standard CSV parsers wouldn't be able to parse it?

Correctly parsed, I would get two values:
one
and
"two","three"

Which of the existing CSV parsers did you try? How did they fail to
parse that line?

But I cannot split at , and I cannot split at ",", since both would lead
to wrong parsing.

Yes, but that's not how you normally would write a parser anyway.

So I have to lexically go through every character. And
Thus I don't only need to know the current character, but also the next
one.

It has been a really long time, but AFAIR that depends on the kind of
tokenizer you are using.

Is substr() really the only choice? Looks a bit awkward to call
substr() dozends of times.

If you really insist on reinventing the wheel and writing your own CSV
parser (why would you want to do that? Excercise in parser writing?)
then the low-level tokenizer may indeed work best on an array of
characters.

jue

Tad J McClellan · Oct 22, 2008

Marten Lehmann said:
I need to parse a file line by line. It is basically a CSV file, not not
completely.

Imagine this content:

"one","""two"",""three"""

"" mean a replacement for one "

That looks like normal CSV to me...

Correctly parsed, I would get two values:

one
and
"two","three"

---------------------------
#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV_XS;

my $line = q("one","""two"",""three""");
my $csv = Text::CSV_XS->new();
$csv->parse($line);
print "$_\n" for $csv->fields();

Marten Lehmann · Oct 22, 2008

Have a look at the Text::CSV_XS[1] module, which will parse your csv

data correctly and spare you a lot of headache about balanced
quotations. Using the double quote itself to escape a literal
double quote isn't that uncommon, so it's a default setting in
Text::CSV_XS.

Thanks, you are right. While I'm usually doing easy parsing line by line
own my own and I just planned to extend it a bit, I just realized, that
recognizing double quotations and new lines is just a bit more work, so
I'm now using the Text::CSV_XS module you recommended.

Regards
Marten

grocery_stocker · Oct 22, 2008

Marten Lehmann said:
Marten Lehmann said:

like I can do it
in C with text[10]? I don't want to split it up and work on an array.

Click to expand...

It is not possible to do it like in C, because in C it _is_ an array.

--

an array of objects.

Single put routine overlapping words during iteration	4	Jan 2, 2023
Sort by number of characters	0	Nov 3, 2023
FAQ 6.23 How can I match strings with multibyte characters?	0	Jan 11, 2011
Find and count strings of text from multiple files	17	Dec 16, 2021
Padding strings for a clean visual print out...	5	Dec 23, 2023
Converting an Array to a String in JavaScript	7	Sep 22, 2023
FAQ 4.21 How do I remove consecutive pairs of characters?	0	Jan 14, 2011
Accessing array index addresses with custom datatype in a function	0	Jun 2, 2022

accessing single characters of strings

Marten Lehmann

Tad J McClellan

Jürgen Exner

Marten Lehmann

Willem

Jürgen Exner

Tad J McClellan

Marten Lehmann

grocery_stocker

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads