splitting string with unknown number of similar substrings

J

Jan Biel

Hi!

I'm trying to split a string into an array of tokens. Some of the tokens are
similar but I don't know how many will be in the string.

Example: (just imagine \n instead of line breaks, this is really one string)

CREATE TABLE personal (
personal_id integer NOT NULL,
vorname varchar(50) NOT NULL,
nachname varchar(50) NOT NULL,
privattelefon varchar(30),
email varchar(130),
datenschutzcode bigint NOT NULL,
CONSTRAINT emailkorrekt_chk CHECK ((email ~~ '%@%'::text))
);

So the string always has a head

"CREATE TABLE personal ("

and a tail

");"


The tokens in between always have the form ".*," except the last one which
is omitting the "," but it is not known beforehand how many tokens exist.

My idea of a fitting perl expression looks like this:
my @table = split /(CREATE TABLE)(.*?)(\()(.*?,)*?(.*?\);)/s, $table;


But it does not work. From the tokens inbetween it only finds

"datenschutzcode bigint NOT NULL,"

and the one after that.


I have tried

my @table = split /(CREATE
TABLE)(.*?)(\()(.*?,)(.*?,)(.*?,)(.*?,)(.*?,)(.*?,)(.*?\);)/s, $table;

Which splits exactly at the correct places. Unluckily with this solution I
have to know the number of tokens beforehand. Additionally I consider it
really bad style.

I hope you can help,
Janbiel
 
J

Jan Biel

Bernard said:
If the data always fits that description then this should work:


my @tokens = m/(?:,|CREATE TABLE personal \()([^,]*)/g;

Thanks for the tip, but could you elaborate?
I'm a newbie and don't fully understand

The expression looks for a match (m/)

I see two groups
(?:,|CREATE TABLE personal \()
and
([^,]*)

I suppose any of the matches is copied in an array field.

The first group either detects "CREATE TABLE personal (" or "?:,", but I
have no idea what the latter means. :(

[^,]* is also a mystery to me.

Just trying to understand instead of blindly copy-pasting. Could you please
clear it up?

Thanks a lot,
Janbiel
 
G

GreenLight

Jan Biel said:
Hi!

I'm trying to split a string into an array of tokens. Some of the tokens are
similar but I don't know how many will be in the string.

Example: (just imagine \n instead of line breaks, this is really one string)

If the string has embedded newlines (\n), then why not split it out
into an array of lines (or read it into an array in the first place)?

@array = split/\n/, $table);

The individual lines might be easier to work with than trying to come
up with one (easily understandable) RE.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top