String Generation using Mask Parsing

Discussion in 'C Programming' started by James Arnold, Sep 21, 2008.

  1. James Arnold

    James Arnold Guest

    Hello,

    I am new to C and I am trying to write a few small applications to get
    some hands-on practise! I am trying to write a random string
    generator, based on a masked input. For example, given the string:
    "AAANN" it would return a string containing 3 alphanumeric characters
    followed by 3 digits. This part I have managed:)

    I would now like to add some complexity to this, such as repetitions
    and grouping. For example, I'd like to have masks similar to:
    "AAN*10", which would return two alphanumeric chars followed by a
    sequence of 10 numeric characters. However, the characters could be
    grouped, such as: "A(AN)*10", which would now return an alphanumeric
    character followed by a sequence of ten alternating alphanumeric/
    numeric characters.

    I'm not really sure where to start with this next step as I have
    minimal experience. Any pointers in the right direction, or sample
    code, would be appreciated.

    Thanks in advance,
    James.
    James Arnold, Sep 21, 2008
    #1
    1. Advertising

  2. James Arnold

    CBFalconer Guest

    James Arnold wrote:
    >
    > I am new to C and I am trying to write a few small applications
    > to get some hands-on practise! I am trying to write a random
    > string generator, based on a masked input. For example, given
    > the string: "AAANN" it would return a string containing 3
    > alphanumeric characters followed by 3 digits. This part I have
    > managed:)
    >
    > I would now like to add some complexity to this, such as
    > repetitions and grouping. For example, I'd like to have masks
    > similar to: "AAN*10", which would return two alphanumeric chars
    > followed by a sequence of 10 numeric characters. However, the
    > characters could be grouped, such as: "A(AN)*10", which would
    > now return an alphanumeric character followed by a sequence of
    > ten alternating alphanumeric/ numeric characters.
    >
    > I'm not really sure where to start with this next step as I
    > have minimal experience. Any pointers in the right direction,
    > or sample code, would be appreciated.


    I think a study of regular expressions, as implemented in Unix and
    Linux, would be instructive here.

    --
    [mail]: Chuck F (cbfalconer at maineline dot net)
    [page]: <http://cbfalconer.home.att.net>
    Try the download section.
    CBFalconer, Sep 21, 2008
    #2
    1. Advertising

  3. James Arnold

    James Arnold Guest

    > I think a study of regular expressions, as implemented in Unix and
    > Linux, would be instructive here.


    I am already familiar with regluar expressions, but I was under the
    impression they can't be used to match braces when nested? So if for
    example I wanted to do A(A(N)*10)*5, regular expressions wouldn't be
    appropriate?

    > It turns into a grammar parsing problem.


    I have been looking at Lex/Yacc (well, Flex/Bison) and written a
    grammar to handle what I would like. I've compiled it and managed to
    get it to output the detected tokens, but it definitely seemed to be
    overkill for such a small program. Currently I'm just iterating
    through a string and switch()'ing on each character, which covers most
    of the functionality I'd like. I figured there must be a way of
    tracking nested depth and calling the parse routine recursively for
    each matched group?

    > After that, my suggestion would be to divide the program into two parts.
    > The first one would input a string like "AA(AN){10}" and expand it to
    > something like "AAANANANANANANANANANAN", which is really what you're
    > looking at.


    This is also something I had considered, but I want to be able to use
    a range for a specified repetition, e.g. repeat between 5 to 10 times.
    This is fine, but if I want to generate 50 different outcomes the full
    mask would need to be expanded each time, rather than just the
    repeated bit. Surely that is not going to be very efficient? :)

    Thanks for the replies!
    James Arnold, Sep 21, 2008
    #3
  4. [You or your news reader is not adding attribution lines. This is not
    a good idea and you should have a look to see if you can fix it.]

    James Arnold <> writes:

    >> I think a study of regular expressions, as implemented in Unix and
    >> Linux, would be instructive here.

    >
    > I am already familiar with regluar expressions, but I was under the
    > impression they can't be used to match braces when nested? So if for
    > example I wanted to do A(A(N)*10)*5, regular expressions wouldn't be
    > appropriate?


    I think the suggestion was only that you could look at REs for how to
    write your masks. You are right that REs won't be any good as way of
    implementing this. For example, some REs use (abc){3,6} for 3 to 6
    repeats of "abc" and you might one day want things like [aeiou] rather
    than just A and N indicators.

    >> It turns into a grammar parsing problem.

    >
    > I have been looking at Lex/Yacc (well, Flex/Bison) and written a
    > grammar to handle what I would like. I've compiled it and managed to
    > get it to output the detected tokens, but it definitely seemed to be
    > overkill for such a small program.


    Agreed. You have at most brackets and a couple of operators. No
    need for lex and yacc.

    > Currently I'm just iterating
    > through a string and switch()'ing on each character, which covers most
    > of the functionality I'd like. I figured there must be a way of
    > tracking nested depth and calling the parse routine recursively for
    > each matched group?


    That's roughly what I'd do. In fact, I'd probably make what you call
    the parse routine do the actual generation as well. The parsing will
    be so simple that actually storing the parse in some form in probably
    not needed.

    >> After that, my suggestion would be to divide the program into two parts.
    >> The first one would input a string like "AA(AN){10}" and expand it to
    >> something like "AAANANANANANANANANANAN", which is really what you're
    >> looking at.

    >
    > This is also something I had considered, but I want to be able to use
    > a range for a specified repetition, e.g. repeat between 5 to 10 times.
    > This is fine, but if I want to generate 50 different outcomes the full
    > mask would need to be expanded each time, rather than just the
    > repeated bit. Surely that is not going to be very efficient? :)


    I agree. If you parse and generate on the fly, there is no need for
    ether an intermediate mask or a stored parse tree.

    --
    Ben.
    Ben Bacarisse, Sep 21, 2008
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?TWlrZSBNb29yZQ==?=

    Format Mask for Phone Number Using Text Box

    =?Utf-8?B?TWlrZSBNb29yZQ==?=, Sep 27, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    836
    =?Utf-8?B?TWlrZSBNb29yZQ==?=
    Sep 27, 2004
  2. Dag Sunde
    Replies:
    7
    Views:
    1,268
    Dimitre Novatchev
    Mar 10, 2007
  3. John W. Long

    HTML Generation (Next Generation CGI)

    John W. Long, Nov 22, 2003, in forum: Ruby
    Replies:
    4
    Views:
    317
    John W. Long
    Nov 24, 2003
  4. Marcin Tyman

    Conversion mask in hex to bit mask

    Marcin Tyman, May 6, 2008, in forum: Ruby
    Replies:
    4
    Views:
    776
    Robert Klemme
    May 6, 2008
  5. 187
    Replies:
    2
    Views:
    538
    Bart Lateur
    Jul 29, 2004
Loading...

Share This Page