regular expression to parse {"hello", "hello world","1hello-2*hello"}

Discussion in 'Java' started by Roy, Jan 6, 2008.

  1. Roy

    Roy Guest

    Hi,

    I was trying to use Java's regular expression to parse the following
    string:

    {"hello", "hello world", "1hello-2*hello"}

    I'd like to extract the words inside the quotation marks as follows:

    hello
    hello world
    1hello-2*hello

    I've tried different ways to write the expression but didn't work this
    out. Can anyone help?

    Thanks a lot.
    Roy
     
    Roy, Jan 6, 2008
    #1
    1. Advertisements

  2. Why do you asking us to do your homework?

    Goto: http://java.sun.com/j2se/1.5.0/docs/api/ , class Pattern
    and remember about double backslash (\\).
     
    Jacek Wojciechowski, Jan 6, 2008
    #2
    1. Advertisements

  3. Hey Roy, here's a free tip: It is very likely to get responses like this
    when making requests of the form "Please solve my problem!" without
    showing and explaining *what you* have tried already and what your exact
    problem with your approach is.


    regards,
    /W
     
    Wildemar Wildenburger, Jan 6, 2008
    #3
  4. Roy

    Roy Guest

    Hi Jacek,
    Thank you for the tip. Actually, this is not my homework. I just
    started learning regular expression last night and came up with this
    problem. I've tried many ways to parse this string but haven't
    succeeded yet.

    The problem that bugs me is the spaces inside the quotation marks and
    the spaces outside of them. I don't know how to write an expression to
    distinguish them. Of course I believe I can find other ways to parse
    the string without using any regular expressions. But I am just
    curious whether a simple expression can do the job.

    Here are what I've tried:

    Enter your regex: [^{},"]+
    Enter input string to search: {"hello", "hello world",
    "1hello-2*hello"}
    I found the text "hello" starting at index 2 and ending at index 7.
    I found the text " " starting at index 9 and ending at index 10.
    I found the text "hello world" starting at index 11 and ending at
    index 22.
    I found the text " " starting at index 24 and ending at index
    33.
    I found the text "1hello-2*hello" starting at index 34 and ending at
    index 48.
    I found the text " " starting at index 50 and ending at index 51.

    Enter your regex: [^{},"\s+]+
    Enter input string to search: {"hello", "hello world",
    "1hello-2*hello"}
    I found the text "hello" starting at index 2 and ending at index 7.
    I found the text "hello" starting at index 11 and ending at index 16.
    I found the text "world" starting at index 17 and ending at index 22.
    I found the text "1hello-2*hello" starting at index 34 and ending at
    index 48.
     
    Roy, Jan 6, 2008
    #4
  5. The simplest regex:

    "\\b\\w+\\b"

    The "\\b" matches a word boundary (logical, so it actually doesn't match
    a character), and the "\\w" matches a word character.
     
    Joshua Cranmer, Jan 6, 2008
    #5
  6. Roy

    Roedy Green Guest

    Roedy Green, Jan 7, 2008
    #6
  7. Roy

    Roedy Green Guest

    if you had a file that looked like that, you could read it as a csv
    file. See http://mindprod.com/jgloss/csv.html
     
    Roedy Green, Jan 7, 2008
    #7
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.