Convert String Containing Hex Values

Discussion in 'Perl Misc' started by cp, Oct 29, 2004.

  1. cp

    cp Guest

    If I have a string that looks like this:

    Job_x0020_Number

    How do I turn that into:

    Job Number

    ?
     
    cp, Oct 29, 2004
    #1
    1. Advertisements

  2. cp

    cp Guest

    s/(.*)\_[[:xdigit:]]+\_(.*)/$1 $2/;
     
    cp, Oct 29, 2004
    #2
    1. Advertisements

  3. s/_x\d+_/ /;
     
    Gunnar Hjalmarsson, Oct 29, 2004
    #3
  4. If you try to solve the problem in a language you know, you will see
    that you need to define the problem in more detail before you can
    start coding. You ought to do so before asking here.

    Making some plausible assumptions about your problem, this might work:

    $string = "Job_x0020_Number";
    $string =~ s/_x[0-9a-fA-F]+_/ /;

    Since you wrote "hex values" in the subject line, I assume all
    hexadecimal digits can occur after the 'x'.
     
    Arndt Jonasson, Oct 29, 2004
    #4
  5. cp

    cp Guest

    Just tested mine, doesn't work if you use leading x for hex number. Change
    to:
    s/(.*)\_x[[:xdigit:]]+\_(.*)/$1 $2/;

    still uglier than other solutions given but it should work.
     
    cp, Oct 29, 2004
    #5
  6. cp

    Uri Guttman Guest

    c> s/(.*)\_[[:xdigit:]]+\_(.*)/$1 $2/;

    overkill. you don't need to grab and put back the leading and trailing
    strings. like gunnar did, just delete the stuff you want to delete.

    and _ doesn't need escaping there (or anywhere as it is just a word
    char).

    so your regex should be:

    s/_[[:xdigit:]]+_//;

    a lot cleaner and easier to read.

    uri
     
    Uri Guttman, Oct 29, 2004
    #6
  7. cp

    cp Guest

    Thanks for the tip. I mistakenly remembered from my quick reading of
    'Programming Perl' that all non alpha characters were metacharacters, but I
    was wrong. I am looking at the list right now (p. 141 3rd ed.) which is :

    \ | ( ) [ { ^ $ * + ? .

    and then you have / which only needs a backslash in front to match literally
    if it is also used as a delimiter.

    I probably shouldn't be trying to help out around here yet, but I couldn't
    resist trying to help a fellow cp!
     
    cp, Oct 29, 2004
    #7
  8. cp

    cp Guest

    This could be a stretch in trying to justify backslashing characters
    unnecessarily, but what about the possibility of reserved metacharacters?
    What if Larry Wall decides to make use of _ and the other
    non-metacharacter, non-alpha characters and old scripts that did not
    backslash them will be broken? I know it's a stretch and I did read that
    even if Perl6 breaks old scripts, there will be a tool to upgrade scripts
    from Perl5 to Perl6 so maybe it's not even an issue.
     
    cp, Oct 29, 2004
    #8
  9. cp

    Uri Guttman Guest

    c> This could be a stretch in trying to justify backslashing
    c> characters unnecessarily, but what about the possibility of
    c> reserved metacharacters? What if Larry Wall decides to make use of
    c> _ and the other non-metacharacter, non-alpha characters and old
    c> scripts that did not backslash them will be broken? I know it's a
    c> stretch and I did read that even if Perl6 breaks old scripts, there
    c> will be a tool to upgrade scripts from Perl5 to Perl6 so maybe it's
    c> not even an issue.

    _ is in \w and will always be a word char and needs no more escaping
    than does k or 3. perl5 regexes ain't gonna change metachar meanings or
    it will break too much code. perl6 not only will have a perl5 regex
    compiler it will have a much easier regex (actually called rules and
    grammars) extension mechanism that it won't need to change its metachars
    in the future.

    uri
     
    Uri Guttman, Oct 29, 2004
    #9
  10. What about
    s/_.*_/ /;

    jue
     
    Jürgen Exner, Oct 30, 2004
    #10
  11. cp

    cp Guest

    My fault for not explaining the whole problem.

    MS Word allows custom datafields in their Word XML files. My users are
    typically creative with them, and insert fields like:

    Client ID Number
    Job Number(s)

    while some would know not to include not to include non-alphnumerics,
    and would write the fields as:

    client_id_or_job_number
    etc.

    Word, when it saves the file as XML, translates illegal characters so
    in the above example, I would get:
    <o:Client_x0020_ID_x0020_Number dt:dt="string">
    12345
    </o:Client_x0020_ID_x0020_Number>

    <o:Job_x0020_Number_x0028_s_x0029_ dt:dt="string">
    5 and 6
    </o:Job_x0020_Number_x0028_s_x0029_>

    The regex Abigail provided fits the bill nicely, as I would like to
    spit out a text file with the custom data fields as:

    Client ID Number: 12345
    Job Number(s): 5 and 6


    Thanks to all who helped
     
    cp, Nov 1, 2004
    #11
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.