Home > Community > Forums > Custom IC SKILL > Pattern Matching of Regular Expressions

Email

* Required Fields

Recipients email * (separate multiple addresses with commas)

Your name *

Your email *

Message *

Contact Us

* Required Fields
First Name *

Last Name *

Email *

Company / Institution *

Comments: *

 Pattern Matching of Regular Expressions 

Last post Mon, Jun 9 2008 5:33 AM by archive. 5 replies.
Started by archive 09 Jun 2008 05:33 AM. Topic has 5 replies and 5034 views
Page 1 of 1 (6 items)
Sort Posts:
  • Mon, Jun 9 2008 5:33 AM

    • archive
    • Top 75 Contributor
    • Joined on Fri, Jul 4 2008
    • Posts 88
    • Points 4,950
    Pattern Matching of Regular Expressions Reply

    I’m studing this topic and I met problem to understand this example…

     

    rexCompile("\\([a-z]+\\)\\.\\1")        => t

    rexExecute("abc.bc")                    => t

    rexExecute("abc.ab")                    => nil

     

    It is not clear why the second case is nil and if I type:

     

    rexCompile("\\([a-z]+\\)\\.\\1")        => t

    rexExecute("abc.bc")                    => t

     

    rexSubstitute( “debug: \\0" ) => debug: bc.bc

     

    Why the result is bc.bc?

     

     \1 register should not point to abc?

     

    Please, could you clarify me how it works? Where could I find accurate information as the cdsdoc is not clear on how registers work.

     

    Giuseppe

     


    Originally posted in cdnusers.org by Giuseppe Greco
    • Post Points: 0
  • Mon, Jun 9 2008 5:52 AM

    • archive
    • Top 75 Contributor
    • Joined on Fri, Jul 4 2008
    • Posts 88
    • Points 4,950
    RE: Pattern Matching of Regular Expressions Reply

    Giuseppe,

    Looks like a bug. I don't think rexExecute("abc.bc") should return t either. Only "abc.abc" should return t.

    What seems to be happening is that \1 is incorrectly being stored as "bc", and that's why the first case returns t.

    You should contact customer support about this.

    Note that in IC61 you have the "pcre" functions which provide access to a [u]much [/u] more powerful regular expression parser/compiler etc - so I'd take a look at that.

    Regards,

    Andrew.


    Originally posted in cdnusers.org by adbeckett
    • Post Points: 0
  • Mon, Jun 9 2008 6:03 AM

    • archive
    • Top 75 Contributor
    • Joined on Fri, Jul 4 2008
    • Posts 88
    • Points 4,950
    RE: Pattern Matching of Regular Expressions Reply

    Hi Giuseppe,
    It looks correct to me. Your first expression:
    rexCompile("\\([a-z]+\\)\\.\\1") ; => t
    stand for :" search any pattern that match 'blablablaXXXX.XXXX'
    So if you enter 'abc.bc' the parser catch :'bc.bc' as XXXX.XXXX
    Everything is fine.
    Regards,
    Remarks : 'blablablaXXXXblabla.XXXX' is not caught by rexCompile("\\([a-z]+\\)\\.\\1")


    Originally posted in cdnusers.org by ebecheto
    • Post Points: 0
  • Mon, Jun 9 2008 6:29 AM

    • archive
    • Top 75 Contributor
    • Joined on Fri, Jul 4 2008
    • Posts 88
    • Points 4,950
    RE: Pattern Matching of Regular Expressions Reply

    Good point. I agree. This should work. In general it tries to match the whole pattern, not each bit in turn.

    So I agree. It's behaving correctly.

    Andrew.


    Originally posted in cdnusers.org by adbeckett
    • Post Points: 0
  • Tue, Jun 10 2008 12:02 AM

    • archive
    • Top 75 Contributor
    • Joined on Fri, Jul 4 2008
    • Posts 88
    • Points 4,950
    RE: Pattern Matching of Regular Expressions Reply

    Ok, perhaps I do not understand how it works...

    rexCompile("\\([a-z]+\\)\\.\\1") ; => t
    ok,

    If I do rexExecute("abc.bc")

    \1 should contain the value matched by \([a-z]+)\, ok? In our case it should be "abc". It is correct?
    I would expect our rexCompile should be equivalent to

    rexCompile("[a-z]+\\.abc")

    so

    rexExecute("abc.bc") => nil

    and

    rexExecute("abc.ab") => nil

    Why the first is true??

    >>stand for :" search any pattern that match 'blablablaXXXX.XXXX'

    Could you explain hot to interpret rexCompile("\\([a-z]+\\)\\.\\1") ?

    Thank you,

    Giuseppe


    Originally posted in cdnusers.org by Giuseppe Greco
    • Post Points: 0
  • Tue, Jun 10 2008 12:22 AM

    • archive
    • Top 75 Contributor
    • Joined on Fri, Jul 4 2008
    • Posts 88
    • Points 4,950
    RE: Pattern Matching of Regular Expressions Reply

    In the first case, you're expecting that the \([a-z]+\) matching happens in isolation. It doesn't. What happens is that it tries to match the entire pattern, and so it will be looking for a combination of 1 or more a-z characters. followed by ".", followed by exactly the same combination. Since it can successfully do that if the \1 part ends up as "bc", then it will do that.

    If you'd have anchored the pattern:

    rexCompile("^\\([a-z]+\\)\\.\\1")

    then this would have forced it to match the complete sequence of characters before the ".", in which case only "abc.abc" would have matched.

    So as was stated previously, you're matching:

    blablahblaXXXX.XXXX

    where XXXX is a variable length sequence of a-z - but is the [u]same[/u] either side of the dot.

    Regards,

    Andrew.


    Originally posted in cdnusers.org by adbeckett
    • Post Points: 0
Page 1 of 1 (6 items)
Sort Posts:
Started by archive at 09 Jun 2008 05:33 AM. Topic has 5 replies.