Results 1 to 6 of 6

Thread: Regex with xml tag

  1. #1
    Join Date
    Dec 2009
    Posts
    192

    Regex with xml tag

    Hello,
    I work on a String containing XML tags. I try to recognize certain patterns:
    Code:
    t <ind> test </ ind> with a <ind> </ ind> More
    Here my goal is to extract:
    Code:
    t with result
    To isolate the text unmarked.
    I use the following regexp: <ind> .* </ ind>
    AC works well, but I just noticed that it works too well because for the example above if I return
    Code:
    Following t
    Because of
    Code:
    t<ind>test </ ind> with a <ind></ ind> Following
    So, how can I use the Regexp effectively with xml tags, so that I can extract data from it.

  2. #2
    Join Date
    Nov 2009
    Posts
    343

    Re: Regex with xml tag

    Hello,
    Note that the regex is not often the right tool to parse the xml and they work well only for the most simple xml. In your case, try:
    Code:
    <ind> .*?</ ind>
    I think this is the suitable option for you. And as i said earlier regex is not a good ideas to parse a xml. So, it is better if you find a other way round. Finally it is up to you, you can use what ever you want.

  3. #3
    Join Date
    Dec 2009
    Posts
    192

    Re: Regex with xml tag

    Hi,
    Code:
    <ind> .*?</ ind>
    Yes it works better but still not it for this code, it is
    Code:
    t<ind> test </ ind> with <ind> a <ind> two </ ind></ ind> More
    If regexp is not the best way to operate, what is it? using languages such XPATH? I have no idea about the others, if you know one which best fits here, then you can post or give me the link to the article or documentation on it.

  4. #4
    Join Date
    Nov 2009
    Posts
    446

    Re: Regex with xml tag

    Hello,
    I think you should limit, i.e make sure that XML remains compatible with a regex analysis, there is no need for a kind of historical reasons (the classic separation lex / yacc or syntax / grammar , there is no grammar). Here the overlap is very difficult to treat in regex. If I want to do anyway, so I do a regex with the OR group match beginning and end of the ground, I walk my chain with a find and I react as I find a beginning or an end; to I end the global as you want, but it only works if no comments or other in XML.

  5. #5
    Join Date
    Dec 2009
    Posts
    192

    Re: Regex with xml tag

    Hi,
    Yes that's how I proceed to a number of tags, I get a ground opening and closing. I look down the list returned by both find () to find the extremism. Then by recursion, I take the string contained between the two extrema and I again. I'll see to apply this principle here at first (because I'm in a hurry to make this work ), But I will look at ways to use an XML API to locate the positions of tags in a String.

  6. #6
    Join Date
    Nov 2009
    Posts
    583

    Re: Regex with xml tag

    Hi,
    Here is a regex that would please you (perl-compatible, but it should go in java)
    Code:
    <(\ w +)>(.*?)</\1>
    It does not pass on the stroke of nested tags, but the rest is ok. You can not do recursion, to my knowledge, with regex, so what you're looking to do is impossible. But in fact, an XML parser is probably easier.

Similar Threads

  1. Substring with a regex
    By TechGate in forum Software Development
    Replies: 5
    Last Post: 02-04-2010, 11:06 AM
  2. Problem in matching Regex
    By Vodka in forum Software Development
    Replies: 5
    Last Post: 18-02-2010, 04:12 AM
  3. Regex string with quotes
    By Gunner 1 in forum Software Development
    Replies: 5
    Last Post: 07-02-2010, 06:03 AM
  4. Regex retrieve the value that matches
    By TechGate in forum Software Development
    Replies: 5
    Last Post: 05-02-2010, 04:29 AM
  5. Query in php regex
    By Linux-Us in forum Software Development
    Replies: 5
    Last Post: 14-12-2009, 12:54 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,750,593,993.80294 seconds with 16 queries