Extracting string (RegExp)
Hello,
I have a string
Code:
hello robert <a href="http://www.test.net/f/"> test </ a> bla bla bl <a href="http://www.google.com"> google </ a> NEWS google </ b> <<google
How to extract every string starting with:
Code:
<a and ending with </ a> to put them in a table?
I try the regular expression below:
Code:
String[] tabText = text.split("[<A * </ A]");
But without success. Thank you in advance
Re: Extracting string (RegExp)
Hello,
As a proposal, because I'm not a pro in regular expressions. You can have a look at the following
Code:
String pat = "<a [\\s] + href =[^>]*>[^<]*</ a> ";
Pattern pat = Pattern.compiles(ground);
Macher = pat matcher.matcher("hello robert <a href =\"http://www.test.net/f/\"> test </ a> bla bla bl <a href =\"http://www.google.com\"> google </ a> NEWS google </ b> "google");
while (macher.find()) {
System.out.System.out.println("Found URL (" + Macher.home() + ", " + Macher.end() + ") --> " + Macher.group());
}
Re: Extracting string (RegExp)
Hello,
There is the non-greedy operator to use you for convenience (.*? Or. +? Or .??). It acts as .* (or. + Or.?), But taking the smallest string that fulfills the contract. :
Code:
String s = "hello robert <a href =\"http://www.test.net/f/\"> test </ a> bla bla bl <a href =\"http://www.google.com\"> google </ a> NEWS google </ b> "google";
Pattern p = Pattern.compiles("<a.*?> .*?</ a>", Pattern.Case_insensitive);
Matcher m = p.matcher(s);
while (m.find()) {
System.out.System.out.println(m.group());
}
Re: Extracting string (RegExp)
Hello,
Thank you very much, that's exactly what I wanted. I do not know the Pattern class is very appropriate. I think winning a lot in performance and simplicity with your help. A big thank you again, I would hope better. If you any other alternative for this, then please do post back. Or if you know any other method for doing the same then please help me, I am interested in that.
Re: Extracting string (RegExp)
Hello,
It is not easy to dip management regex with java.
Code:
Pattern p = Pattern.compiles( "<a href =\"(.*?)\">(.*?)</ a> " );
Matcher m = p.matcher( link );
brackets can delimit blocks (group) capture.
Code:
while( m.find() ){
System.out.System.out.println( m.group( 0 ) ); / / returns all that has captured the mask
System.out.System.out.println( m.group( 1 ) ); / / return what has been captured in the first bracket
System.out.System.out.println( m.group( 2 ) ); / / return what has been captured in the second parenthesis
}
Re: Extracting string (RegExp)
Hello,
Indeed it is not. I try to learn but it is not clear. In any case, once again great!
Better than I'd hoped. Thanks for your help. To close the subject, I discovered a little snag. Do you know if I can exclude tags from keywords found <b> and </ b>?
Code:
<a href=""> big Thank NEWS </ b> </ a>
Anyway Thanks