Windows Phone Developers

Showing posts with label Extract External links in a web page using Vb.NET Regular Expressions. Show all posts
Showing posts with label Extract External links in a web page using Vb.NET Regular Expressions. Show all posts

Sunday, June 1, 2008

Extract Ref Links From WebPage using VB.Net Regular Expressions


Extract Links From WebPage using VB.Net Regular Expressions

Sub Extract_Links_From_WebPage()

Dim oReg As Regex

Dim oMat As Match

Dim sInputString As String

Dim sLink As String

sInputString = "some have links that direct to some html files

"

oReg = New Regex("href\s*=\s*(?:""(?<>[^""]*)(?<>\S+))", RegexOptions.Compiled Or RegexOptions.IgnoreCase)

oMat = oReg.Match(sInputString)

While oMat.Success

sLink = oMat.Groups("link").ToString

End While

End Sub

The above code uses Group class, which represents the results from a single capturing group. Because Group can capture zero, one, or more strings in a single match (using quantifiers), it contains a collection of Capture objects. Because Group inherits from Capture, the last substring captured can be accessed directly (the Group instance itself is equivalent to the last item of the collection returned by the Captures property).

Instances of Group are returned by indexing the GroupCollection object returned by the Groups property. The indexer can be a group number or the name of a capture group if the "(?< groupname >)" grouping construct is used. For example, in C# code you can use Match.Groups[groupnum] or Match.Groups["groupname"], or in Visual Basiccode you can use Match.Groups(groupnum) or Match.Groups("groupname").

In the above example ?<>[^""]*stores the match found by [^""]* pattern in the group ‘link’, Which can be accessed by Groups("link")

See Also
Extract Ref Links From WebPage using VB.Net Regular Expressions
Remove HTML Tags from String using .NET Regular Expressions
VB.NET Regular Expression to Check URL
VB.NET Regular Expression to Check Email Addresses
VB.NET Regular Expression to Check MAC Address
Regular Expression to Check Zip Code
Validate eMail Addresses using VB.NET Function
Regular Expressions in Dot Net (.NET)

Digg Technorati Delicious StumbleUpon Reddit BlinkList Furl Mixx Facebook Google Bookmark Yahoo
ma.gnolia squidoo newsvine live netscape tailrank mister-wong blogmarks slashdot spurl StumbleUpon