Extract Links From WebPage using VB.Net Regular Expressions
Sub Extract_Links_From_WebPage()
Dim oReg As Regex
Dim oMat As Match
Dim sInputString As String
Dim sLink As String
sInputString = "some have links that direct to some html files
oReg = New Regex("href\s*=\s*(?:""(?<>[^""]*)(?<>\S+))", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
oMat = oReg.Match(sInputString)
While oMat.Success
sLink = oMat.Groups("link").ToString
End While
End Sub
The above code uses Group class, which represents the results from a single capturing group. Because Group can capture zero, one, or more strings in a single match (using quantifiers), it contains a collection of Capture objects. Because Group inherits from Capture, the last substring captured can be accessed directly (the Group instance itself is equivalent to the last item of the collection returned by the Captures property).
Instances of Group are returned by indexing the GroupCollection object returned by the Groups property. The indexer can be a group number or the name of a capture group if the "(?< groupname >)" grouping construct is used. For example, in C# code you can use Match.Groups[groupnum] or Match.Groups["groupname"], or in Visual Basiccode you can use Match.Groups(groupnum) or Match.Groups("groupname").
In the above example ?<>[^""]*stores the match found by [^""]* pattern in the group ‘link’, Which can be accessed by Groups("link")
See Also
Extract Ref Links From WebPage using VB.Net Regular Expressions
Remove HTML Tags from String using .NET Regular Expressions
VB.NET Regular Expression to Check URL
VB.NET Regular Expression to Check Email Addresses
VB.NET Regular Expression to Check MAC Address
Regular Expression to Check Zip Code
Validate eMail Addresses using VB.NET Function
Regular Expressions in Dot Net (.NET)