How to parse a webpage in C#? Text Parsing in C#
Posted on 4/3/2007 1:50:55 PM
in #C# .NET
We all know how to remove/replace a string in C#, for example:
string myString = "Yousef (programmer) likes to write in C#";
To remove "(programmer)", we can write:
myString = myString.Replace( "(programmer)", "");
But what if you have a text file with the following strings:
"Yousef (programmer) likes to write in C#"
"John (musician) likes to play guitar"
"Anna (flower shop attendant) likes to arrange flowers"
How do you remove what's between the open/close parenthesis, or for that mattar what's between a begin and end tag (like html tages for example). For that it would be nice if you had a function like:
string RemoveBetween(string strBegin, string strEnd, string strSource);
Finding the string between the parentheses can be done easily using the function GetStringInBetween discussed in this post:
http://www.mycsharpcorner.com/Post.aspx?postID=15
then you can give it to the Replace function and you are done.
Here is the implementation for RemoveStringBetween:
public static string RemoveBetween(string strBegin, string strEnd, string strSource)
{
// default behavior
return RemoveBetween(strBegin, strEnd, strSource, false, false);
}
Here is the implementation for RemoveStringBetween with the optional inclusion of the begin and end string tags in the returned result:
public static string RemoveBetween(string strBegin, string strEnd, string strSource, bool removeBegin, bool removeEnd)
{
string[] result=GetStringInBetween(strBegin, strEnd, strSource, removeBegin, removeEnd);
if(result[0]!="")
{
return strSource.Replace(result[0], "");
}
// nothing found between begin & end
return strSource;
}
Hope this helps... Happy programming!
|