Simple C # Tokenizer using Regex
I'm looking to tokenize really simple strings but trying to get the correct Regex.
The lines might look like this:
string1 = "{[Surname]}, some text... {[FirstName]}"
string2 = "{Item}foo.{Item2}bar"
And I want to extract tokens in curly braces (so string1 gets "{[Surname]}","{[FirstName]}"
and string2 gets "{Item}"
and "{Item2}"
)
So basically there are two different types of tokens that I want to extract: {[Foo]} and {Bar}.
this question is not bad, but i cant get the regex correct correctly: poor man lexer for c # Thanks for the help!
+2
a source to share
3 answers
They are both good answers guys, thanks. Here's what I solved at the end:
// DataToken = {[foo]}
// FieldToken = {Bar}
string pattern = @"(?<DataToken>\{\[\w+\]\})|(?<FieldToken>\{\w+\})";
MatchCollection matches = Regex.Matches(expression.ExpressionString, pattern,
RegexOptions.ExplicitCapture);
string fieldToken = string.Empty;
string dataToken = string.Empty;
foreach (Match m in matches)
{
// note that EITHER fieldtoken OR DataToken will have a value in each loop
fieldToken = m.Groups["FieldToken"].Value;
dataToken = m.Groups["DataToken"].Value;
if (!string.IsNullOrEmpty(dataToken))
{
// Do something
}
if (!string.IsNullOrEmpty(fieldToken))
{
// Do something else
}
}
+2
a source to share