Simple C # Tokenizer using Regex

I'm looking to tokenize really simple strings but trying to get the correct Regex.

The lines might look like this:

string1 = "{[Surname]}, some text... {[FirstName]}"

string2 = "{Item}foo.{Item2}bar"

      

And I want to extract tokens in curly braces (so string1 gets "{[Surname]}","{[FirstName]}"

and string2 gets "{Item}"

and "{Item2}"

)

So basically there are two different types of tokens that I want to extract: {[Foo]} and {Bar}.

this question is not bad, but i cant get the regex correct correctly: poor man lexer for c # Thanks for the help!

+2


a source to share


3 answers


They are both good answers guys, thanks. Here's what I solved at the end:



// DataToken = {[foo]}

// FieldToken = {Bar}

string pattern = @"(?<DataToken>\{\[\w+\]\})|(?<FieldToken>\{\w+\})";

MatchCollection matches = Regex.Matches(expression.ExpressionString, pattern,
RegexOptions.ExplicitCapture);

string fieldToken = string.Empty;
string dataToken = string.Empty;

foreach (Match m in matches)

{
    // note that EITHER fieldtoken OR DataToken will have a value in each loop
    fieldToken = m.Groups["FieldToken"].Value;
    dataToken = m.Groups["DataToken"].Value;

    if (!string.IsNullOrEmpty(dataToken))
    {
         // Do something
    }

    if (!string.IsNullOrEmpty(fieldToken))
    {
         // Do something else
   }
}

      

+2


a source


If the rules are not very convoluted, it will be (?<Token>\{\[.+?\]\})

for the first line and (?<Token>\{.+?\})

for the second



+1


a source


What about (?<token>\{[^\}]*\})

+1


a source







All Articles