Regular expression between two nth position characters

I am trying to get some data depending on a text string that is between two characters (_) but may be a word at the nth position.

I currently have the following

!((?:.*?(_)){2})_(.+?)$

      

works with the following data

D20_Mbps_U10_Mbps_TC4_P

      

where would i expect to get

U10

      

but get nothing as the first part grabs

D20_Mbps_

      

and thus leaves nothing for the second part to capture

I tried

_\s*(.*?)(?=\s*_)

      

But that only gives me the first place where I need it to be nth position. Where can I put n at runtime.

any ideas?

thank

+3


source to share


1 answer


Let me try to answer this in detail.

If you want to match some Nth occurrence of a substring in a delimited string, you should really think about some function String.Split

. In your case, splitting with _

and getting the values ​​you want is a trivial task.

Now that you cannot use programming tools to retrieve this value, you can only do so with a limit quantifier, grouping and capturing (in Java and .NET, you can achieve the same even without capturing).

So, the basic idea is to match 0 or more characters other than your delimiter, and then match the delimiters themselves, then repeat the same N-1 times. Then just compare the delimiter and write the following characters without a delimiter.

^(?:[^_]*_){2}([^_]*)

      



See demo . Group 1 will contain U10

.

Or another option :

^(?:[^_]*_){2}([^_]*)_(.+)$

      

This will grab the third _

-delimited element into group 1. Group 2 in this case is the fourth + element, the rest of the line to the end.

Note that in some regex variants {

and (

must be escaped (vim, sed with non-EGREP versions, etc.).

0


source







All Articles