Regular expression between two nth position characters
I am trying to get some data depending on a text string that is between two characters (_) but may be a word at the nth position.
I currently have the following
!((?:.*?(_)){2})_(.+?)$
works with the following data
D20_Mbps_U10_Mbps_TC4_P
where would i expect to get
U10
but get nothing as the first part grabs
D20_Mbps_
and thus leaves nothing for the second part to capture
I tried
_\s*(.*?)(?=\s*_)
But that only gives me the first place where I need it to be nth position. Where can I put n at runtime.
any ideas?
thank
source to share
Let me try to answer this in detail.
If you want to match some Nth occurrence of a substring in a delimited string, you should really think about some function String.Split
. In your case, splitting with _
and getting the values you want is a trivial task.
Now that you cannot use programming tools to retrieve this value, you can only do so with a limit quantifier, grouping and capturing (in Java and .NET, you can achieve the same even without capturing).
So, the basic idea is to match 0 or more characters other than your delimiter, and then match the delimiters themselves, then repeat the same N-1 times. Then just compare the delimiter and write the following characters without a delimiter.
^(?:[^_]*_){2}([^_]*)
See demo . Group 1 will contain U10
.
Or another option :
^(?:[^_]*_){2}([^_]*)_(.+)$
This will grab the third _
-delimited element into group 1. Group 2 in this case is the fourth + element, the rest of the line to the end.
Note that in some regex variants {
and (
must be escaped (vim, sed with non-EGREP versions, etc.).
source to share