Regular expression to match one of the two video IDs in the Google Video URL

I need to grab a video id from a google video url. There are two different types of URLs that I need to be able to do:

http://video.google.com/videoplay?docid= -3498228245415745977 #

where I need to be able to match -3498228245415745977

(note the dash ;-) and

video.google.com/videoplay?docid=-3498228245415745977#docid= 2728972720932273543

where i need to match 2728972720932273543

. Is there a good regex that can match this?

This is what I have so far: @"docid=(-?\d{19}+)"

as the video id appears to be 19 characters, except when it is prefixed with a dash.

I am using C # (from which I have very little experience) in case that changes anything.

Ps I also thank you for watching my YouTube regex ( @"[\?&]v=([^&#])";

), RedTube ( @"/(\d{1,6})"

) and Vimeo ( @"/(\d*)"

).

I do not expect users to enter the full URL and therefore will not match ^http://\\.?sitename+\\.\\w{2,3}

.

+2


a source to share


2 answers


The next part of RegEx uses what is called a negative lookahead to ensure that there is no part of the string containing #docid after the match:

docid=(-?\d{19}(?!\#docid=))

      

(?!\#docid=)

is the negative reverse of RegEx. If you want to know more about this you can take a look at http://www.regular-expressions.info/lookaround.html



Hope this helps you

EDIT: If you haven't got it yet, you should get "The Regulator 2.0" from sourceforge. Its a design and testing tool for regular expressions. I find it very helpful when I design regular expressions.

+2


a source


use this RE:

docid=-([0-9]*)

      

Result



Array
(
    [0] => docid=-3498228245415745977
    [1] => 3498228245415745977
)

      

I've tested it in Java, PHP, awk, perl.

0


a source







All Articles