Thursday, October 3, 2013

Python regexp multiple expressions with grouping

I’m trying to match the output given by a Modem when asked approximately the network info, it looks like this:

Network survey started...For BCCH-Carrier:arfcn: 15,bsic: 4,dBm: -68For non BCCH-Carrier:arfcn: 10,dBm: -72arfcn: 6,dBm: -78arfcn: 11,dBm: -81arfcn: 14,dBm: -83arfcn: 16,dBm: -83

So I’ve two types of expressions to match, the BCCH & non BCCH. the following code is almost working:

match = re.findall('(?:arfcn: (\d*),dBm: (-\d*))|(?:arfcn: (\d*),bsic: (\d*),dBm: (-\d*))', data)

But it seems that BOTH expressions are being matched, & not found fields left blank:

>>> match[('', '', '15', '4', '-68'), ('10', '-72', '', '', ''), ('6', '-78', '', '', ''), ('11', '-81', '', '', ''), ('14', '-83', '', '', ''), ('16', '-83', '', '', '')]

May anyone help? Why such behaviour? I’ve tried changing the order of the expressions, with no luck.

Thanks!

That is how capturing groups work. Since you have five of them, there will always be five parts returned.

Based on your data, I think you could simplify your regex by making the bsic part optional. That way each row would return three parts, the middle one being empty for non BCCH-Carriers.

match = re.findall('arfcn: (\d*)(?:,bsic: (\d*))?,dBm: (-\d*)', data)

No comments:

Post a Comment