regex - Retrieving text from Pattern1 to Pattern2 - Python -
i have input file below
pattern1 ptr1 blah blah blah needthis blah blah blah thisoneaswell blah blah blah pattern2 pattern1 ptr2 blah blah blah needthis blah blah blah thisoneaswell blah blah blah pattern2 ............................ ............................ pattern1 ptrn blah blah needthis blah blah blah thisoneaswell blah blah blah pattern2
i need function return first column entries pattern1 pattern2 below,
ptr1 needthis thisoneaswell ptr2 needthis thisoneaswell ...................... ...................... ptrn needthis thisoneaswell
ptr1 , ptr2 ...... ptrn each different texts. pattern1 & pattern2 different consistently present in file.
how can achieve in python?
i still beginner in python , trying achieve use re.findall() not getting desired o/p:
def retrieve(): file = open("filename","r") string = re.findall(r"pattern1",file.read()) print string
you nest 2 regexes:
txt='''\ pattern1 ptr1 blah blah blah needthis1 blah blah blah thisoneaswell1 blah blah blah pattern2 pattern1 ptr2 blah blah blah needthis2 blah blah blah thisoneaswell2 blah blah blah pattern2 ............................ ............................ pattern1 ptrn blah blah needthisn blah blah blah thisoneaswelln blah blah blah pattern2''' import re m in re.finditer(r'^pattern1\s*(.*?)(?=^pattern2)', txt, re.m | re.s): print re.findall(r'(^\w+)', m.group(1), re.m)
prints:
['ptr1', 'needthis1', 'thisoneaswell1'] ['ptr2', 'needthis2', 'thisoneaswell2'] ['ptrn', 'needthisn', 'thisoneaswelln']
edit 1
if using file fit in memory:
with open(fn) f: txt=f.read() m in re.finditer(r'^pattern1\s*(.*?)(?=^pattern2)', txt, re.m | re.s): print re.findall(r'(^\w+)', m.group(1), re.m)
use mmap larger files won't fit in memory.
edit 2
just append results list after joining string:
with open(fn) f: results=[] txt=f.read() m in re.finditer(r'^pattern1\s*(.*?)(?=^pattern2)', txt, re.m | re.s): results.append('\n'.join(re.findall(r'(^\w+)', m.group(1), re.m)) print '\n===\n'.join(results)
Comments
Post a Comment