python find max value and print first 5 lines of file -


i trying program,where have directory , having list of text files,if find "color=" find fuzzy value of 'filename' , 'starting line of file',so:


i need : find max value of fuzzy value , need find first 5 lines file having max value

i did coding can find fuzzy value dont know how find max value , print first 5 files having maximum fuzzy value.please help!

import os fuzzywuzzy import fuzz  path = r'c:\python27'  data = {}   dir_entry in os.listdir(path):     dir_entry_path = os.path.join(path, dir_entry)     if os.path.isfile(dir_entry_path):         open(dir_entry_path, 'r') my_file:             line in my_file:                 part in line.split():                     if "color=" in part:                         print part                         string1= "filename:", dir_entry_path                         print(string1)                         string2= "start line of file:", list(my_file)[0]                         print(string1)                         string3=(fuzz.ratio(string1, string2))                         print(string3) 

and output looks as:

"color=" ('filename:', 'c:\\python27\\maybeee.py') ('filename:', 'c:\\python27\\maybeee.py') 20 "color=" ('filename:', 'c:\\python27\\mayp.py') ('filename:', 'c:\\python27\\mayp.py') 28 part.startswith('color='): ('filename:', 'c:\\python27\\mayp1.py') ('filename:', 'c:\\python27\\mayp1.py') 29 

i need output be,considering example here max value 29,so need print first 5 lines of file having max value.please help!answers appreciated.

your code attempts reread entire file again (at list(myfile)[0]), while there's iterator going on already. troublesome.

it better store 5 first lines (this you're asking, yes?) in variable , print them when condition matches.

also, you're printing string1 twice.

changing loop to:

from collections import defaultdict filenames2fuzz = defaultdict(list)  dir_entry in os.listdir(path):     dir_entry_path = os.path.join(path, dir_entry)     if os.path.isfile(dir_entry_path):         first5lines = []         condition_matched_in_file = false         open(dir_entry_path, 'r') my_file:             line_nbr, line in enumerate(my_file):                 if line_nbr < 5:                      first5lines.append(line)                 part in line.split():                     if "color=" in part:                         print part                         string1= "filename:", dir_entry_path                         print(string1)                         condition_matched_in_file = true                          fuzziness = fuzz.ratio(string1, first5lines[0])                         filenames2fuzz[dir_entry_path].append(fuzziness)                         print(fuzziness)         if condition_matched_in_file:             print('\n'.join(first5lines))  # have dictionary holds filenames  # fuzziness values, can find first 5 lines again # of file has best fuzziness value.  best_fuzziness_ratio = 0  # far can tell, docs indicate between 0 , 100 k, v in filenames2fuzz.items():     if max(v) > best_fuzziness_ratio:         best_fuzzy_file = k         best_fuzziness_ratio = max(v) print('file {} has highest fuzzy value '     'of {}. \nthe first 5 lines are:\n'     ''.format(best_fuzzy_file, best_fuzziness_ratio)) open(best_fuzzy_file) f:     in range(5):         print(f.readline()) 

there few more optimizations (have @ os.walk) , without better explanation of problem (give details files you're looping over, list parts of contents), best can do.


Comments

Popular posts from this blog

google api - Incomplete response from Gmail API threads.list -

Installing Android SQLite Asset Helper -

Qt Creator - Searching files with Locator including folder -