Apr 122011
Grepping through log files for lines that match a timestamp is fiddly. It’s hard to catch multi-line entries (e.g. stack traces) and to craft a regex that captures an exact time range. I wrote a little Python script to simpify the process.
Usage:
<example.log| ./by_time.py 9:40 9:44:15
Code below is maintained on GitHub.
#!/usr/bin/env python # Selects time range from a log file. Lines with no time (e.g. stack traces) # are presumed to have occurred at the time of the preceding line. # # Assumes first time-like phrase on a line is the timestamp for that line. # # Assumes time format is pairs of digits separated by colons with optional , or # . initiated suffix. E.g. HH:mm:ss,SSS, HH:mm, etc. # # Does not strip blank lines; just use awk 'NF>0' for that. import sys,re time_pattern = re.compile("(?:^|.*?\D)(\d{1,2}(?::\d{2})+(?:[,.]\d+)?)") fields_pattern = re.compile("[:,.]") if len(sys.argv) < 3: print >> sys.stderr, "Please specify start and end times (e.g. %s 13:50 14:10:01,101)." % sys.argv[0] exit(1) for item,index in [["start time",1],["end time",2]]: if not time_pattern.match(sys.argv[index]): raise ValueError("Cannot parse %s: %s" % (item, sys.argv[index])) start,end = [[int(x) for x in re.split(fields_pattern, s)] for s in sys.argv[1:3]] too_soon = True try: for line in sys.stdin: line = line.strip() m = time_pattern.match(line) if m: t = [int(x) for x in re.split(fields_pattern,m.group(1))] if t >= end: break elif too_soon and t >= start: too_soon = False if not too_soon: print line except IOError: pass