Re: Sorting out of sequence log Apache log file
- From: Ray Van Dolson <rayvd@xxxxxxxxxxxxxxx>
- Date: Wed, 26 Apr 2006 12:43:20 -0700
On Wed, Apr 26, 2006 at 12:13:51PM -0700, Chris W. Parker wrote:
Hello,
I had a hiccup with syslog/apache/logrotate recently and as a result
some of the Apache log files are out of sequence. This is bad because
Webalizer no longer recognizes the out of sequence lines and my
reporting results are skewed.
Is there a command line util that will sort the records correctly? I've
been looking around through Google without any luck so far.
I assume by out of sequence you mean the time stamps are all off?
The following quickie Python hack works for me. Basically call it as follows:
% cat access_log | /path/to/sort_apache.py > sorted_log.log
If it's a huge logfile the script may give you some problems. Basically it
reads in all the lines in the file and sorts by the date and time and then
spits it out in the right order.
#!/usr/bin/python
#
# Simple script to sort an Apache log based on it's time/date field.
#
# Ray Van Dolson <rayvd@xxxxxxxxxxxxxxx>
#
import re
import sys
from time import strptime, mktime
def main():
line_dict = {}
while 1:
buf = sys.stdin.readline()
if buf:
# We have data to process.
t = re.match(".*\[(\d\d\/[A-Za-z]{3}\/[0-9]{4}:\d{2}:\d{2}:\d{2}) .+?\].*", buf)
if t:
ts = mktime(strptime(t.group(1), "%d/%b/%Y:%H:%M:%S"))
if not line_dict.has_key(ts):
line_dict[ts] = []
line_dict[ts].append(t.group(0))
else:
break
keys = line_dict.keys()
keys.sort()
for entry in keys:
for line in line_dict[entry]:
print line
if __name__ == '__main__':
main()
--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list
- Follow-Ups:
- Re: Sorting out of sequence log Apache log file
- From: Allen K. Smith
- Re: Sorting out of sequence log Apache log file
- References:
- Sorting out of sequence log Apache log file
- From: Chris W. Parker
- Sorting out of sequence log Apache log file
- Prev by Date: Sorting out of sequence log Apache log file
- Next by Date: Re: Sorting out of sequence log Apache log file
- Previous by thread: Sorting out of sequence log Apache log file
- Next by thread: Re: Sorting out of sequence log Apache log file
- Index(es):
Relevant Pages
- Re: Sorting out of sequence log Apache log file
... On Wednesday 26 April 2006 12:43, Ray Van Dolson wrote: ... some of the Apache
log files are out of sequence. ... Is there a command line util that will sort the
records correctly? ... (RedHat) - Re: Calculus XOR Probability
... Distance and length are real numbers. ... It's a sequential set of points, that
is a line of some sort, with a real ... A sequence of elements from some set X is
essentially a function f: ... If you simply want to forget about infinitesimal line segments,
... (sci.math) - Re: Calculus XOR Probability
... Distance and length are real numbers. ... It's a sequential set of points, that
is a line of some sort, with a real ... A sequence of elements from some set X is
essentially a function f: ... (sci.math) - Re: What about an EXPLICIT naming scheme for built-ins?
... But having sortedto return a iterator ... sortedreturns a brand new sorted
sequence. ... -- sorting long sequences to use only a few of the first elements. ...
if you need to sort a long list to get the top 10 elements. ... (comp.lang.python) - Re: Create a numbered list that allows change in priority.
... long as it's higher than the ones you wish to sort. ... Sort the query Ascending
on the Sequence field. ... you change a record's Sequence value the entire display
will re-sort. ... (microsoft.public.access.tablesdbdesign)