Recently I had to parse Apache error log and find out all images that are missing.
After referring to a couple of man
pages, I came up with this one liner. I am sure I will need it again, so thought of noting it here so that I know where to look when I need it again 😉
Feel free to use it in whatever way you want, if it solves your problem as well.
Assumption
This assumes that each line in your apache error log looks like this.
Explanation
Filter all 404 lines from the log file
The first step is to filter all lines that that contain the “File does not exist” text. This is done by using sed
.
- By default,
sed
prints out all lines. This is prevented by the-n
. - The second option is the regular expression followed by the
p
flag. This option prints out all lines which match the text. - The third option is the name of the error log file.
Extract the last column of matching lines
The next step is to retrieve the file name from the matching lines. This is done by using awk
.
- By default
awk
uses space as the delimiter and splits the lines into different columns. If you look at each line, we want the last column. NF
is a special variable which points to the last column.print $NF
prints the last column
Filter only images
The next step is to filter out only the images. This is done again by using sed
.
- I use
-n
again to preventsed
from printing all lines. - The
-r
is added, so that we can use extended regular expression - The regular expression
(jpg|jpeg|png|gif)$
filters out all images andp
at the end prints out only lines that match
Sort and find uniques
The sort
and uniq
commands sort the list and find the unique lines.
Write to a file
The final output is written to a file by using the redirection >
operator. If you want to append to a file then we may have to use >>
operator.
More to come
It is really amazing like how you can combine these tools to do amazing things. I am planning to document other one liners which I end up creating to solve my problems. So stay tuned 🙂
Also if you think this can be improved, then do let me know as well.