HW5: apache2cols.py
- Due Feb 3 by 9am
- Points 1
- Submitting a file upload
- Available after Jan 31 at 9:50am
1. Watch this video on writing Python program for use on the CLI Links to an external site.. Note that in the examples, I don't include the lines:
signal(SIGPIPE,SIG_DFL)
at the top of the Python files; you should include these in any Python CLI programs you write.
2. Modify the apache2cols.py program we started today so that...
-
- move the logic for parsing the incoming apache log to a well-named function
- it has a main that processes command line arguments and calls the function you made in the previous step
- it has a __name__ == "__main__" guard that prevents the script from running if it's imported into another script versus run from the command line
- the program takes an optional command line argument: a filename.
- if present, your script should process that file (still outputting the result to stdout)
- if not, your script should behave as it already does, processing stdin
Your program should work in both the following ways (I'm running these from data/apache-logs/ directory of the class git repository):
cat hank.feild.org-access.log | python3 ../../scripts/apache2cols.py
python3 ../../scripts/apache2cols.py apache-logs/hank.feild.org-access.log
Add your Python program to your GitHub project in the scripts/ folder and
3. Rewrite the expression below so that it uses apache2cols.py to convert the columns before grabbing the user-agent column. If you're using macOS, use gzcat, not zcat.
zcat *.gz | \ cat - *.log *.log.1 | \ cut -d' ' -f12- | \ grep -i "bot" | \ sort | \ uniq -c | \ sort -rn | \ head
In a Word or Google document, include a link to your code for #2 above on GitHub, the newly constructed command you wrote for #3, and a screenshot of that command working. Submit your document to this assignment.
When doing these, I'm looking for a good effort, not necessarily the exact right answer. Don't short change yourselves by just trying random things—putting in the time to really give this a good go will pay off in the long run, even if you don't get the exact right answer. Make sure you attempt both questions and that your attempts involve the components asked for in the question. You may work on this with others in the class.