Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7632549
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T06:39:16+00:00 2026-05-31T06:39:16+00:00

I have a python file, i pass the log file and a url to

  • 0

I have a python file, i pass the log file and a url to the code. The output file contains how many times the URL was accessed by which IP addresses.

#!/usr/bin/env python
# 
# Counts the IP addresses of a log file.
# 
# Assumption: the IP address is logged in the first column.
# Example line: 117.195.185.130 - - [06/Mar/2012:00:00:00 -0800] \
#    "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
#

import sys

def urlcheck(line, url):
    '''Checks if the url is part of the log line.'''
    lsplit = line.split()
    if len(lsplit)<7:
        return False
    return url==lsplit[6]

def extract_ip(line):
    '''Extracts the IP address from the line.
       Currently it is assumed, that the IP address is logged in
       the first column and the columns are space separated.'''
    return line.split()[0]

def increase_count(ip_dict, ip_addr):
    '''Increases the count of the IP address.
       If an IP address is not in the given dictionary,
       it is initially created and the count is set to 1.'''
    if ip_addr in ip_dict:
        ip_dict[ip_addr] += 1
    else:
        ip_dict[ip_addr] = 1

def read_ips(infilename, url):
    '''Read the IP addresses from the file and store (count)
       them in a dictionary - returns the dictionary.'''
    res_dict = {}
    log_file = file(infilename)
    for line in log_file:
        if line.isspace():
            continue
        if not urlcheck(line, url):
            continue
        ip_addr = extract_ip(line)
        increase_count(res_dict, ip_addr)
    return res_dict

def write_ips(outfilename, ip_dict):
    '''Write out the count and the IP addresses.'''
    out_file = file(outfilename, "w")
    for ip_addr, count in ip_dict.iteritems():
        out_file.write("%s\t%5d\n" % (ip_addr, count))
    out_file.close()

def parse_cmd_line_args():
    '''Return the in and out file name.
       If there are more or less than two parameters,
       an error is logged in the program is exited.'''
    if len(sys.argv)!=4:
        print("Usage: %s [infilename] [outfilename] [url]" % sys.argv[0])
        sys.exit(1)
    return sys.argv[1], sys.argv[2], sys.argv[3]

def main():
    infilename, outfilename, url = parse_cmd_line_args()
    ip_dict = read_ips(infilename, url)
    write_ips(outfilename, ip_dict)

if __name__ == "__main__":
    main()

now i want to modify the code in such a way that, if i pass a date instead of url, the output file should contain all ips which are pinged on that particular date.

the log file format is as follows.

220.227.40.118 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
220.227.40.118 - - [06/Mar/2012:00:00:00 -0800] "GET /hrefadd.xml HTTP/1.1" 204 214 - -
59.95.13.217 - - [06/Mar/2012:00:00:00 -0800] "GET /dbupdates2.xml HTTP/1.1" 404 0 - -
111.92.9.222 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
120.56.236.46 - - [06/Mar/2012:00:00:00 -0800] "GET /hrefadd.xml HTTP/1.1" 204 214 - -
49.138.106.21 - - [06/Mar/2012:00:00:00 -0800] "GET /add.txt HTTP/1.1" 204 214 - -
117.195.185.130 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
122.160.166.220 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /welcome.html HTTP/1.1" 204 212 - -
117.18.231.5 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.18.231.5 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
122.169.136.211 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
203.217.145.10 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.18.231.5 - - [06/Mar/2012:00:00:00 -0800] "GET /hrefadd.xml HTTP/1.1" 204 214 - -
59.95.13.217 - - [06/Mar/2012:00:00:00 -0800] "GET /dbupdates2.xml HTTP/1.1" 404 0 - -
203.217.145.10 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.206.70.4 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /css/epic.css HTTP/1.1" 204 214 "http://www.epicbrowser.com/welcome.html" -
117.206.70.4 - - [06/Mar/2012:00:00:00 -0800] "GET /add.txt HTTP/1.1" 204 214 - -
117.206.70.4 - - [06/Mar/2012:00:00:00 -0800] "GET /hrefadd.xml HTTP/1.1" 204 214 - -
118.97.38.130 - - [06/Mar/2012:00:00:00 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /js/flash_detect_min.js HTTP/1.1" 304 0 "http://www.epicbrowser.com/welcome.html" -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /images/home-page-bottom.jpg HTTP/1.1" 304 0 "http://www.epicbrowser.com/welcome.html" -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /images/Facebook_Like.png HTTP/1.1" 204 214 "http://www.epicbrowser.com/welcome.html" -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /images/Twitter_Follow.png HTTP/1.1" 204 214 "http://www.epicbrowser.com/welcome.html" -
117.214.20.28 - - [06/Mar/2012:00:00:00 -0800] "GET /images/home-page-top.jpg HTTP/1.1" 304 0 "http://www.epicbrowser.com/welcome.html" -
49.138.106.21 - - [06/Mar/2012:00:00:01 -0800] "GET /dbupdates2.xml HTTP/1.1" 404 0 - -
117.18.231.5 - - [06/Mar/2012:00:00:01 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
117.18.231.5 - - [06/Mar/2012:00:00:01 -0800] "GET /hrefadd.xml HTTP/1.1" 204 214 - -
120.61.182.186 - - [06/Mar/2012:00:00:01 -0800] "GET /mysidebars/newtab.html HTTP/1.1" 404 0 - -
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T06:39:18+00:00Added an answer on May 31, 2026 at 6:39 am

    What’s wrong with grep, cut, sort and uniq?

    grep "\[07/Mar/2012" logfile.txt | cut -d " " -f 1 | sort | uniq
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a python code like this. File named mymodule.py class MyBase(object): pass File
I have a python file testHTTPAuth.py which uses module deliciousapi and is kept in
I have a python class which reads a config file using ConfigParser: Config file:
I have a small python script which i use everyday......it basically reads a file
I have a Python script which accepts a XML file as input and then
I have a python file ( html2text.py ) which gives the desired result when
Say that I have the Python code: someString = file(filename.jpg).read() How can I replicate
I have a Python file with as content: import re import urllib class A(object):
I have created a Python file to generate a Mandelbrot set image. The original
I have a python program/file that I want to run repeatedly and calculate the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.