Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6650057
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T00:50:53+00:00 2026-05-26T00:50:53+00:00

How to find visited pages for a particular user from a big log file

  • 0

How to find visited pages for a particular user from a big log file that contains list
of sessionId and PageId combination in each separate line?

File is big enough not to fit in memory. It means find out page that is being visited most in same session(user).

for e.g.

My file is (order is sessionId, PageID)

usera  page1
userb  page2
userb  page1
usera  page3
....

It should print

usera visits page1 most followed by page3.

If the number of pages
visited is equal, it is up to you how to handle the case (Can print both, or can print any
one of them)

Which data structure/algorithm will you use for this? Since this is an interview-question, efficient algorithm/data structure would be appreciated. The interviewer did not specify what order of algorithm he was looking for.

I came up with std::map<string,std::pair<string,int> > solution. The interviewer asked if I can do anything better than this or if the key set is so large it won’t be efficiently handled by map, what should be done?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T00:50:53+00:00Added an answer on May 26, 2026 at 12:50 am

    I think the first step would be to remove all “non-usera” lines since you’re doing per-user parsing. This would be a one-time job separating all users into different files. After that you can do a line-by line analysis keeping only a couple of lines in “history”. You can do this using a simple line parser without having to store the whole file in-memory.

    If it’s going to be need something like a data structure necessarily, you might want to look into map-reduce paradigm — Hadoop would be ideal for files on the scale of 10GB +.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

could i able to find out current page (visited by user in web application)
I'd like to analyze the structure of some complex web pages that I've visited.
i have a table that has users, the pages that they visited , session
find whether there is a loop in a linked list. Do you have other
find . -type f | xargs file | grep text | cut -d':' -f1
Find out the time complexity (Big Oh Bound) of the recurrence T(n) = T(⌊n⌋)
I find that getting Unicode support in my cross-platform apps a real pain in
I find from reading perldoc perlvar, about a thousand lines in is help for
I find that the .NET event model is such that I'll often be raising
I need to keep track of which user has visited which page how many

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.