Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6773577
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T15:41:05+00:00 2026-05-26T15:41:05+00:00

I am trying to come up with equivalent of wc -l using Haskell Iteratee

  • 0

I am trying to come up with equivalent of “wc -l” using Haskell Iteratee library. Below is the code for “wc” (which just counts the words – similar to the code in iteratee example on hackage), and runs very fast:


{-# LANGUAGE BangPatterns #-}
import Data.Iteratee as I
import Data.ListLike as LL
import Data.Iteratee.IO
import Data.ByteString


length1 :: (Monad m, Num a, LL.ListLike s el) => Iteratee s m a
length1 = liftI (step 0)
  where
    step !i (Chunk xs) = liftI (step $ i + fromIntegral (LL.length xs))
    step !i stream     = idone i stream
{-# INLINE length1 #-}
main = do
  i' <- enumFile 1024 "/usr/share/dict/words" (length1 :: (Monad m) => Iteratee ByteString m Int)
  result <- run i'
  print result
  {- Time measured on a linux x86 box: 
  $ time ./test ## above haskell compiled code
  4950996

  real    0m0.013s
  user    0m0.004s
  sys     0m0.007s

  $  time wc -c /usr/share/dict/words
  4950996 /usr/share/dict/words

  real    0m0.003s
  user    0m0.000s
  sys     0m0.002s
  -}

Now, how do you extend it to count the number of lines that too runs fast? I did a version using Prelude.filter to filter only “\n” to length but it is slower than linux “wc -l” because of too much memory, and gc (lazy evaluation, I guess). So, I wrote another version using Data.ListLike.filter but it won’t compile because it doesn’t type check – help here would be appreciated:


  {-# LANGUAGE BangPatterns #-}
  import Data.Iteratee as I
  import Data.ListLike as LL
  import Data.Iteratee.IO
  import Data.ByteString
  import Data.Char
  import Data.ByteString.Char8 (pack)

  numlines :: (Monad m, Num a, LL.ListLike s el) => Iteratee s m a
  numlines = liftI $ step 0
    where
      step !i (Chunk xs) = liftI (step $i + fromIntegral (LL.length $ LL.filter (\x ->  x == Data.ByteString.Char8.pack "\n")  xs))
      step !i stream = idone i stream
  {-# INLINE numlines #-}

  main = do
    i' <- enumFile 1024 "/usr/share/dict/words" (numlines :: (Monad m) => Iteratee ByteString m Int)
    result <- run i'
    print result
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T15:41:05+00:00Added an answer on May 26, 2026 at 3:41 pm

    There are a lot of good answers already; I have very little to offer performance-wise but a few style points.

    First, I would write it this way:

    import Prelude as P
    import Data.Iteratee
    import qualified Data.Iteratee as I
    import qualified Data.Iteratee.IO as I
    import qualified Data.ByteString as B
    import Data.Char
    import System.Environment
    
    -- numLines has a concrete stream type so it's not necessary to provide an
    -- annotation later.  It could have a more general type.
    numLines :: Monad m => I.Iteratee B.ByteString m Int
    numLines = I.foldl' step 0
     where
      --step :: Int -> Word8 -> Int
      step acc el = if el == (fromIntegral $ ord '\n') then acc + 1 else acc
    
    main = do
      f:_   <- getArgs
      words <- run =<< I.enumFile 65536 f numLines
      print words
    

    The biggest difference is that this uses Data.Iteratee.ListLike.foldl'. Note that only the individual stream elements matter to the step function, not the stream type. It’s exactly the same function as you would use with e.g. Data.ByteString.Lazy.foldl'.

    Using foldl' also means that you don’t need to manually write iteratees with liftI. I would discourage users from doing so unless absolutely necessary. The result is usually longer and harder to maintain with little to no benefit.

    Finally, I’ve increased the buffer size significantly. On my system this is marginally faster than enumerators default of 4096, which is again marginally faster (with iteratee) than your choice of 1024. YMMV with this setting of course.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to come up with an Access query which is equivalent to this
I'm trying to come up with a method which will measure and return the
I have been trying to optimize some code which handles raw pixel data. Currently
I'm building legacy code using the GNUARM C compiler and trying to resolve all
Im trying to come up with some code to check whether or not a
I'm from a LAMP background. I'm trying to come up with a .NET equivalent
im trying to come up with a design for a wrapper for use when
I'm trying to come up with a solution to programmatically enable/disable the network card
I'm trying to come up with an answer to two questions that didn't seem
I'm trying to come up with the largest possible group of friends that would

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.