Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6349449
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T21:37:05+00:00 2026-05-24T21:37:05+00:00

I have been running a highly concurrent application on my HP Proliant Servers. The

  • 0

I have been running a highly concurrent application on my HP Proliant Servers. The application is a file system indexer i coded in erlang. It spawns a process per Folder it finds on the file system and records all file paths in a fragmented Mnesia Database. (Database consists of disc_only_copies type of tables and a screen shot of its file system can be viewed here.)

The Snippet of code that does the high intensive job of going through the file system is shown below:


%%% -------- COPYRIGHT NOTICE --------------------------------------------------------------------
%% @author Muzaaya Joshua, <joshmuza@gmail.com> [http://joshanderlang.blogspot.com]
%% @version 1.0 free software, but modification prohibited
%% @copyright Muzaaya Joshua (file_scavenger-1.0) 2011 - 2012 . All rights reserved
%% @reference <a href="http://www.erlang.org">OpenSource Erlang WebSite</a>
%% 
%%% ---------------- EDOC INTRODUCTION TO THE MODULE ----------------------------------------------
%% @doc This module provides the low level APIs for reading, writing,
%% searching, joining and moving within directories.The module implementation
%% took place on @date at @time.
%% @end

-module(file_scavenger_utilities).

%%% ------- EXPORTS -------------------------------------------------------------------------------
-compile(export_all).

%%% ------- INCLUDES -----------------------------------------------------------------------------

%%% -------- MACROS ------------------------------------------------------------------------------
-define(IS_FOLDER(X),filelib:is_dir(X)).
-define(IS_FILE(X),filelib:is_file(X)).
-define(FAILED_TO_LIST_DIR(X),error_logger:error_report(["*** File Scavenger Utilities Error ***** ",{error,"Failed to List Directory"},{directory,X}])).
-define(NOT_DIR(X),error_logger:error_report(["*** File Scavenger Utilities Error ***** ",{error,"Not a Directory"},{alleged,X}])).
-define(NOT_FILE(X),error_logger:error_report(["*** File Scavenger Utilities Error ***** ",{error,"Not a File"},{alleged,X}])).
%%%--------- TYPES -------------------------------------------------------------------------------

%% @type dir() = string(). 
%%  Must be containing forward slashes, not back slashes. Must not end with a slash
%%  after the exact directory.e.g this is wrong: "C:/Program Files/SomeDirectory/"
%%  but this is right: "C:/Program Files/SomeDirectory"
%% @type file_path() = string(). 
%%  Must be containing forward slashes, not back slashes.
%%  Should include the file extension as well e.g "C:/Program Files/SomeFile.pdf"

%% -----------------------------------------------------------------------------------------------
%% @doc Enters a directory and executes the fun ForEachFileFound/2 for each file it finds
%% If it finds a directory, it executes the fun %% ForEachDirFound/2. 
%% Both funs above take the parent Dir as the first Argument. Then, it will spawn an 
%% erlang process that will spread the found Directory too in the same way as the parent directory 
%% was spread. The process of spreading goes on and on until every File (wether its in a nested 
%% Directory) is registered by its full path.
%% @end
%%
%% @spec spread_directory(dir(),dir(),funtion(),function())-> ok.

spread_directory(Dir,Top_Directory,ForEachFileFound,ForEachDirFound) when is_function(ForEachFileFound),is_function(ForEachDirFound) ->
    case ?IS_FOLDER(Dir) of
        false -> ?NOT_DIR(Dir); 
        true -> 
            F = fun(X)->
                    FileOrDir = filename:absname_join(Dir,X),
                    case ?IS_FOLDER(FileOrDir) of
                        true -> 
                            (catch ForEachDirFound(Top_Directory,FileOrDir)),
                            spawn(fun() -> ?MODULE:spread_directory(FileOrDir,Top_Directory,ForEachFileFound,ForEachDirFound) end);
                        false -> 
                            case ?IS_FILE(FileOrDir) of
                                false -> {error,not_a_file,FileOrDir};
                                true -> (catch ForEachFileFound(Top_Directory,FileOrDir))
                            end
                    end
                end,
            case file:list_dir(Dir) of      
                {error,_} -> ?FAILED_TO_LIST_DIR(Dir);
                {ok,List} -> lists:foreach(F,List)
            end
    end.    

The function spread_directory/4 is generic in a way that it takes two funs. One fun: ForEachFileFound/2 takes along with the Top Most Directory, the found file and does anything with it and the other fun: ForEachDirFound/2 takes along with the Top Most Directory, the folder it finds and uses it in any way it wants.

The start script i use for this application makes sure that erlang will be able to spawn as many processes as possible. Once a process finishes indexing a folder it exits.

#!/usr/bin/env sh
echo "Starting File Scavenger System. Layer 1 on the P2P File Sharing System....."
erl \
    -name file_scavenger@127.0.0.1 \
    +P 13421779 \
    -pa ./ebin ./lib/*/ebin ./include \
    -mnesia dir '"./database"' \
    -mnesia dump_log_write_threshold 10000 \
    -eval "application:load(file_scavenger)" \
    -eval "application:start(file_scavenger)"

There is a gen_server which interfaces the intensive module with the database in which i record all paths. A snippet of where it starts the spread_directory work is shown here below:

handle_cast(index_dirs,#scavenger{directory_paths = Dirs} = State)->
    {File,Folder} = case {State#scavenger.verbose,State#scavenger.verbose_to} of
                        {true,tty} -> 
                            {
                            fun(TopDir,Fl)-> 
                                io:format(" File: ~p~n",[Fl]),
                                file_scavenger_database:insert_file(filename:basename(Fl),file,Fl,TopDir,filename:extension(Fl))
                            end,
                            fun(TopDir,Fd) -> 
                                io:format(" Folder: ~p~n",[Fd]),
                                file_scavenger_database:insert_file(Fd,folder,Fd,TopDir,undefined)
                            end
                            };
                        {true,SomeFile}-> 
                            {
                            fun(TopDir,Fl)-> 
                                os:cmd("echo File: " ++ Fl ++ " >> " ++ SomeFile),
                                file_scavenger_database:insert_file(filename:basename(Fl),file,Fl,TopDir,filename:extension(Fl))
                            end,
                            fun(TopDir,Fd)-> 
                                os:cmd("echo Folder: " ++ Fd ++ " >> " ++ SomeFile),
                                file_scavenger_database:insert_file(Fd,folder,Fd,TopDir,undefined)
                            end
                            }                       
                    end,
    Main = fun(Dir) -> 
                error_logger:info_msg("*** File scavenger Server indexing directory: ~p~n",[Dir]),
                spawn(fun() -> file_scavenger_utilities:spread_directory(Dir,Dir,File,Folder) end)
            end,
    lists:foreach(Main,Dirs),
    {noreply,State};    
handle_cast(stop, State) -> {stop, normal, State}.

More Source details can be found in the whole application.
The application entire Source and build can be found here: File_scavenger-1.0.zip.

Now, i start the application on the Server (HP Proliant G6, containing Intel processors (2 processors, each 4 cores, 2.4 GHz speed each core, 8 MB Cache size), 20 GB RAM size, 1.5 Terabytes disk space. Now, 2 of these high power machines are in our disposal. System Database should be replicated across the two. Each server runs Solaris 10, 64 bit), whose terminal now looks like this below:

bash-3.00# sh file_scavenger.sh
Starting File Scavenger System. Layer 1 on the P2P File Sharing System.....
Erlang R14B03 (erts-5.8.4) [source] [smp:8:8] [rq:8] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.8.4  (abort with ^G)
(file_scavenger@127.0.0.1)1>
=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Starting File Scavenger Database......
=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Database Successfully Started....

=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Starting File Scavenger Database......
=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Database Successfully Started....

=INFO REPORT==== 18-Aug-2011::09:36:04 ===
File Scavenger Server starting with default verbose settings....

(file_scavenger@127.0.0.1)1> file_scavenger_server:index_dirs().

The server starts to run and verboses to the terminal all files and folders it finds. The server is equipped with too much RAM (20 GB), and Swap space (Swap is 16 GB). However, it ran for about 18 hours and finally, the erlang Virtual machine reported this:

File: "/proc/4324/root/opt/csw/gcc4/share/locale/ja/LC_MESSAGES/gcc.mo"
 Folder: "/proc/4324/root/opt/csw/gcc4/share/locale/da"
 Folder: "/proc/4324/root/opt/csw/gcc4/share/locale/es/LC_MESSAGES"
 File: "/proc/4324/root/proc/4984/root/.thumbnails/normal/dc259e3897e8af4b379c6d956b6c1393.png"
 File: "/proc/4324/root/proc/4984/root/.thumbnails/fail/gnome-thumbnail-factory/223c19786421b7101d14075bdec46f61.png"
 File: "/proc/4324/root/opt/csw/gcc4/libexec/gcc/i386-pc-solaris2.10/4.5.1/install-tools/mkheaders"
 File: "/proc/4324/root/opt/csw/gcc4/libexec/gcc/i386-pc-solaris2.10/4.5.1/cc1plus"
 File: "/proc/4324/root/opt/csw/gcc4/lib/libsupc++.la"

Crash dump was written to: erl_crash.dump
eheap_alloc: Cannot allocate 153052320 bytes of memory (of type "heap").
Abort - core dumped
bash-3.00#

Question 1. With such a powerful server, why would the operating system fail to provide such memory to the application (it was the only application running)?

Question 2. The Erlang Emulator i start is instructed to be able to spawn as many processes as it may need. the value +P 13421779. Is Erlang VM failing to access this memory or failing to allocate it to its processes ?

Question 3. To Solaris, it sees one process: epmd, perhaps containing and starting thousands of micro threads. What configurations can i make to Solaris to be able to never stop my application however much “memory hungry” it may be? Swap space available is 16 GB, RAM 20 GB, honestly, there must be something wrong.

Question 4. Which configurations can i make to the Erlang Emulator, to avoid these heap memory crash dumps especially when all the memory it may need is available on the server? How will i run more memory consuming apps on this server if Erlang still fails to allocate such memory to a simple file system indexer (well its heavily concurrent)?

finally, all other tweaks i could do to avoid heap memory problems on such capable hardware are welcome. Thanks in advance

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T21:37:06+00:00Added an answer on May 24, 2026 at 9:37 pm

    I haven’t had time to look at the source, but here are some comments:

    Question 1. With such a powerful server, why would the operating
    system fail to provide such memory to the application (it was the only
    application running)?

    Because the Erlang VM tried to consume more than the available free memory.

    Question 2. The Erlang Emulator i start is instructed to be able to
    spawn as many processes as it may need. the value +P 13421779. Is
    Erlang VM failing to access this memory or failing to allocate it to
    its processes ?

    No. If you would have run out of processess, the Erlang VM would have said so (and the VM would still be up and running):

    =ERROR REPORT==== 18-Aug-2011::10:04:04 ===
    Error in process <0.31775.138> with exit value: {system_limit,[{erlang,spawn_link,    [erlang,apply,[#Fun<shell.3.130303173>,[]]]},{erlang,spawn_link,1},{shell,get_command,5},    {shell,server_loop,7}]}
    

    Question 3. To Solaris, it sees one process: epmd, perhaps containing
    and starting thousands of micro threads. What configurations can i
    make to Solaris to be able to never stop my application however much
    “memory hungry” it may be? Swap space available is 16 GB, RAM 20 GB,
    honestly, there must be something wrong.

    epmd is the Erlang port mapping deamon. It’s responsible for managing distributet Erlang and has nothing to with your individual Erlang application. The processes you should look for will be name beam.smp most likely. These will show the OS memory consumption of the Erlang VM etc.

    Question 4. Which configurations can i make to the Erlang Emulator, to
    avoid these heap memory crash dumps especially when all the memory it
    may need is available on the server? How will i run more memory
    consuming apps on this server if Erlang still fails to allocate such
    memory to a simple file system indexer (well its heavily concurrent)?

    The Erlang VM should be able to use all of the available memory in your machine. However, it depends on how your application is written. There can be many reasons for memory leaks:

    • Atom table filling up (you create too many unique atoms)
    • ETS or Mnesia tables are not garbage collected (you do not delete old unused elements)
    • Not enough memory for processes (you spawn too many processess)
    • Too many binaries are created (you might keep unused references to old binaries)
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have been trying to learn Erlang and have been running into some problems
So I have been running the numbers for Azure and RackSpace Cloud Servers and
I am currently developing on an advertising system, which have been running just fine
I have been running an application developed under Windows 7 in Delphi XE on
I have been running Apache HTTPD in 64bit mode by stripping out the 32bit
I have been running SQL Server 2005 Express Management Studio (SSMSE), and I now
I have been running Visual Studio 2008 Team Edition for some time now and
I have been running drush scripts (for Drupal ) with Cygwin on my relatively
So I have been running into all kinds of interesting problems in VisualStudio 2008
I have been porting oracle selects, and I have been running across a lot

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.