We have a images folder which has about a million images in it.
We need to write a program which would fetch the image based upon a keyword that is entered by the user.
We need to match the file names while searching to find the right image.
Looking for any suggestions.
Thanks
N
We have a images folder which has about a million images in it. We
Share
Depending on the operating system, I suggest you use Indexing Service, Windows Desktop Search, or the latest version of Windows Search. This solves your problem of file lookup based on keyword, it addresses the performance issues in regards to the number of files within a folder, it is scalable, and easily extended.
The DSearch sample at http://msdn.microsoft.com/en-us/library/dd940335(VS.85).aspx does almost exactly what you want and is easy to implement.
For example, if querying a million files and need to move file into subfolders to increase performance then you can simply create the folders and move the files. You will not need to change any code.
If you need to change how keywords are applied, such as using the keywords of the file’s summary properties, then you only need to change the query.
For the later operating systems, you do not even need to install any software because the search feature is part of the operting system and available through OleDB. If you want to use Advance Query Syntax (AQS), Microsoft provides a typed-library to access the COM Interfaces that make it easy to generate the SQL command to query the index database.
Honestly, all these other suggestions about databases, and so on, are a waste of time.
MSDN search of windows search at http://social.msdn.microsoft.com/Search/en-US?query=windows+search
Related Search Technologies to Windows Search at http://msdn.microsoft.com/en-us/library/bb286798(VS.85).aspx
Searching a million files in one folder is going to be prohibitive slow. (See my response at Directory file size calculation – how to make it faster? for Directory file size calculation – how to make it faster?.
I can search my hard drive of ~300,000 files for *tabcontrol.cs” in less that a second The first query takes approx. 4000ms and each query, using a different search term, after the first one takes 300-600ms.
See the DSearch sample at http://msdn.microsoft.com/en-us/library/dd940335(VS.85).aspx for how easy this is to implement.
“Searching the Desktop” at http://blogs.msdn.com/b/coding4fun/archive/2007/01/05/1417884.aspx