Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8313057
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T20:22:17+00:00 2026-06-08T20:22:17+00:00

Here’s my scenario. User selects a document in my software and my software extracts

  • 0

Here’s my scenario. User selects a document in my software and my software extracts some key data out of the document. The software handles two kinds of formats; PDF and DOCX. For each of these types, there are several templates and the uploaded document is supposed to belong to one of these templates. I don’t know if this is a well-known problem and if there exists an established design pattern to solve this scenario (that’s why I’m on SO). Here’s what I have designed so far:

Since each template has specific structure/contents, I’m thinking of creating separate classes for each template. There will be a top-level interface called IExtractor, then there will be two top-level classes called PdfExtractor and DocxExtractor, each implementing the IExtractor interface. Any functionality common to all PDF (or DOCX) templates will go into these parent classes.

Below these two parent classes, there will be several template-classes, one for each template. For example a class called Template571_PdfExtractor that inherits from PdfExtractor, has methods specific to Template 571, but provides results in the same form as any other extractor.

I’m using C# 4.0 if that matters. Here’s the skeleton:

The interface:

interface IExtractor
{
void ExtractDocument(System.IO.FileInfo document, dsExtract dsToFill);
}

The two parent classes:

public class DocxExtractor : IExtractor 
{
    public virtual void ExtractDocument(System.IO.FileInfo document, dsExtract dsToFill)
    {
    }
}

public class PdfExtractor : IExtractor 
{
    public virtual void ExtractDocument(System.IO.FileInfo document, dsExtract dsToFill)
    {
    }
}

One of the concrete classes:

public class Template571_PdfExtractor : PdfExtractor
{
    public virtual void ExtractDocument(System.IO.FileInfo document, dsExtract dsToFill)
    {
    }
}

Now there are a few key questions I’m not sure about. All of them revolve around the problem that I don’t know how and where to instantiate the concrete (template) class’s object. I can use file extension to decide whether I need to go down the PdfExtractor tree node or DocxExtractor node. After that, it is the file’s contents that tells me the template to which user’s document belongs. So where do I put this “decision” code? My idea was to put it in the PdfExtractor class (or DocxExtractor for that matter). Is that the correct way?

Sorry I got a bit long, but I didn’t know how to fully describe my situation. Thanks for your ideas.

Shujaat

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T20:22:18+00:00Added an answer on June 8, 2026 at 8:22 pm

    Once you dig deeper into design patterns and such you’ll surely find out that most of the time there is no one correct way to implement something…

    One possible way would be to create so-called factory classes: One for PdfExtractors, and another one for DocXExtractors. Each factory class would probably have a single static method like

    public final class PdfExtractorFactory {
       public static PdfExtractor getExtractor(String filename) { ... }
    
       ... // constructor, or singleton getter here
    }
    

    The logic to decide upon the concrete subclass of the PdfExtractor instance to return (i.e., which template to use) would than reside in the factory method. This way, neither the abstract base class PdfExtractor nor its subclasses would be cluttered with this decision logic. Only the factory classes would need to know about the subclasses of PdfExtractor (resp. DocXExtractor), and the rest of your code would be totally unaware of the concrete subclasses since the factories pass on instances of the superclasses.

    Since you’re likely to need only a single instance of PdfExtractorFactory and DocXExtractorFactory, you might choose to implement these factory classes as singletons.

    Update: Of course you can use either a static factory method or the Singleton pattern and a non-static factory method (but you don’t need both).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Here's some CSS and HTML to make a textarea below a list of data
Here is my string line: string conSTR = Data Source=(local);Initial Catalog=MyDB;User ID=sa; I just
Here is the code: create table `team`.`User`( `UserID` bigint NOT NULL AUTO_INCREMENT , `Username`
Here is what I am currently doing. PHP echo's out the recent post in
Here is the scenario. I'm writing my geo-ruby oracle adapter for Ruby On Rails
Here is the problem that I am trying to solve. I have two folders
Here is what is supposed to happen: The moment the user chooses an option
Here is my code...I have two dimensional matrices A,B. I want to develop the
Here's the flow that I am trying to achieve: 1) User uploads an audio
Here is my simplified data structure: Object1.h template <class T> class Object1 { private:

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.