The data source I am working with is terrible. Some places where you would

Question

0

Asked: May 19, 20262026-05-19T00:29:51+00:00 2026-05-19T00:29:51+00:00

The data source I am working with is terrible. Some places where you would

0

The data source I am working with is terrible. Some places where you would expect integers, you get “Three”. In the phone number field, you may get “the phone # is xxx”. Some fields are simply blank.

This is OK, as I’m parsing each field so “Three” will end up in my model as integer 3, phone numbers (and such) will be extracted via regex. Users of the service KNOW that the data is sketchy and incomplete, as it’s an unfortunate fact of the way our data source is maintained and there’s nothing we can do about it but step up our parsing game! As an aside, we are producing our own version of the data slowly as we parse more and more of the original data, but this poor source has to do for now.

So users select the data they wish to parse, and we do what we can, returning a partial/incorrect model. Now the final model that we want to store should be validated – there are certain fields that can’t be null, certain strings must adhere to a format and so on.

The flow of the app is:

User tells the service which data to
parse.
Service goes off and grabs
the data, parses what it can and
returns a partial model with
whatever data it could retrieve.
We display the data to the user,
allowing them to make corrections
and to fill in any mandatory fields
for which no data was collected.
This user-corrected data is to be
saved, and therefore validated.
If validation fails, show data again
for user to make fixes, rinse &
repeat.

What is the best way to go about having a model which starts off being potentially completely invalid or containing no data, but which needs to be validated eventually? The two ways I’ve thought of (and partially implemented) are:

2 models – a Data model, which has validations etc, and an UnconfirmedData model, which has no validations. The original data is put into an UnconfirmedData model until the user has made their corrections, at which point it it put into a Data model and validation is attempted.
One model, with a “confirmed data” flag, with validation being performed manually rather than Rails’ validation.

In practice I lean towards using 2 models, but I’m pretty new to Rails so I thought there me be a nicer way to do this, Rails has a habit of surprising me like that 🙂

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-19T00:29:52+00:00

Must you save your data in between requests? If so, I would use your two model format, but use Single Table Inheritance (STI) to keep things dry.

The first model, the one responsible for the parsing and the rendering and the doing-the-best-it-can, shouldn’t have any validations or restrictions on saving it. It should however have the type column in the migration so you can use the inheritance goodness. If you don’t know what I’m talking about, read up on the wealth of information on STI, a good place to start would be a definitive guide.

The second model would be the one you would use in the rest of the application, the strict model, the one which has all the validations. Every time a user submitted reworked and potentially valid data, your app would try and move your instance of the open model created from the params, to an instance of the second model, and see if it was valid. If it was, save it to the database, and the type attribute will change, and everything will be wonderful. If it isn’t valid, save the first instance, and return the second instance to the user so the validation error messages can be used.

class ArticleData < ActiveRecord::Base
    def parse_from_url(url)
        # parses some stuff from the data source
    end
end

class Article < ArticleData
     validates_presence_of :title, :body
     validates_length_of :title, :greater_than => 20
     # ...
end

You’ll need a pretty intense controller action to facilitate the above process, but it shouldn’t be too difficult. In the rest of your application, make sure you run your queries on the Article model to only get back valid ones.

Hope this helps!

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

The data source I am working with is terrible. Some places where you would

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply