$('#map_canvas').mouseover(function(e){ var offset = $('#map_canvas').offset(); var x = e.pageX -…

Question

0

Asked: May 13, 20262026-05-13T20:54:40+00:00 2026-05-13T20:54:40+00:00

I am creating a page where people can post articles. When the user posts

0

I am creating a page where people can post articles. When the user posts an article, it shows up on a list, like the related questions on Stack Overflow (when you add a new question). It’s fairly simple.

My problem is that I have 2 types of users. 1) Unregistered private users. 2) A company.

The unregistered users needs to type in their name, email and phone. Whereas the company users just needs to type in their company name/password. Fairly simple.

I need to reduce the excess database usage and try to optimize the database and build the tables effectively.

Now to my problem in hand:

So I have one table with the information about the companies, ID (guid), Name, email, phone etc.

I was thinking about making one table called articles that contained ArticleID, Headline, Content and Publishing date.

One table with the information about the unregistered users, ID, their name, email and phone.

How do i tie the articles table to the company/unregistered users table. Is it good to make an integer that contains 2 values, 1=Unregistered user and 2=Company and then one field with an ID-number to the specified user/company. It looks like you need a lot of extra code to query the database. Performance? How could i then return the article along with the contact information? You should also be able to return all the articles from a specific company.

So Table company would be:

ID (guid), company name, phone, email, password, street, zip, country, state, www, description, contact person and a few more that i don't have here right now.

Table Unregistered user:

ID (guid), name, phone, email

Table article:

ID (int/guid/short guid), headline, content, published date, is_company, id_to_user

Is there a better approach?

Qualities that I am looking for is: Performance, Easy to query and Easy to maintain (adding new fields, indexes etc)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T20:54:40+00:00

Theory

The problem you described is called Table Inheritance in data modeling theory. In Martin Fowler’s book the solutions are:

single table inheritance: a single table that contains all fields.
class table inheritance: one table per class, with table for abstract classes.
concrete table inheritance: one table per non-abstract class, abstract members are repeated in each concrete table

So from a theory and industry practice point of view all three solutions are acceptable: one table Posters with columns NULLable columns (ie. single table), three tables Posters, Companies and Persons (ie. class inheritance) and two tables Companies and Persons (ie. concrete inheritance).

Now, to pros and cons.

Cost of NULL columns

The record structure is discussed in Inside the Storage Engine: Anatomy of a record:

NULL bitmap

two bytes for count of columns in the record

variable number of bytes to store one bit per column in the
record, regardless of whether the
column is nullable or not (this is
different and simpler than SQL Server
2000 which had one bit per nullable
column only)

So if you have at least one NULLable column, you pay the cost of the NULL bitmap in each record, at least 3 bytes. But the cost is identical if you have 1 or 8 columns! The 9th NULLable column will add a byte to the NULL bitmap in each record. the formula is described in Estimating the Size of a Clustered Index: 2 + ((Num_Cols + 7) / 8)

Peformance Driving Factor

In database system there is really only one factor that drives performance: amount of data scanned. How large are the record scanned by a query plan, and how many records does it have to scan. So to improve the performance you need to:

narrow the records: reduce the data size, covering include indexes, vertical partitioning
reduce the number of records scanned: indexes
reduce the number of scans: eliminate joins

Now in order to analyze these criteria, there is something missing in your post: the prevalent data access pattern, ie. the most common query that the database will be hit with. This is driven by how you display your posts on the site. Consider these possible approaches:

posts front page: like SO, a page of recent posts with header, excerpt, time posted and author basic information (name, gravatar). To get this page displayed you need to join Posts with authors, but you only need the author name and gravatar. Both single table inheritance and class table inheritance would work, but concrete table inheritance would fail. This is because you cannot afford for such a query to do conditional joins (ie. join the articles posted to either Companies or Persons), such a query will be less than optimal.
posts per author: users have to login first and then they’ll see their own posts (this is common for non-public post oriented sites, think incident tracking for instance). For such a design, all three table inheritance schemes would work.

Conclusion

There are some general performance considerations (ie. narrow the data) to consider, but the critical information is missing: how are you going to query the data, your access pattern. The data model has to be optimized for that access pattern:

Which fields from Companies and Persons will be displayed on the landing page of the site (ie. the most often and performance critical query) ? You don’t want to join 5 tables to show those fields.
Are some Company/Person information fields only needed on the user information page? Perhaps partition the table vertically into CompaniesExtra and PersonsExtra tables. Or use a index that will cover the frequently used fields (this approach simplifies code and is easier to keep consistent, at the cost of data duplication)

PS

Needless to say, don’t use guids for ids. Unless you’re building a distributed system, they are a horrible choice for reasons of excessive width. Fragmentation is also a potential problem, but that can be alleviated by use of sequential guids.

Theory

Cost of NULL columns

Peformance Driving Factor

Conclusion

PS

How to approach applying for a job at a company ...

What is a programmer’s life like?

How to handle personal stress caused by utterly incompetent and ...

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am creating a page where people can post articles. When the user posts

Leave an answerCancel reply

1 Answer

Theory

Cost of NULL columns

Peformance Driving Factor

Conclusion

PS

How to approach applying for a job at a company ...

What is a programmer’s life like?

How to handle personal stress caused by utterly incompetent and ...

Leave an answer
Cancel reply