Yay, first post on SO! (Good work Jeff et al.)
We’re trying to solve a bottleneck in one of our web-applications that was introduced when we started allowing users to generate reports on-demand.
Our infrastructure is as follows: 1 server acting as a Webserver/DBServer (ColdFusion 7 and MSSQL 2005)
It’s serving a web-application for our backend users and a frontend website. The reports are generated by the users from the backend so there’s a level of security where the users have to log in (web based).
During peak hours when reports are generated it brings the web-application and frontend website to unacceptable speed due to SQL Server using resources for the huge queries and afterward ColdFusion generating multi page PDFs.
We’re not exactly sure what the best practice would be to remove some load, but restricting access to the reports isn’t an option at the moment.
We’ve considered denormalizing data to other tables to simplify the most common queries, but that seems like it would just push the issue further.
So, we’re thinking of getting a second server and use it as a ‘report server’ with a replicated copy of our DB on which the queries would be ran. This would fix one issue, but the second remains: generating PDFs is resource intensive.
We would like to offload that task to the reporting server as well, but being in a secured web-application we can’t just fire HTTP GET to create PDFs with the user logged in the web-application from server 1 and displaying it in the web-application but generating/fetching it on server 2 without validating the user’s credential…
Anyone have experience with this? Thanks in advance Stack Overflow!!
‘We would like to offload that task to the reporting server as well, but being in a secured web-application we can’t just fire HTTP GET to create PDFs with the user logged in the web-application from server 1 and displaying it in the web-application but generating/fetching it on server 2 without validating the user’s credential…’
why can’t you? you’re using the world’s easiest language for writing webservices. here are my suggestions.
first, move the database to it’s own server thus having cf and sql server on separate servers. the first reason to do this is performance. as already mentioned, having both cf and sql on the same server isn’t an ideal setup. the second reason is for security. if someone is able to hack your webserver, well there right there to get your data. you should have a firewall in place between your cf and sql server to give you more security. last reason is for scalability. if you ever need to throw more resources or cluster your database, it’s easier when it’s on it’s own server.
now for the webservices. what you can do is install cf on another server and writing webservices to handle the generation of reports. just lock down the new cf server to accept only ssl connections and pass the login credentials of the users to the webservice. inside your webservice, authenticate the user before invoking the methods to generate the report.
now for the pdfs themselves. one of the methods i’ve done in the pass is generating a hash based on some parameters passed (user credentials and the generated sql to run the query) and then once the pdf is generated, you assign the hash to the name of the pdf and save it on disk. now you have a simple caching system where you can look to see if the pdf already exists and if so, return it, otherwise generate it and cache it.
in closing, your problem is not something that most haven’t seen before. you just need to do a little work and your application will magnitudes faster.