I migrated a News database into a CakePHP news site I am creating. I have a problem with displaying the text from those migrated news because in the text that was imported to DB there were HTML tags that controls the text within them.
Could anyone help me find a way to remove these texts without compromising the layout of those same news?
Basically, I would like to accomplish the following:
- Create a ONE-Time Use only function that I can include in my
ArticlesController - For example the function name would be
function fixtext(){...} - When I call this function from lets say
http://mydomain.com/articles/fixtext, all the affected rows in theArticle.bodycolumn would be scanned and fixed.
The section of text I want to remove is font-size: 12pt; line-height: 115%;, which in within the <span>...</span> tag.
I had something in mind like this, but I am not sure how to implement it
function fixtext(){
$this->autoRender = 'FALSE';
$articles = $this->Article->find(
'all',
array(
'fields' => array(
'Article.body',
'Article.id'
),
'recursive' => -1
)
);
foreach($articles as $article){
// Per Dunhamzzz suggestion
$text = str_replace('font-size: 12pt; line-height: 115%;', '', $article['Article']['body']);
$this->Article->id = $article['Article']['id'];
$this->Article->saveField('Article.body', $text);
}
$this->redirect('/');
}
I am not sure how to approach this, and what is the best way.
Firstly, I would personally create a shell to accomplish this as it is a batch job and (depending on the amount of records involved) you may hit Apache’s request timeout limit. Also, it’s a good (fun) learning experience and the shell can be extended to perform future maintenance tasks.
Secondly, it is a bad idea to parse HTML using (greedy) regular expressions due to the fact it may be malformed. It is safer to use an HTML parser or using simple string replacements instead but, if it is a small regular string that can be pattern matched safely (ie. your not trying to remove the closing
</span>tags), regular expressions can work.Something like this (untested):