Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6955665
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T14:46:01+00:00 2026-05-27T14:46:01+00:00

Converting newline into space makes sense for English, for example, the following HTML: <p>

  • 0

Converting newline into space makes sense for English, for example, the following HTML:

<p>
This is
a sentence.
</p>

We get the following after converting the newline into space in the browser:

This is a sentence.

This is good for English, but not good for Chinese characters because we don’t use spaces to separate words in Chinese. Here’s an example (The Chinese sentence has the same meaning of "This is a sentence"):

<p>
这是
一句话。
</p>

I get the following result on Chrome, Safari and IE…

这是 一句话。

…but what I wanted is the following, without the extra space:

这是一句话。

I don’t know why the browser does not ignore the newline if the last character of the current line and the first character of the next line are both Chinese characters (which I think makes more sense). Or they have provided this mechanism but need special handling?

BTW, in Vim, when using "J" to join lines, no space will be added if the last and the first character of the 2 lines are all Chinese characters. But for English, a space will be added. So I guess Vim does some special handling for this.

UPDATE:

Though I think this is an issue with the browser, I have to live with that. So currently I would preprocess my Markdown text to join Chinese lines before generating HTML. Here’s how I do this in Ruby, complete code which also handles Chinese punctuations is on gist

#encoding: UTF-8

# Requires ruby 1.9.x, and assume using UTF-8 encoding

class String
  # The regular expression trick to match CJK characters comes from
  # http://stackoverflow.com/a/4681577/306935
  def join_chinese
    gsub(/(\p{Han})\n(\p{Han})/m, '\1\2')
  end
end
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T14:46:01+00:00Added an answer on May 27, 2026 at 2:46 pm

    Browsers treat newlines as spaces because the specifications say so, ever since HTML 2.0. In fact, HTML 2.0 was milder than later specifications; it said: “An HTML user agent should treat end of line in any of its variations as a word space in all contexts except preformatted text.” (Conventional Representation of Newlines), whereas newer specifications say this stronger (describing it as what happens in HTML).

    The background is that HTML and the Web was developed with mainly Western European languages in mind; this is reflected in many features of the original specifications and early implementations. Only slowly have they been internationalized.

    It is unlikely that the parsing rules will be changed. More likely, what might happen is sensitivity to language or character properties rendering. This would mean that a line break still gets taken as a space (and the DOM string will contain Ascii space character), but a string like 这是 一句话。 would be rendered as if the space were not there. This what the HTML 4.01 specification seems to refer to (White space). The text is somewhat confused, but I think it tries to say that the behavior would depend in the content language, either inferred by the browser or as declared in markup.

    But browsers don’t do such things yet. Declaring the language of content, e.g. <html lang=zh>, is a good principle but has little practical impact—in rendering, it may affect the browser’s choice of a default font (but how many authors let browsers use their default fonts?). It may even result in added spacing, if the space character happens to be wider in the browser’s default font for the language specified.

    According to the CSS3 Text draft, you could use the text-spacing property. The value none “Turns off all text-spacing features. All fullwidth characters are set with full-width glyphs.” Unfortunately, no browser seems to support this yet.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

converting some CSS to Sass, for example: .ptn, .pvn, .pan{padding-top:0px !important} to this @mixin
Converting dates/times into ticks using the PowerShell Get-Date applet is simple. However, how do
I want to prevent TinyMCE from converting linebreaks (pasted from notepad for example) into
While converting an old code, I encountered the following problem. Given an HTML string,
I'm converting an application to use Java 1.5 and have found the following method:
Converting from Django, I'm used to doing something like this: {% if not var1
I have a HTMLEditor(ajax control). i am converting the contents present in HTMLEditor into
I have a HTMLEditor(ajax control). i am converting the contents present in HTMLEditor into
Im converting some information Im receiving by a string to a float to get
Im converting this simple program from vb to c# it updates, displays, create and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.