Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8202907
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T07:21:33+00:00 2026-06-07T07:21:33+00:00

In the following Java program running in Linux using OpenJDK 1.6.0_22 I simply list

  • 0

In the following Java program running in Linux using OpenJDK 1.6.0_22 I simply list the contents of the directory taken in as parameter at the command line. The directory contains the files which have file names in UTF-8 (e.g. Hindi, Mandarin, German etc.).

import java.io.*;

class ListDir {

    public static void main(String[] args) throws Exception {
    //System.setProperty("file.encoding", "en_US.UTF-8");
        System.out.println(System.getProperty("file.encoding"));
    File f = new File(args[0]);
    for(String c : f.list()) {
        String absPath = args[0] + "" + c;
        File cf = new File(args[0] + "/" + c);
        System.out.println(cf.getAbsolutePath() + " --> " + cf.exists());
    }
    }
}

If I set the LC_ALL variable to en_US.UTF-8 the results are printed fine. But if I set the LC_ALL variable to POSIX and supply the file.encoding and sun.jnu.encoding properties as UTF-8 from command line I get the garbage output and cf.exists() returns false.

Can you please explain this behavior. As I read on so many websites file.encoding is said to be sufficient to read file names and use them for operations. Here it looks like that property has no effect at all.

Update 1: If I set file.encoding to something like GBK (Chinese) and LC_ALL variable to en_US.UTF-8 then cf.exists() returns true. only the ‘?’ appears instead of file name. Surprise o_O.

Update 2: More investigation and it looks like its not a Java issue. It looks like libc on Linux used locale settings to translate file name encodings and those settings will cause file not found error/exception. “file.encoding” is for how Java interprets file names.

Update 3 Now it looks problem is how Java interprets file names. The following simple C code works on Linux regardless of file encoding and value of LC_ALL environment variable (I am happy that this proves for answer given here: https://unix.stackexchange.com/questions/39175/understanding-unix-file-name-encoding). But still I am not clear how Java interprets on LC_ALL variable. Now looking into OpenJDK code for that.

Sample C code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>

int main(int argc, char *argv[])
{
    char *argdir = argv[1];
    DIR *dp = opendir(argdir);
    struct dirent *de;
    while(de = readdir(dp)) {
        char *abspath = (char *) malloc(strlen(argdir)  + 1 + strlen(de->d_name) + 1);
        strcpy(abspath, argdir);
        abspath[strlen(argdir)] = '/';
        strcpy(abspath + strlen(argdir) + 1, de->d_name);
        printf("%d %s ", de->d_type, abspath);
        FILE *fp = fopen(abspath, "r");
        if (fp) {
            printf("Success");
        }
        fclose(fp);
        putchar('\n');
    }
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T07:21:34+00:00Added an answer on June 7, 2026 at 7:21 am

    Note: So finally I think that I have nailed it down. I am not confirming that it is right. But with some code reading and tests this is what I found out and I don’t have additional time to look into it. If anyone is interested they can check it out and tell if this answer is right or wrong – I would be glad 🙂

    The reference I used was from this tarball available at OpenJDK’s site:
    openjdk-6-src-b25-01_may_2012.tar.gz

    1. Java natively translates all string to platform’s local encoding in this method: jdk/src/share/native/common/jni_util.c - JNU_GetStringPlatformChars() . System property sun.jnu.encoding is used to determine the platform’s encoding.

    2. The value of sun.jnu.encoding is set at jdk/src/solaris/native/java/lang/java_props_md.c - GetJavaProperties() using setlocale() method of libc. Environment variable LC_ALL is used to set the value of sun.jnu.encoding. Value given at the command prompt using -Dsun.jnu.encoding option to Java is ignored.

    3. Call to File.exists() has been coded in file jdk/src/share/classes/java/io/File.java and it returns as

      return ((fs.getBooleanAttributes(this) & FileSystem.BA_EXISTS) != 0);

    4. getBooleanAttributes() is natively coded (and I am skipping steps in code browsing through many files) in jdk/src/share/native/java/io/UnixFileSystem_md.c in function :
      Java_java_io_UnixFileSystem_getBooleanAttributes0(). Here the macro
      WITH_FIELD_PLATFORM_STRING(env, file, ids.path, path) converts path string to platform’s encoding.

    5. So conversion to wrong encoding will actually send a wrong C string (char array) to subsequent call to stat() method. And it will return with result that file cannot be found.

    LESSON: LC_ALL is very important

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am running a Java program with the G1 garbage collector using the following
I get the following error when running mongo-hadoop streaming: java.io.IOException: Cannot run program "mapper.py":
Why gives the following Windows 7 .cmd command script: set SUN_JAVA_HOME=C:\Program Files (x86)\Java\jdk1.6.0_17 if
I am looking for a way to access running java program from command line.
I am running the following java program in the Eclipse IDE: import java.net.*; import
I'm calling a java program from my python code in the following way: subprocess.check_output([java,
I get the following error when I run a java program: Exception in thread
Given the following program: import java.io.*; import java.util.*; public class GCTest { public static
Observe the following program written in Java (complete runnable version follows, but the important
Why am I getting java.io.FileNotFoundException in the following program? import java.io.*; class FisDemo {

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.