Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9054491
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T13:42:14+00:00 2026-06-16T13:42:14+00:00

My mac os will generate a .DS_Store under my train data set file directory,

  • 0

My mac os will generate a .DS_Store under my train data set file directory, and load_files will load it and raise exception like

UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0xff in position 1116

I want to know that how to filter the .DS_Store file except delete it?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T13:42:15+00:00Added an answer on June 16, 2026 at 1:42 pm

    Looking at the documentation, there doesn’t seem to be any way to filter directly in load_files (or, rather, you can whitelist categories, but you can’t whitelist files within the categories, or blacklist at either level).

    You might want to consider filing a feature request to the scikit-learn project. Alternatively, you might consider it a bug that hidden files (as defined appropriately for the platform—but on OS X and other POSIX systems that should include files whose names start with .) are loaded, and file a bug report on that.

    Meanwhile, there is a load_content flag that you can set:

    load_content : boolean, optional (default=True)

    Whether to load or not the content of the different files. If true a ‘data’ attribute containing the text information is present in the data structure returned. If not, a filenames attribute gives the path to the files.

    Pass False, and it will just find the filenames for you, which you can then filter however you want (e.g., filenames = (filename for filename in ret.filenames if not filename.startswith('.'))), then load manually.

    This seems like the best solution available with the given tools.

    On the other hand, given how simple load_files actually is—especially if you don’t use the extra features like categories or shuffle—it might be simpler to just not use it, and instead use os.walk or just os.listdir. In this case, given that the files are exactly 2 levels deep, rather than at an arbitrary depth, the latter is probably simpler:

    def getfilenames(category):
        return [filename for filename in os.listdir(category)
                if not filename.endswith('.')]
    categoryfiles = [getcategory(os.path.join(rootpath, category)
                     for category in os.listdir(rootpath)]
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Mac OSX 10.6.4 Nexus One Android 2.2 The command adb devices will list my
I'm looking to develop a small application on Mac OSX and it will need
I plan to write a Desktop Client for Windows and Mac. It will be
I have a challenging situation; we will have programs on Mac, PC, iOS and
I'm using openldap on Mac OS X Server 10.6 and need to generate a
I'm trying to compile a c++ file and generate an asm or s file
I need to generate small images for certain parts of text. Those will have
I need a conversion utility/script that will convert a .sql dump file generated on
I have an autotools project that compiles just fine on the Mac, but under
The creation of a CSR will prompt Keychain Access to simultaneously generate a public

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.