I am looking for a natural language tool that can automatically de-identify English text.

Question

0

Asked: May 30, 20262026-05-30T09:12:32+00:00 2026-05-30T09:12:32+00:00

I am looking for a natural language tool that can automatically de-identify English text.

0

I am looking for a natural language tool that can automatically de-identify English text. For example, every email address should be renamed or obscured. But proper names should be de-identified, as should addresses and what not.

There is a MITRE Identification Scrubber Toolkit. I don’t know how well it works.

My questions:

Are there any other tools out there?
Does anyone have experience with the MITRE tool? How well does it work?

Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T09:12:34+00:00

De-identification (perhaps more often referred to as anonymization) is a very active research area as its success is obviously a requirement for the use of authentic text corpora in such fields as NLP for healthcare, medicine and the like. I recommend that you look at the tools listed in the answer to this question on CrossValidated. If you follow the links further, you will find research papers describing how these tools work with further references and results evaluations.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am looking for a natural language tool that can automatically de-identify English text.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply