The Text Cleaner transforms text from AI models, PDFs, websites, and Office documents into clean, continuous text. With one click, it removes all formatting clutter like line breaks, tabs, hyphenation, special characters, and even hidden AI markers. Enjoy cleanly structured text – perfect for your projects at work, in school, or online.
What does it remove?
- Line breaks (single or multiple)
- Tabs and duplicate spaces
- Indents at the beginning of lines (spaces or tabs)
- Protected and invisible characters (e.g.
\u00A0,\u200B,\uFEFF) - All control and formatting markers (soft hyphen, word joiner, bidi marks, invisible operators, variation selectors, tag characters)
- AI watermark characters (Zero-Width Joiner patterns, Combining Grapheme Joiner, Hangul Fillers)
- Homoglyphs – similar-looking foreign characters (Cyrillic/Greek letters normalized to Latin)
- All Unicode space variants (Em Space, En Space, Thin Space, Ideographic Space, etc.)
- Typographic symbols ("smart punctuation" such as –, —, " " ' ')
- Markdown and bullet symbols (
*,—,1.,•etc.) - HTML entities (
,&,>, decoded into plain text) - Hyphenation across lines (e.g. "Be-\nreich" → "Bereich")