Nanoscope Tools
Files processed in your session onlyNo account requiredOne-time purchasePrivacy & security →
Data & Analyticswebapp

CSV Row Deduplicator

Remove duplicate rows from any CSV — exact matches or fuzzy matching on key columns you choose.

Use this tool — $5 per use$5 per use
Input
.csv.xlsx.tsv
Output
.csv
#csv#deduplication#data cleaning#duplicates#spreadsheet#data quality

What it does

Upload a CSV. Choose which columns to use as the uniqueness key. Download a deduplicated file — plus a separate file of the removed duplicates so you can review what was dropped.

The problem it solves

Duplicate rows creep into every dataset: form submissions that got submitted twice, list merges that overlapped, ETL jobs that re-processed records. Removing them with Excel pivot tables or VLOOKUP formulas takes longer than it should, and it's hard to verify what was removed.

What it handles automatically

  • Key column selection — Choose which columns define a "duplicate": match on email alone, or on name + address, or any column combination
  • Exact matching — Byte-for-byte comparison after normalization (trim whitespace, lowercase)
  • Fuzzy matching — Optional: mark near-duplicates (e.g. "john@example.com" vs "John@Example.Com" vs "john @example.com")
  • Keep strategy — When duplicates are found, keep the first occurrence, the last, or the row with the most non-empty fields
  • Review file — Outputs a second CSV of removed rows with a _duplicate_of column showing which row it was a match for

FAQ

How many columns can I use as the key? Up to 5. More key columns = more specific matching = fewer false positives.

What's the difference between exact and fuzzy matching? Exact matching normalizes whitespace and case before comparing. Fuzzy matching also catches minor typos, extra characters, and common variations (useful for names and addresses).

Does it store my data? No. All processing is in your browser.

Related tools