How each format works, where each one shines, and how to convert between them without breaking on the edge cases.
CSV and JSON are the two formats you reach for constantly when moving data between programs, yet they are built for almost opposite purposes. Choosing the wrong one leads to fragile imports, mangled values, and hours lost to escaping bugs. This article explains how each format actually works, where each one shines, and how to convert cleanly between them.
CSV — comma-separated values — is a flat, tabular format. Every row is a record, every column a field, separated by commas. It is the lingua franca of spreadsheets: Excel, Google Sheets, and every database export tool speaks it. Its strengths are simplicity and density. A million-row dataset in CSV is dramatically smaller than the same data in JSON, because CSV writes each field name exactly once, in the header, instead of repeating it on every record.
CSV is the right choice when your data is genuinely tabular — a list of rows that all share the same columns — and when a human might open it in a spreadsheet. Sales exports, mailing lists, sensor logs, and analytics dumps all fit naturally.
CSV has no concept of nesting. The moment your data has structure — an order that contains several line items, a user with a list of roles — CSV forces you to flatten or duplicate. It also has no types: every value is text, so the reader has to guess whether 012 is the number twelve or a zip code that must keep its leading zero, and whether 2026-01-05 is a date or a string. And the “comma-separated” promise breaks the instant a value itself contains a comma, a quote, or a line break, forcing a thicket of escaping rules that different tools implement inconsistently.
JSON — JavaScript Object Notation — represents structured data: objects with named fields, arrays, and values nested to any depth. It has real types: strings, numbers, booleans, null, arrays, and objects. This makes it the natural format for APIs, configuration files, and anything where one record contains another. An order with its line items, customer, and shipping address travels as a single, self-describing object.
JSON is also unambiguous in a way CSV is not. A number is quoted differently from a string, so there is no guessing. Almost every programming language can parse it natively, which is why web APIs settled on it.
JSON is verbose. Every record repeats every field name, so for large flat datasets it can be several times bigger than the CSV equivalent. It is also harder for non-programmers to read or edit — you cannot open a 50,000-record JSON file in a spreadsheet and scan it the way you can a CSV. And while JSON is great for nested data, forcing inherently tabular data into it just adds noise.
| Question | Use CSV | Use JSON |
|---|---|---|
| Is the data a flat table of identical rows? | Yes | No |
| Does a record contain nested lists or objects? | No | Yes |
| Will a human open it in a spreadsheet? | Yes | Rarely |
| Is it feeding or coming from a web API? | No | Yes |
| Do you need to preserve exact data types? | No | Yes |
| Is raw file size the priority? | Yes | No |
The most common task is turning a spreadsheet export into JSON so a program can consume it. The mapping is straightforward: the CSV header row becomes the keys of each JSON object, and each data row becomes one object in an array. The catch is types — a good converter lets numbers and booleans stay unquoted rather than wrapping everything in strings.
name,age,active
Asha,29,true
Ravi,34,false
becomes:
[
{ "name": "Asha", "age": 29, "active": true },
{ "name": "Ravi", "age": 34, "active": false }
]
When converting, three things trip people up. First, quoted values containing commas — a proper parser respects the quotes and does not split mid-field. Second, leading zeros — a zip code like 02134 must stay a string, or it becomes 2134. Third, empty cells — decide whether they should become an empty string, null, or be omitted entirely, because downstream code will treat those very differently.
Reach for CSV when the data is flat, large, and might be opened in a spreadsheet. Reach for JSON when the data is nested, typed, or destined for an API. And when you need to move between the two, use a converter that handles quoting and types correctly rather than a hand-rolled split on commas — the edge cases are exactly where hand-rolled code breaks. If your structured data lives in YAML instead, the same nesting rules apply; a YAML formatter can tidy it before you convert.