JSON to CSV and Back: Data Format Conversion for Developers
JSON and CSV are the two most common data interchange formats. JSON is the standard for APIs and configuration. CSV dominates spreadsheets, databases exports, and data pipelines. Developers frequently need to convert between them, and the conversion is not always straightforward.
When to Use Each Format
CSV works best for flat, tabular data where every record has the same fields. It is human-readable in any text editor, opens natively in Excel, and streams efficiently line by line.
JSON handles nested structures, arrays, mixed types, and sparse records. It is the native format for JavaScript and most web APIs.
The conversion challenge arises because CSV is inherently flat while JSON supports arbitrary depth. Flattening nested JSON into CSV rows requires decisions about how to represent the hierarchy.
JSON to CSV: The Simple Case
When your JSON is an array of flat objects with identical keys, conversion is direct:
interface Row {
[key: string]: string | number | boolean | null;
}
function jsonToCsv(data: Row[]): string {
if (data.length === 0) return "";
const headers = Object.keys(data[0]);
const lines = [
headers.join(","),
...data.map((row) =>
headers.map((h) => escapeCsvField(String(row[h] ?? ""))).join(",")
),
];
return lines.join("\n");
}
function escapeCsvField(field: string): string {
if (field.includes(",") || field.includes('"') || field.includes("\n")) {
return `"${field.replace(/"/g, '""')}"`;
}
return field;
}
The escapeCsvField function handles the three cases that require quoting: commas within values, double quotes (escaped by doubling), and newlines.
Handling Nested JSON
Real-world API responses are rarely flat. Consider this structure:
{
"id": 1,
"name": "Acme Corp",
"address": {
"street": "123 Main St",
"city": "Springfield"
},
"tags": ["enterprise", "active"]
}
There are two common strategies for flattening:
Dot notation creates columns like address.street and address.city:
function flattenObject(
obj: Record<string, any>,
prefix = ""
): Record<string, string> {
const result: Record<string, string> = {};
for (const [key, value] of Object.entries(obj)) {
const path = prefix ? `${prefix}.${key}` : key;
if (value && typeof value === "object" && !Array.isArray(value)) {
Object.assign(result, flattenObject(value, path));
} else if (Array.isArray(value)) {
result[path] = value.join("; ");
} else {
result[path] = String(value ?? "");
}
}
return result;
}
Array expansion creates separate rows for array elements, duplicating the parent data. This is useful when the array represents a one-to-many relationship (e.g., orders with line items).
CSV to JSON
Parsing CSV back to JSON requires handling the same edge cases in reverse:
function csvToJson(csv: string): Record<string, string>[] {
const lines = parseCsvLines(csv);
if (lines.length < 2) return [];
const headers = lines[0];
return lines.slice(1).map((fields) => {
const obj: Record<string, string> = {};
headers.forEach((header, i) => {
obj[header] = fields[i] ?? "";
});
return obj;
});
}
The tricky part is parseCsvLines. A proper CSV parser must handle quoted fields that contain commas, newlines, and escaped quotes. The RFC 4180 specification defines the rules, but real-world CSV files frequently violate them.
Common Edge Cases
Mixed delimiters. Some CSV files use semicolons or tabs instead of commas. European locales often use semicolons because commas serve as decimal separators.
Encoding issues. CSV files from Excel may use Windows-1252 encoding instead of UTF-8. Check for garbled characters and convert encoding before parsing.
Type inference. CSV values are all strings. When converting to JSON, you may want to parse numbers and booleans:
function inferType(value: string): string | number | boolean | null {
if (value === "") return null;
if (value === "true") return true;
if (value === "false") return false;
const num = Number(value);
if (!isNaN(num) && value.trim() !== "") return num;
return value;
}
Be cautious with number detection. Values like zip codes ("07001") and phone numbers should remain strings.
BOM markers. UTF-8 files from Excel often start with a BOM (byte order mark, \uFEFF). Strip it before parsing:
const cleanCsv = csv.replace(/^\uFEFF/, "");
Streaming Large Files
For files with millions of rows, loading everything into memory is not practical. Process line by line with a streaming parser:
async function* streamCsvLines(
reader: ReadableStreamDefaultReader<Uint8Array>
): AsyncGenerator<string[]> {
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop()!; // Keep incomplete last line
for (const line of lines) {
yield parseCsvLine(line);
}
}
if (buffer.trim()) yield parseCsvLine(buffer);
}
This processes each line as it arrives, keeping memory usage constant regardless of file size.
Practical Tips
- Always quote fields that might contain the delimiter character.
- Use UTF-8 encoding with a BOM if the file will be opened in Excel.
- Include a header row. Headerless CSV files are ambiguous.
- Escape newlines within fields by quoting the entire field.
- When round-tripping (JSON to CSV to JSON), expect some information loss for nested structures.
Try our JSON CSV Converter to convert between JSON and CSV instantly — right in your browser, no upload required.