Best Practicesjsondatavalidation

JSON Best Practices: Structure, Validation, and Common Mistakes

Master JSON data handling with best practices for structure, validation, and debugging. Learn common mistakes and how to avoid them.

Loopaloo TeamNovember 20, 202512 min read

JSON emerged from a simple idea: what if data interchange used the same syntax as JavaScript object literals? Douglas Crockford formalized the format in the early 2000s, deliberately limiting it to a small set of data types and strict syntax rules. That deliberate simplicity — no comments, no trailing commas, no ambiguity — made JSON parsable by any programming language with minimal effort, and it rapidly displaced XML as the web's dominant data format.

Today JSON is everywhere: REST APIs, configuration files, NoSQL databases, log systems, and inter-service communication. Its ubiquity means that understanding its quirks and best practices pays dividends across nearly every area of software development.

The Six Data Types and Their Gotchas

JSON supports exactly six data types, and everything you express in JSON must fit one of them. Understanding their constraints prevents subtle bugs.

Strings must use double quotes. Single quotes, which are valid in JavaScript and Python, are a syntax error in JSON. Strings can contain any Unicode character, with special characters escaped using backslashes. The most common escapes are \\n for newlines, \\t for tabs, and \\" for literal double quotes.

Numbers follow JavaScript's number representation — they can be integers or floating point, positive or negative, with optional scientific notation. However, JSON has no concept of Infinity, NaN, or leading zeros. The number 019 is invalid JSON, as is .5 (it must be 0.5). A subtler issue: JSON numbers have arbitrary precision in the specification, but most parsers use 64-bit floating point, which means integers larger than 2^53 lose precision. If your API produces 64-bit integer IDs, you should transmit them as strings to prevent silent truncation.

{
  "valid_integer": 42,
  "valid_float": 3.14,
  "valid_scientific": 2.998e8,
  "safe_large_id": "9007199254740993"
}

Booleans are the lowercase words true and false — unquoted, case-sensitive. Writing True, TRUE, or "true" are either invalid or semantically different (the last is a string, not a boolean).

Null is the lowercase word null, unquoted. It represents the intentional absence of a value, which is semantically different from omitting the key entirely. In an API response, "middleName": null communicates "this field exists but has no value," while omitting middleName entirely says "this field isn't applicable."

Arrays are ordered, heterogeneous lists enclosed in square brackets. While JSON allows mixed types within an array, doing so in practice creates headaches for consumers who need to handle unpredictable element types. Keep arrays homogeneous whenever possible.

Objects are unordered collections of key-value pairs enclosed in curly braces. Keys must be strings. The specification says keys should be unique, but doesn't require it — most parsers will silently use the last occurrence of a duplicate key, which can mask bugs.

Common Syntax Errors

Three syntax mistakes account for the vast majority of JSON parsing failures, and all three come from habits in other languages.

Trailing Commas

JavaScript, Python, and many other languages allow (or even encourage) trailing commas in lists and objects. JSON does not. This is the single most common JSON syntax error, especially when editing files by hand.

// INVALID - trailing comma after "age"
{
  "name": "John",
  "age": 30,
}

// VALID
{
  "name": "John",
  "age": 30
}

Comments

JSON has no comment syntax. Not //, not /* */, not #. This is a deliberate design decision — Crockford removed comments to prevent them from being abused as parsing directives, as had happened with XML comments. If you need annotated configuration files, consider JSONC (JSON with Comments, supported by VS Code and TypeScript) or YAML.

Single Quotes and Unquoted Keys

// INVALID - single quotes
{ 'name': 'John' }

// INVALID - unquoted key
{ name: "John" }

// VALID
{ "name": "John" }

These are valid JavaScript but not valid JSON. This distinction trips up developers who prototype in a browser console and paste results into JSON files.

Structuring JSON for APIs

Good JSON structure makes APIs intuitive to use and resilient to change. Several patterns have emerged as best practices through years of API design experience.

Naming Conventions

Choose one naming convention and apply it consistently across your entire API. Mixing conventions within a single response forces consumers to remember which style applies to which field.

camelCase is the JavaScript convention and the most common choice for frontend-facing APIs: firstName, createdAt, isActive. snake_case is standard in Python and Ruby ecosystems: first_name, created_at, is_active. Neither is objectively better — consistency matters more than the specific choice.

Date Formatting

Dates in JSON should always use ISO 8601 format with timezone information:

{
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-15T14:45:30.123Z",
  "expiresAt": "2024-06-15T00:00:00+05:30"
}

The Z suffix indicates UTC. Including timezone information prevents a pervasive class of bugs where dates are interpreted in different timezones by different systems. Unix timestamps (like 1705312200) are unambiguous about timezone but unreadable to humans and ambiguous about precision (seconds vs. milliseconds).

Collection Responses

Wrap collections in an object with a descriptive key rather than returning a bare array. This gives you a natural place to add metadata — pagination, total counts, filter information — without breaking existing consumers.

{
  "users": [
    { "id": 1, "name": "Alice" },
    { "id": 2, "name": "Bob" }
  ],
  "pagination": {
    "page": 1,
    "perPage": 20,
    "totalPages": 10,
    "totalItems": 195
  }
}

A bare array response like [{"id": 1}, {"id": 2}] makes it impossible to add pagination metadata later without a breaking change.

Nesting Depth

Deep nesting makes JSON difficult to navigate and query. If your JSON regularly exceeds four levels of nesting, consider flattening related data or using references:

// Too deeply nested
{
  "company": {
    "department": {
      "team": {
        "member": {
          "address": {
            "city": "NYC"
          }
        }
      }
    }
  }
}

// Flattened with references
{
  "members": [{ "id": "m1", "name": "Alice", "teamId": "t1", "addressId": "a1" }],
  "teams": [{ "id": "t1", "departmentId": "d1" }],
  "addresses": [{ "id": "a1", "city": "NYC" }]
}

The flattened structure is more verbose but easier to query, cache, and update independently.

JSON Schema: Defining Structure

JSON Schema is a vocabulary for annotating and validating JSON documents. It lets you define what your JSON should look like — required fields, data types, value constraints, string patterns — in a machine-readable format.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "minLength": 1,
      "maxLength": 100,
      "description": "The user's full name"
    },
    "email": {
      "type": "string",
      "format": "email"
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150
    },
    "role": {
      "type": "string",
      "enum": ["admin", "editor", "viewer"]
    }
  },
  "required": ["name", "email"],
  "additionalProperties": false
}

Schemas serve triple duty: they validate incoming data (catching malformed requests before they reach your business logic), document your API (schema files are always accurate, unlike hand-written docs that drift), and generate types (tools like json-schema-to-typescript produce TypeScript interfaces directly from schemas).

The additionalProperties: false setting is worth highlighting — it rejects any keys not explicitly defined in the schema, which catches misspelled field names and prevents clients from accidentally sending data your API ignores.

JSON Alternatives Worth Knowing

JSON's strictness is an asset for machine-to-machine communication but a liability for human-authored files.

JSONC (JSON with Comments) adds // and /* */ comments to JSON. It's used by VS Code's settings files (settings.json) and TypeScript's tsconfig.json. You can't use JSONC with standard JSON parsers, but JSONC-aware parsers are available for most languages.

JSON5 extends JSON more aggressively: it allows single-quoted strings, trailing commas, unquoted keys (if they're valid identifiers), hex numbers, multi-line strings, and comments. It aims to be "JSON for humans" while remaining a strict subset of ES5. It's a good choice for configuration files where human readability matters more than universal parser support.

YAML is a superset of JSON that uses indentation instead of braces, making it popular for configuration files (Docker Compose, Kubernetes manifests, GitHub Actions). YAML is more readable than JSON for simple structures but becomes treacherous with complex nesting — significant whitespace errors can silently change meaning. It also has surprising type coercion (yes, no, on, off are parsed as booleans) that catches newcomers off guard.

For API communication, standard JSON remains the right choice. For configuration files, JSONC or JSON5 offer a better developer experience. YAML is an option when the ecosystem expects it, but its complexity-to-benefit ratio is unfavorable compared to JSON5.

Security Considerations

JSON Injection

When constructing JSON by string concatenation rather than using a serialization library, you risk JSON injection. User input containing quotes or backslashes can break out of a string value and inject additional JSON structure. Always use your language's built-in JSON serializer (JSON.stringify in JavaScript, json.dumps in Python) rather than building JSON strings manually.

Prototype Pollution (JavaScript)

In JavaScript, JSON.parse can return objects with __proto__ keys that modify the prototype chain:

const malicious = '{"__proto__": {"isAdmin": true}}';
const parsed = JSON.parse(malicious);
// parsed.__proto__.isAdmin is now true

When this parsed object is later merged into another object using naive deep-merge utilities, the isAdmin property can leak into all objects. Use Object.create(null) for dictionaries, or validate and strip __proto__, constructor, and prototype keys from parsed input.

Denial of Service via Large Payloads

JSON parsers must process the entire input before returning, and deeply nested structures require stack space proportional to nesting depth. An attacker can send a small payload with extreme nesting ([[[[[[...) to exhaust your parser's stack. Set request body size limits and maximum nesting depth in your API infrastructure.

Handling Large JSON

Processing multi-gigabyte JSON files with standard parsers fails because the entire file must fit in memory. Streaming parsers solve this by processing the file incrementally:

// Node.js with streaming JSON parser
const { parser } = require('stream-json');
const { streamArray } = require('stream-json/streamers/StreamArray');
const fs = require('fs');

const pipeline = fs.createReadStream('huge-file.json')
  .pipe(parser())
  .pipe(streamArray());

pipeline.on('data', ({ value }) => {
  // Process one item at a time; memory usage stays constant
  processItem(value);
});

For JSON Lines format (JSONL), where each line is a separate JSON object, you can process the file line-by-line with standard text streaming tools — no special parser needed. JSONL is increasingly popular for log files, data exports, and machine learning datasets precisely because it's streamable.

Conclusion

JSON's strength is its constraints. The inability to add comments forces documentation into proper documentation systems. The lack of trailing commas catches copy-paste errors. The strict quoting rules eliminate ambiguity. These aren't limitations — they're features that make JSON the most reliably parseable data format in widespread use.

The practices that matter most are consistent naming conventions, ISO 8601 dates, wrapped collections with pagination metadata, JSON Schema validation at API boundaries, and using serialization libraries instead of string concatenation. Get these right, and JSON becomes a transparent layer in your architecture rather than a source of bugs.

Use our JSON Formatter & Validator to check syntax, pretty-print complex structures, and explore nested data — all processed in your browser with no data uploaded.

Try Our Free Tools

200+ browser-based tools for developers and creators. No uploads, complete privacy.

Explore All Tools