# Dataset schema specification

The dataset schema defines the structure and presentation of data produced by an Actor. It controls what fields each dataset item contains and how that data appears in the Output tab UI.

## Schema components

A dataset schema has two main components:

* **`fields`** (optional) - JSON Schema describing the structure of each dataset item. Enables validation and provides metadata for AI agents.
* **`views`** (required) - Display configurations that control how data appears in the Output tab. Each view can show different fields, ordering, and formatting.

Both components work together: `fields` describes *what* data your Actor produces, while `views` controls *how* that data is presented.

.actor/dataset\_schema.json


```json
{
    "actorSpecification": 1,
    "fields": {
        "type": "object",
        "properties": {
            "title": { "type": "string" },
            "price": { "type": "number" }
        }
    },
    "views": {
        "overview": {
            "title": "Overview",
            "transformation": { "fields": ["title", "price"] },
            "display": { "component": "table" }
        }
    }
}
```


## File structure

Place the dataset schema in the `.actor` folder in your Actor's root directory. You can organize it in two ways:

### Inline in actor.json

.actor/actor.json


```json
{
    "actorSpecification": 1,
    "name": "my-scraper",
    "title": "My Scraper",
    "version": "1.0.0",
    "storages": {
        "dataset": {
            "actorSpecification": 1,
            "fields": {},
            "views": {
                "overview": {
                    "title": "Overview",
                    "transformation": {},
                    "display": { "component": "table" }
                }
            }
        }
    }
}
```


### Separate file

.actor/actor.json


```json
{
    "actorSpecification": 1,
    "name": "my-scraper",
    "title": "My Scraper",
    "version": "1.0.0",
    "storages": {
        "dataset": "./dataset_schema.json"
    }
}
```


.actor/dataset\_schema.json


```json
{
    "actorSpecification": 1,
    "fields": {},
    "views": {
        "overview": {
            "title": "Overview",
            "transformation": {},
            "display": { "component": "table" }
        }
    }
}
```


Use a separate file when your schema is complex or you want to keep `actor.json` concise.

## Fields

The `fields` property defines the structure of each dataset item using [JSON Schema](https://json-schema.org/). This schema enables validation and provides metadata that helps both humans and AI agents understand your Actor's output.

### Why define fields

When AI agents interact with Actors through the MCP server or API, they rely on field metadata to understand what data the Actor produces. Including `title`, `description`, and `example` properties enables agents to:

* Understand the meaning of each output field
* Chain Actors together by matching inputs to outputs
* Generate appropriate queries and handle responses correctly

Without this metadata, agents must infer field meanings from names alone, which leads to errors.

### Field properties

Each field in your schema can include standard JSON Schema properties:

| Property      | Type   | Description                                                               |
| ------------- | ------ | ------------------------------------------------------------------------- |
| `type`        | string | The data type (`string`, `number`, `boolean`, `array`, `object`, `null`). |
| `title`       | string | A human-readable name for the field.                                      |
| `description` | string | Explains what the field contains and how to interpret it.                 |
| `example`     | any    | A sample value that demonstrates the expected format.                     |
| `enum`        | array  | A list of allowed values for the field.                                   |

### Example with field metadata

.actor/dataset\_schema.json


```json
{
    "actorSpecification": 1,
    "fields": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
            "productName": {
                "type": "string",
                "title": "Product name",
                "description": "The full name of the product as displayed on the product page.",
                "example": "Wireless Bluetooth Headphones"
            },
            "price": {
                "type": "number",
                "title": "Price",
                "description": "The current price in USD. Does not include shipping or taxes.",
                "example": 49.99
            },
            "currency": {
                "type": "string",
                "title": "Currency code",
                "description": "Three-letter ISO 4217 currency code.",
                "example": "USD",
                "enum": ["USD", "EUR", "GBP"]
            },
            "inStock": {
                "type": "boolean",
                "title": "In stock",
                "description": "Whether the product is currently available for purchase.",
                "example": true
            },
            "url": {
                "type": "string",
                "title": "Product URL",
                "description": "Direct link to the product page.",
                "example": "https://example.com/products/wireless-headphones"
            }
        },
        "required": ["productName", "price", "url"]
    },
    "views": {
        "overview": {
            "title": "Overview",
            "transformation": {
                "fields": ["productName", "price", "inStock", "url"]
            },
            "display": {
                "component": "table",
                "properties": {
                    "url": { "format": "link" },
                    "inStock": { "format": "boolean" }
                }
            }
        }
    }
}
```


Naming convention

Use `camelCase` for field names. This matches the convention used in input schemas and ensures consistency across your Actor's configuration.

For validation options and error handling, see [Dataset validation](https://pr-2554.preview.docs.apify.com/platform/actors/development/actor-definition/dataset-schema/validation.md).

## Views

Views control how data appears in the Output tab UI. Each view defines which fields to show, in what order, and with what formatting.

### Why use views

Dataset views are like database views - different perspectives on the same data. Instead of showing all fields at once, views present focused subsets. Users find data faster, and AI agents can better understand your output.

For a real-world example, see [Google Maps Scraper](https://apify.com/compass/crawler-google-places) which uses views to separate place details from review data.

### When to use views

* **Control field order and formatting** - Without views, fields appear in JSON property order. Views let you order fields logically and format URLs as links, images inline, etc.
* **Expand nested data with `unwind`** - Arrays of nested objects appear collapsed by default. Use `unwind` to expand them into readable rows.
* **Create focused perspectives** - A scraper with 50+ fields can offer an "Overview" view and a "Details" view. Same data, different focus.

A single view is fine for simple Actors with fewer than 10 fields where all fields are equally relevant.

### Organizing views by use case

The same data often serves different purposes. An e-commerce scraper could offer a "Marketing" view (name, image, description) and a "Pricing" view (price, discount, competitor price). The first view defined becomes the default.

### What views are NOT for

Views show the same data from different angles. They're NOT for:

* **Separating unrelated data types** - Storing posts, comments, and profiles in one dataset, then using views to separate them. Use separate datasets for unrelated data types.
* **Controlling export formats** - Views don't change how data exports to JSON, CSV, or Excel. Export format is set in download options or the Dataset API `format` parameter. Views only affect Console UI display.

### Basic view example

The following Actor stores data using `Actor.pushData()`:

main.js


```javascript
import { Actor } from 'apify';
await Actor.init();

await Actor.pushData({
    numericField: 10,
    pictureUrl: 'https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_92x30dp.png',
    linkUrl: 'https://google.com',
    textField: 'Google',
    booleanField: true,
    dateField: new Date(),
    arrayField: ['#hello', '#world'],
    objectField: {},
});

await Actor.exit();
```


Configure the Output tab UI with a dataset schema:

.actor/actor.json


```json
{
    "actorSpecification": 1,
    "name": "Actor Name",
    "title": "Actor Title",
    "version": "1.0.0",
    "storages": {
        "dataset": {
            "actorSpecification": 1,
            "views": {
                "overview": {
                    "title": "Overview",
                    "transformation": {
                        "fields": [
                            "pictureUrl",
                            "linkUrl",
                            "textField",
                            "booleanField",
                            "arrayField",
                            "objectField",
                            "dateField",
                            "numericField"
                        ]
                    },
                    "display": {
                        "component": "table",
                        "properties": {
                            "pictureUrl": {
                                "label": "Image",
                                "format": "image"
                            },
                            "linkUrl": {
                                "label": "Link",
                                "format": "link"
                            },
                            "textField": {
                                "label": "Text",
                                "format": "text"
                            },
                            "booleanField": {
                                "label": "Boolean",
                                "format": "boolean"
                            },
                            "arrayField": {
                                "label": "Array",
                                "format": "array"
                            },
                            "objectField": {
                                "label": "Object",
                                "format": "object"
                            },
                            "dateField": {
                                "label": "Date",
                                "format": "date"
                            },
                            "numericField": {
                                "label": "Number",
                                "format": "number"
                            }
                        }
                    }
                }
            }
        }
    }
}
```


Each view has two parts:

1. `transformation` - Which fields to fetch and how to transform them
2. `display` - How to visually present the data in the UI

The Output tab displays fields from `transformation.fields` in the specified order:

![Output tab UI](/assets/images/output-schema-example-42bf91c1c1f39834fad5bbedf209acaa.png)

### Multiple views example

Create multiple views for different use cases. This e-commerce scraper offers Marketing and Pricing views:

.actor/dataset\_schema.json


```json
{
    "actorSpecification": 1,
    "views": {
        "marketing": {
            "title": "Marketing",
            "description": "Fields for marketing and content creation",
            "transformation": {
                "fields": ["productName", "imageUrl", "description", "price"]
            },
            "display": {
                "component": "table",
                "properties": {
                    "imageUrl": {
                        "label": "Image",
                        "format": "image"
                    }
                }
            }
        },
        "pricing": {
            "title": "Pricing analysis",
            "description": "Fields for competitive pricing analysis",
            "transformation": {
                "fields": [
                    "productName",
                    "price",
                    "currency",
                    "discountPercent",
                    "competitorPrice",
                    "priceDifference"
                ]
            },
            "display": {
                "component": "table",
                "properties": {
                    "discountPercent": {
                        "label": "Discount %",
                        "format": "number"
                    },
                    "priceDifference": {
                        "label": "vs. Competitor",
                        "format": "number"
                    }
                }
            }
        }
    }
}
```


The first view defined becomes the default tab.

## Handling nested structures

Tabular formats (Output tab table, Excel, CSV) require flat data. If your Actor produces nested JSON structures, transform them using these options:

### Flatten nested objects

Use `transformation.flatten` to convert nested objects into flat key-value pairs:


```json
{
    "transformation": {
        "flatten": ["address"],
        "fields": ["name", "address.street", "address.city"]
    }
}
```


With `flatten: ["address"]`, the object `{"address": {"street": "Main St", "city": "NYC"}}` becomes `{"address.street": "Main St", "address.city": "NYC"}`.

Use the flattened property names (e.g., `address.street`) in both `transformation.fields` and `display.properties`.

### Unwind arrays

Use `transformation.unwind` to expand arrays of nested objects into separate rows:


```json
{
    "transformation": {
        "unwind": ["reviews"],
        "fields": ["productName", "reviewText", "rating"]
    }
}
```


With `unwind: ["reviews"]`, a product with 5 reviews becomes 5 rows in the output, each containing the product name plus one review's data.

### Flatten in Actor code

Alternatively, flatten nested structures in your Actor code before calling `Actor.pushData()`.

## Reference

### DatasetSchema object

| Property             | Type              | Required | Description                                                                                       |
| -------------------- | ----------------- | -------- | ------------------------------------------------------------------------------------------------- |
| `actorSpecification` | integer           | true     | Version of the dataset schema structure. Currently only version 1 is available.                   |
| `fields`             | JSONSchema object | false    | Schema of one dataset object using JSON Schema Draft 2020-12 or compatible format.                |
| `views`              | Object            | true     | An object containing view definitions. Each key is a view ID, each value is a DatasetView object. |

### DatasetView object

| Property         | Type               | Required | Description                                                   |
| ---------------- | ------------------ | -------- | ------------------------------------------------------------- |
| `title`          | string             | true     | The title shown in the Output tab and API.                    |
| `description`    | string             | false    | Description of the view. Only available in API responses.     |
| `transformation` | ViewTransformation | true     | Defines how to fetch and transform data from the Dataset API. |
| `display`        | ViewDisplay        | true     | Defines how to render data in the Output tab UI.              |

### ViewTransformation object

| Property  | Type      | Required | Description                                                                                                                 |
| --------- | --------- | -------- | --------------------------------------------------------------------------------------------------------------------------- |
| `fields`  | string\[] | true     | Fields to include in the output. Order determines column order in the UI. Missing field values display as **undefined**.    |
| `unwind`  | string\[] | false    | Array fields to expand into parent objects. With `unwind: ["foo"]`, `{"foo": {"bar": "hello"}}` becomes `{"bar": "hello"}`. |
| `flatten` | string\[] | false    | Object fields to flatten. With `flatten: ["foo"]`, `{"foo": {"bar": "hello"}}` becomes `{"foo.bar": "hello"}`.              |
| `omit`    | string\[] | false    | Fields to exclude from output. Supports nested field names.                                                                 |
| `limit`   | integer   | false    | Maximum number of results. Default is all results.                                                                          |
| `desc`    | boolean   | false    | Sort order. Default is ascending (oldest first). Set `true` for descending (newest first).                                  |

### ViewDisplay object

| Property     | Type   | Required | Description                                                                                                                                                                 |
| ------------ | ------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `component`  | string | true     | Display component. Only `table` is available.                                                                                                                               |
| `properties` | Object | false    | Field display settings. Keys match `transformation.fields`, values are ViewDisplayProperty objects. If not set, fields render as strings, arrays, or objects automatically. |

### ViewDisplayProperty object

| Property | Type   | Required | Description                                                                                 |
| -------- | ------ | -------- | ------------------------------------------------------------------------------------------- |
| `label`  | string | false    | Column header text in the table view.                                                       |
| `format` | string | false    | Display format: `text`, `number`, `date`, `link`, `boolean`, `image`, `array`, or `object`. |
