Can Inorganic Intelligence Do Useful Tasks?

For this research topic, GDPVal is used as a framework for useful tasks, even though GDP itself may not be the best measure of societal output.

What is GDPval?

GDPval measures real-world, economically valuable professional work, not academic puzzles.

Key properties:

Scale:

Occupations Covered (Examples)

GDPval spans representative roles across the economy, including:

Professional & Technical

Healthcare

Sales, Marketing & Media

Operations & Public Services

(44 total roles across 9 sectors)


What GDPval Tasks Look Like

**GDPval tasks are real work outputs, not trivia.

Typical deliverables include:

Tasks often include reference files (spreadsheets, datasets, templates) and require multi-step reasoning.

The Public “Gold Subset”

OpenAI has released a 220-task “gold subset” for public inspection and research.

Purpose:

These tasks:


Finance & Accounting Example

Auditor Sample Testing & Variance Analysis

Scenario: You are an auditor reviewing an Anti-Financial Crime risk dataset.

Required deliverables:

Skills tested:


Finance Example

Profit & Loss Report for a Music Tour

Scenario: You are Finance Lead for a touring production company.

Required deliverables:

Skills tested:


How Outputs Are Judged

GDPval does not score answers as “correct” or “incorrect”.

Instead:

A reported score like ~70% GDPval means:

Why GDPval Matters

GDPval attempts to capture something traditional benchmarks miss:

It’s designed to answer:

Can this model actually do economically useful work?

Summary


Appendix

Here are actual example tasks from the 220-task gold subset of OpenAI’s GDPval benchmark — the portion that’s been open-sourced so researchers can inspect real prompts and reference files.


Finance & Accounting

Auditor Sample Testing & Variance Analysis

Scenario
You’re an auditor reviewing a spreadsheet of Anti-Financial Crime risk metrics.

Deliverables include:

Clear instructions and structured output expected.

Profit and Loss Report for a Music Tour

Scenario
You’re Finance Lead for a production company’s fall tour.

Deliverables include:

Requires both numerical analysis and professional report structure.


Professional Services / Consulting

Many tasks require multi-step deliverables such as slide decks or written outputs based on real data and business context (not all prompts are publicly visible).

Examples include:

These combine analysis + narrative + structured formatting.


Where to Explore the Full Gold Subset

OpenAI’s GDPval gold subset (220 tasks with prompts and reference files) is available via:


GDPval is designed to answer: Can this system actually do economically useful work at a professional level?