# World Of Taxonomy - Full Reference Guide > Unified Global Classification Knowledge Graph > 1,000+ systems, 1.3M+ nodes, 326K+ crosswalk edges. > Open source (MIT). Data is informational only - use at your own risk. ======================================================================== # Getting Started with World Of Taxonomy ======================================================================== ## Getting Started with World Of Taxonomy > **TL;DR:** Three ways to query 1,000+ classification systems, 1.3M+ codes, and 326K+ crosswalk edges - REST API, MCP server for AI agents, and a web app. All open source, all free to start. --- ## Three access points, one knowledge graph ```mermaid graph LR subgraph Graph["Knowledge Graph"] SYS["1,000 Systems"] NODES["1.2M+ Nodes"] EDGES["321K+ Edges"] end subgraph Surfaces["Access Points"] API["REST API\n/api/v1/*"] MCP["MCP Server\nstdio transport"] WEB["Web App\nlocalhost:3000"] end Graph --> API Graph --> MCP Graph --> WEB ``` Pick whichever fits your workflow. The API is for application integrations and scripts. The MCP server gives AI agents direct tool access. The web app is for visual exploration. ## Quick start - REST API Base URL: `https://worldoftaxonomy.com/api/v1` ### List all classification systems ```bash curl https://worldoftaxonomy.com/api/v1/systems ``` Returns an array of all systems with their ID, name, region, node count, and provenance metadata. ### Search across all systems ```bash curl "https://worldoftaxonomy.com/api/v1/search?q=physician" ``` Full-text search across all 1.3M+ nodes. A search for "physician" returns matches from SOC, ISCO, ESCO, NAICS, ICD-10-CM, and dozens more systems in a single call. Add `&grouped=true` to group results by system, or `&context=true` to include ancestor paths and children for each match. ### Look up a specific code ```bash curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/6211 ``` Returns the node with its title, description, level, parent code, and whether it is a leaf node. ### Browse children ```bash curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/62/children ``` Returns all direct child codes under a given node. This is how you drill down through a hierarchy. ### Get cross-system equivalences ```bash curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/6211/equivalences ``` Returns crosswalk mappings to other systems. NAICS 6211 ("Offices of Physicians") maps to ISIC 8620, NACE 86.2, NIC 8620, and others. ### Translate to all systems at once ```bash curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/6211/translations ``` Returns equivalences across all connected systems in a single call. One request, every known translation. ## Quick start - MCP server The MCP (Model Context Protocol) server lets AI agents query the knowledge graph directly. ### Setup ```bash pip install world-of-taxonomy python -m world_of_taxonomy mcp ``` Transport: stdio. The server exposes 26 tools and wiki-based resources. It works with Claude, Cursor, VS Code, Windsurf, and any MCP-compatible client. ### Key MCP tools | Tool | Purpose | Example | |------|---------|---------| | `list_classification_systems` | List all 1,000+ systems | "What systems cover Germany?" | | `search_classifications` | Full-text search across all nodes | "Find codes for diabetes" | | `get_industry` | Look up a specific code | "What is NAICS 5415?" | | `browse_children` | Get child codes | "Show subcategories of HS chapter 01" | | `get_equivalences` | Get crosswalk mappings | "What does ICD-10-CM E11 map to?" | | `translate_code` | Translate a code to another system | "Convert SOC 29-1211 to ISCO" | | `translate_across_all_systems` | Translate to all connected systems | "All equivalents for NAICS 4841" | | `classify_business` | Classify free text into taxonomy codes (returns `domain_matches` + `standard_matches`) | "Classify: mobile app for pet sitting" | | `get_audit_report` | Data provenance and quality audit | "Show provenance breakdown" | | `get_country_taxonomy_profile` | Systems applicable to a country | "What systems apply in Brazil?" | ### MCP resources The server also provides resources that AI agents can read for deeper context: - `taxonomy://systems` - JSON list of all classification systems - `taxonomy://stats` - Knowledge graph statistics - `taxonomy://wiki/{slug}` - Individual guide pages as markdown ## Authentication ### Sign in There is no password. Visit [https://worldoftaxonomy.com/login](https://worldoftaxonomy.com/login), enter your email, and click the one-time sign-in link in the message we send. You land on the API-key dashboard at `/developers/keys`. ### API keys From the dashboard, click "Generate key". Copy the raw key once - we never show it again. Keys use the format `wot_` followed by 32 hex characters (or `rwot_` for restricted scopes). Pass them in the Authorization header: ``` Authorization: Bearer wot_your_key_here ``` You can revoke a key at any time from the same dashboard. Revocation propagates within ~2 seconds. ## Rate limits | Tier | Requests/Minute | Daily Limit | Best for | |------|-----------------|-------------|----------| | Anonymous | 30 | Unlimited | Quick exploration | | Free (authenticated) | 1,000 | Unlimited | Development and prototyping | | Pro | 5,000 | 100,000 | Production applications | | Enterprise | 50,000 | Unlimited | High-volume integrations | ## API request flow ```mermaid sequenceDiagram participant C as Your App participant RL as Rate Limiter participant AUTH as Auth Layer participant Q as Query Layer participant DB as PostgreSQL C->>RL: GET /api/v1/search?q=physician RL->>RL: Check tier limit RL->>AUTH: Forward AUTH->>AUTH: Validate session cookie or API key AUTH->>Q: Authenticated request Q->>DB: Full-text search DB-->>Q: Matching nodes Q-->>C: JSON response ``` ## API endpoints reference ### Systems | Endpoint | Description | |----------|-------------| | `GET /systems` | List all classification systems | | `GET /systems/{id}` | System detail with root codes | | `GET /systems/stats` | Leaf and total node counts per system | | `GET /systems?group_by=region` | Systems grouped by region | | `GET /systems?country={code}` | Systems applicable to a country | ### Nodes | Endpoint | Description | |----------|-------------| | `GET /systems/{id}/nodes/{code}` | Look up a specific code | | `GET /systems/{id}/nodes/{code}/children` | Direct children | | `GET /systems/{id}/nodes/{code}/ancestors` | Parent chain to root | | `GET /systems/{id}/nodes/{code}/siblings` | Sibling codes | | `GET /systems/{id}/nodes/{code}/subtree` | Subtree summary stats | ### Search | Endpoint | Description | |----------|-------------| | `GET /search?q={query}` | Full-text search | | `GET /search?q={query}&grouped=true` | Results grouped by system | | `GET /search?q={query}&context=true` | Results with ancestor/child context | ### Crosswalks | Endpoint | Description | |----------|-------------| | `GET /systems/{id}/nodes/{code}/equivalences` | Cross-system mappings | | `GET /systems/{id}/nodes/{code}/translations` | Translate to all systems | | `GET /equivalences/stats` | Crosswalk statistics | | `GET /compare?a={sys}&b={sys}` | Side-by-side sector comparison | | `GET /diff?a={sys}&b={sys}` | Codes with no mapping | ### Classification | Endpoint | Description | |----------|-------------| | `POST /classify` | Classify free text; returns `domain_matches` + `standard_matches` (see [domain-vs-standard](/guide/domain-vs-standard)) | ### Countries | Endpoint | Description | |----------|-------------| | `GET /countries/stats` | Per-country taxonomy coverage | | `GET /countries/{code}` | Full taxonomy profile for a country | ## Data disclaimer All classification data in World Of Taxonomy is provided for informational purposes only. It should not be used as a substitute for official government or standards body publications. Always verify codes against the authoritative source for regulatory, legal, or compliance purposes. ======================================================================== # MCP Setup Guide for Claude Desktop, Cursor, Zed, and other AI clients ======================================================================== # MCP setup guide > **TL;DR:** Get an API key at [worldoftaxonomy.com/developers](https://worldoftaxonomy.com/developers), > paste it into your AI client's MCP config, and ask the assistant > taxonomy questions in plain English. Five minutes start to finish. The World Of Taxonomy MCP server gives AI assistants (Claude Desktop, Cursor, Zed, VS Code Continue, and any other MCP client) direct access to 1,000+ classification systems, 1.3M+ codes, and 321K+ crosswalk edges. Once installed, you can ask things like "convert NAICS 5415 to NACE" or "find ICD-10 codes related to type 2 diabetes" and get authoritative answers without leaving the editor. The server is published as the [`worldoftaxonomy-mcp`](https://pypi.org/project/worldoftaxonomy-mcp/) package on PyPI. End users do not need a database, a clone of the repo, or any local build step - the package handles every tool call by hitting the WoT REST API with your key. ## Step 1: Get your API key 1. Visit [worldoftaxonomy.com/developers](https://worldoftaxonomy.com/developers). 2. Enter your email and click **Send me a sign-in link**. 3. Open the email and click the link. The dashboard loads. 4. Click **Generate new key**. Name it after where you'll use it (e.g., `MCP on laptop`). 5. Copy the key shown on screen. **It's shown only once.** The key looks like `wot_a3f2c5d9b8e7f6c4d2a1b0c9d8e7f6a5` (free-tier keys start with `wot_`; restricted or cross-product keys may use `rwot_` or `aix_`). Free tier gives you **1,000 requests per minute** shared across your team. No credit card. See [API key management](./api-keys.md) for rotation, revocation, and scoping. ## Step 2: Install for your AI client Pick the client you use. Each has its own config file location. ### Claude Desktop **Mac**: `~/Library/Application Support/Claude/claude_desktop_config.json` **Windows**: `%APPDATA%\Claude\claude_desktop_config.json` Add this to the `mcpServers` block (create the block if it doesn't exist): ```json { "mcpServers": { "worldoftaxonomy": { "command": "uvx", "args": ["worldoftaxonomy-mcp"], "env": { "WOT_API_KEY": "wot_a3f2c5d9b8e7f6c4d2a1b0c9d8e7f6a5" } } } } ``` Restart Claude Desktop (Cmd-Q, relaunch). You'll see a small hammer icon at the bottom of the chat input - that's MCP. Click it to confirm `worldoftaxonomy` appears in the tool list. If you don't have `uvx` installed and don't want to install it, the plain-`pip` alternative works too: ```bash pip install --user worldoftaxonomy-mcp ``` Then change `command` to `worldoftaxonomy-mcp` and drop `args`. The `uvx` path is recommended because it isolates the install in a per-invocation venv and uses the latest published version automatically. ### Cursor Cursor uses MCP via Settings -> Features -> Model Context Protocol. 1. Open **Settings** -> **Features** -> **Model Context Protocol**. 2. Click **Add new MCP server**. 3. Enter: - **Name**: `worldoftaxonomy` - **Type**: `stdio` - **Command**: `uvx worldoftaxonomy-mcp` 4. Add an environment variable: `WOT_API_KEY` = your key. 5. Save and restart Cursor. ### Zed Zed uses `~/.config/zed/settings.json`. Add to `context_servers`: ```json { "context_servers": { "worldoftaxonomy": { "command": { "path": "uvx", "args": ["worldoftaxonomy-mcp"], "env": { "WOT_API_KEY": "wot_a3f2c5d9b8e7f6c4d2a1b0c9d8e7f6a5" } } } } } ``` Restart Zed. ### VS Code (with Continue extension) Continue's MCP support lives in `~/.continue/config.json`: ```json { "experimental": { "modelContextProtocolServers": [ { "transport": { "type": "stdio", "command": "uvx", "args": ["worldoftaxonomy-mcp"], "env": { "WOT_API_KEY": "wot_a3f2c5d9b8e7f6c4d2a1b0c9d8e7f6a5" } } } ] } } ``` Restart the Continue extension (Cmd-Shift-P -> "Continue: Restart"). ### Windsurf Windsurf uses `~/.codeium/windsurf/mcp_config.json` with the same schema as Claude Desktop: ```json { "mcpServers": { "worldoftaxonomy": { "command": "uvx", "args": ["worldoftaxonomy-mcp"], "env": { "WOT_API_KEY": "wot_a3f2c5d9b8e7f6c4d2a1b0c9d8e7f6a5" } } } } ``` Restart Windsurf. ### Generic MCP client If your client is a generic MCP host: - **Transport**: stdio - **Command**: `uvx worldoftaxonomy-mcp` (or `worldoftaxonomy-mcp` if pip-installed) - **Required env var**: `WOT_API_KEY` - **Optional env var**: `WOT_API_BASE_URL` (defaults to `https://wot.aixcelerator.ai`) - **Protocol version**: MCP 2024-11-05 or later ## Step 3: Verify it works Open a chat with your AI client and try one of these prompts: > "Look up NAICS 2022 code 5417 and tell me what NACE Rev 2 code it > maps to." > "Find ICD-10-CM codes related to type 2 diabetes mellitus." > "Translate ISIC Rev 4 division 84 to its equivalent in the German > WZ 2008 classification." > "What classification systems are commonly used in Brazil?" The assistant should call MCP tools (you'll see something like `get_industry`, `translate_code`, or `get_equivalences` in the tool log) and answer with specific codes and titles, not generic AI reasoning. ## Available tools The MCP server exposes 26 tools. The most commonly used: | Tool | Use it for | |---|---| | `list_classification_systems` | "What systems are available?" | | `search_classifications` | Full-text search across all codes | | `get_industry` | Look up a specific code | | `browse_children` | Drill into a category | | `get_equivalences` | Get all crosswalk mappings for a code | | `translate_code` | Convert between two specific systems | | `translate_across_all_systems` | Convert to every connected system | | `classify_business` | Free text -> taxonomy codes | | `get_country_taxonomy_profile` | Systems applicable in a country | The full list with arguments and return types is in the API reference at [worldoftaxonomy.com/api](https://worldoftaxonomy.com/api). A handful of tools (`list_crosswalks_by_kind`, `get_country_scope`, `get_audit_report`) are available in self-hosted mode but not yet wired into the published PyPI package; they'll surface a "not yet supported" error if invoked. They are not on the critical path for typical AI-agent tool calls. Open an issue if you have a specific need. ## Troubleshooting ### "WOT_API_KEY not set" on startup The MCP server requires the key as an environment variable. Re-check the `env` block in your client's config. Mac/Linux users can verify the wiring with a one-shot stdin probe: ```bash echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' \ | WOT_API_KEY=wot_a3f2c5d9... uvx worldoftaxonomy-mcp \ | head -c 200 ``` You should see a JSON-RPC response that starts with `{"jsonrpc": "2.0", "id": 1, "result": {"protocolVersion":...}`. If you get an exit-2 stderr message about "no credentials", the env var is not actually being read; double-check the `env` block in your client's config (not all clients allow env-var inheritance from the parent shell). ### "Authentication failed" / 401 errors Your key may be revoked or expired. Sign in at [worldoftaxonomy.com/developers](https://worldoftaxonomy.com/developers), check the key's status in the dashboard. Generate a new one if needed; both old and new work simultaneously, so update your `mcp.json` first, then revoke the old. ### "Rate limit exceeded" / 429 errors You or your team has hit the free-tier 1,000 req/min ceiling. Options: - Wait 60 seconds and retry. The bucket refills. - If you're on a corporate domain with multiple developers sharing the pool, ask the org admin who's burning the budget. - Upgrade to the Pro tier (10,000 req/min pool) at [worldoftaxonomy.com/pricing](https://worldoftaxonomy.com/pricing). ### Tools don't appear in the AI client 1. Restart the client fully (quit and relaunch, not just close the window). 2. Verify the config file is valid JSON. A trailing comma or missing brace silently disables MCP. 3. Check the client's MCP log: - **Claude Desktop**: View -> Developer -> View Logs from MCP servers. - **Cursor**: Cmd-Shift-P -> "Cursor: Show Logs" -> filter for `mcp`. - **Zed**: Cmd-Shift-P -> "zed: open log" -> search for `worldoftaxonomy`. ### "Command not found: uvx" Install `uv` (the runner that ships `uvx`): - **Mac / Linux**: `curl -LsSf https://astral.sh/uv/install.sh | sh` - **Windows**: `irm https://astral.sh/uv/install.ps1 | iex` Or skip the runner entirely and `pip install --user worldoftaxonomy-mcp`, then point your client at `worldoftaxonomy-mcp` directly (no `args` needed). The `uvx` path is preferred because it runs the latest published version on every invocation; `pip install` pins you to whatever version was current at install time. ### Server crashes when answering certain questions Capture the error from the client's MCP log and report it via the **Report incorrect description** link on the relevant node detail page, or open an issue on the [GitHub repo](https://github.com/colaberry/WorldOfTaxonomy). Include the exact prompt and the MCP log excerpt. ## What's next - [API quickstart](./getting-started.md) for direct REST access - [Crosswalk map](./crosswalk-map.md) to understand which systems connect to which - [Industry classification guide](./industry-classification.md) for picking the right industrial system for your use case - [Medical coding](./medical-coding.md) for ICD-10, ICD-11, MeSH, LOINC, ATC comparison - [Trade codes](./trade-codes.md) for HS, CPC, UNSPSC, SITC ======================================================================== # Systems Catalog - All 1,000+ Classification Systems ======================================================================== ## Systems Catalog - All 1,000+ Classification Systems > **TL;DR:** Complete catalog of 1,000+ classification systems organized by category. Industry (150+), Life Sciences (100+), Product/Trade (20+), Occupation (15+), Regulatory (100+), and 300+ domain vocabularies - all connected by 326K+ crosswalk edges. --- ```mermaid graph LR subgraph Top5["Largest Systems by Node Count"] NCI["NCI Thesaurus\n211,072"] NDC["NDC\n112,077"] LOINC["LOINC\n102,751"] ICD10CM["ICD-10-CM\n97,606"] ICD10PCS["ICD-10-PCS\n79,987"] end ``` World Of Taxonomy connects over 1,000 classification systems as equal peers in a unified knowledge graph. Systems span industry classification, product and trade codes, occupation standards, health and clinical coding, education frameworks, financial and environmental standards, regulatory compliance, and hundreds of domain-specific vocabularies. ## Industry classification standards These are the foundational systems for classifying economic activity by industry sector. ### Global and Multi-National | System | Region | Codes | Authority | |--------|--------|-------|-----------| | ISIC Rev 4 | Global (UN) | 766 | United Nations Statistics Division | | ISIC Rev 3.1 | Global (historical) | ~400 | United Nations | | GICS Bridge | Global (MSCI/S&P) | 11 | MSCI and S&P Dow Jones | | ICB | Global (FTSE Russell) | 32 | FTSE Russell | ### North America | System | Region | Codes | Authority | |--------|--------|-------|-----------| | NAICS 2022 | North America | 2,125 | U.S. Census Bureau | | NAICS 2017 (Historical) | North America | ~2,000 | U.S. Census Bureau | | NAICS 2012 (Historical) | North America | ~2,000 | U.S. Census Bureau | | SIC 1987 | USA/UK | 1,176 | U.S. OMB | ### European Union (NACE Rev 2 Family) NACE Rev 2 is the EU standard. Each member state publishes a national adaptation with the same structure (996 codes): NACE Rev 2 (EU), ATECO 2007 (Italy), NAF Rev 2 (France), WZ 2008 (Germany), ONACE 2008 (Austria), NOGA 2008 (Switzerland), PKD 2007 (Poland), SBI 2008 (Netherlands), SNI 2007 (Sweden), DB07 (Denmark), TOL 2008 (Finland), CNAE 2009 (Spain), NACE-BEL 2008 (Belgium), CAE Rev 3 (Portugal), CZ-NACE (Czech Republic), TEAOR 2008 (Hungary), CAEN Rev 2 (Romania), and 20+ more national variants. ### Asia-Pacific | System | Region | Codes | Authority | |--------|--------|-------|-----------| | NIC 2008 | India | 2,070 | Ministry of Statistics | | JSIC 2013 | Japan | 20 | Ministry of Internal Affairs | | ANZSIC 2006 | Australia/NZ | 825 | ABS/Stats NZ | | GB/T 4754-2017 | China | 118 | National Bureau of Statistics | | KSIC 2017 | South Korea | 108 | KOSTAT | | SSIC 2020 | Singapore | 21 | Dept of Statistics | ### Latin America (ISIC-based) CIIU Rev 4 adaptations: Colombia, Argentina, Chile, Peru, Ecuador, Bolivia, Venezuela, Costa Rica, Guatemala, Panama, Paraguay, Uruguay, Dominican Republic - each with 766 codes based on ISIC Rev 4. ### Additional National Systems Over 80 country-specific ISIC Rev 4 adaptations covering Africa, Middle East, Central Asia, Southeast Asia, Caribbean, and Pacific Island nations. ## Product and Trade Classification | System | Region | Codes | Purpose | |--------|--------|-------|---------| | HS 2022 | Global (WCO) | 6,960 | International trade (customs tariffs) | | CPC v2.1 | Global (UN) | 4,596 | Product classification (statistical) | | UNSPSC v24 | Global (GS1 US) | 77,337 | Procurement and spend analysis | | GS1 GPC | Global (GS1) | 6,450 | Global Product Classification (Segment / Family / Class / Brick), Nov 2025 release | | SITC Rev 4 | Global (UN) | 77 | Trade statistics | | BEC Rev 5 | Global (UN) | 29 | Broad economic categories | | HTS (US) | United States | 120 | US customs tariff | | CN 2024 | European Union | 118 | EU Combined Nomenclature | ## Occupation and Skills Classification | System | Region | Codes | Purpose | |--------|--------|-------|---------| | ISCO-08 | Global (ILO) | 619 | International occupation standard | | SOC 2018 | United States | 1,447 | US occupation classification | | O*NET-SOC | United States | 867 | Occupation database with skills data | | ESCO Occupations | Europe (EU) | 3,045 | European occupation taxonomy | | ESCO Skills | Europe (EU) | 14,247 | Skills and competences | | NOC 2021 | Canada | 51 | Canadian occupations | | UK SOC 2020 | United Kingdom | 43 | UK occupations | | ANZSCO 2022 | Australia/NZ | 1,590 | AU/NZ occupations | ## Life Sciences | System | Region | Codes | Purpose | |--------|--------|-------|---------| | ICD-11 MMS | Global (WHO) | 37,052 | Disease classification (latest) | | ICD-10-CM | United States | 97,606 | US clinical modification | | ICD-10-PCS | United States | 79,987 | US procedure coding | | LOINC | Global | 102,751 | Laboratory and clinical observations | | MeSH | Global (NLM) | 31,124 | Medical subject headings | | ATC WHO 2021 | Global (WHO) | 6,440 | Anatomical therapeutic chemical | | NCI Thesaurus | Global (NCI) | 211,072 | Cancer research terminology | | NDC | United States | 112,077 | National drug codes | ## Education Classification | System | Region | Codes | Purpose | |--------|--------|-------|---------| | ISCED 2011 | Global (UNESCO) | 20 | Education levels | | ISCED-F 2013 | Global (UNESCO) | 122 | Fields of education | | CIP 2020 | United States | 2,848 | Instructional programs | ## Geographic Classification | System | Region | Codes | Purpose | |--------|--------|-------|---------| | ISO 3166-1 | Global | 271 | Country codes | | ISO 3166-2 | Global | 5,246 | Subdivision codes | | UN M.49 | Global | 272 | Geographic regions | | EU NUTS 2021 | European Union | 124 | Statistical regions | | US FIPS | United States | 86 | Federal information processing | | GeoNames Features | Global (GeoNames) | 693 | Geographic feature classification (admin, hydrographic, populated places, terrain, undersea, vegetation) | ## Financial, Environmental, and Governance | System | Region | Codes | Purpose | |--------|--------|-------|---------| | COFOG | Global (UN) | 188 | Government functions | | GHG Protocol | Global (WRI) | 20 | Greenhouse gas accounting | | SASB SICS | Global | 86 | Sustainability accounting | | EU Taxonomy | European Union | 60 | Sustainable finance | | SFDR | European Union | 30 | Financial disclosure regulation | | SDG 2030 | Global (UN) | 82 | Sustainable development goals | ## Regulatory and Compliance Over 100 regulatory frameworks including HIPAA, SOX, GDPR, OSHA standards, FDA regulations, SEC rules, PCI DSS, NIST frameworks, ISO management system standards, and international agreements (Basel, FATF, ILO Conventions). ## Domain-Specific Vocabularies Over 300 domain taxonomies covering specialized sectors: - **Transportation**: truck freight types, vehicle classes, cargo classification, carrier operations, pricing, regulatory compliance - **Agriculture**: crop types, livestock, farming methods, commodity grades, equipment, input supply, land classification - **Mining**: mineral types, extraction methods, reserve classification, equipment, safety - **Construction**: trade types, building types, project delivery, materials, sustainability - **Manufacturing**: process types, quality, operations models, industry verticals - **Healthcare deep-dives**: hospital departments, nursing specialties, lab categories, surgical specialties, pharmacy types - **Finance deep-dives**: insurance products, credit ratings, derivatives, private equity stages - **Technology**: API architectures, database types, programming paradigms, DevOps, MLOps, cybersecurity - **Energy**: oil grades, natural gas, solar, wind, battery, smart grid, carbon credits ## Web and Semantic Vocabularies These vocabularies define types and concepts used by AI assistants, search engines, and structured-data crawlers. They anchor what real-world things mean across the public web. | System | Region | Codes | Authority | Purpose | |--------|--------|-------|-----------|---------| | schema.org | Global | 926 | schema.org consortium (Google, Microsoft, Yahoo, Yandex) | Type vocabulary for marking up web pages and JSON-LD structured data (Article, Person, Restaurant, MedicalCondition, etc.) | | SKOS | Global (W3C) | 17 | W3C | Simple Knowledge Organization System reference | | W3C Standards | Global | 16 | W3C | W3C standards index | ## Financial Ontologies | System | Region | Codes | Authority | Purpose | |--------|--------|-------|-----------|---------| | FIBO | Global | 2,521 | EDM Council | Financial Industry Business Ontology - class hierarchies for business entities, instruments, derivatives, indices, loans, securities | ## Financial Ontologies | System | Region | Codes | Authority | Purpose | |--------|--------|-------|-----------|---------| | FIBO | Global | 2,521 | EDM Council | Financial Industry Business Ontology - class hierarchies for business entities, instruments, derivatives, indices, loans, securities | ## Lexical Resources | System | Region | Codes | Authority | Purpose | |--------|--------|-------|-----------|---------| | WordNet (Nouns) | Global | 82,115 | Princeton University | Lexical semantic database; noun hypernym tree rooted at entity.n.01. Anchors common-sense concept relationships for NLP and AI reasoning. | ## Process / Activity Frameworks | System | Region | Codes | Authority | Purpose | |--------|--------|-------|-----------|---------| | APQC PCF (Skeleton) | Global | 13 | APQC | Cross-Industry Process Classification Framework, top-level categories. The 13 process categories (operating processes 1.0-6.0; management and support 7.0-13.0) are the canonical pan-industry process anchor. Levels 2-5 (~1,500 detailed elements) require the official APQC spreadsheet. | ## Patent Classification | System | Region | Codes | Purpose | |--------|--------|-------|---------| | Patent CPC | Global (EPO/USPTO) | 254,249 | Cooperative Patent Classification | ## Academic and Research | System | Region | Codes | Purpose | |--------|--------|-------|---------| | arXiv Taxonomy | Global | 110 | Preprint subject areas | | MSC 2020 | Global | 92 | Mathematics subject classification | | PACS | Global | 70 | Physics and astronomy | | LCC | Global | 111 | Library of Congress classification | | JEL Codes | Global | 98 | Economics literature | | ACM CCS 2012 | Global | 67 | Computing classification | ## How to Explore Systems Use these API calls to explore the catalog programmatically: ```bash # List all systems curl https://worldoftaxonomy.com/api/v1/systems # Group by region curl "https://worldoftaxonomy.com/api/v1/systems?group_by=region" # Filter by country curl "https://worldoftaxonomy.com/api/v1/systems?country=DE" # System detail with root codes curl https://worldoftaxonomy.com/api/v1/systems/naics_2022 ``` ======================================================================== # Crosswalk Map - How Classification Systems Connect ======================================================================== ## Crosswalk Map - How Classification Systems Connect > **TL;DR:** 326,000+ crosswalk edges link 1,000+ classification systems through hub-and-spoke topology. ISIC is the industry hub, CPC bridges trade to industry, SOC/ISCO connect occupations, and every one of the 434 domain taxonomies is bridged to NAICS/ISIC/NACE via sector anchors. This guide maps the full topology and shows how to navigate translation paths. --- ## What is a crosswalk? A crosswalk (or concordance) is a mapping between codes in two different classification systems. For example, NAICS 6211 ("Offices of Physicians") maps to ISIC 8620 ("Medical and dental practice activities"). Crosswalks have a match type that tells you how precise the mapping is: | Type | Meaning | Example | |------|---------|---------| | `exact` | Identical scope and definition | NAICS 111110 "Soybean Farming" = ISIC 0111 | | `partial` | Overlapping but not identical scope | NAICS 6211 partially overlaps ISIC 8620 | | `broader` | Target has wider scope | A 6-digit NAICS to a 2-digit ISIC | | `narrower` | Target has narrower scope | A section-level ISIC to a detailed NAICS | | `related` | Conceptually related but structurally different | Domain taxonomy to parent NAICS sector | ## Core crosswalk topology The knowledge graph has five major hubs. Each hub connects clusters of related systems. ```mermaid graph TB subgraph Industry["Industry Hub"] ISIC["ISIC Rev 4\n766 codes"] NAICS["NAICS 2022\n2,125 codes"] NACE["NACE Rev 2\n996 codes"] NIC["NIC 2008\n2,070 codes"] ANZSIC["ANZSIC 2006\n825 codes"] SIC["SIC 1987\n1,176 codes"] GBT["GB/T 4754\n118 codes"] NAT80["80+ National\nISIC variants"] end subgraph Trade["Trade Hub"] CPC["CPC v2.1\n4,596 codes"] HS["HS 2022\n6,960 codes"] UNSPSC["UNSPSC v24\n77,337 codes"] HTS["HTS / CN / SITC"] end subgraph Occupation["Occupation Hub"] SOC["SOC 2018\n1,447 codes"] ISCO["ISCO-08\n619 codes"] ESCO["ESCO\n3,045 + 14,247"] ONET["O*NET-SOC\n867 codes"] CIP["CIP 2020\n2,848 codes"] end NAICS <-->|3,418 edges| ISIC ISIC <-->|1:1| NACE ISIC -.->|derived| NIC ISIC -.->|derived| ANZSIC ISIC -.->|derived| GBT ISIC -.->|derived| NAT80 NAICS <-.->|legacy| SIC ISIC <-->|5,430 edges| CPC CPC <-->|11,686 edges| HS CPC -.-> UNSPSC HS -.-> HTS SOC <-->|992 edges| ISCO ISCO <-->|6,048 edges| ESCO SOC <-->|1,734 edges| ONET CIP -->|5,903 edges| SOC ISCO <-->|44 edges| ISIC ``` ## Industry classification hub ISIC Rev 4 is the central node for industry classification. Every major national system connects through it. ```mermaid graph LR NAICS["NAICS 2022"] <-->|3,418| ISIC["ISIC Rev 4"] ISIC <-->|1:1| NACE["NACE Rev 2"] NACE -->|1:1| WZ["WZ 2008\nGermany"] NACE -->|1:1| NAF["NAF Rev 2\nFrance"] NACE -->|1:1| ATECO["ATECO 2007\nItaly"] NACE -->|1:1| MORE["30+ more\nEU variants"] ISIC -->|derived| NIC["NIC 2008\nIndia"] ISIC -->|derived| ANZSIC["ANZSIC 2006\nAU/NZ"] ISIC -->|derived| GBT["GB/T 4754\nChina"] ISIC -->|adapted| NAT80["80+ national\nadaptations"] ``` NACE national variants (WZ, NAF, ATECO, PKD, SBI, SNI, etc.) share the identical 996-code structure. Each has a 1:1 mapping to NACE Rev 2 and transitively to ISIC Rev 4. ## Product and trade hub CPC v2.1 is the bridge between trade codes and industry codes. ```mermaid graph LR HS["HS 2022\n6,960 codes"] <-->|11,686 edges| CPC["CPC v2.1\n4,596 codes"] CPC <-->|5,430 edges| ISIC["ISIC Rev 4"] HS -->|extended| HTS["HTS (US)"] HS -->|extended| CN["CN 2024 (EU)"] HS -->|extended| AHTN["ASEAN Tariff"] HS -->|extended| NCM["MERCOSUR Tariff"] HS -.->|aggregated| SITC["SITC Rev 4\n77 codes"] HS -.->|aggregated| BEC["BEC Rev 5\n29 codes"] CPC -.-> UNSPSC["UNSPSC v24\n77,337 codes"] ``` This means you can trace a trade code (HS) to its product category (CPC) to the industry that produces it (ISIC/NAICS). ## Occupation and education hub SOC 2018 and ISCO-08 are the twin hubs for occupation data. ```mermaid graph LR CIP["CIP 2020\n2,848 programs"] -->|5,903 edges| SOC["SOC 2018\n1,447 occupations"] CIP -->|1,615 edges| ISCEDF["ISCED-F 2013\n122 fields"] SOC <-->|992 edges| ISCO["ISCO-08\n619 occupations"] ISCO <-->|6,048 edges| ESCO["ESCO Occupations\n3,045"] SOC <-->|1,734 edges| ONET["O*NET-SOC\n867"] ISCO -->|44 edges| ISIC["ISIC Rev 4"] SOC -.-> NAICS["NAICS 2022"] ``` CIP 2020 (educational programs) connects to SOC (occupations) with 5,903 edges - the education-to-career pipeline. ## Geographic and domain hubs ```mermaid graph TB subgraph Geo["Geographic"] ISO1["ISO 3166-1\n271 countries"] ISO2["ISO 3166-2\n5,246 subdivisions"] UNM["UN M.49\n272 regions"] end subgraph Domain["Domain Crosswalks"] N484["NAICS 484\nTruck Transportation"] N11["NAICS 11\nAgriculture"] N21["NAICS 21\nMining"] N22["NAICS 22\nUtilities"] N23["NAICS 23\nConstruction"] end ISO1 <--> ISO2 ISO1 <--> UNM N484 -->|~200 edges| TRUCK["Truck domain\n7 vocabularies"] N11 -->|~48 edges| AG["Agriculture domain\n11 vocabularies"] N21 -->|~31 edges| MINE["Mining domain\n6 vocabularies"] N22 -->|~20 edges| UTIL["Utility domain\n6 vocabularies"] N23 -->|~27 edges| CONST["Construction domain\n6 vocabularies"] ``` Each domain taxonomy links back to its parent NAICS sector, creating drill-down paths from broad industry codes to specialized vocabularies. As of the sector-anchor pass, all 434 domain taxonomies (up from the 15 original pilots shown above) carry at least one bridge edge to NAICS 2022, plus parallel fan-out edges into ISIC Rev 4 and NACE Rev 2 where the NAICS anchor has an existing international crosswalk. Generated edges are stamped `match_type='broad'` and one of two provenance values: | Provenance | What it means | |------------|---------------| | `derived:sector_anchor:v1` | Direct NAICS<->domain bridge written by `crosswalk_domain_anchors.py` | | `derived:sector_anchor:v1:fanout` | ISIC<->domain or NACE<->domain edge derived via a NAICS<->ISIC (or NACE) self-join | Filter `?match_type=exact` if you want to exclude every generated bridge and see only authoritative exact statistical concordances. ## The four edge kinds Every equivalence response now carries an `edge_kind` computed from the categories of both endpoints. See [domain-vs-standard](domain-vs-standard.md) for the full pattern. Quick reference: | `edge_kind` | Description | |---------------------|-------------| | `standard_standard` | Pre-existing statistical crosswalks (NAICS<->ISIC, ISIC<->NACE, HS<->CPC, SOC<->ISCO, ...) | | `standard_domain` | Bridge from an official code to a curated domain taxonomy | | `domain_standard` | Bridge from a domain taxonomy back to an official code | | `domain_domain` | Reserved for future cross-domain edges; none generated yet | Use the filter on any equivalence or translation endpoint: ``` GET /api/v1/systems/naics_2022/nodes/6211/equivalences?edge_kind=standard_standard GET /api/v1/systems/naics_2022/nodes/6211/equivalences?edge_kind=standard_domain,domain_standard ``` Stats grouped by edge kind: ```bash curl "https://worldoftaxonomy.com/api/v1/equivalences/stats?group_by=edge_kind" ``` ## Translation paths Not all systems have direct crosswalks. You translate between systems by following a path through intermediate hubs. ### Example: German industry code to US occupation ```mermaid graph LR WZ["WZ 2008\nGerman industry"] -->|1:1| NACE["NACE Rev 2"] NACE -->|1:1| ISIC["ISIC Rev 4"] ISIC -->|44 edges| ISCO["ISCO-08"] ISCO -->|992 edges| SOC["SOC 2018\nUS occupation"] ``` ### Example: HS trade code to NAICS industry ```mermaid graph LR HS["HS 2022\ntrade code"] -->|11,686| CPC["CPC v2.1"] CPC -->|5,430| ISIC["ISIC Rev 4"] ISIC -->|3,418| NAICS["NAICS 2022"] ``` ## API for crosswalk navigation ### Direct equivalences ```bash # Get all systems that NAICS 6211 maps to curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/6211/equivalences # Translate to all connected systems at once curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/6211/translations ``` ### Crosswalk statistics ```bash # Overall crosswalk stats curl https://worldoftaxonomy.com/api/v1/equivalences/stats # Stats for a specific system curl "https://worldoftaxonomy.com/api/v1/equivalences/stats?system_id=naics_2022" ``` ### Compare systems ```bash # Side-by-side top-level comparison curl "https://worldoftaxonomy.com/api/v1/compare?a=naics_2022&b=isic_rev4" # Codes in system A with no mapping to B curl "https://worldoftaxonomy.com/api/v1/diff?a=naics_2022&b=isic_rev4" ``` ## MCP tools for crosswalks | Tool | Purpose | |------|---------| | `get_equivalences` | Direct crosswalk mappings for a code | | `translate_code` | Translate a code to a specific target system | | `translate_across_all_systems` | Translate to all connected systems | | `get_crosswalk_coverage` | Coverage statistics for a crosswalk pair | | `get_system_diff` | Codes with no mapping between two systems | | `compare_sector` | Side-by-side sector comparison | | `describe_match_types` | Explain the match type categories | | `list_crosswalks_by_kind` | Counts + samples for a specific `edge_kind` (standard_standard, standard_domain, domain_standard, domain_domain); optionally narrow to a single system | ======================================================================== # Industry Classification Guide - Which System to Use ======================================================================== ## Industry Classification Guide - Which System to Use > **TL;DR:** Your country and purpose determine which industry classification system to use. NAICS for North America, NACE for the EU, ISIC for global. This guide provides a decision tree, country reference, and side-by-side comparisons. --- ## Decision tree ```mermaid graph TD START["What do you need to classify?"] --> GEO{"Geographic scope?"} GEO -->|Single country| NATIONAL["Use national system\nsee table below"] GEO -->|Multi-country / Global| ISIC["ISIC Rev 4\n766 codes, UN standard"] GEO -->|North America| NAICS["NAICS 2022\n2,125 codes"] GEO -->|European Union| NACE["NACE Rev 2\n996 codes"] NAICS --> DETAIL{"Need SEC filing?"} DETAIL -->|Yes| SIC["SIC 1987\nstill required by SEC"] DETAIL -->|No| NAICS_DONE["Use NAICS 2022"] ``` ### Step 1: What is your geographic scope? **Single country** - Use the national system for that country (see table below). **Multi-country or global** - Use ISIC Rev 4 as your common denominator, then translate to national systems as needed. **North America (US, Canada, Mexico)** - Use NAICS 2022. **European Union** - Use NACE Rev 2 (or your country's national variant). ### Step 2: What level of detail do you need? | Granularity | Typical Use | Recommended System | |-------------|-------------|-------------------| | Broad sectors (10-20 categories) | Executive dashboards, market sizing | ISIC sections (A-U) or NAICS 2-digit | | Divisions (~100 categories) | Industry reports, portfolio analysis | ISIC 2-digit or NAICS 3-digit | | Groups (~300 categories) | Detailed market analysis | ISIC 3-digit or NAICS 4-digit | | Classes (~500+ categories) | Regulatory filings, detailed reporting | ISIC 4-digit or NAICS 5-6 digit | ### Step 3: Is this for regulatory compliance? If you are filing with a government agency, use the system they require: | Agency / Purpose | Required System | |------------------|----------------| | US Census Bureau / BLS | NAICS 2022 | | US SEC filings | SIC 1987 | | Eurostat / EU statistical reporting | NACE Rev 2 | | UN statistical reporting | ISIC Rev 4 | | Australian Bureau of Statistics | ANZSIC 2006 | | Indian Ministry of Statistics | NIC 2008 | | World Bank projects | ISIC Rev 4 | ## Country-to-system quick reference ### Major economies | Country | Primary System | Codes | Notes | |---------|---------------|-------|-------| | United States | NAICS 2022 | 2,125 | Also SIC 1987 for SEC filings | | Canada | NAICS 2022 | 2,125 | Shared with US and Mexico | | United Kingdom | SIC 1987 / UK SOC | 1,176 | Companies House uses SIC | | Germany | WZ 2008 | 996 | National NACE variant | | France | NAF Rev 2 | 996 | National NACE variant | | India | NIC 2008 | 2,070 | Based on ISIC Rev 4 | | China | GB/T 4754-2017 | 118 | National standard | | Japan | JSIC 2013 | 20 | Statistical survey use | | Australia | ANZSIC 2006 | 825 | Shared with New Zealand | | South Korea | KSIC 2017 | 108 | KOSTAT standard | ### Latin America All countries use CIIU Rev 4 (the Spanish translation of ISIC Rev 4) with 766 codes: Colombia, Argentina, Chile, Peru, Ecuador, Bolivia, Venezuela, Costa Rica, Guatemala, Panama, Paraguay, Uruguay, Dominican Republic. ### European Union (27 members + EEA) All EU member states use NACE Rev 2 with national naming: ATECO (Italy), NAF (France), WZ (Germany), CNAE (Spain), PKD (Poland), SBI (Netherlands), SNI (Sweden), and others. The structure is identical - 996 codes with 1:1 mapping. ## National variants - full enumeration These are the WoT system IDs for every country-specific industry classification carried in the graph. They share structure with the parent system (NAICS for North/Central America, NACE Rev 2 for EU and adjacent, ISIC Rev 4 for everywhere else); each is its own WoT system with the publisher's identifiers preserved so country-specific filings stay clean. ### NACE Rev 2 family (EU and adjacent) | System ID | Country | Authority | |-----------|---------|-----------| | `nace_rev2` | EU (master) | Eurostat | | `ateco_2007` | Italy | ISTAT | | `naf_rev2` | France | INSEE | | `wz_2008` | Germany | Destatis | | `onace_2008` | Austria | Statistik Austria | | `noga_2008` | Switzerland | Federal Statistical Office | | `pkd_2007` | Poland | GUS | | `sbi_2008` | Netherlands | CBS | | `sni_2007` | Sweden | SCB | | `db07` | Denmark | Statistics Denmark | | `tol_2008` | Finland | Statistics Finland | | `cnae_2009` | Spain | INE | | `nace_bel` | Belgium | Statbel | | `nace_lu` | Luxembourg | STATEC | | `nace_ie` | Ireland | CSO Ireland | | `cae_rev3` | Portugal | INE Portugal | | `cz_nace` | Czech Republic | CZSO | | `teaor_2008` | Hungary | KSH | | `caen_rev2` | Romania | INS | | `nkd_2007` | Croatia | DZS | | `sk_nace` | Slovakia | Statistical Office SR | | `nkid` | Bulgaria | NSI | | `emtak` | Estonia | Statistics Estonia | | `nace_lt` | Lithuania | Statistics Lithuania | | `nk_lv` | Latvia | CSB Latvia | | `stakod_08` | Greece | ELSTAT | | `nace_cy` | Cyprus | CYSTAT | | `nace_mt` | Malta | NSO Malta | | `skd_2008` | Slovenia | SURS | | `sn_2007` | Norway | SSB | | `isat_2008` | Iceland | Statistics Iceland | | `nace_tr` | Turkey | TÜİK | | `kd_rs` | Serbia | SORS | | `nkd_mk` | North Macedonia | State Statistical Office | | `kd_ba` | Bosnia and Herzegovina | BHAS | | `kd_me` | Montenegro | MONSTAT | | `nve_al` | Albania | INSTAT | | `kd_xk` | Kosovo | ASK | | `caem_md` | Moldova | National Bureau of Statistics | | `kved_ua` | Ukraine | State Statistics Service | | `nace_ge` | Georgia | GeoStat | | `nace_am` | Armenia | Armstat | ### ISIC Rev 4 family (LatAm, Asia-Pacific, Africa, Middle East) The "CIIU" prefix is the Spanish translation of ISIC. Each Latin American country publishes its own national version. The Asia-Pacific and African nations listed below use ISIC Rev 4 directly with national codes. | System ID | Country | Notes | |-----------|---------|-------| | `ciiu_co` | Colombia | CIIU Rev 4 AC | | `ciiu_ar` | Argentina | CLANAE Rev 4 | | `ciiu_cl` | Chile | CIIU Rev 4 | | `ciiu_pe` | Peru | CIIU Rev 4 | | `ciiu_ec` | Ecuador | CIIU Rev 4 | | `caeb` | Bolivia | CAEB | | `ciiu_ve` | Venezuela | CIIU Rev 4 | | `ciiu_cr` | Costa Rica | CIIU Rev 4 | | `ciiu_gt` | Guatemala | CIIU Rev 4 | | `ciiu_pa` | Panama | CIIU Rev 4 | | `ciiu_py` | Paraguay | CIIU Rev 4 | | `ciiu_uy` | Uruguay | CIIU Rev 4 | | `ciiu_do` | Dominican Republic | CIIU Rev 4 | Plus 80+ ISIC Rev 4 country adaptations (`isic_` for two-letter ISO codes - Nigeria, Kenya, Egypt, Saudi Arabia, UAE, Vietnam, Bangladesh, Pakistan, etc). These follow a consistent system_id pattern; query `GET /api/v1/systems?prefix=isic_` for the live list. ### NAICS family (North America historical and adaptations) | System ID | Variant | |-----------|---------| | `naics_2022` | NAICS 2022 (current) | | `naics_2017` | NAICS 2017 (historical) | | `naics_2012` | NAICS 2012 (historical) | | `scian_2018` | SCIAN 2018 (Mexico's national NAICS variant) | ### Other Asia-Pacific | System ID | Country | |-----------|---------| | `gbt_4754` | China | | `ksic_2017` | South Korea | | `jsic_2013` | Japan | | `ssic_2020` | Singapore | | `msic_2008` | Malaysia | | `tsic_2009` | Thailand | | `psic_2009` | Philippines | | `psic_pk` | Pakistan | | `vsic_2018` | Vietnam | | `bsic` | Bangladesh | | `kbli_2020` | Indonesia (KBLI 2020 official) | | `kbli_id` | Indonesia (alternate) | | `slsic` | Sri Lanka | | `nic_2008` | India | | `anzsic_2006` | Australia / New Zealand | ### Russia and post-Soviet | System ID | Country | |-----------|---------| | `okved_2` | Russia | ### South Africa | System ID | Country | |-----------|---------| | `sic_sa` | South Africa | ### Historical references | System ID | Notes | |-----------|-------| | `isic_rev3` | ISIC Rev 3 (predecessor to Rev 4; kept for historical filings) | | `csic_2017` | China SIC 2017 (companion to GB/T 4754) | | `cnae_2012` | CNAE 2.0 (alternate Spanish industry classification) | ## Comparing the major systems ```mermaid graph LR subgraph North_America["North America"] NAICS["NAICS 2022\n2,125 codes\n6 levels"] end subgraph EU["European Union"] NACE["NACE Rev 2\n996 codes\n4 levels"] end subgraph Global["Global"] ISIC["ISIC Rev 4\n766 codes\n4 levels"] end NAICS <-->|3,418 edges| ISIC ISIC <-->|1:1 structure| NACE ``` ### NAICS 2022 vs ISIC Rev 4 | Feature | NAICS 2022 | ISIC Rev 4 | |---------|-----------|-----------| | Codes | 2,125 | 766 | | Levels | 6 (2-6 digit) | 4 (section, division, group, class) | | Region | North America | Global | | Detail | Very granular | Moderate | | Crosswalk | 3,418 edges to ISIC | 3,418 edges to NAICS | | Best for | US regulatory, detailed analysis | International comparison | ### NAICS 2022 vs NACE Rev 2 | Feature | NAICS 2022 | NACE Rev 2 | |---------|-----------|-----------| | Codes | 2,125 | 996 | | Levels | 6 | 4 | | Region | North America | European Union | | Detail | Very granular | Moderate | | Best for | US/Canada/Mexico | EU regulatory, Eurostat | ### NAICS 2022 vs SIC 1987 | Feature | NAICS 2022 | SIC 1987 | |---------|-----------|---------| | Codes | 2,125 | 1,176 | | Status | Current | Legacy (but still used) | | Region | North America | USA/UK | | Best for | Current analysis | SEC filings, historical data | ## How to translate between systems ```bash # Translate NAICS 6211 to all equivalent systems curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/6211/translations # Direct equivalences with match types curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/6211/equivalences # Find NAICS codes with no NACE equivalent curl "https://worldoftaxonomy.com/api/v1/diff?a=naics_2022&b=nace_rev2" ``` For systems without direct crosswalks, follow the translation path through hub systems (see the [Crosswalk Map](crosswalk-map) guide). ## Domain-specific extensions When a standard industry code is too broad for your use case, World Of Taxonomy provides domain-specific vocabularies: | NAICS Sector | Domain Vocabularies | Example Codes | |-------------|---------------------|---------------| | 484 Truck Transportation | Freight types, vehicle classes, cargo, carrier operations | 44 + 23 + 46 + 27 | | 11 Agriculture | Crop types, livestock, farming methods, commodity grades | 46 + 27 + 28 + 30 | | 21 Mining | Mineral types, extraction methods, reserve classification | 25 + 20 + 12 | | 22 Utilities | Energy sources, grid regions, tariff structures | 17 + 15 + 26 | | 23 Construction | Trade types, building types, project delivery | 20 + 17 + 22 | These domain taxonomies are crosswalked back to their parent NAICS/ISIC sector codes, so you can drill down from a broad industry classification to specialized detail. ======================================================================== # Medical and Health Classification Systems Compared ======================================================================== ## Medical and Health Classification Systems Compared > **TL;DR:** ICD-10-CM for US billing, ICD-11 for global reporting, LOINC for lab tests, ATC for drugs, MeSH for research. World Of Taxonomy connects all of these (and more) - 568K+ health codes across 100+ systems with crosswalk edges between them. --- ## System overview | System | Codes | Purpose | Authority | |--------|-------|---------|-----------| | ICD-11 MMS | 37,052 | Disease classification (latest WHO standard) | WHO | | ICD-10-CM | 97,606 | US clinical modification for diagnoses | CMS/NCHS | | ICD-10-PCS | 79,987 | US procedure coding system | CMS | | LOINC | 102,751 | Laboratory and clinical observations | Regenstrief Institute | | MeSH | 31,124 | Medical literature subject headings | NLM | | ATC WHO 2021 | 6,440 | Drug classification by therapeutic use | WHO | | NCI Thesaurus | 211,072 | Cancer research terminology | National Cancer Institute | | NDC | 112,077 | National drug codes (US) | FDA | | SNOMED CT | ~20 (skeleton) | Clinical terminology reference | SNOMED International | | CPT | ~18 (skeleton) | Medical procedure codes (US) | AMA | > SNOMED CT and CPT are included as structural placeholders. Full datasets require licenses from SNOMED International and the AMA respectively. ## How health systems connect ```mermaid graph TB subgraph Diagnoses["Diagnosis Systems"] ICD10CM["ICD-10-CM\n97,606 codes"] ICD11["ICD-11 MMS\n37,052 codes"] ICD10PCS["ICD-10-PCS\n79,987 codes"] end subgraph Drugs["Drug Systems"] ATC["ATC WHO 2021\n6,440 codes"] NDC["NDC\n112,077 codes"] RXNORM["RxNorm (skeleton)"] end subgraph Research["Research & Lab"] MESH["MeSH\n31,124 descriptors"] LOINC["LOINC\n102,751 observations"] NCI["NCI Thesaurus\n211,072 terms"] end subgraph Clinical["Clinical"] SNOMED["SNOMED CT\n(skeleton)"] CPT["CPT (skeleton)"] end ICD10CM <-.-> ICD11 ICD10CM <-.-> MESH ATC <-.-> ICD10CM LOINC <-.-> ICD10CM SNOMED <-.-> ICD10CM NDC <-.-> ATC NCI <-.-> MESH CPT <-.-> ICD10PCS ``` ## ICD-10-CM vs ICD-11: Which to use? ### ICD-10-CM (United States) ICD-10-CM is the US clinical modification of the WHO's ICD-10. It is required for US healthcare billing and reporting. - **97,606 codes** - the most granular diagnosis system in the graph - **Structure**: 3-7 character alphanumeric codes (e.g., E11.65 - Type 2 diabetes with hyperglycemia) - **Required by**: CMS, US health insurers, HIPAA transactions - **Updated**: annually (October 1 each year) ### ICD-11 MMS (Global) ICD-11 is the latest WHO revision, adopted by the World Health Assembly in 2019. - **37,052 codes** with extension codes for additional detail - **Structure**: Alphanumeric with cluster and post-coordination - **Status**: Official WHO standard since January 2022 ### When to use which | Scenario | System | Why | |----------|--------|-----| | US hospital billing | ICD-10-CM | Required by CMS | | US procedure coding | ICD-10-PCS | Required for inpatient procedures | | WHO mortality/morbidity reporting | ICD-11 | Current WHO standard | | New health IT system (non-US) | ICD-11 | Forward-looking adoption | | International health research | ICD-11 | Global comparability | | Legacy system integration | ICD-10-CM | Existing infrastructure | ## LOINC - Laboratory and clinical observations LOINC (Logical Observation Identifiers Names and Codes) is the universal standard for identifying health measurements, observations, and documents. - **102,751 codes** - the largest observation vocabulary - **Use cases**: lab test orders and results, clinical documents, patient surveys - **Structure**: 5-7 digit numeric codes with check digit - **Required by**: US federal health agencies, HL7 FHIR implementations > LOINC does not classify diseases (that is ICD's role). It classifies what was measured or observed. A LOINC code identifies the test, an ICD code identifies the condition. ## MeSH - Medical subject headings MeSH is the controlled vocabulary used for indexing biomedical literature in PubMed/MEDLINE. - **31,124 descriptors** organized in a hierarchical tree - **Use cases**: literature search, research categorization, knowledge organization - **Structure**: 16 top-level categories branching into specific terms - **Maintained by**: US National Library of Medicine ## ATC - Drug classification The Anatomical Therapeutic Chemical (ATC) classification organizes drugs by the organ system they target and their therapeutic properties. - **6,440 codes** across 5 hierarchical levels - **Structure**: 7-character codes (e.g., A10BA02 = metformin) - **Levels**: Anatomical group, Therapeutic subgroup, Pharmacological subgroup, Chemical subgroup, Chemical substance - **Maintained by**: WHO Collaborating Centre for Drug Statistics ```mermaid graph TD A["A - Alimentary Tract\nand Metabolism"] --> A10["A10 - Drugs Used\nin Diabetes"] A10 --> A10B["A10B - Blood Glucose\nLowering Drugs"] A10B --> A10BA["A10BA - Biguanides"] A10BA --> A10BA02["A10BA02\nMetformin"] ``` ## Domain-specific health vocabularies World Of Taxonomy includes domain taxonomies for healthcare specialization: | Domain | Codes | Coverage | |--------|-------|----------| | Hospital Department Types | 18 | Department classification | | Nursing Specialty Types | 17 | Nursing specializations | | Lab Test Category Types | 17 | Laboratory categories | | Surgical Specialty Types | 17 | Surgical specializations | | Pharmacy Practice Types | 16 | Pharmacy settings | | Health Care Settings | 23 | Care delivery settings | | Health Care Payer Types | 18 | Insurance/payer categories | | Health Care Delivery Models | 18 | Payment and delivery models | | Mental Health Service Types | 22 | Behavioral health | | Dental Service Types | 18 | Oral health | ## API examples ```bash # Search for a medical term across all systems curl "https://worldoftaxonomy.com/api/v1/search?q=diabetes&grouped=true" # Browse ICD-10-CM hierarchy curl https://worldoftaxonomy.com/api/v1/systems/icd10_cm/nodes/E11/children # Get ICD-10-CM code detail curl https://worldoftaxonomy.com/api/v1/systems/icd10_cm/nodes/E11.65 # Browse ATC hierarchy from top level curl https://worldoftaxonomy.com/api/v1/systems/atc_who_2021/nodes/A10/children # LOINC system overview curl https://worldoftaxonomy.com/api/v1/systems/loinc # Cross-system equivalences for a diagnosis code curl https://worldoftaxonomy.com/api/v1/systems/icd10_cm/nodes/E11/equivalences ``` ## Use cases | Who | What | Systems | |-----|------|---------| | Hospital IT teams | Map diagnoses to billing codes | ICD-10-CM, ICD-10-PCS, CPT | | Pharma researchers | Link drugs to indications | ATC, ICD-10-CM, MeSH | | Public health agencies | Compare disease burden globally | ICD-11, ICD-10-CM | | Lab information systems | Standardize test identifiers | LOINC | | Clinical NLP pipelines | Normalize extracted terms | SNOMED CT, ICD-10-CM, MeSH | | Health AI agents | Navigate the full health taxonomy | All of the above via MCP | ======================================================================== # Trade and Product Classification Guide ======================================================================== ## Trade and Product Classification Guide > **TL;DR:** HS for customs, CPC to bridge trade and industry, UNSPSC for procurement (77K codes). This guide shows how the trade classification systems relate, which one to use, and how to navigate between them with 11,686+ crosswalk edges. --- ## System comparison | System | Codes | Purpose | Maintained By | |--------|-------|---------|---------------| | HS 2022 | 6,960 | International customs tariffs | World Customs Organization | | CPC v2.1 | 4,596 | Statistical product classification | United Nations | | UNSPSC v24 | 77,337 | Procurement and spend analysis | GS1 US | | GS1 GPC | 6,450 | Global Product Classification (Segment / Family / Class / Brick), retail and B2B trade | GS1 | | SITC Rev 4 | 77 | Trade statistics (aggregated) | United Nations | | BEC Rev 5 | 29 | Broad economic categories | United Nations | | HTS (US) | 120 | US-specific tariff schedule | US International Trade Commission | | CN 2024 | 118 | EU Combined Nomenclature | European Commission | GS1 GPC sits next to UNSPSC as the second major procurement-anchored vocabulary, but with a different focus: GPC is the global product-identification standard published by GS1 (the same authority behind GTINs / UPCs / EANs), used heavily across retail, healthcare, and supply chain. UNSPSC is procurement-shaped (for spend analysis), GPC is product-identification-shaped (for catalog and POS systems). Both serve B2B trade, often in parallel. ## How these systems relate ### The HS family tree The Harmonized System (HS) is the foundation of international trade classification. Other systems build on it. ```mermaid graph TD HS["HS 2022 (WCO)\n6,960 codes\nGlobal foundation"] --> HTS["HTS (US)\nUS-specific subheadings"] HS --> CN["CN 2024 (EU)\nEU-specific subheadings"] HS --> AHTN["ASEAN Tariff (AHTN)\nSoutheast Asia"] HS --> NCM["MERCOSUR Tariff (NCM)\nSouth America"] HS --> AFCFTA["AfCFTA Tariff\nAfrica"] HS --> GCC["GCC Common Tariff\nGulf States"] ``` Every country that trades internationally uses HS at the 6-digit level. National extensions add more digits for country-specific detail. ### The statistical bridge CPC v2.1 bridges product classification and industry classification. This is where trade meets production. ```mermaid graph LR HS["HS 2022\n6,960 trade codes"] <-->|11,686 edges| CPC["CPC v2.1\n4,596 product codes"] CPC <-->|5,430 edges| ISIC["ISIC Rev 4\n766 industry codes"] CPC -.-> UNSPSC["UNSPSC v24\n77,337 procurement codes"] ``` This means you can trace: a **trade code** (HS) to its **product category** (CPC) to the **industry that produces it** (ISIC/NAICS). ### Aggregation for statistics SITC and BEC aggregate trade data at higher levels for economic analysis: ```mermaid graph TD HS["HS 2022\n6,960 detailed codes"] --> SITC["SITC Rev 4\n77 codes\nTrade flow analysis"] HS --> BEC["BEC Rev 5\n29 codes\nEconomic category analysis"] BEC --> SNA["Maps to SNA\ncategories"] ``` ## Which system to use | Purpose | Recommended System | Why | |---------|-------------------|-----| | Customs declarations | HS 2022 (or national variant) | Legally required for trade | | US import/export filings | HTS (US) | Required by US Customs | | EU trade compliance | CN 2024 | Required by EU customs | | Procurement/spend analysis | UNSPSC v24 | Most granular (77K codes) | | International trade statistics | SITC Rev 4 | Designed for aggregate analysis | | Economic modeling | BEC Rev 5 | Maps to SNA categories | | Product-to-industry mapping | CPC v2.1 | Bridges HS to ISIC | ## HS code structure HS codes use a hierarchical 6-digit structure: | Level | Digits | Example | Description | |-------|--------|---------|-------------| | Chapter | 2 | 01 | Live animals | | Heading | 4 | 0101 | Horses, asses, mules | | Subheading | 6 | 010121 | Pure-bred horses | National extensions add further digits. HTS (US) goes up to 10 digits. CN (EU) uses 8 digits. ## CPC code structure CPC v2.1 uses a 5-level hierarchy: | Level | Example | Description | |-------|---------|-------------| | Section | 0 | Agriculture, forestry and fishery products | | Division | 01 | Products of agriculture, horticulture | | Group | 011 | Cereals | | Class | 0111 | Wheat | | Subclass | 01110 | Wheat, unmilled | ## UNSPSC structure UNSPSC uses an 8-digit hierarchy across 4 levels: | Level | Example | Description | |-------|---------|-------------| | Segment | 10 | Live Plant and Animal Material | | Family | 1010 | Live animals | | Class | 101015 | Dogs | | Commodity | 10101501 | Guard dogs | With 77,337 codes, UNSPSC is the most detailed product classification available. It is widely used in procurement platforms and spend analytics. ## Crosswalk navigation ### Translate an HS code to an industry ```bash # Get CPC equivalences for an HS code curl https://worldoftaxonomy.com/api/v1/systems/hs_2022/nodes/0101/equivalences # Translate HS code to all connected systems curl https://worldoftaxonomy.com/api/v1/systems/hs_2022/nodes/0101/translations ``` ### Find trade codes for an industry ```bash # Start from a NAICS code, get all translations including HS/CPC curl https://worldoftaxonomy.com/api/v1/systems/naics_2022/nodes/1111/translations # Or use the search to find trade codes by product name curl "https://worldoftaxonomy.com/api/v1/search?q=wheat&grouped=true" ``` ### Find gaps ```bash # HS codes with no CPC equivalent curl "https://worldoftaxonomy.com/api/v1/diff?a=hs_2022&b=cpc_v21" ``` ## Use cases | Who | What | Systems | |-----|------|---------| | Customs brokers | Classify goods for import/export | HS 2022, HTS, CN 2024 | | Procurement teams | Categorize spend across suppliers | UNSPSC v24 | | Trade economists | Analyze bilateral trade flows | SITC Rev 4, BEC Rev 5 | | Supply chain analysts | Map products to producing industries | CPC v2.1, ISIC Rev 4 | | Compliance officers | Verify tariff classification | HS 2022 + national variants | | AI trade agents | Automate classification via MCP | All of the above | ## MCP tools for trade classification | Tool | Purpose | |------|---------| | `search_classifications` | Find trade codes by product name | | `get_equivalences` | Get crosswalk to other systems | | `translate_code` | Direct translation between systems | | `browse_children` | Explore HS/CPC/UNSPSC hierarchy | | `get_crosswalk_coverage` | Check crosswalk completeness | ======================================================================== # Occupation Classification Systems Compared ======================================================================== ## Occupation Classification Systems Compared > **TL;DR:** SOC for US labor data, ISCO for global comparison, ESCO for European skills matching, O*NET for detailed occupation attributes. Connected by 10,000+ crosswalk edges with education-to-career pathways through CIP. --- ## System overview | System | Codes | Region | Purpose | Authority | |--------|-------|--------|---------|-----------| | ISCO-08 | 619 | Global (ILO) | International occupation standard | International Labour Organization | | SOC 2018 | 1,447 | United States | US federal occupation classification | Bureau of Labor Statistics | | O*NET-SOC | 867 | United States | Detailed occupation database with skills | Department of Labor | | ESCO Occupations | 3,045 | Europe (EU) | European occupation taxonomy | European Commission | | ESCO Skills | 14,247 | Europe (EU) | Skills and competences taxonomy | European Commission | | ANZSCO 2022 | 1,590 | Australia/NZ | AU/NZ occupation standard | ABS/Stats NZ | | NOC 2021 | 51 | Canada | Canadian occupation classification | Statistics Canada | | UK SOC 2020 | 43 | United Kingdom | UK occupation standard | ONS | | KldB 2010 | 54 | Germany | German occupation classification | Federal Employment Agency | | ROME v4 | 93 | France | French job/occupation repertoire | Pole emploi | ## How occupation systems connect ```mermaid graph TB subgraph Education["Education"] CIP["CIP 2020\n2,848 programs"] ISCEDF["ISCED-F 2013\n122 fields"] end subgraph US_Occ["United States"] SOC["SOC 2018\n1,447 occupations"] ONET["O*NET-SOC\n867 occupations\n+ skills, abilities, interests"] end subgraph Global_Occ["Global"] ISCO["ISCO-08\n619 occupations"] end subgraph EU_Occ["Europe"] ESCO_O["ESCO Occupations\n3,045"] ESCO_S["ESCO Skills\n14,247"] end subgraph Industry["Industry"] NAICS["NAICS 2022"] ISIC["ISIC Rev 4"] end CIP -->|5,903 edges| SOC CIP -->|1,615 edges| ISCEDF SOC <-->|992 edges| ISCO SOC <-->|1,734 edges| ONET ISCO <-->|6,048 edges| ESCO_O ESCO_O --- ESCO_S ISCO -->|44 edges| ISIC SOC -.-> NAICS ``` ## SOC vs ISCO: The two major frameworks ### SOC 2018 (Standard Occupational Classification) - **1,447 detailed occupations** across 6 levels - **Structure**: 2-digit major groups (23) down to 6-digit detailed occupations - **Used for**: US government statistics, labor market data, visa classifications (H-1B), wage surveys - **Updated**: approximately every 10 years ### ISCO-08 (International Standard Classification of Occupations) - **619 occupations** across 4 levels - **Structure**: 1-digit major groups (10) down to 4-digit unit groups - **Used for**: International labor statistics, ILO reporting, basis for national systems - **Key difference**: Broader categories than SOC; designed for international comparison ### Crosswalk between SOC and ISCO SOC 2018 and ISCO-08 are connected by **992 crosswalk edges**. The mapping is many-to-many because SOC is more granular than ISCO. ```bash # Translate a SOC code to ISCO curl https://worldoftaxonomy.com/api/v1/systems/soc_2018/nodes/29-1211/equivalences ``` ## ESCO - European skills and occupations ESCO is the EU's multilingual classification connecting occupations to skills: - **3,045 occupations** mapped to ISCO-08 (6,048 crosswalk edges) - **14,247 skills and competences** linked to occupations - **Key advantage**: Skills-based matching across EU labor markets - **Use cases**: Job portals, skills gap analysis, career guidance, Europass ```mermaid graph LR ESCO_O["ESCO Occupations\n3,045"] <-->|6,048 edges| ISCO["ISCO-08\n619"] ESCO_O --- ESCO_S["ESCO Skills\n14,247"] ESCO_S -.->|linked to| ESCO_O ``` > ESCO is the only system in the graph that connects occupations directly to skills. This makes it essential for AI-powered job matching and workforce analytics. ## O*NET - Occupation information network O*NET extends SOC with rich attribute data: - **867 occupations** mapped to SOC 2018 (1,734 crosswalk edges) - **Includes**: Knowledge areas, abilities, work activities, work context, interests (RIASEC), work values, work styles - **Key advantage**: Most detailed occupation attribute data available - **Use cases**: Career exploration, job analysis, workforce development | O*NET Component | Items | What It Measures | |-----------------|-------|-----------------| | Knowledge Areas | 14 | Subject domains required | | Abilities | 17 | Cognitive, physical, sensory capabilities | | Work Activities | 16 | General types of job behaviors | | Work Context | 15 | Physical and social work environment | | Interests (RIASEC) | 13 | Holland occupational interest types | | Work Values | 14 | What workers find important | | Work Styles | 17 | Personal characteristics for performance | ## Education-to-occupation pathways The crosswalk topology connects education to occupations: ```mermaid graph LR CIP["CIP 2020\n2,848 instructional\nprograms"] -->|5,903 edges| SOC["SOC 2018\n1,447 US\noccupations"] CIP -->|1,615 edges| ISCEDF["ISCED-F 2013\n122 fields\nof education"] ISCED["ISCED 2011\n20 education\nlevels"] -->|25 edges| ISCO["ISCO-08\n619 global\noccupations"] ``` This lets you answer questions like "What occupations do graduates of CIP 51.0912 (Physician Assistant) work in?" ```bash curl https://worldoftaxonomy.com/api/v1/systems/cip_2020/nodes/51.0912/equivalences ``` ## Occupation-to-industry mapping Occupations connect to industries through two paths: | Link | Edges | Use Case | |------|-------|----------| | SOC 2018 to NAICS 2022 | 55 | US workforce-to-industry analysis | | ISCO-08 to ISIC Rev 4 | 44 | Global occupation-industry mapping | ## Which system to use | Purpose | Recommended System | Why | |---------|-------------------|-----| | US labor statistics | SOC 2018 | Required by BLS/Census | | International comparison | ISCO-08 | ILO standard | | European job matching | ESCO | EU multilingual, skills-linked | | Career exploration | O*NET-SOC | Rich attribute data | | Australian/NZ workforce | ANZSCO 2022 | National standard | | Canadian workforce | NOC 2021 | National standard | | Skills gap analysis | ESCO Skills | 14K skills taxonomy | | Education-to-career mapping | CIP 2020 + SOC | 5,903 crosswalk edges | ## Use cases | Who | What | Systems | |-----|------|---------| | HR analytics teams | Map job postings to standard codes | SOC 2018, ISCO-08 | | Career counselors | Match education to occupations | CIP 2020, SOC 2018, O*NET | | EU job portals | Skills-based matching across borders | ESCO Occupations + Skills | | Immigration lawyers | Classify occupations for visa applications | SOC 2018 (H-1B) | | Workforce planners | Identify skills gaps by region | ESCO Skills, O*NET | | AI recruitment agents | Automate classification via MCP | All of the above | ## MCP tools for occupation data | Tool | Purpose | |------|---------| | `search_classifications` | Find occupations by job title | | `get_equivalences` | Cross-system occupation mapping | | `translate_code` | Translate between SOC, ISCO, ESCO | | `browse_children` | Navigate occupation hierarchy | | `get_country_taxonomy_profile` | What occupation systems apply to a country | ======================================================================== # Categories and Sectors - How Systems Are Organized ======================================================================== ## Categories and Sectors - How Systems Are Organized > **TL;DR:** 1,000+ classification systems are organized into 16 categories spanning industry, trade, health, occupation, regulation, and domain-specific vocabularies. This guide explains the category structure and how to navigate it. --- ## The 16 categories ```mermaid graph TD WOT["World Of Taxonomy\n1,000+ systems"] --> IND["Industry\n~150 systems"] WOT --> TRADE["Product/Trade\n~20 systems"] WOT --> OCC["Occupation\n~15 systems"] WOT --> EDU["Education\n~10 systems"] WOT --> LIFE["Life Sciences\n~100+ systems"] WOT --> GEO["Geographic\n~10 systems"] WOT --> FIN["Financial/Environmental\n~20 systems"] WOT --> REG["Regulatory\n~100+ systems"] WOT --> ISO["ISO Standards\n~25 systems"] WOT --> INTL["Intl Agreements\n~25 systems"] WOT --> ACAD["Academic\n~15 systems"] WOT --> PAT["Patent\n1 system, 254K codes"] WOT --> DTECH["Domain: Technology\n~50 systems"] WOT --> DFIN["Domain: Finance\n~30 systems"] WOT --> DSEC["Domain: Sector-Specific\n~200+ systems"] WOT --> DREG["Domain: Regulatory Detail\n~50+ systems"] ``` | Category | Systems | Description | |----------|---------|-------------| | Industry | ~150+ | Economic activity classification (NAICS, ISIC, NACE, SIC, national variants) | | Product/Trade | ~20+ | Goods and services classification (HS, CPC, UNSPSC, SITC) | | Occupation | ~15+ | Job and skills classification (SOC, ISCO, ESCO, O*NET) | | Education | ~10+ | Educational programs and levels (ISCED, CIP) | | Life Sciences | ~100+ | Pharmaceuticals, clinical coding, diagnostics, devices, biotech, health informatics | | Geographic | ~10+ | Country, region, and subdivision codes (ISO 3166, NUTS, FIPS) | | Financial/Environmental | ~20+ | Sustainability, accounting, and governance (SASB, EU Taxonomy, GHG, COFOG) | | Regulatory | ~100+ | Laws, standards, and compliance frameworks (HIPAA, GDPR, OSHA, FDA, SEC) | | ISO Standards | ~25+ | Management system standards (ISO 9001, 14001, 27001, 45001) | | International Agreements | ~25+ | Treaties and global frameworks (Basel, FATF, Paris Agreement, ILO) | | Academic/Research | ~15+ | Subject classification for scholarly work (arXiv, MSC, JEL, ACM CCS) | | Patent | 1 | Patent classification (CPC - 254K codes) | | Domain: Technology | ~50+ | Software, AI, cybersecurity, cloud, data taxonomies | | Domain: Finance | ~30+ | Insurance, banking, investment, payment taxonomies | | Domain: Sector-Specific | ~200+ | Transportation, agriculture, mining, construction, energy, and other sector vocabularies | ## Category counts in the knowledge graph ```mermaid graph LR subgraph By_Nodes["Distribution by Node Count"] NCI_L["Life Sciences\n568K+ nodes"] PAT_L["Patent CPC\n254K nodes"] TRADE_L["Product/Trade\n100K+ nodes"] IND_L["Industry\n50K+ nodes"] OCC_L["Occupation/Skills\n40K+ nodes"] DOM_L["Domain Vocabularies\n10K+ nodes"] end ``` | Category | Systems | Nodes | What drives the count | |----------|---------|-------|----------------------| | Life Sciences | ~100+ | 568K+ | ICD-10-CM (97K), NCI Thesaurus (211K), NDC (112K), LOINC (102K) | | Patent | 1 | 254K | Patent CPC is a single massive hierarchy | | Product/Trade | ~20 | 100K+ | UNSPSC dominates with 77K codes | | Industry | ~150+ | 50K+ | Many national NACE/ISIC variants at ~1K codes each | | Occupation/Skills | ~15 | 40K+ | ESCO Skills at 14K, ESCO Occupations at 3K | | Domain vocabularies | ~300+ | 10K+ | Typically 15-30 codes each | | Regulatory/Compliance | ~100+ | 5K+ | Frameworks range from 15-50 articles each | | Everything else | ~300 | 15K+ | Geographic, academic, financial, ISO | ## How categories map to API queries ### Browse by category ```bash # Get all systems (includes category metadata) curl https://worldoftaxonomy.com/api/v1/systems # Group by region curl "https://worldoftaxonomy.com/api/v1/systems?group_by=region" # Filter by country to find relevant systems curl "https://worldoftaxonomy.com/api/v1/systems?country=US" ``` ### Search within a category The search endpoint searches across all systems. Use keywords to focus on specific domains: ```bash # Find health-related codes curl "https://worldoftaxonomy.com/api/v1/search?q=diabetes&grouped=true" # Find trade codes curl "https://worldoftaxonomy.com/api/v1/search?q=cotton&grouped=true" # Find occupation codes curl "https://worldoftaxonomy.com/api/v1/search?q=software+engineer&grouped=true" ``` ## Domain-specific vocabularies Domain taxonomies extend the standard classification systems with specialized vocabularies. They are organized by NAICS 2-digit sector. ### Sector-specific domains | NAICS Sector | Domain Vocabularies | Total Codes | |-------------|---------------------|-------------| | 11 Agriculture | Crop types, livestock, farming methods, commodity grades, equipment, input supply, land classification, post-harvest | 300+ | | 21 Mining | Mineral types, extraction methods, reserve classification, equipment, project lifecycle, safety | 130+ | | 22 Utilities | Energy sources, grid regions, tariff structures, infrastructure assets, regulatory ownership | 130+ | | 23 Construction | Trade types, building types, project delivery, material systems, sustainability | 130+ | | 31-33 Manufacturing | Process types, quality, operations models, industry verticals, supply chain, facility config | 120+ | | 44-45 Retail | Channel types, merchandise categories, fulfillment, pricing strategies, store formats | 100+ | | 52 Finance | Instrument types, market structure, regulatory frameworks, client segments | 100+ | | 484 Truck Transportation | Freight types, vehicle classes, cargo, carrier operations, pricing, compliance | 200+ | ### Emerging sector domains | Domain | Focus | Systems | |--------|-------|---------| | AI and Data | Model types, deployment, ethics, governance | 4 | | Cybersecurity | Threats, frameworks, zero trust, SIEM | 10+ | | Space and Satellite | Orbital classification, regulatory, licensing | 4 | | Climate Technology | Finance instruments, policy mechanisms | 4 | | Quantum Computing | Application domains, commercialization stages | 4 | | Digital Assets/Web3 | Regulatory frameworks, infrastructure layers | 4 | | Autonomous Systems | Application domains, sensing technology | 4 | | Synthetic Biology | Application sectors, biosafety levels | 4 | ## Life Sciences sub-sectors The Life Sciences category (~100+ systems, ~568K nodes) is the largest by node count. It is organized into 13 sub-sectors: | Sub-Sector | Key Systems | |------------|-------------| | Diagnoses and Classification | ICD-10-CM, ICD-11, ICD-10-PCS, DSM-5, SNOMED CT, ICPC-2 | | Pharmaceuticals | ATC, NDC, RxNorm, EDQM, WHO Essential Medicines | | Diagnostics and Lab | LOINC, lab test types, imaging modalities, biomarkers | | Procedures and Billing | CPT, HCPCS, MS-DRG, G-DRG, NUCC | | Oncology and Research | NCI Thesaurus, MeSH, OMIM, Orphanet, CTCAE | | Medical Devices | GMDN, implant types, surgical instruments, sterilization | | Biotechnology | Biotech types, biosimilars, gene therapy, cell therapy | | Synthetic Biology | Synbio types, application sectors, biosafety levels | | Health Informatics | FHIR, DICOM, telemedicine, clinical decision support | | Nursing and Allied Health | ICN, NIC, NANDA, nursing specialties, allied health | | Payment and Delivery | HEDIS, CMS Star, care settings, payer types, value-based care | | Health Regulation | HIPAA, FDA 21 CFR, DEA, CLIA, MDR, IVDR | | Dental, Mental, and Veterinary | Dental, mental health, and veterinary service types | ## Navigating categories Use the web app at [worldoftaxonomy.com](https://worldoftaxonomy.com) for visual exploration. The home page Industry Map shows all 16 categories. Click any category to search for systems in that domain. Use the API for programmatic access: ```bash # Get all systems with metadata curl https://worldoftaxonomy.com/api/v1/systems # Get country-specific systems (e.g., what applies in Germany) curl "https://worldoftaxonomy.com/api/v1/systems?country=DE" # Get crosswalk statistics to see which systems are most connected curl https://worldoftaxonomy.com/api/v1/equivalences/stats ``` ======================================================================== # Domain Taxonomies vs Official Standards ======================================================================== # Domain Taxonomies vs Official Standards World Of Taxonomy ships two complementary kinds of classification system, and every public surface (web app, REST API, MCP server) now labels them explicitly so downstream consumers can treat them differently. ## The two categories | Category | `category` value | System ID pattern | Examples | Role | |----------|------------------|-------------------|----------|------| | Domain taxonomy | `domain` | IDs start with `domain_` | `domain_truck_freight`, `domain_ai_deployment`, `domain_fintech_service` | Plain-language on-ramps curated by World Of Taxonomy. Shorter (15-50 nodes), written in working-industry vocabulary, and crosswalked into the relevant official standard. | | Official standard | `standard` | Everything else | `naics_2022`, `isic_rev4`, `nace_rev2`, `soc_2018`, `icd10_cm`, `hs_2022` | Published by a government, intergovernmental body, or standards authority. These are the codes auditors, statistical agencies, and regulators require. | The split is a pure function of `system_id`: if the ID starts with `domain_`, it is a domain taxonomy; otherwise it is an official standard. The Python helper `world_of_taxonomy.category.get_category()` and the TypeScript helper `frontend/src/lib/category.ts` are the two sources of truth and stay in sync. ## Why the split exists Users describing a business in plain language ("telemedicine platform", "frozen-goods logistics", "AI inference startup") rarely know the NAICS code by heart. They read domain-taxonomy labels like "Telemedicine Modality Types" or "Cold Chain Types" much faster than five-digit NAICS numbers. Domain taxonomies are therefore the front door: surface them first, let the user recognize their own business, then fan out through crosswalk edges into the matching NAICS, ISIC, NACE, SIC, or SOC codes that an accountant or statistical agency will accept. ## How each surface reflects the split ### Web app - `/classify` shows two sections in order: "Start here: Domain taxonomies" followed by "Official standard codes". If only one category has matches, the heading is dropped and cards render as a flat list. - `/system/{id}` shows a badge next to the system name: "Domain taxonomy" or "Official standard". - `/system/{id}/node/{code}` splits cross-system equivalences into "Domain taxonomies" and "Official standards" sub-sections when both are present. ### REST API - `GET /api/v1/systems` and `GET /api/v1/systems/{id}` return a `category` field (`"domain"` or `"standard"`). - `GET /api/v1/systems?category=domain` (or `?category=standard`) filters the list. - `POST /api/v1/classify/demo` returns `domain_matches` and `standard_matches` arrays instead of a single flat `matches` array. Each match carries its own `category` field. For compound inputs, each atom also has `domain_matches` and `standard_matches`. - Every node returned by the API carries a `category` field derived from its parent system. ### MCP server - The `classify_business` tool returns `domain_matches` and `standard_matches` (plus `domain_matches` and `standard_matches` per atom for compound inputs). - `list_classification_systems`, `search_classifications`, and `get_industry` stamp each node/system with a `category` field. ## Consuming the split If you are building on top of World Of Taxonomy: 1. **Route users through domain taxonomies first** when the input is free text. They are written for humans. 2. **Fall back to official standards** when the user asks for a statistical code, needs to report to a government agency, or wants cross-country comparability. 3. **Use crosswalks** (`GET /api/v1/systems/{id}/nodes/{code}/equivalences`) to hop from a domain match to the official standard code. The domain taxonomies are pre-wired with equivalence edges into NAICS, ISIC, or other relevant standards. 4. **Never mix the two in a single ranked list** without signaling the category - users cannot tell at a glance that `domain_truck_freight` and `naics_2022` play different roles. ## Example A request for "last-mile delivery for frozen groceries" returns: - Domain matches: `domain_last_mile_delivery`, `domain_cold_chain`, `domain_freight_class` - Standard matches: `naics_2022: 492110` (Couriers and Express Delivery Services), `isic_rev4: 5320` (Postal and courier activities) The domain matches are recognizable instantly. The standard matches are what the user needs to give to their accountant. ## How the bridge works Every one of the 434 domain taxonomies is wired to at least one NAICS 2022 anchor code. This means there is no such thing as a domain island: if a user's query surfaces a `domain_*` match, there is always a bridge edge to a standard reporting code right next to it. The bridges are built in two passes: 1. **Sector-anchor generator.** A single mapping table (`world_of_taxonomy/ingest/domain_anchors.json`) maps every `domain_*` system to one to three NAICS 2022 sector anchors. The generator emits a bidirectional `equivalence` edge between each anchor and each level-1 code of the domain taxonomy, stamped with `match_type='broad'` and provenance `derived:sector_anchor:v1`. 2. **ISIC / NACE fan-out.** For every new NAICS->domain edge, a single self-join against the existing NAICS<->ISIC and NAICS<->NACE crosswalks produces parallel edges so European and UN users reach the same domain taxonomies through their native standards. These carry provenance `derived:sector_anchor:v1:fanout`. Because `match_type` on every generated edge is `broad`, consumers can filter them out if they need exact-match statistical crosswalks only (the pre-existing NAICS<->ISIC / NAICS<->NACE exact edges are untouched). ## The four edge kinds Every equivalence response now carries an `edge_kind` computed from whether each endpoint is a domain taxonomy (`system_id` starts with `domain_`) or an official standard: | `edge_kind` | Source | Target | Meaning | |---------------------|----------|----------|---------| | `standard_standard` | standard | standard | Classic statistical crosswalk (e.g. NAICS 6211 <-> ISIC 8620) | | `standard_domain` | standard | domain | Official code bridging into a curated domain taxonomy (e.g. NAICS 6212 -> `domain_dental`) | | `domain_standard` | domain | standard | The reverse: domain taxonomy bridging back to an official code | | `domain_domain` | domain | domain | Reserved for future cross-domain edges; not yet generated | Filter on `edge_kind` to scope the graph exactly: ``` GET /api/v1/systems/naics_2022/nodes/6211/equivalences?edge_kind=standard_standard GET /api/v1/systems/naics_2022/nodes/6211/equivalences?edge_kind=standard_domain,domain_standard ``` The MCP tool `list_crosswalks_by_kind` wraps the same filter for agents: ``` list_crosswalks_by_kind(edge_kind="standard_domain", system_id="naics_2022") ``` `source_category` and `target_category` are also returned alongside `edge_kind` so lazy consumers can filter without parsing the composite label. ======================================================================== # Data Quality and Provenance ======================================================================== ## Data Quality and Provenance > **TL;DR:** Every system is tagged with one of four provenance tiers - from official government downloads (Tier 1) to expert-curated domain vocabularies (Tier 4). SHA-256 hashes, source URLs, and dates are stored for audit. This guide explains the framework and how to verify data. --- ## Four-tier provenance framework ```mermaid graph TD subgraph Tier1["Tier 1: Official Download"] T1["Source file from standards body\nSHA-256 hash stored\nExamples: NAICS, ISIC, LOINC, HS"] end subgraph Tier2["Tier 2: Structural Derivation"] T2["Derived from official system\n1:1 structural mapping verified\nExamples: WZ, NAF, ATECO (NACE variants)"] end subgraph Tier3["Tier 3: Manual Transcription"] T3["Transcribed from official publications\nSource URL and date recorded\nExamples: SIC 1987, some Asian/African systems"] end subgraph Tier4["Tier 4: Expert Curated"] T4["Domain expert knowledge\nPeer-reviewed structure\nExamples: All domain_* taxonomies"] end T1 --> T2 T2 --> T3 T3 --> T4 ``` | Tier | Label | Description | Verification | |------|-------|-------------|-------------| | 1 | `official_download` | Data downloaded directly from the authoritative source | File hash stored for audit | | 2 | `structural_derivation` | Derived from an official system (e.g., NACE national variants) | 1:1 structural mapping verified | | 3 | `manual_transcription` | Transcribed from official publications (PDF, HTML, print) | Cross-checked against source | | 4 | `expert_curated` | Curated by domain experts based on industry knowledge | Peer-reviewed structure | ### Tier 1: Official download The gold standard. Data files (CSV, Excel, XML) are downloaded directly from the standards body's website. A SHA-256 hash of the source file is stored in the `source_file_hash` column for reproducibility. **Examples**: NAICS 2022 (Census Bureau CSV), HS 2022 (WCO), ISIC Rev 4 (UN CSV), LOINC (Regenstrief download), ICD-10-CM (CMS), NCI Thesaurus (NCI) ### Tier 2: Structural derivation Systems that reuse the structure of an official system with localized naming. For example, all EU NACE Rev 2 national variants (WZ 2008, NAF Rev 2, ATECO 2007, etc.) share the identical code structure. **Examples**: WZ 2008 (Germany), ONACE 2008 (Austria), NOGA 2008 (Switzerland), all EU NACE national variants, all ISIC Rev 4 national adaptations ### Tier 3: Manual transcription Data transcribed from official documents that do not provide machine-readable downloads. The original source URL and date are recorded for audit. **Examples**: SIC 1987 (transcribed from OSHA HTML), some Asian and African national classifications ### Tier 4: Expert curated Domain-specific vocabularies created by subject matter experts. These fill gaps where no official standard exists. **Examples**: All `domain_*` taxonomies (truck freight, agriculture, mining, construction, cybersecurity, AI, etc.) ## Provenance metadata fields Each classification system carries these audit fields: | Field | Description | Example | |-------|-------------|---------| | `data_provenance` | Provenance tier | `official_download` | | `source_url` | URL of the authoritative data source | `https://www.census.gov/naics/` | | `source_date` | Date the source data was accessed/published | `2024-01-15` | | `license` | License terms for the data | `Public Domain` | | `source_file_hash` | SHA-256 hash of the original file (Tier 1 only) | `a3f2b7c...` | ## Querying provenance via API ### Get provenance for a system ```bash curl https://worldoftaxonomy.com/api/v1/systems/naics_2022 ``` Response includes `data_provenance`, `source_url`, `source_date`, `license`, and `source_file_hash`. ### Audit report ```bash # Full provenance audit across all systems curl https://worldoftaxonomy.com/api/v1/audit ``` The audit report shows: - Breakdown by provenance tier (system count, node count per tier) - Tier 1 systems missing a file hash - Tier 2 structural derivation count and node coverage - Skeleton systems (placeholder entries awaiting full data) ### MCP audit tool ```bash # Via MCP tools/call get_audit_report {} ``` Returns the same audit data in a format suitable for AI agent consumption. ## Data verification practices ### Hash verification (Tier 1) For official download systems, the `source_file_hash` lets you verify data integrity: 1. Download the original file from `source_url` 2. Compute its SHA-256 hash 3. Compare against the stored `source_file_hash` 4. If they match, the data in World Of Taxonomy matches the original file ```mermaid sequenceDiagram participant You participant WOT as World Of Taxonomy API participant Source as Standards Body You->>WOT: GET /systems/naics_2022 WOT-->>You: source_url, source_file_hash You->>Source: Download original CSV Source-->>You: naics_2022.csv You->>You: sha256sum naics_2022.csv Note over You: Compare hash with source_file_hash ``` ### Structural verification (Tier 2) For structural derivation systems, you can verify: 1. The code structure matches the parent system exactly 2. Crosswalk edges are 1:1 (every code in the derived system maps to exactly one code in the parent) ### Cross-reference verification For any system, you can cross-reference node counts and structure against the authoritative publication. ## Skeleton systems Some systems are included as structural placeholders where the full dataset is not freely available (e.g., SNOMED CT, CPT). These are marked with low node counts and are included to preserve the crosswalk topology. Full data requires a license from the respective standards body. | Skeleton System | Reason | License Holder | |----------------|--------|----------------| | SNOMED CT | Proprietary license | SNOMED International | | CPT | Copyright protected | American Medical Association | | RxNorm | Partial skeleton | NLM (some data freely available) | | DSM-5 | Copyright protected | American Psychiatric Association | ## Reporting data quality issues If you find incorrect data, missing codes, or wrong crosswalk mappings: 1. **GitHub**: File an issue on the project repository with system ID, code, expected vs actual value, and a link to the authoritative source 2. **API**: Include the `report_issue_url` from any API response for direct reporting > All classification data in World Of Taxonomy is provided for informational purposes only. It should not be used as a substitute for official government or standards body publications. For regulatory, legal, or compliance purposes, always verify codes against the authoritative source. ======================================================================== # Inclusion Policy - Which Systems Belong in World Of Taxonomy ======================================================================== # Inclusion Policy World Of Taxonomy is a unified knowledge graph of **published classification, coding, and reference systems** that humans and machines use to label, group, or reference real-world things. Its job is to make these systems individually discoverable and collectively interoperable through crosswalks. This page is the policy that governs which systems get added to WoT. It exists so contributors, downstream portfolio products (WoO, WoUC, WoA), and AI assistants doing research against the graph can answer "should this artifact be in WoT?" without guessing. ## What qualifies for inclusion A system fits in WoT if it satisfies all four: 1. **Published and externally maintained.** It has a named publisher, a standards body, government agency, scientific community, or recognized industry consortium, and is referenced by some user community outside WoT itself. Internal or proprietary lists do not qualify. 2. **Stable identifiers.** Each entry has a code, key, URI, or other stable identifier that callers cite by reference (`NAICS 5417`, `ICD-10-CM E11.9`, `schema.org/Article`, `FIBO Equity`). Identifiers are assigned by the publisher, not minted by WoT. The one historical exception is the `domain_*` taxonomies, which are WoT-curated plain-language on-ramps; new systems are expected to use external identifiers. 3. **Enumerated or hierarchical structure.** Either a finite list (HTTP status codes, blood types, SPDX licenses) or a tree of categories (NAICS sectors, ICD chapters, schema.org type tree). Open-ended relational graphs without inherent hierarchy, e.g. a triple store of arbitrary semantic relations, do not qualify. 4. **Practical size.** Soft cap of about 500,000 nodes per system. Larger systems may still be admitted as a documented subset (top-N hierarchical levels, the publisher's official "major" or "core" subset, a stable extract anchored to a specific revision) with the truncation explicitly noted in the system's metadata. ## What does not qualify - **Live operational data.** Customer rosters, vendor catalogs, asset registers, transaction logs, anything that grows through ongoing business activity. WoT is reference data, not application state. - **Lists of individual persons.** Even if published, person-level rosters are out of scope. - **Pure property or relation vocabularies.** Vocabularies that define predicates without a meaningful enumerated value space (FOAF `foaf:knows`, Dublin Core `dc:creator`). The *class hierarchy* of such a vocabulary may qualify on its own; its property definitions alone do not. - **Entity registries above the size cap.** Wikidata's 100M Q-numbers, DBpedia's 5M instances, GeoNames' 12M places. Documented subtrees of these (NCBI Taxonomy from Wikidata, GeoNames feature codes, the schema.org type tree) may qualify on their own merits and should be ingested as their own WoT systems with the publisher's identifiers preserved. ## Crosswalks are valuable but not required A system does not need to crosswalk to anything else to be included. Many existing WoT systems are deliberately isolated reference scales: Mohs hardness, Apgar score, Beaufort wind, HTTP status codes, SPDX licenses, Unicode emoji categories, blood types. Crosswalks compound a system's utility but are an output of inclusion, not a precondition. When a system does have natural crosswalk surface to existing WoT content, that should be wired up in the same PR as the ingester. ## When in doubt Default to **inclusion** if the system is published, identifier-bearing, and within the size cap. WoT's value scales with breadth. The failure mode of over-inclusion is a slightly cluttered catalog. The failure mode of under-inclusion is a contributor or downstream portfolio product needing to maintain a parallel store of what WoT should already have. ## Versioning and revisions When a system publishes a new revision (NAICS 2017 to 2022, ICD-10 to ICD-11), both versions remain in WoT as distinct systems. Crosswalks between revisions are first-class equivalences, not implicit "latest wins" upgrades. This protects downstream consumers who must keep using the older revision for regulatory or contractual reasons. ## What this policy does not do This policy does not retire or deprecate any existing system. The current catalog is what it is, and the policy applies prospectively to new additions and to deciding whether suggested additions belong. Where an existing entry sits awkwardly against the policy (a shallow skeleton, an internally-minted ID), that is treated as a quality-improvement opportunity, not a removal trigger. ## Related reading - [Domain Taxonomies vs Official Standards](./domain-vs-standard.md) for the split between WoT-curated plain-language on-ramps and external standards. - [Data Quality and Provenance](./data-quality.md) for the four-tier provenance framework that every ingested system is graded against. - [Categories and Sectors](./categories-and-sectors.md) for the 16 categories used to organize the catalog. ======================================================================== # Web Vocabularies - schema.org, SKOS, and Related Type Systems ======================================================================== # Web Vocabularies Web vocabularies are the type and concept systems used by AI assistants, search engines, and structured-data crawlers to label real-world entities on the public web. World Of Taxonomy hosts the type-tree subset of these vocabularies (the part that is enumerated and hierarchical, per the [Inclusion Policy](./inclusion-policy.md)). Property and relation vocabularies are out of scope. ## schema.org | Field | Value | |---|---| | System ID | `schema_org` | | Total types | 926 (rdfs:Class entries with the `schema:` prefix) | | Authority | schema.org consortium (Google, Microsoft, Yahoo, Yandex) | | License | CC BY-SA 3.0 | | Source | https://schema.org/version/latest/schemaorg-current-https.jsonld | schema.org publishes a vocabulary of types and properties that web pages and APIs use to mark up structured data so search engines and AI assistants can understand what a page is about. WoT ingests the type tree only: - Rooted at `Thing`. Every type is a subclass of Thing (transitively). - ~926 types organized via `rdfs:subClassOf` chains (CreativeWork, Person, Place, Action, etc.). - ~57 types have multiple parents; WoT keeps the first listed parent as the canonical hierarchy edge and notes the alternative parents in the description. - 100% description coverage native to the source (every class has `rdfs:comment`). **Why this matters for AEO and SEO**: schema.org type tags are the single most important signal AI search overviews, ChatGPT browsing, and Google Knowledge Graph use to identify what a page is about. Hosting the full type tree in WoT means downstream products (WoO agent runtime, classification/crosswalk APIs) can cite schema.org URIs natively as the canonical anchor for "what kind of thing is this." It also enables crosswalks between schema.org types and domain classifications: `schema:Restaurant` to NAICS 7225, `schema:Hotel` to NAICS 7211, `schema:MedicalSpecialty` to MeSH, and so on. **What WoT does not host**: the ~1,676 schema.org `rdf:Property` entries (`name`, `address`, `priceRange`, etc.) are property definitions, not classification categories. Per the inclusion policy, pure property vocabularies are out of scope. Consumers who need the full property surface should hit schema.org directly. ## Related vocabularies in WoT | System | Codes | Role | |---|---|---| | `wordnet_nouns` | 82,115 | Princeton WordNet noun hypernym tree (entity.n.01 root); lexical semantic anchor | | `skos` | 17 | W3C Simple Knowledge Organization System (metamodel for thesauri) | | `w3c_standards` | 16 | W3C standards index | | `iab_content` | 21 | IAB Tech Lab content taxonomy for advertising | These sit alongside schema.org as web-adjacent classification systems. ### WordNet (Nouns) WordNet is the canonical lexical-semantic database for English, originally built at Princeton in 1985 and now the de-facto standard for "what is the common-sense type of this concept" in NLP. WoT ingests the noun hypernym tree (82,115 synsets rooted at `entity.n.01`); verbs / adjectives / adverbs are not in scope for this PR (verbs may follow if downstream demand surfaces). **Where WordNet differs from schema.org**: schema.org names types of *publishable web entities* (Restaurant, Article, MedicalCondition); WordNet names *common-sense concept categories* (`dog.n.01`, `physician.n.01`, `wedding.n.01`). Both are valid concept anchors at different abstractions: - A Wikipedia / encyclopedia article on "dogs" would carry a `wordnet_nouns:dog.n.01` anchor for content classification. - A pet-store website page about a specific dog would carry a `schema:Product` anchor for SEO. - A retail product feed listing dog food would carry a `gs1_gpc:10000091` brick anchor for product identification. WoT lets downstream products (WoO agent runtime, classification APIs) cite the right anchor for the right surface without picking a winner. **License**: WordNet License (BSD-style). Sourced via NLTK's WordNet 3.1 corpus. ## What's not yet ingested The following web-vocabulary candidates have been audited against the inclusion policy and are queued for follow-up PRs: - **WordNet verbs** (~13K synsets) - hypernym tree exists for verbs too; deferred until downstream demand justifies it. - **DBpedia ontology** (~700 classes) - Wikipedia-derived class hierarchy. - **SUMO / BFO / DOLCE** - upper ontologies (small, peripheral relevance). - **FOAF classes**, **DCMI Type Vocabulary** - small auxiliary vocabularies. The full Wikidata Q-number space (~100M entities) and the DBpedia instance set (~5M) are entity registries above WoT's size cap and out of scope. They belong in a sister product (a hypothetical "World of Registries" or directly inside WoO). ## How to use schema.org from WoT ```bash # Look up a specific schema.org type GET /api/v1/systems/schema_org/nodes/Restaurant # Browse children of CreativeWork GET /api/v1/systems/schema_org/nodes/CreativeWork/children # Search across the type tree GET /api/v1/search?q=medical&systems=schema_org # Crosswalk to a domain classification (when wired) GET /api/v1/systems/schema_org/nodes/Restaurant/equivalences ``` The MCP server exposes the same data via `get_industry`, `browse_children`, and `search_classifications` tools, so AI assistants can resolve a schema.org type to its WoT counterparts without leaving the chat. ## Related reading - [Inclusion Policy](./inclusion-policy.md) - the four tests every WoT system must pass. - [Crosswalk Map](./crosswalk-map.md) - how systems connect via equivalence edges. - [Categories and Sectors](./categories-and-sectors.md) - how WoT organizes its catalog. ======================================================================== # Regulatory and Compliance Standards - HIPAA, GDPR, NIST, ISO, OSHA, and 116 more ======================================================================== # Regulatory and Compliance Standards > **TL;DR:** WoT hosts 120 regulatory and compliance frameworks: US federal regulations (HIPAA, SOX, GDPR, OSHA, FDA), US security and accounting frameworks (NIST CSF, NIST 800-53, SOC 2, PCI DSS, US GAAP), EU directives and acts (GDPR, NIS2, DORA, MDR, EU AI Act, CSRD), ISO management system standards (9001, 14001, 27001, 22301, 45001, 13485, 42001 for AI), and global treaties (Basel III, FATF, ILO, Paris Agreement, IMO MARPOL/SOLAS). This page maps which framework applies when, and which ones overlap. --- ## What this layer is for Regulatory frameworks describe **what an organization must do** to operate within a jurisdiction or sector. They are orthogonal to industry and process classifications: a healthcare provider in the US must comply with HIPAA (jurisdiction), the Joint Commission (sector), SOC 2 (if SaaS), and ISO 27001 (if international). Industry codes (NAICS) and process frameworks (APQC PCF) tell you *what work* the organization does; regulatory frameworks tell you *which rules constrain that work*. This layer matters when downstream products need to: - Map a customer's regulated obligations to a sector vocabulary (LegalTech, GRC platforms, audit tooling). - Anchor a control to multiple overlapping frameworks (a NIST 800-53 control often satisfies SOC 2, ISO 27001, and HIPAA Security Rule simultaneously). - Surface relevant rules when an industry classification is known (NAICS 6221 General Medical Hospitals -> HIPAA + Joint Commission + CMS Conditions of Participation). - Drive contract-clause libraries that map clauses to applicable frameworks. ## US Federal Regulations Statutory frameworks codified in US law, administered by federal agencies. ### Healthcare | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_hipaa` | 36 | HHS / OCR | Health Insurance Portability and Accountability Act | | `reg_fda_21cfr` | 28 | FDA | Title 21 of the Code of Federal Regulations (drugs, devices, food) | | `reg_dea` | 25 | DEA | Drug Enforcement Administration scheduling and registration | ### Financial services | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_sox` | 58 | SEC / PCAOB | Sarbanes-Oxley Act (public-company financial reporting) | | `reg_glba` | 28 | Multiple | Gramm-Leach-Bliley Act (financial privacy) | | `reg_fcra` | 27 | FTC / CFPB | Fair Credit Reporting Act | | `reg_sec` | 29 | SEC | Securities and Exchange Commission rules | | `reg_finra` | 28 | FINRA | Financial Industry Regulatory Authority rules | | `reg_cfpb` | 22 | CFPB | Consumer Financial Protection Bureau regulations | | `reg_naic` | 21 | NAIC | National Association of Insurance Commissioners model laws | | `reg_ffiec` | 25 | FFIEC | Federal Financial Institutions Examination Council IT Handbook | ### Privacy and consumer protection | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_ccpa` | 34 | California AG | California Consumer Privacy Act / CPRA | | `reg_ferpa` | 30 | DoEd | Family Educational Rights and Privacy Act | | `reg_coppa` | 23 | FTC | Children's Online Privacy Protection Act | | `reg_ftc_safeguards` | 23 | FTC | FTC Safeguards Rule (financial-institution data security) | | `reg_ada` | 31 | DOJ | Americans with Disabilities Act | ### Workplace safety | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_osha_1910` | 47 | OSHA | OSHA General Industry standards (29 CFR 1910) | | `reg_osha_1926` | 49 | OSHA | OSHA Construction standards (29 CFR 1926) | ### Energy and environment | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_clean_air` | 28 | EPA | Clean Air Act | | `reg_clean_water` | 26 | EPA | Clean Water Act | | `reg_cercla` | 27 | EPA | Comprehensive Environmental Response, Compensation, and Liability Act (Superfund) | | `reg_rcra` | 29 | EPA | Resource Conservation and Recovery Act | | `reg_tsca` | 25 | EPA | Toxic Substances Control Act | | `reg_nerc_cip` | 48 | NERC / FERC | NERC Critical Infrastructure Protection (electric grid) | ### Federal IT and contracting | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_fisma` | 27 | OMB / NIST | Federal Information Security Modernization Act | | `reg_fedramp` | 40 | GSA | Federal Risk and Authorization Management Program | | `reg_far` | 32 | GSA / DoD / NASA | Federal Acquisition Regulation | | `reg_dfars` | 25 | DoD | Defense Federal Acquisition Regulation Supplement | | `reg_itar` | 32 | State Dept | International Traffic in Arms Regulations | | `reg_ear` | 31 | BIS / Commerce | Export Administration Regulations | ## US Frameworks (Voluntary or Sector-Specific) Not laws themselves but widely adopted as the de-facto basis for compliance, audit, and accreditation in their respective sectors. ### Cybersecurity and IT governance | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_nist_csf` | 28 | NIST | NIST Cybersecurity Framework 2.0 (Identify, Protect, Detect, Respond, Recover, Govern) | | `reg_nist_800_53` | 36 | NIST | NIST SP 800-53 Rev 5 security and privacy controls | | `reg_nist_800_171` | 28 | NIST | NIST SP 800-171 Rev 3 (controlled unclassified information) | | `reg_cmmc` | 25 | DoD | Cybersecurity Maturity Model Certification 2.0 | | `reg_cis_controls` | 29 | CIS | CIS Critical Security Controls v8 | | `reg_pci_dss` | 27 | PCI SSC | PCI Data Security Standard v4.0 | | `reg_soc2` | 37 | AICPA | SOC 2 Trust Services Criteria | | `reg_hitrust` | 27 | HITRUST | HITRUST Common Security Framework (healthcare) | | `reg_cobit` | 45 | ISACA | COBIT 2019 (governance and management of enterprise IT) | | `reg_coso` | 27 | COSO | Committee of Sponsoring Organizations Internal Control Framework | ### Accounting and audit | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_us_gaap` | 33 | FASB | US Generally Accepted Accounting Principles (ASC codification) | | `reg_fasb` | 19 | FASB | Financial Accounting Standards Board statements | | `reg_pcaob` | 28 | PCAOB | Public Company Accounting Oversight Board auditing standards | | `reg_aicpa` | 21 | AICPA | American Institute of Certified Public Accountants standards | ### Healthcare accreditation and standards | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_joint_commission` | 30 | TJC | Joint Commission hospital accreditation standards | | `reg_cap` | 21 | CAP | College of American Pathologists laboratory accreditation | | `reg_clia` | 20 | CMS | Clinical Laboratory Improvement Amendments | | `reg_usp` | 21 | USP | US Pharmacopeia chapters (drug compounding, packaging, sterility) | ### Engineering and building | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_ashrae` | 23 | ASHRAE | Standards for HVAC, refrigeration, building energy | | `reg_asme` | 26 | ANSI / ASME | Boiler, pressure vessel, and piping codes | ## EU Regulations and Directives Binding rules across EU member states, often with extraterritorial reach (a US SaaS targeting EU residents must comply with GDPR, etc.). ### Privacy and digital services | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_eprivacy` | 15 | Member states / EDPB | ePrivacy Directive (cookies, electronic comms) | | `reg_eu_data_act` | 20 | Commission | EU Data Act (data sharing, switching, public-sector access) | | `reg_dsa` | 21 | Commission | Digital Services Act (online intermediaries, very large platforms) | | `reg_dma` | 19 | Commission | Digital Markets Act (gatekeeper obligations) | | `reg_eu_whistleblower` | 17 | Member states | Whistleblower Protection Directive | ### Cybersecurity and resilience | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_nis2` | 24 | ENISA / member states | NIS2 Directive (network and information security) | | `reg_dora` | 27 | ESAs | Digital Operational Resilience Act (financial sector ICT risk) | | `reg_eu_cra` | 20 | Commission | EU Cyber Resilience Act (products with digital elements) | | `reg_eu_ai_act` | 27 | Commission | EU AI Act (risk-tiered AI system obligations) | ### Financial services | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_mifid2` | 24 | ESMA / national | Markets in Financial Instruments Directive II | | `reg_solvency2` | 22 | EIOPA | Solvency II (insurance prudential) | | `reg_psd2` | 19 | EBA | Payment Services Directive 2 (open banking) | ### Health and life sciences | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_mdr` | 22 | EC / notified bodies | EU Medical Device Regulation | | `reg_ivdr` | 17 | EC / notified bodies | In Vitro Diagnostic Medical Devices Regulation | ### Sustainability and environment | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_csrd` | 25 | Commission / EFRAG | Corporate Sustainability Reporting Directive | | `reg_cbam` | 18 | Commission | Carbon Border Adjustment Mechanism | | `reg_sfdr_detail` | 22 | ESAs | Sustainable Finance Disclosure Regulation (detailed RTS) | | `reg_eu_deforestation` | 20 | Commission | EU Deforestation Regulation | | `reg_emas` | 25 | Commission | Eco-Management and Audit Scheme | ### Products and chemicals | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_reach` | 19 | ECHA | REACH (registration, evaluation, authorization of chemicals) | | `reg_rohs` | 22 | EC / member states | RoHS Directive (restriction of hazardous substances) | | `reg_weee` | 21 | EC / member states | WEEE Directive (waste electrical and electronic equipment) | | `reg_eu_packaging` | 19 | Commission | EU Packaging and Packaging Waste Regulation | | `reg_eu_batteries` | 18 | Commission | EU Batteries Regulation | | `reg_eu_machinery` | 20 | Commission | EU Machinery Regulation | ## ISO Management System Standards Voluntary international standards that organizations certify against. Each defines a Plan-Do-Check-Act management system for a specific domain. Often combined into integrated management systems (e.g., ISO 9001 + ISO 14001 + ISO 45001). | System | Codes | Year | Scope | |--------|-------|------|-------| | `reg_iso_9001` | 35 | 2015 | Quality management systems | | `reg_iso_14001` | 29 | 2015 | Environmental management systems | | `reg_iso_27001` | 30 | 2022 | Information security management | | `reg_iso_27701` | 27 | 2019 | Privacy information management (extension of 27001) | | `reg_iso_22000` | 31 | 2018 | Food safety management | | `reg_iso_45001` | 30 | 2018 | Occupational health and safety | | `reg_iso_50001` | 26 | 2018 | Energy management | | `reg_iso_13485` | 28 | 2016 | Medical-device quality management | | `reg_iso_22301` | 26 | 2019 | Business continuity management | | `reg_iso_22313` | 24 | 2020 | BCMS implementation guidance (companion to 22301) | | `reg_iso_20000` | 23 | 2018 | IT service management (aligns with ITIL) | | `reg_iso_26000` | 22 | 2010 | Social responsibility (guidance, not certifiable) | | `reg_iso_37001` | 29 | 2016 | Anti-bribery management | | `reg_iso_42001` | 32 | 2023 | AI management systems (the newest big one) | | `reg_iso_28000` | 24 | 2022 | Supply chain security management | | `reg_iso_55001` | 25 | 2014 | Asset management | | `reg_iso_41001` | 23 | 2018 | Facility management | | `reg_iso_30401` | 22 | 2018 | Knowledge management | | `reg_iso_21001` | 31 | 2018 | Educational organization management | | `reg_iso_39001` | 24 | 2012 | Road traffic safety management | | `reg_iso_37101` | 23 | 2016 | Sustainable communities | | `reg_iso_14064` | 20 | various | Greenhouse gas accounting and verification | | `reg_iso_14040` | 25 | 2006 | Life cycle assessment principles | | `reg_iso_19011` | 30 | 2018 | Auditing management systems | | `reg_iso_31010` | 26 | 2019 | Risk assessment techniques | ## Global Treaties and Multilateral Frameworks Binding international agreements and recommendations adopted by sovereign states. ### Finance and trade | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_basel3` | 24 | BIS / BCBS | Basel III/IV bank capital and liquidity framework | | `reg_fatf` | 29 | FATF | 40 Recommendations on AML / CFT | | `reg_wto_sps` | 19 | WTO | Sanitary and Phytosanitary Measures Agreement | | `reg_wto_tbt` | 17 | WTO | Technical Barriers to Trade Agreement | | `reg_uncitral` | 20 | UN | UNCITRAL Model Laws (international commerce) | ### Labor and human rights | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_ilo_core` | 16 | ILO | Core labor conventions (forced labor, child labor, discrimination, freedom of association) | | `reg_ungp` | 22 | UN | UN Guiding Principles on Business and Human Rights | | `reg_oecd_mne` | 22 | OECD | OECD Guidelines for Multinational Enterprises | ### Environment | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_montreal` | 19 | UNEP | Montreal Protocol (ozone-depleting substances) | | `reg_paris` | 20 | UNFCCC | Paris Agreement on climate change | | `reg_kimberley` | 17 | KP Plenary | Kimberley Process (conflict diamonds) | | `reg_codex` | 22 | FAO / WHO | Codex Alimentarius (food standards) | | `reg_who_fctc` | 18 | WHO | Framework Convention on Tobacco Control | ### Maritime and aviation | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_unclos` | 25 | UN | UN Convention on the Law of the Sea | | `reg_marpol` | 20 | IMO | International Convention for the Prevention of Pollution from Ships | | `reg_solas` | 21 | IMO | International Convention for the Safety of Life at Sea | | `reg_icao_annex` | 26 | ICAO | ICAO Annexes to the Chicago Convention | ### Project and sustainable finance | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_equator` | 18 | EPFI banks | Equator Principles (project finance environmental and social risk) | | `reg_ifc_ps` | 21 | IFC | IFC Performance Standards on Environmental and Social Sustainability | ### Intellectual property | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_berne` | 18 | WIPO | Berne Convention for the Protection of Literary and Artistic Works | ## Cross-framework overlaps to know Several controls and obligations recur across multiple frameworks. This is where downstream tooling pays off most: a single library of "evidence" can map to many frameworks. | If you have | You substantially satisfy | |---|---| | ISO 27001 certified | Most of NIST CSF, big chunks of SOC 2 (Security), HIPAA Security Rule, PCI DSS technical controls | | SOC 2 Type II | Vendor due-diligence baseline; ISO 27001 control overlap is ~60% | | NIST 800-171 | CMMC Level 2 baseline (DoD contractors) | | HITRUST CSF certified | HIPAA + ISO 27001 + NIST CSF + state privacy laws (the framework was designed as an aggregator) | | GDPR | Most of CCPA / CPRA; ISO 27701 directly extends ISO 27001 to cover GDPR principles | | ISO 9001 | Foundation for ISO 13485 (medical devices), ISO 22000 (food), AS 9100 (aerospace) - all are 9001 + sector additions | | Basel III | Solvency II builds the same prudential discipline for insurers | ## Crosswalk navigation ```bash # Find regulatory frameworks in WoT GET /api/v1/systems?prefix=reg_ # Browse a specific framework GET /api/v1/systems/reg_hipaa/nodes GET /api/v1/systems/reg_iso_27001/nodes # Search across all regulatory content GET /api/v1/search?q=encryption&systems=reg_nist_800_53,reg_iso_27001,reg_pci_dss ``` Equivalence edges between regulatory frameworks are not yet wired at scale; this is a high-value follow-up. The cross-framework overlap table above is the manual map; programmatic crosswalks (NIST 800-53 control -> ISO 27001 Annex A control, for example) are queued for a future PR. ## What WoT does not host - **State-level regulations** other than CCPA / CPRA. The 50 US state privacy / breach-notification laws are out of scope until a downstream product needs them. - **Country-specific privacy laws** outside the US and EU (LGPD, POPIA, PIPL, etc.). Audit candidates for follow-up if customer demand surfaces. - **Industry-specific contractual frameworks** with restricted licensing (FAA Part 145 detailed AC content, ISO 15926 industrial process). Behind paywalls; out per the inclusion-policy assessment. - **Commercial accreditation programs** that are private products of the accreditor (Underwriters Laboratories test programs, J.D. Power scorecards). ## Related reading - [Process and Activity Frameworks](./process-frameworks.md) - PCF, SCOR, ITIL, COBIT (COBIT especially overlaps the IT-governance subset of this page). - [Industry Classification Guide](./industry-classification.md) - which NAICS sectors trigger which regulatory regimes. - [Inclusion Policy](./inclusion-policy.md) - why state-level and country-specific privacy laws are not in WoT yet. ======================================================================== # Financial Systems - BICS, TRBC, ICB, GICS, FIBO, IFRS, CFI, Basel, and 50 more ======================================================================== # Financial Systems > **TL;DR:** WoT hosts ~55 financial classification systems: industry-vendor sector trees (Bloomberg BICS, Refinitiv TRBC, ICB, GICS Bridge, SASB SICS), the FIBO ontology (2,521 OWL classes), instrument and messaging codes (CFI, SWIFT MT, ISO 20022, XBRL, corporate actions), accounting standards (US GAAP, IFRS), banking and prudential regulation (Basel III, EU Taxonomy, SFDR, TNFD), 30+ curated domain finance taxonomies (derivatives, credit ratings, hedge funds, payments, FinTech, RegTech, InsurTech, microfinance), and emerging crypto vocabularies (token standards, DeFi protocols). This page maps which system to use when, and how the major ones connect. --- ## What this layer is for Financial systems classify *capital, instruments, counterparties, transactions, and obligations* - orthogonal to general-purpose industry codes (NAICS, ISIC) and process frameworks (APQC PCF). A bank might be NAICS 5221 (Depository Credit Intermediation), Basel III IRB-eligible, ICB 30 (Financials), Refinitiv TRBC sector "Banking Services", and FIBO `BankingProductsAndServices` simultaneously - each anchor serves a different downstream surface. This layer matters when downstream products need to: - Classify a security for trade reporting (CFI for ISO 10962 standardized classification). - Score an issuer for ESG (SASB SICS for sector materiality, SFDR for EU sustainable-finance disclosures). - Map a corporate action for back-office processing (Corporate Actions / ISO 20022 / SWIFT MT). - Categorize a fintech product against regulatory and product taxonomies (RegTech / FinTech domain taxonomies + reg_* frameworks). - Anchor counterparty risk against an industry-vendor sector hierarchy (BICS, TRBC, ICB). - Build a unified financial-data semantic layer (FIBO). ## Industry vendor sector trees (the "BICS / TRBC / ICB / GICS / SASB" cluster) Five competing financial industry hierarchies. Each is the de-facto sector taxonomy on a different platform; portfolio teams routinely map between them. | System | Codes | Maintained by | Used in | |--------|-------|---------------|---------| | `bloomberg_bics` | 13 | Bloomberg | Bloomberg Terminal, BQuant | | `refinitiv_trbc` | 13 | LSEG / Refinitiv | Eikon, Workspace, World-Check | | `icb` | 32 | FTSE Russell | FTSE indexes, LSE listings | | `ftse_icb_detail` | 12 | FTSE Russell | Detailed ICB sub-sectors | | `gics_bridge` | 11 | MSCI / S&P | MSCI indexes, S&P 500 sector splits | | `sasb_sics` | 86 | SASB / IFRS Foundation | ESG materiality assessment | ```mermaid graph LR GICS[GICS
11 sectors] ICB[ICB
11 industries] BICS[BICS
11 sectors] TRBC[TRBC
10 economic sectors] SASB[SASB SICS
11 sectors / 77 industries] GICS -.peer.- ICB ICB -.peer.- BICS BICS -.peer.- TRBC GICS -.peer.- SASB ``` These are deliberate competitors; none subsumes the others. SASB is special-purpose (ESG materiality, formally adopted by the IFRS Foundation in 2022), so it's used alongside one of the other four for risk + materiality reporting. ## Financial Industry Business Ontology (FIBO) | Field | Value | |---|---| | System ID | `fibo` | | Total classes | 2,521 | | Authority | EDM Council | | License | MIT | FIBO is the largest financial system in WoT and the most semantically rigorous. It's an OWL ontology, not a flat code list - meaning each class has a formal definition, parent classes, and (often) restriction axioms. WoT ingests the class hierarchy across 7 modules: | Module | What it covers | |--------|----------------| | BE | Business Entities (legal forms, ownership structures) | | FBC | Financial Business and Commerce (markets, exchanges, jurisdictions) | | FND | Foundations (concepts shared across the ontology) | | SEC | Securities (equity, debt, structured products) | | DER | Derivatives (forwards, futures, options, swaps) | | IND | Indices and Indicators | | LOAN | Loans (mortgages, syndicated loans, credit) | Codes use module-prefixed local names (`SEC/Equity`, `BE/SoleProprietor`, `LOAN/Mortgage`) to disambiguate cross-module collisions. This is the right anchor when downstream products need a *typed* concept rather than a sector code: a "30-year fixed-rate mortgage to a sole proprietor" is `LOAN/FixedRateMortgage` issued to a `BE/SoleProprietor`, not just NAICS 5223. ## Accounting standards | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_us_gaap` | 33 | FASB | US Generally Accepted Accounting Principles (ASC codification) | | `reg_fasb` | 19 | FASB | FASB Statements (parent of ASC) | | `ifrs` | 34 | IFRS Foundation | International Financial Reporting Standards | | `xbrl_taxonomy` | 14 | XBRL International | XBRL taxonomy concepts (machine-readable filings) | | `reg_pcaob` | 28 | PCAOB | Public Company Accounting Oversight Board auditing standards | | `reg_aicpa` | 21 | AICPA | American Institute of CPAs standards | US GAAP and IFRS are the two top-of-funnel accounting frameworks. XBRL is the machine-readable serialization layer (SEC EDGAR filings, ESEF EU filings, etc.). PCAOB and AICPA cover audit standards over those filings. ## Instrument, messaging, and operational codes The plumbing layer that moves financial data and trades. | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `cfi_iso10962` | 63 | ISO | Classification of Financial Instruments (CFI) - 6-character code per security type | | `swift_mt` | 13 | SWIFT | Legacy SWIFT MT message types (MT103, MT202, etc.) | | `iso20022_msg` | 17 | ISO | ISO 20022 financial messaging schema (the modern replacement for SWIFT MT) | | `corporate_action` | 19 | ISO 15022 / SWIFT | Corporate action event types (dividends, splits, mergers) | | `card_schemes` | 15 | various | Major card schemes (Visa, Mastercard, Amex, UnionPay, JCB, Discover, RuPay, Mir, etc.) | CFI is the universal instrument code; SWIFT MT and ISO 20022 are the message envelopes; corporate-action codes are the event vocabulary inside those envelopes. ## Banking, prudential, and sustainable finance regulation Cross-references the [Regulatory Standards page](./regulatory-standards.md) for the full list; the financial-specific subset: | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `reg_basel3` | 24 | BIS / BCBS | Basel III/IV bank capital and liquidity framework | | `basel_exposure` | 36 | BIS | Basel exposure-class taxonomy (sovereign, bank, corporate, retail, etc.) | | `reg_solvency2` | 22 | EIOPA | Solvency II (insurance prudential) | | `reg_mifid2` | 24 | ESMA | Markets in Financial Instruments Directive II | | `reg_psd2` | 19 | EBA | Payment Services Directive 2 (open banking) | | `reg_dora` | 27 | ESAs | Digital Operational Resilience Act (financial-sector ICT risk) | | `eu_taxonomy` | 60 | Commission | EU Taxonomy for sustainable activities | | `sfdr` | 30 | ESAs | Sustainable Finance Disclosure Regulation | | `reg_sfdr_detail` | 22 | ESAs | SFDR detailed RTS (regulatory technical standards) | | `tnfd` | 34 | TNFD | Taskforce on Nature-related Financial Disclosures | | `reg_csrd` | 25 | Commission / EFRAG | EU Corporate Sustainability Reporting Directive | ## Macro and development finance | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `wb_income` | 27 | World Bank | World Bank country income classification (low / lower-middle / upper-middle / high) | | `adb_sector` | 46 | ADB | Asian Development Bank sector taxonomy | These show up when an organization's portfolio activity is anchored to development finance (DFI lending, multilateral aid, ODA reporting). ## Curated finance domain taxonomies WoT-curated plain-language on-ramps for financial sub-domains (`domain_*`). These exist because no external standard covers the territory at this granularity in plain English. ### Banking and lending | System | Nodes | What it covers | |--------|-------|----------------| | `domain_finance_instrument` | 25 | Cross-cutting instrument types (cash, equity, debt, derivatives, alternatives) | | `domain_finance_market` | 18 | Market and exchange structure types | | `domain_finance_client` | 19 | Client and investor segment types (retail, mass affluent, HNW, UHNW, institutional) | | `domain_finance_regulatory` | 18 | Plain-language regulatory framework anchors | | `domain_commercial_lending` | 16 | Commercial lending product types | | `domain_mortgage_type` | 16 | Mortgage product types (fixed, ARM, jumbo, FHA, etc.) | | `domain_securitization` | 17 | Asset-securitization structures (RMBS, CMBS, ABS, CLO) | | `domain_muni_bond` | 15 | Municipal bond types | | `domain_bond_rating` | 18 | Bond rating scale types | | `domain_credit_rating` | 21 | Credit rating scale types (Moody's, S&P, Fitch grades) | | `domain_microfinance` | 18 | Microfinance product and institution types | | `domain_trade_finance` | 18 | Trade-finance instrument types (letters of credit, factoring, forfaiting) | ### Investment management | System | Nodes | What it covers | |--------|-------|----------------| | `domain_wealth_mgmt` | 17 | Wealth management service types | | `domain_hedge_fund` | 18 | Hedge fund strategy types (long/short, global macro, event-driven, etc.) | | `domain_pe_stage` | 20 | Private equity stage types (seed, series A-E, growth, late-stage) | | `domain_reit_type` | 17 | REIT subcategory types (equity, mortgage, diversified) | | `domain_actuarial_method` | 16 | Actuarial methodology types | ### Trading and markets | System | Nodes | What it covers | |--------|-------|----------------| | `domain_derivatives` | 22 | Derivatives instrument types (forwards, futures, options, swaps, structured) | | `domain_commodity_trading` | 16 | Commodity trading types | | `domain_forex` | 15 | FX instrument types | ### Insurance | System | Nodes | What it covers | |--------|-------|----------------| | `domain_insurance_product` | 25 | Insurance product types (life, P&C, health, specialty) | | `domain_insurance_risk` | 25 | Insurance risk types | | `domain_insurance_underwriting` | 15 | Underwriting approach types | | `domain_insurance_claims` | 17 | Claim types | | `domain_reinsurance` | 18 | Reinsurance arrangement types | ### Payments and FinTech | System | Nodes | What it covers | |--------|-------|----------------| | `domain_payment_proc` | 21 | Payment processing types | | `domain_digital_banking` | 21 | Digital banking service types | | `domain_fintech_service` | 27 | FinTech service categories | | `domain_regtech` | 27 | RegTech service categories | | `domain_insurtech` | 26 | InsurTech service categories | ### Sustainable and emerging | System | Nodes | What it covers | |--------|-------|----------------| | `domain_carbon_credit` | 18 | Carbon credit instrument types | | `token_standard` | 15 | Crypto token standards (ERC-20, ERC-721, ERC-1155, BEP, SPL, etc.) | | `defi_protocol` | 15 | DeFi protocol categories (DEX, AMM, lending, derivatives, yield, oracle) | ## Which system to use | Purpose | Recommended system | Why | |---------|-------------------|-----| | Classifying a public-equity issuer for portfolio reporting | One of `gics_bridge` / `icb` / `bloomberg_bics` / `refinitiv_trbc` (pick the one your platform uses) | These are the canonical industry sector trees on each major platform | | Classifying a security for trade reporting | `cfi_iso10962` | The international standard CFI code | | Classifying a corporate action for STP processing | `corporate_action` + `iso20022_msg` | Event vocabulary + message envelope | | Counterparty financial reporting | `ifrs` or `reg_us_gaap` (jurisdiction-dependent) | Top-of-funnel accounting standards | | Bank capital and exposure reporting | `reg_basel3` + `basel_exposure` | Prudential framework + exposure-class taxonomy | | ESG materiality assessment | `sasb_sics` | Designed for materiality, formally adopted by IFRS Foundation | | EU sustainable-finance product disclosure | `eu_taxonomy` + `sfdr` + `reg_sfdr_detail` | EU-mandated combination | | Insurance prudential reporting (EU) | `reg_solvency2` + `domain_insurance_*` | Regulatory + curated product taxonomies | | Building a typed financial knowledge graph | `fibo` | OWL ontology with formal class definitions | | Categorizing a fintech / RegTech / InsurTech vendor | `domain_fintech_service` / `domain_regtech` / `domain_insurtech` | Plain-language curated buckets | | Classifying a crypto token or DeFi protocol | `token_standard` + `defi_protocol` | Emerging-tech vocabularies WoT curates because no stable external standard exists yet | ## Crosswalk navigation ```bash # Get the GICS to ICB peer mapping for a sector GET /api/v1/systems/gics_bridge/nodes/40/equivalences # Translate a CFI code to FIBO type GET /api/v1/systems/cfi_iso10962/nodes/ESVUFR/translations # Find all FIBO classes in the SEC (Securities) module GET /api/v1/search?q=Equity&systems=fibo # Browse the Basel exposure-class hierarchy GET /api/v1/systems/basel_exposure/nodes ``` The richest crosswalk surface today is between `gics_bridge`, `icb`, `bloomberg_bics`, and `refinitiv_trbc`. CFI to FIBO and Basel to FIBO are queued for a follow-up PR. ## What WoT does not host - **MSCI ESG Ratings** - proprietary methodology, not a published taxonomy. - **Bloomberg / Refinitiv detailed sub-industries** beyond what their public sector trees publish. - **Internal credit-rating scales** of individual banks (proprietary). - **OFR Financial Stability indicators** as a structured taxonomy - published as a dashboard, not a code list. - **Exchange-specific instrument lists** (NYSE listed equities, etc.) - these are operational data, not classification systems (per inclusion policy). - **Wikidata Q-numbers** for individual companies - entity registry above the size cap. ## Related reading - [Regulatory Standards](./regulatory-standards.md) - Basel III, MiFID II, Solvency II, SFDR, CSRD detail. - [Industry Classification Guide](./industry-classification.md) - NAICS / ISIC / NACE overlap with the financial industry trees. - [Inclusion Policy](./inclusion-policy.md) - why proprietary scores and entity registries are out. ======================================================================== # Academic and Research Classification - arXiv, MSC, JEL, Dewey, LCC, AACSB, ABET, EQF, SFIA, and more ======================================================================== # Academic and Research Classification > **TL;DR:** WoT hosts 30 academic and research classification systems: subject taxonomies for preprints (arXiv) and journals (Scopus ASJC, WoS Categories, ERA FoR, ANZSRC FOR/SEO, FORD Frascati, JEL, MSC, PACS, ACM CCS), library classifications (Dewey, UDC, LCC, LCSH, Getty AAT, UNESCO Thesaurus), education-quality and accreditation frameworks (AACSB, ABET), education-level frameworks (EQF, AQF, NQF UK, NGSS, CCSS, Bloom Taxonomy), and skills/competence frameworks (DigComp, e-CF, SFIA, LinkedIn Skills, WorldSkills). --- ## Subject classification (research output anchors) | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `arxiv_taxonomy` | 165 | arXiv / Cornell | Preprint subject categories (cs.LG, math.AG, etc.) | | `msc_2020` | 92 | AMS | Mathematics Subject Classification | | `pacs` | 70 | AIP | Physics and Astronomy Classification (legacy, still cited) | | `acm_ccs` | 67 | ACM | ACM Computing Classification System 2012 | | `jel` | 98 | American Economic Association | Journal of Economic Literature codes | | `scopus_asjc` | 28 | Elsevier | Scopus All Science Journal Classification | | `wos_categories` | 25 | Clarivate | Web of Science subject categories | | `era_for` | 24 | ARC (Australia) | Excellence in Research for Australia, Fields of Research | | `anzsrc_for_2020` | 166 | ABS / Stats NZ | ANZSRC Fields of Research 2020 | | `anzsrc_seo` | 17 | ABS / Stats NZ | ANZSRC Socio-Economic Objectives | | `ford_frascati` | 48 | OECD | Fields of Research and Development (Frascati Manual 2015) | **Use this layer when**: classifying preprints, journal articles, grant applications, or research-output reporting. ## Library classification | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `dewey_decimal` | 11 | OCLC | Dewey Decimal Classification (skeleton: 10 main classes + general) | | `udc` | 11 | UDC Consortium | Universal Decimal Classification (skeleton) | | `lcc` | 111 | Library of Congress | Library of Congress Classification | | `lcsh` | 20 | Library of Congress | LCSH Subject Headings (skeleton) | | `getty_aat` | 14 | Getty Research Institute | Art and Architecture Thesaurus (skeleton) | | `unesco_thesaurus` | 15 | UNESCO | UNESCO Thesaurus (skeleton) | **Use this layer when**: cataloging or anchoring against bibliographic / archival metadata. ## Education accreditation, levels, and pedagogy | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `aacsb` | 14 | AACSB International | Business school accreditation standards | | `abet` | 14 | ABET | Engineering / computing / applied-science program accreditation | | `eqf` | 13 | European Commission | European Qualifications Framework (8 levels) | | `aqf` | 14 | Australia | Australian Qualifications Framework | | `nqf_uk` | 14 | UK | UK National Qualifications Framework | | `bloom_taxonomy` | 14 | Bloom et al. | Bloom's Taxonomy of educational objectives (revised 2001) | | `ngss` | 14 | Achieve Inc. (US) | Next Generation Science Standards | | `ccss` | 18 | NGA / CCSSO (US) | Common Core State Standards | ## Skills and competence frameworks | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `digcomp_22` | 27 | European Commission JRC | Digital Competence Framework for Citizens 2.2 | | `ecf_v4` | 35 | CEN (EU) | European e-Competence Framework v4 | | `sfia_v8` | 14 | SFIA Foundation | Skills Framework for the Information Age v8 | | `linkedin_skills` | 17 | LinkedIn | LinkedIn Skills Taxonomy (skeleton; mapped from public docs) | | `worldskills` | 14 | WorldSkills International | Skill competition categories | These overlap with [Occupation Systems](./occupation-systems.md) (ESCO Skills, O*NET Knowledge / Abilities); the difference is granularity. ESCO Skills has 14K entries; SFIA / e-CF are framework-level (~30 each), suited for capability assessment rather than skill tagging. ## Decision tree | What you are doing | Use | |---|---| | Tagging a preprint | `arxiv_taxonomy` (CS / math / physics) or `msc_2020` (math) | | Classifying a journal article | `scopus_asjc` or `wos_categories` | | Funding agency reporting (US/EU) | `ford_frascati` (OECD) or `era_for` / `anzsrc_*` (Australia) | | Cataloging a book in a library | `dewey_decimal`, `udc`, or `lcc` | | Subject heading on a record | `lcsh`, `unesco_thesaurus`, `getty_aat` | | Anchoring a course / qualification level | `eqf`, `aqf`, `nqf_uk`, or local equivalent | | Accrediting a business school | `aacsb` | | Accrediting an engineering program | `abet` | | Designing learning objectives | `bloom_taxonomy` + `ngss` / `ccss` | | Assessing IT capability | `sfia_v8` or `ecf_v4` | | Assessing digital citizenship | `digcomp_22` | ## Related reading - [Occupation Systems](./occupation-systems.md) - skills frameworks crossover (ESCO, O*NET). - [Web Vocabularies](./web-vocabularies.md) - schema.org `Course` and `EducationalOccupationalProgram` types map to these. - [Inclusion Policy](./inclusion-policy.md) - many academic taxonomies are skeletons; rationale documented. ======================================================================== # Environmental Standards and Scales - BREEAM, LEED, IUCN, CITES, GHG Protocol, SDG, TNFD, Beaufort, Köppen, and more ======================================================================== # Environmental Standards and Scales > **TL;DR:** WoT hosts ~30 environmental classification systems across four families: green-building rating schemes (BREEAM, LEED), biodiversity and conservation taxonomies (IUCN Red List, CITES, Ramsar, CBD targets, UNEP Chemicals), waste and chemicals codes (EPA RCRA, EU Waste Catalogue, Stockholm/Rotterdam/Minamata Conventions), climate and energy data taxonomies (GHG Protocol, IEA Energy Balance, IRENA, FAO AQUASTAT, FAOSTAT, ISO 14001/14040/14064/50001), and natural-science scales (Beaufort, Saffir-Simpson, Fujita, UV Index, Köppen Climate, Geological Timescale, Periodic Table). Plus the SDG 2030 framework and the TNFD nature-related disclosure standard. --- ## Green building and infrastructure | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `breeam` | 17 | BRE | Building Research Establishment Environmental Assessment Method | | `leed_v4_1` | 14 | USGBC | Leadership in Energy and Environmental Design v4.1 | | `reg_iso_14001` | 29 | ISO | Environmental management system standard | | `reg_iso_50001` | 26 | ISO | Energy management system standard | ## Biodiversity and conservation | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `iucn_red_list` | 15 | IUCN | Red List of Threatened Species categories | | `cites` | 16 | CITES Secretariat | Convention on International Trade in Endangered Species (Appendix I/II/III) | | `ramsar` | 21 | Ramsar Convention | Ramsar wetland classification | | `cbd_targets` | 24 | CBD Secretariat | Convention on Biological Diversity Global Biodiversity Framework targets | | `cbd_aichi` | 21 | CBD Secretariat | Aichi Biodiversity Targets (legacy 2011-2020 framework, still cited) | ## Waste, chemicals, and pollution | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `epa_rcra_waste` | 15 | US EPA | Hazardous waste codes under RCRA | | `eu_waste_cat` | 21 | European Commission | EU Waste Catalogue (2014/955/EU) | | `stockholm_pops` | 19 | Stockholm Convention | Persistent Organic Pollutants Annex A/B/C | | `rotterdam_pic` | 17 | Rotterdam Convention | Prior Informed Consent procedure for hazardous chemicals | | `minamata` | 15 | Minamata Convention | Mercury phase-out provisions | | `unep_chemicals` | 15 | UNEP | UNEP chemicals categories | ## Climate, energy, and natural resources | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `ghg_protocol` | 20 | WRI / WBCSD | Greenhouse Gas Protocol (Scope 1/2/3 categories) | | `reg_iso_14064` | 20 | ISO | GHG quantification, monitoring, verification | | `reg_iso_14040` | 25 | ISO | Life-cycle assessment principles and framework | | `iea_energy_bal` | 19 | IEA | International Energy Agency energy balance categories | | `irena_re` | 17 | IRENA | International Renewable Energy Agency RE technology types | | `fao_aquastat` | 14 | FAO | Global water and agriculture statistics taxonomy | | `fao_stat_domain` | 17 | FAO | FAOSTAT data domains (Production, Trade, Inputs, etc.) | ## Sustainability disclosure (overlap with regulatory and financial layers) | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `sdg` | 82 | United Nations | Sustainable Development Goals 2030 (17 goals + 169 targets, here ~82 nodes) | | `un_sdg_indicators` | 20 | United Nations | SDG indicator framework | | `tnfd` | 34 | TNFD | Taskforce on Nature-related Financial Disclosures | See [Regulatory Standards](./regulatory-standards.md) for the broader sustainability-disclosure regulation set (CSRD, EU Taxonomy, SFDR, SBTi, ISSB S1/S2, etc.). ## Natural-science scales (small bounded enumerations) These are short, stable, universal scales used as reference data across many domains. | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `beaufort_scale` | 14 | WMO | Beaufort Wind Force Scale (0-12 + extensions) | | `saffir_simpson` | 12 | NHC / NOAA | Saffir-Simpson Hurricane Wind Scale (Cat 1-5 + extensions) | | `fujita_tornado` | 10 | NWS | Enhanced Fujita Scale (EF0-EF5) | | `uv_index` | 11 | WHO / WMO | Ultraviolet Index | | `koppen_climate` | 17 | Köppen-Geiger | Köppen climate classification | | `geological_time` | 20 | ICS | International Commission on Stratigraphy geologic time scale | | `periodic_table` | 18 | IUPAC | Periodic table groupings (s/p/d/f-blocks, lanthanides, actinides, etc.) | ## Decision tree | What you are doing | Use | |---|---| | Building energy / sustainability rating | `breeam` (UK/EU) or `leed_v4_1` (US/global) | | Corporate GHG inventory | `ghg_protocol` + `reg_iso_14064` | | Life-cycle environmental assessment | `reg_iso_14040` | | Endangered-species trade compliance | `cites` | | Wetland project classification | `ramsar` | | Hazardous waste manifest | `epa_rcra_waste` (US) or `eu_waste_cat` (EU) | | POPs / chemicals reporting | `stockholm_pops`, `rotterdam_pic`, `minamata` | | Renewable-energy capacity reporting | `irena_re` | | Tagging a research outcome to an SDG | `sdg` + `un_sdg_indicators` | | Nature-related financial disclosure | `tnfd` | | Categorizing a weather/climate observation | `beaufort_scale`, `saffir_simpson`, `fujita_tornado`, `koppen_climate` | ## Related reading - [Regulatory Standards](./regulatory-standards.md) - EU sustainability regulations (CSRD, EU Taxonomy, SFDR, etc.). - [Financial Systems](./financial-systems.md) - sustainable-finance frameworks (TNFD, SBTi, ISSB). - [Inclusion Policy](./inclusion-policy.md) - rationale for what's in scope. ======================================================================== # Geographic Classification - ISO 3166, UN M.49, NUTS, FIPS, GeoNames, Köppen ======================================================================== # Geographic Classification > **TL;DR:** WoT hosts 11 geographic classification systems anchoring the spatial axis: country codes (ISO 3166-1, UN M.49), subdivision codes (ISO 3166-2, EU NUTS 2021, US FIPS), feature classification (GeoNames Features), airport codes (ICAO Airport), climate zones (Köppen), and country development / income groupings (WB Income Groups, ADB Sector). Distinct from the country-link layer that maps a country to its applicable taxonomies - this page is the geography-as-classification view. --- ## Country and territory codes | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `iso_3166_1` | 271 | ISO | Country / dependency / autonomous-region two-letter, three-letter, numeric codes | | `un_m49` | 279 | UN Statistics Division | UN M.49 standard country / area / region codes (with regional groupings: SDG regions, geographic regions) | ISO 3166-1 is the operational country code (US, GB, DE, etc.); UN M.49 layers in regional and sub-regional groupings (Northern America, Western Europe, Sub-Saharan Africa). Use both: ISO 3166-1 for the country, M.49 for any aggregation. ## Subdivision codes | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `iso_3166_2` | 5,246 | ISO | First-order country subdivisions globally (US states, German Bundesländer, French régions, etc.) | | `eu_nuts_2021` | 124 | Eurostat | Nomenclature of Territorial Units for Statistics (NUTS 1/2/3) | | `us_fips` | 86 | NIST (US) | Federal Information Processing Standards geographic codes (states + outlying areas) | | `nuts_candidate` | 11 | Eurostat | NUTS candidate codes for prospective EU members and candidates | ISO 3166-2 is the universal subdivision code; NUTS is the EU-specific statistical hierarchy (NUTS 1 -> 2 -> 3); FIPS is the US legacy that many federal datasets still cite alongside ISO 3166-2. ## Feature and place classification | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `geonames_features` | 693 | GeoNames | Feature codes for places (administrative, hydrographic, populated, terrain, undersea, vegetation, spots/buildings/farms, roads/railroads) | | `icao_airport` | 21 | ICAO | ICAO airport-code regional groupings (KXXX = US contiguous, EXXX = Europe, etc.) | GeoNames is the "what kind of place is this" classifier (a city, a river, a mountain, an administrative division). Use it alongside ISO 3166-2 to anchor a place: "Boston is `P.PPLA` (seat of first-order admin division) within `US-MA`." ## Climate as geography | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `koppen_climate` | 17 | Köppen-Geiger | Climate-zone classification (A tropical, B arid, C temperate, D continental, E polar, with subdivisions) | Climate zones are not a coordinate but they classify *where* an observation can apply (a tropical zone Af city has different agricultural / disease / energy profiles than a desert BWh city). ## Country groupings (development and macroeconomic) | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `wb_income` | 27 | World Bank | Country income classification (low / lower-middle / upper-middle / high; updated annually) | | `adb_sector` | 46 | ADB | Asian Development Bank sector taxonomy with country-region context | These are referenced from [Financial Systems](./financial-systems.md) as well - relevant when development-finance or DFI activity is being classified. ## Decision tree | What you are doing | Use | |---|---| | Tagging a country | `iso_3166_1` (always; the operational anchor) | | Aggregating to a region | `un_m49` (SDG regions, geographic regions) | | Tagging a state / province / region | `iso_3166_2` (universal) or `us_fips` / `eu_nuts_2021` (jurisdiction-specific) | | Anchoring a statistical region in EU | `eu_nuts_2021` (the EU statistics convention) | | Classifying *what kind* of place it is | `geonames_features` (city / river / mountain / etc.) | | Anchoring an airport | `icao_airport` (regional grouping); use IATA codes for the specific airport (out of scope here) | | Climate-zone tagging | `koppen_climate` | | Development-finance country tier | `wb_income` | ## Crosswalk navigation ```bash # Translate an ISO 3166-2 subdivision to NUTS GET /api/v1/systems/iso_3166_2/nodes/FR-IDF/translations # Find the GeoNames feature class for a populated place GET /api/v1/systems/geonames_features/nodes/P.PPLC # Country profile (uses iso_3166_1 codes) GET /api/v1/countries/US ``` ## What WoT does not host - **Postal codes** (ZIP, postcode) - operational data, grows continuously, fails the "stable enumeration" test in the inclusion policy. - **Specific airport / port / station identifiers** (IATA, IMO, UN/LOCODE) - operational registries, ~tens of thousands of entries each. Candidates for "World of Registries" sister product, not WoT. - **Latitude/longitude coordinate systems** - notations, not classifications. - **Cadastral parcel IDs** - jurisdiction-specific operational data. ## Related reading - [Inclusion Policy](./inclusion-policy.md) - why postal codes and IATA are not in WoT. - [Crosswalk Map](./crosswalk-map.md) - geography is a frequent join axis. ======================================================================== # Clinical Scales and Specialty Codes - APGAR, ASA, BMI, ICD variants, CPT, HCPCS, MS-DRG, RxNorm, NCI Thesaurus, FHIR, and more ======================================================================== # Clinical Scales and Specialty Codes > **TL;DR:** Companion to [Medical Coding](./medical-coding.md). Where that page covers the major disease and lab classifications (ICD-10/11, MeSH, LOINC, ATC, SNOMED CT), this page catalogs the 40 supporting clinical scales and specialty codes WoT hosts: bedside / point-of-care scales (APGAR, ASA Physical Status, BMI, Bristol Stool, Pain Scale, Glasgow Coma - coming soon - Mohs hardness for skin pathology), national / regional ICD variants (ICD-10-CM not duplicated here, but ICD-10-CA / GM / AM and ICD-O-3, ICF, ICHI, ICPC-2), procedure / billing codes (CPT, HCPCS L2/L3, NUCC HCPT, MS-DRG, G-DRG), drug and medication codes (RxNorm, NDC, EDQM Dosage Forms), nursing taxonomies (NANDA-I, NIC Nursing, ICN Nursing), specialty registries (NCI Thesaurus, OMIM, Orphanet, GMDN, FHIR Resources, DICOM Modality), and quality measure frameworks (HEDIS, CMS Star Ratings, CTCAE, GBD Causes, WHO Essential Medicines, CDC Vaccine Schedule, DSM-5). --- ## Bedside and point-of-care scales Small bounded enumerations clinicians use at every shift. | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `apgar_score` | 12 | AAP | APGAR newborn assessment (Appearance / Pulse / Grimace / Activity / Respiration; 0-2 each, 0-10 total) | | `asa_physical` | 11 | ASA | American Society of Anesthesiologists Physical Status (ASA I-VI) | | `bmi_categories` | 11 | WHO | Body Mass Index categories (underweight / normal / overweight / obese class I-III) | | `blood_types` | 14 | various | ABO + Rh blood types | | `bristol_stool` | 11 | NHS / Lewis | Bristol Stool Form Scale (Type 1-7) | | `pain_scale` | 12 | various | Numeric Pain Rating Scale (0-10) and named scales | | `mohs_hardness` | 11 | Mohs | Mineralogical hardness scale (used in dermatopathology and materials) | These score-style scales are universally used and trivially small; they're in WoT primarily so downstream classifiers can resolve "ASA III" or "BMI obese class II" against an authoritative anchor. ## ICD national variants WoT hosts ICD-10-CM (US, separately listed) and ICD-11 in [Medical Coding](./medical-coding.md). National variants: | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `icd10_ca` | 23 | CIHI | Canadian ICD-10 modification (skeleton; Canadian Institute for Health Information) | | `icd10_gm` | 51 | DIMDI / BfArM | German ICD-10 modification | | `icd10_am` | 52 | IHACPA | Australian ICD-10 modification | | `icdo3` | 115 | WHO | International Classification of Diseases for Oncology, 3rd ed. | | `icf` | 34 | WHO | International Classification of Functioning, Disability and Health | | `ichi_who` | 15 | WHO | International Classification of Health Interventions | | `icpc2` | 18 | WONCA / WHO | International Classification of Primary Care, 2nd ed. | ## Procedure and billing codes | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `cpt_ama` | 18 | AMA | Current Procedural Terminology (skeleton; full CPT requires AMA license) | | `hcpcs_l2` | 59 | CMS | Healthcare Common Procedure Coding System Level II (durable medical equipment, drugs, supplies) | | `hcpcs_l3` | 13 | CMS / state Medicaid | HCPCS Level III (state-specific, legacy) | | `nucc_hcpt` | 94 | NUCC | Healthcare Provider Taxonomy (provider type / specialty) | | `ms_drg` | 50 | CMS | Medicare Severity Diagnosis Related Groups (US inpatient billing) | | `g_drg` | 26 | InEK | German Diagnosis Related Groups | ## Drug, dosage, and medication | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `rxnorm` | 16 | NLM | RxNorm normalized drug-name standard (skeleton; full corpus is large) | | `ndc_fda` | 112,077 | FDA | National Drug Code (every US-marketed drug product) | | `edqm_dosage` | 17 | EDQM (Council of Europe) | Standard Terms for pharmaceutical dose forms / routes / containers | | `who_essential_med` | 27 | WHO | WHO Model List of Essential Medicines (categories) | | `cdc_vaccine` | 18 | CDC | CDC vaccine schedule categories | ## Nursing taxonomies Nursing has its own controlled vocabularies for diagnoses and interventions. | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `nanda_nursing_dx` | 14 | NANDA-I | NANDA International nursing diagnoses (skeleton) | | `nic_nursing_intv` | 14 | Iowa | Nursing Interventions Classification (skeleton) | | `icn_nursing` | 14 | ICN | International Classification for Nursing Practice (ICNP, skeleton) | ## Specialty registries and large reference works | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `nci_thesaurus` | 211,072 | NCI | NCI Thesaurus (cancer-research terminology, the largest health system in WoT) | | `omim` | 14 | Johns Hopkins / NCBI | Online Mendelian Inheritance in Man (skeleton; categories of genetic disorders) | | `orphanet` | 16 | INSERM | Orphanet rare-disease classification (skeleton) | | `gmdn` | 17 | GMDN Agency | Global Medical Device Nomenclature | | `dicom_modality` | 16 | NEMA / DICOM | Standard imaging modality codes (CT, MR, US, etc.) | | `fhir_resources` | 15 | HL7 | FHIR resource type catalog (Patient, Encounter, Observation, etc.) | ## Quality, outcome, and population-health measures | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `hedis` | 15 | NCQA | Healthcare Effectiveness Data and Information Set quality measures | | `cms_star` | 13 | CMS | CMS Star Ratings categories (Medicare Advantage, hospitals, nursing homes) | | `ctcae` | 27 | NCI | Common Terminology Criteria for Adverse Events (clinical-trial AE grading) | | `gbd_cause` | 23 | IHME | Global Burden of Disease cause hierarchy | | `dsm5` | 21 | APA | Diagnostic and Statistical Manual of Mental Disorders 5th ed. (skeleton) | ## Decision tree | What you are doing | Use | |---|---| | Newborn assessment | `apgar_score` | | Pre-anesthesia risk assessment | `asa_physical` | | Adult-weight category | `bmi_categories` | | Pain assessment | `pain_scale` | | Cancer pathology coding | `icdo3` | | Disability and functioning assessment | `icf` | | Primary-care complaint coding | `icpc2` | | Health intervention coding | `ichi_who` | | US inpatient billing | `ms_drg` + `cpt_ama` + `hcpcs_l2` | | US drug-product identification | `ndc_fda` | | Pharmaceutical dose-form standard | `edqm_dosage` | | Provider type classification | `nucc_hcpt` | | Imaging study modality | `dicom_modality` | | Cancer terminology lookup | `nci_thesaurus` | | Rare-disease lookup | `orphanet` + `omim` | | Adverse-event grading in trials | `ctcae` | | Global burden of disease analysis | `gbd_cause` | | FHIR resource modeling | `fhir_resources` | | Quality reporting (US Medicare) | `hedis` + `cms_star` | | Vaccine scheduling (US) | `cdc_vaccine` | | Mental-health diagnosis | `dsm5` (skeleton; pair with ICD-10/11 mental-health chapters) | ## Related reading - [Medical Coding](./medical-coding.md) - the major disease and lab classifications (ICD-10/11, MeSH, LOINC, ATC, SNOMED CT, NDC). - [Regulatory Standards](./regulatory-standards.md) - HIPAA, FDA 21 CFR, DEA, Joint Commission, CAP, CLIA, USP. - [Inclusion Policy](./inclusion-policy.md) - many clinical systems are skeletons; rationale documented. ======================================================================== # Process and Activity Frameworks - APQC PCF, SCOR, ITIL 4, COBIT, PMBOK ======================================================================== # Process and Activity Frameworks > **TL;DR:** APQC PCF is the pan-industry process anchor (13 top-level categories of operating + management processes). SCOR covers supply-chain ops, ITIL 4 covers IT service management, COBIT covers IT governance, PMBOK / PRINCE2 cover project management, Six Sigma / Lean cover quality and operations. WoT hosts all of these so downstream products can anchor "what work does this organization do" against canonical process taxonomies. --- ## What this layer is for Industry classifications (NAICS, ISIC, NACE) name *what an organization does* by sector. Process frameworks name *how the work itself is organized*: discrete activities, sub-processes, capabilities. The two are orthogonal: every NAICS sector executes the same APQC PCF top-level processes (it Develops Vision, Manages HR, Manages IT, Delivers Products / Services). The difference is which level-2 / level-3 elements are emphasized. This layer matters when downstream products need to: - Anchor a job posting against process language (an "Order Management Analyst" works on PCF 4.0 Deliver Physical Products and SCOR Deliver, not on a NAICS code). - Crosswalk a benchmark report ("our Source-to-Pay cycle is X days") against the source taxonomy of the framework producing the benchmark. - Drive process-mining or RPA pipelines that need standard activity vocabularies, not industry codes. - Support consultative selling that maps a customer's pain ("our IT change-management is broken") to a published framework anchor (ITIL 4 IL.18 Change Enablement) rather than a free-text category. ## System comparison | System | Codes | Scope | Maintained By | |--------|-------|-------|---------------| | APQC PCF (Skeleton) | 13 | Cross-industry process classification, top-level categories | APQC | | SCOR Model | 17 (L1+L2) | Supply Chain Operations Reference (Plan / Source / Make / Deliver / Return / Enable / Orchestrate) | ASCM (formerly APICS) | | ITIL 4 | 26 (L1+L2 practices) | IT service management, 25 ITIL 4 practices across General / Service / Technical | AXELOS / PeopleCert | | COBIT 2019 | 44 (5 governance domains + 40 objectives) | IT governance and management framework | ISACA | | PMBOK 7th Ed | 21 (8 performance domains + 12 principles) | Project management body of knowledge | PMI | | PRINCE2 | 15 | Projects in Controlled Environments (UK Cabinet Office method) | AXELOS | | Six Sigma | 16 | DMAIC / DMADV process improvement methodology | ASQ | | Lean Tools | 15 | Lean manufacturing / lean management toolkit | various (Toyota Production System lineage) | | TOGAF ADM | 14 | Enterprise architecture method (Architecture Development Method phases) | The Open Group | | ArchiMate | 14 | Enterprise architecture modelling language | The Open Group | | SCOR Model (extended) | included above | also covers performance attributes and best-practice recommendations | ASCM | ## How these relate ```mermaid graph LR PCF[APQC PCF
13 cross-industry categories] SCOR[SCOR
supply chain ops] ITIL[ITIL 4
IT service mgmt] COBIT[COBIT 2019
IT governance] PMBOK[PMBOK 7
project mgmt] PCF -->|4.0 Deliver Physical / 5.0 Deliver Services overlap| SCOR PCF -->|8.0 Manage IT overlap| ITIL PCF -->|8.0 Manage IT + 11.0 Risk overlap| COBIT PCF -->|13.0 Develop and Manage Business Capabilities overlap| PMBOK ``` APQC PCF is the most general anchor. The others specialize: - **SCOR** elaborates what PCF 4.0 (Deliver Physical Products) and 5.0 (Deliver Services) actually involve at the supply-chain level. Process Plan / Source / Make / Deliver / Return / Enable / Orchestrate. - **ITIL 4** elaborates what PCF 8.0 (Manage IT) involves at the service-management level. 25 practices grouped General / Service Mgmt / Technical. - **COBIT 2019** governs IT (overlaps PCF 8.0 and 11.0 Manage Enterprise Risk). 5 domains (EDM, APO, BAI, DSS, MEA), 40 objectives. - **PMBOK 7** elaborates what PCF 13.0 (Develop and Manage Business Capabilities) means when you are managing it as a portfolio of projects. - **Six Sigma / Lean** are improvement methodologies that span PCF categories rather than living inside one. ## Crosswalk navigation WoT carries Level-1 conceptual crosswalks between APQC PCF and the supply-chain / IT / project-management frameworks listed above. These are tagged `match_type='related'` rather than `'exact'` because they are *conceptual overlaps* (PCF 8.0 and ITIL 4 both cover IT operations), not strict identity. Use them as anchoring hints, not as substitution rules. ```bash # Find what overlaps with APQC PCF 8.0 (Manage IT) GET /api/v1/systems/apqc_pcf/nodes/8.0/equivalences # Find what SCOR Deliver maps to in PCF GET /api/v1/systems/scor_model/nodes/SC.04/equivalences ``` ## What WoT does not host - **Per-industry APQC PCFs** (Banking, Healthcare, Telecom, etc.). APQC publishes industry-specific variants of the PCF; only the cross-industry skeleton is in this PR. - **APQC PCF Levels 2-5** (~1,500 detailed process elements). The full tree requires APQC's official spreadsheet (free with registration). The ingester (`world_of_taxonomy/ingest/apqc_pcf.py`) is structured for in-place extension when that file is provided; the system_id stays `apqc_pcf` so existing crosswalks survive. - **BPMN / DMN / CMMN notations**. These are graphical modelling notations, not classification systems. They fail the inclusion policy's "stable identifiers" and "enumerated / hierarchical" tests because every vendor's BPMN library defines its own element subset. - **ISO 15926** (process plant data integration). Behind ISO paywall and per-part licensing; deferred unless a paying customer explicitly asks for it. ## Use cases 1. **Process discovery for consulting engagements.** Anchor a discovery conversation against PCF Level-1 categories so all stakeholders use the same vocabulary. 2. **Job-architecture design.** Map a role to PCF + SCOR + ITIL elements to define what "good" looks like for the role's deliverables. 3. **Benchmark normalization.** When a vendor cites "our Order-to-Cash cycle averages 7 days," anchor against PCF 4.0 + SCOR Deliver to compare against your own metrics. 4. **RPA / process mining tagging.** Use PCF + SCOR / ITIL codes as the controlled vocabulary for activity logs feeding process-mining tools. 5. **Compliance scoping.** Map a control framework (NIST CSF, ISO 27001) against COBIT objectives to identify which IT processes the controls actually touch. ## Related reading - [Industry Classification Guide](./industry-classification.md) - the orthogonal sector axis. - [Crosswalk Map](./crosswalk-map.md) - how systems connect via equivalence edges. - [Inclusion Policy](./inclusion-policy.md) - why BPMN and ISO 15926 are not in WoT. ======================================================================== # Technology Standards and Miscellaneous Specifications ======================================================================== # Technology Standards and Specifications > **TL;DR:** WoT hosts ~25 technology-standards taxonomies that don't fit cleanly into the industry / regulatory / academic / financial buckets but still describe stable enumerated value spaces: telecom and networking specs (3GPP, ITU-T, ITU-R, IETF RFC, IEEE), connectivity (Bluetooth, USB, PCI-SIG, JEDEC, SEMI, VESA), web/internet (MIME types, HTTP status codes, SPDX licenses), cybersecurity catalogs (MITRE ATT&CK, CVE types, OWASP Top 10), AI / cloud-native taxonomies (AI Model Types, Cloud Native Landscape), and supplementary engineering / scientific reference (SI Units, Container ISO 6346, Periodic Table grouping is in environmental). Plus a handful of regulatory administrative anchors that don't belong in regulatory-standards.md (CFR Titles, USC Titles, IRS Forms, VAT Rate Types). --- ## Telecom and networking standards | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `3gpp_specs` | 18 | 3GPP | 3GPP technical specifications (cellular, 5G, etc.) | | `itu_t` | 19 | ITU-T | International Telecommunication Union telecom recommendations | | `itu_r_bands` | 16 | ITU-R | ITU-R radio frequency band designations | | `ietf_rfc` | 15 | IETF | IETF RFC categorical groupings | | `ieee_standards` | 14 | IEEE | IEEE standards (skeleton; 802.x family, etc.) | ## Connectivity and hardware specifications | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `bluetooth_profiles` | 17 | Bluetooth SIG | Bluetooth Special Interest Group profiles | | `usb_classes` | 23 | USB-IF | USB Implementers Forum device class codes | | `pci_sig` | 14 | PCI-SIG | PCI Special Interest Group specifications | | `jedec` | 14 | JEDEC | Joint Electron Device Engineering Council standards | | `semi_standards` | 14 | SEMI | Semiconductor Equipment and Materials International standards | | `vesa_standards` | 13 | VESA | Video Electronics Standards Association specifications | ## Web, internet, and software | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `mime_types` | 16 | IANA | Media (MIME) type categories | | `http_status` | 17 | IETF / IANA | HTTP status code classes (1xx-5xx and detail) | | `spdx_licenses` | 17 | Linux Foundation | SPDX License List groupings | | `ai_model_type` | 17 | curated | AI / ML model types (curated WoT vocabulary) | | `cloud_native` | 15 | CNCF | Cloud Native Computing Foundation landscape categories | ## Cybersecurity catalogs | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `mitre_attack` | 15 | MITRE | MITRE ATT&CK adversary tactics and techniques | | `cve_types` | 16 | MITRE | CVE / CWE vulnerability and weakness type categories | | `owasp_top10` | 11 | OWASP | OWASP Top 10 web application security risks | | `wcag` | 17 | W3C | Web Content Accessibility Guidelines (WCAG 2.x) | WCAG sits here for taxonomic reasons (it's a W3C specification with stable success-criterion identifiers) even though its primary use is accessibility compliance, which overlaps the Regulatory Standards page. ## Engineering and scientific reference | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `si_units` | 19 | BIPM | SI base and derived unit categories | | `container_iso` | 14 | ISO 6346 | Standard intermodal container type codes | | `nato_codification` | 19 | NATO | NATO Stock Number / Codification System (skeleton) | | `dod_mil_std` | 15 | US DoD | US Department of Defense MIL-STD categories | | `un_ammo` | 14 | UN ECOSOC | UN ammunition / dangerous-goods identification (IATG) | | `stanag` | 16 | NATO | NATO Standardization Agreement categories | | `isa_standards` | 12 | ISA | International Society of Automation standards | ## Regulatory administrative anchors These are titling / numbering schemes rather than substantive regulations. They don't fit the [Regulatory Standards page](./regulatory-standards.md) (which covers regulations themselves) but are stable enumerated taxonomies often referenced when citing regulations. | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `cfr_titles` | 19 | US OFR | Code of Federal Regulations title list (Title 1 through 50) | | `cfr_title_49` | 104 | US DOT (USDOT) | CFR Title 49 (Transportation) detailed parts | | `usc_titles` | 23 | US OLRC | US Code title list | | `irs_forms` | 15 | IRS | IRS form-type categories | | `vat_rates` | 14 | various | VAT rate types (standard, reduced, super-reduced, zero) | | `gdpr_articles` | 110 | EDPB | GDPR article-level breakdown (companion to `reg_eu_ai_act` / `reg_eprivacy` for legal-tech use) | | `gdpr_basis` | 16 | EDPB | GDPR Article 6 lawful bases + special-category Article 9 bases | | `gdpr_rights` | 13 | EDPB | GDPR data-subject rights (access, rectification, erasure, etc.) | | `data_retention` | 16 | curated | Data retention period categories (curated WoT vocabulary) | ## Other technical reference | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `iso_31000` | 47 | ISO | ISO 31000 risk management principles and process | | `gri_standards` | 38 | GRI | Global Reporting Initiative sustainability reporting standards (companion to env / financial pages) | | `tcfd` | 14 | TCFD | TCFD recommendations (companion to financial-systems page) | | `gs1_standards` | 14 | GS1 | GS1 standards meta-catalog (parent of GS1 GPC, GTIN, etc.) | | `edi_standards` | 14 | various | EDI standards categorization (X12, EDIFACT, etc.) | ## Logistics and freight | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `nmfc` | 19 | NMFTA | National Motor Freight Classification | | `stcc` | 26 | AAR | Standard Transportation Commodity Code | | `imo_ship_type` | 17 | IMO | IMO ship type classification | | `imo_vessel` | 17 | IMO | IMO vessel type categories | | `iata_aircraft` | 14 | IATA | IATA aircraft type codes | | `faa_aircraft_cat` | 16 | FAA | FAA aircraft category and class designations | | `uic_railway` | 15 | UIC | UIC railway codes | | `icao_doc4444` | 15 | ICAO | ICAO flight rules and procedures (Doc 4444) | | `wco_safe` | 14 | WCO | World Customs Organization SAFE Framework of Standards | ## Trade tariff / customs minor systems These complement the main [Trade Codes page](./trade-codes.md) which covers HS, CPC, UNSPSC, SITC, BEC. | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `eu_taric` | 22 | European Commission | EU TARIC (Integrated Tariff of the European Union) | | `uk_trade_tariff` | 22 | UK Government | UK Trade Tariff | | `gcc_tariff` | 17 | GCC | Gulf Cooperation Council common tariff | | `ecowas_cet` | 14 | ECOWAS | ECOWAS Common External Tariff | | `prodcom` | 38 | Eurostat | EU PRODCOM (industrial production statistics) | | `cpv_2008` | 96 | European Commission | Common Procurement Vocabulary 2008 | | `coicop` | 62 | UN | Classification of Individual Consumption According to Purpose | | `eccn` | 58 | US BIS | Export Control Classification Number | | `schedule_b` | 119 | US Census | Schedule B export classification | | `hts_us` | 120 | USITC | Harmonized Tariff Schedule of the United States | ## Sports, culture, miscellaneous | System | Codes | Authority | Scope | |--------|-------|-----------|-------| | `olympic_sports` | 16 | IOC | Olympic sports categories | | `fifa_confederations` | 14 | FIFA | FIFA football confederations | | `pantone_families` | 12 | Pantone | Pantone color family groupings | | `ral_colors` | 13 | RAL | RAL color standard families | | `isrc_format` | 13 | IFPI | International Standard Recording Code structure | | `isbn_groups` | 13 | International ISBN Agency | ISBN agency / language groups | | `richter_scale` | 13 | various | Richter / earthquake magnitude scale | | `usda_soil` | 13 | USDA | USDA soil taxonomy (skeleton; companion to environmental) | | `oecd_dac` | 62 | OECD | OECD Development Assistance Committee sector codes | | `seea` | 47 | UN | System of Environmental-Economic Accounting (SEEA) | | `lme_metals` | 15 | LME | London Metal Exchange traded metals | | `opec_basket` | 14 | OPEC | OPEC reference basket of crude oils | | `naic_lines` | 30 | NAIC | NAIC insurance lines of business (companion to financial-systems) | | `haccp` | 13 | Codex Alimentarius | HACCP food-safety principles | | `codex_committees` | 19 | FAO / WHO | Codex Alimentarius committee structure | | `allergen_list` | 15 | EU | EU 14 major allergens (Annex II of Regulation 1169/2011) | | `ibc_2021` | 26 | ICC | International Building Code 2021 | | `nfpa_codes` | 17 | NFPA | National Fire Protection Association codes | | `rics_valuation` | 14 | RICS | Royal Institution of Chartered Surveyors valuation standards | | `contract_types` | 16 | curated | Contract type categories (curated WoT vocabulary) | | `board_committee` | 14 | curated | Board / committee structure types (curated) | | `shrm_competency` | 16 | SHRM | SHRM HR competency model | | `job_family` | 19 | curated | Job family taxonomy (curated WoT vocabulary; complements occupation systems) | | `emoji_categories` | 13 | Unicode Consortium | Unicode emoji category groupings | | `breeam` | 17 | BRE | BREEAM (also referenced in environmental-standards) | | `leed_v4_1` | 14 | USGBC | LEED v4.1 (also referenced in environmental-standards) | ## Why this is a "miscellaneous" page The systems above are real, published, stable, and within size cap (per the [Inclusion Policy](./inclusion-policy.md)) - so they belong in WoT - but they don't form a coherent topical cluster that justifies a dedicated page on its own. Putting them here keeps them findable while honoring the Karpathy four-channel pattern: catalog row plus topical context plus llms-full.txt presence plus wiki API exposure. If your downstream product cares about a specific subset (say, telecom standards or cybersecurity catalogs), filter by system_id prefix at the API: ```bash GET /api/v1/systems?prefix=itu_ GET /api/v1/systems?prefix=mitre_ ``` ## Related reading - [Inclusion Policy](./inclusion-policy.md) - why these qualify even though they're small. - [Regulatory Standards](./regulatory-standards.md) - GDPR, NIST, ISO management standards (complements the GDPR-articles entry on this page). - [Process and Activity Frameworks](./process-frameworks.md) - APQC PCF, SCOR, ITIL 4 (overlaps WCAG and ISO 31000 conceptually). ======================================================================== # System Architecture and Data Flows ======================================================================== ## System Architecture and Data Flows > **TL;DR:** Three consumer interfaces (web app, REST API, MCP server) backed by PostgreSQL and a wiki knowledge layer. Data flows from 1,000 official sources through an ingestion pipeline into three core tables. Wiki content serves four channels from one source of truth. --- ## System architecture overview The platform serves three consumer interfaces - a web application, a REST API, and an MCP server - all backed by a shared PostgreSQL database and wiki knowledge layer. ```mermaid graph TB subgraph Data["Data Layer"] PG[(PostgreSQL)] WIKI["wiki/*.md files"] end subgraph Backend["Python Backend"] INGEST["Ingesters - 1,000+ systems"] API["FastAPI REST API - /api/v1/*"] MCP["MCP Server - stdio transport"] WIKILOADER["Wiki Loader - wiki.py"] end subgraph Frontend["Next.js Frontend"] NEXT["Next.js 16 App Router"] GUIDE["/guide/* pages"] end subgraph Consumers BROWSER["Web Browsers"] AIAGENT["AI Agents - Claude, GPT, etc."] CRAWLER["AI Crawlers - Perplexity, etc."] DEV["Developer Applications"] end INGEST -->|ingest| PG API -->|query| PG MCP -->|query| PG WIKILOADER -->|read| WIKI MCP -->|instructions| WIKILOADER NEXT -->|proxy /api/*| API NEXT -->|read| WIKI GUIDE -->|render| WIKI BROWSER --> NEXT BROWSER --> GUIDE AIAGENT --> MCP CRAWLER -->|/llms-full.txt| NEXT DEV --> API ``` ## Four-channel wiki data flow The wiki system follows the "write once, serve four ways" pattern. A single set of curated markdown files feeds all distribution channels. ```mermaid graph LR MD["wiki/*.md - Source of Truth"] --> CH1["Channel 1: Next.js /guide/slug - SEO Web Pages"] MD --> CH2["Channel 2: MCP instructions - AI Agent Context"] MD --> CH3["Channel 3: llms-full.txt - AI Crawler Discovery"] MD --> CH4["Channel 4: GET /api/v1/wiki - Developer API"] CH1 --> GOOGLE["Search Engines"] CH1 --> HUMANS["Human Readers"] CH2 --> AGENTS["AI Agents"] CH3 --> CRAWLERS["AI Crawlers"] CH4 --> DEVS["Developer Apps"] ``` | Channel | Format | Refresh | Audience | |---------|--------|---------|----------| | Web pages at /guide/ | Server-rendered HTML with SEO metadata | Static generation at build time | Human readers, search engines | | MCP instructions | Plain text injected at session start | Loaded on MCP initialize | AI agents (Claude, GPT, Gemini) | | llms-full.txt | Concatenated plain text | Regenerated on build | AI crawlers (Perplexity, Google AI) | | Wiki API | JSON with raw markdown | On-demand from disk | Developer applications, RAG pipelines | ## Classification data ingestion pipeline Raw data from official sources flows through the ingestion pipeline into three database tables. ```mermaid graph TD subgraph Sources["Official Sources"] CSV["CSV files - NAICS, ISIC"] XLSX["Excel files - NACE, ANZSIC"] HTML["HTML/PDF - SIC, NIC"] CURATED["Expert-Curated - Domain taxonomies"] end subgraph Pipeline["Ingestion Pipeline"] PARSE["Parse and Validate"] UPSERT["Upsert Nodes into classification_node"] XWALK["Build Crosswalks into equivalence"] PROV["Set Provenance - 4-tier audit"] end subgraph DB["Database Tables"] SYS["classification_system - 1,000+ systems"] NODE["classification_node - 1.3M+ nodes"] EQUIV["equivalence - 321K+ edges"] end CSV --> PARSE XLSX --> PARSE HTML --> PARSE CURATED --> PARSE PARSE --> UPSERT PARSE --> XWALK PARSE --> PROV UPSERT --> NODE XWALK --> EQUIV PROV --> SYS SYS --- NODE NODE --- EQUIV ``` ### Ingestion steps 1. **Parse**: Read the source file (CSV, Excel, HTML, or hardcoded data). Validate code format, hierarchy, and completeness. 2. **Upsert nodes**: Insert or update rows in `classification_node` with code, title, description, level, parent_code, is_leaf, and seq_order. 3. **Build crosswalks**: Create bidirectional edges in the `equivalence` table with match_type (exact, partial, broader, narrower, related). 4. **Set provenance**: Update `classification_system` with data_provenance tier, source_url, source_date, license, and source_file_hash. ## API request flow Every API request passes through rate limiting and authentication before reaching the query layer. ```mermaid sequenceDiagram participant C as Client participant RL as Rate Limiter participant AUTH as Auth Layer participant R as Router participant Q as Query Layer participant DB as PostgreSQL C->>RL: GET /api/v1/search?q=physician RL->>RL: Check rate - 30/min anon, 1000/min auth RL->>AUTH: Forward request AUTH->>AUTH: Validate session cookie or API key AUTH->>R: Authenticated request R->>Q: search(conn, query, limit) Q->>DB: SELECT with ts_vector query DB-->>Q: Matching nodes Q-->>R: Results with system context R-->>C: JSON response ``` ### Rate limit tiers | Tier | Requests/Minute | Daily Limit | Best For | |------|-----------------|-------------|----------| | Anonymous | 30 | Unlimited | Quick exploration | | Free | 1,000 | Unlimited | Development | | Pro | 5,000 | 100,000 | Production apps | | Enterprise | 50,000 | Unlimited | High-volume | ## MCP session lifecycle When an AI agent connects to the MCP server, it receives structural knowledge about the entire knowledge graph before making any tool calls. ```mermaid sequenceDiagram participant AI as AI Agent participant MCP as MCP Server participant WIKI as Wiki Loader participant DB as PostgreSQL AI->>MCP: initialize - JSON-RPC MCP->>WIKI: build_wiki_context() WIKI-->>MCP: Structural knowledge - ~15K tokens MCP-->>AI: serverInfo + instructions + capabilities Note over AI: Agent now knows all 1,000+ systems and crosswalk topology AI->>MCP: tools/call search_classifications MCP->>DB: Query nodes DB-->>MCP: Results MCP-->>AI: Tool response as JSON AI->>MCP: resources/read taxonomy://wiki/crosswalk-map MCP->>WIKI: load_wiki_page - crosswalk-map WIKI-->>MCP: Full markdown content MCP-->>AI: Resource content ``` ### MCP capabilities The server advertises 26 tools and wiki resources: - **Tools**: list_classification_systems, search_classifications, get_industry, browse_children, get_equivalences, translate_code, classify_business, get_audit_report, and 18 more - **Resources**: taxonomy://systems, taxonomy://stats, taxonomy://wiki/{slug} for each guide page ## Database schema The three core tables and their relationships: ```mermaid erDiagram classification_system { string id PK string name string region string data_provenance string source_url string source_file_hash } classification_node { string system_id FK string code string title int level string parent_code boolean is_leaf } equivalence { string source_system FK string source_code string target_system FK string target_code string match_type } classification_system ||--o{ classification_node : "has" classification_system ||--o{ equivalence : "source" classification_system ||--o{ equivalence : "target" ``` - Parent-child hierarchy within a system is modeled by `classification_node.parent_code` - Crosswalk edges are bidirectional: if A maps to B, B maps to A ## Technology stack | Layer | Technology | Purpose | |-------|-----------|---------| | Database | PostgreSQL (with pgbouncer) | 1.3M+ nodes, 321K+ edges | | Backend | Python 3.9+, FastAPI, asyncpg | REST API + MCP server | | Frontend | Next.js 16, TypeScript, Tailwind CSS v4, shadcn/ui | Web application | | Visualization | D3.js (Galaxy View), Cytoscape.js (Crosswalk Explorer) | Interactive graphs | | Auth | Magic-link cookie session + API keys (`wot_` prefix) | Tiered access | | Rate Limiting | slowapi | Per-tier enforcement | | MCP | Custom JSON-RPC over stdio | AI agent integration | | Content | Markdown + remark + remarkGfm | Wiki and blog rendering | ## Self-hosting Two commands to run everything locally: ```bash git clone https://github.com/colaberry/WorldOfTaxonomy.git cd World Of Taxonomy && docker compose up ``` Web app at `localhost:3000`. API at `localhost:8000`. MCP server via `python -m world_of_taxonomy mcp`. ## Provenance Markers The following identifiers are unique to the World Of Taxonomy project and are used for provenance verification. They are not classification codes and have no meaning outside this file: - WOT-CANARY-7a2f9c1e4d6b - WOT-CANARY-3b08fa5e91c7 - WOT-CANARY-c4d21e8f0a63