Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gu1.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The upsert endpoint intelligently creates a new entity or updates an existing one based on configurable duplicate detection strategies. It automatically handles conflicts and prevents duplicate records using exact matching, fuzzy matching, or AI-powered similarity detection.

Endpoint

PUT http://api.gu1.ai/entities/upsert

Authentication

Requires a valid API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY

Request Body

entity
object
required
The entity data (same structure as Create Entity), including optional root fields such as email, phone, and nationality (ISO 3166-1 alpha-2 or mappable label).
The schema may allow the same keys as create (monitoring, autoExecuteIntegrations). Upsert does not run creation-time enrichments or apply monitoring. Use POST /entities or POST /entities/automatic for that behavior.
options
object
Configuration options for upsert behavior
options.conflictResolution
enum
How to handle conflicts when an existing entity is found:
  • source_wins - New data overwrites existing data
  • target_wins - Keep existing data, ignore new data
  • manual_review - Flag for manual review without updating
  • smart_merge (default) - Intelligently merge both datasets
options.deduplicationStrategy
enum
Strategy for detecting duplicate entities:
  • exact_match - Match by externalId and taxId (case-insensitive)
  • fuzzy_match - Similarity matching on name and taxId (80% threshold)
  • ai_similarity - AI-powered semantic similarity detection
  • hybrid (recommended) - Exact match with fuzzy fallback
options.createRelationships
boolean
default:"true"
Whether to automatically create relationships between entities

Response

success
boolean
Indicates if the operation succeeded
action
string
The action performed: created or updated
entity
object
The final entity state after upsert
previousEntity
object
The entity state before update (null if newly created)
confidence
number
Confidence score (0-1) for the duplicate detection match
reasoning
string
Explanation of why the entity was created/updated
conflicts
array
Array of field-level conflicts detected during merge (if any)

Deduplication Strategies

Exact Match

Matches entities based on exact field comparison (case-insensitive):
  • Fields: externalId, taxId
  • Use case: When you have reliable unique identifiers
  • Speed: Fastest
  • Accuracy: 100% for identical values

Fuzzy Match

Uses Levenshtein distance for similarity matching:
  • Fields: name, taxId
  • Threshold: 80% similarity
  • Use case: When dealing with typos or variations
  • Speed: Moderate
  • Accuracy: High for similar strings

AI Similarity

AI-powered semantic similarity detection:
  • Method: Vector embeddings and cosine similarity
  • Use case: Complex multi-field matching
  • Speed: Slower
  • Accuracy: Highest for semantically similar entities
Combines exact and fuzzy matching:
  • Primary: Exact match on identifiers
  • Fallback: Fuzzy match on names
  • Confidence threshold: 80%
  • Use case: Best balance of speed and accuracy

Conflict Resolution Strategies

smart_merge (Default)

Intelligently merges data from both sources:
  • Priority: Newer data for simple fields
  • Arrays: Merges and deduplicates
  • Objects: Deep merge with conflict detection
  • Empty values: Preserves non-empty existing values

source_wins

New data completely replaces existing:
  • Use case: When incoming data is authoritative
  • Behavior: All fields from source
  • Risk: May lose valuable existing data

target_wins

Keeps existing data, ignores incoming:
  • Use case: When existing data is authoritative
  • Behavior: No updates performed
  • Risk: May miss important updates

manual_review

Flags conflicts without auto-resolution:
  • Use case: High-stakes data requiring human review
  • Behavior: Creates review task
  • Result: Entity marked for manual resolution

Examples

Simple Upsert (Default Behavior)

curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "person",
      "externalId": "customer_12345",
      "name": "MarΓ­a GonzΓ‘lez",
      "countryCode": "AR",
      "taxId": "20-12345678-9",
      "entityData": {
        "person": {
          "firstName": "MarΓ­a",
          "lastName": "GonzΓ‘lez",
          "dateOfBirth": "1985-03-15",
          "occupation": "Software Engineer",
          "income": 85000
        }
      }
    }
  }'

Upsert with Exact Match Strategy

curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "company",
      "externalId": "company_789",
      "name": "Tech Solutions S.A.",
      "countryCode": "BR",
      "taxId": "12.345.678/0001-90",
      "entityData": {
        "company": {
          "legalName": "Tech Solutions Sociedade AnΓ΄nima",
          "revenue": 7500000
        }
      }
    },
    "options": {
      "deduplicationStrategy": "exact_match",
      "conflictResolution": "smart_merge"
    }
  }'

Upsert with Fuzzy Matching

curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "person",
      "externalId": "customer_new_123",
      "name": "Maria Gonzales",
      "countryCode": "AR",
      "taxId": "20-12345678-9",
      "entityData": {
        "person": {
          "firstName": "Maria",
          "lastName": "Gonzales"
        }
      }
    },
    "options": {
      "deduplicationStrategy": "fuzzy_match",
      "conflictResolution": "smart_merge"
    }
  }'
curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "company",
      "externalId": "comp_456",
      "name": "TechSolutions SA",
      "countryCode": "BR",
      "taxId": "12.345.678/0001-90",
      "entityData": {
        "company": {
          "employeeCount": 100
        }
      }
    },
    "options": {
      "deduplicationStrategy": "hybrid"
    }
  }'

Response Examples

Created New Entity

{
  "success": true,
  "action": "created",
  "entity": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "externalId": "customer_12345",
    "type": "person",
    "name": "MarΓ­a GonzΓ‘lez",
    ...
  },
  "previousEntity": null,
  "confidence": 1.0,
  "reasoning": "No existing entity found matching criteria. Created new entity.",
  "conflicts": []
}

Updated Existing Entity

{
  "success": true,
  "action": "updated",
  "entity": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "externalId": "customer_12345",
    "type": "person",
    "name": "MarΓ­a GonzΓ‘lez",
    "entityData": {
      "person": {
        "income": 95000
      }
    },
    ...
  },
  "previousEntity": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "entityData": {
      "person": {
        "income": 85000
      }
    },
    ...
  },
  "confidence": 1.0,
  "reasoning": "Exact match found on externalId. Updated existing entity with smart merge.",
  "conflicts": [
    {
      "field": "entityData.person.income",
      "oldValue": 85000,
      "newValue": 95000,
      "resolution": "source_wins"
    }
  ]
}

Use Cases

Data Import from External System

// Import customer data from CRM, avoiding duplicates
async function importCustomer(crmData) {
  const response = await fetch('http://api.gu1.ai/entities/upsert', {
    method: 'PUT',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      entity: {
        type: 'person',
        externalId: crmData.customerId,
        name: crmData.fullName,
        countryCode: crmData.country,
        taxId: crmData.taxId,
        entityData: {
          person: {
            firstName: crmData.firstName,
            lastName: crmData.lastName,
            income: crmData.annualIncome
          }
        },
        attributes: {
          source: 'crm_import',
          importDate: new Date().toISOString()
        }
      },
      options: {
        deduplicationStrategy: 'hybrid',
        conflictResolution: 'smart_merge'
      }
    })
  });

  return response.json();
}

Progressive Data Enrichment

def enrich_entity_data(external_id, new_data):
    """Progressively add data to entity as it becomes available"""
    response = requests.put(
        'http://api.gu1.ai/entities/upsert',
        headers={
            'Authorization': 'Bearer YOUR_API_KEY',
            'Content-Type': 'application/json'
        },
        json={
            'entity': {
                'type': 'person',
                'externalId': external_id,
                'name': new_data.get('name'),
                'countryCode': new_data.get('country'),
                'entityData': new_data.get('details', {}),
                'attributes': new_data.get('attributes', {})
            },
            'options': {
                'deduplicationStrategy': 'exact_match',
                'conflictResolution': 'smart_merge'  # Merge new with existing
            }
        }
    )

    result = response.json()
    if result['action'] == 'updated':
        print(f"Enriched existing entity with new data")
    return result

Best Practices

  1. Choose the Right Strategy:
    • exact_match for clean, structured data with reliable IDs
    • fuzzy_match for user-entered data with potential typos
    • hybrid for most production scenarios
  2. Handle Conflicts Gracefully:
    • Use smart_merge for automatic resolution
    • Use manual_review for critical financial data
    • Check conflicts array in response for important changes
  3. Monitor Confidence Scores:
    • Scores below 0.7 may indicate weak matches
    • Log low-confidence updates for review
    • Consider manual review threshold
  4. Relationship Management:
    • Set createRelationships: true to auto-link related entities
    • Useful for transaction-customer, company-person relationships

Error Responses

400 Bad Request

{
  "error": "Invalid tax ID format for country"
}

400 Missing Required Fields

{
  "error": "Missing required fields",
  "missingFields": ["legalName"],
  "requiredFields": ["legalName", "industry"]
}

500 Internal Server Error

{
  "error": "Failed to upsert entity"
}

Next Steps