Skip to main content

Overview

The upsert endpoint intelligently creates a new entity or updates an existing one based on configurable duplicate detection strategies. It automatically handles conflicts and prevents duplicate records using exact matching, fuzzy matching, or AI-powered similarity detection.

Endpoint

PUT http://api.gu1.ai/entities/upsert

Authentication

Requires a valid API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY

Request Body

entity
object
required
The entity data (same structure as Create Entity endpoint)
options
object
Configuration options for upsert behavior
options.conflictResolution
enum
How to handle conflicts when an existing entity is found:
  • source_wins - New data overwrites existing data
  • target_wins - Keep existing data, ignore new data
  • manual_review - Flag for manual review without updating
  • smart_merge (default) - Intelligently merge both datasets
options.deduplicationStrategy
enum
Strategy for detecting duplicate entities:
  • exact_match - Match by externalId and taxId (case-insensitive)
  • fuzzy_match - Similarity matching on name and taxId (80% threshold)
  • ai_similarity - AI-powered semantic similarity detection
  • hybrid (recommended) - Exact match with fuzzy fallback
options.createRelationships
boolean
default:"true"
Whether to automatically create relationships between entities

Response

success
boolean
Indicates if the operation succeeded
action
string
The action performed: created or updated
entity
object
The final entity state after upsert
previousEntity
object
The entity state before update (null if newly created)
confidence
number
Confidence score (0-1) for the duplicate detection match
reasoning
string
Explanation of why the entity was created/updated
conflicts
array
Array of field-level conflicts detected during merge (if any)

Deduplication Strategies

Exact Match

Matches entities based on exact field comparison (case-insensitive):
  • Fields: externalId, taxId
  • Use case: When you have reliable unique identifiers
  • Speed: Fastest
  • Accuracy: 100% for identical values

Fuzzy Match

Uses Levenshtein distance for similarity matching:
  • Fields: name, taxId
  • Threshold: 80% similarity
  • Use case: When dealing with typos or variations
  • Speed: Moderate
  • Accuracy: High for similar strings

AI Similarity

AI-powered semantic similarity detection:
  • Method: Vector embeddings and cosine similarity
  • Use case: Complex multi-field matching
  • Speed: Slower
  • Accuracy: Highest for semantically similar entities
Combines exact and fuzzy matching:
  • Primary: Exact match on identifiers
  • Fallback: Fuzzy match on names
  • Confidence threshold: 80%
  • Use case: Best balance of speed and accuracy

Conflict Resolution Strategies

smart_merge (Default)

Intelligently merges data from both sources:
  • Priority: Newer data for simple fields
  • Arrays: Merges and deduplicates
  • Objects: Deep merge with conflict detection
  • Empty values: Preserves non-empty existing values

source_wins

New data completely replaces existing:
  • Use case: When incoming data is authoritative
  • Behavior: All fields from source
  • Risk: May lose valuable existing data

target_wins

Keeps existing data, ignores incoming:
  • Use case: When existing data is authoritative
  • Behavior: No updates performed
  • Risk: May miss important updates

manual_review

Flags conflicts without auto-resolution:
  • Use case: High-stakes data requiring human review
  • Behavior: Creates review task
  • Result: Entity marked for manual resolution

Examples

Simple Upsert (Default Behavior)

curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "person",
      "externalId": "customer_12345",
      "name": "MarΓ­a GonzΓ‘lez",
      "countryCode": "AR",
      "taxId": "20-12345678-9",
      "entityData": {
        "person": {
          "firstName": "MarΓ­a",
          "lastName": "GonzΓ‘lez",
          "dateOfBirth": "1985-03-15",
          "occupation": "Software Engineer",
          "income": 85000
        }
      }
    }
  }'

Upsert with Exact Match Strategy

curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "company",
      "externalId": "company_789",
      "name": "Tech Solutions S.A.",
      "countryCode": "BR",
      "taxId": "12.345.678/0001-90",
      "entityData": {
        "company": {
          "legalName": "Tech Solutions Sociedade AnΓ΄nima",
          "revenue": 7500000
        }
      }
    },
    "options": {
      "deduplicationStrategy": "exact_match",
      "conflictResolution": "smart_merge"
    }
  }'

Upsert with Fuzzy Matching

curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "person",
      "externalId": "customer_new_123",
      "name": "Maria Gonzales",
      "countryCode": "AR",
      "taxId": "20-12345678-9",
      "entityData": {
        "person": {
          "firstName": "Maria",
          "lastName": "Gonzales"
        }
      }
    },
    "options": {
      "deduplicationStrategy": "fuzzy_match",
      "conflictResolution": "smart_merge"
    }
  }'
curl -X PUT http://api.gu1.ai/entities/upsert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": {
      "type": "company",
      "externalId": "comp_456",
      "name": "TechSolutions SA",
      "countryCode": "BR",
      "taxId": "12.345.678/0001-90",
      "entityData": {
        "company": {
          "employeeCount": 100
        }
      }
    },
    "options": {
      "deduplicationStrategy": "hybrid"
    }
  }'

Response Examples

Created New Entity

{
  "success": true,
  "action": "created",
  "entity": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "externalId": "customer_12345",
    "type": "person",
    "name": "MarΓ­a GonzΓ‘lez",
    ...
  },
  "previousEntity": null,
  "confidence": 1.0,
  "reasoning": "No existing entity found matching criteria. Created new entity.",
  "conflicts": []
}

Updated Existing Entity

{
  "success": true,
  "action": "updated",
  "entity": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "externalId": "customer_12345",
    "type": "person",
    "name": "MarΓ­a GonzΓ‘lez",
    "entityData": {
      "person": {
        "income": 95000
      }
    },
    ...
  },
  "previousEntity": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "entityData": {
      "person": {
        "income": 85000
      }
    },
    ...
  },
  "confidence": 1.0,
  "reasoning": "Exact match found on externalId. Updated existing entity with smart merge.",
  "conflicts": [
    {
      "field": "entityData.person.income",
      "oldValue": 85000,
      "newValue": 95000,
      "resolution": "source_wins"
    }
  ]
}

Use Cases

Data Import from External System

// Import customer data from CRM, avoiding duplicates
async function importCustomer(crmData) {
  const response = await fetch('http://api.gu1.ai/entities/upsert', {
    method: 'PUT',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      entity: {
        type: 'person',
        externalId: crmData.customerId,
        name: crmData.fullName,
        countryCode: crmData.country,
        taxId: crmData.taxId,
        entityData: {
          person: {
            firstName: crmData.firstName,
            lastName: crmData.lastName,
            income: crmData.annualIncome
          }
        },
        attributes: {
          source: 'crm_import',
          importDate: new Date().toISOString()
        }
      },
      options: {
        deduplicationStrategy: 'hybrid',
        conflictResolution: 'smart_merge'
      }
    })
  });

  return response.json();
}

Progressive Data Enrichment

def enrich_entity_data(external_id, new_data):
    """Progressively add data to entity as it becomes available"""
    response = requests.put(
        'http://api.gu1.ai/entities/upsert',
        headers={
            'Authorization': 'Bearer YOUR_API_KEY',
            'Content-Type': 'application/json'
        },
        json={
            'entity': {
                'type': 'person',
                'externalId': external_id,
                'name': new_data.get('name'),
                'countryCode': new_data.get('country'),
                'entityData': new_data.get('details', {}),
                'attributes': new_data.get('attributes', {})
            },
            'options': {
                'deduplicationStrategy': 'exact_match',
                'conflictResolution': 'smart_merge'  # Merge new with existing
            }
        }
    )

    result = response.json()
    if result['action'] == 'updated':
        print(f"Enriched existing entity with new data")
    return result

Best Practices

  1. Choose the Right Strategy:
    • exact_match for clean, structured data with reliable IDs
    • fuzzy_match for user-entered data with potential typos
    • hybrid for most production scenarios
  2. Handle Conflicts Gracefully:
    • Use smart_merge for automatic resolution
    • Use manual_review for critical financial data
    • Check conflicts array in response for important changes
  3. Monitor Confidence Scores:
    • Scores below 0.7 may indicate weak matches
    • Log low-confidence updates for review
    • Consider manual review threshold
  4. Relationship Management:
    • Set createRelationships: true to auto-link related entities
    • Useful for transaction-customer, company-person relationships

Error Responses

400 Bad Request

{
  "error": "Invalid tax ID format for country"
}

400 Missing Required Fields

{
  "error": "Missing required fields",
  "missingFields": ["legalName"],
  "requiredFields": ["legalName", "industry"]
}

500 Internal Server Error

{
  "error": "Failed to upsert entity"
}

Next Steps