SQL Repository Implementation¶
The SQL Repository provides a relational database implementation of the Repository interface using GORM as the ORM layer. It supports PostgreSQL for production deployments and SQLite for embedded/development use cases. This document covers the architecture, initialization, and data storage approach of the SQL repository implementation.
For detailed information about specific operations:
- Entity CRUD operations: see SQL Entity Operations
- Edge management and querying: see SQL Edge Operations
- Tag creation and retrieval: see SQL Tag Management
For the Neo4j graph database implementation, see Neo4j Repository.
Repository Architecture¶
The SQL repository is implemented by the sqlRepository struct defined in . It wraps a GORM database connection and implements all methods of the Repository interface.
Core Components¶
graph TB
RepositoryInterface["repository.Repository<br/>Interface"]
sqlRepository["sqlRepository<br/>struct"]
GORMDB["*gorm.DB<br/>ORM Connection"]
subgraph "Database Backends"
PostgresDriver["postgres.Open()<br/>PostgreSQL Driver"]
SQLiteDriver["sqlite.Open()<br/>SQLite Driver"]
end
subgraph "Storage Layer"
PostgresDB[("PostgreSQL<br/>Database")]
SQLiteDB[("SQLite<br/>File or Memory")]
end
RepositoryInterface -.->|"implements"| sqlRepository
sqlRepository -->|"sql.db"| GORMDB
sqlRepository -->|"sql.dbtype"| DBType["dbtype: string<br/>'postgres', 'sqlite', 'sqlite_memory'"]
GORMDB -->|"uses"| PostgresDriver
GORMDB -->|"uses"| SQLiteDriver
PostgresDriver --> PostgresDB
SQLiteDriver --> SQLiteDB
Database Type Constants¶
The SQL repository supports three database configurations, identified by string constants:
| Constant | Value | Purpose |
|---|---|---|
Postgres |
"postgres" |
Production PostgreSQL deployment |
SQLite |
"sqlite" |
File-based SQLite database |
SQLiteMemory |
"sqlite_memory" |
In-memory SQLite for testing |
Initialization Process¶
The New function creates and configures a SQL repository instance. It establishes the database connection with appropriate driver settings and connection pooling parameters.
Factory Function¶
graph LR
Client["Client Code"]
NewFunc["sqlrepo.New(dbtype, dsn)"]
NewDatabase["newDatabase(dbtype, dsn)"]
subgraph "Database Creation"
PostgresFunc["postgresDatabase(dsn)"]
SQLiteFunc["sqliteDatabase(dsn, conns, idles)"]
end
sqlRepo["sqlRepository{<br/>db: *gorm.DB<br/>dbtype: string<br/>}"]
Client -->|"'postgres', dsn"| NewFunc
NewFunc --> NewDatabase
NewDatabase -->|"dbtype='postgres'"| PostgresFunc
NewDatabase -->|"dbtype='sqlite'<br/>dbtype='sqlite_memory'"| SQLiteFunc
PostgresFunc -->|"*gorm.DB"| NewFunc
SQLiteFunc -->|"*gorm.DB"| NewFunc
NewFunc --> sqlRepo
sqlRepo -->|"return"| Client
Key Implementation Details:
The factory function routes to database-specific initialization based on the dbtype parameter :
- PostgreSQL: Uses
gorm.Open(postgres.Open(dsn))with connection pooling configured for production workloads - SQLite (file): Uses
gorm.Open(sqlite.Open(dsn))with 3 max connections and 5 max idle connections - SQLite (memory): Uses the same driver but with increased limits (50 max connections, 100 max idle) for test performance
Connection Pooling Configuration¶
Each database backend is configured with specific connection pooling parameters to optimize performance and resource usage.
PostgreSQL Connection Pool¶
| Parameter | Value | Purpose |
|---|---|---|
MaxIdleConns |
5 | Maintains 5 idle connections for quick reuse |
MaxOpenConns |
10 | Limits concurrent connections to 10 |
ConnMaxLifetime |
1 hour | Recycles connections after 1 hour |
ConnMaxIdleTime |
10 minutes | Closes idle connections after 10 minutes |
SQLite Connection Pool¶
File-based SQLite:
- MaxOpenConns: 3
- MaxIdleConns: 5
- Same lifetime settings as PostgreSQL
In-memory SQLite:
- MaxOpenConns: 50
- MaxIdleConns: 100
- Optimized for high-throughput testing scenarios
Data Storage Model¶
The SQL repository stores graph data (entities and edges) in relational tables using JSON serialization for flexible content storage.
Table Structure¶
erDiagram
entities ||--o{ edges : "from_entity_id"
entities ||--o{ edges : "to_entity_id"
entities ||--o{ entity_tags : "entity_id"
edges ||--o{ edge_tags : "edge_id"
entities {
uint64 entity_id PK
string etype
string content
timestamp created_at
timestamp updated_at
}
edges {
uint64 edge_id PK
string type
string content
uint64 from_entity_id FK
uint64 to_entity_id FK
timestamp created_at
timestamp updated_at
}
entity_tags {
uint64 tag_id PK
uint64 entity_id FK
string ttype
string content
timestamp created_at
timestamp updated_at
}
edge_tags {
uint64 tag_id PK
uint64 edge_id FK
string ttype
string content
timestamp created_at
timestamp updated_at
}
GORM Model Structs¶
The repository uses internal GORM model structs (not shown in provided files but referenced by operations) that map to these tables:
Entitystruct: Maps toentitiestableEdgestruct: Maps toedgestable with foreign key constraintsEntityTagstruct: Maps toentity_tagstableEdgeTagstruct: Maps toedge_tagstable
JSON Content Serialization¶
All Open Asset Model objects (Assets, Relations, Properties) are serialized to JSON for storage in the content fields. This approach provides:
- Schema Flexibility: Supports arbitrary asset types without schema migrations
- OAM Compatibility: Preserves full fidelity of OAM objects
- Query Capability: JSON operators enable content-based queries
Serialization Flow¶
graph LR
Asset["oam.Asset<br/>(Go struct)"]
JSON1["asset.JSON()"]
Content1["content: string<br/>(JSON)"]
Relation["oam.Relation<br/>(Go struct)"]
JSON2["relation.JSON()"]
Content2["content: string<br/>(JSON)"]
Property["oam.Property<br/>(Go struct)"]
JSON3["property.JSON()"]
Content3["content: string<br/>(JSON)"]
Asset --> JSON1 --> Content1
Relation --> JSON2 --> Content2
Property --> JSON3 --> Content3
Content1 --> Store1["Entity.Content"]
Content2 --> Store2["Edge.Content"]
Content3 --> Store3["EntityTag.Content<br/>EdgeTag.Content"]
Examples in Code:
- Entity serialization:
- Edge serialization:
- Deserialization:
GORM Query Patterns¶
The SQL repository uses GORM's query builder extensively throughout its operations. Common patterns include:
Basic CRUD Operations¶
| Operation | GORM Method | Example Location |
|---|---|---|
| Create | db.Create(&model) |
|
| Read by ID | db.First(&model) |
|
| Update | db.Save(&model) |
|
| Delete | db.Delete(&model) |
Query Chaining¶
The repository builds complex queries by chaining GORM methods:
graph LR
DB["sql.db"]
Where1["Where('etype = ?', type)"]
Where2["Where('updated_at >= ?', since)"]
Find["Find(&entities)"]
DB --> Where1
Where1 --> Where2
Where2 --> Find
Example: Finding entities by type with temporal filtering at
Duplicate Prevention¶
The SQL repository implements duplicate detection logic to prevent redundant data storage.
Entity Deduplication¶
When creating an entity, the repository:
- Searches for existing entities with matching content
- If found, updates the existing entity's timestamp instead of creating a new one
- Uses GORM's
Savemethod which performs an upsert operation
Edge Deduplication¶
Edge creation uses the isDuplicateEdge helper function :
- Queries existing outgoing edges from the source entity
- Checks for matching destination entity and relation content using
reflect.DeepEqual - If duplicate found, updates the timestamp via
edgeSeen - Returns the existing edge instead of creating a new one
Timestamp Management¶
All entities and edges track two timestamps:
created_at: When the record was first created (immutable after creation)updated_at: When the record was last seen/updated (modified on subsequent observations)
The repository uses UTC timezone consistently and converts to local time when returning results to clients .
Repository Interface Methods¶
The sqlRepository implements all methods defined by the Repository interface. Operations are organized into three categories:
Entity Operations¶
CreateEntity(*types.Entity) (*types.Entity, error)CreateAsset(oam.Asset) (*types.Entity, error)- Convenience wrapperFindEntityById(string) (*types.Entity, error)FindEntitiesByContent(oam.Asset, time.Time) ([]*types.Entity, error)FindEntitiesByType(oam.AssetType, time.Time) ([]*types.Entity, error)DeleteEntity(string) error
Detailed documentation: SQL Entity Operations
Edge Operations¶
CreateEdge(*types.Edge) (*types.Edge, error)FindEdgeById(string) (*types.Edge, error)IncomingEdges(*types.Entity, time.Time, ...string) ([]*types.Edge, error)OutgoingEdges(*types.Entity, time.Time, ...string) ([]*types.Edge, error)DeleteEdge(string) error
Detailed documentation: SQL Edge Operations
Tag Operations¶
CreateEntityTag(*types.EntityTag) (*types.EntityTag, error)GetEntityTags(*types.Entity, time.Time) ([]*types.EntityTag, error)CreateEdgeTag(*types.EdgeTag) (*types.EdgeTag, error)GetEdgeTags(*types.Edge, time.Time) ([]*types.EdgeTag, error)
Detailed documentation: SQL Tag Management
Utility Methods¶
Close() error- Closes the database connectionGetDBType() string- Returns the database type constant
Error Handling¶
The SQL repository propagates GORM errors to callers with minimal wrapping. Common error scenarios include:
| Scenario | Error Source | Example Location |
|---|---|---|
| Record not found | gorm.ErrRecordNotFound |
|
| ID parsing failure | strconv.ParseUint |
|
| Database constraint violation | GORM/database driver | |
| JSON serialization failure | asset.JSON() |
The repository also returns custom errors for semantic issues:
- "zero entities found" when queries return no results
- "failed input validation checks" for nil parameters
- OAM validation errors for invalid relationships
Performance Considerations¶
Query Optimization¶
The SQL repository relies on proper database indexing for performance. Key indexes should include:
- Primary keys on
entity_id,edge_id,tag_id - Foreign key indexes on
from_entity_id,to_entity_idin edges table - Index on
etypecolumn for type-based queries - Index on
updated_atfor temporal queries - JSON path indexes for content-based queries (database-specific)
These indexes are created by the migration system (see SQL Schema Migrations).
Connection Pooling¶
The conservative connection limits (5-10 for PostgreSQL, 3 for SQLite) balance resource usage with throughput. For high-concurrency scenarios, consider:
- Wrapping the repository with the caching layer (see Caching System)
- Adjusting pool sizes at
SQL Entity Operations¶
This page documents entity CRUD (Create, Read, Delete) operations in the SQL repository implementation. It covers how entities representing assets are created, queried, and managed in PostgreSQL and SQLite databases using GORM. For edge (relationship) operations, see SQL Edge Operations. For entity tag management, see SQL Tag Management.
Overview¶
The SQL repository implements entity operations defined in the Repository interface using GORM as the ORM layer. Entities are stored in the entities table with JSON-serialized content. The implementation handles duplicate detection, timestamp management, and conversion between database records and types.Entity objects.
Entity Storage Model¶
Database Schema¶
Entities in SQL repositories are stored using the Entity struct, which maps to the entities table:
| Field | Type | Purpose |
|---|---|---|
ID |
uint64 |
Auto-incrementing primary key |
Type |
string |
Asset type (e.g., "FQDN", "IPAddress") from oam.AssetType() |
Content |
string |
JSON-serialized asset data from oam.Asset.JSON() |
CreatedAt |
time.Time |
First creation timestamp |
UpdatedAt |
time.Time |
Last seen/update timestamp |
The Content field stores the complete asset as JSON, allowing flexible storage of any asset type defined in the Open Asset Model.
graph TB
subgraph "types Package"
TypesEntity["types.Entity<br/>ID: string<br/>CreatedAt: time.Time<br/>LastSeen: time.Time<br/>Asset: oam.Asset"]
end
subgraph "sqlrepo Package"
SQLEntity["Entity struct<br/>ID: uint64<br/>Type: string<br/>Content: string (JSON)<br/>CreatedAt: time.Time<br/>UpdatedAt: time.Time"]
Parse["Parse() method<br/>Deserializes JSON<br/>to oam.Asset"]
JSONQuery["JSONQuery() method<br/>Builds GORM where clause"]
end
subgraph "Database"
EntitiesTable["entities table<br/>PostgreSQL or SQLite"]
end
TypesEntity -->|"CreateEntity()"| SQLEntity
SQLEntity -->|"GORM Save()"| EntitiesTable
EntitiesTable -->|"GORM Find()"| SQLEntity
SQLEntity -->|"Parse()"| TypesEntity
SQLEntity -.->|"Uses"| Parse
SQLEntity -.->|"Uses"| JSONQuery
Create Operations¶
CreateEntity¶
The CreateEntity method persists a types.Entity to the database. It includes sophisticated duplicate detection logic that updates existing entities rather than creating duplicates.
flowchart TD
Start["CreateEntity(input *types.Entity)"]
Serialize["Serialize input.Asset to JSON<br/>jsonContent, err := input.Asset.JSON()"]
CreateStruct["Create Entity struct<br/>Type = input.Asset.AssetType()<br/>Content = jsonContent"]
CheckDup["FindEntitiesByContent()<br/>Check for existing entity"]
DupExists{{"Duplicate<br/>exists?"}}
SameType{{"Same<br/>AssetType?"}}
UpdateExisting["Update existing:<br/>entity.ID = existing.ID<br/>entity.UpdatedAt = now"]
SetNew["Set timestamps:<br/>entity.CreatedAt = now or input.CreatedAt<br/>entity.UpdatedAt = now or input.LastSeen"]
Save["db.Save(&entity)<br/>GORM upsert operation"]
Convert["Convert to types.Entity<br/>ID: strconv.FormatUint()<br/>Timestamps: UTC to Local"]
Return["Return *types.Entity"]
Start --> Serialize
Serialize --> CreateStruct
CreateStruct --> CheckDup
CheckDup --> DupExists
DupExists -->|"Yes"| SameType
DupExists -->|"No"| SetNew
SameType -->|"Yes"| UpdateExisting
SameType -->|"No"| SetNew
UpdateExisting --> Save
SetNew --> Save
Save --> Convert
Convert --> Return
Key behaviors:
- Duplicate detection: Queries FindEntitiesByContent to check if entity already exists
- Update vs Insert: If duplicate found with matching AssetType, updates UpdatedAt timestamp but preserves ID and CreatedAt
- GORM Save: Uses db.Save() which performs upsert (insert or update based on primary key)
- ID conversion: Database uint64 ID converted to string for types.Entity
- Timezone handling: Timestamps stored in UTC, returned in local time
CreateAsset¶
The CreateAsset method is a convenience wrapper around CreateEntity:
func (sql *sqlRepository) CreateAsset(asset oam.Asset) (*types.Entity, error) {
return sql.CreateEntity(&types.Entity{Asset: asset})
}
It accepts an oam.Asset directly and wraps it in a types.Entity before calling CreateEntity.
Query Operations¶
FindEntityById¶
Retrieves a single entity by its string ID.
| Operation | Implementation |
|---|---|
| Input | id string - Entity ID as string |
| Conversion | strconv.ParseUint(id, 10, 64) to convert to uint64 |
| Query | db.First(&entity) - GORM query by primary key |
| Parsing | entity.Parse() - Deserializes JSON content to oam.Asset |
| Output | *types.Entity with populated Asset field |
Error cases: - Invalid ID format (not a valid uint64) - Entity not found in database - JSON parsing failure
FindEntitiesByContent¶
Searches for entities matching specific asset content, with optional time filtering.
graph TD
Input["FindEntitiesByContent(assetData, since)"]
SerializeInput["Serialize input asset to JSON"]
BuildQuery["entity.JSONQuery()<br/>Build JSON matching query"]
ApplyFilters["tx.Where('etype = ?', type)"]
CheckSince{{"since.IsZero()?"}}
AddTimeFilter["tx.Where('updated_at >= ?', since.UTC())"]
Execute["tx.Find(&entities)<br/>Execute GORM query"]
Parse["Parse each Entity.Content<br/>to oam.Asset"]
BuildResults["Build []*types.Entity slice"]
Return["Return results"]
Input --> SerializeInput
SerializeInput --> BuildQuery
BuildQuery --> ApplyFilters
ApplyFilters --> CheckSince
CheckSince -->|"No"| AddTimeFilter
CheckSince -->|"Yes"| Execute
AddTimeFilter --> Execute
Execute --> Parse
Parse --> BuildResults
BuildResults --> Return
Key features:
- Content matching: Uses JSONQuery() method to build database-specific JSON query
- Type filtering: Always filters by etype field
- Time filtering: Optional since parameter filters by updated_at >= since
- Zero value handling: If since.IsZero(), time filter is skipped
FindEntitiesByType¶
Retrieves all entities of a specific asset type, with optional time filtering.
| Parameter | Type | Purpose |
|---|---|---|
atype |
oam.AssetType |
Asset type to filter (e.g., "FQDN", "IPAddress") |
since |
time.Time |
Optional time filter for updated_at >= since |
Query variations:
// Without time filter (since.IsZero())
db.Where("etype = ?", atype).Find(&entities)
// With time filter
db.Where("etype = ? AND updated_at >= ?", atype, since.UTC()).Find(&entities)
Delete Operations¶
DeleteEntity¶
Removes an entity by ID from the database.
sequenceDiagram
participant Client
participant DeleteEntity
participant GORM
participant Database
Client->>DeleteEntity: DeleteEntity(id string)
DeleteEntity->>DeleteEntity: strconv.ParseUint(id, 10, 64)
DeleteEntity->>GORM: db.Delete(&Entity{ID: entityId})
GORM->>Database: DELETE FROM entities WHERE id = ?
Database-->>GORM: Rows affected
GORM-->>DeleteEntity: result.Error
DeleteEntity-->>Client: error or nil
Implementation details:
- Converts string ID to uint64
- Uses GORM's Delete() method with primary key
- Returns error if conversion fails or deletion fails
- Does not check if entity exists before deletion
Duplicate Handling Strategy¶
The SQL repository implements intelligent duplicate detection in CreateEntity:
graph LR
subgraph "Duplicate Detection Flow"
Call["CreateEntity called<br/>with new asset"]
Query["FindEntitiesByContent()<br/>Search existing"]
Check{{"Matching entity<br/>found?"}}
TypeCheck{{"Same<br/>AssetType?"}}
Update["Update existing:<br/>- Keep same ID<br/>- Keep CreatedAt<br/>- Update UpdatedAt"]
Create["Create new:<br/>- New ID<br/>- Set CreatedAt<br/>- Set UpdatedAt"]
Call --> Query
Query --> Check
Check -->|"Yes"| TypeCheck
Check -->|"No"| Create
TypeCheck -->|"Yes"| Update
TypeCheck -->|"No"| Create
end
Rationale: This prevents duplicate entities with identical content while updating the LastSeen timestamp to track recency. This is critical for time-based queries in discovery systems like OWASP Amass.
Test validation: The test TestLastSeenUpdates verifies that calling CreateAsset twice with the same asset updates LastSeen while preserving ID and CreatedAt .
JSON Serialization¶
Encoding: Asset to JSON¶
The CreateEntity method serializes oam.Asset to JSON:
The JSON string is stored in the Content field of the database entity .
Decoding: JSON to Asset¶
The Parse() method (implementation in Entity struct) deserializes JSON back to oam.Asset:
This method is called in all query operations to reconstruct the asset from database storage .
JSONQuery Method¶
The JSONQuery() method builds database-specific WHERE clauses for JSON content matching. This is used by FindEntitiesByContent to efficiently query entities with specific content .
Time Handling¶
The SQL repository performs careful timezone conversions to ensure consistency:
Storage: Local/Input to UTC¶
// For new entities
if input.CreatedAt.IsZero() {
entity.CreatedAt = time.Now().UTC()
} else {
entity.CreatedAt = input.CreatedAt.UTC()
}
if input.LastSeen.IsZero() {
entity.UpdatedAt = time.Now().UTC()
} else {
entity.UpdatedAt = input.LastSeen.UTC()
}
All timestamps are converted to UTC before storage .
Retrieval: UTC to Local¶
return &types.Entity{
ID: strconv.FormatUint(entity.ID, 10),
CreatedAt: entity.CreatedAt.In(time.UTC).Local(),
LastSeen: entity.UpdatedAt.In(time.UTC).Local(),
Asset: input.Asset,
}
Timestamps are converted back to local time when returning entities .
Rationale: Storing in UTC ensures consistency across different database servers and clients in different timezones. Converting to local time on retrieval maintains compatibility with client expectations.
Integration with GORM¶
All SQL entity operations use GORM methods:
| GORM Method | Purpose | Used In |
|---|---|---|
db.Save(&entity) |
Insert or update based on primary key | CreateEntity |
db.First(&entity) |
Query single record by primary key | FindEntityById |
db.Where(...).Find(&entities) |
Query multiple records with conditions | FindEntitiesByContent, FindEntitiesByType |
db.Delete(&entity) |
Delete record by primary key | DeleteEntity |
The sqlRepository struct contains a *gorm.DB field that executes these operations against PostgreSQL or SQLite databases configured at initialization.
SQL Edge Operations¶
This document details the SQL repository implementation for edge operations. Edges represent directed relationships between entities in the property graph model. The SQL implementation uses GORM to manage edges stored in relational databases (PostgreSQL and SQLite).
This page covers edge creation, querying, retrieval, and deletion operations. For entity management, see SQL Entity Operations. For edge tag operations, see SQL Tag Management. For the Neo4j graph database implementation of edge operations, see Neo4j Edge Operations.
Edge Table Structure¶
The SQL repository stores edges in an edges table with the following structure:
| Column | Type | Description |
|---|---|---|
edge_id |
uint64 | Primary key, auto-incremented |
type |
string | Relation type from OAM |
content |
JSON | Serialized relation data |
from_entity_id |
uint64 | Foreign key to source entity |
to_entity_id |
uint64 | Foreign key to destination entity |
created_at |
timestamp | When the edge was first created |
updated_at |
timestamp | Last seen timestamp |
The Edge struct in the SQL repository maps to this table structure:
type Edge struct {
ID uint64
Type string
Content []byte
FromEntityID uint64
ToEntityID uint64
CreatedAt time.Time
UpdatedAt time.Time
}
Edge Creation Flow¶
CreateEdge Method¶
flowchart TD
Start["CreateEdge(edge)"]
Validate["Validate input fields<br/>(line 23-26)"]
TaxCheck["Validate against OAM taxonomy<br/>oam.ValidRelationship()<br/>(line 28-32)"]
DupCheck["isDuplicateEdge()<br/>(line 41-43)"]
Found{{"Duplicate<br/>found?"}}
ParseFrom["Parse FromEntity.ID<br/>(line 45-48)"]
ParseTo["Parse ToEntity.ID<br/>(line 50-53)"]
Serialize["Serialize Relation to JSON<br/>(line 55-58)"]
Create["Create Edge struct<br/>with timestamps<br/>(line 60-71)"]
Insert["sql.db.Create(&r)<br/>(line 73-76)"]
Convert["toEdge(r)<br/>(line 77)"]
Return["Return edge"]
Start --> Validate
Validate --> TaxCheck
TaxCheck --> DupCheck
DupCheck --> Found
Found -->|"Yes"| Return
Found -->|"No"| ParseFrom
ParseFrom --> ParseTo
ParseTo --> Serialize
Serialize --> Create
Create --> Insert
Insert --> Convert
Convert --> Return
style DupCheck fill:#f9f9f9
style Found fill:#f9f9f9
Diagram: Edge Creation Process
The CreateEdge method implements comprehensive validation and duplicate detection:
- Input Validation : Verifies that the edge, relation, and both entities are non-nil
- Taxonomy Validation : Calls
oam.ValidRelationship()to ensure the relationship is valid according to the Open Asset Model taxonomy - Duplicate Detection : Checks if an identical edge already exists
- Timestamp Management : Uses provided
LastSeentimestamp or defaults to current UTC time - Entity ID Parsing : Converts string entity IDs to uint64
- Content Serialization : Serializes the relation to JSON format
- Database Insert : Uses GORM to insert the edge record
Duplicate Edge Detection¶
isDuplicateEdge Logic¶
flowchart TD
Start["isDuplicateEdge(edge, updated)"]
Query["OutgoingEdges(FromEntity)<br/>(line 85)"]
Iterate["Iterate through outgoing edges<br/>(line 86-98)"]
Match{{"ToEntity.ID matches<br/>AND<br/>Relation content matches?"}}
Update["edgeSeen(out, updated)<br/>(line 88)"]
Fetch["FindEdgeById(out.ID)<br/>(line 90)"]
ReturnDup["Return edge, true"]
ReturnNew["Return nil, false"]
Start --> Query
Query --> Iterate
Iterate --> Match
Match -->|"Yes"| Update
Update --> Fetch
Fetch --> ReturnDup
Match -->|"No"| Iterate
Iterate -->|"No more edges"| ReturnNew
style Match fill:#f9f9f9
Diagram: Duplicate Edge Detection Mechanism
The duplicate detection mechanism prevents redundant edges in the database:
- Query Existing Edges : Retrieves all outgoing edges from the source entity with the same label
- Deep Comparison : Compares both the destination entity ID and the relation content using
reflect.DeepEqual - Update Timestamp : If a duplicate is found, updates its
updated_attimestamp viaedgeSeen() - Return Existing : Fetches and returns the existing edge instead of creating a new one
This approach ensures that re-discovering the same relationship updates the temporal information without creating duplicate records.
Edge Timestamp Management¶
edgeSeen Method¶
The edgeSeen method updates the updated_at timestamp for an existing edge:
flowchart LR
Input["Edge + new timestamp"]
ParseID["Parse edge.ID<br/>(line 105-108)"]
Serialize["Serialize Relation to JSON<br/>(line 110-113)"]
ParseFrom["Parse FromEntity.ID<br/>(line 115-118)"]
ParseTo["Parse ToEntity.ID<br/>(line 120-123)"]
BuildStruct["Build Edge struct<br/>with new UpdatedAt<br/>(line 125-133)"]
Save["sql.db.Save(&r)<br/>(line 135-138)"]
Input --> ParseID
ParseID --> Serialize
Serialize --> ParseFrom
ParseFrom --> ParseTo
ParseTo --> BuildStruct
BuildStruct --> Save
Diagram: Edge Timestamp Update Flow
The method preserves the original created_at timestamp while updating updated_at to reflect the most recent observation of the relationship.
Querying Edges¶
Incoming and Outgoing Edge Queries¶
Both IncomingEdges and OutgoingEdges methods support:
- Temporal filtering via the since parameter
- Label filtering to retrieve edges of specific relation types
- Optional label filtering (returns all edges if no labels specified)
IncomingEdges Query Structure¶
flowchart TD
Start["IncomingEdges(entity, since, labels...)"]
ParseID["Parse entity.ID to int64<br/>(line 157-160)"]
CheckSince{{"since.IsZero()?"}}
QueryAll["WHERE to_entity_id = ?<br/>(line 165)"]
QueryTime["WHERE to_entity_id = ?<br/>AND updated_at >= ?<br/>(line 167)"]
Filter{{"labels<br/>provided?"}}
FilterLoop["Filter by relation label<br/>(line 174-186)"]
NoFilter["Use all edges<br/>(line 188)"]
Return["Return filtered edges<br/>via toEdges()<br/>(line 194)"]
Start --> ParseID
ParseID --> CheckSince
CheckSince -->|"Yes"| QueryAll
CheckSince -->|"No"| QueryTime
QueryAll --> Filter
QueryTime --> Filter
Filter -->|"Yes"| FilterLoop
Filter -->|"No"| NoFilter
FilterLoop --> Return
NoFilter --> Return
style CheckSince fill:#f9f9f9
style Filter fill:#f9f9f9
Diagram: IncomingEdges Query Flow
OutgoingEdges Query Structure¶
The OutgoingEdges method follows the same pattern but queries on from_entity_id instead:
| Query Parameter | SQL WHERE Clause |
|---|---|
| No time filter | from_entity_id = ? |
| With time filter | from_entity_id = ? AND updated_at >= ? |
Label Filtering Implementation:
When labels are provided : 1. Parse each edge's JSON content to extract the relation 2. Compare the relation's label against the requested labels 3. Include the edge only if its label matches
Edge Retrieval and Deletion¶
FindEdgeById¶
The FindEdgeById method retrieves a single edge by its ID:
Implementation details :
- Parses the string ID
- Queries using WHERE edge_id = ?
- Converts the database Edge to types.Edge via toEdge()
DeleteEdge¶
The DeleteEdge method removes an edge from the database:
Implementation :
- Parses the string ID to uint64
- Delegates to deleteEdges() which executes: DELETE FROM edges WHERE edge_id IN ?
The deleteEdges helper method accepts a slice of IDs, enabling batch deletion operations.
Type Conversion Functions¶
Database to Domain Model Conversion¶
flowchart LR
DB["Edge<br/>(database struct)"]
Parse["Parse JSON content<br/>to oam.Relation<br/>(line 260)"]
Build["Build types.Edge<br/>with converted fields<br/>(line 265-278)"]
Domain["types.Edge<br/>(domain model)"]
DB --> Parse
Parse --> Build
Build --> Domain
Note1["FromEntity and ToEntity<br/>contain only IDs,<br/>not full asset content<br/>(line 272-276)"]
Build -.-> Note1
Diagram: Edge Type Conversion
toEdge Function¶
The toEdge function converts a database Edge struct to a types.Edge:
- Parses the JSON
Contentfield back to anoam.Relation - Converts numeric timestamps to
time.Timein local timezone - Creates entity references with IDs only (does not join to fetch full entity data)
- Converts uint64 IDs to string format
toEdges Function¶
The toEdges helper applies toEdge to a slice of edges, filtering out any that fail to parse.
Important Note: The converted FromEntity and ToEntity objects contain only the ID field. Full entity content is not joined during edge queries for performance reasons. To retrieve complete entity data, use FindEntityById() on the entity IDs.
Integration Testing Examples¶
The test suite demonstrates typical edge operations:
Basic Edge Creation and Query¶
From :
edge := &types.Edge{
Relation: tc.relation,
FromEntity: sourceEntity,
ToEntity: destinationEntity,
}
e, err := store.CreateEdge(edge)
// Query incoming edges
incoming, err := store.IncomingEdges(destinationEntity, start, tc.relation.Label())
// Query outgoing edges
outgoing, err := store.OutgoingEdges(sourceEntity, start, tc.relation.Label())
Duplicate Handling¶
From :
// Store duplicate relation - last_seen is updated
rr, err := store.CreateEdge(edge2)
// Verify LastSeen timestamp increased
assert(rr.LastSeen > originalEdge.LastSeen)
Unfiltered Edge Queries¶
From :
// Retrieve all outgoing edges (no label filter)
outs, err := store.OutgoingEdges(sourceEntity, time.Time{})
// Returns all edges regardless of relation type
Error Handling¶
Edge operations return errors for the following conditions:
| Operation | Error Condition | Error Type |
|---|---|---|
CreateEdge |
Nil input fields | "failed input validation checks" |
CreateEdge |
Invalid taxonomy relationship | "%s -%s-> %s is not valid in the taxonomy" |
CreateEdge |
Entity ID parse failure | Parse error |
CreateEdge |
JSON serialization failure | Serialization error |
CreateEdge |
Database insert failure | GORM error |
IncomingEdges |
Entity ID parse failure | Parse error |
IncomingEdges |
Zero edges found | "zero edges found" |
OutgoingEdges |
Entity ID parse failure | Parse error |
OutgoingEdges |
Zero edges found | "zero edges found" |
FindEdgeById |
Edge not found | GORM error (record not found) |
DeleteEdge |
ID parse failure | Parse error |
DeleteEdge |
Database delete failure | GORM error |
Performance Considerations¶
Query Optimization¶
- Indexed Lookups: The
from_entity_idandto_entity_idcolumns should be indexed for efficient edge traversal queries - Time-based Filtering: The
updated_atcolumn enables temporal queries for incremental data retrieval - Label Filtering: Performed in-memory after database query, requires parsing JSON content
Duplicate Detection Cost¶
The isDuplicateEdge check queries all outgoing edges with the same label and performs deep equality checks. For entities with many outgoing edges, this can be expensive. The cost is traded off against preventing duplicate records.
Lazy Entity Loading¶
Edge queries return entity references with IDs only, avoiding JOIN operations. This design improves query performance but requires separate queries to fetch full entity data when needed.
SQL Tag Management¶
This document details how the SQL repository implementation manages entity and edge tags. Tags are metadata containers that store OAM properties (from the Open Asset Model) as JSON content attached to entities and edges. This page covers tag creation, retrieval, content-based searching, duplicate handling, and deletion.
For entity and edge operations themselves, see SQL Entity Operations and SQL Edge Operations. For tag management in Neo4j, see Neo4j Tag Management.
Overview¶
Tags in the SQL repository serve as a flexible metadata system for attaching properties to both entities and edges. Each tag wraps an oam.Property object, serializes it to JSON, and stores it in the database with timestamp tracking. The system prevents duplicate tags and supports content-based queries.
Database Schema and Core Structures¶
The SQL repository uses two separate tables for tags: entity_tags and edge_tags. Both follow a similar structure, storing the property type, JSON content, and foreign key references.
erDiagram
entities ||--o{ entity_tags : "has many"
edges ||--o{ edge_tags : "has many"
entities {
uint64 id PK
string atype
string content
timestamp created_at
timestamp updated_at
}
entity_tags {
uint64 id PK
string ttype
string content
uint64 entity_id FK
timestamp created_at
timestamp updated_at
}
edges {
uint64 id PK
string rtype
string content
uint64 from_entity_id FK
uint64 to_entity_id FK
timestamp created_at
timestamp updated_at
}
edge_tags {
uint64 id PK
string ttype
string content
uint64 edge_id FK
timestamp created_at
timestamp updated_at
}
Internal Tag Structures¶
The EntityTag and EdgeTag structs are internal GORM models that map to database tables:
| Field | Type | Description |
|---|---|---|
ID |
uint64 |
Auto-incrementing primary key |
Type |
string |
The property type (e.g., "simple_property") |
Content |
string |
JSON-serialized property data |
EntityID / EdgeID |
uint64 |
Foreign key reference |
CreatedAt |
time.Time |
Initial creation timestamp |
UpdatedAt |
time.Time |
Last seen timestamp |
These internal structs are converted to types.EntityTag and types.EdgeTag for external API consumption.
Entity Tag Operations¶
Creating Entity Tags¶
The CreateEntityTag function persists property metadata for an entity. It serializes the OAM property to JSON and implements duplicate detection logic.
flowchart TD
Start["CreateEntityTag(entity, input)"]
ParseID["Parse entity.ID to uint64"]
Serialize["Serialize input.Property to JSON"]
CreateStruct["Create EntityTag struct<br/>with Type, Content, EntityID"]
CheckDup["GetEntityTags(entity, time.Time{}, prop.Name())"]
DupFound{"Duplicates<br/>found?"}
IterateDups["Iterate through existing tags"]
MatchFound{"PropertyType &<br/>Value match?"}
UpdateID["Set tag.ID = existing.ID<br/>Set tag.CreatedAt = existing.CreatedAt<br/>Set tag.UpdatedAt = now()"]
SetNew["Set tag.CreatedAt from input<br/>or now()<br/>Set tag.UpdatedAt from input.LastSeen<br/>or now()"]
Save["sql.db.Save(&tag)"]
Return["Return types.EntityTag"]
Start --> ParseID
ParseID --> Serialize
Serialize --> CreateStruct
CreateStruct --> CheckDup
CheckDup --> DupFound
DupFound -->|Yes| IterateDups
IterateDups --> MatchFound
MatchFound -->|Yes| UpdateID
MatchFound -->|No| IterateDups
UpdateID --> Save
DupFound -->|No| SetNew
SetNew --> Save
Save --> Return
Key behaviors:
- Duplicate Prevention: The function queries existing tags with the same name
- Type and Value Matching: Duplicates are identified by matching
PropertyType()andValue() - Update vs Insert: Duplicates update
UpdatedAtwhile preservingCreatedAt; new tags set both timestamps - GORM Save: Uses
Save()to perform INSERT or UPDATE based on whetherIDis set
Convenience Wrapper¶
The CreateEntityProperty function provides a simpler interface when you only have an oam.Property:
// Wrapper that creates EntityTag from Property
func (sql *sqlRepository) CreateEntityProperty(entity *types.Entity, prop oam.Property) (*types.EntityTag, error)
Finding Entity Tags by ID¶
The FindEntityTagById function retrieves a single tag by its unique identifier:
flowchart LR
Input["id (string)"]
Parse["strconv.ParseUint(id, 10, 64)"]
Query["sql.db.First(&tag)"]
ParseContent["tag.Parse()"]
Convert["Convert to types.EntityTag"]
Output["Return *types.EntityTag"]
Input --> Parse --> Query --> ParseContent --> Convert --> Output
Content-Based Tag Searching¶
The FindEntityTagsByContent function enables searching for tags by property content. It uses JSON field extraction to query specific property values.
flowchart TD
Input["FindEntityTagsByContent(prop, since)"]
Serialize["prop.JSON()"]
CreateTag["Create EntityTag with<br/>Type and Content"]
BuildQueries["tag.NameJSONQuery()<br/>tag.ValueJSONQuery()"]
BaseQuery["tx = sql.db.Where('ttype = ?', tag.Type)"]
SinceCheck{"since.IsZero()?"}
AddSince["tx.Where('updated_at >= ?', since.UTC())"]
Execute["tx.Where(nameQuery).Where(valueQuery).Find(&tags)"]
Parse["Parse each tag.Content to Property"]
Convert["Convert to []*types.EntityTag"]
Return["Return results"]
Input --> Serialize --> CreateTag --> BuildQueries
BuildQueries --> BaseQuery --> SinceCheck
SinceCheck -->|No| AddSince
SinceCheck -->|Yes| Execute
AddSince --> Execute
Execute --> Parse --> Convert --> Return
The NameJSONQuery() and ValueJSONQuery() methods generate database-specific JSON extraction queries for PostgreSQL and SQLite. This allows efficient content filtering at the database level.
Retrieving All Entity Tags¶
The GetEntityTags function retrieves all tags for a specific entity with optional filtering:
Function Signature:
func (sql *sqlRepository) GetEntityTags(entity *types.Entity, since time.Time, names ...string) ([]*types.EntityTag, error)
Parameters:
- entity: The entity whose tags to retrieve
- since: If not zero, only returns tags with updated_at >= since
- names: Optional property names to filter by
Query Logic:
| Condition | Query |
|---|---|
since.IsZero() |
WHERE entity_id = ? |
!since.IsZero() |
WHERE entity_id = ? AND updated_at >= ? |
After database retrieval, the function filters results by property name if names are provided .
Deleting Entity Tags¶
The DeleteEntityTag function removes a tag by its ID:
It parses the string ID to uint64, creates an EntityTag struct with that ID, and uses GORM's Delete() method.
Edge Tag Operations¶
Edge tag operations mirror entity tag operations but target edge relationships instead of entities.
Creating Edge Tags¶
The CreateEdgeTag function follows the same pattern as CreateEntityTag:
graph TB
subgraph "CreateEdgeTag Flow"
A["Parse edge.ID"]
B["Serialize property to JSON"]
C["Create EdgeTag struct"]
D["Check for duplicates via GetEdgeTags"]
E["Duplicate found?"]
F["Update existing tag ID and timestamps"]
G["Set new timestamps"]
H["sql.db.Save(&tag)"]
I["Return types.EdgeTag"]
end
A --> B --> C --> D --> E
E -->|Yes| F --> H
E -->|No| G --> H
H --> I
Duplicate detection logic:
1. Queries existing edge tags with the same property name
2. Compares PropertyType() and Value()
3. Updates timestamp on match or creates new tag
Edge Tag Convenience Functions¶
Similar to entity tags, edge tags provide a convenience wrapper:
func (sql *sqlRepository) CreateEdgeProperty(edge *types.Edge, prop oam.Property) (*types.EdgeTag, error)
Finding Edge Tags¶
Edge tag retrieval functions parallel their entity counterparts:
| Function | Purpose | Sources |
|---|---|---|
FindEdgeTagById |
Retrieve by unique ID | |
FindEdgeTagsByContent |
Search by property content | |
GetEdgeTags |
Get all tags for an edge | |
DeleteEdgeTag |
Remove by ID |
The implementation details match entity tag operations but operate on the edge_tags table with edge_id foreign keys instead of entity_id.
Tag Lifecycle and Timestamp Management¶
Tags maintain two timestamps that track their lifecycle:
stateDiagram-v2
[*] --> Created: CreateEntityTag/CreateEdgeTag<br/>with new property
Created --> Updated: CreateEntityTag/CreateEdgeTag<br/>with duplicate property
Updated --> Updated: Subsequent duplicate creates
Updated --> Deleted: DeleteEntityTag/DeleteEdgeTag
Deleted --> [*]
note right of Created
created_at: Set once
updated_at: Set to now()
end note
note right of Updated
created_at: Preserved
updated_at: Updated to now()
end note
Timestamp Behavior¶
On Initial Creation :
- created_at: Set from input.CreatedAt if provided, otherwise time.Now().UTC()
- updated_at: Set from input.LastSeen if provided, otherwise time.Now().UTC()
On Duplicate Update :
- created_at: Preserved from existing tag
- updated_at: Set to time.Now().UTC()
Time Zone Handling:
All timestamps are stored in UTC but converted to local time when returned via types.EntityTag or types.EdgeTag .
JSON Content Storage and Querying¶
Tags store OAM properties as JSON strings in the content field. This enables flexible property storage while supporting content-based queries.
Serialization Process¶
flowchart LR
Property["oam.Property<br/>(Go struct)"]
JSON["property.JSON()<br/>(method call)"]
String["JSON string<br/>(database storage)"]
Parse["Parse()<br/>(method call)"]
PropertyOut["oam.Property<br/>(Go struct)"]
Property -->|"Serialize"| JSON
JSON --> String
String -->|"Deserialize"| Parse
Parse --> PropertyOut
JSON Query Methods¶
The internal EntityTag and EdgeTag structs implement methods for generating database-specific JSON extraction queries:
NameJSONQuery(): Extracts the property name fieldValueJSONQuery(): Extracts the property value field
These methods handle differences between PostgreSQL's ->> operator and SQLite's json_extract() function, abstracting database-specific syntax.
Code Entity Mapping¶
The following diagram maps the public API functions to their internal implementations and database operations:
flowchart TB
subgraph "Public API (repository.Repository interface)"
CreateEntityTag["CreateEntityTag(entity, tag)"]
CreateEntityProperty["CreateEntityProperty(entity, prop)"]
FindEntityTagById["FindEntityTagById(id)"]
FindEntityTagsByContent["FindEntityTagsByContent(prop, since)"]
GetEntityTags["GetEntityTags(entity, since, names...)"]
DeleteEntityTag["DeleteEntityTag(id)"]
CreateEdgeTag["CreateEdgeTag(edge, tag)"]
CreateEdgeProperty["CreateEdgeProperty(edge, prop)"]
FindEdgeTagById["FindEdgeTagById(id)"]
FindEdgeTagsByContent["FindEdgeTagsByContent(prop, since)"]
GetEdgeTags["GetEdgeTags(edge, since, names...)"]
DeleteEdgeTag["DeleteEdgeTag(id)"]
end
subgraph "sqlRepository Implementation"
CreateEntityTagImpl["sqlRepository.CreateEntityTag<br/>[tag.go:21-76]"]
CreateEntityPropertyImpl["sqlRepository.CreateEntityProperty<br/>[tag.go:82-84]"]
FindEntityTagByIdImpl["sqlRepository.FindEntityTagById<br/>[tag.go:89-113]"]
FindEntityTagsByContentImpl["sqlRepository.FindEntityTagsByContent<br/>[tag.go:120-169]"]
GetEntityTagsImpl["sqlRepository.GetEntityTags<br/>[tag.go:174-226]"]
DeleteEntityTagImpl["sqlRepository.DeleteEntityTag<br/>[tag.go:231-243]"]
CreateEdgeTagImpl["sqlRepository.CreateEdgeTag<br/>[tag.go:249-304]"]
CreateEdgePropertyImpl["sqlRepository.CreateEdgeProperty<br/>[tag.go:310-312]"]
FindEdgeTagByIdImpl["sqlRepository.FindEdgeTagById<br/>[tag.go:317-346]"]
FindEdgeTagsByContentImpl["sqlRepository.FindEdgeTagsByContent<br/>[tag.go:353-402]"]
GetEdgeTagsImpl["sqlRepository.GetEdgeTags<br/>[tag.go:407-459]"]
DeleteEdgeTagImpl["sqlRepository.DeleteEdgeTag<br/>[tag.go:464-476]"]
end
subgraph "GORM Database Operations"
Save["db.Save(&tag)"]
First["db.First(&tag)"]
Where["db.Where(...).Find(&tags)"]
Delete["db.Delete(&tag)"]
end
CreateEntityTag --> CreateEntityTagImpl --> Save
CreateEntityProperty --> CreateEntityPropertyImpl --> CreateEntityTagImpl
FindEntityTagById --> FindEntityTagByIdImpl --> First
FindEntityTagsByContent --> FindEntityTagsByContentImpl --> Where
GetEntityTags --> GetEntityTagsImpl --> Where
DeleteEntityTag --> DeleteEntityTagImpl --> Delete
CreateEdgeTag --> CreateEdgeTagImpl --> Save
CreateEdgeProperty --> CreateEdgePropertyImpl --> CreateEdgeTagImpl
FindEdgeTagById --> FindEdgeTagByIdImpl --> First
FindEdgeTagsByContent --> FindEdgeTagsByContentImpl --> Where
GetEdgeTags --> GetEdgeTagsImpl --> Where
DeleteEdgeTag --> DeleteEdgeTagImpl --> Delete
Usage Examples from Tests¶
The test suite demonstrates typical tag operations:
Entity Tag Test Flow¶
sequenceDiagram
participant Test as Test Code
participant Repo as sqlRepository
participant DB as Database
Test->>Repo: CreateAsset(&dns.FQDN{Name: "utica.edu"})
Repo->>DB: INSERT entity
DB-->>Repo: entity with ID
Repo-->>Test: entity
Test->>Repo: CreateEntityProperty(entity, SimpleProperty{Name:"test", Value:"foo"})
Repo->>DB: Check for duplicates via GetEntityTags
DB-->>Repo: No duplicates found
Repo->>DB: INSERT entity_tag
DB-->>Repo: tag with ID
Repo-->>Test: tag (CreatedAt and LastSeen set)
Note over Test: Wait 1 second
Test->>Repo: CreateEntityProperty(entity, same property)
Repo->>DB: Check for duplicates via GetEntityTags
DB-->>Repo: Duplicate found
Repo->>DB: UPDATE entity_tag (UpdatedAt only)
DB-->>Repo: updated tag
Repo-->>Test: tag (LastSeen updated, CreatedAt preserved)
Test->>Repo: GetEntityTags(entity, since, "test")
Repo->>DB: SELECT entity_tags WHERE entity_id = ? AND name = 'test'
DB-->>Repo: matching tags
Repo-->>Test: []*types.EntityTag
Test->>Repo: DeleteEntityTag(tag.ID)
Repo->>DB: DELETE entity_tag WHERE id = ?
DB-->>Repo: Success
Repo-->>Test: nil error
Edge Tag Test Flow¶
Edge tags follow an identical pattern but require an edge to be created first:
- Create two entities (
CreateAsset) - Create an edge between them (
CreateEdge) - Attach properties to the edge via
CreateEdgeProperty - Query and verify tags via
GetEdgeTags - Clean up with
DeleteEdgeTag
Error Handling¶
The tag management system returns errors in the following scenarios:
| Scenario | Error Source | Functions Affected |
|---|---|---|
| Invalid ID format | strconv.ParseUint |
FindEntityTagById, DeleteEntityTag, FindEdgeTagById, DeleteEdgeTag |
| JSON serialization failure | property.JSON() |
CreateEntityTag, CreateEdgeTag, FindEntityTagsByContent, FindEdgeTagsByContent |
| Tag not found | gorm.First |
FindEntityTagById, FindEdgeTagById |
| Database operation failure | gorm.Save, gorm.Delete |
All create and delete operations |
| Zero results | Custom check | FindEntityTagsByContent, GetEntityTags, FindEdgeTagsByContent, GetEdgeTags |
The "zero tags found" error is explicitly returned when queries produce no results, distinguishing between database errors and legitimate empty result sets.