Module: CIA Schema Update Detection - Upstream Change Monitoring

Automated schema update detection system continuously monitoring the CIA GitHub repository for changes to published JSON export specifications. Identifies when CIA data structure versions diverge from local cached versions, triggering synchronization workflows for data product consistency.

Operational Purpose: Ensures riksdagsmonitor maintains compatibility with CIA platform's 19 data products by detecting upstream schema modifications before they break data pipelines. Implements change detection through cryptographic checksums, enabling rapid identification of schema evolution without manual polling.

CIA Data Products Monitored (19 schemas):

  • overview-dashboard: Parliamentary activity summary
  • party-performance: Party voting and activity metrics
  • cabinet-scorecard: Government performance tracking
  • election-analysis: Electoral outcomes and trends
  • top10-influential-mps: Parliamentary power analysis
  • top10-productive-mps: Legislation productivity metrics
  • top10-controversial-mps: Political controversy tracking
  • top10-absent-mps: Attendance pattern analysis
  • top10-rebels: Party discipline violations
  • top10-coalition-brokers: Coalition dynamics influencers
  • top10-rising-stars: Career trajectory identification
  • top10-electoral-risk: Election vulnerability analysis
  • top10-ethics-concerns: Ethics violation tracking
  • top10-media-presence: Political media prominence
  • committee-network: Committee membership networks
  • politician-career: Career progression analysis
  • party-longitudinal: Historical party data trends
  • riksdag-overview: Parliamentary structure and history
  • ministry-performance: Government ministry effectiveness

Change Detection Architecture:

  • Fetches remote schema files from CIA GitHub repository
  • Computes SHA-256 checksums of remote files
  • Compares checksums with locally cached metadata
  • Identifies added, modified, or deleted schemas
  • Generates change report for action planning

Remote Data Source:

  • CIA Repository: https://github.com/Hack23/cia
  • Schema Location: /json-export-specs/schemas/
  • Access Method: GitHub raw content CDN (no authentication required)
  • Data License: Apache-2.0 (compatible with riksdagsmonitor)

Local Cache Structure:

  • Schemas Directory: ./schemas/cia/
  • Metadata Directory: ./schemas/metadata/
  • Stores: Downloaded schema files, checksum verification data
  • Updated by: sync-cia-schemas.js (separate synchronization script)

Metadata Management:

  • Checksums: SHA-256 hashes for change detection
  • Update timestamps: ISO 8601 format with timezone
  • Fetch status: Success/failure/error indicators
  • Version tracking: Schema version numbers if available

Detection Workflow:

  1. Fetch remote schema file list from CIA GitHub
  2. Compute SHA-256 checksum of each remote file
  3. Load local metadata (previous checksums)
  4. Compare remote vs. local checksums
  5. Identify differences: new, modified, deleted
  6. Generate change report with details
  7. Trigger downstream actions if changes detected

Update Triggers & Actions:

  • If schema added: Notify administrator for evaluation
  • If schema modified: Trigger validate-against-cia-schemas.js
  • If schema deleted: Update local cache, assess impact
  • If validation fails: Alert operations team, prevent deployment

Error Handling:

  • Network failures: Retry with exponential backoff
  • Malformed schemas: Log and skip with alert
  • File access errors: Report with detailed diagnostics
  • Partial failures: Complete check for other schemas

Output Report Structure: { timestamp: ISO 8601, status: 'success' | 'failure', summary: { total, added, modified, deleted }, details: [ { schema: 'name', change: 'added|modified|deleted', ... } ], errors: [ ... ] }

Integration Points:

  • CI/CD pipeline: Scheduled check during build process
  • sync-cia-schemas.js: Triggered to download new/updated schemas
  • validate-against-cia-schemas.js: Validates local data against updated schemas
  • Intelligence dashboards: Alerts for schema compatibility issues

Network Security:

  • HTTPS only (GitHub raw content CDN)
  • No authentication required (public repository)
  • Rate limiting: GitHub allows 60 requests/hour unauthenticated
  • Implements delay between requests to respect rate limits

Performance Characteristics:

  • Fetches ~19 schema files (avg 2-5 KB each)
  • Checksum computation: < 100ms per file
  • Total execution time: 3-5 seconds typical
  • Can be scheduled hourly without performance impact

Data Integrity:

  • Checksums enable detection of file corruption
  • Change log maintains audit trail of schema evolution
  • Version control in git for local metadata tracking
  • Complies with data integrity principles

ISMS Compliance:

  • ISO 27001:2022 A.8.1 - Asset management (track schema versions)
  • ISO 27001:2022 A.12.6.1 - Change management (schema version tracking)
  • NIST CSF 2.0 RC.IM-2 - Incident management and improvements
  • CIS Control 5.3 - Configuration change control

Usage: node scripts/check-cia-schema-updates.js

Reports: New schemas, modified schemas, deleted schemas

Triggers: sync-cia-schemas.js if updates detected

Version:
  • 1.3.0
Author:
  • Hack23 AB (Data Infrastructure Team)
License:
  • Apache-2.0
Source:
See:
  • sync-cia-schemas.js (schema download and cache management)
  • validate-against-cia-schemas.js (data validation against schemas)
  • CIA Repository: https://github.com/Hack23/cia
  • ISO 27001:2022 A.12.6.1 - Change management