Module: Intelligence Operations/Data Transformation Pipeline

Core data transformation pipeline converting raw MCP server responses into structured intelligence article content. This module implements advanced semantic processing algorithms for legislative data, parliamentary event analysis, and multi-dimensional data mapping for automated journalism.

The transformation pipeline provides four specialized processing stages:

Stage 1 - Calendar Event Processing (transformCalendarToEventGrid): Transforms raw calendar data from riksdag-regering-mcp into structured event grid suitable for visual presentation. Handles multiple timestamp formats (MCP responses may use 'datum', 'from', 'start' fields), performs temporal normalization, and groups events by date for calendar visualization. Implements date comparison logic for marking "today" events with visual indicators.

Stage 2 - Document Content Generation (generateArticleContent): Processes structured parliamentary documents into narrative article prose. Maps document types (propositions, motions, reports) to narrative structures, extracts semantic meaning from legislative language, and generates coherent paragraphs suitable for journalist review. Applies natural language processing techniques for readability optimization and audience targeting.

Stage 3 - Intelligence Extraction (extractWatchPoints): Performs analytical extraction of critical intelligence points from parliamentary data. Identifies policy implications, fiscal impacts, timeline constraints, and political risk factors. Uses rule-based analysis for common legislative patterns (votes, committee decisions, government actions) and produces structured watch points for inclusion in article "watch sections".

Stage 4 - Metadata Generation (generateMetadata, calculateReadTime, generateSources): Synthesizes article metadata including publication date, author attribution, reading time estimates, source citations, and SEO keywords. Generates machine-readable metadata for structured data (Schema.org JSON-LD) and social media sharing.

Supported Data Types:

  • Calendar events (committee meetings, plenary sessions, parliamentary breaks)
  • Legislative documents (propositions, motions, parliamentary inquiries)
  • Voting records (roll-call votes with party/member positions)
  • Government announcements (press releases, policy documents, ministerial statements)
  • Committee reports (analysis, recommendations, decisions)
  • Debate transcripts (parliamentary speeches with speaker context)

Multi-Language Processing:

  • Swedish source content transformation into 14 target languages
  • Terminology mapping for political/legal concepts
  • Date formatting and timezone adjustment per target language
  • Pluralization and grammatical agreement handling
  • RTL language support for Arabic and Hebrew output

Data Validation & Quality Assurance:

  • Schema validation against CIA data model definitions
  • Null/undefined field handling with intelligent fallbacks
  • Temporal consistency checking (dates in correct order)
  • Cross-reference validation (referenced documents exist)
  • Semantic completeness assessment
Version:
  • 2.0.0
Author:
  • Hack23 AB - Intelligence Operations Team
License:
  • Apache-2.0
Source:
See:
  • ./mcp-client.js MCP API client providing raw data
  • ./article-template.js Template rendering consuming transformed data
  • ./generate-news-enhanced.js Article generation orchestration
  • ./html-utils.js HTML sanitization (escapeHtml)
  • docs/DATA_TRANSFORMATION_GUIDE.md Detailed transformation algorithms
  • docs/MCP_DATA_SCHEMA.md MCP response schema definitions
  • docs/INTELLIGENCE_EXTRACTION.md Intelligence analysis methodology

Members

(static, constant) CONTENT_LABELS

Generate Week Ahead article content

Source:

(inner, constant) COMMITTEE_NAMES

Map Swedish committee codes to full names for richer descriptions

Source:

(inner, constant) LOCALE_MAP

Map of custom locale codes to Intl-compatible locale strings

Source:

Methods

(static) L()

Get localized label with fallback to English

Source:

(static) calculateReadTime(content) → {string}

Calculate estimated read time

Parameters:
Name Type Description
content string

Article HTML content

Source:
Returns:

Read time (e.g., "5 min read")

Type
string

(static) extractTopics(documents) → {Array}

Extract key topics from documents

Parameters:
Name Type Description
documents Array

Documents from MCP server

Source:
Returns:

Topic tags

Type
Array

(static) extractWatchPoints(data, lang) → {Array}

Extract "Watch Points" from data

Parameters:
Name Type Description
data Object

MCP data

lang string

Language code

Source:
Returns:

Watch points for article

Type
Array

(static) generateArticleContent(data, type, lang) → {string}

Generate article content from MCP data

Parameters:
Name Type Description
data Object

MCP data (events, documents, etc.)

type string

Article type (week-ahead, committee-reports, etc.)

lang string

Language code

Source:
Returns:

Article HTML content

Type
string

(static) generateMetadata(data, type, lang) → {Object}

Generate article metadata

Parameters:
Name Type Description
data Object

Article data

type string

Article type

lang string

Language code

Source:
Returns:

Article metadata

Type
Object

(static) generateSources(tools) → {Array}

Generate article sources list

Parameters:
Name Type Description
tools Array

MCP tools used

Source:
Returns:

Sources list

Type
Array

(static) transformCalendarToEventGrid(events, lang) → {Array}

Transform calendar events into event grid structure for template

Parameters:
Name Type Description
events Array

Calendar events from MCP server

lang string

Language code (en, sv)

Source:
Returns:

Event grid structure for article template

Type
Array

(inner) formatDayLabel()

Format day label (e.g., "February 10 - Monday") using Intl for all 14 languages

Source:

(inner) formatDayName()

Format day name (Monday, Tuesday, etc.) using Intl for all 14 languages

Source:

(inner) generateCommitteeContent()

Generate Committee Reports content with analytical narrative

Source:

(inner) generateEnhancedSummary(doc, type, lang) → {string}

Generate enhanced summary from document metadata when summary field is missing Uses document type, subtype, organ, and other metadata to create informative placeholder

Parameters:
Name Type Description
doc Object

Document object

type string

Document type (report, proposition, motion)

lang string

Language code

Source:
Returns:

Enhanced summary text

Type
string

(inner) generateGenericContent()

Generate generic content

Source:

(inner) generateMotionsContent()

Generate Motions content with analytical narrative

Source:

(inner) generatePropositionsContent()

Generate Propositions content with analytical narrative

Source:

(inner) getCommitteeName()

Get human-readable committee name from code

Source:

(inner) isHighPriority()

Determine if event is high priority

Source:

(inner) isTodayDate()

Check if date is today

Source: