Core data transformation pipeline converting raw MCP server responses into structured intelligence article content. This module implements advanced semantic processing algorithms for legislative data, parliamentary event analysis, and multi-dimensional data mapping for automated journalism.
The transformation pipeline provides four specialized processing stages:
Stage 1 - Calendar Event Processing (transformCalendarToEventGrid): Transforms raw calendar data from riksdag-regering-mcp into structured event grid suitable for visual presentation. Handles multiple timestamp formats (MCP responses may use 'datum', 'from', 'start' fields), performs temporal normalization, and groups events by date for calendar visualization. Implements date comparison logic for marking "today" events with visual indicators.
Stage 2 - Document Content Generation (generateArticleContent): Processes structured parliamentary documents into narrative article prose. Maps document types (propositions, motions, reports) to narrative structures, extracts semantic meaning from legislative language, and generates coherent paragraphs suitable for journalist review. Applies natural language processing techniques for readability optimization and audience targeting.
Stage 3 - Intelligence Extraction (extractWatchPoints): Performs analytical extraction of critical intelligence points from parliamentary data. Identifies policy implications, fiscal impacts, timeline constraints, and political risk factors. Uses rule-based analysis for common legislative patterns (votes, committee decisions, government actions) and produces structured watch points for inclusion in article "watch sections".
Stage 4 - Metadata Generation (generateMetadata, calculateReadTime, generateSources): Synthesizes article metadata including publication date, author attribution, reading time estimates, source citations, and SEO keywords. Generates machine-readable metadata for structured data (Schema.org JSON-LD) and social media sharing.
Supported Data Types:
- Calendar events (committee meetings, plenary sessions, parliamentary breaks)
- Legislative documents (propositions, motions, parliamentary inquiries)
- Voting records (roll-call votes with party/member positions)
- Government announcements (press releases, policy documents, ministerial statements)
- Committee reports (analysis, recommendations, decisions)
- Debate transcripts (parliamentary speeches with speaker context)
Multi-Language Processing:
- Swedish source content transformation into 14 target languages
- Terminology mapping for political/legal concepts
- Date formatting and timezone adjustment per target language
- Pluralization and grammatical agreement handling
- RTL language support for Arabic and Hebrew output
Data Validation & Quality Assurance:
- Schema validation against CIA data model definitions
- Null/undefined field handling with intelligent fallbacks
- Temporal consistency checking (dates in correct order)
- Cross-reference validation (referenced documents exist)
- Semantic completeness assessment
- Version:
- 2.0.0
- License:
- Apache-2.0
- Source:
- See:
-
- ./mcp-client.js MCP API client providing raw data
- ./article-template.js Template rendering consuming transformed data
- ./generate-news-enhanced.js Article generation orchestration
- ./html-utils.js HTML sanitization (escapeHtml)
- docs/DATA_TRANSFORMATION_GUIDE.md Detailed transformation algorithms
- docs/MCP_DATA_SCHEMA.md MCP response schema definitions
- docs/INTELLIGENCE_EXTRACTION.md Intelligence analysis methodology
Members
(static, constant) CONTENT_LABELS
Generate Week Ahead article content
- Source:
(inner, constant) COMMITTEE_NAMES
Map Swedish committee codes to full names for richer descriptions
- Source:
(inner, constant) LOCALE_MAP
Map of custom locale codes to Intl-compatible locale strings
- Source:
Methods
(static) L()
Get localized label with fallback to English
- Source:
(static) calculateReadTime(content) → {string}
Calculate estimated read time
Parameters:
| Name | Type | Description |
|---|---|---|
content |
string | Article HTML content |
- Source:
Returns:
Read time (e.g., "5 min read")
- Type
- string
(static) extractTopics(documents) → {Array}
Extract key topics from documents
Parameters:
| Name | Type | Description |
|---|---|---|
documents |
Array | Documents from MCP server |
- Source:
Returns:
Topic tags
- Type
- Array
(static) extractWatchPoints(data, lang) → {Array}
Extract "Watch Points" from data
Parameters:
| Name | Type | Description |
|---|---|---|
data |
Object | MCP data |
lang |
string | Language code |
- Source:
Returns:
Watch points for article
- Type
- Array
(static) generateArticleContent(data, type, lang) → {string}
Generate article content from MCP data
Parameters:
| Name | Type | Description |
|---|---|---|
data |
Object | MCP data (events, documents, etc.) |
type |
string | Article type (week-ahead, committee-reports, etc.) |
lang |
string | Language code |
- Source:
Returns:
Article HTML content
- Type
- string
(static) generateMetadata(data, type, lang) → {Object}
Generate article metadata
Parameters:
| Name | Type | Description |
|---|---|---|
data |
Object | Article data |
type |
string | Article type |
lang |
string | Language code |
- Source:
Returns:
Article metadata
- Type
- Object
(static) generateSources(tools) → {Array}
Generate article sources list
Parameters:
| Name | Type | Description |
|---|---|---|
tools |
Array | MCP tools used |
- Source:
Returns:
Sources list
- Type
- Array
(static) transformCalendarToEventGrid(events, lang) → {Array}
Transform calendar events into event grid structure for template
Parameters:
| Name | Type | Description |
|---|---|---|
events |
Array | Calendar events from MCP server |
lang |
string | Language code (en, sv) |
- Source:
Returns:
Event grid structure for article template
- Type
- Array
(inner) formatDayLabel()
Format day label (e.g., "February 10 - Monday") using Intl for all 14 languages
- Source:
(inner) formatDayName()
Format day name (Monday, Tuesday, etc.) using Intl for all 14 languages
- Source:
(inner) generateCommitteeContent()
Generate Committee Reports content with analytical narrative
- Source:
(inner) generateEnhancedSummary(doc, type, lang) → {string}
Generate enhanced summary from document metadata when summary field is missing Uses document type, subtype, organ, and other metadata to create informative placeholder
Parameters:
| Name | Type | Description |
|---|---|---|
doc |
Object | Document object |
type |
string | Document type (report, proposition, motion) |
lang |
string | Language code |
- Source:
Returns:
Enhanced summary text
- Type
- string
(inner) generateGenericContent()
Generate generic content
- Source:
(inner) generateMotionsContent()
Generate Motions content with analytical narrative
- Source:
(inner) generatePropositionsContent()
Generate Propositions content with analytical narrative
- Source:
(inner) getCommitteeName()
Get human-readable committee name from code
- Source:
(inner) isHighPriority()
Determine if event is high priority
- Source:
(inner) isTodayDate()
Check if date is today
- Source: