Essential utility library providing safe HTML generation and sanitization functions used across the political intelligence platform. Prevents XSS vulnerabilities while enabling dynamic content generation for multi-language news articles and dashboards.
Core Functionality:
- HTML entity encoding: Converts &, <, >, ", ' to their safe HTML entity equivalents
- XSS attack prevention: Blocks script injection through user-generated or external content
- Safe HTML generation: Enables dynamic DOM construction without direct innerHTML risks
- JSON-LD integration: Escapes special characters for embedded structured data
Security Architecture:
- Whitelist-based entity encoding (only necessary entities escaped)
- Prevents DOM-based XSS attacks through proper output encoding
- Supports generation of safe HTML templates for 14 languages
- Compatible with content security policy (CSP) headers
Integration Usage Across Codebase:
- generate-news-indexes.js: Escapes article titles and metadata in dynamic index pages
- generate-news-backport.js: Sanitizes article content during legacy migration
- generate-sitemap.js: Escapes URL parameters and article descriptions
- generate-news-enhanced.js: Handles safe HTML generation for multi-language articles
- validate-articles-playwright.js: Validates generated HTML content integrity
OWASP Security Standards:
- Implements Output Encoding from OWASP Top 10 (A03:2021 - Injection)
- Prevents Stored XSS through proper entity escaping
- Complies with OWASP API Security #2 (Broken Authentication/Authorization)
Data Protection:
- No sensitive data storage; pure utility functions
- Operates on publicly available political content
- Complies with ISO 27001:2022 A.13.1.3 (segregation of networks)
- Supports GDPR Article 32 (security of processing)
Multi-Language Support:
- Handles UTF-8 text across all supported scripts (Latin, CJK, Arabic, Hebrew, etc.)
- Preserves linguistic integrity while ensuring security
- Supports bidirectional text (Hebrew, Arabic) in HTML attributes
Performance Considerations:
- Lightweight regex-based character replacement
- Minimal memory footprint for bulk article processing
- Suitable for high-volume content generation pipelines
Functions:
- escapeHtml(text): Escapes HTML special characters for safe inclusion in HTML/JSON-LD
Usage Example:
import { escapeHtml } from './html-utils.js';
const safeTitle = escapeHtml(userProvidedTitle);
const jsonLd = "headline": "${escapeHtml(articleTitle)}";
- Version:
- 2.1.0
- License:
- Apache-2.0
- Source:
- See:
-
- OWASP Output Encoding
- CWE-79: Improper Neutralization of Input During Web Page Generation
- ISO 27001:2022 A.13.1.3 - Network segregation
- GDPR Article 32 - Security of processing
Methods
(static) escapeHtml(text) → {string}
Escape HTML special characters for safe inclusion in HTML/JSON-LD. Prevents XSS by converting &, <, >, ", ' to their HTML entity equivalents.
Parameters:
| Name | Type | Description |
|---|---|---|
text |
string | Raw text to escape |
- Source:
Returns:
Escaped text safe for HTML insertion
- Type
- string