๐ Themes Summary (CSV)
Contains one row per theme with:
- Theme ID: Numeric identifier (1, 2, 3...)
- Top Words: Comma-separated list of characteristic terms
- Document Count: Number of documents assigned to this theme
- Percentage: Share of total documents
Use for: Executive summaries, presentations, theme labeling reference
๐ Document Assignments (CSV)
Contains one row per document with:
- Document ID: Row number from your original data
- Original Text: The full text of the document
- Assigned Theme: Primary theme ID
- Confidence: Assignment confidence percentage
- Theme_1_Score, Theme_2_Score, ...: Raw scores for each theme
Use for: Further analysis in Excel/R/Python, filtering by theme, joining with other data
๐ก Analysis Ideas
- Join assignments with customer metadata (segment, region, NPS score)
- Filter low-confidence assignments for manual review
- Track theme distribution over time periods
- Compare themes across product lines or customer segments
Sample Comments by Theme
Most representative documents for each theme (highest confidence scores).
Understanding Sample Comments & Confidence Scores
๐ What Are Sample Comments?
These are the most representative documents for each themeโthe ones the algorithm is most confident about assigning to that theme. They serve as concrete examples of what each theme "looks like" in your data.
๐ Understanding Confidence Scores
The confidence percentage reflects how strongly a document matches its assigned theme:
๐ก How to Use Samples
โ ๏ธ When Samples Don't Match Expectations
If sample comments seem wrong for a theme: