Project Files
samples / data-analysis / SKILL.md
Use this skill when asked to explore, analyse, or visualise any dataset, CSV, table, or collection of numbers.
Always follow this sequence:
Never skip steps. Never jump to "Key findings" without the groundwork.
## Dataset Overview Rows, columns, source, date range if applicable. ## Data Quality Missing values per column, duplicates found, anomalies flagged. ## Descriptive Statistics Table of numeric column stats. Categorical column value counts (top 5). ## Key Patterns Bullet points - one insight per bullet, quantified. Bad: "Sales seem higher in Q4" Good: "Q4 sales average 34% higher than Q1-Q3 combined (mean: $2.1M vs $1.57M)" ## Recommendations What to investigate further, or what action the data supports.
scripts/ - do not write raw matplotlib from scratch.Pre-built analysis scripts live in scripts/. Read them before writing any analysis code.
| Script | Purpose |
|---|---|
scripts/profile.py | Full dataset profile: types, nulls, stats, top values |
scripts/correlations.py | Pearson + Spearman correlation matrix with heatmap |
scripts/time_series.py | Date-aware trend analysis, resampling, rolling averages |
Usage: copy the relevant script, adapt the INPUT_FILE and column name variables at the top, then run.