Merge cowork/majorair/wiki-updates-apr25 — 3 new articles + nav updates
This commit is contained in:
commit
5d7ce294b6
10 changed files with 1038 additions and 329 deletions
|
|
@ -1,203 +0,0 @@
|
|||
---
|
||||
title: WSL2 Fedora 43 Training Environment Rebuild
|
||||
domain: linux
|
||||
category: distro-specific
|
||||
tags:
|
||||
- wsl2
|
||||
- fedora
|
||||
- unsloth
|
||||
- pytorch
|
||||
- cuda
|
||||
- majorrig
|
||||
- majortwin
|
||||
status: published
|
||||
created: 2026-03-16
|
||||
updated: 2026-04-29T22:45
|
||||
---
|
||||
|
||||
# WSL2 Fedora 43 Training Environment Rebuild
|
||||
|
||||
How to rebuild the MajorTwin training environment from scratch on MajorRig after a WSL2 loss. Covers Fedora 43 install, Python 3.11 via pyenv, PyTorch with CUDA, Unsloth, and llama.cpp for GGUF conversion.
|
||||
|
||||
## The Short Answer
|
||||
|
||||
```bash
|
||||
# 1. Install Fedora 43 and move to D:
|
||||
wsl --install -d FedoraLinux-43 --no-launch
|
||||
wsl --export FedoraLinux-43 D:\WSL\fedora43.tar
|
||||
wsl --unregister FedoraLinux-43
|
||||
wsl --import FedoraLinux-43 D:\WSL\Fedora43 D:\WSL\fedora43.tar
|
||||
|
||||
# 2. Set default user
|
||||
echo -e "[boot]\nsystemd=true\n[user]\ndefault=majorlinux" | sudo tee /etc/wsl.conf
|
||||
useradd -m -G wheel majorlinux && passwd majorlinux
|
||||
echo "%wheel ALL=(ALL) ALL" | sudo tee /etc/sudoers.d/wheel
|
||||
|
||||
# 3. Install Python 3.11 via pyenv, PyTorch, Unsloth
|
||||
# See full steps below
|
||||
```
|
||||
|
||||
## Step 1 — System Packages
|
||||
|
||||
```bash
|
||||
sudo dnf update -y
|
||||
sudo dnf install -y git curl wget tmux screen htop rsync unzip \
|
||||
python3 python3-pip python3-devel gcc gcc-c++ make cmake \
|
||||
ninja-build pkg-config openssl-devel libffi-devel \
|
||||
gawk patch readline-devel sqlite-devel
|
||||
```
|
||||
|
||||
## Step 2 — Python 3.11 via pyenv
|
||||
|
||||
Fedora 43 ships Python 3.13. Unsloth requires 3.11. Use pyenv:
|
||||
|
||||
```bash
|
||||
curl https://pyenv.run | bash
|
||||
|
||||
# Add to ~/.bashrc
|
||||
export PYENV_ROOT="$HOME/.pyenv"
|
||||
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
|
||||
eval "$(pyenv init - bash)"
|
||||
|
||||
source ~/.bashrc
|
||||
pyenv install 3.11.9
|
||||
pyenv global 3.11.9
|
||||
```
|
||||
|
||||
The tkinter warning during install is harmless — it's not needed for training.
|
||||
|
||||
## Step 3 — Training Virtualenv + PyTorch
|
||||
|
||||
```bash
|
||||
mkdir -p ~/majortwin/{staging,datasets,outputs,scripts}
|
||||
python -m venv ~/majortwin/venv
|
||||
source ~/majortwin/venv/bin/activate
|
||||
|
||||
pip install --upgrade pip
|
||||
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
|
||||
|
||||
# Verify GPU
|
||||
python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"
|
||||
```
|
||||
|
||||
Expected output: `True NVIDIA GeForce RTX 3080 Ti`
|
||||
|
||||
## Step 4 — Unsloth + Training Stack
|
||||
|
||||
```bash
|
||||
source ~/majortwin/venv/bin/activate
|
||||
|
||||
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
|
||||
pip install transformers datasets accelerate peft trl bitsandbytes \
|
||||
sentencepiece protobuf scipy einops
|
||||
|
||||
# Pin transformers for unsloth-zoo compatibility
|
||||
pip install "transformers<=5.2.0"
|
||||
|
||||
# Verify
|
||||
python -c "import unsloth; print('Unsloth OK')"
|
||||
```
|
||||
|
||||
> [!warning] Never run `pip install -r requirements.txt` from inside llama.cpp while the training venv is active. It installs CPU-only PyTorch and downgrades transformers, breaking the CUDA setup.
|
||||
|
||||
## Step 5 — llama.cpp (CPU-only for GGUF conversion)
|
||||
|
||||
CUDA 12.8 is incompatible with Fedora 43's glibc for compiling llama.cpp (math function conflicts in `/usr/include/bits/mathcalls.h`). Build CPU-only — it's sufficient for GGUF conversion, which doesn't need GPU:
|
||||
|
||||
```bash
|
||||
# Install GCC 14 (CUDA 12.8 doesn't support GCC 15 which Fedora 43 ships)
|
||||
sudo dnf install -y gcc14 gcc14-c++
|
||||
|
||||
cd ~/majortwin
|
||||
git clone https://github.com/ggerganov/llama.cpp.git
|
||||
cd llama.cpp
|
||||
|
||||
cmake -B build \
|
||||
-DGGML_CUDA=OFF \
|
||||
-DCMAKE_BUILD_TYPE=Release \
|
||||
-DCMAKE_C_COMPILER=/usr/bin/gcc-14 \
|
||||
-DCMAKE_CXX_COMPILER=/usr/bin/g++-14
|
||||
|
||||
cmake --build build --config Release -j$(nproc) 2>&1 | tee /tmp/llama_build.log &
|
||||
tail -f /tmp/llama_build.log
|
||||
```
|
||||
|
||||
Verify:
|
||||
```bash
|
||||
ls ~/majortwin/llama.cpp/build/bin/llama-quantize && echo "OK"
|
||||
ls ~/majortwin/llama.cpp/build/bin/llama-cli && echo "OK"
|
||||
```
|
||||
|
||||
## Step 6 — Shell Environment
|
||||
|
||||
```bash
|
||||
cat >> ~/.bashrc << 'EOF'
|
||||
# MajorInfrastructure Paths
|
||||
export VAULT="/mnt/c/Users/majli/Documents/MajorVault"
|
||||
export MAJORANSIBLE="/mnt/d/MajorAnsible"
|
||||
export MAJORTWIN_D="/mnt/d/MajorTwin"
|
||||
export MAJORTWIN_WSL="$HOME/majortwin"
|
||||
export LLAMA_CPP="$HOME/majortwin/llama.cpp"
|
||||
|
||||
# Venv
|
||||
alias mtwin='source $MAJORTWIN_WSL/venv/bin/activate && cd $MAJORTWIN_WSL'
|
||||
alias vault='cd $VAULT'
|
||||
alias ll='ls -lah --color=auto'
|
||||
|
||||
# SSH Fleet Aliases
|
||||
alias majorhome='ssh majorlinux@100.120.209.106'
|
||||
alias dca='ssh root@100.104.11.146'
|
||||
alias majortoot='ssh root@100.110.197.17'
|
||||
alias majorlinuxvm='ssh root@100.87.200.5'
|
||||
alias majordiscord='ssh root@100.122.240.83'
|
||||
alias majorlab='ssh root@100.86.14.126'
|
||||
alias majormail='ssh root@100.84.165.52'
|
||||
alias teelia='ssh root@100.120.32.69'
|
||||
alias tttpod='ssh root@100.84.42.102'
|
||||
alias majorrig='ssh majorlinux@100.98.47.29' # port 2222 retired 2026-03-25, fleet uses port 22
|
||||
|
||||
# DNF5
|
||||
alias update='sudo dnf upgrade --refresh'
|
||||
alias install='sudo dnf install'
|
||||
alias clean='sudo dnf clean all'
|
||||
|
||||
# MajorTwin helpers
|
||||
stage_dataset() {
|
||||
cp "$VAULT/20-Projects/MajorTwin/03-Datasets/$1" "$MAJORTWIN_WSL/datasets/"
|
||||
echo "Staged: $1"
|
||||
}
|
||||
export_gguf() {
|
||||
cp "$MAJORTWIN_WSL/outputs/$1" "$MAJORTWIN_D/models/"
|
||||
echo "Exported: $1 → $MAJORTWIN_D/models/"
|
||||
}
|
||||
EOF
|
||||
|
||||
source ~/.bashrc
|
||||
```
|
||||
|
||||
## Key Rules
|
||||
|
||||
- **Always activate venv before pip installs:** `source ~/majortwin/venv/bin/activate`
|
||||
- **Never train from /mnt/c or /mnt/d** — stage files in `~/majortwin/staging/` first
|
||||
- **Never put ML artifacts inside MajorVault** — models, venvs, artifacts go on D: drive
|
||||
- **Max viable training model:** 7B at QLoRA 4-bit (RTX 3080 Ti, 12GB VRAM)
|
||||
- **Current base model:** Qwen2.5-7B-Instruct (ChatML format — stop token: `<|im_end|>` only)
|
||||
- **Transformers must be pinned:** `pip install "transformers<=5.2.0"` for unsloth-zoo compatibility
|
||||
|
||||
## D: Drive Layout
|
||||
|
||||
```
|
||||
D:\MajorTwin\
|
||||
models\ ← finished GGUFs
|
||||
datasets\ ← dataset archives
|
||||
artifacts\ ← training run artifacts
|
||||
training-runs\ ← logs, checkpoints
|
||||
D:\WSL\
|
||||
Fedora43\ ← WSL2 VHDX
|
||||
Backups\ ← weekly WSL2 backup tars
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [WSL2 Instance Migration](wsl2-instance-migration-fedora43.md)
|
||||
- [WSL2 Backup via PowerShell](wsl2-backup-powershell.md)
|
||||
180
02-selfhosting/dns-networking/pihole-doh-dot-bypass-defense.md
Normal file
180
02-selfhosting/dns-networking/pihole-doh-dot-bypass-defense.md
Normal file
|
|
@ -0,0 +1,180 @@
|
|||
---
|
||||
title: Pi-hole DoH / DoT Bypass Defense
|
||||
domain: selfhosting
|
||||
category: dns-networking
|
||||
tags:
|
||||
- pihole
|
||||
- dns
|
||||
- doh
|
||||
- dot
|
||||
- privacy
|
||||
- adblock
|
||||
- bypass
|
||||
- hagezi
|
||||
status: published
|
||||
created: 2026-04-22
|
||||
updated: 2026-04-23T09:09
|
||||
---
|
||||
|
||||
# Pi-hole DoH / DoT Bypass Defense
|
||||
|
||||
## The Problem
|
||||
|
||||
A LAN-wide ad/tracker/threat-intel blocklist at the DNS layer is only effective if clients actually use the DNS server doing the blocking. Three classes of client routinely bypass LAN DNS:
|
||||
|
||||
1. **Modern browsers with built-in DNS-over-HTTPS (DoH).** Chrome, Firefox, Safari, Edge all ship with DoH either on by default or a one-toggle opt-in. When enabled, the browser sends DNS queries over HTTPS directly to Cloudflare / Google / Quad9 / NextDNS, bypassing the OS resolver and every DNS-layer blocklist on the network.
|
||||
2. **IoT / smart devices with hardcoded public DNS.** Chromecast, Google Home, Nest, many Samsung TVs, some Amazon devices include hardcoded `8.8.8.8` or `1.1.1.1`. They ignore DHCP-pushed DNS entirely.
|
||||
3. **Applications using DNS-over-TLS (DoT).** Rarer than DoH but used by some privacy-focused apps and occasional malware C2 — hits Cloudflare / Quad9 on port 853 instead of 53.
|
||||
|
||||
Without defense, a compromised IoT or a telemetry-hungry app can exfil DNS traffic freely even though Pi-hole is "running."
|
||||
|
||||
## What This Guide Covers
|
||||
|
||||
- How Pi-hole's `blocking.mode = NULL` structurally prevents the most common fallback-resolver bypass.
|
||||
- Why the `HaGeZi doh-vpn-proxy-bypass` adlist is the single highest-leverage defense against browser DoH.
|
||||
- What still leaks and how to assess whether the router-level firewall is worth the effort for your threat model.
|
||||
|
||||
## Pi-hole's block mode matters
|
||||
|
||||
Pi-hole v6's default `dns.blocking.mode` is `NULL`. A blocked domain resolves to `0.0.0.0` — a **valid** DNS answer, not an NXDOMAIN. Verify on your host:
|
||||
|
||||
```bash
|
||||
dig +short <blocked-domain> @<pihole-ip>
|
||||
# → 0.0.0.0
|
||||
```
|
||||
|
||||
Why this matters: multi-resolver OSes (macOS, iOS, Windows) only consult fallback resolvers on a **failure** (timeout, SERVFAIL). A valid NULL answer short-circuits that — the client accepts the 0.0.0.0, tries to connect, fails at TCP, and never retries DNS. Even if `/etc/resolv.conf` has `1.1.1.1` as a secondary, it's never queried.
|
||||
|
||||
If you've set blocking mode to `NXDOMAIN`, clients **will** fall back — and every telemetry domain on every adlist becomes bypassable through whatever secondary resolver the OS is configured with. **Leave it at NULL.**
|
||||
|
||||
Check:
|
||||
```bash
|
||||
pihole-FTL --config dns.blocking.mode
|
||||
# → NULL
|
||||
```
|
||||
|
||||
## HaGeZi DoH/VPN/Proxy Bypass — the biggest single win
|
||||
|
||||
HaGeZi maintains `adblock/doh-vpn-proxy-bypass.txt` — ~18,000 DoH resolver hostnames, including the bootstrap domains used by every major browser:
|
||||
|
||||
| Browser | DoH bootstrap |
|
||||
|---|---|
|
||||
| Firefox | `mozilla.cloudflare-dns.com` |
|
||||
| Chrome | `chrome.cloudflare-dns.com`, `dns.google` |
|
||||
| Safari (iCloud Private Relay bootstrap) | Apple-specific, *not* in this list — Apple uses QUIC |
|
||||
| Edge | `dns.google`, other public resolvers |
|
||||
|
||||
When the bootstrap hostname can't be resolved (Pi-hole answers `0.0.0.0`), the browser's DoH setup fails and it falls back to the system resolver — which is Pi-hole. This flips the default behavior from "browsers can bypass" to "browsers respect LAN DNS."
|
||||
|
||||
### Adding it
|
||||
|
||||
```bash
|
||||
NOW=$(date +%s)
|
||||
sudo pihole-FTL sqlite3 /etc/pihole/gravity.db <<SQL
|
||||
INSERT INTO adlist (address, enabled, comment, date_added, date_modified, type)
|
||||
VALUES
|
||||
('https://cdn.jsdelivr.net/gh/hagezi/dns-blocklists@latest/adblock/doh-vpn-proxy-bypass.txt',
|
||||
1, 'HaGeZi DoH/VPN/Proxy bypass', $NOW, $NOW, 0);
|
||||
SQL
|
||||
sudo pihole -g
|
||||
```
|
||||
|
||||
See [[pihole-v6-adlist-management]] for general adlist CRUD via SQL.
|
||||
|
||||
### Verification
|
||||
|
||||
After `pihole -g` completes, probe major DoH hostnames:
|
||||
|
||||
```bash
|
||||
for h in mozilla.cloudflare-dns.com dns.google chrome.cloudflare-dns.com dns.quad9.net; do
|
||||
echo -n "$h → "
|
||||
dig +short $h @<pihole-ip>
|
||||
done
|
||||
# All should return 0.0.0.0
|
||||
```
|
||||
|
||||
### Known false positives
|
||||
|
||||
The list is aggressive. Expect occasional pushback:
|
||||
|
||||
- **`claude.ai`** — gets caught by the broader `pro.txt` or TIF list in some combinations; DoH bypass list itself is usually clean. If you use Claude on LAN and see blocks, allowlist `claude.ai` — note that `api.anthropic.com` is typically **not** on any of these lists, so Claude Code / API traffic is unaffected.
|
||||
- **Zscaler ZPA / Zscaler Internet Access** — **this will break work-from-home auth if you don't allowlist it.** The DoH/VPN bypass list classifies Zscaler's ZTNA backbone as a "VPN proxy" and blocks it. Symptom: users see a blank / failed page at `https://samlsp.private.zscaler.com/...` during SAML sign-in, and the Zscaler Client Connector fails to authenticate.
|
||||
|
||||
The critical piece is that Zscaler's SAML SP hostname is a **CNAME chain**:
|
||||
|
||||
```
|
||||
samlsp.private.zscaler.com. CNAME samlsp.prod.zpath.net.
|
||||
samlsp.prod.zpath.net. CNAME zapp2saml.gslb.prod.zpath.net.
|
||||
zapp2saml.gslb.prod.zpath.net. CNAME snico2br.gslb.prod.zpath.net.
|
||||
snico2br.gslb.prod.zpath.net. A <IP>
|
||||
```
|
||||
|
||||
Pi-hole walks the CNAME chain and blocks on the target (status 9 = `blocked_gravity_cname`), so **an exact-hostname allowlist for `samlsp.private.zscaler.com` will NOT fix it** — you have to allowlist the CNAME target domain. The GSLB subdomains rotate, so use a regex allowlist for the whole `zpath.net` zone:
|
||||
|
||||
```sql
|
||||
INSERT OR IGNORE INTO domainlist (type, domain, enabled, comment)
|
||||
VALUES (2, '(\.|^)zpath\.net$', 1, 'Zscaler ZPA CNAME backbone — do not block');
|
||||
```
|
||||
|
||||
Don't forget `pihole reloaddns` after. Expect to also need regex allowlists for `zscaler.net`, `zscalertwo.net`, `zscalerthree.net`, `zscalerone.net`, `zscloud.net` if any are gravity-blocked — HaGeZi's lists may cover different combinations over time.
|
||||
- **iCloud Private Relay** — if you want iCPR to keep working on your Apple devices, allowlist its mask ingresses. The DoH/VPN bypass list blocks `mask.icloud.com`, `mask-h2.icloud.com`, and `mask-api.icloud.com` (Apple's iCPR entrance points). Without them, iCPR silently falls back to standard DNS — which means **Pi-hole is covering the bypass whether you want it to or not**. For hosts where iCPR is desired:
|
||||
|
||||
```sql
|
||||
INSERT OR IGNORE INTO domainlist (type, domain, enabled, comment)
|
||||
VALUES (2, '(\.|^)mask[a-z0-9-]*\.icloud\.com$', 1, 'iCloud Private Relay ingress');
|
||||
```
|
||||
|
||||
Keep this surgical — do **not** allowlist all of `icloud.com`. Other subdomains (`metrics.icloud.com`, `init.gc.apple.com` family) are Apple telemetry that the adlists correctly block. After allowlist + `pihole reloaddns`, toggle Wi-Fi or flip iCPR off/on in Settings on each Apple device to force DNS re-resolution — iOS/macOS caches DNS aggressively and won't pick up the change otherwise.
|
||||
- **`dot.txt` companion adlist** — as of April 2026, HaGeZi's separate `adblock/dot.txt` URL returns 403. DoT resolver hostnames are folded into `doh-vpn-proxy-bypass.txt` already.
|
||||
|
||||
## What still leaks
|
||||
|
||||
The DoH adlist does not defend against:
|
||||
|
||||
1. **IoT devices with hardcoded public DNS.** Chromecast et al. send UDP/53 queries directly to `8.8.8.8`. Pi-hole never sees them.
|
||||
2. **Apps that hardcode a DoH or DoT endpoint by IP.** If an app has `1.1.1.1` baked in rather than `cloudflare-dns.com`, the hostname block can't help.
|
||||
3. **Apple iCloud Private Relay.** Uses QUIC (UDP/443) to Cloudflare with oblivious DNS. Safari + Apple services route around Pi-hole entirely. Acceptable tradeoff for most users; mostly a privacy win even if it weakens your LAN-side visibility.
|
||||
|
||||
Estimated residual gap after the DoH adlist: **~3%** of tracker/telemetry traffic, mostly from hardcoded-DNS IoT.
|
||||
|
||||
## Router-level enforcement (optional, higher effort)
|
||||
|
||||
To close the remaining 3%, block outbound `udp/53`, `tcp/53`, `tcp/853` at the router for everything except the Pi-hole's IP. Two rules:
|
||||
|
||||
```bash
|
||||
# Transparently redirect all LAN :53 traffic to Pi-hole, except Pi-hole itself
|
||||
iptables -t nat -I PREROUTING -i br0 -p udp --dport 53 ! -s <pihole-ip> -j DNAT --to <pihole-ip>:53
|
||||
iptables -t nat -I PREROUTING -i br0 -p tcp --dport 53 ! -s <pihole-ip> -j DNAT --to <pihole-ip>:53
|
||||
|
||||
# Reject DoT so apps fall back to classic DNS (→ Pi-hole via above)
|
||||
iptables -I FORWARD -i br0 -p tcp --dport 853 ! -s <pihole-ip> -j REJECT --reject-with tcp-reset
|
||||
iptables -I FORWARD -i br0 -p udp --dport 853 ! -s <pihole-ip> -j REJECT
|
||||
```
|
||||
|
||||
Design choices:
|
||||
- **REDIRECT (DNAT), not DROP, for port 53** — devices with hardcoded `8.8.8.8` receive transparent answers from Pi-hole instead of silently breaking.
|
||||
- **REJECT, not DROP, for port 853** — DoT clients see a fast error and fall back to classic DNS immediately instead of timing out.
|
||||
- **Exempt the Pi-hole** — it needs to reach upstream resolvers (`1.1.1.1` etc.) unimpeded.
|
||||
- **`-i br0` only** — LAN ingress, not WAN.
|
||||
|
||||
### Persistence depends on router firmware
|
||||
|
||||
- **Asuswrt-Merlin:** add rules to `/jffs/scripts/firewall-start` — runs on every firewall init.
|
||||
- **Stock AsusWRT 388+:** `/jffs/scripts/firewall-start` is **not** honored. Rules added live persist until the next `restart_firewall` event (reboot, WAN flap, GUI config change). Workarounds: flash to Merlin, use the GUI's "LAN ▸ Network Services Filter" (DROP-only, less flexible), or schedule a cron re-apply in `/jffs/configs/crontab`.
|
||||
- **OpenWrt / pfSense / OPNsense:** their respective firewall config persistence works out of the box.
|
||||
|
||||
## Summary — minimum viable DoH defense
|
||||
|
||||
1. Pi-hole block mode = `NULL` (default — verify).
|
||||
2. Install HaGeZi `doh-vpn-proxy-bypass` adlist.
|
||||
3. Run `pihole -g`.
|
||||
4. Verify major DoH bootstraps return `0.0.0.0`.
|
||||
5. Optional: add router iptables rules to close the IoT/hardcoded-DNS gap.
|
||||
|
||||
Steps 1–4 give you ~97% effectiveness with zero client-side changes and no broken devices. Step 5 is polish for threat models where LAN-wide DNS visibility matters.
|
||||
|
||||
## Related
|
||||
|
||||
- [[MajorPi]] — local Pi-hole deployment
|
||||
- [[pihole-v6-adlist-management]] — adlist CRUD via SQL (v5 CLI commands don't work in v6)
|
||||
- [[Network Overview]] — fleet network context
|
||||
180
02-selfhosting/dns-networking/pihole-v6-adlist-management.md
Normal file
180
02-selfhosting/dns-networking/pihole-v6-adlist-management.md
Normal file
|
|
@ -0,0 +1,180 @@
|
|||
---
|
||||
title: "Pi-hole v6 Adlist Management via SQL"
|
||||
domain: selfhosting
|
||||
category: dns-networking
|
||||
tags: [pihole, pihole-v6, adlist, dns, sql, sqlite, gravity, runbook]
|
||||
status: published
|
||||
created: 2026-04-22
|
||||
updated: 2026-04-22
|
||||
---
|
||||
|
||||
# Pi-hole v6 Adlist Management via SQL
|
||||
|
||||
## The Problem
|
||||
|
||||
Pi-hole v6 removed the `pihole -a adlist` CLI subcommands. The old muscle-memory commands (`pihole -a adlist add <url>`, `pihole -a adlist remove <url>`, `pihole -a adlist list`) all return errors or are no-ops on v6. The Web UI works, but for scripting, Ansible, or SSH-only hosts, you need a CLI-level method.
|
||||
|
||||
The answer is to hit the `gravity.db` SQLite database directly. It's simple, idempotent, and scriptable.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Pi-hole v6 installed (`pihole -v` → Core version v6.x).
|
||||
- `sudo` access — `gravity.db` is owned `pihole:pihole` mode 660.
|
||||
- `sqlite3` binary is **not** required. Pi-hole ships `pihole-FTL` with a built-in `sqlite3` subcommand that you can use instead:
|
||||
```bash
|
||||
sudo pihole-FTL sqlite3 /etc/pihole/gravity.db "SELECT 1;"
|
||||
```
|
||||
Use this on any host where you don't want to install the standalone `sqlite3` package (e.g., Raspberry Pi OS minimal).
|
||||
|
||||
## Listing adlists
|
||||
|
||||
```bash
|
||||
sudo pihole-FTL sqlite3 -column -header /etc/pihole/gravity.db \
|
||||
"SELECT id, enabled, address, comment FROM adlist ORDER BY id;"
|
||||
```
|
||||
|
||||
| Column | Meaning |
|
||||
|---|---|
|
||||
| `id` | Internal ID (autoincrement, **does not match `queries.list_id`** — see note below) |
|
||||
| `enabled` | `1` = active, `0` = disabled (still in DB but not compiled into gravity) |
|
||||
| `address` | The URL fetched by `pihole -g` |
|
||||
| `comment` | Human-readable label shown in the Web UI |
|
||||
|
||||
## Adding an adlist
|
||||
|
||||
```bash
|
||||
NOW=$(date +%s)
|
||||
sudo pihole-FTL sqlite3 /etc/pihole/gravity.db <<SQL
|
||||
INSERT INTO adlist (address, enabled, comment, date_added, date_modified, type)
|
||||
VALUES
|
||||
('https://example.com/blocklist.txt', 1, 'My Blocklist', $NOW, $NOW, 0);
|
||||
SQL
|
||||
```
|
||||
|
||||
`type = 0` means a regular blocklist (as opposed to an allowlist). `date_added` and `date_modified` are unix seconds.
|
||||
|
||||
**Always follow with `pihole -g`** to fetch the list and rebuild the gravity blob:
|
||||
|
||||
```bash
|
||||
sudo pihole -g
|
||||
```
|
||||
|
||||
This takes 30s–3min depending on adlist size. Expect output like:
|
||||
```
|
||||
[✓] Parsed 0 exact domains and 18121 ABP-style domains (blocking, ignored 0 non-domain entries)
|
||||
[i] Number of gravity domains: 2669352 (2409506 unique domains)
|
||||
[✓] Building gravity tree
|
||||
```
|
||||
|
||||
## Removing an adlist
|
||||
|
||||
By address:
|
||||
```bash
|
||||
sudo pihole-FTL sqlite3 /etc/pihole/gravity.db \
|
||||
"DELETE FROM adlist WHERE address = 'https://example.com/blocklist.txt';"
|
||||
sudo pihole -g
|
||||
```
|
||||
|
||||
By id:
|
||||
```bash
|
||||
sudo pihole-FTL sqlite3 /etc/pihole/gravity.db \
|
||||
"DELETE FROM adlist WHERE id = 9;"
|
||||
sudo pihole -g
|
||||
```
|
||||
|
||||
## Enabling / disabling without removing
|
||||
|
||||
```bash
|
||||
sudo pihole-FTL sqlite3 /etc/pihole/gravity.db \
|
||||
"UPDATE adlist SET enabled=0 WHERE id=9;"
|
||||
sudo pihole -g
|
||||
```
|
||||
|
||||
This is the right move when you want to toggle an adlist on/off without losing the URL/comment (e.g., a situational blocklist like Disney+ streaming).
|
||||
|
||||
## Verifying a new adlist is actually blocking
|
||||
|
||||
After `pihole -g` finishes, probe a known domain from the list directly against Pi-hole:
|
||||
|
||||
```bash
|
||||
dig +short <known-blocked-domain> @192.168.50.238
|
||||
# Expected: 0.0.0.0 (when dns.blocking.mode = NULL)
|
||||
```
|
||||
|
||||
If you get a real answer, either the adlist fetch failed (check `pihole -g` output for 403/404), or the entry isn't in the list you added.
|
||||
|
||||
## Common gotchas
|
||||
|
||||
### `pihole -g` fails with "Forbidden"
|
||||
|
||||
The adlist URL returned HTTP 403 or 404. HaGeZi and OISD in particular reorganize file paths occasionally. Remove the broken entry and either substitute the new URL or drop it:
|
||||
|
||||
```bash
|
||||
sudo pihole-FTL sqlite3 /etc/pihole/gravity.db \
|
||||
"DELETE FROM adlist WHERE address = '<404-url>';"
|
||||
```
|
||||
|
||||
### `queries.list_id` doesn't match `adlist.id`
|
||||
|
||||
In Pi-hole v6's FTL query log, the `list_id` column on `queries`/`query_storage` does **not** reliably point back at the `adlist.id`. For `status=4` (regex), it references a `domainlist.id`. For `status=1` (gravity), it can reference a `gravity` table rowid, not the adlist. Do not assume a bidirectional mapping — treat `list_id` as an opaque debug hint.
|
||||
|
||||
### Stale regex after editing `domainlist`
|
||||
|
||||
FTL compiles regex rules into memory at process start and on explicit reload. Editing `domainlist` via SQL without calling `pihole reloaddns` afterwards leaves the old compiled regex active. Symptom: `queries.status=4` blocks firing for domains whose `list_id` points at deleted entries.
|
||||
|
||||
Fix: always follow `domainlist` edits with:
|
||||
```bash
|
||||
sudo pihole reloaddns
|
||||
```
|
||||
|
||||
Verify via the FTL log:
|
||||
```bash
|
||||
sudo grep "Compiled .* regex" /var/log/pihole/FTL.log | tail
|
||||
# → "Compiled N allow and M deny regex for X clients"
|
||||
```
|
||||
|
||||
The numbers should match the count of `enabled=1` entries in `domainlist` by `type`.
|
||||
|
||||
### No standalone `sqlite3` on the host
|
||||
|
||||
Use `pihole-FTL sqlite3` — ships with every Pi-hole install, behaves identically to the standalone binary for the commands shown here. Do not install the `sqlite3` package just to manage Pi-hole.
|
||||
|
||||
## Useful introspection queries
|
||||
|
||||
**Total gravity domains by adlist:**
|
||||
```sql
|
||||
SELECT a.id, a.comment, COUNT(g.domain) AS domains
|
||||
FROM gravity g
|
||||
JOIN adlist a ON a.id = g.adlist_id
|
||||
GROUP BY a.id
|
||||
ORDER BY domains DESC;
|
||||
```
|
||||
|
||||
**Active regex rules (what FTL SHOULD be running):**
|
||||
```sql
|
||||
SELECT * FROM vw_regex_denylist;
|
||||
SELECT * FROM vw_regex_allowlist;
|
||||
```
|
||||
|
||||
**Blocked queries in the last hour by adlist source:**
|
||||
```sql
|
||||
SELECT
|
||||
CASE status
|
||||
WHEN 1 THEN 'gravity'
|
||||
WHEN 4 THEN 'regex_deny'
|
||||
WHEN 5 THEN 'exact_deny'
|
||||
WHEN 9 THEN 'gravity_cname'
|
||||
WHEN 10 THEN 'regex_cname'
|
||||
WHEN 11 THEN 'exact_cname'
|
||||
END AS source,
|
||||
COUNT(*) AS n
|
||||
FROM queries
|
||||
WHERE timestamp > strftime('%s','now','-1 hour')
|
||||
AND status IN (1,4,5,9,10,11)
|
||||
GROUP BY status;
|
||||
```
|
||||
|
||||
## Related
|
||||
|
||||
- [[MajorPi]] — the host running this
|
||||
- [[pihole-doh-dot-bypass-defense]] — DoH/DoT bypass defense (reasons to add specific adlists)
|
||||
143
02-selfhosting/services/mastodon-db-maintenance.md
Normal file
143
02-selfhosting/services/mastodon-db-maintenance.md
Normal file
|
|
@ -0,0 +1,143 @@
|
|||
---
|
||||
title: Mastodon DB Maintenance — Statuses, Accounts, and VACUUM
|
||||
domain: selfhosting
|
||||
category: services
|
||||
tags:
|
||||
- mastodon
|
||||
- database
|
||||
- postgresql
|
||||
- maintenance
|
||||
- tootctl
|
||||
- majortoot
|
||||
status: published
|
||||
created: 2026-04-22
|
||||
updated: 2026-04-22
|
||||
---
|
||||
|
||||
# Mastodon DB Maintenance
|
||||
|
||||
Mastodon aggressively caches remote content — avatars, statuses, follow graphs — from every instance it federates with. On an active instance, this causes substantial PostgreSQL bloat over time. Without periodic maintenance, the database grows unbounded even if S3 handles media.
|
||||
|
||||
## The Problem — majortoot at ~3.5 years
|
||||
|
||||
| Table | Size | Rows |
|
||||
|-------|------|------|
|
||||
| `statuses` | 3.5 GB | 3.6M rows (3.6M remote cached, 37K local) |
|
||||
| `accounts` | 499 MB | 214,770 remote cached, 18 local |
|
||||
| `preview_cards` | 837 MB | remote link previews |
|
||||
| `statuses_tags` | 506 MB | cascades from statuses |
|
||||
| `conversations` | 436 MB | cascades from statuses |
|
||||
| `mentions` | 305 MB | cascades from statuses |
|
||||
|
||||
The `statuses remove` and `accounts cull` commands address most of this.
|
||||
|
||||
---
|
||||
|
||||
## Maintenance Tasks
|
||||
|
||||
### 1. Cache Clear
|
||||
|
||||
Clears in-memory Rails caches. Fast (<5 seconds), safe to run anytime.
|
||||
|
||||
```bash
|
||||
tootctl cache clear
|
||||
```
|
||||
|
||||
### 2. Statuses Remove
|
||||
|
||||
Removes cached remote statuses (and their cascaded rows in `statuses_tags`, `mentions`, `conversations`, `status_stats`) older than N days. Does **not** touch local statuses.
|
||||
|
||||
```bash
|
||||
tootctl statuses remove --days=90
|
||||
```
|
||||
|
||||
> [!warning] This is the slowest step
|
||||
> On a 3.6M-row statuses table, the extraction phase alone can take 20–40 minutes. PostgreSQL will be under heavy load. Run during off-peak hours.
|
||||
|
||||
**What gets removed:** Remote statuses not pinned, not boosted by local users, and not replied to by local users, older than the threshold.
|
||||
|
||||
### 3. Accounts Cull
|
||||
|
||||
Contacts each remote account's home instance via WebFinger to check if it still exists. Removes accounts that return 404 or `Gone`. Catches dead instances, deleted accounts, and renamed handles.
|
||||
|
||||
```bash
|
||||
tootctl accounts cull
|
||||
```
|
||||
|
||||
> [!note] Network-bound
|
||||
> Cull makes HTTP requests to remote servers. It's slower on flaky network conditions and will skip accounts it can't reach (to avoid false deletions).
|
||||
|
||||
### 4. VACUUM ANALYZE
|
||||
|
||||
After large deletions, PostgreSQL does not immediately return space to the OS — dead rows sit in pages marked for reuse. `VACUUM ANALYZE` reclaims that space and updates query planner statistics.
|
||||
|
||||
```bash
|
||||
sudo -u postgres psql mastodon_production -c "VACUUM ANALYZE;"
|
||||
```
|
||||
|
||||
For recovering actual disk space (not just marking pages free), `VACUUM FULL` is more aggressive but locks tables. Stick with plain `VACUUM ANALYZE` for routine maintenance.
|
||||
|
||||
---
|
||||
|
||||
## The Maintenance Script
|
||||
|
||||
**Location:** `/home/mastodon/maintenance.sh`
|
||||
**Cron:** `0 2 * * 0` — Sunday 2 AM (runs before media prune at 3 AM)
|
||||
**Log:** `/var/log/mastodon/maintenance.log`
|
||||
**Notifications:** Email to `marcus@majorshouse.com` at each step via Postfix → MajorMail
|
||||
|
||||
The script runs all four tasks in sequence and sends a notification email:
|
||||
|
||||
- **On start** — lists steps and current DB size
|
||||
- **After cache clear** — confirms done, warns statuses remove will take a while
|
||||
- **After statuses remove** — summary output + current DB size
|
||||
- **After accounts cull** — accounts removed + current DB size
|
||||
- **On completion** — full timing breakdown and final DB size
|
||||
|
||||
### Running Manually
|
||||
|
||||
```bash
|
||||
ssh root@100.110.197.17
|
||||
bash /home/mastodon/maintenance.sh
|
||||
```
|
||||
|
||||
### Monitoring Progress
|
||||
|
||||
```bash
|
||||
ssh root@100.110.197.17 "tail -f /var/log/mastodon/maintenance.log"
|
||||
```
|
||||
|
||||
### tootctl Wrapper (one-off commands)
|
||||
|
||||
The `mastodon` user's rbenv is not on PATH in a login shell. Always use the wrapper:
|
||||
|
||||
```bash
|
||||
su - mastodon -c 'export PATH="/home/mastodon/.rbenv/bin:/home/mastodon/.rbenv/shims:$PATH" && eval "$(rbenv init -)" && cd /home/mastodon/live && RAILS_ENV=production bin/tootctl <command>'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Full Cron Schedule on majortoot
|
||||
|
||||
| Time | Job | Script |
|
||||
|------|-----|--------|
|
||||
| Sun 2 AM | DB maintenance | `/home/mastodon/maintenance.sh` |
|
||||
| Sun 3 AM | Media prune (S3) | `/home/mastodon/media-prune.sh` |
|
||||
| Daily 8 AM | Fail2Ban digest | `/usr/local/bin/fail2ban-digest.sh` |
|
||||
| Monthly | Fail2Ban nginx-botsearch prune | `/usr/local/bin/f2b-prune.sh` |
|
||||
| Daily | Certbot renewal | `service nginx stop; certbot renew; service nginx start` |
|
||||
|
||||
---
|
||||
|
||||
## First Run Results (2026-04-22)
|
||||
|
||||
First maintenance run ever on majortoot after ~3.5 years of operation. Results pending (job running in background at time of writing). Check `/var/log/mastodon/maintenance.log` for final numbers.
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Mastodon]] — service doc (deployment, access, S3 config)
|
||||
- [[majortoot]] — server doc (incident log, specs)
|
||||
- [[mastodon-federation]] — domain blocks, silencing, FediSeer
|
||||
- [[mastodon-instance-tuning]] — character limits, media cache
|
||||
168
02-selfhosting/services/mastodon-federation.md
Normal file
168
02-selfhosting/services/mastodon-federation.md
Normal file
|
|
@ -0,0 +1,168 @@
|
|||
---
|
||||
title: Mastodon Federation — Domain Blocks, Silencing, and FediSeer
|
||||
domain: selfhosting
|
||||
category: services
|
||||
tags:
|
||||
- mastodon
|
||||
- federation
|
||||
- fediverse
|
||||
- domain-blocks
|
||||
- fediseer
|
||||
- majortoot
|
||||
status: published
|
||||
created: 2026-04-22
|
||||
updated: 2026-04-22
|
||||
---
|
||||
|
||||
# Mastodon Federation — Domain Blocks, Silencing, and FediSeer
|
||||
|
||||
## Domain Block Severity — Critical Gotcha
|
||||
|
||||
The Mastodon admin UI labels severities as **Silence** and **Suspend**, but the integer values stored in the database are **not** in alphabetical order. The Rails enum is:
|
||||
|
||||
```ruby
|
||||
# app/models/domain_block.rb
|
||||
enum :severity, { silence: 0, suspend: 1, noop: 2 }, validate: true
|
||||
```
|
||||
|
||||
| DB value | Meaning | Effect |
|
||||
|----------|---------|--------|
|
||||
| `0` | **silence** | Instance limited — posts hidden from public timelines; follows require manual approval |
|
||||
| `1` | **suspend** | Full defederation — all content removed, all follows severed |
|
||||
| `2` | **noop** | No effect — entry tracked but no federation action taken |
|
||||
|
||||
> [!warning] Don't trust raw integer queries
|
||||
> If you query `domain_blocks` directly via psql, severity `0` looks like "the lowest level" but it's actually **silence** — a meaningful restriction. Always map through the enum. This tripped up a defederation investigation on 2026-04-22 where 13 silenced instances (including mastodon.social) were initially misread as noop.
|
||||
|
||||
### majortoot block inventory (as of 2026-04-22)
|
||||
|
||||
| Severity | Count | Notable entries |
|
||||
|----------|-------|-----------------|
|
||||
| silence (0) | 13 | mastodon.social, mastodon.world, chaos.social, fosstodon.org, tech.lgbt, threads.net |
|
||||
| suspend (1) | 413 | Full defederation list |
|
||||
| noop (2) | 0 | — |
|
||||
|
||||
---
|
||||
|
||||
## How Silencing Affects Follows
|
||||
|
||||
When your instance silences a remote domain, **every follow request from that domain requires manual approval** — even if your account has `locked = false`.
|
||||
|
||||
This is enforced in `app/lib/activitypub/activity/follow.rb`:
|
||||
|
||||
```ruby
|
||||
if target_account.locked? || @account.silenced?
|
||||
LocalNotificationWorker.perform_async(target_account.id, follow_request.id, 'FollowRequest', 'follow_request')
|
||||
```
|
||||
|
||||
`@account.silenced?` returns true when the sending account's domain is in your `domain_blocks` at severity=0. The follow goes to the follow_requests queue instead of being automatically accepted.
|
||||
|
||||
**Practical effect on majortoot:** mastodon.social is silenced (added 2026-12-11, same day as a FluentInFinance follow-spam report). All follows from mastodon.social accounts appear as pending follow requests requiring manual approval. This is intentional — it's the expected behavior of a silence block.
|
||||
|
||||
---
|
||||
|
||||
## Checking Defederation Status
|
||||
|
||||
### Are major instances blocking you?
|
||||
|
||||
Check if your domain appears in another instance's public block list:
|
||||
|
||||
```bash
|
||||
# Check mastodon.social's public block list (397 entries as of 2026-04-22)
|
||||
curl -s "https://mastodon.social/api/v1/instance/domain_blocks" | \
|
||||
python3 -c "import sys,json; data=json.load(sys.stdin); \
|
||||
found=[b for b in data if b['domain']=='toot.majorshouse.com']; \
|
||||
print('BLOCKED' if found else 'Not in public block list')"
|
||||
```
|
||||
|
||||
Note: instances can mark blocks as private, so absence from the public list is not a guarantee.
|
||||
|
||||
### Are you in their peer list?
|
||||
|
||||
If you're in an instance's peer list, they've federated with you at some point:
|
||||
|
||||
```bash
|
||||
curl -s "https://mastodon.social/api/v1/instance/peers" | \
|
||||
python3 -c "import sys,json; data=json.load(sys.stdin); print('toot.majorshouse.com' in data)"
|
||||
```
|
||||
|
||||
### Is the account visible from a remote instance?
|
||||
|
||||
```bash
|
||||
curl -s "https://mastodon.social/api/v1/accounts/lookup?acct=majorlinux@toot.majorshouse.com" | \
|
||||
python3 -c "import sys,json; d=json.load(sys.stdin); print('limited:', d.get('limited'), 'suspended:', d.get('suspended'))"
|
||||
```
|
||||
|
||||
`limited: true` means the remote instance has silenced toot.majorshouse.com.
|
||||
|
||||
### Check federation delivery health (Sidekiq)
|
||||
|
||||
```bash
|
||||
ssh root@100.110.197.17 "redis-cli llen sidekiq:dead; redis-cli llen sidekiq:retry"
|
||||
# Both should be 0 for a healthy instance
|
||||
```
|
||||
|
||||
### Check unavailable domains (delivery consistently failing)
|
||||
|
||||
```bash
|
||||
ssh root@100.110.197.17 "
|
||||
sudo -u postgres psql mastodon_production -c \
|
||||
'SELECT domain, updated_at FROM unavailable_domains ORDER BY updated_at DESC LIMIT 20;'"
|
||||
```
|
||||
|
||||
These are domains where ActivityPub delivery has repeatedly failed. Most are dead instances, not active blocks.
|
||||
|
||||
---
|
||||
|
||||
## FediSeer Registration
|
||||
|
||||
[FediSeer](https://fediseer.com) is a community service that tracks censures (formal complaints) against fediverse instances. Registering lets you monitor if any instance formally censures toot.majorshouse.com.
|
||||
|
||||
### majortoot status (registered 2026-04-22)
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Domain | toot.majorshouse.com |
|
||||
| ID | 5575 |
|
||||
| State | UP |
|
||||
| Censures received | 0 |
|
||||
| Endorsements | 0 |
|
||||
| Tags | mastodon, selfhosted, leftist, foss |
|
||||
| Guarantor | none |
|
||||
| API key | Bitwarden — "FediSeer — toot.majorshouse.com" |
|
||||
|
||||
### Claiming / re-claiming your instance
|
||||
|
||||
```bash
|
||||
# Claim (sends API key via DM from @fediseer@fediseer.com)
|
||||
curl -s -X PUT "https://fediseer.com/api/v1/whitelist/toot.majorshouse.com" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"admin": "majorlinux", "pm_proxy": "MASTODON"}'
|
||||
|
||||
# The API key arrives as a DM — delete the DM after saving to Bitwarden
|
||||
```
|
||||
|
||||
### Check censures
|
||||
|
||||
```bash
|
||||
curl -s "https://fediseer.com/api/v1/censures/toot.majorshouse.com" | python3 -c \
|
||||
"import sys,json; d=json.load(sys.stdin); print('Censures:', d.get('total',0))"
|
||||
```
|
||||
|
||||
### Update tags
|
||||
|
||||
```bash
|
||||
curl -s -X PUT "https://fediseer.com/api/v1/tags" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "apikey: <key-from-bitwarden>" \
|
||||
-d '{"tags_csv": "mastodon,selfhosted,leftist,foss"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Mastodon]] — service doc
|
||||
- [[majortoot]] — server doc
|
||||
- [[mastodon-db-maintenance]] — statuses remove, accounts cull, vacuum
|
||||
- [[mastodon-instance-tuning]] — character limits, media cache
|
||||
|
|
@ -0,0 +1,111 @@
|
|||
---
|
||||
title: "Fantastical Google Sync Error Flood — Phantom Calendars Fixed via syncselect"
|
||||
domain: troubleshooting
|
||||
category: productivity
|
||||
tags: [fantastical, google-calendar, caldav, sync, macos, syncselect]
|
||||
status: published
|
||||
created: 2026-04-24
|
||||
updated: 2026-04-24
|
||||
---
|
||||
|
||||
# Fantastical Google Sync Error Flood — Phantom Calendars Fixed via syncselect
|
||||
|
||||
Fantastical floods its macOS unified log with Google Calendar sync errors, the app shows persistent sync failures in the UI, and re-adding the Google account inside Fantastical doesn't fix it. The cause is usually orphan calendar references — calendars that were deleted from Google Calendar but still enabled in Google's per-account CalDAV sync whitelist.
|
||||
|
||||
## The Short Answer
|
||||
|
||||
Visit **`https://www.google.com/calendar/syncselect`**, uncheck any calendars that no longer exist or you don't want Fantastical / Apple Calendar to try syncing, click Save. Fantastical's error flood stops within one sync cycle.
|
||||
|
||||
This is a per-Google-account page — completely independent of Fantastical's settings, and independent of the calendar list inside Google Calendar's main web UI.
|
||||
|
||||
## Background
|
||||
|
||||
Google Calendar has **three** separate notions of calendar "visibility" for a given account:
|
||||
|
||||
| Surface | What it controls |
|
||||
|---|---|
|
||||
| `calendar.google.com` main UI — calendar list in the left sidebar | What you see in Google's own web interface |
|
||||
| `calendar.google.com/calendar/u/0/r/settings/calendar/...` — per-calendar settings | Sharing, notifications, access control |
|
||||
| **`google.com/calendar/syncselect`** — sync selection | **What Google exposes to third-party CalDAV/Exchange clients** (Apple Calendar, Fantastical, Outlook, Thunderbird, etc.) |
|
||||
|
||||
Fantastical talks to Google via CalDAV. It asks Google for the list of calendars enabled for CalDAV sync. If `syncselect` still has a calendar flagged for sync but the calendar has been deleted from Google (or unshared from you), Google returns an inconsistent response — the CalDAV principal lists the calendar ID but any request for its data returns 404. Fantastical dutifully logs an error and retries next sync cycle. Multiply by the number of orphans and you get a flood.
|
||||
|
||||
Deleting a calendar from Google Calendar's main UI does **not** automatically remove it from `syncselect`. That's the gotcha.
|
||||
|
||||
## Symptoms
|
||||
|
||||
- Fantastical UI shows "Sync Error" or a red badge on the account
|
||||
- macOS unified log filling with lines like:
|
||||
```
|
||||
[FBGooglePrincipalSyncSession.m] Unable to find Google Calendar information:
|
||||
<calendar-id>@group.calendar.google.com in (<list of real calendars>)
|
||||
```
|
||||
- `dataaccessd` logs `Error Domain=kEKAccountErrorDomain Code=0` with `lastSyncStartDate = (null)`
|
||||
- Fantastical's helper `85C27NK92C.com.flexibits.fantastical2.mac.helper` spams XPC / CoreData token errors every 3 seconds (secondary symptom when the token store gets wedged in the retry loop)
|
||||
|
||||
## Diagnosis
|
||||
|
||||
### Step 1 — Spot the phantom calendar IDs in the log
|
||||
|
||||
```bash
|
||||
log show --last 5m --style compact \
|
||||
--predicate 'eventMessage CONTAINS "Unable to find Google Calendar"' 2>/dev/null \
|
||||
| grep -oE 'information: [a-zA-Z0-9._%@-]+' | sort -u
|
||||
```
|
||||
|
||||
Each line returned is a calendar ID Fantastical is asking Google for that Google can't find.
|
||||
|
||||
### Step 2 — Get calendar names from Fantastical's local DB
|
||||
|
||||
The orphan IDs alone look random. To match them to what the calendars were called (so you know what to uncheck in syncselect), query Fantastical's SQLite DB:
|
||||
|
||||
```bash
|
||||
DB="$HOME/Library/Group Containers/85C27NK92C.com.flexibits.fantastical2.mac/Database/Fantastical-8.fcdata"
|
||||
|
||||
for id in <each-orphan-id-here>; do
|
||||
echo "--- $id ---"
|
||||
strings "$DB" 2>/dev/null | grep "$id" | head -5
|
||||
done
|
||||
```
|
||||
|
||||
Fantastical stores the calendar's display name near each ID in the binary form. You may see names like `Kitchen Lights`, `Major7`, or other labels that remind you what the calendar was used for — often a deleted smart-home automation trigger, an old device's dedicated calendar, a former coworker's shared calendar, a subscribed sports or holiday calendar that moved.
|
||||
|
||||
### Step 3 — Visit syncselect
|
||||
|
||||
Open `https://www.google.com/calendar/syncselect` in the same browser you're signed in with. You'll see every calendar Google knows about for this account, with a checkbox per entry:
|
||||
|
||||
- ✅ Live calendars you want on devices — leave checked
|
||||
- ❌ Orphans, former smart-home triggers, deleted shared calendars — **uncheck**
|
||||
- Unsure? Cross-reference against the names from Step 2
|
||||
|
||||
Click **Save**.
|
||||
|
||||
## Fix
|
||||
|
||||
1. Uncheck orphans at `https://www.google.com/calendar/syncselect`, click Save.
|
||||
2. Let Fantastical complete one more sync cycle (or quit + relaunch for faster turnaround).
|
||||
3. Verify the log is clean:
|
||||
```bash
|
||||
log show --last 2m --style compact \
|
||||
--predicate 'eventMessage CONTAINS "Unable to find Google Calendar"' 2>/dev/null \
|
||||
| wc -l
|
||||
```
|
||||
Should return 0.
|
||||
|
||||
**What you should NOT do as a first attempt:**
|
||||
|
||||
- Remove and re-add the Google account inside Fantastical. This fixes some orphans but not all — Fantastical's local event cache keeps references to calendars that have associated cached events, so orphans with historical data survive a standard account re-add. Hit `syncselect` first.
|
||||
- Delete Fantastical's `.fcdata` SQLite. Nuclear, loses local cache, unnecessary for this specific issue.
|
||||
|
||||
## Gotchas & Notes
|
||||
|
||||
- **syncselect is per-Google-account**, so if you have multiple Google accounts in Fantastical, each needs its own visit. The URL will use whichever account you're currently signed in with in the browser.
|
||||
- **Calendar deletion from `calendar.google.com` doesn't propagate to syncselect.** This is a Google quirk, not a Fantastical bug.
|
||||
- **The same fix applies to Apple Calendar.app** if it's showing the same sync errors — Fantastical and Apple Calendar use identical CalDAV plumbing via macOS's `dataaccessd`.
|
||||
- The phantom calendar IDs will remain in Fantastical's `.fcdata` for a while even after the fix — Fantastical doesn't aggressively garbage-collect cached event data. This is cosmetic and doesn't re-trigger sync errors as long as syncselect no longer lists them.
|
||||
- The XPC `Unable to create token NSXPCConnection` loop is downstream of the sync error flood — when Fantastical's helper gets wedged on repeated failed syncs, its CoreData-backed OAuth token store can't initialize cleanly. Fixing syncselect + a full Fantastical quit (menubar → Quit Fantastical, not just `Cmd+Q`) + relaunch clears this too.
|
||||
|
||||
## Related
|
||||
|
||||
- [[Recap]] skill — uses Google Calendar MCPs that are unaffected by this issue (MCPs go through Google's API directly, not CalDAV)
|
||||
- Google's syncselect URL: https://www.google.com/calendar/syncselect
|
||||
98
05-troubleshooting/fantastical-mcp-permission-denied.md
Normal file
98
05-troubleshooting/fantastical-mcp-permission-denied.md
Normal file
|
|
@ -0,0 +1,98 @@
|
|||
---
|
||||
title: "Fantastical MCP Server: Permission Denied on Launch (macOS Quarantine)"
|
||||
domain: troubleshooting
|
||||
category: productivity
|
||||
tags: [fantastical, mcp, claude, macos, gatekeeper, quarantine, cowork]
|
||||
status: published
|
||||
created: 2026-04-26
|
||||
updated: 2026-04-26
|
||||
---
|
||||
|
||||
# Fantastical MCP Server: Permission Denied on Launch (macOS Quarantine)
|
||||
|
||||
Fantastical's MCP server fails to connect in Claude/Cowork with a `Server disconnected` error and no dialog or prompt to explain why. The binary is installed but macOS Gatekeeper silently blocks it from executing.
|
||||
|
||||
## The Short Answer
|
||||
|
||||
```bash
|
||||
xattr -d com.apple.quarantine "/Users/majorlinux/Library/Application Support/Claude/Claude Extensions/ant.dir.gh.flexibits.fantastical-mcp/server/FantasticalMCP.app"
|
||||
```
|
||||
|
||||
Fully quit and reopen Cowork. Fantastical MCP reconnects cleanly.
|
||||
|
||||
If the quarantine attribute isn't present, also try setting the executable bit:
|
||||
|
||||
```bash
|
||||
chmod +x "/Users/majorlinux/Library/Application Support/Claude/Claude Extensions/ant.dir.gh.flexibits.fantastical-mcp/server/FantasticalMCP.app/Contents/MacOS/FantasticalMCP"
|
||||
```
|
||||
|
||||
## Why This Happens
|
||||
|
||||
macOS automatically tags downloaded files with a `com.apple.quarantine` extended attribute. When you launch an app yourself, macOS shows a Gatekeeper dialog — click Open, and the quarantine flag is cleared. But the FantasticalMCP binary is never launched by the user directly; Claude/Cowork spawns it as a subprocess. There's no dialog, and Gatekeeper just returns `Permission denied`. Claude sees the process die immediately and logs `Server disconnected`.
|
||||
|
||||
This recurs after any Fantastical update that replaces the MCP binary — the new binary comes in quarantined again.
|
||||
|
||||
## Diagnosis
|
||||
|
||||
The log tells the whole story. Check:
|
||||
|
||||
```bash
|
||||
tail -n 50 ~/Library/Logs/Claude/mcp-server-Fantastical.log
|
||||
```
|
||||
|
||||
If you see this sequence, it's the quarantine issue:
|
||||
|
||||
```
|
||||
Server started and connected successfully
|
||||
Failed to spawn process: Permission denied
|
||||
Server transport closed
|
||||
Server disconnected.
|
||||
```
|
||||
|
||||
## Full Fix
|
||||
|
||||
**Step 1 — Remove the quarantine flag:**
|
||||
|
||||
```bash
|
||||
xattr -d com.apple.quarantine \
|
||||
"/Users/majorlinux/Library/Application Support/Claude/Claude Extensions/ant.dir.gh.flexibits.fantastical-mcp/server/FantasticalMCP.app"
|
||||
```
|
||||
|
||||
**Step 2 — Verify the attribute is gone:**
|
||||
|
||||
```bash
|
||||
xattr "/Users/majorlinux/Library/Application Support/Claude/Claude Extensions/ant.dir.gh.flexibits.fantastical-mcp/server/FantasticalMCP.app"
|
||||
```
|
||||
|
||||
Should return empty or only non-quarantine attributes. If `com.apple.quarantine` is still listed, re-run step 1.
|
||||
|
||||
**Step 3 — Fully quit and reopen Cowork:**
|
||||
|
||||
Cmd+Q (not just close the window). Closing the window leaves the MCP host process running — it won't retry the failed server until the app fully relaunches.
|
||||
|
||||
**Step 4 — Verify connection:**
|
||||
|
||||
Check the log again:
|
||||
|
||||
```bash
|
||||
tail -n 10 ~/Library/Logs/Claude/mcp-server-Fantastical.log
|
||||
```
|
||||
|
||||
You should see `Server started and connected successfully` with no `Permission denied` line following it.
|
||||
|
||||
## After a Fantastical Update
|
||||
|
||||
If this breaks again after Fantastical auto-updates, re-run the `xattr -d` command from Step 1. The update replaces the binary and macOS re-quarantines the new one.
|
||||
|
||||
## MCP Log Locations
|
||||
|
||||
| Log | Path |
|
||||
|-----|------|
|
||||
| Fantastical MCP | `~/Library/Logs/Claude/mcp-server-Fantastical.log` |
|
||||
| All MCP servers | `~/Library/Logs/Claude/mcp*.log` |
|
||||
| Combined MCP log | `~/Library/Logs/Claude/mcp.log` |
|
||||
|
||||
## Related
|
||||
|
||||
- [[Fantastical Google Sync Error Flood — Phantom Calendars Fixed via syncselect]]
|
||||
- MCP debugging docs: https://modelcontextprotocol.io/docs/tools/debugging
|
||||
|
|
@ -1,126 +0,0 @@
|
|||
---
|
||||
title: "Fedora usrmerge: ebtables Symlink Blocks Directory Consolidation"
|
||||
domain: troubleshooting
|
||||
category: fedora
|
||||
tags: [fedora, usrmerge, ebtables, update-alternatives, ansible, dnf]
|
||||
status: published
|
||||
created: 2026-04-19
|
||||
updated: 2026-04-19
|
||||
---
|
||||
|
||||
# Fedora usrmerge: ebtables Symlink Blocks Directory Consolidation
|
||||
|
||||
## Symptom
|
||||
|
||||
Every `dnf upgrade` on Fedora 43 (and some earlier Fedora releases) emits a warning partway through the transaction:
|
||||
|
||||
```
|
||||
/usr/sbin cannot be merged yet, /usr/sbin/ebtables points to /etc/alternatives/ebtables
|
||||
```
|
||||
|
||||
When the upgrade is driven by Ansible, the warning contaminates the module's JSON output and surfaces in a play run as:
|
||||
|
||||
```
|
||||
TASK [Upgrade all packages on CentOS/Fedora servers] ***
|
||||
changed: [majorlab]
|
||||
[WARNING]: Module invocation had junk after the JSON data:
|
||||
/usr/sbin cannot be merged yet, /usr/sbin/ebtables points to /etc/alternatives/ebtables
|
||||
changed: [majordiscord]
|
||||
```
|
||||
|
||||
The upgrade succeeds — the warning is cosmetic — but it keeps firing on every run until the underlying state is cleaned up.
|
||||
|
||||
## Why It Happens
|
||||
|
||||
Fedora's `usrmerge` transition turns `/usr/sbin` into a symlink to `/usr/bin`. The `filesystem` package's post-install scriptlet enforces that at every transaction: it walks `/usr/sbin` looking for any entity still pinned to the old path and refuses to consolidate until they're removed.
|
||||
|
||||
`ebtables` triggers this because `update-alternatives` can create registrations at `/usr/sbin/<cmd>` with targets in `/etc/alternatives/<cmd>`. Those symlinks:
|
||||
|
||||
- Are **not owned by any rpm** (confirmable with `rpm -qf /usr/sbin/ebtables` → "not owned")
|
||||
- Predate the usrmerge — they were created when `/usr/sbin` was still a real directory
|
||||
- Point to a target (`/etc/alternatives/ebtables`) that in turn points back into `/usr/sbin/ebtables-legacy` or `/usr/bin/ebtables-nft`
|
||||
|
||||
Because these live outside rpm, no package upgrade can clean them up. The filesystem scriptlet detects the blocker and backs off.
|
||||
|
||||
## Investigation
|
||||
|
||||
1. Confirm which hosts are affected:
|
||||
```bash
|
||||
ansible fedora -m shell -a '[ -e /usr/sbin/ebtables ] && ls -la /usr/sbin/ebtables'
|
||||
```
|
||||
2. Inspect the alternatives registration:
|
||||
```bash
|
||||
update-alternatives --display ebtables
|
||||
```
|
||||
Note whether the link points at `/usr/bin/ebtables-nft` (nft backend) or `/usr/sbin/ebtables-legacy` (legacy backend). Different Fedora images ship with different defaults.
|
||||
3. Confirm ownership:
|
||||
```bash
|
||||
rpm -qf /usr/sbin/ebtables /etc/alternatives/ebtables
|
||||
```
|
||||
Both should report "not owned by any package." That's the signal.
|
||||
|
||||
## Fix
|
||||
|
||||
Tear down the alternative, delete the blocker symlinks, then re-register with **`/usr/bin` paths on both sides of the registration** so the scriptlet has nothing left to complain about.
|
||||
|
||||
```bash
|
||||
# Capture current provider first (nft or legacy)
|
||||
update-alternatives --display ebtables
|
||||
|
||||
# Remove the stale registration
|
||||
update-alternatives --remove-all ebtables
|
||||
|
||||
# Clear the blocking symlinks (not rpm-owned)
|
||||
rm -f /usr/sbin/ebtables /etc/alternatives/ebtables
|
||||
|
||||
# Re-register with /usr/bin paths — example for nft backend
|
||||
update-alternatives --install /usr/bin/ebtables ebtables /usr/bin/ebtables-nft 10 \
|
||||
--slave /usr/bin/ebtables-restore ebtables-restore /usr/bin/ebtables-nft-restore \
|
||||
--slave /usr/bin/ebtables-save ebtables-save /usr/bin/ebtables-nft-save \
|
||||
--slave /usr/share/man/man8/ebtables.8.gz ebtables.8.gz /usr/share/man/man8/ebtables-nft.8.gz
|
||||
|
||||
# For legacy backend, swap -nft suffixes for -legacy
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
which ebtables # should resolve to /usr/bin/ebtables
|
||||
ebtables -V # should print the version without error
|
||||
test -e /usr/sbin/ebtables && echo BLOCKER || echo clean
|
||||
```
|
||||
|
||||
Next `dnf upgrade` will consolidate `/usr/sbin` cleanly with no warning.
|
||||
|
||||
## Ansible Playbook
|
||||
|
||||
`MajorAnsible/fix_ebtables_usrmerge.yml` handles this fleet-wide:
|
||||
|
||||
- Detects the backend (nft vs legacy) per host via `update-alternatives --display`
|
||||
- Uses `check_mode: false` on the detection query — otherwise `ansible.builtin.command` is skipped in `--check`, the detection fact defaults, and downstream conditionals misfire (see [Ansible Check Mode False Positives](ansible-check-mode-false-positives.md) for the broader pattern)
|
||||
- Safety check: bails out if `/usr/bin/ebtables-<backend>` is missing before touching anything
|
||||
- Idempotent on re-run — no alternative registered → `end_host`
|
||||
|
||||
Applied 2026-04-19 across the four Fedora hosts:
|
||||
|
||||
| Host | Backend |
|
||||
|---|---|
|
||||
| majorlab | nft (`ebtables v1.8.11 nf_tables`) |
|
||||
| majorhome | nft |
|
||||
| majormail | legacy (`ebtables v2.0.11 (legacy)`) |
|
||||
| majordiscord | legacy |
|
||||
|
||||
## Why not just remove ebtables?
|
||||
|
||||
Tempting, since nothing on the fleet currently writes L2 bridge firewall rules. But:
|
||||
|
||||
- `ebtables` is a transitive dependency of iptables/libvirt/networking packages on Fedora — removing it fights the package manager
|
||||
- The package itself isn't the problem; the **stale alternatives state** is
|
||||
|
||||
Cleaning up the registration is cheaper than untangling the dependency graph.
|
||||
|
||||
## Related
|
||||
|
||||
- [Ansible Check Mode False Positives in Verify/Assert Tasks](ansible-check-mode-false-positives.md)
|
||||
- Playbook: `MajorAnsible/fix_ebtables_usrmerge.yml`
|
||||
- Fedora usrmerge background: `man file-hierarchy`, Fedora Change page "UsrMove"
|
||||
60
05-troubleshooting/python-smtplib-missing-rfc-headers.md
Normal file
60
05-troubleshooting/python-smtplib-missing-rfc-headers.md
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
---
|
||||
title: "Python smtplib: Missing Date/Message-ID Headers Break Mail Clients"
|
||||
domain: troubleshooting
|
||||
category: general
|
||||
tags: [email, python, smtplib, spam, rfc, spark]
|
||||
status: published
|
||||
created: 2026-04-29
|
||||
updated: 2026-04-29
|
||||
---
|
||||
# Python smtplib: Missing Date/Message-ID Headers Break Mail Clients
|
||||
|
||||
## Problem
|
||||
|
||||
Emails sent via Python's `smtplib` and `EmailMessage` appear on some mail clients but not others. The emails are delivered to the server and visible in Maildir, but specific clients silently suppress them.
|
||||
|
||||
## Root Cause
|
||||
|
||||
Python's `EmailMessage` does **not** automatically add `Date:` or `Message-ID:` headers. These are required by RFC 5322. Without them:
|
||||
|
||||
- **SpamAssassin** flags `MISSING_DATE` and `MISSING_MID`, and may set `X-Spam-Flag: YES` even if the overall score is below the spam threshold
|
||||
- **Mail clients** (e.g., Spark) may filter on the spam flag header and silently hide the message — no Junk folder, just invisible
|
||||
- **Other clients** (e.g., iPhone Mail, some Spark builds) may be more lenient and display the message anyway
|
||||
|
||||
This creates a confusing situation where the same email appears on one device but not another, despite both using the same IMAP account.
|
||||
|
||||
## Fix
|
||||
|
||||
Always include `Date` and `Message-ID` headers when constructing emails with `EmailMessage`:
|
||||
|
||||
```python
|
||||
import smtplib
|
||||
from email.message import EmailMessage
|
||||
from email.utils import formatdate, make_msgid
|
||||
|
||||
msg = EmailMessage()
|
||||
msg['Subject'] = 'Your subject here'
|
||||
msg['From'] = 'sender@example.com'
|
||||
msg['To'] = 'recipient@example.com'
|
||||
msg['Date'] = formatdate(localtime=True)
|
||||
msg['Message-ID'] = make_msgid(domain='example.com')
|
||||
msg.set_content('Email body here')
|
||||
|
||||
with smtplib.SMTP('mail.example.com', 25) as s:
|
||||
s.send_message(msg)
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
After applying the fix, check that SpamAssassin no longer flags the headers:
|
||||
|
||||
```bash
|
||||
# Check email headers on the mail server
|
||||
grep -E 'MISSING_DATE|MISSING_MID|X-Spam' /var/vmail/domain/user/cur/<message-file>
|
||||
```
|
||||
|
||||
A clean message should show `X-Spam-Status: No` with no `MISSING_DATE` or `MISSING_MID` in the test list.
|
||||
|
||||
## Key Takeaway
|
||||
|
||||
Python's `EmailMessage` is a low-level builder — it trusts you to set all required headers. Unlike higher-level mail libraries or webmail interfaces, it will happily send a message with no date or message ID. Always add both explicitly in any script that sends email via `smtplib`.
|
||||
98
05-troubleshooting/ubuntu-dist-upgrade-repo-quarantine.md
Normal file
98
05-troubleshooting/ubuntu-dist-upgrade-repo-quarantine.md
Normal file
|
|
@ -0,0 +1,98 @@
|
|||
---
|
||||
title: "Ubuntu dist-upgrade Quarantines Third-Party Repos"
|
||||
domain: troubleshooting
|
||||
category: ubuntu
|
||||
tags: [ubuntu, apt, dist-upgrade, repositories, tailscale, digitalocean]
|
||||
status: published
|
||||
created: 2026-04-28
|
||||
updated: 2026-04-28
|
||||
---
|
||||
|
||||
# Ubuntu dist-upgrade Quarantines Third-Party Repos
|
||||
|
||||
## Problem
|
||||
|
||||
When running `do-release-upgrade` (e.g., Jammy 22.04 to Noble 24.04), Ubuntu renames all third-party `.list` files in `/etc/apt/sources.list.d/` to `.list.distUpgrade`. This silently disables every third-party repo — packages from those repos stop receiving updates with no warning.
|
||||
|
||||
The upgrade process does this intentionally because it can't guarantee third-party repos will have packages for the new release. Some repos get re-added as `.sources` files during the upgrade, but many don't.
|
||||
|
||||
## Symptoms
|
||||
|
||||
- `apt list --upgradable` shows nothing for packages you know have updates (e.g., Tailscale stuck on an old version)
|
||||
- `apt list --installed` shows packages as `[installed,local]` instead of `[installed]` — the "local" tag means apt has no repo to check for updates
|
||||
- `.distUpgrade` files accumulate in `/etc/apt/sources.list.d/` indefinitely
|
||||
|
||||
## Diagnosis
|
||||
|
||||
Check for quarantined repos:
|
||||
|
||||
```bash
|
||||
ls /etc/apt/sources.list.d/*.distUpgrade
|
||||
```
|
||||
|
||||
For each file, check whether a replacement `.list` or `.sources` file already exists:
|
||||
|
||||
```bash
|
||||
ls /etc/apt/sources.list.d/*.list /etc/apt/sources.list.d/*.sources
|
||||
```
|
||||
|
||||
## Fix
|
||||
|
||||
### Distro-agnostic repos (e.g., DigitalOcean agents)
|
||||
|
||||
If the repo URL doesn't reference a distro codename (jammy/noble), just rename:
|
||||
|
||||
```bash
|
||||
mv /etc/apt/sources.list.d/digitalocean-agent.list.distUpgrade \
|
||||
/etc/apt/sources.list.d/digitalocean-agent.list
|
||||
```
|
||||
|
||||
### Distro-specific repos (e.g., Tailscale, ondrej-php)
|
||||
|
||||
The quarantined file references the old distro (jammy). Re-run the upstream install script to get a correct entry for the new release:
|
||||
|
||||
```bash
|
||||
# Tailscale
|
||||
curl -fsSL https://tailscale.com/install.sh | sh
|
||||
|
||||
# Or manually: update the codename
|
||||
sed 's/jammy/noble/' /etc/apt/sources.list.d/tailscale.list.distUpgrade \
|
||||
> /etc/apt/sources.list.d/tailscale.list
|
||||
apt update && apt upgrade tailscale
|
||||
```
|
||||
|
||||
### Already replaced by .sources
|
||||
|
||||
If the upgrade process already created a `.sources` replacement (common for ubuntu-esm-apps, ondrej-php), the `.distUpgrade` file is just clutter — delete it:
|
||||
|
||||
```bash
|
||||
rm /etc/apt/sources.list.d/ondrej-ubuntu-php-jammy.list.distUpgrade
|
||||
```
|
||||
|
||||
### After all fixes
|
||||
|
||||
```bash
|
||||
apt update
|
||||
apt list --upgradable # should now show pending updates
|
||||
apt upgrade
|
||||
```
|
||||
|
||||
## Real-World Example: MajorsHouse Fleet (2026-04-28)
|
||||
|
||||
Five Ubuntu 24.04 servers were dist-upgraded from Jammy in October 2024. The `.distUpgrade` quarantine was discovered 6 months later when Tailscale's website wouldn't load (Pi-hole was blocking subdomains, but the investigation revealed teelia was stuck on Tailscale 1.76.0 — 20 versions behind — because the repo was disabled).
|
||||
|
||||
| Host | Quarantined files | Impact |
|
||||
|------|------------------|--------|
|
||||
| dcaprod | 8 | Tailscale, DO agents, MySQL, ondrej-php, ESM, vector |
|
||||
| teelia | 4 | Tailscale (stuck on 1.76.0), DO agents, certbot bionic PPA |
|
||||
| majorlinux | 8 | Tailscale, DO agents, MySQL, ondrej-php, ESM, apt-fast |
|
||||
| majortoot | 11 | Tailscale, DO agents, nodesource, PostgreSQL, vector, zabbix, ESM |
|
||||
| tttpod | 0 | Clean — was likely rebuilt rather than upgraded |
|
||||
|
||||
All files were audited, stale ones deleted, distro-agnostic repos renamed, and distro-specific repos re-added via upstream install scripts. DO agents upgraded from 3.16.11 to 3.18.12, teelia's Tailscale jumped from 1.76.0 to 1.96.4.
|
||||
|
||||
## Prevention
|
||||
|
||||
- **Post-upgrade audit:** After any `do-release-upgrade`, immediately run `ls /etc/apt/sources.list.d/*.distUpgrade` and resolve each file.
|
||||
- **Prefer `.sources` format:** When adding new third-party repos, use the DEB822 `.sources` format — it's what Ubuntu itself uses on Noble and is handled more gracefully during upgrades.
|
||||
- **Ansible playbook:** Consider a post-upgrade play that checks for `.distUpgrade` files and alerts or auto-fixes distro-agnostic repos.
|
||||
Loading…
Add table
Reference in a new issue