wiki: add manual update guide for Gemini CLI

This commit is contained in:
2026-03-13 22:45:52 -04:00
parent 70d9657b7f
commit 2861cade55
10 changed files with 323 additions and 21 deletions

View File

@@ -0,0 +1,47 @@
# 🛠️ Gemini CLI: Manual Update Guide
If the automatic update fails or you need to force a specific version of the Gemini CLI, use these steps.
## 🔴 Symptom: Automatic Update Failed
You may see an error message like:
`✕ Automatic update failed. Please try updating manually`
## 🟢 Manual Update Procedure
### 1. Verify Current Version
Check the version currently installed on your system:
```bash
gemini --version
```
### 2. Check Latest Version
Query the npm registry for the latest available version:
```bash
npm show @google/gemini-cli version
```
### 3. Perform Manual Update
Use `npm` with `sudo` to update the global package:
```bash
sudo npm install -g @google/gemini-cli@latest
```
### 4. Confirm Update
Verify that the new version is active:
```bash
gemini --version
```
## 🛠️ Troubleshooting Update Failures
### Permissions Issues
If you encounter `EACCES` errors without `sudo`, ensure your user has permissions or use `sudo` as shown above.
### Registry Connectivity
If `npm` cannot reach the registry, check your internet connection or any local firewall/proxy settings.
### Cache Issues
If the version doesn't update, try clearing the npm cache:
```bash
npm cache clean --force
```

View File

@@ -0,0 +1,58 @@
# Qwen2.5-14B OOM on RTX 3080 Ti (12GB)
## Problem
When attempting to run or fine-tune **Qwen2.5-14B** on an NVIDIA RTX 3080 Ti with 12GB of VRAM, the process fails with an Out of Memory (OOM) error:
```
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate X GiB (GPU 0; 12.00 GiB total capacity; Y GiB already allocated; Z GiB free; ...)
```
The 12GB VRAM limit is hit during the initial model load or immediately upon starting the first training step.
## Root Causes
1. **Model Size:** A 14B parameter model in FP16/BF16 requires ~28GB of VRAM just for the weights.
2. **Context Length:** High context lengths (e.g., 4096+) significantly increase VRAM usage during training.
3. **Training Overhead:** Even with QLoRA (4-bit quantization), the overhead of gradients, optimizer states, and activations can exceed 12GB for a 14B model.
---
## Solutions
### 1. Pivot to a 7B Model (Recommended)
For a 12GB GPU, a 7B parameter model (like **Qwen2.5-7B-Instruct**) is the sweet spot. It provides excellent performance while leaving enough VRAM for high context lengths and larger batch sizes.
- **VRAM Usage (7B QLoRA):** ~6-8GB
- **Pros:** Stable, fast, supports long context.
- **Cons:** Slightly lower reasoning capability than 14B.
### 2. Aggressive Quantization
If you MUST run 14B, use 4-bit quantization (GGUF or EXL2) for inference only. Training 14B on 12GB is not reliably possible even with extreme offloading.
```bash
# Example Ollama run (uses 4-bit quantization by default)
ollama run qwen2.5:14b
```
### 3. Training Optimizations (if attempting 14B)
If you have no choice but to try 14B training:
- Set `max_seq_length` to 512 or 1024.
- Use `Unsloth` (it is highly memory-efficient).
- Enable `gradient_checkpointing`.
- Set `per_device_train_batch_size = 1`.
---
## Maintenance
Keep your NVIDIA drivers and CUDA toolkit updated. On Windows (MajorRig), ensure WSL2 has sufficient memory allocation in `.wslconfig`.
---
## Tags
#gpu #cuda #oom #qwen #majortwin #llm #fine-tuning

View File

@@ -2,8 +2,18 @@
Practical fixes for common Linux, networking, and application problems.
- [ISP SNI Filtering with Caddy](isp-sni-filtering-caddy.md)
- [Obsidian Cache Hang Recovery](obsidian-cache-hang-recovery.md)
- [yt-dlp Fedora JS Challenge](yt-dlp-fedora-js-challenge.md)
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)
## 🖥️ GPU & AI
- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](gpu-display/qwen-14b-oom-3080ti.md)
## 🌐 Networking & Web
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md)
- [ISP SNI Filtering & Caddy](isp-sni-filtering-caddy.md)
- [yt-dlp YouTube JS Challenge Fix](yt-dlp-fedora-js-challenge.md)
## 📦 Docker & Systems
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md)
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)
## 📝 Application Specific
- [Obsidian Vault Recovery — Loading Cache Hang](obsidian-cache-hang-recovery.md)
- [Gemini CLI Manual Update](gemini-cli-manual-update.md)

View File

@@ -119,3 +119,20 @@ The webhook runs as a systemd service so it survives reboots:
systemctl status majwiki-webhook
systemctl restart majwiki-webhook
```
---
*Updated 2026-03-13: Obsidian Git plugin dropped. See canonical workflow below.*
## Canonical Publishing Workflow
The Obsidian Git plugin was evaluated but dropped — too convoluted for a simple push. Manual git from the terminal is the canonical workflow.
```bash
cd ~/Documents/MajorVault
git add 20-Projects/MajorTwin/08-Wiki/
git commit -m "wiki: describe your changes"
git push
```
From there: Gitea receives the push → fires webhook → majorlab pulls → MkDocs rebuilds → `notes.majorshouse.com` updates.