HackerNews GitHub Archive — 17,000 Curated Repos

HackerNews GitHub Archive: 17,900+ Curated Repos (2025)

Stop wasting time digging through threads. Get every high-signal GitHub repo that hit HackerNews for the last two years — cleaned, verified, and ready to use.

Download CSV — Instant Access See Why It Saves Weeks

CSV, JSON & Markdown • 17,900+ rows • Verified GitHub links • Commercial use allowed

What this is

A ready-to-use dataset: 17,900+ GitHub repositories that were posted to HackerNews between 2025. Each row contains the project title, the direct GitHub link, and the submission date — cleaned and de-duped so you get signal, not noise.

•

No scraping required

We already did the extraction and cleanup. You get a clean CSV (with JSON, & Markdown) and get to work.
•

Verified links

Links were filtered and reviewed — dead links and spam removed.
•

Plug & play

CSV (or JSON, or Markdown) loads into Excel, BigQuery, Pandas, or any analytics tool instantly.

What's included

• 17,900+ curated rows (title, GitHub URL, submission date)
• Clean CSV, JSON, Markdown (UTF-8)
• Zero duplicates, minimal noise
• Instant download link after purchase
• Commercial use allowed

Get the Dataset — Instant

Who this is for

Developers, indie hackers, founders, content creators, data scientists, and researchers who want a fast path to real-world repo signals — without grinding through threads and scraping failures.

Why this saves you weeks

Time to collect: Manually scraping and verifying 2+ years of HackerNews posts takes days or weeks.
Noise reduction: We removed non-GitHub links, spam, and repeats — you only get projects.
Format headaches: CSV, JOSN and Markdown that loads instantly into your tools — no parsing or schema drama.

Bottom line: do the valuable work — analysis, product design, model training — not the boring extraction and cleanup.

Legitimacy & Credibility

This dataset was compiled using date-based extraction from HackerNews listings, strict filtering for GitHub links, and manual cleanup to remove dead links and spam. No private data, no scraped credentials — only public postings and repo links.

17k

Curated rows

2025

Source window

CSV

Ready to use

JSON

Ready to process

Markdown

Ready to render